Patrick Steinhardt <ps@xxxxxx> writes: > On Thu, Mar 20, 2025 at 12:44:02PM +0100, Karthik Nayak wrote: >> diff --git a/refs.c b/refs.c >> index b34ec198f5..f719046f47 100644 >> --- a/refs.c >> +++ b/refs.c >> @@ -2540,6 +2540,7 @@ enum ref_transaction_error refs_verify_refnames_available(struct ref_store *refs >> const struct string_list *refnames, >> const struct string_list *extras, >> const struct string_list *skip, >> + struct ref_transaction *transaction, >> unsigned int initial_transaction, >> struct strbuf *err) >> { >> @@ -2599,12 +2601,26 @@ enum ref_transaction_error refs_verify_refnames_available(struct ref_store *refs >> if (!initial_transaction && >> !refs_read_raw_ref(refs, dirname.buf, &oid, &referent, >> &type, &ignore_errno)) { >> + if (transaction && ref_transaction_maybe_set_rejected( >> + transaction, *update_idx, >> + REF_TRANSACTION_ERROR_NAME_CONFLICT)) { >> + strset_remove(&dirnames, dirname.buf); >> + continue; >> + } >> + > > Okay. We have to remove the dirname from `dirnames` again so that the > next reference that creates a reference in the same directory would also > be marked as conflicting. It does have the consequence that we now have > to read the dirname N times again, where N is the number of refs that > are created below that directory. > > We could probably improve this by using another map that contains the > conflicting names, right? > Yes that's definitely possible, I will go ahead and add it! >> diff --git a/refs/files-backend.c b/refs/files-backend.c >> index be758ffff5..1d50d4013c 100644 >> --- a/refs/files-backend.c >> +++ b/refs/files-backend.c >> @@ -864,7 +868,9 @@ static enum ref_transaction_error lock_raw_ref(struct files_ref_store *refs, >> * make sure there is no existing packed ref that conflicts >> * with refname. This check is deferred so that we can batch it. >> */ >> - string_list_append(refnames_to_check, refname); >> + item = string_list_append(refnames_to_check, refname); >> + item->util = xmalloc(sizeof(update_idx)); >> + memcpy(item->util, &update_idx, sizeof(update_idx)); >> } >> >> ret = 0; > > Hm, so we have to allocate the `util` field now to store the update > index, which is a bit unfortunate because all of this is part of the hot > loop. We cannot store a direct pointer though because the array of > updates may be reallocated, which would invalidate any pointers pointing > into the array. > Yes, your inference is on point. > I was wondering whether we could abuse an `uintptr_t` and use it to > store the update index as a pointer. It does feel somewhat dirty though. > We can do something like this, it would be _clever_ but does feel very hacky. I did so some benchmarking here Benchmark 1: update-ref: create many refs (refformat = files, refcount = 100000, revision = master) Time (mean ± σ): 7.396 s ± 0.175 s [User: 0.962 s, System: 6.312 s] Range (min … max): 7.145 s … 7.688 s 10 runs Benchmark 2: update-ref: create many refs (refformat = files, refcount = 100000, revision = b4/245-partially-atomic-ref-updates) Time (mean ± σ): 7.514 s ± 0.144 s [User: 0.919 s, System: 6.438 s] Range (min … max): 7.297 s … 7.750 s 10 runs Summary update-ref: create many refs (refformat = files, refcount = 100000, revision = master) ran 1.02 ± 0.03 times faster than update-ref: create many refs (refformat = files, refcount = 100000, revision = b4/245-partially-atomic-ref-updates) Overall the perf degradation is very minimal. I also looked at the flamegraph to see if there is something. But most of the time seems to be spent in IO (reading refs and creating locks). So I think we should be ok here, wdyt? > Patrick
Attachment:
signature.asc
Description: PGP signature