On Mon, Mar 24, 2025 at 05:48:32PM +0000, Karthik Nayak wrote: > Patrick Steinhardt <ps@xxxxxx> writes: > > On Thu, Mar 20, 2025 at 12:44:02PM +0100, Karthik Nayak wrote: > >> diff --git a/refs/files-backend.c b/refs/files-backend.c > >> index be758ffff5..1d50d4013c 100644 > >> --- a/refs/files-backend.c > >> +++ b/refs/files-backend.c > >> @@ -864,7 +868,9 @@ static enum ref_transaction_error lock_raw_ref(struct files_ref_store *refs, > >> * make sure there is no existing packed ref that conflicts > >> * with refname. This check is deferred so that we can batch it. > >> */ > >> - string_list_append(refnames_to_check, refname); > >> + item = string_list_append(refnames_to_check, refname); > >> + item->util = xmalloc(sizeof(update_idx)); > >> + memcpy(item->util, &update_idx, sizeof(update_idx)); > >> } > >> > >> ret = 0; > > > > Hm, so we have to allocate the `util` field now to store the update > > index, which is a bit unfortunate because all of this is part of the hot > > loop. We cannot store a direct pointer though because the array of > > updates may be reallocated, which would invalidate any pointers pointing > > into the array. > > > > Yes, your inference is on point. > > > I was wondering whether we could abuse an `uintptr_t` and use it to > > store the update index as a pointer. It does feel somewhat dirty though. > > > > We can do something like this, it would be _clever_ but does feel very > hacky. > > I did so some benchmarking here > > Benchmark 1: update-ref: create many refs (refformat = files, refcount > = 100000, revision = master) > Time (mean ± σ): 7.396 s ± 0.175 s [User: 0.962 s, System: 6.312 s] > Range (min … max): 7.145 s … 7.688 s 10 runs > > Benchmark 2: update-ref: create many refs (refformat = files, refcount > = 100000, revision = b4/245-partially-atomic-ref-updates) > Time (mean ± σ): 7.514 s ± 0.144 s [User: 0.919 s, System: 6.438 s] > Range (min … max): 7.297 s … 7.750 s 10 runs > > Summary > update-ref: create many refs (refformat = files, refcount = 100000, > revision = master) ran > 1.02 ± 0.03 times faster than update-ref: create many refs > (refformat = files, refcount = 100000, revision = > b4/245-partially-atomic-ref-updates) > > Overall the perf degradation is very minimal. I also looked at the > flamegraph to see if there is something. But most of the time seems to > be spent in IO (reading refs and creating locks). > > So I think we should be ok here, wdyt? Okay. I'm not a huge fan of using the `->util` pointer like this, but I don't have a better idea either, and the performance regression seems acceptable. So let's roll with it until somebody has a better idea. Thanks for doing the benchmark! Patrick