On Mon, Aug 25, 2025 at 11:39:20PM -0400, Jeff King wrote: > > But the hash function being oidhash(), I am a bit surprised. It > > shouldn't be so much more expensive to peek at the first 4 bytes and > > then do the usual hashtable thing than looking at the in-object > > commit->index. Is it a sign that the range of oidhash() is a bit > > too small for a real workload? > > > > Nah, 4 byte unsigned integer should be sufficient for the number of > > objects in the kernel. > > I was surprised, too. I expected it be maybe 20% slower or something. > Which really makes me think I've managed to screw up the patch, but if > so, I don't see it. I tried profiling the result, expecting to see a > bunch of extra time spent in obj_timestamp_put() or obj_timestamp_get(). > But I don't. They account together for only a few percent of the > run-time, according to perf. > > So I dunno. I am confused by the results, but I am not sure if I am > holding it wrong. OK, maybe I am just holding it wrong. I think I may have mistakenly been using the wrong timing for my baseline (maybe --date-order instead of --author-date-order; the latter is _way_ more expensive because we have to open the commits to parse the author date). Here's a more apples-to-apples comparison using hyperfine. On git.git: Benchmark 1: ./git.slab rev-list --author-date-order HEAD Time (mean ± σ): 547.3 ms ± 12.2 ms [User: 535.8 ms, System: 11.3 ms] Range (min … max): 536.1 ms … 566.4 ms 10 runs Benchmark 2: ./git.hash rev-list --author-date-order HEAD Time (mean ± σ): 558.6 ms ± 11.2 ms [User: 542.4 ms, System: 16.0 ms] Range (min … max): 544.4 ms … 572.6 ms 10 runs Summary ./git.slab rev-list --author-date-order HEAD ran 1.02 ± 0.03 times faster than ./git.hash rev-list --author-date-order HEAD So a little slowdown, but within the run-to-run noise. And on linux.git: Benchmark 1: ~/compile/git/git.slab rev-list --author-date-order HEAD Time (mean ± σ): 11.020 s ± 0.131 s [User: 10.764 s, System: 0.254 s] Range (min … max): 10.886 s … 11.262 s 10 runs Benchmark 2: ~/compile/git/git.hash rev-list --author-date-order HEAD Time (mean ± σ): 11.682 s ± 0.204 s [User: 11.398 s, System: 0.282 s] Range (min … max): 11.424 s … 12.139 s 10 runs Summary ~/compile/git/git.slab rev-list --author-date-order HEAD ran 1.06 ± 0.02 times faster than ~/compile/git/git.hash rev-list --author-date-order HEAD A little more measurable there. Those numbers are more in line with what I was expecting. I'm not sure what it all means, though. 6% is enough that it is probably worth keeping a custom data type like slab around. Though it would be nice to have a data type that worked on all object types and didn't necessarily use a ton of memory. This particular case may not be representative, either. I picked it because it was easy to convert. But I wonder how bad it would be to put the object flags for a traversal into a hash. Right now those are in the original struct, not even in a commit-slab. So I'd guess it's an even bigger slowdown. -Peff