Re: [PATCH] describe: use khash in finish_depth_computation()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Aug 25, 2025 at 11:39:20PM -0400, Jeff King wrote:

> > But the hash function being oidhash(), I am a bit surprised.  It
> > shouldn't be so much more expensive to peek at the first 4 bytes and
> > then do the usual hashtable thing than looking at the in-object
> > commit->index.  Is it a sign that the range of oidhash() is a bit
> > too small for a real workload?
> > 
> > Nah, 4 byte unsigned integer should be sufficient for the number of
> > objects in the kernel.
> 
> I was surprised, too. I expected it be maybe 20% slower or something.
> Which really makes me think I've managed to screw up the patch, but if
> so, I don't see it. I tried profiling the result, expecting to see a
> bunch of extra time spent in obj_timestamp_put() or obj_timestamp_get().
> But I don't. They account together for only a few percent of the
> run-time, according to perf.
> 
> So I dunno. I am confused by the results, but I am not sure if I am
> holding it wrong.

OK, maybe I am just holding it wrong. I think I may have mistakenly been
using the wrong timing for my baseline (maybe --date-order instead of
--author-date-order; the latter is _way_ more expensive because we have
to open the commits to parse the author date).

Here's a more apples-to-apples comparison using hyperfine. On git.git:

  Benchmark 1: ./git.slab rev-list --author-date-order HEAD
    Time (mean ± σ):     547.3 ms ±  12.2 ms    [User: 535.8 ms, System: 11.3 ms]
    Range (min … max):   536.1 ms … 566.4 ms    10 runs
  
  Benchmark 2: ./git.hash rev-list --author-date-order HEAD
    Time (mean ± σ):     558.6 ms ±  11.2 ms    [User: 542.4 ms, System: 16.0 ms]
    Range (min … max):   544.4 ms … 572.6 ms    10 runs
  
  Summary
    ./git.slab rev-list --author-date-order HEAD ran
      1.02 ± 0.03 times faster than ./git.hash rev-list --author-date-order HEAD

So a little slowdown, but within the run-to-run noise. And on linux.git:

  Benchmark 1: ~/compile/git/git.slab rev-list --author-date-order HEAD
    Time (mean ± σ):     11.020 s ±  0.131 s    [User: 10.764 s, System: 0.254 s]
    Range (min … max):   10.886 s … 11.262 s    10 runs
  
  Benchmark 2: ~/compile/git/git.hash rev-list --author-date-order HEAD
    Time (mean ± σ):     11.682 s ±  0.204 s    [User: 11.398 s, System: 0.282 s]
    Range (min … max):   11.424 s … 12.139 s    10 runs
  
  Summary
    ~/compile/git/git.slab rev-list --author-date-order HEAD ran
      1.06 ± 0.02 times faster than ~/compile/git/git.hash rev-list --author-date-order HEAD


A little more measurable there. Those numbers are more in line with what
I was expecting. I'm not sure what it all means, though. 6% is enough
that it is probably worth keeping a custom data type like slab around.
Though it would be nice to have a data type that worked on all object
types and didn't necessarily use a ton of memory.

This particular case may not be representative, either. I picked it
because it was easy to convert. But I wonder how bad it would be to put
the object flags for a traversal into a hash. Right now those are in the
original struct, not even in a commit-slab. So I'd guess it's an even
bigger slowdown.

-Peff




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux