On Wed, Feb 05, 2025 at 03:14:21PM -0800, Emily Shaffer wrote: > On Mon, Feb 3, 2025 at 1:55 AM Patrick Steinhardt <ps@xxxxxx> wrote: > > > > Hi, > > > > due to a couple performance regressions that we have hit over the last > > couple Git releases at GitLab, we have started to set up an effort to > > implement continuous benchmarking for the Git project. The intent is to > > have regular (daily) benchmarking runs against Git's `master` and `next` > > branches to be able to spot any performance regressions before they make > > it into the next release. > > > > I have started with a relatively simple setup: > > > > - I have started collection benchmarks that I myself do regularly [1]. > > These benchmarks are built on hyperfine and are thus not part of the > > Git repository itself. > > > > - GitLab CI runs on a nightly basis, executing a subset of these > > benchmarks [2]. > > > > - Results are uploaded with a hyperfine adaptor to Bencher and are > > summarized in dashboards. > > > > This at least gives us some visibility in severe performance outliers, > > whether these are improvements or regressions. Some statistics are > > applied on this data to automatically generate alerts when things are > > significantly changing. > > > > The setup is of course not perfect. It's built on top of CI jobs, which > > are by their very nature not really performing consistent. The scripts > > are hosted outside of Git. And I'm the only one running this. > > For the CI "noisy neighbors" problem at least, it could be an option > to try to host in GCE (or some other compute that isn't shared). I > asked around a little inside Google and it seems like it's possible, > I'll keep pushing on it and see just how hard it would be. I'd even be > happy to trade on-push runs with noisy neighbors for nightly runs with > no neighbors, which makes it not really a CI thing - guess I will find > out if that's easier or harder for us to implement. :) That would be awesome. > > So I wonder whether there is a wider interest in the Git community to > > have this infrastructure part of the Git project itself. This may > > include steps like the following: > > > > - Extending our performance tests we have in "t/perf" to cover more > > benchmarks. > > Folks may be aware that our biggest (in terms of scale) internal > customer at Google is Android project. They are the ones who complain > to me and my team the most about performance; they are also open to > setting up nightly performance regression test. Would it be appealing > to get reports from such a test upstream? I think it's more compelling > to our customer team if we run it against the closed-source Android > repo, which means the Git project doesn't get to see as much about the > shape and content of the repos the performance tests are running > against, but we might be able to publish info about the shape without > the contents. Would that be useful? What would help to know (# of > commits, size of largest object, distribution of object size, # of > branches, size of worktree...?) If not having the specifics of the > repo-under-test is a dealbreaker we could explore running performance > tests in public with Android Open Source Project as the > repo-under-test instead, but it's much more manageable than full > Android. The biggest question is whether such regression reports would be actionable by the Git community. I often found performance issues to be very specific to the repository at hand, and reconstructing the exact situation tends to be extremely tedious or completely infeasible. I run into the situation way too often where customers come knock at my door with a performance issue, but don't want to provide the underlying data. More often than not I end up not being able to reproduce, so I have to push back on such reports. Ideally, any report should be accompanied by a trivial reproducer that any developer can execute on their local machine. > Maybe in the long term it would be even better to have some toy > repo-under-test, like "sample repo with massive object store", "sample > repo with massive history", etc. to help us pinpoint which ways we're > scaling well and which ways we aren't. But having a ready made > repo-under-test, and a team who's got a very large stake in Git > performing well with it (so they can invest their time in setting up > tests), might be a good enough place to start. That would be great. I guess this wouldn't be a single repository, but a set of repositories that have different kinds of characteristics. > > - Writing an adaptor that is able to upload the data generated from > > our perf scripts to Bencher. > > > > - Setting up proper infrastructure to do the benchmarking. We may for > > now also continue to use GitLab CI, but as said they are quite noisy > > overall. Dedicated servers would help here. > > > > - Sending alerts to the Git mailing list. > > Yeah, I'd love to see reports coming to Git mailing list, or at least > bad news reports (maybe we don't need "everything ran great!" every > night, but would appreciate "last night the performance suite ran 50% > slower than last-6-months average"). That seems the easiest to > integrate with the way the project runs now, and I think we are used > to list noise :) Oh, totally, I certainly don't think there's any benefit in reporting anything when there is no information. Right now there still are semi- frequent outliers where an alert is generated only because of a flake, not a real performance regression. But my hope would be that we can address this issue once we address the noisy neighbour problem. > > I'm happy to hear your thoughts on this. Any ideas are welcome, > > including "we're not interested at all". In that case, we'd simply > > continue to maintain the setup ourselves at GitLab. > > In general, though, yes! I am very interested! Google had trouble with > performance regressions over the last 3 months or so, I'd love to see > the community noticing it more. I think in general we have a sense > that performance matters, during code review, but aren't always sure > where it matters most, and a regular performance test that anybody can > see the results of would help a lot. Thanks for your input! Patrick