On Thu, Mar 27, 2025 at 05:58:19PM -0400, Taylor Blau wrote: > On Thu, Mar 27, 2025 at 02:32:43AM -0400, Jeff King wrote: > > The pathspec-trie stuff is, I think, still a reasonable idea for general > > use. But IIRC, the rewritten blame-tree you guys worked on does not > > benefit from it, because it ditches pathspecs entirely (both because > > they're too slow without the tries, but also because it's important to > > continually narrow the pathspec while traversing). That trie code was > > never run in production, I think (and I see there is a patch to narrow > > the pathspec while traversing; I suspect that likewise was never used). > > Yeah, the rewritten blame-tree code uses changed-path Bloom filters to > narrow the set of revisions that we need to actually compute tree-diffs > for. > > The general idea is that we have a set of paths that we have yet to > blame, and those are the "interesting" ones. IOW, if a changed-path > Bloom filter tells us that we are at some revision where there is maybe > a change to one or more unblamed paths, we need to compute a tree-diff. > But if the Bloom filter says "no", then we can skip the tree-diff at > that layer entirely. You'd still in theory benefit from the tree-diffs you _do_ run using a continually narrowing pathspec. Skimming over the code from your tb/blame-tree branch, it looks like it's just fed the original pathspec. But that's probably good enough in practice. Especially for non-recursive blame-trees, where pruning already-matched entries will never save you from opening another tree anyway. > > So yeah. I don't know if all of this is really a very good starting > > point. Taylor, if you can share the current code that GitHub is running, > > I think that would be beneficial for the community. > > Sure. You can fetch from the 'tb/blame-tree' branch from my tree (which > is located at 'git@xxxxxxxxxx:ttaylorr/git.git'). I owe a huge "thank > you" to Victoria Dye, who split out the various topics from GitHub's > fork into individual rebased branches. Thanks. I don't have time to pick it up as a topic myself, but hopefully it can be useful to Toon (or any others interested in the topic). -Peff