On 9/2/25 09:57, Phillip Wood wrote:
On 01/09/2025 21:43, Colin Stagner wrote:
The outer loop in git-subtree.sh:983 appears to iterate from the root
commit forwards… and not from the HEAD backwards.
git rev-list --topo-order --reverse --parents $rev $unrevs
# ^^^^^^^^^
Since the iteration is ancestor-first, I'm having difficulty seeing
why `should_ignore_subtree_split_commit()` would want to do an
ancestor traversal at all. It already sees the commits ancestor-
first.
check_parents() recurses into process_split_commit() rather than the
loop that call should_ignore_subtree_split_commit(). I think what this
check does do is prune some parents which stops check_parents() from
recursing into other subtrees so the check is in the right place.
I agree. In the original patch [1], Zach indicated that the check
results in a significant speedup for rejoin-heavy repos. The check is
clearly doing something.
Performance improvements may be possible. Instead of looking at commits
one at a time, this operation might be faster as part of a one-shot
HEAD-to-root traversal:
git log --grep 'for stuff' --format='%(trailers:...)' $unrev..HEAD
Commits that are deemed "uninteresting" or unnecessary could then be
provided, in bulk, as negative refs to the `git rev-list` traversal.
But my plan is to make the smallest and most portable maint-2.44 bugfix.
I think that non-essential performance changes are a task for later.
I am *much* less certain about the second test:
test -z "$(git log -1 \
--grep="git-subtree-dir: $arg_prefix$" $rev)"
If I remove this `test -z`, all the unit tests still pass—including mine. There may not be any test coverage for this line.
I'm not very familiar with git-subtree but I thought this was ensuring
that we did not exclude the ancestors of a squash or split that involves
the subtree that we're interested in.
It does, but I am still having problems finding commits that actually
trigger it.
It appears that `find_existing_splits()` in git-subtree.sh:459 filters
out the commits that the `test -z` I quoted above would otherwise
detect. `find_existing_splits()` searches for a previous --rejoin commit
to use as an `$unrev` stopping point for the rev-walk. It searches for
commits matching
git log --grep="^git-subtree-dir: $dir/*\$"
in combination with `git-subtree-mainline:`.
This is essentially the same test as in
`should_ignore_subtree_split_commit()`.
In --ignore-joins mode, `find_existing_splits()` looks for different
commits. I experimented a bit with adding --ignore-joins to some of the
existing unit tests, but I still could not find any instance where this
`test -z` makes a difference.
That said... I am inclined to keep this second test. The bug I am
patching is the result of an overzealous prune. The last thing I want to
do is to inadvertently prune commits we need for the sake of a
performance boost.
I wouldn't be surprised if the test coverage was lacking.
There don't appear to be any tests at all for --ignore-joins, aside from
option parsing.
[1]: 98ba49ccc2 (subtree: fix split processing with multiple subtrees
present, 2023-12-01)