> 2025年6月4日 06:09,Taylor Blau <me@xxxxxxxxxxxx> 写道: > > On Tue, Jun 03, 2025 at 06:20:49AM +0000, Lidong Yan via GitGitGadget wrote: >> From: Lidong Yan <502024330056@xxxxxxxxxxxxxxxx> >> >> In pack-bitmap.c:find_boundary_objects(), the roots_bitmap is only freed >> if cascade_pseudo_merges_1() fails. Since cascade_pseudo_merges_1() only >> use roots_bitmap as a mutable reference but not takes roots_bitmap's >> ownership. Once cascade_pseudo_merges_1 succeed(), roots_bitmap leaks. >> And this leak currently lacks a dedicated test to detect it. >> >> To fix this leak, remove if cascade_pseudo_merges_1() succeed check and >> always calling bitmap_free(roots_bitmap); > > This sentence might be more clear if it were written as: > > To fix this leak, unconditionally free the roots_bitmap regardless > of whether or not cascade_pseudo_merges_1() succeeds. > >> To trigger this leak, we need a pseudo-merge whose size is equal to >> or smaller than roots_bitmap (which corresponds to the set of "haves" >> commits in prepare_bitmap_walk()). To do this, we can create two >> commits: A and B. Add A to the pseudo-merge list and perform a traversal >> over the range A..B. In this scenario, the "haves" set will be {A}, >> and cascade_pseudo_merges_1() will succeed, thereby exposing the leak >> due to the missing roots_bitmap cleanup. > > I don't think this is quite right. Calling cascade_pseudo_merges_1() > succeeds (and returns a non-zero value) when one or more pseudo-merges > are satisfied. A pseudo-merge is satisfied here when its parents bitmap > is a *subset* of the roots_bitmap, not when it has a smaller size. > > The precise definition of one bitmap being a subset of another can be > found in ewah/bitmap.c::ewah_bitamp_is_subset(). But in general one > bitmap is a subset of the other if the set of bit positions with value > "1" from one is a subset of the same set from the other bitmap. > > I think that's what you meant by "smaller", but I think it's worth > clarifying here. Yes, I want to say subset here, I will rewrite this part of comment. > >> diff --git a/pack-bitmap.c b/pack-bitmap.c >> index ac6d62b980c..8727f316de9 100644 >> --- a/pack-bitmap.c >> +++ b/pack-bitmap.c >> @@ -1363,8 +1363,8 @@ static struct bitmap *find_boundary_objects(struct bitmap_index *bitmap_git, >> bitmap_set(roots_bitmap, pos); >> } >> >> - if (!cascade_pseudo_merges_1(bitmap_git, cb.base, roots_bitmap)) >> - bitmap_free(roots_bitmap); >> + cascade_pseudo_merges_1(bitmap_git, cb.base, roots_bitmap); >> + bitmap_free(roots_bitmap); > > Makes sense. > >> diff --git a/t/t5333-pseudo-merge-bitmaps.sh b/t/t5333-pseudo-merge-bitmaps.sh >> index 56674db562f..e665001a410 100755 >> --- a/t/t5333-pseudo-merge-bitmaps.sh >> +++ b/t/t5333-pseudo-merge-bitmaps.sh >> @@ -445,4 +445,24 @@ test_expect_success 'pseudo-merge closure' ' >> ) >> ' >> >> +test_expect_success 'use pseudo-merge in boundary traversal' ' >> + git init pseudo-merge-boundary-traversal && >> + ( >> + cd pseudo-merge-boundary-traversal && >> + >> + git config bitmapPseudoMerge.test.pattern refs/ && >> + git config bitmapPseudoMerge.test.threshold now && > > Setting the unstable threshold here should be unnecessary, since the > unstable portion of the group only includes matching commits beyond the > threshold that *don't* already have a bitmap. Since "A" is the only > commit at the time you write the bitmap below, it will always be > selected, and thus never appear in the unstable portion of a > pseudo-merge group. > >> + git config bitmapPseudoMerge.test.stableThreshold now && > > This one is technically unnecessary, but only because test_commit starts > at the $test_tick value, which is very far in the past (beyond the > default value of 1.month.ago). May be this is the time for me to re-read pseudo-merge documents. > >> + test_commit A && >> + git repack -adb && >> + test_commit B && >> + >> + echo '1' >expect && > > Please do not use single-quotes in a test script. It happens to work in > this instance, but it is easy to break. Got it. > >> + GIT_TEST_PACK_USE_BITMAP_BOUNDARY_TRAVERSAL=1 \ >> + git rev-list --count --use-bitmap-index HEAD~1..HEAD >actual && > > This test needs to use the boundary-based bitmap traversal routines, but > I'm unclear on why you're using the GIT_TEST_-environment variable to > enable them. I don’t have a special reason to choose GIT_TEST rather than `git config`. I just find in both way this test works so I use GIT_TEST. I will switch to `git config`. > > Is there a reason that we can't rely on the usual repository > configuration here? I would have expected something like this (which > should apply cleanly on top of your patch): > > --- 8< --- > diff --git a/t/t5333-pseudo-merge-bitmaps.sh b/t/t5333-pseudo-merge-bitmaps.sh > index e665001a41..491ef404ea 100755 > --- a/t/t5333-pseudo-merge-bitmaps.sh > +++ b/t/t5333-pseudo-merge-bitmaps.sh > @@ -453,14 +453,14 @@ test_expect_success 'use pseudo-merge in boundary traversal' ' > git config bitmapPseudoMerge.test.pattern refs/ && > git config bitmapPseudoMerge.test.threshold now && > git config bitmapPseudoMerge.test.stableThreshold now && > + git config pack.useBitmapBoundaryTraversal true && > > test_commit A && > git repack -adb && > test_commit B && > > - echo '1' >expect && > - GIT_TEST_PACK_USE_BITMAP_BOUNDARY_TRAVERSAL=1 \ > - git rev-list --count --use-bitmap-index HEAD~1..HEAD >actual && > + echo 1 >expect && > + git rev-list --count --use-bitmap-index HEAD~1..HEAD >actual && > test_cmp expect actual > ) > ' > --- >8 --- > >> + test_cmp expect actual > > Hmm. I suppose, although it feels a little clunky to me to write > something like "echo 1 >expect". I would imagine that you'd do something > like: > > test 1 -eq $(git rev-list --count --use-bitmap-index HEAD~1..HEAD) > > instead. Or if you wanted to split them off into separate lines, you > could do: > > nr=$(git rev-list --count --use-bitmap-index HEAD~1..HEAD) && > test 1 -eq "$nr" > I like the latter one, I will use it in the next series. Thanks, Lidong