On Thu, Feb 27, 2025 at 11:26:32AM -0800, Elijah Newren wrote: > > However, this process breaks down when we attempt to freshen an object > > packed in an earlier cruft pack that is larger than the threshold and > > thus will survive the repack. > > ...packed in an earlier cruft pack, and that cruft pack is larger than > the threshold... > > (Otherwise, it's unclear whether you are talking about the object or > the cruft pack it is in being larger than the threshold.) Good suggestion, thanks! > > When this is the case, it is impossible to freshen objects in cruft > > pack(s) which are larger than the threshold. This is because we avoid > > writing them in the new cruft pack entirely, for a couple of reasons. > > ...freshen objects in cruft packs when those cruft packs are larger > than the threshold... > > Again, just to clarify what thing is "larger". Likewise, this makes sense as well, and I applied it in my local copy. > Also, this paragraph while fine on its own is slightly unclear whether > you are discussing pre-patch or post-patch state, which when reading > the next two items causes some double takes. Perhaps just spell it > out slightly clearer here that for the next two enumerated items you > are discussing the existing state previous to your changes? I adjusted the paragraph before this one to make it a little clearer. Instead of saying "However, [...]", I rewrote it as "Prior to this patch, however, [...]". > > - exists in a non-cruft pack that we are retaining, regardless of that > > pack's mtime, or > > > > - exists in a cruft pack with an mtime more recent than the copy we are > > debating whether or not to pack, in which case freshening would be > > redundant. > > s/more recent than/at least as recent as/ ? Thanks for the careful read, and yes, the comparison here is a >= rather than a strict >, and that difference is worth being precise about. > > > > To do this, keep track of whether or not we have any cruft packs in our > > in-core kept list with a new 'ignore_packed_keep_in_core_has_cruft' > > flag. When we end up in this new special case, we replace a call to > > 'has_object_kept_pack()' to 'want_cruft_object_mtime()', and only > > reject objects when we have a copy in an existing cruft pack with a more > > recent mtime (in which case "freshening" would be redundant). > > Again, s/a more recent/at least as recent/ ? I like this suggestion, but I think the wording ends up a little awkward if applied as-is. I turned this sentence into: [...], and only reject objects when we have a copy in an existing cruft pack with at least as recent an mtime as our candidate (in which case "freshening" would be redundant). Let me know what you think! > > +test_expect_success '--max-cruft-size with freshened objects (previously cruft)' ' > > + git init max-cruft-size-threshold && > > + ( > > + cd max-cruft-size-threshold && > > + > > + test_commit base && > > + foo="$(generate_random_blob foo $((2*1024*1024)))" && > > + bar="$(generate_random_blob bar $((2*1024*1024)))" && > > + baz="$(generate_random_blob baz $((2*1024*1024)))" && > > + > > + test-tool chmtime --get -100000 \ > > + "$objdir/$(test_oid_to_path "$foo")" >foo.old && > > + test-tool chmtime --get -100000 \ > > + "$objdir/$(test_oid_to_path "$bar")" >bar.old && > > + test-tool chmtime --get -100000 \ > > + "$objdir/$(test_oid_to_path "$baz")" >baz.old && > > + > > + git repack --cruft -d && > > + > > + # Make a packed copy of object $foo with a more recent > > + # mtime. > > s/$foo/foo/ ? Eh. $foo holds the OID of that blob, so "foo" on its own doesn't really mean anything (even though the implicit meaning is clear from context). I think changing it is fine (leaving it alone is equally fine in my mind, but I don't feel strongly about it). > > + foo="$(generate_random_blob foo $((2*1024*1024)))" && > > I thought this was creating a completely different foo, which would > defeat the point of the test. It might be worth adding a comment that > because generate_random_blob uses a very simplistic and repeatable > random character generator with the first argument as the seed, that > this will regenerate the same loose object as above for foo. I think the part of the comment which reads "packed copy of" makes it clear-ish that we're creating an identical copy, but it doesn't hurt to be more explicit here. Thanks for the careful read! Thanks, Taylor