Re: [PATCH v2 7/8] pack-objects: introduce '--stdin-packs=follow'

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Apr 14, 2025 at 08:11:08PM -0700, Elijah Newren wrote:
> > diff --git a/Documentation/git-pack-objects.adoc b/Documentation/git-pack-objects.adoc
> > index 7f69ae4855..c894582799 100644
> > --- a/Documentation/git-pack-objects.adoc
> > +++ b/Documentation/git-pack-objects.adoc
> > @@ -87,13 +87,19 @@ base-name::
> >         reference was included in the resulting packfile.  This
> >         can be useful to send new tags to native Git clients.
> >
> > ---stdin-packs::
> > +--stdin-packs[=<mode>]::
> >         Read the basenames of packfiles (e.g., `pack-1234abcd.pack`)
> >         from the standard input, instead of object names or revision
> >         arguments. The resulting pack contains all objects listed in the
> >         included packs (those not beginning with `^`), excluding any
> >         objects listed in the excluded packs (beginning with `^`).
> >  +
> > +When `mode` is "follow", pack objects which are reachable from objects
> > +in the included packs, but appear in packs that are not listed.
> > +Reachable objects which appear in excluded packs are not packed. Useful
> > +for resurrecting once-cruft objects to generate packs which are closed
> > +under reachability up to the excluded packs.
>
> Maybe:
>
> When `mode` is "follow", objects from packs not listed on stdin
> receive special treatment.  Objects within unlisted packs will be
> included if those objects (1) are reachable from the included packs,
> and (2) are not also found in any of the excluded packs.  This mode is
> useful for resurrecting once-cruft objects to generate packs which are
> closed under reachability up to the boundary set by the excluded
> packs.

I like it. I went with your version with some minor rewording and tweaks
on top.

> > +               /*
> > +                * Our 'to_pack' list was constructed by iterating all
> > +                * objects packed in included packs, and so doesn't
> > +                * have a non-zero hash field that you would typically
> > +                * pick up during a reachability traversal.
> > +                *
> > +                * Make a best-effort attempt to fill in the ->hash
> > +                * and ->no_try_delta here using a now in order to
> > +                * perhaps improve the delta selection process.
> > +                */
>
> I know you just moved this paragraph from below...but it doesn't parse
> for me.  "using a now in order to perhaps"?  What does that mean?

Yeah, this is just bogus, and was so before this patch. The rewording is
minor enough (just dropping "using a now") that I think we can just
squash it in with the movement in this patch.

> > +               oe->hash = pack_name_hash_fn(name);
> > +               oe->no_try_delta = name && no_try_delta(name);
> > +
> > +               stdin_packs_hints_nr++;
> > +       }
> > +}
> > +
> > +static void show_commit_pack_hint(struct commit *commit, void *data)
> > +{
> > +       enum stdin_packs_mode mode = *(enum stdin_packs_mode *)data;
> > +       if (mode == STDIN_PACKS_MODE_FOLLOW) {
> > +               show_object_pack_hint((struct object *)commit, "", data);
> >                 return;
> > +       }
> > +       /* nothing to do; commits don't have a namehash */
> >
> > -       /*
> > -        * Our 'to_pack' list was constructed by iterating all objects packed in
> > -        * included packs, and so doesn't have a non-zero hash field that you
> > -        * would typically pick up during a reachability traversal.
> > -        *
> > -        * Make a best-effort attempt to fill in the ->hash and ->no_try_delta
> > -        * here using a now in order to perhaps improve the delta selection
> > -        * process.
> > -        */
> > -       oe->hash = pack_name_hash_fn(name);
> > -       oe->no_try_delta = name && no_try_delta(name);
> > -
> > -       stdin_packs_hints_nr++;
> >  }
>
> It might be worth swapping the order of functions as a preparatory
> patch (both here and when you've done it elsewhere in this series),
> just because it'll make the diff so much easier to read when we can
> see the changes to the function without have to also deal with the
> order swapping (since order swapping looks like a large deletion and
> large addition of one of the two functions).

Fair enough.

> > @@ -4467,6 +4484,23 @@ static int is_not_in_promisor_pack(struct commit *commit, void *data) {
> >         return is_not_in_promisor_pack_obj((struct object *) commit, data);
> >  }
> >
> > +static int parse_stdin_packs_mode(const struct option *opt, const char *arg,
> > +                                 int unset)
> > +{
> > +       enum stdin_packs_mode *mode = opt->value;
> > +
> > +       if (unset)
> > +               *mode = STDIN_PACKS_MODE_NONE;
> > +       else if (!arg || !*arg)
> > +               *mode = STDIN_PACKS_MODE_STANDARD;
>
> I don't understand why you have both a None mode and a Standard mode,
> especially since the implementation seems to only care about whether
> or not the Follow mode has been set.  Shouldn't these both be setting
> mode to the same value?

I'm not sure I follow your question... stdin_packs is a tri-state. It
can be off, on in standard/legacy mode, or on in follow mode.

> > +test_expect_success 'setup for --stdin-packs=follow' '
> > +       git init stdin-packs--follow &&
> > +       (
> > +               cd stdin-packs--follow &&
> > +
> > +               for c in A B C D
> > +               do
> > +                       test_commit "$c" || return 1
> > +               done &&
> > +
> > +               A="$(echo A | git pack-objects --revs $packdir/pack)" &&
> > +               B="$(echo A..B | git pack-objects --revs $packdir/pack)" &&
> > +               C="$(echo B..C | git pack-objects --revs $packdir/pack)" &&
> > +               D="$(echo C..D | git pack-objects --revs $packdir/pack)" &&
> > +
> > +               git prune-packed
> > +       )
> > +'

Huh, I have no idea how this snuck in. This "setup" test does nothing
and creates a repository that isn't used later on in the script.
Probably leftover from writing these tests in the first place, but I've
removed it.

> I like the tests -- normal --stdin-packs, then --stdin-packs=follow,
> then --stdin-packs=follow + --unpacked.

I think the normal tests are accidental since we use pack-objects to
write packs A, B, C, and D. But the --stdin-packs vs.
--stdin-packs=follow and --stdin-packs=follow + --unpacked was
definitely intentional.

> However, would it be worthwhile to create commit E immediately after
> creating the packs?

Yeah, I think that is a good suggestion. We already have tests that
exercise --stdin-packs with --unpacked earlier in the same script, but
obviously not with --stdin-packs=follow. Moving the creation of commit E
earlier up makes a lot of sense to me, thanks!

Thanks,
Taylor




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux