Re: [PATCH v2 0/8] repack: avoid MIDX'ing cruft pack(s) where possible

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Apr 14, 2025 at 07:57:52PM -0700, Elijah Newren wrote:
> On Mon, Apr 14, 2025 at 1:06 PM Taylor Blau <me@xxxxxxxxxxxx> wrote:
> >
> > Here is a non-RFC version of my series to explore creating MIDXs while
> > repacking that don't include the cruft pack.
> >
> > The core idea behind this approach is to ensure that packs generated via
> > geometric repacking traverse through objects that appear in packs which
> > are neither included nor excluded.
>
> This phrasing feels confusing -- what does it mean for packs to be
> neither included nor excluded?  Maybe:
>
> "The core idea behind this approach is to allow some (most) of the
> objects in a pack to be excluded, while still including some subset of
> objects from that pack as part of the repack.  In particular, we
> include the objects in that pack which are reachable from the other
> objects we repack.  This is different from our current handling which
> either entirely includes or entirely excludes all objects from a given
> pack."

I am admittedly having a little bit of a hard time parsing your version
of this, but I think this part:

    [...] In particular, we include the objects in that pack which are
    reachable from the other objects we repack.

isn't quite right. It's not that the output pack contains objects
reachable from the other objects we repack, but rather it contains the
reachable objects from the other objects we repack *if* those objects
don't appear in an excluded pack given as part of the input.

> > Then if some commit (for example) in
> > a pack reaches some once-unreachable object stored in a cruft pack, the
> > pack generated via geometric repacking will pick up and write a copy of
> > that object during its traversal.
> >
> > If you repack consistently using this strategy, you can guarantee that
> > the union of geometrically-repacked packs are closed under reachability
> > without having to keep track of any cruft pack(s) in the MIDX.
>
> Also, if you do a single non-geometric repack with this strategy, you
> are also closed under reachability, right?  Is that the suggested
> transition plan for those that want to use this...first do a
> non-geometric repack, and then ensure that subsequent geometric
> repacks are done with this strategy?

Yeah, the last commit gets at this a bit. The property you have to
maintain is that the union of geometrically-repacked packs (which form
the MIDX) are and stay closed under reachability. I am pretty sure that
the way this is constructed, adding new geometrically-repacked packs to
the chain does not violate this property[^1].

But you can't guarantee it part of the way through a sequence of
geometric repacks, which is what midx_has_unknown_packs() is checking
for.

If you do an all-into-one cruft repack first, then there is no MIDX to
begin with, so there aren't any unknown packs to worry about (since
there are no packs in a MIDX to begin with). When that property is met,
then we can use the new behavior.

Thanks,
Taylor

[^1]: So long as you don't drop part of the geometric progression, e.g.,
      if you have some pack that was in the existing MIDX, but wasn't
      repacked or included in the new MIDX.




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux