Re: [PATCH v2 3/3] index-pack: allow revisiting REF_DELTA chains

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Apr 28, 2025 at 08:24:45PM +0000, Derrick Stolee via GitGitGadget wrote:
> The crux of the matter is how the algorithm works when the REF_DELTAs
> point to base objects that exist in the local repository.

Hmm. I'm having trouble squaring this with these next two paragraphs:

> Consider the case where the packfile has two REF_DELTA objects, A and B,
> and the delta chain looks like "A depends on B" and "B depends on C" for
> some third object C, where C is already in the current repository. The
> algorithm _should_ start with all objects that depend on C, finding B,
> and then moving on to all objects depending on B, finding A.
>
> However, if the repository also already has object B, then the delta
> chain can be analyzed in a different order. The deltas with base B can
> be analyzed first, finding A, and then the deltas with base C are
> analyzed, finding B. The algorithm currently continues to look for
> objects that depend on B, finding A again. This fails due to A's
> 'real_type' member already being overwritten from OBJ_REF_DELTA to the
> correct object type.

ISTM that this A->B->C chain is a problem because (in the above example)
the server sent B as a REF_DELTA base for A but also had its own
pre-existing copy of B.

But the first quoted sentence suggests that the issue is with REF_DELTAs
that point to base objects that exist in the local repository. Does
"point to" mean that the REF_DELTA's base is the local object, or that
the local object itself was sent as a REF_DELTA against some other base?

I haven't fully wrapped my head around the implications of this all yet,
but I think that it's the former, though admittedly even typing this I
am not quite sure of myself. I *think* that doing this is OK if the only
path from base objects to their corresponding deltas is unique, and/or
there were no such paths at all.

I'm trying to think through the implications of this against my
series[1] from a while ago that converts OFS_DELTAs that weren't usable
as part of verbatim pack-reuse into REF_DELTAs. There are two cases
there that I was considering:

  - For some (delta, base) pair in a pack, there was an additional copy
    of 'base' in some other pack, and the MIDX chose the copy from that
    pack. That forms what I'm calling a "cross-pack" delta. This isn't
    currently reusable as an OFS_DELTA for a variety of reasons, but is
    OK as a REF_DELTA provided we know that the client either already
    has the base object or we are sending it as part of the pack anyway.

  - The other case is that we the client wants the delta-half of a
    delta/base-pair, but not the base object. In this case, we can't
    currently reuse the OFS_DELTA verbatim, but could if we converted it
    to a REF_DELTA based on the knowledge that the client already has
    the base object.

The latter is doable based on the information in the wants/haves bitmap.
The process there looks like: determine that the client doesn't want the
base object, realize that its bit is set in the 'haves' bitmap, and then
convert the delta object from a OFS_DELTA to a REF_DELTA.

But I think that all breaks for older clients that don't meet the unique
paths condition above. Does that sound right to you?

I think the cross-pack case is fine, provided we know ahead of time that
the client doesn't have the (converted-to-REF_DELTA) delta object in its
local copy.

Unfortunately, I think this means that [1] is a bit of a dead-end for
serves that have older clients (running a version that does not include
this patch). At least, I think that's true if we can construct a
situation where the server sends a REF_DELTA that it thinks the client
doesn't have but actually does. I'm not immediately sure what such a
situation would look like beyond cases like: "the client verbatim asked
for an object it already has, but isn't reachable from the set of
provided 'haves'".

Thanks,
Taylor

[1]: https://lore.kernel.org/git/cover.1728505840.git.me@xxxxxxxxxxxx/




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux