On Fri, May 16, 2025 at 12:49:35AM -0400, Jeff King wrote: > The cat-file command has some minor support for handling objects with > "unknown" types. I.e., strings that are not "blob", "commit", "tree", or > "tag". > > In theory this could be used for debugging or experimenting with > extensions to Git. But in practice this support is not very useful: > > 1. You can get the type and size of such objects, but nothing else. > Not even the contents! > > 2. Only loose objects are supported, since packfiles use numeric ids > for the types, rather than strings. > > 3. Likewise you cannot ever transfer objects between repositories, > because they cannot be represented in the packfiles used for the > on-the-wire protocol. All of these are good reasons. To add one more: they would probably prove to be a pain for pluggable object databases, too. No need to add that to the list here though given that there isn't even a design doc for those yet. > The support for these unknown types complicates the object-parsing code, > and has led to bugs such as b748ddb7a4 (unpack_loose_header(): fix > infinite loop on broken zlib input, 2025-02-25). So let's drop it. > > The first step is to remove the user-facing parts, which are accessible > only via cat-file. This is technically backwards-incompatible, but given > the limitations listed above, these objects couldn't possibly be useful > in any workflow. I agree. > However, we can't just rip out the option entirely. That would hurt a > caller who ran: > > git cat-file -t --allow-unknown-object <oid> > > and fed it normal, well-formed objects. There --allow-unknown-type was > doing nothing, but we wouldn't want to start bailing with an error. So > to protect any such callers, we'll retain --allow-unknown-type as a > noop. Okay, that sounds reasonable to me. > The code change is fairly small (but we'll able to clean up more code in > follow-on patches). The test updates drop any use of the option. We > still retain tests that feed the broken objects to cat-file without > --allow-unknown-type, as we should continue to confirm that those > objects are rejected. Note that in one spot we can drop a layer of loop, > re-indenting the body; viewing the diff with "-w" helps there. Shouldn't we have a test though that the option is still accepted, even though it doesn't do anything? > Signed-off-by: Jeff King <peff@xxxxxxxx> > --- > Documentation/git-cat-file.adoc | 6 +- > builtin/cat-file.c | 18 +-- > t/t1006-cat-file.sh | 211 ++++++++------------------------ > 3 files changed, 56 insertions(+), 179 deletions(-) > > diff --git a/Documentation/git-cat-file.adoc b/Documentation/git-cat-file.adoc > index fc4b92f104..cde79ad242 100644 > --- a/Documentation/git-cat-file.adoc > +++ b/Documentation/git-cat-file.adoc > @@ -9,8 +9,7 @@ SYNOPSIS > -------- > [verse] > 'git cat-file' <type> <object> > -'git cat-file' (-e | -p) <object> > -'git cat-file' (-t | -s) [--allow-unknown-type] <object> > +'git cat-file' (-e | -p | -t | -s) <object> > 'git cat-file' (--textconv | --filters) > [<rev>:<path|tree-ish> | --path=<path|tree-ish> <rev>] > 'git cat-file' (--batch | --batch-check | --batch-command) [--batch-all-objects] > @@ -202,9 +201,6 @@ flush:: > only once, even if it is stored multiple times in the > repository. > > ---allow-unknown-type:: > - Allow `-s` or `-t` to query broken/corrupt objects of unknown type. > - Should we maybe introduce a "deprecated" section and spell out that this option is a no-op nowadays that will be removed for example in Git 3.0? Patrick