Re: [PATCH 02/13] cat-file: make --allow-unknown-type a noop

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, May 16, 2025 at 12:49:35AM -0400, Jeff King wrote:
> The cat-file command has some minor support for handling objects with
> "unknown" types. I.e., strings that are not "blob", "commit", "tree", or
> "tag".
> 
> In theory this could be used for debugging or experimenting with
> extensions to Git. But in practice this support is not very useful:
> 
>   1. You can get the type and size of such objects, but nothing else.
>      Not even the contents!
> 
>   2. Only loose objects are supported, since packfiles use numeric ids
>      for the types, rather than strings.
> 
>   3. Likewise you cannot ever transfer objects between repositories,
>      because they cannot be represented in the packfiles used for the
>      on-the-wire protocol.

All of these are good reasons. To add one more: they would probably
prove to be a pain for pluggable object databases, too. No need to add
that to the list here though given that there isn't even a design doc
for those yet.

> The support for these unknown types complicates the object-parsing code,
> and has led to bugs such as b748ddb7a4 (unpack_loose_header(): fix
> infinite loop on broken zlib input, 2025-02-25). So let's drop it.
> 
> The first step is to remove the user-facing parts, which are accessible
> only via cat-file. This is technically backwards-incompatible, but given
> the limitations listed above, these objects couldn't possibly be useful
> in any workflow.

I agree.

> However, we can't just rip out the option entirely. That would hurt a
> caller who ran:
> 
>   git cat-file -t --allow-unknown-object <oid>
> 
> and fed it normal, well-formed objects. There --allow-unknown-type was
> doing nothing, but we wouldn't want to start bailing with an error. So
> to protect any such callers, we'll retain --allow-unknown-type as a
> noop.

Okay, that sounds reasonable to me.

> The code change is fairly small (but we'll able to clean up more code in
> follow-on patches). The test updates drop any use of the option. We
> still retain tests that feed the broken objects to cat-file without
> --allow-unknown-type, as we should continue to confirm that those
> objects are rejected. Note that in one spot we can drop a layer of loop,
> re-indenting the body; viewing the diff with "-w" helps there.

Shouldn't we have a test though that the option is still accepted, even
though it doesn't do anything?

> Signed-off-by: Jeff King <peff@xxxxxxxx>
> ---
>  Documentation/git-cat-file.adoc |   6 +-
>  builtin/cat-file.c              |  18 +--
>  t/t1006-cat-file.sh             | 211 ++++++++------------------------
>  3 files changed, 56 insertions(+), 179 deletions(-)
> 
> diff --git a/Documentation/git-cat-file.adoc b/Documentation/git-cat-file.adoc
> index fc4b92f104..cde79ad242 100644
> --- a/Documentation/git-cat-file.adoc
> +++ b/Documentation/git-cat-file.adoc
> @@ -9,8 +9,7 @@ SYNOPSIS
>  --------
>  [verse]
>  'git cat-file' <type> <object>
> -'git cat-file' (-e | -p) <object>
> -'git cat-file' (-t | -s) [--allow-unknown-type] <object>
> +'git cat-file' (-e | -p | -t | -s) <object>
>  'git cat-file' (--textconv | --filters)
>  	     [<rev>:<path|tree-ish> | --path=<path|tree-ish> <rev>]
>  'git cat-file' (--batch | --batch-check | --batch-command) [--batch-all-objects]
> @@ -202,9 +201,6 @@ flush::
>  	only once, even if it is stored multiple times in the
>  	repository.
>  
> ---allow-unknown-type::
> -	Allow `-s` or `-t` to query broken/corrupt objects of unknown type.
> -

Should we maybe introduce a "deprecated" section and spell out that this
option is a no-op nowadays that will be removed for example in Git 3.0?

Patrick




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux