Jeff King wrote: > On Mon, Jun 02, 2025 at 06:55:55PM +0000, Victoria Dye via GitGitGadget wrote: > >> To disambiguate without needing to invoke a separate Git process (e.g. >> 'ls-tree'), print the message "<oid> submodule" for such objects instead of >> "<object> missing". In addition to the change from "missing" to "submodule", >> the new message differs from the old in that it always prints the resolved >> tree entry's OID, rather than the input object specification. > > OK. I read over the discussion from last year, which I think mostly > centered around this patch. I do still think in the long run it would be > nice for cat-file to produce what output it _can_ for a missing object > (e.g., the oid and mode). One way to handle that could be changing the message to something like: submodule SP <mode> SP <oid> but... > > But I think it is OK to punt on that for now. Because "<oid> missing" > lines already exist, we'd probably need to put such behavior behind a > new command-line option. So while "<oid> submodule" lines would be > unnecessary in that hypothetical future world, we are not digging the > hole any deeper, from a backwards-compatibility standpoint. > > Although speaking of backwards compatibility, I guess older readers may > be surprised that the old "missing" message becomes a "submodule" one. > They may need to be updated if they were written carefully to bail on > unknown input (and were happy seeing "missing" messages for submodules). > So there may be some fallout, but it's not like the existing messages > were particularly useful in the first place. ...I suspect that'd be even less compatible with existing automation around 'cat-file' than just swapping out "submodule" for "missing", and users can theoretically infer that the mode is 160000 (S_IFGITLINK). That said, if at some point in the future we support submodules with a different mode, then an explicit value would be fairly useful. Happy to change it or keep it the same, I have no strong preference either way. > >> Note that this implementation maintains a distinction between submodules >> where the commit OID is not present in the repo, and submodules where the >> commit OID *is* present; the former will now print "<object> submodule", but >> the latter will still print the full object content. > > Hmm, that is an interesting point. It feels kind of arbitrary, but I'm > having trouble making a strong argument for one direction or the other. > The way you've written it means that readers need to be prepared to > parse _both_ the mode and "<oid> submodule" lines to find submodules. > But maybe there's some value in finding out more information about > submodule commits you do have in-repo. This was pretty much my thought process on it. It was a somewhat arbitrary choice, but what tipped me towards distinguishing the cases is that I'd rather have information like size, content, etc. about a commit and not need to use it, than need it but not have it available. That, and it does maintain the existing treatment of self-referential submodules.