On 8/18/25 7:05 AM, Jeff King wrote: > On Mon, Aug 18, 2025 at 12:48:07AM -0400, Jeff King wrote: > >> And indeed, the current code does then segfault on "git describe foo" at >> the spot I mentioned. Even though the repository state here is >> unexpected and corrupt, I do think we should probably be more defensive >> and avoid the segfault. > > So you almost nerd-sniped me into making a series. But the more I dug > into the rabbit-hole, the more I turned away in disgust. :) > > The set of problems I found are: > > 1. What should happen when traversing from HEAD does not find the blob > in question? Right now we print a blank line, which is...weird. Weird indeed. > Probably we should either print nothing, or return an error. The latter, consistent with "git describe <commit-ish>". > If we return an error, should we respect --always? The documentation says "git describe <blob>" takes no options. It could learn some, of course. But does it have to? Perhaps better keep that thing contained. > Are we stuck with the > current dumb behavior because it's a plumbing command? git describe is a porcelain command. The documentation doesn't say what happens when the blob is not found in HEAD's ancestry. I can't imagine why someone would use "git describe <blob>" to check whether a particular blob is linked to, but it _is_ slightly faster than "git rev-list --objects --no-object-names HEAD | grep <blob>" for me, and of course easier to type. "git log --find-object <blob>" is slower than either. > 2. When we are on an unborn branch, we print a confusing message: > > $ git init > $ git commit --allow-empty -m foo > $ git tag foo > $ git symbolic-ref HEAD refs/heads/unborn > $ git describe $(echo blob | git hash-object -w --stdin) > fatal: ambiguous argument 'HEAD': unknown revision or path not in the working tree. > Use '--' to separate paths from revisions, like this: > 'git <command> [<revision>...] -- [<file>...]' > > We should probably resolve HEAD ourselves and either bail with an > empty output or an error (depending on what we do for (1) above). It already is an error, just needs a better message. It should still report an error even if we were to stick with showing blank lines for unrelated blobs. > 3. When we do traverse, if process_object() sees that we didn't find a > commit, we should detect that and either return an empty result or > an error (again, depending on the behavior of (2) above). This is > done by checking is_null_oid(&pcd->current_commit) there. OK, ending the search right there might be the best option. Traversing deeper into the forest that we then know to be cursed would be the unappealing alternative. > 4. Then we can teach describe_commit() to take a commit rather than an > oid (and the is_null_oid() check becomes a NULL check). 5. When process_object() has a commit, but it is indescribable, it shows an error: $ git describe 5afbe6da1d6ab0b8939343166636b828e561bf35 fatal: No tags can describe '3b681e255cd8577a77983958ef7f566b05806cd0'. Try --always, or create some tags. It's not immediately clear that the reported hash belongs to the found commit. And that suggestion to try --always is misleading, as "git describe <blob>" takes no options according to the documentation. I'm not sure I like it in general -- can't tell if the command is being snarky with me. > So it all depends on what to do with (1), and for a feature that IMHO > should not even exist in the first place, I had trouble summoning the > will-power to make this 4-patch series. 644eb60bd0 (builtin/describe.c: describe a blob, 2017-11-15) and 15af58c1ad (diffcore: add a pickaxe option to find a specific blob, 2018-01-04) confuse me; the latter's commit message sounds like the former wasn't (supposed to be?) merged. I think the issues you listed are independent, though. Or what's wrong with this demo that addresses point 3 in process_object() and 1 in describe_blob(). If we want a blank line for 1 then we apply only the first hunk. Or am I missing something? René diff --git a/builtin/describe.c b/builtin/describe.c index d7dd8139de..9e485240aa 100644 --- a/builtin/describe.c +++ b/builtin/describe.c @@ -507,8 +507,10 @@ static void process_object(struct object *obj, const char *path, void *data) if (oideq(&pcd->looking_for, &obj->oid) && !pcd->dst->len) { reset_revision_walk(); - describe_commit(&pcd->current_commit, pcd->dst); - strbuf_addf(pcd->dst, ":%s", path); + if (!is_null_oid(&pcd->current_commit)) { + describe_commit(&pcd->current_commit, pcd->dst); + strbuf_addf(pcd->dst, ":%s", path); + } free_commit_list(pcd->revs->commits); pcd->revs->commits = NULL; } @@ -519,6 +521,7 @@ static void describe_blob(struct object_id oid, struct strbuf *dst) struct rev_info revs; struct strvec args = STRVEC_INIT; struct process_commit_data pcd = { *null_oid(the_hash_algo), oid, dst, &revs}; + size_t orig_len = dst->len; strvec_pushl(&args, "internal: The first arg is not parsed", "--objects", "--in-commit-order", "--reverse", "HEAD", @@ -532,6 +535,8 @@ static void describe_blob(struct object_id oid, struct strbuf *dst) die("revision walk setup failed"); traverse_commit_list(&revs, process_commit, process_object, &pcd); + if (dst->len == orig_len) + die(_("unable to describe blob '%s'"), oid_to_hex(&oid)); reset_revision_walk(); release_revisions(&revs); strvec_clear(&args);