Re: Semantics of change IDs (Re: Gerrit, GitButler, and Jujutsu projects collaborating on change-id commit footer)

"Theodore Ts'o" <tytso@xxxxxxx> · Thu, 10 Apr 2025 09:44:26 -0400

On Wed, Apr 09, 2025 at 01:35:45PM -0500, Nico Williams wrote:
> I was using file rename heuristics to explain that I wouldn't like more
> of the same for other things; I was not trying to litigate renames.
> 
> I'm trying to litigate the _addition_ of more similarity-based
> heuristics for _other_ things.
> 
> If similarity heuristics were enough for CR tools then none would have
> introduced anything like change IDs.  Or perhaps CR tools authors have
> been flat out wrong to not try or use similarity heuristics exclusively
> over change IDs.  That's a topic worth discussing.  I've stated my
> preference for not relying solely on similarity heuristics.

There is quite a lot of similarity between trying to record file names
(and having the concept of "inode numbers" for files tracked by git),
and the discussion we've had about how to track user intent when a
commit gets split or merged, and having a "Change-ID" which exactly
functions like an "inode number", except for an individual commit
intead of a file.

The arguments about why we don't have an "inode number" for files,
because it *is* complicated and hard to getr right, are *precisely*
the same argument for why I remmain unconvinced that having an "inode
number" of the semantic idea of a commit (read: Change-Id).

If you are someone who very much believes in the importance of doing
per-commit Code Review using something like Gerrit, then you might
think that a Change-ID is more *important* than an "inode number", and
so it is therefore worth the greater amount of complexity and/or
ambiguity when the Change-Id gets subject to the same levels of
incorrectness that having the IDE track the user intent behind a file
copy or rename might have.  That's a value judgement, and there's no
real right answer here.

After all, there are still people, for example as seen on a thread on
the The Unix Heritage Sociey mailing list, who have argued that git
is a hot mess because we don't track file renames and copies the way
"real" source code management systems like BitKeeper and Perforce does
things.  I happen to disagree, but that's a value judgement about
what's important in a SCM design.

Regardless how we come out on whethe having an "inode number" for the
high-level semantic value of a commit is worth it, I do think having a
"patch set ID" which ties related commits together does make sense,
though.  That would solve some interesting problems both for the
web/forge review workflow as wel as the mailing list review workflow.
I'd be curious what people might think about that.

Cheers,

						- Ted