Re: Gerrit, GitButler, and Jujutsu projects collaborating on change-id commit footer

Martin von Zweigbergk <martinvonz@xxxxxxxxxx> · Thu, 10 Apr 2025 14:40:34 -0700

On Thu, 10 Apr 2025 at 01:29, Junio C Hamano <gitster@xxxxxxxxx> wrote:
>
> Nico Williams <nico@xxxxxxxxxxxxxxxx> writes:
>
> > argument against similarity heuristics over change IDs.  I still think
> > that explicit change IDs would be better than using only commit
> > similarity heuristics.
>
> I do think it makes sense to explicitly record that this commit was
> (or "these commits were") created to refine and replace that commit
> (or "these other commits"), if we want to keep track of how a set of
> patches evolved, if such a determination can be reliably done.  And
>
> I suspect that IDEs can do a much better job keeping track of such
> correspondence than they can keep track of renames and copies, which
> I mentioned in an earlier message.
>
> It is insufficient to just record a single "change ID" to each
> commit, in order to handle anything other than "a single commit gets
> updated by another single commit" case.  It is insufficient to even
> keep track of "a single commit gets updated by another single
> commit, which in turn gets updated by yet another single commit"
> case, without assuming globally synchronised clock in a distributed
> environment, simply because you only have three commit objects that
> share the same "change ID" string among themselves, and you cannot
> tell between A becoming B becoming C (in which case people would
> consider C is the latest in the iterations), or two developers
> started from A to produce B and C indenendently (in which case it is
> not yet decided which one between B and C should be considered the
> latest).
>
> Since we are all human, it is possible that we think things through
> and make a design as complete as humanly possible but it later turns
> out to be insufficient.  If we make such a mistake, we'd then need
> to deal with it and that is just simply a part of developers' life.
>
> But something that is _known_ to be structurally insufficient before
> it is added to the system?  We should refuse to make such a thing a
> part of very core part of the data structure, like the header fields
> in commit objects.

I think we are talking about slightly different things. The change ID
proposal is about providing a stable way of referring to an evolving
commit. You can think of it almost like an automatically generated Git
branch name that follows the commit as it's rewritten (as if you had
passed `--update-refs` to every command). I think what you're
describing is more like the "Git Evolve" proposal [1] (also linked to
from elsewhere in this thread). I think that's also an interesting
feature but I see it as mostly a separate feature. There is certainly
overlap between the two features, such as how the simple centralized
flow I mentioned can work pretty well by relying only on the change
ID.

Change IDs in Jujutsu are very useful without support for changeset
evolution. With the indexing I mentioned and the prefix lookup
prioritizing "mutable commits" (roughly those that are not on a
remote), it's quite convenient to run `jj log` and see a highlighted
prefix of, say, "xu", and then you can do e.g. `jj show xu` instead of
having to paste a longer ID or type a branch name. A further advantage
of preferring change IDs over commit IDs in commands is that you don't
risk creating "divergent" commits (similar copies) by rewriting the
same commit twice. For example, `jj describe xu -m foo && jj describe
xu -m bar` will rewrite the original commit to have message "foo" and
then rewrite the rewritten commit to have message "bar". I understand
that this use case is less useful to Git users because Git doesn't
like to work in detached HEAD mode and doesn't rewrite descendants
automatically. Consider experimenting with jj to get a better sense of
how it works :)

[1] https://lore.kernel.org/git/pull.1356.git.1663959324.gitgitgadget@xxxxxxxxx/