Re: Semantics of change IDs (Re: Gerrit, GitButler, and Jujutsu projects collaborating on change-id commit footer)

Junio C Hamano <gitster@xxxxxxxxx> · Mon, 12 May 2025 10:03:35 -0700

Martin von Zweigbergk <martinvonz@xxxxxxxxxx> writes:

> If we instead had something like Mercurial's Changeset Evolution
> (explicitly recording how commits have evolved), then we could have a
> similar identifier that was based on the original version of a commit.
> To make lookup by this kind of change ID faster, we could have an
> index from commit ID to change ID (i.e. original commit ID). This
> seems to imply a commit can have 0 or 1 predecessors (0 for brand new
> commits, 1 for rewrites), which is different from Mercurial's
> Changeset Evolution, but not necessarily bad. For this kind of change
> ID to be the same across repos, and assuming the predecessor pointer
> is stored in the commit, we need to make sure to transfer all commits
> back to the original commit when we push to a remote. As I think we've
> talked about before here, that can be problematic because the user has
> to be careful to check that the intermediate commits did not have
> anything sensitive in them. It's also often wasteful to share all the
> intermediate commits with other developers. Another option is to
> transfer the predecessor pointer outside of the commit object. That
> has its own problems, like being able to create cycles in the
> predecessor graph.

A few comments (not necessarily strong suggestions).

 - I do not think you need to limit the predecessor pointers to 0 or
   1; when you started from N commits and worked to produce the
   final single commit, the result would naturally have N predecessors.

 - The predecessor pointers do not necessarily have to participate
   in the object transfer, just like filtered/lazy clones can ignore
   the tree pointer in a commit object when making a commit-only
   clone and the contained trees are fetched from the promisor
   on-demand.  It can even be set to be filtered out by default,
   since it would make unnecessary transfer cost people would not
   care most of the time, and only made available when the user
   expresses that they want to know how the change resulted in the
   current shape.