Re: Gerrit, GitButler, and Jujutsu projects collaborating on change-id commit footer

Elijah Newren <newren@xxxxxxxxx> · Thu, 3 Apr 2025 08:39:31 -0700

On Wed, Apr 2, 2025 at 11:48 AM Martin von Zweigbergk
<martinvonz@xxxxxxxxxx> wrote:
>
> Hi,
>
> The Gerrit, GitButler, and Jujutsu projects all have a concept of
> a "change id", and it behaves in a similar way between the three
> tools. The change id is conceptually associated with a commit.
> It follows a commit as its rewritten (e.g. by amending and
> rebasing). The three projects currently store and format the
> change id differently. We would like to unify that so we can
> interoperate better. We hope the Git project is also interested
> in preserving and using this header.
>
> There are many benefits to having a change id even if it's just
> local. I mentioned some in my email to this mailing list in [1].
> For example, it enables
> `git rebase main <change ID>; git switch <change ID>` without
> requiring the user to look up the hash of the rewritten commit.

But <change ID> isn't unique, right?  The whole point of having the
change ID is to preserve it despite edits (e.g. rebase, commit
--amend, cherry-pick), meaning that you end up with multiple commits
with the same <change ID>.

Why would this work?

And if it does work, isn't it expensive since you'd need to walk
history to find it?  Or do you keep an extra lookup table on the side
somewhere?

> If the change id also transferred between repos and preserved by
> a forge (such as Gerrit), it enables the change id to be used to
> identify a code review.
>
> Here's how the change ids are currently stored and formatted:
>
>  * Gerrit currently stores change ids in a commit trailer called
>    `Change-Id`. It always starts with the letter 'I' and is
>    followed by 40 hex digits. For example:
>    `Change-Id: Ib563e78c3fedcff262255fa025441daa3202311b`.
>
>  * GitButler currently stores change ids in a commit footer
>    called `gitbutler-change-id` (older versions used
>    `change-id`). It's written as 32 hex digits separated by
>    dashes as in the UUID  format. For example:
>    `gitbutler-change-id  7d0fbc63-032d-413c-8ae8-610fbeb713c0`.
>
>  * Jujutsu currently stores change ids in a local storage outside
>    of the Git repo and is therefore not part of the Git commit
>    id. It is stored as 16 bytes. It is rendered to the user as
>   "reverse hex" using 'z' through 'k' as hex digits ('z' = 0,
>   'k' = 15). This allows even short prefixes to be distinguished
>    from commit  ids, which is a very useful property when used in
>    the CLI.
>
> As mentioned, the three projects would like to use the same
> storage and format. I think we have a consensus to store it in a
> Git commit header called `change-id` as a 32 reverse-hex digis.
> For example: `change-id ywlktllmukprnxnmzzprukpuwyztylwt`.

Yaay, I always hated it as a trailer.

> There is a design doc [2] about the impact on Gerrit and how to
> handle various cases where the client doesn't understand the
> `change-id` header. That also includes some discussion about
> whether cherry-picking should preserve the change id or create a
> new one. I think there is a lot of value in having a
> standardized header regardless of what we decide about
> cherry-picks.

cherry-pick & rebase preserve author name, email & time, while
creating a new committer name, email, & time.  To me, the change-id is
about the authorship, and since these commands already preserve
authorship, it'd seem weird to me to have cherry-pick not preserve the
change-id by default.

> So, to be clear, this is mostly a heads up at this point; we don't
> depend on any immediate changes from the Git project.

I appreciate the heads up, and agree based on what I've seen so far
that you can at least get started without Git changes.

However, I think you'd want git to preserve the change-id headers upon
git commit --amend, rebase, or cherry-pick, which would require some
git changes.  And you may want git to preserve them when doing a
fast-export, and be able to read them in with fast-import.

Anyway, I was a voice in the past that was kind of against these,
though that was mostly as a commit footer.  Plus, the number of
projects using them, hearing about their experience at Git Merge, and
realizing that part of my objection was due to misuse of Gerrit by
some folks in the past have all lead me to change my opinion.