Re: Semantics of change IDs (Re: Gerrit, GitButler, and Jujutsu projects collaborating on change-id commit footer)

"D. Ben Knoble" <ben.knoble@xxxxxxxxx> · Tue, 22 Apr 2025 16:17:03 -0400

On Wed, Apr 16, 2025 at 7:36 AM Remo Senekowitsch <remo@xxxxxxxxxxx> wrote:
>
> On Mon Apr 14, 2025 at 9:54 PM CEST, D. Ben Knoble wrote:
> > On Wed, Apr 9, 2025 at 12:56 PM Nico Williams <nico@xxxxxxxxxxxxxxxx> wrote:
> >> Let's nail down the semantics of these change ID headers.  Here is a
> >> proposal to bang on:
> >>
> >>  - change IDs get preserved on cherry-pick and on `pick`s in rebases
> >>
> >>  - users can manually remove or change these change IDs, naturally,
> >>    though generall they would not
> >>
> >>  - the actual change IDs are either free-form or they are URIs -- pick
> >>    one, but if they are URIs they should be URIs to CRs, and approved
> >>    CRs should perhaps have links to integration reports etc.
> >
> > Using URIs [to code reviews] looks to me like it makes some
> > assumptions about what creates or consumes these headers, right?
> > Especially since the URI should point to a code review… Is there a way
> > to do that which is downstream-agnostic?
> >
> > Further, and maybe this is my ignorance of Gerrit showing: how would
> > you attach a URI to a local commit when authoring it? You don't have
> > the review URI when running `git commit`, do you? (Maybe I
> > misunderstood; I'm seeing an odd chicken-egg problem here.)
> >
> > Which begs another question: what/who applies the initial change ID to
> > a commit and when?
>
> These are all great questions, which the originally proposed format
> (fixed-width reverse-hex) has answers to. I think a URI would be
> strictly worse.

Well, I think we still missed "what/who applies the initial change ID
to a commit and when."

But the treatment below is something I agree with and failed to
convey, I think: namely, URIs seem to encode too much
"unportable"/"specific" information in Git. I feel like the current
design is not really "tool-agnostic" as much as "built on a universal
core." That seems valuable and prone to more longevity.

>
> * Using a reverse-hex ID makes no assumptions about what consumes these
>   headers. There can be multiple different consumers which treat the ID
>   differently with different URI schemes.
>
> * Attaching a reverse-hex ID to a local commit when authoring it is
>   trivial: you generate it randomly.
>
> This is one of those cases where being maximally restrictive about the
> format will enable maximal flexibility downstream.
>
> One example: GitHub has a URL scheme that looks like this:
> github.com/org/repo/compare/<ref1>..<ref2>
>
> This doesn't work if the refs contain slashes, as branches sometimes do
> (e.g. feat/foo, username/bar). If the change-id is a URI, this type of
> URL scheme doesn't work reliably.
>
> That is not to say we should design the change-id around GitHub, it's
> just an example how making the format more free-form (URI is more
> free-form than fixed-width reverse-hex) makes it more difficult to get
> stuff working downstream.
>
> And lastly, laser-etching the URI scheme of one particular tool into
> your commit history means the history is at great risk of degrading
> over time. URI schemes change, domains change, tools become outdated
> and are replaced.
>
> Adding some ephemeral configuration to a tool that constructs a URI out
> of a reverse-hex ID on the other hand is trivial.

Yep.

> > PS This discussion feels somewhat related to the classic GitHub
> > problem of not presenting interdiffs/range-diffs: GitHub shows a
> > too-flat source diff on force-pushes. Perhaps better web UI tooling
> > about interdiff review (which I think is one of the things Gerrit
> > does/wants to do?) makes change IDs less necessary, since interdiffs
> > help connect evolutions of commits?
>
> I think it's the other way around: Building a code review UI built on
> git and centered around interdiffs today is _hard_, that's why we don't
> have it yet. Adding change-ids to commits will make it much easier,
> paving the way for these tools to be implemented.
>
> Remo

Fair point, although GitHub's detection of force-pushes makes me think
it could split a PR into versions at that point, cross-link backwards
and forwards by one version (from the force-push detection), show
range-diffs between versions based on the target branch of the merge,
and even follow the cross-links to show an overall sequence of
versions.

But I don't work there, so presumably it's harder than that :)

I sincerely hope to make it easier if it's really that hard! And,
though my opinion matters little, I'm having a hard time piercing the
conversation to see a "universal core" that solves the desired
problem. Maybe I'm not reading carefully enough, and maybe a summary
would help. I greatly appreciated the work of previous folks to
summarize the current thread status.

-- 
D. Ben Knoble