On Mon, Apr 07, 2025 at 04:36:01PM -0500, Nico Williams wrote: > This is why I suggested earlier that there need to be multiple change > IDs, not just one. Perhaps one is a "code review ID" and another is > a "commit change ID". The code review ID would let you link together > all commits that were reviewed together, so if you have to split or > squash commits they would all still have that one code review ID. The > commit change ID would be shared by all sufficiently-similar versions of > a commit. If a commit is dropped or split or squashed then its commit > ID might get dropped too, but the code review ID would stay the same. I think "code review ID" makes a lot of sense, although what I would call it is "patch series ID". This has very clear semantic: it ties commits which should be grouped together as a single higher-level set of changes. It could be used by "git format-patch" / "git send-email" to automatically send a group of patches as a logical unit. I'd include the "patch series ID" in the e-mail that gets sent out, so that "git apply-message" would be able to retain the patch series ID. Patchwork could use the Patch series ID to automatically mark a v2 version of patch series as obsoleting the v1 version of the patch series. So it would be a lot more useful for than just for Gerrit-style workflows, and that's a good sign that feature makes sense from a design perspective. I'll note that even without the "commit change ID", just simply knowing that one patch series is a newer version of a pre-existing patch series is enough to allow Gerrit to intuit which commit is a newer version of another commit. For singleton commits, nothing else is necessary. For multi-commit patch series, gerrit could use the one-line commit description to associate commits; it could use ordering of the patches; it could just see which commit contents are similar to previous commits, much like how git detects renames. In my experience looking at how kernel developers use gerrit versus e-mail workflows, in general, gerrit patch series tend to involve a smaller number of commits, because looking at how various files change between commtis is awkward; and with e-mail workflows, the patch series tend involve a larger number of commits, because reviewing smaller commits is easier with e-mail. So if this true for other communities using web-based review workflows, using an hueristics instead of a "commit change ID" might be sufficient --- and for those communities that run into problems, they could continue to use a gerrit-style "Change-ID: " in the footer, with the hueristics being used if for some reason commits that don't have the Change-ID make it into Gerrit. > > Quite frankly, I think the concept of "change ID" is nice but it is > > not mechanically trustable. Recording them in the trailers is fine, > > but I somehow feel that they have a clear-cut semantics everybody > > can agree on to deserve to be in the header part of commit objects. > > I don't think they need to have such extremely detailed semantics in > order to be able to get a header. The semantics will ultimately be > somewhat project-defined, typically something like "during code review > you can use these to related newer updates to an MR/PR/CR to older > versions" and "once integrated you can use these to find the approved > code review as follows [details]". The [details] (probably a URI > template) for finding concluded CRs might vary. The CR tool might vary. > The construction of the change IDs might vary. The intent might not > vary at all. I disagree. From long experience, allowing something into an interface that doesn't have strongly defined semantics has lead to *huge* problems. This has certainly been the case for Kernel<->Userspace interfaces; so my bias is that if we can't define strong semantics, then we should probably avoid adding that interface until we can. Otherwise, this can lead to a huge number of headaches, both for developers and users. People *will* develop automation tools suing an official "commit change ID", assuming that how their project (or their forge site) uses the ill-defined Change ID is the One True Way that the badly defined field should be used. And other people will developer *other* tools assuming some other interpreation for that field. And then the git developers and users will be left trying to pick up the pieces. Cheers, - Ted