On Tue, Apr 08, 2025 at 08:55:21AM -0400, Theodore Ts'o wrote: > On Mon, Apr 07, 2025 at 04:36:01PM -0500, Nico Williams wrote: > > This is why I suggested earlier that there need to be multiple change > > IDs, not just one. Perhaps one is a "code review ID" and another is > > a "commit change ID". [...] > > I think "code review ID" makes a lot of sense, although what I would > call it is "patch series ID". This has very clear semantic: it ties > commits which should be grouped together as a single higher-level set > of changes. It could be used by "git format-patch" / "git send-email" > to automatically send a group of patches as a logical unit. > > [...] Yes. > I'll note that even without the "commit change ID", just simply > knowing that one patch series is a newer version of a pre-existing > patch series is enough to allow Gerrit to intuit which commit is a > newer version of another commit. For singleton commits, nothing else > is necessary. For multi-commit patch series, gerrit could use the > one-line commit description to associate commits; it could use > ordering of the patches; it could just see which commit contents are > similar to previous commits, much like how git detects renames. I'm not keen on CR tools "intuiting" from.. similarity checks. I don't love Git's similarity checks for file renames. I get that for a distributed VCS assigning something like "inode numbers" is tricky, but as long as devs don't race to create the same files it was always possible to have UUIDs as "inode numbers" and avoid the similarity checks. Strictly speaking we don't even need any of these change IDs to make it possible for tools to use similarity checks to find all versions of a commit or patch series or whatever, but it's very nice to have something less heuristic and more exact. > In my experience looking at how kernel developers use gerrit versus > e-mail workflows, in general, gerrit patch series tend to involve a > smaller number of commits, because looking at how various files change > between commtis is awkward; and with e-mail workflows, the patch > series tend involve a larger number of commits, because reviewing > smaller commits is easier with e-mail. Yes. > So if this true for other communities using web-based review > workflows, using an hueristics instead of a [...] I'm not keen :) > > I don't think they need to have such extremely detailed semantics in > > order to be able to get a header. The semantics will ultimately be > > somewhat project-defined, typically something like "during code review > > you can use these to related newer updates to an MR/PR/CR to older > > versions" and "once integrated you can use these to find the approved > > code review as follows [details]". The [details] (probably a URI > > template) for finding concluded CRs might vary. The CR tool might vary. > > The construction of the change IDs might vary. The intent might not > > vary at all. > > I disagree. From long experience, allowing something into an > interface that doesn't have strongly defined semantics has lead to > *huge* problems. This has certainly been the case for > Kernel<->Userspace interfaces; so my bias is that if we can't define > strong semantics, then we should probably avoid adding that interface > until we can. Otherwise, this can lead to a huge number of headaches, > both for developers and users. So how much of the [details] do you want specified? If you want to be able to go from "change ID" to CR generically for all CR tools then the the best -and perhaps only reasonable- way is to make the change ID a URI. Or if you think the [details] can be elided and still have semantics that are well-defined enough then I think you agree with me more than you disagree :) If we want to leave some details to be site-/project-local then perhaps change IDs should have some type and domain/project identifier. Users who cannot make use of that metadata (e.g., because the CR tool is not reachable) can still use the change IDs to link commits and patch series. I think that linking is the only thing we absolutely must define semantics for, and the rest can be site-/project-local. IMO. Nico --