What a thread! On Tue, Apr 8, 2025, at 16:27, Junio C Hamano wrote: > Martin von Zweigbergk <martinvonz@xxxxxxxxxx> writes: > >>> A set of individual commits that share the same "change ID" is, >>> unlike reflog entries which is an ordered set of tip of topics, not >>> inherently ordered. This is inevitable in the distributed world >>> where many people can simultaneously work on improving a single >>> "change" in many different ways, but making it difficult if not >>> impossible to see how things evolved, simply because you first need >>> to figure out the order of these commits that share the same "change >>> ID". Some may be independently evolved from the same ancestor >>> iteration. Some may be repeatedly worked on on a single strand of >>> pearls (much like how development recorded in reflog entries of a >>> single branch in a single user set-up goes). I guess you would need >>> a way to record the predecessor vs successor relationship of various >>> commits that share the same "change ID", much like commits form DAG >>> to represent ancestor vs descendant relationship. >> >> That is correct. The change ID should be sufficient for handling >> simple distributed cases involving a single remote but it's not a full >> replacement for something like Mercurial's Changeset Evolution [1]. > > Just a random thought. We could very easily replace "change ID" > with a concept of predecessor-successor commits. > > Just like we can represent parents-children NxM transitive relation > only with 0 or more "parent" commit object headers, we can record > zero or more "predecessor" trailer in the commit log. > > (1) a commit with no "predecessor" is like "root commit" in the > commit history topology. It is a brand new change that took > inspiration from nobody else and that is not a polished form of > any other existing commit. > > (2) a commit created as a refinement for one or more existing > commits record each of them as "predecessor" to it. Having > more than one of them is like a "merge commit" in the commit > history topology and represents that two patches were squashed > into one. > > (3) Splitting an originally large change into multiple changes can > be represented the same way. They share the same commit as > their "predecessor". Perhaps you have originally two-commit > series, A and B, and split them differently in such a way that > C has half of a and D has the rest of A plus B. In which case, > C has A as its predecessor while D has both A and B as its > predecessor. > > (4) Just like we can use auxiliary data structures like bitmaps to > figure out reachability without following all the links in the > commit history topology, we should be able to learn how a new > change was born, and trace how it evolved into newer iteration > of the moral equivalent of the change, possibly as a series > with mutiple commits, using auxiliary data structure, which > would represent predecessor-successor NxM transitive relation > in a similar way in a form that is efficient to access. > > Something like this should allow us avoid relying on "change ID"s > that can collide elsewhere in the world without having a central > authority to assign them. I have a few submissions where I recorded the commit hash and the previous commits in the email headers. https://lore.kernel.org/git/0ab05a4cf09ba02016b4493936ad1b092b1326aa.1730979849.git.code@xxxxxxxxxxxxxxx/ For this one (v3):[1] ``` X-Commit-Hash: 0ab05a4cf09ba02016b4493936ad1b092b1326aa X-Previous-Commits: c50f9d405f9043a03cb5ca1855fbf27f9423c759 63a431537b78e2d84a172b5c837adba6184a1f1b ``` • `X-Commit-Hash`: my local commit for this patch • `X-Previous-Commits`: the two previous commits (v1 and v2 in arbitrary order) Version 1 just has the hash: https://lore.kernel.org/git/63a431537b78e2d84a172b5c837adba6184a1f1b.1729451376.git.code@xxxxxxxxxxxxxxx/ ``` X-Commit-Hash: 63a431537b78e2d84a172b5c837adba6184a1f1b ``` And v2: https://lore.kernel.org/git/c50f9d405f9043a03cb5ca1855fbf27f9423c759.1730234365.git.code@xxxxxxxxxxxxxxx/ ``` X-Commit-Hash: c50f9d405f9043a03cb5ca1855fbf27f9423c759 X-Previous-Commits: 63a431537b78e2d84a172b5c837adba6184a1f1b ``` † 1: The hash is in the message-id in my case. But I wanted a dedicated field instead of taking it out of the msg id. And the msg id makeup doesn’t seem documented. I’ve already seen a thread where someone relied on parsing data out of the msg id until it changed from under them. > (4) Just like we can use auxiliary data structures like bitmaps to > figure out reachability without following all the links in the > commit history topology, we should be able to learn how a new > change was born, and trace how it evolved into newer iteration > of the moral equivalent of the change, possibly as a series > with mutiple commits, using auxiliary data structure, which > would represent predecessor-successor NxM transitive relation > in a similar way in a form that is efficient to access. I don’t know if this is related but it would be amazing if we users could define custom indexes on the DB. Maybe people won’t agree on what a change-id should mean (judging by this thread?) but with custom indexes you could maybe get fast queries for whatever “id” you want to define. Unrelated example: defining an index on `git patch-id --stable` for quick *cherry* checks without making your own table with: ``` <rev list> | git diff-tree --patch --stdin \ | git patch-id --stable ```