On Mon Apr 14, 2025 at 5:13 PM CEST, Junio C Hamano wrote: > "Theodore Ts'o" <tytso@xxxxxxx> writes: > >> On Fri, Apr 11, 2025 at 10:44:43AM -0700, Junio C Hamano wrote: >>> >>> The submitting contributor must make a conscious arrangement to give >>> a "patch set ID" shared among the messages in a single iteration, >>> and everybody who are responding must make sure they do not add the >>> same ID to the messages they throw at the thread in response. Those >>> who use format-patch and send-email can do that with convention and >>> automation and there is no reason to rely on In-Reply-To: header >>> (which may confuse the automated recipient of manually created >>> follow-up messages). >> >> So it all depends on how the patch set ID is implemented. Here's one >> way that I had in mind. The reason why I like like this over the >> Change-ID approach is that the semantics can be very clearly defined, >> and the only thing we rely on is the user saying "this new commit is >> part of patch series which I'm putting together". >> >> By default when creating a new commit, the field is empty (in which >> case the patch set ID is presumed to be the same as the commit ID), or >> if the user gives a command-line flag say, "git commit --series" >> which indicates that it is part of a patch series in which case the >> patch set ID of the commit is set to the patch set ID of the current >> commit (i.e., eventully, its parent commit). >> >> Whenever the commit is amended or rebased or cherry picked, if the >> patch series ID is NULL, then it is set to the original commit ID. >> Otherwise, the existing patch set ID is preserved. >> >> The patch set ID will be output by git format-patch (perhaps as "Patch >> Series ID: sha has" immediately after the --- line. And if it is >> present, "git am" will import that patch series ID into git commit >> which creates when it sucks in the e-mail. >> >> The net affect of this is that for new versions of git which implement >> the Patch Set ID, all new commits are treated as patch series of >> length 1, unless a subsequent commit is created using "git commit >> --series". And the Patch Set ID will be preserved across >> cherry-picks, rebase operations, and git send-email/git apply-message >> operations. >> >> So if someone replies to an existing e-mail thread with a new commit, >> git format-patch will give it a different patch set ID, so we can >> distinguish it from an amended copy of a patch in the patch series. >> >> It also means that singleton commits, the patch ID effectively acts >> much like the tranditonal Change-ID. For multi-commit patch series, >> all of the commits will have the same patch set ID. > > Yeah, I like that aspect the best---the case for single commit > series falling out as a natural degenerate case of the more general > case to support multi-commit series is a good sign that the design > got something right ;-) > > I am still not sure what to think about the lack of explicit the > evolution history of one patch set that share the same patch set ID. > > When we have 10 commits that share the same patch set ID, I can > imagine that we can easily tell 3 are from one iteration, and 3 and > 4 among the rest are from another two iterations by noticing that > there are three strand of pearls, having 3, 3, and 4 commits on it. > And we can identify the initial round by noticing that one of the > commits have its name as the patch set ID, but I am not sure if we > should be OK by not having anything but the committter timestamp to > tell which one among the other two iterations are earlier, and we > cannot tell anything about these two other iterations if they are > independent rewrites of the original round. > > But other than that, I like something with clearly defined semantics > (and the definition coming naturally out of the structure, not out > of some arbitrary convention that forces to bring in some > semantics), and what you outlined above looks reasonably clean and > easy to use. Doesn't a patch set ID suffer from the same kind of ambiguity the change-id supposedly does? Patch sets can be split and merged, a commit from one patch set can be cherry-picked into another. What patch set ID should such a cherry-picked commit have? And I think the argument that a change-id for a singleton patch set naturally falls out of the patch set ID can easily be reversed. Admittedly, I don't have the most experience with the mailing list workflow, but a multi-commit patch set usually comes with a cover letter, right? And people like to track their cover letter in a commit? IIUC, b4 is designed around that too. In that case, the cover letter has its own change-id as any other commit, which will naturally remain stable across every version of the patch set. It would be non-sensical to squash, split or cherry-pick the cover letter commit. Sounds like a great candidate for the patch set ID. So the patch set ID can just as naturally flow out from the change-id. I can see two concrete disadvantages of the patch set ID: * It's strictly less powerful. As explained, the change-id can do everything the patch set ID can via the cover letter. But the patch set ID cannot help you track how individual commits within the patch set evolved. * It's more complicated. While many Git users work with patch sets every day, it's not a concept in Git iself. Git only knows about commits. The patch set ID would introduce a new concept into Git unnecessarily, while the change-id naturally extends the language Git already speaks, that of commits. Remo