On Fri, Apr 04, 2025 at 02:05:13AM +0200, Remo Senekowitsch wrote: > On Fri Apr 4, 2025 at 12:07 AM CEST, Nico Williams wrote: > > If I cherry-pick a commit then I absolutely want its "change ID" to be > > preserved by default. If I want to drop that I can always ask for that > > or amend the commit to remove it. I will want the same behavior for > > rebase and cherry-pick. Having to remember different defaults and > > options for the two would be a cognitive load I do not need. > > Yeah, that's a very valid argument. In Jujutsus CLI, there is a very > clear separation between "rebase" and "duplicate", so there's no risk > of confusion if one preserves the change-id and the other doesn't. > [...] "no risk of confusion"? It's higher cognitive load for the user, and higher cognitive load does lead to confusion. I don't understand whence the need to introduce a new name for an old feature. Even Fossil -which lacks rebase- did not introduce a new name for cherry-pick. Given cherry-pick one can always implement rebase, but Fossil's devs hate rebase workflows and love merge workflows, so they won't implement or accept rebase. But still, they did not think up a new name for cherry-pick just to be different. Mercurial also went through a phase of "we hate rebase", then later they added: light-weight branches (i.e., Git-style branching), rebase, and histedit (because mixing history editing with rebase is scary please no!!), and the end result was just more unnecessary cognitive load. I hope the next better-VCS project has more empathy for its users. > Making them behave the same way can be seen as simpler. Yes: because it reduces users' cognitive load! Here's an example: Git's rebase, cherry-pick, and am commands have the same --abort, --continue, and --skip options because they all do similar things. This means I only ever needed to learn that once for one subcommand and then that knowledge carried over to the others. For all the UI hate Git gets it gets things like that right, and I love that. > >> [...]. The ways rebase and cherry-pick are most often > >> used are semantically very different from each other. (interactive) > > > > How do you know this is "most often" so? [...] > > I haven't conducted a study, this is my impression from talking to peers > and reading chatter from other Git users online. Maybe the impression is > wrong. Rebase is notionally built out of cherry-pick, therefore they are semantically similar even if users don't notice it. Be careful with "chatter". You might only be hearing from merge-happy users who don't use rebase and rarely use cherry-pick. Many users prefer merge-based workflows because that elides the rebase/cherry-pick cognitive load. I won't deny that rebasing requires more thinking than merging, but leaving behind useful history always requires more thinking than merging. I recommend watching huge projects like OpenSSL, PostgreSQL, or Illumos to get a good idea of how power users use Git. For example, Illumos uses the same policy that Sun used to: strictly linear history only in the upstream master branch, with no merge commits ever. You can have topic branches, naturally, and release branches too, but in each branch linear history only, which means you have to rebase before pushing to upstream branches. If you've never run into users who use or have to use such rebase-heavy workflows then you might reach the wrong conclusions about what is common. But at any rate in this case the fundamentals make it clear that you cannot have just one commit with a given change ID and you should not have different defaults for copying change IDs for rebase vs. cherry-pick vs. am. > Yeah, I was (over-)simplifying. rebase is the swiss-army knife of git > commands. But for all of these operations, it holds that the previous > version of the patch(es) won't be reachable in the commit tree anymore > after the rebase is complete. (assuming potential descendant branches > are also rebased, which is usually the case) So rebase doesn't generally > cause duplicate change-ids, which is what I wanted to get at. > it holds Not so. For example, when forward-porting our local patches to $external_open_source_project from 1.2.3 to 1.3.4 I do the following: : ; git checkout 1.2.3-patched : ; git checkout -b 1.3.4-patched : ; git rebase --onto 1.3.4 1.2.3 : ; <address merge conflicts...> : ; : ; # Now branch 1.3.4-patched has the local patches from 1.2.3-patched : ; # but is based on 1.3.4. and now I'll have multiple commits with the same change IDs. > > When would you not want to preserve a change ID on cherry-pick? I can't > > say I would ever have wanted to do that had Git had change IDs from day > > 1, and I've been using Git for more than twenty years. > > That's not exactly how Jujutsu thinks about the change-id, but it's a > useful piece of information. Gerrit does indeed use its change-id to > track cherry-picks. I am in favor of measures to track that metadata > (although duplicating change-ids is not my preferred option for that). Say you insisted on adding a prefix or suffix to the change ID when "duplicating" commits, how would you have Git enforce repo-wide uniqueness of change IDs? The only tool Git has for this is refs, so you'd have to create a ref for each change ID that points to the commit with that change ID. But now you've defeated the whole point of change IDs beyond code review, so I would insist on a multitude of types of change ID so that I could have one that lets me have more than one commit with the same change ID. > Let's assume the change-id represents the origin of a patch. What should > happen if a patch is split in two? Should they have the same change-id, > because they ultimately have the same origin? Maybe. I, the author, get to decide whether a) they both keep the same change ID, or b) one of them gets a new change ID, or c) both get new change IDs, and/or maybe even d) with support for multiple change IDs I can track that both came from the same original commit but also they now have different additional change IDs. If it's just a header I can do this by convention. If the VCS was going to implement global uniqueness for change IDs then my life would get more complicated in this case and I would not appreciate it. > I don't attach too much semantic meaning to the change-id. It's a > normally unique identifier for a change that persists as the change > evolves. That's useful. The more commits with the same change-id as > others there are, the less useful the concept becomes. But it's perfect for all the other use-cases I mentioned, such as backporting and forward-porting. Those are the use-cases that most would benefit from change IDs. But even for the code-review-only case the fact that you could look at a commit and use it to find corresponding code review(s) is nice, even if you've had to cherry-pick a commit for backports or for forward-porting. > >> doesn't preserve the change-id for that reason. So if cherry-pick > > > > I have _never_ used cherry-pick to cause there to be duplicate commits > > in the same branch. Therefore calling it "duplicate" seems terribly > > wrong to me. > > Well, obviously not in the same branch. I meant duplicate among all > visible commits (reachable from any branch). That's the issue we're > discussing w.r.t. change-ids not always being unique identifiers for > a single commit. What would you like me to call that siuation instead > of duplicate? Cherry-pick. Because that's the name we already have. > Can you maybe give some examples of how you use cherry-pick? I'd be > interested in your use cases to maybe better understand where you're > coming from. [...] - [take over someone else's work and] decide to pick some of their commits and drop others, but maybe do it in a new branch because the old branch is still useful due to dropped commits still being useful history (e.g., in case they are ever needed in the future) - fetch a branch from one upstream and pick selected commits onto another branch that normally tracks a different upstream (I keep local commits always "on top" of the upstream, rebasing as needed) - backports - forward-porting (this is arguably symmetric with backporting) - maintaining multiple related but different branches when researching different ways to implement some feature > [...]. I myself almost never use cherry-pick, simply because I'm > not involved in any backporting. I've seen cherry-pick used to get a > bugfix from another branch onto your own, in order to avoid having to > wait for the other branch to be merged. But that practice has always > rubbed me the wrong way. [...] You may not have worked in sufficiently complex environments/projects. In my world cherry-pick is an essential tool I cannot do without. Where a VCS was forced upon me that did not implement cherry-pick I've simply used patch(1) to apply diffs from the commit I wanted to cherry-pick -- it's precisely because this is possible _and_ necessary that the VCS might as well provide it. Given cherry-pick then rebase follows, ergo the VCS might as well also provide rebase. > [...]. I feel like the correct thing to do in that > situation is to extract the bugfix to a separate dependency-free branch > and make the two feature branches depend on it. That way, both feature "extract the bugfix to a separate [...] branch" -- that's exactly what cherry-pick does! > branches can more easily track changes in the bugfix by rebasing. If the > bugfix was cherry-picked, it's much harder to keep the two versions in > sync. (And finally, the latter approach probably makes the bugfix land > faster.) So yeah, interested to hear your use-cases for cherry-pick. I don't see how using a built-in cherry-pick feature vs manually "extracting" a commit makes it easier to "keep the two versions in sync". On the contrary, manual operations always involve more cogntive load that automated ones, and at any rate the thing that woul dhelp you "keep the two versions in sync" is.. change IDs! Nico --