On Wed, Apr 23, 2025 at 02:58:49PM +0200, Toon Claes wrote: > Nico Williams <nico@xxxxxxxxxxxxxxxx> writes: > > At GitLab we keep track of the commit IDs a branch has been (maybe only > if there is a Merge Request for that branch, I'm not sure). [...] Do you mean "we keep track of the commit _hashes_ a branch has _seen_"? But it can't be commit hashes, and there's no commit IDs, so GL could be assigning synthetic, internal commit IDs based on commit similarity, which proves Junio's and Theodore's point that similarity checking can be enough. > > The point is that GL demonstrates that these things can be done. And I > > don't see how a change ID would have helped GL much except in cases > > where one re-does all the commits with different subject lines etc, but > > leaves the actual patches mostly the same. Now it does happen that I > > split and squash commits, but it's rare that I completely redo them. > > That's because GL stores history about a branch ref (outside the Git > object/ref database). If you don't do that, you can't. Having a > Change-Id embedded in the commit, retains that information in Git's DB. I.e., GL has an internal reflog on the server side. I've sometimes wished that I could push and fetch reflogs (or subsets thereof anyways). When doing code reviews I use [local, obv.] reflogs to see the diffs between an earlier version of a branch that I fetched and reviewed earlier and the latest that I just fetched and am reviewing, and generally I don't need to see any other versions I never fetched, but occasionally I've wished I could fetch those other versions, but since there are no server-side refs for them, I can't. [Or maybe I'm about to learn of some feature I didn't know about :)] I agree that change IDs / commit IDs in commit headers can help one keep track of versions of a branch w/o a server-side reflog, but how would you keep track of their chnronology? I.e., how do you know which is version 1, which is version 2, .., and which is version N-1? (Version N being the head of the branch.) If you don't index these then finding them is a full table scan, and if you index them then you've implemented a server-side reflog. Which makes me think that all that's needed for a good CR tool here is a) a server-side reflog, b) similarity checking for commits. (a) doesn't seem like a radical idea (that can be implemented with server side hooks), and (b) is also not radical given that file rename / copy operations are detected by Git using similarity checking already.