Re: [PATCH] [RFC] rebase -m: partial support for copying extra commit headers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Junio

On 08/04/2025 15:44, Junio C Hamano wrote:
Phillip Wood <phillip.wood123@xxxxxxxxx> writes:

Thanks for sharing that, it is an interesting list. On the subject of
encoding I do think our documentation could be clearer that the
encoding applies to all the headers as well as the commit message. As
far as I can see it only mentions the commit message, not the author
or committer identities but repo_logmsg_reencode() re-encodes the
whole commit buffer. Out of interest do you think we could be doing a
better job with fsck to pick up some of these problems earlier?

I think "git rebase" only cares that the author identity can be parsed
by split_ident() which is fairly lenient.

"rebase" already knows that it has to be picky which header fields
need to be propagated and which must not be, doesn't it?

I'd say it's picky because of the way it is implemented - it calls "git commit" and there is no way to set "extra" header fields when doing that.

 Can the
same be said for arbitrary "extra" header fields?

Information on some of the header fields are inherently destroyed
when you refine an existing commit.  The value on the 'parent'
headers may need to be updated (unless "rebase" is fast-forwarding
an earlier part of the changes on the same base), the 'author'
information usually wants to be preserved, but when the scale of the
change since the previous iteration is so large, you may give it a
new authorship, the 'committer' information should record who
created the new commit object that records the result of rebasing,
the 'gpgsig' and 'gigsig-sha256' header fields would lose validity
if you are creating a new object that is different from the original
by even a single bit (if we are somehow recording which predecessor
commit the new one replaces, it certainly is safe to drop these that
have lost validity, as we can go back to the predecessor to see it
has a valid signature, and the change in the new commit that lost
the signature fields is the moral equivalent of the original.
Otherwise, carrying a stale signature may serve as a reminder that
the commit was rewritten in the past---I dunno).  And so on.

Now, one thing that worries me is this.  If "extra" commit headers
include truly extra fields with unknown semantics, the machinery
cannot tell which ones are safe and benefitial to propagate.

That's true and we could have a config key to select which "extra" headers are propagated. We do however unconditionally propagate all "extra" headers when amending a commit with "git commit --amend" and when rewriting it with "git replay" which I think GitHub have been using to rebase branches for over 18 months. If we're worried about rebase unconditionally copying these headers we do something to stop "git replay" doing the same. On the other hand if "git replay" is being used in the wild without problems maybe we don't need to worry.

Best Wishes

Phillip





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux