Re: `git remote rename` does not work when `refs/remotes/server/HEAD` is unborn (when right after `git remote add -m`)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jul 24, 2025 at 06:45:36AM -0400, Jeff King wrote:
> On Thu, Jul 24, 2025 at 09:59:45PM +1200, Han Jiang wrote:
> 
> > What did you expect to happen? (Expected behavior)
> > 
> > `git symbolic-ref 'refs/remotes/server/HEAD'` outputs
> > "refs/remotes/server/master";
> > `git symbolic-ref 'refs/remotes/server2/HEAD'` outputs
> > "refs/remotes/server2/master".
> > 
> > What happened instead? (Actual behavior)
> > 
> > `git symbolic-ref 'refs/remotes/server/HEAD'` outputs
> > "refs/remotes/server/master";
> > `git symbolic-ref 'refs/remotes/server2/HEAD'` outputs "fatal: ref
> > refs/remotes/server2/HEAD is not a symbolic ref".
> > `git symbolic-ref 'refs/remotes/server/HEAD'` outputs
> > "refs/remotes/server/master".
> 
> Thanks for the report. I can reproduce the issue easily here. Probably a
> simpler reproduction is just:
> 
>   git init
>   git remote add -m whatever server1 /does/not/need/to/exist
>   git remote rename server1 server2
>   git symbolic-ref refs/remotes/server2/HEAD
> 
> The problem is that the branch-renaming code in git-remote is not
> prepared to handle symrefs that don't resolve. This seems to make it
> work:
> 
> diff --git a/builtin/remote.c b/builtin/remote.c
> index 5dd6cbbaee..478ea3a80c 100644
> --- a/builtin/remote.c
> +++ b/builtin/remote.c
> @@ -630,7 +630,9 @@ static int read_remote_branches(const char *refname, const char *referent UNUSED
>  	if (starts_with(refname, buf.buf)) {
>  		item = string_list_append(rename->remote_branches, refname);
>  		symref = refs_resolve_ref_unsafe(get_main_ref_store(the_repository),
> -						 refname, RESOLVE_REF_READING,
> +						 refname,
> +						 RESOLVE_REF_READING |
> +						 RESOLVE_REF_NO_RECURSE,
>  						 NULL, &flag);
>  		if (symref && (flag & REF_ISSYMREF)) {
>  			item->util = xstrdup(symref);
> @@ -835,8 +837,8 @@ static int mv(int argc, const char **argv, const char *prefix,
>  	 * First remove symrefs, then rename the rest, finally create
>  	 * the new symrefs.
>  	 */
> -	refs_for_each_ref(get_main_ref_store(the_repository),
> -			  read_remote_branches, &rename);
> +	refs_for_each_rawref(get_main_ref_store(the_repository),
> +			     read_remote_branches, &rename);
>  	if (show_progress) {
>  		/*
>  		 * Count symrefs twice, since "renaming" them is done by
> 
> That is, we need two fixes:
> 
>   1. When iterating over the refs, we need to cover _all_ refs, not just
>      those that fully resolve (there's a related bug here: we'll
>      silently ignore an actual broken or corrupt ref, whereas I think
>      the right thing would probably be to try copying it and then
>      complain loudly if we don't have the object).
> 
>   2. When resolving each one, we shouldn't recurse. We're doing a
>      shallow copy, not a deep one.
> 
> Reading this code, though, I can't help but think that the recent "git
> refs migrate" command had to deal with all of these problems. I wonder
> if we could reuse its code. +cc pks for wisdom.

I'm not sure whether we can easily reuse the code -- the use case is
quite different, as the migration works across two totally independent
refdbs. So all refs are recreated 1:1, without any renaming involved.

But it certainly seems to me like this whole logic could use quite some
love:

  - We create N+M*2 separate ref transactions, where N is the number of
    direct remote refs we need to migrate and M is the number of
    symbolic refs. This is bad with the "reftable" backend, but given
    that the N transactions are all renames that have to delete the old
    ref it's even quadratic in the worst case for the "files" backend
    because we may have to rewrite the packed-refs file for each such
    transaction.

  - It is way too brittle, as the update isn't even pretending to be
    atomic. We first delete everything, and then we recreate it. So if
    any of these updates fails we'll be left in an in-between state.

  - We shouldn't have to even call `refs_resolve_ref_unsafe()` at all,
    as the `read_remote_branches()` nowadays gets the referent as
    parameter.

To demonstrate:

     $ git init --ref-format=files repo
    Initialized empty Git repository in /tmp/repo/.git/
     $ cd repo/
     /tmp/repo:HEAD $ git commit --allow-empty -m initial
    [main (root-commit) 00c2622] x
     $ git remote add origin /dev/null
     /tmp/repo:main $ for i in $(seq 100000); do printf "create refs/remotes/origin/branch-%d HEAD\n" $i; done | git update-ref --stdin
     /tmp/repo:main $ git pack-refs --all
     /tmp/repo:main $ time git remote rename origin renamed
    Renaming remote references:   0% (2216/100000)

I stopped after a minute -- this will take hours to complete.

So I think we should adapt this logic to use a single transaction.
There's one catch, as refs_rename_ref()` also migrates any reflogs that
exist. But with the recent infra that Karthik has added we can now also
migrate reflogs, so that's all doable.

Patrick




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux