On Wed, Apr 30, 2025 at 10:58:24AM +0200, Patrick Steinhardt wrote: > On Tue, Apr 29, 2025 at 01:48:18PM -0700, Junio C Hamano wrote: > > Patrick Steinhardt <ps@xxxxxx> writes: > > > > > @@ -213,6 +228,8 @@ int cmd_mv(int argc, > > > struct cache_entry *ce; > > > struct string_list only_match_skip_worktree = STRING_LIST_INIT_DUP; > > > struct string_list dirty_paths = STRING_LIST_INIT_DUP; > > > + struct hashmap moved_dirs = HASHMAP_INIT(pathmap_cmp, NULL); > > > + struct strbuf pathbuf = STRBUF_INIT; > > > int ret; > > > > > > git_config(git_default_config, NULL); > > > @@ -331,11 +348,17 @@ int cmd_mv(int argc, > > > > > > dir_check: > > > if (S_ISDIR(st.st_mode)) { > > > + struct pathmap_entry *entry; > > > char *dst_with_slash; > > > size_t dst_with_slash_len; > > > int j, n; > > > int first = index_name_pos(the_repository->index, src, length), last; > > > > > > + entry = xmalloc(sizeof(*entry)); > > > + entry->path = src; > > > + hashmap_entry_init(&entry->ent, fspathhash(src)); > > > + hashmap_add(&moved_dirs, &entry->ent); > > > + > > > > OK, this collects in moved_dirs the directories that will get moved. > > And then a separate loop, ... > > > > > + for (i = 0; i < argc; i++) { > > > + const char *slash_pos; > > > + > > > + strbuf_addstr(&pathbuf, sources.v[i]); > > > > Shouldn't there be a call to strbuf_reset(&pathbuf) before doing > > this? > > Yup, indeed. > > > > + slash_pos = strrchr(pathbuf.buf, '/'); > > > > And start from the deepest directory, going one level up per > > iteration, ... > > > > > + while (slash_pos > pathbuf.buf) { > > > + struct pathmap_entry needle; > > > + > > > + strbuf_setlen(&pathbuf, slash_pos - pathbuf.buf); > > > + > > > + needle.path = pathbuf.buf; > > > + hashmap_entry_init(&needle.ent, fspathhash(pathbuf.buf)); > > > > ... see if the path being moved falls within that subdirectory. > > Ah, there's another gotcha here: when moving a directory, we also add > all of its children to `argc`. So this would now always fail when we > move directories around. > > I guess we can handle this by introducing another `MOVE_VIA_PARENT_DIR` > mode -- we'd then skip the verification for any entry marked like this. > > > > + if (!hashmap_get_entry(&moved_dirs, &needle, ent, NULL)) > > > + continue; > > > > If there is no overlap, we need to do anything special. > > > > > + if (!ignore_errors) > > > + die(_("cannot move both parent directory '%s' and its child '%s'"), > > > + pathbuf.buf, sources.v[i]); > > > > Otherwise we are in trouble. > > > > > + if (--argc > 0) { > > > + int n = argc - i; > > > + strvec_remove(&sources, i); > > > + strvec_remove(&destinations, i); > > > + MOVE_ARRAY(modes + i, modes + i + 1, n); > > > + MOVE_ARRAY(submodule_gitfiles + i, > > > + submodule_gitfiles + i + 1, n); > > > + i--; > > > + break; > > > + } > > > > So with > > > > $ git mv a/ a/b x y z/ > > > > then a/ is left in the argv[]/sources[]/destinations[] arrays, and > > upon inspecting a/b, we come here and in order to ignore a/b, we > > shift it out; the resulting arrays would have a/, x, and y being > > moved to z/. > > > > It somehow feels troubling that it would lead to a different result > > if I give a morally equivalent arguments, i.e. > > > > $ git mv a/b a/ x y z/ > > > > where a/b survives and a/ gets omitted. > > Fully agreed. I was quite surprised to see that git-mv(1) already > behaves like this with a couple of other error conditions. So I simply > continued to build on top of this behaviour, but I'm not a fan of it at > all. > > Note that this behaviour doesn't trigger by default though. So your > above command would cause us to die without doing any change at all. You > explicitly have to `git mv -k` (whatever 'k' is supposed to mean -- > maybe "keep going"?) to opt into this weird behaviour. Which makes this > overall a bit less awful. > > > One thing that came to my mind (without concrete "here is the right > > way to solve it" that I am myself convinced) is this. > > > > * Should this code path even have its own ignore-errors handling? > > "git mv a b z/", when 'a' does not exist, may ignore 'a' and move > > only 'b', which may make sense. But the original command line in > > that case is a plausibly correct one if there weren't missing or > > unmovable paths. The command line "git mv a/ a/b z/" seems to > > fall into a different category (aka "total nonsense"); no matter > > how you fix the items in your working tree files, you cannot make > > it plausibly correct. > > Fair. I guess the intent of '-k' is about handling the case where a > subset of files might be missing, not the case where the original > request didn't make any sense at all. I certainly wouldn't mind to > tighten this code. > > > a totally unrelated tangent that made me scratch my head while > > reading the original ocde is the dest_paths variable. It is never > > used as a collection to hold potentially multiple paths; it is a > > strvec only to be able to call internel_prefix_pathspec() with, and > > used only once with only one element in the vector. At least it > > should lose the plural 's' suffix to unconfuse its readers, I would > > think. > > Yeah. From my point of view this isn't the only confusing part about > this code. > > Patrick > I have polished this patch a bit now and sent it via [1]. Thanks! Patrick [1]: <20250430-pks-mv-parent-child-conflict-v1-0-11a87c55ffb9@xxxxxx>