Hi Patrick and thank you for the feedback! o/
On Tue, 09 Sep 2025, Patrick Steinhardt <ps@xxxxxx> wrote:
On Mon, Sep 08, 2025 at 05:01:09PM +0300, Adrian Ratiu wrote:
This is in preparation for encoding the submodule names to
avoid conflicts like submodules named foo and foo/bar together
with case-insensitive file- system handling and other corner
cases like reserved filenames on Windows. Backward
compatibility is kept with plain-name modules already existing
at paths like .git/modules/<name>, however a clear separation
between legacy (plain) and new (encoded) namespaces is
desirable, to avoid situations like an existing plain-name
module containing the encoding escape character/ Thus we split
the new-style (encoded) gitdir name paths to .git/submodules,
while legacy-style paths remain under .git/modules. This is
just a default directory change with the accompanying test
updates, in preparation for the actual encoding additions in
future commits.
One of the questions here is how this move will affect alternate
implementations of Git, like libgit2, JGit or Gitoxide. There's
two angles to this:
- Git needs to handle that those implementations continue to
write
submodules into ".git/modules".
- These implementations need to be able to handle the
new-style paths.
The first item should work just fine, as we make sure that we
handle both paths. But do the other implementations need any
adjustment? I guess the answer is "yes", so we need to treat
this as a backwards incompatible change as they wouldn't be able
to find the submodule repositories anymore, right?
That is correct and also applies to older versions git itself
which do not have this mechanism. Phillip Wood suggested we add an
extension like "extensions.submoduleEncoding" (name suggestions
welcome).
I'll do that in v3 of this series.
Ideally, the way that submodules were populated was less
fragile. For example, we could have a "submodule.*.repoPath"
config key that gets populated whenever we clone a submodule. If
Git clients knew to use that field they wouldn't have to
second-guess where a previous Git client stored a specific
submodule, but they could just read that path and then use
whatever is stored therein. This would even allow for changes
like using a hash to encode the submodule name.
Slight tangent (I'll respond to your point after this):
Junio asked to please keep the name human-readable and that's why
we use url-encoding which is also widely known and well
understood.
I guess we could add a config to change the name encoding or
hashing mechanism while keeping url-encoding as the
default. Likely in a later series because this one is big enough
now at 10 patches and keeps growing.
One of my Collabora collegues even suggested they would like to
use a pattern like "hash_name" to get the best of both worlds.
But to the best of my knowledge such a key does not currently
exist, which is too bad (please correct me if I'm wrong, I'm
definitely not an expert when it comes to submodules).
No, it does not exist. I've added something a little bit similar
with the gitdir path config option in this series, however it is
only used to override default paths computed by git-submodule,
when necessary.
There is also a config clutter problem, if such a key were to be
added by default, since most submodules use default paths.
Phillip had the idea to only compute the path once, during the
initial submodule clone, then reuse it from the .git file inside
the submodule workdir in later actions, however that is not enough
for compatibility with other implementations or older versions.
So yes, to avoid user confusion, multi-implementation
inter-operability problems or risk any repo inconsistency, I'll
make it a breaking change.