Dear git list,
sparse-checkout interacts badly with symlinks within a git repository:
if b/file is a symlink to a/file, and the user asks for a
sparse-checkout with only b/, they get a dead link (b/file points to
nothing).
I initially assumed that replacing a file by a symlink to another file
with the same content would not be observable by other users of the
repository. This assumption is incorrect in presence of sparse checkouts.
I would find it natural to have sparse-checkout "follow symlinks". When
checking b/file as the user requests, git would notice that it is a
symlink and do one of the following:
1. if the link target a/file is not in the specified sparse checkout
set, copy its content instead of creating a dead symlink
(Downside: this could lead to duplication if several in-checkout
files point to a/file.)
2. or add a/file to the sparse checkout set
(Note: simply checking it out silently is not enough as 'reapply'
would then drop it)
Does this sound reasonable to you? Would you have recommendations on
what the interface for such a feature should look like?
- which of the alternatives above would you recommend?
- should this be enabled only by a new configuration or command-line
option (to which subcommand?), how would you name it?
Thanks in advance
## More details on the use-case
I'm trying to reduce the working directory size of a gigabyte-large git
repository ( https://github.com/typst/packages
<https://github.com/typst/packages> ) which contains a substantial
amount of duplicated files, by replacing duplicates by symlinks. The
repository uses a continuous integration script to run automated tests
on each proposed change, which uses sparse-checkout on only the
directories listed as containing modified files.(The directories
correspond to independent "packages" so it makes sense to check them
separately.) This breaks when the modified directories contain symlinks
to other, non-modified directories.