"brian m. carlson" <sandals@xxxxxxxxxxxxxxxxxxxx> writes: > When we define a new repository format with REPOSITORY_FORMAT_INIT, we > always use GIT_HASH_SHA1, and this value ends up getting used as the > default value to initialize a repository if none of the command line, > environment, or config tell us to do otherwise. > > Because we might not always want to use SHA-1 as the default, let's > instead specify the default hash algorithm constant so that we will use > whatever the specified default is. All of the above hints the use of _DEFAULT instead of _SHA1 but ... > However, to make sure we continue to > read repositories without a specified hash algorithm as SHA-1, default > the repository format to the original hash algorithm (SHA-1) when > reading the repository format. ... this explains why we may want to - expect that we would be able to read the hash from repository - read from repository - if the repository specifies the hash, happily use it - otherwise it is a lagacy repository so use the SHA-1 Is that what is going on here? Because I find some things that happens in the code somewhat questionable. > Signed-off-by: brian m. carlson <sandals@xxxxxxxxxxxxxxxxxxxx> > --- > setup.c | 5 ++++- > setup.h | 2 +- > 2 files changed, 5 insertions(+), 2 deletions(-) > > diff --git a/setup.c b/setup.c > index 641c857ed5..fb38824a2b 100644 > --- a/setup.c > +++ b/setup.c > @@ -835,9 +835,12 @@ static void init_repository_format(struct repository_format *format) > int read_repository_format(struct repository_format *format, const char *path) > { > clear_repository_format(format); > + format->hash_algo = GIT_HASH_ORIGINAL; If we expect we can read from the config, and otherwise we fall back to the hardcoded legacy SHA-1, do we need this assignment? We cleared and version is set to -1, and then we read from the config ... > git_config_from_file(check_repo_format, path, format); ... so if the file said anything about "extensions.objectformat", we would know it by now. If not, wouldn't version be left to -1 as our previous clal to clear_repository_format() set it via its call to init_repository_format()? Ahh, this code is prepared to handle a repository that claims to use version 1 but does not set extensions.objectformat at all. And in such a case, we do want to use SHA-1. OK, the above code makes sense for that case. > - if (format->version == -1) > + if (format->version == -1) { And if there is no core.repositoryformatversion set, we will come here. According to the comment before handle_extension_v0(), some extensions.* should still be honored even in such a repository, and the above call to git_config_from_file() should have handled them just fine. However, I do not understand why we clear all of what we read with another call to clear_repository_format() here. > clear_repository_format(format); > + format->hash_algo = GIT_HASH_ORIGINAL; > + } > return format->version; > } Admittedly, I find that the way how check_repo_format() does its thing somewhat questionable. Even though it reads into the .version member the value of core.repositoryformatversion, it slurps in all the extensions.* regardless of what .version the repository claims to be in. So even though there are two separate functions to handle "historical compatibility" handle_extension_v0() and other extensions, we still end up honoring extensions.objectformat in a repository that does not say what format version it uses. And clearing them with a call to clear_repository_format() may make sense, but do we want to clear things we read with handle_extension_v0() as well? > diff --git a/setup.h b/setup.h > index 18dc3b7368..8522fa8575 100644 > --- a/setup.h > +++ b/setup.h > @@ -149,7 +149,7 @@ struct repository_format { > { \ > .version = -1, \ > .is_bare = -1, \ > - .hash_algo = GIT_HASH_SHA1, \ > + .hash_algo = GIT_HASH_DEFAULT, \ > .ref_storage_format = REF_STORAGE_FORMAT_FILES, \ > .unknown_extensions = STRING_LIST_INIT_DUP, \ > .v1_only_extensions = STRING_LIST_INIT_DUP, \