Patrick Steinhardt <ps@xxxxxx> writes: > The "reftable" format has come a long way and has matured nicely since > it has been merged into git via 57db2a094d5 (refs: introduce reftable > backend, 2024-02-07). It fixes longstanding issues that cannot be fixed > with the "files" format in a backwards-compatible way and performs > significantly better in many use cases. > > Announce that we will switch to the "reftable" format in Git 3.0 for > newly created repositories. > Nit: This commit does more than announce the switch. It also adds in the changes to use reftable when WITH_BREAKING_CHANGES is set. Would be nice to call that out here. > This switch is dependent on support in the larger Git ecosystem. Most > importantly, libraries like JGit, libgit2 and Gitoxide should support > the reftable backend so that we don't break all applications and tools > built on top of those libraries. > > Signed-off-by: Patrick Steinhardt <ps@xxxxxx> > --- > Documentation/BreakingChanges.adoc | 44 ++++++++++++++++++++++++++++++++++++++ > help.c | 2 ++ > repository.h | 6 ++++++ > setup.c | 2 ++ > t/t0001-init.sh | 11 ++++++++++ > 5 files changed, 65 insertions(+) > > diff --git a/Documentation/BreakingChanges.adoc b/Documentation/BreakingChanges.adoc > index c6bd94986c5..614debcd740 100644 > --- a/Documentation/BreakingChanges.adoc > +++ b/Documentation/BreakingChanges.adoc > @@ -118,6 +118,50 @@ Cf. <2f5de416-04ba-c23d-1e0b-83bb655829a7@xxxxxxxxxxx>, > <20170223155046.e7nxivfwqqoprsqj@LykOS.localdomain>, > <CA+EOSBncr=4a4d8n9xS4FNehyebpmX8JiUwCsXD47EQDE+DiUQ@xxxxxxxxxxxxxx>. > > +* The default storage format for references in newly created repositories will > + be changed from "files" to "reftable". The "reftable" format provides > + multiple advantages over the "files" format: > ++ > + ** It is impossible to store two references that only differ in casing on > + case-insensitive filesystems with the "files" format. This issue is common > + on Windows and macOS platforms. As the "reftable" backend does not use > + filesystem paths anymore to encode reference names this problem goes away. Nit: s/anymore// makes it clearer, since reftable never used filesystem path. > + ** Similarly, macOS normalizes path names that contain unicode characters, > + which has the consequence that you cannot store two names with unicode > + characters that are encoded differently with the "files" backend. Again, > + this is not an issue with the "reftable" backend. > + ** Deleting references with the "files" backend requires Git to rewrite the > + complete "packed-refs" file. In large repositories with many references > + this file can easily be dozens of megabytes in size, in extreme cases it > + may be gigabytes. The "reftable" backend uses tombstone markers for > + deleted references and thus does not have to rewrite all of its data. > + ** Repository housekeeping with the "files" backend typically performs > + all-into-one repacks of references. This can be quite expensive, and > + consequently housekeeping is a tradeoff between the number of loose > + references that accumulate and slow down operations that read references, > + and compressing those loose references into the "packed-refs" file. The > + "reftable" backend uses geometric compaction after every write, which > + amortizes costs and ensures that the backend is always in a > + well-maintained state. > + ** Operations that write multiple references at once are not atomic with the > + "files" backend. Consequently, Git may see in-between states when it reads > + references while a reference transaction is in the process of being > + committed to disk. > + ** Writing many references at once is slow with the "files" backend because > + every reference is created as a separate file. The "reftable" backend > + significantly outperforms the "files" backend by multiple orders of > + magnitude. Perhaps something about how reftable uses a binary format and could save storage space. > ++ > +Users that get immediate benefit from the "reftable" backend could continue to > +opt-in to the "reftable" format manually by setting the "init.defaultRefFormat" > +config. But defaults matter, and we think that overall users will have a better > +experience with less platform-specific quirks when they use the new backend by > +default. > ++ > +A prerequisite for this change is that the ecosystem is ready to support the > +"reftable" format. Most importantly, alternative implementations of Git like > +JGit, libgit2 and Gitoxide need to support it. > + > === Removals > > * Support for grafting commits has long been superseded by git-replace(1). > diff --git a/help.c b/help.c > index 21b778707a6..89cd47e3b86 100644 > --- a/help.c > +++ b/help.c > @@ -810,6 +810,8 @@ void get_version_info(struct strbuf *buf, int show_build_options) > SHA1_UNSAFE_BACKEND); > #endif > strbuf_addf(buf, "SHA-256: %s\n", SHA256_BACKEND); > + strbuf_addf(buf, "default-ref-format: %s\n", > + ref_storage_format_to_name(REF_STORAGE_FORMAT_DEFAULT)); > } > } > > diff --git a/repository.h b/repository.h > index c4c92b2ab9c..77c4189d5dc 100644 > --- a/repository.h > +++ b/repository.h > @@ -20,6 +20,12 @@ enum ref_storage_format { > REF_STORAGE_FORMAT_REFTABLE, > }; > > +#ifdef WITH_BREAKING_CHANGES /* Git 3.0 */ > +# define REF_STORAGE_FORMAT_DEFAULT REF_STORAGE_FORMAT_REFTABLE > +#else > +# define REF_STORAGE_FORMAT_DEFAULT REF_STORAGE_FORMAT_FILES > +#endif > + Okay this makes sense. > struct repo_path_cache { > char *squash_msg; > char *merge_msg; > diff --git a/setup.c b/setup.c > index f93bd6a24a5..f0c06c655a9 100644 > --- a/setup.c > +++ b/setup.c > @@ -2541,6 +2541,8 @@ static void repository_format_configure(struct repository_format *repo_fmt, > repo_fmt->ref_storage_format = ref_format; > } else if (cfg.ref_format != REF_STORAGE_FORMAT_UNKNOWN) { > repo_fmt->ref_storage_format = cfg.ref_format; > + } else { > + repo_fmt->ref_storage_format = REF_STORAGE_FORMAT_DEFAULT; > } > repo_set_ref_storage_format(the_repository, repo_fmt->ref_storage_format); > } Shouldn't this change be instead made to REPOSITORY_FORMAT_INIT? diff --git a/setup.h b/setup.h index 18dc3b7368..c1b765043f 100644 --- a/setup.h +++ b/setup.h @@ -150,7 +150,7 @@ struct repository_format { .version = -1, \ .is_bare = -1, \ .hash_algo = GIT_HASH_SHA1, \ - .ref_storage_format = REF_STORAGE_FORMAT_FILES, \ + .ref_storage_format = REF_STORAGE_FORMAT_DEFAULT, \ .unknown_extensions = STRING_LIST_INIT_DUP, \ .v1_only_extensions = STRING_LIST_INIT_DUP, \ } > diff --git a/t/t0001-init.sh b/t/t0001-init.sh > index f11a40811f2..186664162fc 100755 > --- a/t/t0001-init.sh > +++ b/t/t0001-init.sh > @@ -658,6 +658,17 @@ test_expect_success 'init warns about invalid init.defaultRefFormat' ' > test_cmp expected actual > ' > > +test_expect_success 'default ref format' ' > + test_when_finished "rm -rf refformat" && > + ( > + sane_unset GIT_DEFAULT_REF_FORMAT && > + git init refformat > + ) && > + git version --build-options | sed -ne "s/^default-ref-format: //p" >expect && > + git -C refformat rev-parse --show-ref-format >actual && > + test_cmp expect actual > +' > + > backends="files reftable" > for format in $backends > do > > -- > 2.50.0.195.g74e6fc65d0.dirty
Attachment:
signature.asc
Description: PGP signature