Re: [PATCH 1/2] BreakingChanges: announce switch to "reftable" format

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Patrick Steinhardt <ps@xxxxxx> writes:

> diff --git a/Documentation/BreakingChanges.adoc b/Documentation/BreakingChanges.adoc
> index c6bd94986c5..c96b5319cdd 100644
> --- a/Documentation/BreakingChanges.adoc
> +++ b/Documentation/BreakingChanges.adoc
> @@ -118,6 +118,45 @@ Cf. <2f5de416-04ba-c23d-1e0b-83bb655829a7@xxxxxxxxxxx>,
>  <20170223155046.e7nxivfwqqoprsqj@LykOS.localdomain>,
>  <CA+EOSBncr=4a4d8n9xS4FNehyebpmX8JiUwCsXD47EQDE+DiUQ@xxxxxxxxxxxxxx>.
>  
> +* The default storage format for references in newly created repositories will
> +  be changed from "files" to "reftable". The "reftable" format provides
> +  multiple advantages over the "files" format:
> ++
> +  ** It is impossible to store two references that only differ in casing on
> ...
> +  ** Writing many references at once is slow with the "files" backend because
> +     every reference is created as a separate file. The "reftable" backend
> +     significantly outperforms the "files" backend by multiple orders of
> +     magnitude.

These list benefits of using "reftable".  Can we also add one point
that stresses why we want to make it the default?  Something like
"Having to do X once per user to make them opt-in is too cumbersome"
is probably good enough.

> +A prerequisite for this change is that the ecosystem is ready to support the
> +"reftable" format. Most importantly, alternative implementations of Git like
> +JGit, libgit2 and Gitoxide need to support it.

... in order for them to access the same repository.

How common is it to use a single repository from these multiple
implementations these days, I have to wonder?

> diff --git a/setup.c b/setup.c
> index f93bd6a24a5..3ab0f11fbfd 100644
> --- a/setup.c
> +++ b/setup.c
> @@ -2541,6 +2541,12 @@ static void repository_format_configure(struct repository_format *repo_fmt,
>  			repo_fmt->ref_storage_format = ref_format;
>  	} else if (cfg.ref_format != REF_STORAGE_FORMAT_UNKNOWN) {
>  		repo_fmt->ref_storage_format = cfg.ref_format;
> +	} else {
> +#ifdef WITH_BREAKING_CHANGES
> +		repo_fmt->ref_storage_format = REF_STORAGE_FORMAT_REFTABLE;
> +#else
> +		repo_fmt->ref_storage_format = REF_STORAGE_FORMAT_FILES;
> +#endif
>  	}
>  	repo_set_ref_storage_format(the_repository, repo_fmt->ref_storage_format);
>  }

That's obvious one.  I think the approach taken by brian's SHA-256
topic would have introduced REF_STORAGE_FORMAT_DEFAULT and did the
build-time switching between the two in a single conditional
definition

        #ifndef WITH_BREAKING_CHANGES /* 3.0 */
        # define REF_STORAGE_FORMAT_DEFAULT REF_STORAGE_FORMAT_FILES
        #else
        # define REF_STORAGE_FORMAT_DEFAULT REF_STORAGE_FORMAT_REFTABLE
        #endif

somewhere in a header file.  Either way would work, but I wonder if
these breaking-changes definitions are collected together into a
single header file (say <bc.h>), it may make the transition at 3.0
version boundary simpler and less error-prone.  We can just discard
selected conditionals into unconditional definition more easily.
For example if we moved the default flip between SHA-1 and SHA-256,
i.e.

	#ifndef WITH_BREAKING_CHANGES /* 3.0 */
	# define GIT_HASH_DEFAULT GIT_HASH_SHA1
	#else
	# define GIT_HASH_DEFAULT GIT_HASH_SHA256
	#endif

out of hash.h and have it next to the above REF_STORAGE_FORMAT_DEFAULT
definition, and then in a subsystem specific header file, after
including <bc.h>, can say

	=== In hash.h ===
	#include <bc.h>
	#ifndef GIT_HASH_DEFAULT
	# define GIT_HASH_DEFAULT GIT_HASH_SHA256
	#endif

	=== In refs.h ===
	#include <bc.h>
	#ifndef REF_STORAGE_FORMAT_DEFAULT
        # define REF_STORAGE_FORMAT_DEFAULT REF_STORAGE_FORMAT_REFTABLE
	#endif

If some reason making reftable backend the default when unspecified
turns out to be a bit premature at 3.0 boundary while the world is
ready for SHA-256 by default for new repositories, then we can tweak
that single header file like so:

        -#ifndef WITH_BREAKING_CHANGES /* 3.0 */
        +#ifndef WITH_BREAKING_CHANGES /* 4.0? */
         # define REF_STORAGE_FORMAT_DEFAULT REF_STORAGE_FORMAT_FILES
         #else
         # define REF_STORAGE_FORMAT_DEFAULT REF_STORAGE_FORMAT_REFTABLE
         #endif

	-#ifndef WITH_BREAKING_CHANGES
	-# define GIT_HASH_DEFAULT GIT_HASH_SHA1
	-#else
	-# define GIT_HASH_DEFAULT GIT_HASH_SHA256
	-#endif

and optionally change the "if default is not set, use 256" in <hash.h>
to "unconditionally use 256 as the default", but forgetting to do so
would not break anything, which makes the process less error prone.

By doing something like this, we'll have a single place <bc.h> to
see what are being planned, and we can "git log that-header-file" to
see how our thinking has evolved over time.  Hopefully we do not
have to keep too many entries in that file and can retire the
conditionals as we plan ahead.

> diff --git a/t/t0001-init.sh b/t/t0001-init.sh
> index f11a40811f2..e0f27484192 100755
> --- a/t/t0001-init.sh
> +++ b/t/t0001-init.sh
> @@ -658,6 +658,22 @@ test_expect_success 'init warns about invalid init.defaultRefFormat' '
>  	test_cmp expected actual
>  '
>  
> +test_expect_success 'default ref format' '
> +	test_when_finished "rm -rf refformat" &&
> +	(
> +		sane_unset GIT_DEFAULT_REF_FORMAT &&
> +		git init refformat
> +	) &&
> +	if test_have_prereq WITH_BREAKING_CHANGES
> +	then
> +		echo reftable >expect
> +	else
> +		echo files >expect
> +	fi &&
> +	git -C refformat rev-parse --show-ref-format >actual &&
> +	test_cmp expect actual
> +'

Obvious ;-)

Thanks.




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux