Re: [PATCH v2 08/13] pack-objects: enable --path-walk via config

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Mar 24, 2025 at 03:22:44PM +0000, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <stolee@xxxxxxxxx>
>
> Users may want to enable the --path-walk option for 'git pack-objects' by
> default, especially underneath commands like 'git push' or 'git repack'.
>
> This should be limited to client repositories, since the --path-walk option
> disables bitmap walks, so would be bad to include in Git servers when
> serving fetches and clones. There is potential that it may be helpful to
> consider when repacking the repository, to take advantage of improved deltas
> across historical versions of the same files.
>
> Much like how "pack.useSparse" was introduced and included in
> "feature.experimental" before being enabled by default, use the repository
> settings infrastructure to make the new "pack.usePathWalk" config enabled by
> "feature.experimental" and "feature.manyFiles".
>
> Signed-off-by: Derrick Stolee <stolee@xxxxxxxxx>
> ---
>  Documentation/config/feature.adoc | 4 ++++
>  Documentation/config/pack.adoc    | 8 ++++++++
>  builtin/pack-objects.c            | 3 +++
>  repo-settings.c                   | 3 +++
>  repo-settings.h                   | 1 +
>  5 files changed, 19 insertions(+)
>
> diff --git a/Documentation/config/feature.adoc b/Documentation/config/feature.adoc
> index f061b64b748..cb49ff2604a 100644
> --- a/Documentation/config/feature.adoc
> +++ b/Documentation/config/feature.adoc
> @@ -20,6 +20,10 @@ walking fewer objects.
>  +
>  * `pack.allowPackReuse=multi` may improve the time it takes to create a pack by
>  reusing objects from multiple packs instead of just one.
> ++
> +* `pack.usePathWalk` may speed up packfile creation and make the packfiles be
> +significantly smaller in the presence of certain filename collisions with Git's
> +default name-hash.
>
>  feature.manyFiles::
>  	Enable config options that optimize for repos with many files in the
> diff --git a/Documentation/config/pack.adoc b/Documentation/config/pack.adoc
> index da527377faf..08d06271177 100644
> --- a/Documentation/config/pack.adoc
> +++ b/Documentation/config/pack.adoc
> @@ -155,6 +155,14 @@ pack.useSparse::
>  	commits contain certain types of direct renames. Default is
>  	`true`.
>
> +pack.usePathWalk::
> +	When true, git will default to using the '--path-walk' option in
> +	'git pack-objects' when the '--revs' option is present. This
> +	algorithm groups objects by path to maximize the ability to
> +	compute delta chains across historical versions of the same
> +	object. This may disable other options, such as using bitmaps to
> +	enumerate objects.
> +

Same note here as in the previous patch, I think it's fine to refer
readers to the git-pack-objects[1] documentation instead of repeating
yourself here. (And it removes the risk of these three descriptions
falling out of sync with one another).

>  pack.preferBitmapTips::
>  	When selecting which commits will receive bitmaps, prefer a
>  	commit at the tip of any reference that is a suffix of any value
> diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
> index a6b8a78d42a..0ea85754c52 100644
> --- a/builtin/pack-objects.c
> +++ b/builtin/pack-objects.c
> @@ -4652,6 +4652,9 @@ int cmd_pack_objects(int argc,
>  		if (use_bitmap_index > 0 ||
>  		    !use_internal_rev_list)
>  			path_walk = 0;
> +		else if (the_repository->gitdir &&
> +			 the_repository->settings.pack_use_path_walk)
> +			path_walk = 1;
>  		else
>  			path_walk = git_env_bool("GIT_TEST_PACK_PATH_WALK", 0);
>  	}

The limited diff context makes it hard for me to tell for sure, but this
takes place after git_config(), right? If so, I think we can avoid using
the repository settings machinery here and just use the config API
directly.

(FWIW, I typically think of repository settings as a way to expose
config information to some part of the codebase that doesn't otherwise
have easy access to, e.g., a static field that was set by a git_config()
callback).

Separate from the above: should this have a sanity test to makes sure
that we read the config setting correctly?

Thanks,
Taylor




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux