Re: [PATCH v2 00/13] PATH WALK II: Add --path-walk option to 'git pack-objects'

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



"Derrick Stolee via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes:

> This patch series does the following:
>
>  1. Add a new '--path-walk' option to 'git pack-objects' that uses the
>     path-walk API instead of the revision API to collect objects for delta
>     compression.
>
>  2. Add a new '--path-walk' option to 'git repack' to pass this option along
>     to 'git pack-objects'.
>
>  3. Add a new 'pack.usePathWalk' config option to opt into this option
>     implicitly, such as in 'git push'.
>
>  4. Optimize the '--path-walk' option using threading so it better competes
>     with the existing multi-threaded delta compression mechanism.
>
>  5. Update the path-walk API with a new 'edge_aggressive' option that pairs
>     close to the --edge-aggressive option in the revision API. This is
>     useful when creating thin packs inside shallow clones.
>
> This feature works by using the path-walk API to emit groups of objects that
> appear at the same path. These groups are tracked so they can be tested for
> delta compression with each other, and then after those groups are tested a
> second pass using the name-hash attempts to find better (or first time)
> deltas across path boundaries. This second pass is much faster than a fresh
> pass since the existing deltas are used as a limit for the size of
> potentially new deltas, short-circuiting the checks when the delta size
> exceeds the current-best.
> ...
> This feature was shipped with similar features in microsoft/git as of
> v2.47.0.vfs.0.3 [4]. This was used in CI machines for an internal monorepo
> that had significant repository growth due to constructing a batch of
> beachball [5] CHANGELOG.[md|json] files and pushing them to a release
> branch. These pushes were frequently 70-200 MB due to poor delta
> compression. Using the 'pack.usePathWalk=true' config, these pushes dropped
> in size by 100x while improving performance. Since these CI machines were
> working with a shallow clone, the 'edge_aggressive' changes were required to
> enable the path-walk option.
>
> [4] https://github.com/microsoft/git/releases/tag/v2.47.0.vfs.0.3
>
> [5] https://github.com/microsoft/beachball
>
>
> Updates in v2
> =============
>
>  * Re-added a dropped comment when moving code in patch 1.
>  * Updated documentation to include interaction with --use-bitmap-index.
>  * An UNUSED parameter is now used, reducing the use of global variables
>    slightly.

The iteration saw no comments from anybody, so I (naturally) forgot
about it for quite a long time.  Let me mark it for 'next'.

Thanks.




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux