"Derrick Stolee via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes: > This patch series does the following: > > 1. Add a new '--path-walk' option to 'git pack-objects' that uses the > path-walk API instead of the revision API to collect objects for delta > compression. > > 2. Add a new '--path-walk' option to 'git repack' to pass this option along > to 'git pack-objects'. > > 3. Add a new 'pack.usePathWalk' config option to opt into this option > implicitly, such as in 'git push'. > > 4. Optimize the '--path-walk' option using threading so it better competes > with the existing multi-threaded delta compression mechanism. > > 5. Update the path-walk API with a new 'edge_aggressive' option that pairs > close to the --edge-aggressive option in the revision API. This is > useful when creating thin packs inside shallow clones. > > This feature works by using the path-walk API to emit groups of objects that > appear at the same path. These groups are tracked so they can be tested for > delta compression with each other, and then after those groups are tested a > second pass using the name-hash attempts to find better (or first time) > deltas across path boundaries. This second pass is much faster than a fresh > pass since the existing deltas are used as a limit for the size of > potentially new deltas, short-circuiting the checks when the delta size > exceeds the current-best. > ... > This feature was shipped with similar features in microsoft/git as of > v2.47.0.vfs.0.3 [4]. This was used in CI machines for an internal monorepo > that had significant repository growth due to constructing a batch of > beachball [5] CHANGELOG.[md|json] files and pushing them to a release > branch. These pushes were frequently 70-200 MB due to poor delta > compression. Using the 'pack.usePathWalk=true' config, these pushes dropped > in size by 100x while improving performance. Since these CI machines were > working with a shallow clone, the 'edge_aggressive' changes were required to > enable the path-walk option. > > [4] https://github.com/microsoft/git/releases/tag/v2.47.0.vfs.0.3 > > [5] https://github.com/microsoft/beachball > > > Updates in v2 > ============= > > * Re-added a dropped comment when moving code in patch 1. > * Updated documentation to include interaction with --use-bitmap-index. > * An UNUSED parameter is now used, reducing the use of global variables > slightly. The iteration saw no comments from anybody, so I (naturally) forgot about it for quite a long time. Let me mark it for 'next'. Thanks.