From: Phillip Wood <phillip.wood@xxxxxxxxxxxxx> Thanks to Taylor and Ben for their comments on V1. Changes since V1: - Patch 1: style fix and clarified commit message - Patch 4: reworded commit message and documentation Cover Letter for V1: This series fixes an overflow when running "git multi-pack-index repack" on an old raspberry pi and a couple of other small issues I noticed while reading the code. I'm unsure how realistic the example of integer overflow on 64 bit systems in patch 2 is. I'm happy to drop it if people who work with large repositories think its not worth worrying about. Base-Commit: cb96e1697ad6e54d11fc920c95f82977f8e438f8 Published-As: https://github.com/phillipwood/git/releases/tag/pw%2Fmidx-repack-overflow%2Fv2 View-Changes-At: https://github.com/phillipwood/git/compare/cb96e1697...a140181bd Fetch-It-Via: git fetch https://github.com/phillipwood/git pw/midx-repack-overflow/v2 Phillip Wood (4): midx repack: avoid integer overflow on 32 bit systems midx repack: avoid potential integer overflow on 64 bit systems midx: avoid negative array index midx docs: clarify tie breaking Documentation/git-multi-pack-index.adoc | 11 +++++++---- git-compat-util.h | 16 ++++++++++++++++ midx-write.c | 22 ++++++++++++++++------ 3 files changed, 39 insertions(+), 10 deletions(-) Range-diff against v1: 1: cbc5e69b908 ! 1: 9a1e6c81688 midx repack: avoid integer overflow on 32 bit systems @@ Commit message can overflow as both total_size and estimated_size can be greater that SIZE_MAX / 2. This is addressed by using saturating arithmetic for the - addition. + addition. Although estimated_size is of type uint64_t by the time we + reach this sum it is bounded by the batch size which is of type size_t + and so casting estimated_size to size_t does not truncate the value. Signed-off-by: Phillip Wood <phillip.wood@xxxxxxxxxxxxx> @@ midx-write.c: static void fill_included_packs_batch(struct repository *r, continue; - total_size += expected_size; -+ if (unsigned_add_overflows (total_size, (size_t)expected_size)) ++ if (unsigned_add_overflows(total_size, (size_t)expected_size)) + total_size = SIZE_MAX; + else + total_size += expected_size; 2: 9f07da4fe71 = 2: 54303d96c31 midx repack: avoid potential integer overflow on 64 bit systems 3: 688b0273604 = 3: 5b6cfb9d212 midx: avoid negative array index 4: 29769df1c60 ! 4: a140181bd57 midx docs: clarify tie breaking @@ Commit message midx docs: clarify tie breaking Clarify what happens when an object exists in more than one pack, but - not in the preferred pack. If the user does not pass a preferred pack - then the pack with the lowest mtime is chosen as the preferred pack. For - objects that are not in the preferred pack the pack with the highest - mtime is used. "git multi-pack-index repack" relies on this behavior. If - ties were resolved in favor of the oldest pack as the current - documentation suggests the multi-pack index would not reference any of - the objects in the pack created by "git multi-pack-index repack". + not in the preferred pack. "git multi-pack-index repack" relies on ties + for objects that are not in the preferred pack being resolved in favor + of the newest pack that contains a copy of the object. If ties were + resolved in favor of the oldest pack as the current documentation + suggests the multi-pack index would not reference any of the objects in + the pack created by "git multi-pack-index repack". + Helped-by: Taylor Blau <me@xxxxxxxxxxxx> Signed-off-by: Phillip Wood <phillip.wood@xxxxxxxxxxxxx> ## Documentation/git-multi-pack-index.adoc ## @@ Documentation/git-multi-pack-index.adoc: write:: + + + -- --preferred-pack=<pack>:: - Optionally specify the tie-breaking pack used when - multiple packs contain the same object. `<pack>` must +- Optionally specify the tie-breaking pack used when +- multiple packs contain the same object. `<pack>` must - contain at least one object. If not given, ties are - broken in favor of the pack with the lowest mtime. -+ contain at least one object. If not given the pack with -+ the lowest mtime is used as the preferred pack. Ties -+ for objects that are not contained in the preferred -+ are resolved in favor of the pack with the newest mtime. ++ When specified, break ties in favor of this pack when ++ there are additional copies of its objects in other ++ packs. Ties for objects not found in the preferred ++ pack are always resolved in favor of the copy in the ++ pack with the highest mtime. If unspecified, the pack ++ with the lowest mtime is used by default. The ++ preferred pack must have at least one object. --[no-]bitmap:: Control whether or not a multi-pack bitmap is written. -- 2.49.0.897.gfad3eb7d210