Here is a tiny reroll of my series to explore creating MIDXs while repacking that don't include the cruft pack. The only changes since last time are as follows: - removed a stray whitespace change in patch 2/9 caught by Junio and Elijah - reworded the final commit message based on helpful feedback from Elijah. The rest of the series is unchanged, and a range-diff is included below as usual for convenience. A meta-note on something new since last time: I have deployed a cherry-picked version of this series to GitHub's infrastructure a few weeks ago. The testing repository (whose maintenance failure many years ago precipitated commit ddee3703b3 (builtin/repack.c: add cruft packs to MIDX during geometric repack, 2022-05-20)) has been happily running maintenance tasks without any issues since then, and the cruft pack has been excluded from the MIDX. Thanks in advance for any review :-). Taylor Blau (9): pack-objects: use standard option incompatibility functions pack-objects: limit scope in 'add_object_entry_from_pack()' pack-objects: factor out handling '--stdin-packs' pack-objects: declare 'rev_info' for '--stdin-packs' earlier pack-objects: perform name-hash traversal for unpacked objects pack-objects: fix typo in 'show_object_pack_hint()' pack-objects: swap 'show_{object,commit}_pack_hint' pack-objects: introduce '--stdin-packs=follow' repack: exclude cruft pack(s) from the MIDX where possible Documentation/config/repack.adoc | 7 + Documentation/git-pack-objects.adoc | 10 +- builtin/pack-objects.c | 190 ++++++++++++++++++---------- builtin/repack.c | 163 +++++++++++++++++++++--- t/t5331-pack-objects-stdin.sh | 84 +++++++++++- t/t7704-repack-cruft.sh | 90 +++++++++++++ 6 files changed, 455 insertions(+), 89 deletions(-) Range-diff against v3: 1: 986bef29b5 ! 1: f8b31c6a8d pack-objects: limit scope in 'add_object_entry_from_pack()' @@ Metadata Author: Taylor Blau <me@xxxxxxxxxxxx> ## Commit message ## - pack-objects: limit scope in 'add_object_entry_from_pack()' + pack-objects: use standard option incompatibility functions - In add_object_entry_from_pack() we declare 'revs' (given to us through - the miscellaneous context argument) earlier in the "if (p)" conditional - than is necessary. Move it down as far as it can go to reduce its - scope. + pack-objects has a handful of explicit checks for pairs of command-line + options which are mutually incompatible. Many of these pre-date + a699367bb8 (i18n: factorize more 'incompatible options' messages, + 2022-01-31). + + Convert the explicit checks into die_for_incompatible_opt2() calls, + which simplifies the implementation and standardizes pack-objects' + output when given incompatible options (e.g., --stdin-packs with + --filter gives different output than --keep-unreachable with + --unpack-unreachable). + + There is one minor piece of test fallout in t5331 that expects the old + format, which has been corrected. Signed-off-by: Taylor Blau <me@xxxxxxxxxxxx> ## builtin/pack-objects.c ## -@@ builtin/pack-objects.c: static int add_object_entry_from_pack(const struct object_id *oid, - return 0; +@@ builtin/pack-objects.c: int cmd_pack_objects(int argc, + strvec_push(&rp, "--unpacked"); + } - if (p) { -- struct rev_info *revs = _data; - struct object_info oi = OBJECT_INFO_INIT; -- - oi.typep = &type; +- if (exclude_promisor_objects && exclude_promisor_objects_best_effort) +- die(_("options '%s' and '%s' cannot be used together"), +- "--exclude-promisor-objects", "--exclude-promisor-objects-best-effort"); ++ die_for_incompatible_opt2(exclude_promisor_objects, ++ "--exclude-promisor-objects", ++ exclude_promisor_objects_best_effort, ++ "--exclude-promisor-objects-best-effort"); + if (exclude_promisor_objects) { + use_internal_rev_list = 1; + fetch_if_missing = 0; +@@ builtin/pack-objects.c: int cmd_pack_objects(int argc, + if (!pack_to_stdout && thin) + die(_("--thin cannot be used to build an indexable pack")); + +- if (keep_unreachable && unpack_unreachable) +- die(_("options '%s' and '%s' cannot be used together"), "--keep-unreachable", "--unpack-unreachable"); ++ die_for_incompatible_opt2(keep_unreachable, "--keep-unreachable", ++ unpack_unreachable, "--unpack-unreachable"); + if (!rev_list_all || !rev_list_reflog || !rev_list_index) + unpack_unreachable_expiration = 0; + +- if (stdin_packs && filter_options.choice) +- die(_("cannot use --filter with --stdin-packs")); ++ die_for_incompatible_opt2(stdin_packs, "--stdin-packs", ++ filter_options.choice, "--filter"); + - if (packed_object_info(the_repository, p, ofs, &oi) < 0) { - die(_("could not get type of object %s in pack %s"), - oid_to_hex(oid), p->pack_name); - } else if (type == OBJ_COMMIT) { -+ struct rev_info *revs = _data; - /* - * commits in included packs are used as starting points for the - * subsequent revision walk + + if (stdin_packs && use_internal_rev_list) + die(_("cannot use internal rev list with --stdin-packs")); +@@ builtin/pack-objects.c: int cmd_pack_objects(int argc, + if (cruft) { + if (use_internal_rev_list) + die(_("cannot use internal rev list with --cruft")); +- if (stdin_packs) +- die(_("cannot use --stdin-packs with --cruft")); ++ die_for_incompatible_opt2(stdin_packs, "--stdin-packs", ++ cruft, "--cruft"); + } + + /* + + ## t/t5331-pack-objects-stdin.sh ## +@@ t/t5331-pack-objects-stdin.sh: test_expect_success '--stdin-packs is incompatible with --filter' ' + cd stdin-packs && + test_must_fail git pack-objects --stdin-packs --stdout \ + --filter=blob:none </dev/null 2>err && +- test_grep "cannot use --filter with --stdin-packs" err ++ test_grep "options .--stdin-packs. and .--filter. cannot be used together" err + ) + ' + -: ---------- > 2: 2753e29648 pack-objects: limit scope in 'add_object_entry_from_pack()' 2: 6f8fe8a4e1 = 3: 32b49d9073 pack-objects: factor out handling '--stdin-packs' 3: 2a235461a6 = 4: a797ff3a83 pack-objects: declare 'rev_info' for '--stdin-packs' earlier 4: 240e90b68d = 5: 29bf05633a pack-objects: perform name-hash traversal for unpacked objects 5: 9a18fa2e52 = 6: 0696fa1736 pack-objects: fix typo in 'show_object_pack_hint()' 6: 6c997853f1 = 7: 1cc45b4472 pack-objects: swap 'show_{object,commit}_pack_hint' 7: 0ff699f056 = 8: 3e3d929bd0 pack-objects: introduce '--stdin-packs=follow' 8: 58891101f3 ! 9: 52a069ef48 repack: exclude cruft pack(s) from the MIDX where possible @@ Commit message MIDX with '--write-midx' to ensure that the resulting MIDX was always closed under reachability in order to generate reachability bitmaps. - Suppose (prior to this patch) you have a once-unreachable object packed - in a cruft pack, which later on becomes reachable from one or more - objects in a geometrically repacked pack. That once-unreachable object - *won't* appear in the new pack, since the cruft pack was specified as - neither included nor excluded to 'pack-objects --stdin-packs'. If the + While the previous patch added the '--stdin-packs=follow' option to + pack-objects, it is not yet on by default. Given that, suppose you have + a once-unreachable object packed in a cruft pack, which later becomes + reachable from one or more objects in a geometrically repacked pack. + That once-unreachable object *won't* appear in the new pack, since the + cruft pack was not specified as included or excluded when the + geometrically repacked pack was created with 'pack-objects + --stdin-packs' (*not* '--stdin-packs=follow', which is not on). If that new pack is included in a MIDX without the cruft pack, then trying to generate bitmaps for that MIDX may fail. This happens when the bitmap selection process picks one or more commits which reach the - once-unreachable objects, commit ddee3703b3 ensures that the MIDX will - be closed under reachability. Without it, we would fail to generate a - MIDX bitmap. + once-unreachable objects. + To mitigate this failure mode, commit ddee3703b3 ensures that the MIDX + will be closed under reachability by including cruft pack(s). If cruft + pack(s) were not included, we would fail to generate a MIDX bitmap. But ddee3703b3 alludes to the fact that this is sub-optimal by saying [...] it's desirable to avoid including cruft packs in the MIDX base-commit: 485f5f863615e670fd97ae40af744e14072cfe18 -- 2.49.0.640.ga4de40e6a8