[PATCH v4 0/9] repack: avoid MIDX'ing cruft pack(s) where possible

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Here is a tiny reroll of my series to explore creating MIDXs while
repacking that don't include the cruft pack.

The only changes since last time are as follows:

 - removed a stray whitespace change in patch 2/9 caught by Junio and
   Elijah

 - reworded the final commit message based on helpful feedback from
   Elijah.

The rest of the series is unchanged, and a range-diff is included below
as usual for convenience.

A meta-note on something new since last time: I have deployed a
cherry-picked version of this series to GitHub's infrastructure a few
weeks ago. The testing repository (whose maintenance failure many years
ago precipitated commit ddee3703b3 (builtin/repack.c: add cruft packs to
MIDX during geometric repack, 2022-05-20)) has been happily running
maintenance tasks without any issues since then, and the cruft pack has
been excluded from the MIDX.

Thanks in advance for any review :-).

Taylor Blau (9):
  pack-objects: use standard option incompatibility functions
  pack-objects: limit scope in 'add_object_entry_from_pack()'
  pack-objects: factor out handling '--stdin-packs'
  pack-objects: declare 'rev_info' for '--stdin-packs' earlier
  pack-objects: perform name-hash traversal for unpacked objects
  pack-objects: fix typo in 'show_object_pack_hint()'
  pack-objects: swap 'show_{object,commit}_pack_hint'
  pack-objects: introduce '--stdin-packs=follow'
  repack: exclude cruft pack(s) from the MIDX where possible

 Documentation/config/repack.adoc    |   7 +
 Documentation/git-pack-objects.adoc |  10 +-
 builtin/pack-objects.c              | 190 ++++++++++++++++++----------
 builtin/repack.c                    | 163 +++++++++++++++++++++---
 t/t5331-pack-objects-stdin.sh       |  84 +++++++++++-
 t/t7704-repack-cruft.sh             |  90 +++++++++++++
 6 files changed, 455 insertions(+), 89 deletions(-)

Range-diff against v3:
 1:  986bef29b5 !  1:  f8b31c6a8d pack-objects: limit scope in 'add_object_entry_from_pack()'
    @@ Metadata
     Author: Taylor Blau <me@xxxxxxxxxxxx>

      ## Commit message ##
    -    pack-objects: limit scope in 'add_object_entry_from_pack()'
    +    pack-objects: use standard option incompatibility functions

    -    In add_object_entry_from_pack() we declare 'revs' (given to us through
    -    the miscellaneous context argument) earlier in the "if (p)" conditional
    -    than is necessary.  Move it down as far as it can go to reduce its
    -    scope.
    +    pack-objects has a handful of explicit checks for pairs of command-line
    +    options which are mutually incompatible. Many of these pre-date
    +    a699367bb8 (i18n: factorize more 'incompatible options' messages,
    +    2022-01-31).
    +
    +    Convert the explicit checks into die_for_incompatible_opt2() calls,
    +    which simplifies the implementation and standardizes pack-objects'
    +    output when given incompatible options (e.g., --stdin-packs with
    +    --filter gives different output than --keep-unreachable with
    +    --unpack-unreachable).
    +
    +    There is one minor piece of test fallout in t5331 that expects the old
    +    format, which has been corrected.

         Signed-off-by: Taylor Blau <me@xxxxxxxxxxxx>

      ## builtin/pack-objects.c ##
    -@@ builtin/pack-objects.c: static int add_object_entry_from_pack(const struct object_id *oid,
    - 		return 0;
    +@@ builtin/pack-objects.c: int cmd_pack_objects(int argc,
    + 		strvec_push(&rp, "--unpacked");
    + 	}

    - 	if (p) {
    --		struct rev_info *revs = _data;
    - 		struct object_info oi = OBJECT_INFO_INIT;
    --
    - 		oi.typep = &type;
    +-	if (exclude_promisor_objects && exclude_promisor_objects_best_effort)
    +-		die(_("options '%s' and '%s' cannot be used together"),
    +-		    "--exclude-promisor-objects", "--exclude-promisor-objects-best-effort");
    ++	die_for_incompatible_opt2(exclude_promisor_objects,
    ++				  "--exclude-promisor-objects",
    ++				  exclude_promisor_objects_best_effort,
    ++				  "--exclude-promisor-objects-best-effort");
    + 	if (exclude_promisor_objects) {
    + 		use_internal_rev_list = 1;
    + 		fetch_if_missing = 0;
    +@@ builtin/pack-objects.c: int cmd_pack_objects(int argc,
    + 	if (!pack_to_stdout && thin)
    + 		die(_("--thin cannot be used to build an indexable pack"));
    +
    +-	if (keep_unreachable && unpack_unreachable)
    +-		die(_("options '%s' and '%s' cannot be used together"), "--keep-unreachable", "--unpack-unreachable");
    ++	die_for_incompatible_opt2(keep_unreachable, "--keep-unreachable",
    ++				  unpack_unreachable, "--unpack-unreachable");
    + 	if (!rev_list_all || !rev_list_reflog || !rev_list_index)
    + 		unpack_unreachable_expiration = 0;
    +
    +-	if (stdin_packs && filter_options.choice)
    +-		die(_("cannot use --filter with --stdin-packs"));
    ++	die_for_incompatible_opt2(stdin_packs, "--stdin-packs",
    ++				  filter_options.choice, "--filter");
     +
    - 		if (packed_object_info(the_repository, p, ofs, &oi) < 0) {
    - 			die(_("could not get type of object %s in pack %s"),
    - 			    oid_to_hex(oid), p->pack_name);
    - 		} else if (type == OBJ_COMMIT) {
    -+			struct rev_info *revs = _data;
    - 			/*
    - 			 * commits in included packs are used as starting points for the
    - 			 * subsequent revision walk
    +
    + 	if (stdin_packs && use_internal_rev_list)
    + 		die(_("cannot use internal rev list with --stdin-packs"));
    +@@ builtin/pack-objects.c: int cmd_pack_objects(int argc,
    + 	if (cruft) {
    + 		if (use_internal_rev_list)
    + 			die(_("cannot use internal rev list with --cruft"));
    +-		if (stdin_packs)
    +-			die(_("cannot use --stdin-packs with --cruft"));
    ++		die_for_incompatible_opt2(stdin_packs, "--stdin-packs",
    ++					  cruft, "--cruft");
    + 	}
    +
    + 	/*
    +
    + ## t/t5331-pack-objects-stdin.sh ##
    +@@ t/t5331-pack-objects-stdin.sh: test_expect_success '--stdin-packs is incompatible with --filter' '
    + 		cd stdin-packs &&
    + 		test_must_fail git pack-objects --stdin-packs --stdout \
    + 			--filter=blob:none </dev/null 2>err &&
    +-		test_grep "cannot use --filter with --stdin-packs" err
    ++		test_grep "options .--stdin-packs. and .--filter. cannot be used together" err
    + 	)
    + '
    +
 -:  ---------- >  2:  2753e29648 pack-objects: limit scope in 'add_object_entry_from_pack()'
 2:  6f8fe8a4e1 =  3:  32b49d9073 pack-objects: factor out handling '--stdin-packs'
 3:  2a235461a6 =  4:  a797ff3a83 pack-objects: declare 'rev_info' for '--stdin-packs' earlier
 4:  240e90b68d =  5:  29bf05633a pack-objects: perform name-hash traversal for unpacked objects
 5:  9a18fa2e52 =  6:  0696fa1736 pack-objects: fix typo in 'show_object_pack_hint()'
 6:  6c997853f1 =  7:  1cc45b4472 pack-objects: swap 'show_{object,commit}_pack_hint'
 7:  0ff699f056 =  8:  3e3d929bd0 pack-objects: introduce '--stdin-packs=follow'
 8:  58891101f3 !  9:  52a069ef48 repack: exclude cruft pack(s) from the MIDX where possible
    @@ Commit message
         MIDX with '--write-midx' to ensure that the resulting MIDX was always
         closed under reachability in order to generate reachability bitmaps.

    -    Suppose (prior to this patch) you have a once-unreachable object packed
    -    in a cruft pack, which later on becomes reachable from one or more
    -    objects in a geometrically repacked pack. That once-unreachable object
    -    *won't* appear in the new pack, since the cruft pack was specified as
    -    neither included nor excluded to 'pack-objects --stdin-packs'. If the
    +    While the previous patch added the '--stdin-packs=follow' option to
    +    pack-objects, it is not yet on by default. Given that, suppose you have
    +    a once-unreachable object packed in a cruft pack, which later becomes
    +    reachable from one or more objects in a geometrically repacked pack.
    +    That once-unreachable object *won't* appear in the new pack, since the
    +    cruft pack was not specified as included or excluded when the
    +    geometrically repacked pack was created with 'pack-objects
    +    --stdin-packs' (*not* '--stdin-packs=follow', which is not on). If that
         new pack is included in a MIDX without the cruft pack, then trying to
         generate bitmaps for that MIDX may fail. This happens when the bitmap
         selection process picks one or more commits which reach the
    -    once-unreachable objects, commit ddee3703b3 ensures that the MIDX will
    -    be closed under reachability. Without it, we would fail to generate a
    -    MIDX bitmap.
    +    once-unreachable objects.

    +    To mitigate this failure mode, commit ddee3703b3 ensures that the MIDX
    +    will be closed under reachability by including cruft pack(s). If cruft
    +    pack(s) were not included, we would fail to generate a MIDX bitmap. But
         ddee3703b3 alludes to the fact that this is sub-optimal by saying

             [...] it's desirable to avoid including cruft packs in the MIDX

base-commit: 485f5f863615e670fd97ae40af744e14072cfe18
--
2.49.0.640.ga4de40e6a8




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux