I was motivated to start looking closely at midx-write.c due to multiple users reporting Git crashes in their background maintenance, specifically during git multi-pack-index repack calls. I was eventually able to reproduce it in git multi-pack-index expire as well. Patch 1 is the only change we need to fix this bug. It includes a test case that will fail under --stress with SANITIZE=address. It requires creating many packfiles (50 was not enough, but 100 is enough). As far as I can tell, this bug has existed since Git 2.47.0 in October 2024, but I started hearing reports of this from users in July 2025 (and took a while to get a dump/repro). The remaining patches are cleanups based on my careful rereading of midx-write.c. There are some issues about error handling that needed some cleanup as well as a removal of the DISABLE_SIGN_COMPARE_WARNINGS macro. Updates in V2 ============= * A stale comment to an unsubmitted version of the test is removed. * More cases needing open_pack_index() are patched. * Typos fixed. * A new patch assumes error and sets result to zero only on the few successful paths. Thanks, -Stolee Derrick Stolee (6): midx-write: only load initialized packs midx-write: put failing response value back midx-write: use cleanup when incremental midx fails midx-write: use uint32_t for preferred_pack_idx midx-write: reenable signed comparison errors midx-write: simplify error cases midx-write.c | 134 +++++++++++++++++------------------- t/t5319-multi-pack-index.sh | 22 +++++- 2 files changed, 86 insertions(+), 70 deletions(-) base-commit: c44beea485f0f2feaf460e2ac87fdd5608d63cf0 Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1965%2Fderrickstolee%2Fmidx-write-cleanup-v2 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1965/derrickstolee/midx-write-cleanup-v2 Pull-Request: https://github.com/gitgitgadget/git/pull/1965 Range-diff vs v1: 1: 4a4b35c694 ! 1: e02a444315 midx-write: only load initialized packs @@ Commit message Add a new test that breaks under --stress when compiled with SANITIZE=address. The chosen number of 100 packfiles was selected to get the --stress output to fail about 50% of the time, while 50 packfiles - could not get a failure in most --stress runs. This test has a very - minor check at the end confirming only one packfile remaining. The - failing nature of this test actually relies on auto-GC cleaning up some - packfiles during the creation of the commits, as tests setting gc.auto - to zero make the packfile count match the number of added commits but - also avoids hitting the memory issue. + could not get a failure in most --stress runs. The test case is marked as EXPENSIVE not only because of the number of packfiles it creates, but because some CI environments were reporting @@ Commit message #4 0x562d5d46fff6 in cmd_multi_pack_index builtin/multi-pack-index.c:305 ... - This failure stack trace is disconnected from the real fix because it - the bad pointers are accessed later when closing the packfiles from the - context. + This failure stack trace is disconnected from the real fix because the bad + pointers are accessed later when closing the packfiles from the context. There are a few different aspects to this fix that are worth noting: @@ midx-write.c: static int fill_packs_from_midx(struct write_midx_context *ctx, - if (open_pack_index(m->packs[i])) - die(_("could not open index for %s"), - m->packs[i]->pack_name); +- } + if (prepare_midx_pack(ctx->repo, m, -+ m->num_packs_in_base + i)) { -+ error(_("could not load pack")); -+ return 1; - } ++ m->num_packs_in_base + i)) ++ return error(_("could not load pack")); + ALLOC_GROW(ctx->info, ctx->nr + 1, ctx->alloc); fill_pack_info(&ctx->info[ctx->nr++], m->packs[i], @@ midx-write.c: static int write_midx_internal(struct repository *r, const char *o goto cleanup; } +@@ midx-write.c: static int write_midx_internal(struct repository *r, const char *object_dir, + struct packed_git *oldest = ctx.info[ctx.preferred_pack_idx].p; + ctx.preferred_pack_idx = 0; + ++ /* ++ * Attempt opening the pack index to populate num_objects. ++ * Ignore failiures as they can be expected and are not ++ * fatal during this selection time. ++ */ ++ open_pack_index(oldest); ++ + if (packs_to_drop && packs_to_drop->nr) + BUG("cannot write a MIDX bitmap during expiration"); + +@@ midx-write.c: static int write_midx_internal(struct repository *r, const char *object_dir, + + if (!oldest->num_objects || p->mtime < oldest->mtime) { + oldest = p; ++ open_pack_index(oldest); + ctx.preferred_pack_idx = i; + } + } @@ midx-write.c: static int write_midx_internal(struct repository *r, const char *object_dir, if (ctx.preferred_pack_idx > -1) { 2: 709555c531 ! 2: a1dd3ed874 midx-write: put failing response value back @@ Commit message This instance of setting the result to 1 before going to cleanup was accidentally removed in fcb2205b77 (midx: implement support for writing - incremental MIDX chains, 2024-08-06). + incremental MIDX chains, 2024-08-06). Build upon a test that already deletes + a packfile to verify that this error propagates to full command failure. Signed-off-by: Derrick Stolee <stolee@xxxxxxxxx> @@ midx-write.c: static int write_midx_internal(struct repository *r, const char *o goto cleanup; } + + ## t/t5319-multi-pack-index.sh ## +@@ t/t5319-multi-pack-index.sh: test_expect_success 'load reverse index when missing .idx, .pack' ' + mv $idx.bak $idx && + + mv $pack $pack.bak && +- git cat-file --batch-check="%(objectsize:disk)" <tip ++ git cat-file --batch-check="%(objectsize:disk)" <tip && ++ ++ test_must_fail git multi-pack-index write 2>err && ++ grep "could not load pack" err + ) + ' + 3: a5bee03601 = 3: c4f75cca09 midx-write: use cleanup when incremental midx fails 4: bd97db26f7 ! 4: 2290e27ded midx-write: use uint32_t for preferred_pack_idx @@ midx-write.c: static int write_midx_internal(struct repository *r, const char *o + struct packed_git *oldest = ctx.info[0].p; ctx.preferred_pack_idx = 0; - if (packs_to_drop && packs_to_drop->nr) + /* @@ midx-write.c: static int write_midx_internal(struct repository *r, const char *object_dir, * objects to resolve, so the preferred value doesn't * matter. 5: eb1abdca32 = 5: 35302f5228 midx-write: reenable signed comparison errors -: ---------- > 6: 7be25cf534 midx-write: simplify error cases -- gitgitgadget