> > Le 3 juin 2025 à 10:01, Patrick Steinhardt <ps@xxxxxx> a écrit : > > Hi, > > this patch series fixes races around locking the "packed-refs" file when > auto-maintenance decides to repack it. This issue has been reported e.g. > via [1] and [2]. > > The root cause is that git-gc(1) used to know to detach _after_ having > repacked references. As such, callers wouldn't continue with their thing > until we have already packed refs, and thus the race does not exist > there. git-maintenance(1) didn't have the same split though, so this > patch series retrofits that logic. > > The series is structured as follows: > > - Patches 1 and 2 do some light refactorings. > > - Patches 3 to 5 refactor how we set up the list of tasks to not rely > on globals anymore. Instead, we now have a single source of truth > for which tasks exactly will be run. > > - The remaining patches introduce the split of before/after-detach > tasks and wire them up for "pack-refs", "reflog-expire" and "gc" > tasks. > > Changes in v2: > - A couple of commit message improvements. > - Introduce `die(NULL)` to die with the correct exit code but no error > message. This gets rid of some magic numbers. > - Introduce an enum to discern the phases before and after detach. > - Link to v1: https://lore.kernel.org/r/20250527-b4-pks-maintenance-ref-lock-race-v1-0-e1ceb2dea66e@xxxxxx > > Changes in v3: > - Rework logic to talk about foreground/background tasks instead of > before/after detach. > - Link to v2: https://lore.kernel.org/r/20250530-b4-pks-maintenance-ref-lock-race-v2-0-d04e2f93e51f@xxxxxx > > Changes in v4: > - Some more massaging of commit messages. > - Link to v3: https://lore.kernel.org/r/20250602-b4-pks-maintenance-ref-lock-race-v3-0-587d44252dcb@xxxxxx > > Thanks! > > Patrick > > [1]: <CAJR-fbZ4X1+gN75m2dUvocR6NkowLOZ9F26cjBy8w1qd181OoQ@xxxxxxxxxxxxxx> > [2]: <CANi7bVAkNc+gY1NoXfJuDRjxjZLTgL8Lfn8_ZmWsvLAoiLPkNg@xxxxxxxxxxxxxx> > > --- > Patrick Steinhardt (12): > builtin/gc: use designated field initializers for maintenance tasks > builtin/gc: drop redundant local variable > builtin/maintenance: centralize configuration of explicit tasks > builtin/maintenance: mark "--task=" and "--schedule=" as incompatible > builtin/maintenance: stop modifying global array of tasks > builtin/maintenance: extract function to run tasks > builtin/maintenance: fix typedef for function pointers > builtin/maintenance: split into foreground and background tasks > builtin/maintenance: fix locking race with refs and reflogs tasks > usage: allow dying without writing an error message > builtin/gc: avoid global state in `gc_before_repack()` > builtin/maintenance: fix locking race when handling "gc" task > > builtin/am.c | 4 +- > builtin/checkout.c | 4 +- > builtin/fetch.c | 2 +- > builtin/gc.c | 410 +++++++++++++++++++++++++------------------- > builtin/submodule--helper.c | 12 +- > t/t7900-maintenance.sh | 19 +- > usage.c | 2 + > 7 files changed, 263 insertions(+), 190 deletions(-) > > Range-diff versus v3: > > 1: e46a65951b9 = 1: 280f13d2895 builtin/gc: use designated field initializers for maintenance tasks > 2: 73cd67f3e1a = 2: 16a017fb819 builtin/gc: drop redundant local variable > 3: a02452a6d6f = 3: 0ab3344ddb0 builtin/maintenance: centralize configuration of explicit tasks > 4: ccd7691e4d5 = 4: 69e768cb54e builtin/maintenance: mark "--task=" and "--schedule=" as incompatible > 5: 0e243fd81e6 = 5: 295e9e5ee9f builtin/maintenance: stop modifying global array of tasks > 6: c95bd62823e = 6: d94b0c86622 builtin/maintenance: extract function to run tasks > 7: 43d28434d8e = 7: 0bbba671cd0 builtin/maintenance: fix typedef for function pointers > 8: d5740a5c9d9 = 8: 4ce38539bb6 builtin/maintenance: split into foreground and background tasks > 9: 168eb3a9372 ! 9: 28092b9bed1 builtin/maintenance: fix locking race when packing refs and reflogs > @@ Metadata > Author: Patrick Steinhardt <ps@xxxxxx> > > ## Commit message ## > - builtin/maintenance: fix locking race when packing refs and reflogs > + builtin/maintenance: fix locking race with refs and reflogs tasks > > As explained in the preceding commit, git-gc(1) knows to detach only > - after it has already packed references and reflogs. This is done to > - avoid racing around their respective lockfiles. > + after it has already packed references and expired reflogs. This is done > + to avoid racing around their respective lockfiles. > > Adapt git-maintenance(1) accordingly and run the "pack-refs" and > "reflog-expire" tasks in the foreground. Note that the "gc" task has the > 10: 0ff01f6e2aa ! 10: b8ed080c67d usage: allow dying without writing an error message > @@ Commit message > usage: allow dying without writing an error message > > Sometimes code wants to die in a situation where it already has written > - an error message. To use the same error code as `die()` we have to open > - code the code with a call to `exit(128)` in such cases, which is easy to > - get wrong and leaves magical numbers all over our codebase. > + an error message. To use the same error code as `die()` we have to use > + `exit(128)`, which is easy to get wrong and leaves magic numbers all > + over our codebase. > > Teach `die_message_builtin()` to not print any error when passed a > `NULL` pointer as error string. Like this, such users can now call > 11: 93f53000e47 = 11: 5b149886263 builtin/gc: avoid global state in `gc_before_repack()` > 12: 01095d1bf88 = 12: 9ba01f143b3 builtin/maintenance: fix locking race when handling "gc" task > > --- > base-commit: 845c48a16a7f7b2c44d8cb137b16a4a1f0140229 > change-id: 20250527-b4-pks-maintenance-ref-lock-race-11ae5d68e06f Range-diff vs v3 (and v2!) looks good to me; thanks for taking the time!