[PATCH v3 00/12] builtin/maintenance: fix ref lock races when detaching

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

this patch series fixes races around locking the "packed-refs" file when
auto-maintenance decides to repack it. This issue has been reported e.g.
via [1] and [2].

The root cause is that git-gc(1) used to know to detach _after_ having
repacked references. As such, callers wouldn't continue with their thing
until we have already packed refs, and thus the race does not exist
there. git-maintenance(1) didn't have the same split though, so this
patch series retrofits that logic.

The series is structured as follows:

  - Patches 1 and 2 do some light refactorings.

  - Patches 3 to 5 refactor how we set up the list of tasks to not rely
    on globals anymore. Instead, we now have a single source of truth
    for which tasks exactly will be run.

  - The remaining patches introduce the split of before/after-detach
    tasks and wire them up for "pack-refs", "reflog-expire" and "gc"
    tasks.

Changes in v2:
  - A couple of commit message improvements.
  - Introduce `die(NULL)` to die with the correct exit code but no error
    message. This gets rid of some magic numbers.
  - Introduce an enum to discern the phases before and after detach.
  - Link to v1: https://lore.kernel.org/r/20250527-b4-pks-maintenance-ref-lock-race-v1-0-e1ceb2dea66e@xxxxxx

Changes in v3:
  - Rework logic to talk about foreground/background tasks instead of
    before/after detach.
  - Link to v2: https://lore.kernel.org/r/20250530-b4-pks-maintenance-ref-lock-race-v2-0-d04e2f93e51f@xxxxxx

Thanks!

Patrick

[1]: <CAJR-fbZ4X1+gN75m2dUvocR6NkowLOZ9F26cjBy8w1qd181OoQ@xxxxxxxxxxxxxx>
[2]: <CANi7bVAkNc+gY1NoXfJuDRjxjZLTgL8Lfn8_ZmWsvLAoiLPkNg@xxxxxxxxxxxxxx>

---
Patrick Steinhardt (12):
      builtin/gc: use designated field initializers for maintenance tasks
      builtin/gc: drop redundant local variable
      builtin/maintenance: centralize configuration of explicit tasks
      builtin/maintenance: mark "--task=" and "--schedule=" as incompatible
      builtin/maintenance: stop modifying global array of tasks
      builtin/maintenance: extract function to run tasks
      builtin/maintenance: fix typedef for function pointers
      builtin/maintenance: split into foreground and background tasks
      builtin/maintenance: fix locking race when packing refs and reflogs
      usage: allow dying without writing an error message
      builtin/gc: avoid global state in `gc_before_repack()`
      builtin/maintenance: fix locking race when handling "gc" task

 builtin/am.c                |   4 +-
 builtin/checkout.c          |   4 +-
 builtin/fetch.c             |   2 +-
 builtin/gc.c                | 410 +++++++++++++++++++++++++-------------------
 builtin/submodule--helper.c |  12 +-
 t/t7900-maintenance.sh      |  19 +-
 usage.c                     |   2 +
 7 files changed, 263 insertions(+), 190 deletions(-)

Range-diff versus v2:

 1:  2c90fadd640 =  1:  00fa348fb9c builtin/gc: use designated field initializers for maintenance tasks
 2:  ebd37a92895 =  2:  d7a2072da28 builtin/gc: drop redundant local variable
 3:  90863ae9957 =  3:  2eef35388e1 builtin/maintenance: centralize configuration of explicit tasks
 4:  f0c05004ea3 =  4:  372efcf7fec builtin/maintenance: mark "--task=" and "--schedule=" as incompatible
 5:  85c8273d4f4 =  5:  979f27b19ce builtin/maintenance: stop modifying global array of tasks
 6:  a9d49980f8a =  6:  b89c8d237ed builtin/maintenance: extract function to run tasks
 7:  35a78300572 =  7:  96554899fd9 builtin/maintenance: fix typedef for function pointers
 8:  b5fe34348aa !  8:  b79befc82be builtin/maintenance: let tasks do maintenance before and after detach
    @@ Metadata
     Author: Patrick Steinhardt <ps@xxxxxx>
     
      ## Commit message ##
    -    builtin/maintenance: let tasks do maintenance before and after detach
    +    builtin/maintenance: split into foreground and background tasks
     
         Both git-gc(1) and git-maintenance(1) have logic to daemonize so that
         the maintenance tasks are performed in the background. git-gc(1) has
    @@ Commit message
         Signed-off-by: Patrick Steinhardt <ps@xxxxxx>
     
      ## builtin/gc.c ##
    -@@ builtin/gc.c: typedef int (*maintenance_auto_fn)(struct gc_config *cfg);
    +@@ builtin/gc.c: static int maintenance_task_incremental_repack(struct maintenance_run_opts *opts
    + 
    + typedef int (*maintenance_task_fn)(struct maintenance_run_opts *opts,
    + 				   struct gc_config *cfg);
    +-
    +-/*
    +- * An auto condition function returns 1 if the task should run
    +- * and 0 if the task should NOT run. See needs_to_gc() for an
    +- * example.
    +- */
    + typedef int (*maintenance_auto_fn)(struct gc_config *cfg);
      
      struct maintenance_task {
      	const char *name;
     -	maintenance_task_fn fn;
    -+	maintenance_task_fn before_detach;
    -+	maintenance_task_fn after_detach;
    ++
    ++	/*
    ++	 * Work that will be executed before detaching. This should not include
    ++	 * tasks that may run for an extended amount of time as it does cause
    ++	 * auto-maintenance to block until foreground tasks have been run.
    ++	 */
    ++	maintenance_task_fn foreground;
    ++
    ++	/*
    ++	 * Work that will be executed after detaching. When not detaching the
    ++	 * work will be run in the foreground, as well.
    ++	 */
    ++	maintenance_task_fn background;
    ++
    ++	/*
    ++	 * An auto condition function returns 1 if the task should run and 0 if
    ++	 * the task should NOT run. See needs_to_gc() for an example.
    ++	 */
      	maintenance_auto_fn auto_condition;
      };
      
    @@ builtin/gc.c: typedef int (*maintenance_auto_fn)(struct gc_config *cfg);
      	[TASK_PREFETCH] = {
      		.name = "prefetch",
     -		.fn = maintenance_task_prefetch,
    -+		.after_detach = maintenance_task_prefetch,
    ++		.background = maintenance_task_prefetch,
      	},
      	[TASK_LOOSE_OBJECTS] = {
      		.name = "loose-objects",
     -		.fn = maintenance_task_loose_objects,
    -+		.after_detach = maintenance_task_loose_objects,
    ++		.background = maintenance_task_loose_objects,
      		.auto_condition = loose_object_auto_condition,
      	},
      	[TASK_INCREMENTAL_REPACK] = {
      		.name = "incremental-repack",
     -		.fn = maintenance_task_incremental_repack,
    -+		.after_detach = maintenance_task_incremental_repack,
    ++		.background = maintenance_task_incremental_repack,
      		.auto_condition = incremental_repack_auto_condition,
      	},
      	[TASK_GC] = {
      		.name = "gc",
     -		.fn = maintenance_task_gc,
    -+		.after_detach = maintenance_task_gc,
    ++		.background = maintenance_task_gc,
      		.auto_condition = need_to_gc,
      	},
      	[TASK_COMMIT_GRAPH] = {
      		.name = "commit-graph",
     -		.fn = maintenance_task_commit_graph,
    -+		.after_detach = maintenance_task_commit_graph,
    ++		.background = maintenance_task_commit_graph,
      		.auto_condition = should_write_commit_graph,
      	},
      	[TASK_PACK_REFS] = {
      		.name = "pack-refs",
     -		.fn = maintenance_task_pack_refs,
    -+		.after_detach = maintenance_task_pack_refs,
    ++		.background = maintenance_task_pack_refs,
      		.auto_condition = pack_refs_condition,
      	},
      	[TASK_REFLOG_EXPIRE] = {
      		.name = "reflog-expire",
     -		.fn = maintenance_task_reflog_expire,
    -+		.after_detach = maintenance_task_reflog_expire,
    ++		.background = maintenance_task_reflog_expire,
      		.auto_condition = reflog_expire_condition,
      	},
      	[TASK_WORKTREE_PRUNE] = {
      		.name = "worktree-prune",
     -		.fn = maintenance_task_worktree_prune,
    -+		.after_detach = maintenance_task_worktree_prune,
    ++		.background = maintenance_task_worktree_prune,
      		.auto_condition = worktree_prune_condition,
      	},
      	[TASK_RERERE_GC] = {
      		.name = "rerere-gc",
     -		.fn = maintenance_task_rerere_gc,
    -+		.after_detach = maintenance_task_rerere_gc,
    ++		.background = maintenance_task_rerere_gc,
      		.auto_condition = rerere_gc_condition,
      	},
      };
      
     +enum task_phase {
    -+	TASK_PHASE_BEFORE_DETACH,
    -+	TASK_PHASE_AFTER_DETACH,
    ++	TASK_PHASE_FOREGROUND,
    ++	TASK_PHASE_BACKGROUND,
     +};
     +
      static int maybe_run_task(const struct maintenance_task *task,
    @@ builtin/gc.c: typedef int (*maintenance_auto_fn)(struct gc_config *cfg);
     +			  struct gc_config *cfg,
     +			  enum task_phase phase)
      {
    -+	int before = (phase == TASK_PHASE_BEFORE_DETACH);
    -+	maintenance_task_fn fn = before ? task->before_detach : task->after_detach;
    -+	const char *region = before ? "maintenance before" : "maintenance";
    ++	int foreground = (phase == TASK_PHASE_FOREGROUND);
    ++	maintenance_task_fn fn = foreground ? task->foreground : task->background;
    ++	const char *region = foreground ? "maintenance foreground" : "maintenance";
      	int ret = 0;
      
     +	if (!fn)
    @@ builtin/gc.c: static int maintenance_run_tasks(struct maintenance_run_opts *opts
      
     +	for (size_t i = 0; i < opts->tasks_nr; i++)
     +		if (maybe_run_task(&tasks[opts->tasks[i]], r, opts, cfg,
    -+				   TASK_PHASE_BEFORE_DETACH))
    ++				   TASK_PHASE_FOREGROUND))
     +			result = 1;
     +
      	/* Failure to daemonize is ok, we'll continue in foreground. */
    @@ builtin/gc.c: static int maintenance_run_tasks(struct maintenance_run_opts *opts
      	for (size_t i = 0; i < opts->tasks_nr; i++)
     -		if (maybe_run_task(&tasks[opts->tasks[i]], r, opts, cfg))
     +		if (maybe_run_task(&tasks[opts->tasks[i]], r, opts, cfg,
    -+				   TASK_PHASE_AFTER_DETACH))
    ++				   TASK_PHASE_BACKGROUND))
      			result = 1;
      
      	rollback_lock_file(&lk);
 9:  79766b0afa0 <  -:  ----------- builtin/maintenance: fix locking race when packing refs and reflogs
 -:  ----------- >  9:  8b3f82b871d builtin/maintenance: fix locking race when packing refs and reflogs
10:  d74a6a3fe7f = 10:  7c61970f4aa usage: allow dying without writing an error message
11:  f8e77819f6e = 11:  39a1aa9dcb0 builtin/gc: avoid global state in `gc_before_repack()`
12:  31007c209cb ! 12:  8b263d0fc2b builtin/maintenance: fix locking race when handling "gc" task
    @@ Commit message
         for the "pack-refs" and "reflog-expire" tasks in preceding commits. Fix
         this by splitting up the logic of the "gc" task:
     
    -      - Before detaching we execute `gc_before_repack()`, which contains the
    -        logic that git-gc(1) itself would execute before detaching.
    +      - We execute `gc_before_repack()` in the foreground, which contains
    +        the logic that git-gc(1) itself would execute in the foreground, as
    +        well.
     
    -      - After detaching we spawn git-gc(1), but with a new hidden flag that
    +      - We spawn git-gc(1) after detaching, but with a new hidden flag that
             suppresses calling `gc_before_repack()`.
     
         Like this we have roughly the same logic as git-gc(1) itself and know to
         repack refs and reflogs before detaching, thus fixing the race.
     
    -    Note that `gc_before_repack()` is renamed to `gc_before_detach()` to
    +    Note that `gc_before_repack()` is renamed to `gc_foreground_tasks()` to
         better reflect what this function does.
     
         Signed-off-by: Patrick Steinhardt <ps@xxxxxx>
    @@ builtin/gc.c: static int report_last_gc_error(void)
      }
      
     -static int gc_before_repack(struct maintenance_run_opts *opts,
    -+static int gc_before_detach(struct maintenance_run_opts *opts,
    - 			    struct gc_config *cfg)
    +-			    struct gc_config *cfg)
    ++static int gc_foreground_tasks(struct maintenance_run_opts *opts,
    ++			       struct gc_config *cfg)
      {
      	if (cfg->pack_refs && maintenance_task_pack_refs(opts, cfg))
    + 		return error(FAILED_RUN, "pack-refs");
     @@ builtin/gc.c: int cmd_gc(int argc,
      	pid_t pid;
      	int daemonized = 0;
      	int keep_largest_pack = -1;
    -+	int skip_maintenance_before_detach = 0;
    ++	int skip_foreground_tasks = 0;
      	timestamp_t dummy;
      	struct maintenance_run_opts opts = MAINTENANCE_RUN_OPTS_INIT;
      	struct gc_config cfg = GC_CONFIG_INIT;
    @@ builtin/gc.c: int cmd_gc(int argc,
      			 N_("repack all other packs except the largest pack")),
      		OPT_STRING(0, "expire-to", &cfg.repack_expire_to, N_("dir"),
      			   N_("pack prefix to store a pack containing pruned objects")),
    -+		OPT_HIDDEN_BOOL(0, "skip-maintenance-before-detach", &skip_maintenance_before_detach,
    -+			   N_("skip maintenance steps typically done before detaching")),
    ++		OPT_HIDDEN_BOOL(0, "skip-foreground-tasks", &skip_foreground_tasks,
    ++			   N_("skip maintenance tasks typically done in the foreground")),
      		OPT_END()
      	};
      
    @@ builtin/gc.c: int cmd_gc(int argc,
     -			ret = 0;
     -			goto out;
     -		}
    -+		if (!skip_maintenance_before_detach) {
    ++		if (!skip_foreground_tasks) {
     +			if (lock_repo_for_gc(force, &pid)) {
     +				ret = 0;
     +				goto out;
    @@ builtin/gc.c: int cmd_gc(int argc,
     -		if (gc_before_repack(&opts, &cfg) < 0)
     -			die(NULL);
     -		delete_tempfile(&pidfile);
    -+			if (gc_before_detach(&opts, &cfg) < 0)
    ++			if (gc_foreground_tasks(&opts, &cfg) < 0)
     +				die(NULL);
     +			delete_tempfile(&pidfile);
     +		}
    @@ builtin/gc.c: int cmd_gc(int argc,
      
     -	if (opts.detach <= 0)
     -		gc_before_repack(&opts, &cfg);
    -+	if (opts.detach <= 0 && !skip_maintenance_before_detach)
    -+		gc_before_detach(&opts, &cfg);
    ++	if (opts.detach <= 0 && !skip_foreground_tasks)
    ++		gc_foreground_tasks(&opts, &cfg);
      
      	if (!repository_format_precious_objects) {
      		struct child_process repack_cmd = CHILD_PROCESS_INIT;
    @@ builtin/gc.c: static int maintenance_task_prefetch(struct maintenance_run_opts *
      
     -static int maintenance_task_gc(struct maintenance_run_opts *opts,
     -			       struct gc_config *cfg UNUSED)
    -+static int maintenance_task_gc_before_detach(struct maintenance_run_opts *opts,
    -+					     struct gc_config *cfg)
    ++static int maintenance_task_gc_foreground(struct maintenance_run_opts *opts,
    ++					  struct gc_config *cfg)
     +{
    -+	return gc_before_detach(opts, cfg);
    ++	return gc_foreground_tasks(opts, cfg);
     +}
     +
    -+static int maintenance_task_gc_after_detach(struct maintenance_run_opts *opts,
    -+					    struct gc_config *cfg UNUSED)
    ++static int maintenance_task_gc_background(struct maintenance_run_opts *opts,
    ++					  struct gc_config *cfg UNUSED)
      {
      	struct child_process child = CHILD_PROCESS_INIT;
      
    @@ builtin/gc.c: static int maintenance_task_gc(struct maintenance_run_opts *opts,
      	else
      		strvec_push(&child.args, "--no-quiet");
      	strvec_push(&child.args, "--no-detach");
    -+	strvec_push(&child.args, "--skip-maintenance-before-detach");
    ++	strvec_push(&child.args, "--skip-foreground-tasks");
      
      	return run_command(&child);
      }
    @@ builtin/gc.c: static const struct maintenance_task tasks[] = {
      	},
      	[TASK_GC] = {
      		.name = "gc",
    --		.after_detach = maintenance_task_gc,
    -+		.before_detach = maintenance_task_gc_before_detach,
    -+		.after_detach = maintenance_task_gc_after_detach,
    +-		.background = maintenance_task_gc,
    ++		.foreground = maintenance_task_gc_foreground,
    ++		.background = maintenance_task_gc_background,
      		.auto_condition = need_to_gc,
      	},
      	[TASK_COMMIT_GRAPH] = {
    @@ t/t7900-maintenance.sh: test_expect_success 'run [--auto|--quiet]' '
     -	test_subcommand git gc --quiet --no-detach <run-no-auto.txt &&
     -	test_subcommand ! git gc --auto --quiet --no-detach <run-auto.txt &&
     -	test_subcommand git gc --no-quiet --no-detach <run-no-quiet.txt
    -+	test_subcommand git gc --quiet --no-detach --skip-maintenance-before-detach <run-no-auto.txt &&
    -+	test_subcommand ! git gc --auto --quiet --no-detach --skip-maintenance-before-detach <run-auto.txt &&
    -+	test_subcommand git gc --no-quiet --no-detach --skip-maintenance-before-detach <run-no-quiet.txt
    ++	test_subcommand git gc --quiet --no-detach --skip-foreground-tasks <run-no-auto.txt &&
    ++	test_subcommand ! git gc --auto --quiet --no-detach --skip-foreground-tasks <run-auto.txt &&
    ++	test_subcommand git gc --no-quiet --no-detach --skip-foreground-tasks <run-no-quiet.txt
      '
      
      test_expect_success 'maintenance.auto config option' '
    @@ t/t7900-maintenance.sh: test_expect_success 'run --task=<task>' '
     -	test_subcommand ! git gc --quiet --no-detach <run-commit-graph.txt &&
     -	test_subcommand git gc --quiet --no-detach <run-gc.txt &&
     -	test_subcommand git gc --quiet --no-detach <run-both.txt &&
    -+	test_subcommand ! git gc --quiet --no-detach --skip-maintenance-before-detach <run-commit-graph.txt &&
    -+	test_subcommand git gc --quiet --no-detach --skip-maintenance-before-detach <run-gc.txt &&
    -+	test_subcommand git gc --quiet --no-detach --skip-maintenance-before-detach <run-both.txt &&
    ++	test_subcommand ! git gc --quiet --no-detach --skip-foreground-tasks <run-commit-graph.txt &&
    ++	test_subcommand git gc --quiet --no-detach --skip-foreground-tasks <run-gc.txt &&
    ++	test_subcommand git gc --quiet --no-detach --skip-foreground-tasks <run-both.txt &&
      	test_subcommand git commit-graph write --split --reachable --no-progress <run-commit-graph.txt &&
      	test_subcommand ! git commit-graph write --split --reachable --no-progress <run-gc.txt &&
      	test_subcommand git commit-graph write --split --reachable --no-progress <run-both.txt

---
base-commit: 845c48a16a7f7b2c44d8cb137b16a4a1f0140229
change-id: 20250527-b4-pks-maintenance-ref-lock-race-11ae5d68e06f





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux