Re: [PATCH -rcu -next 5/7] rcu: Document GP init vs hotplug-scan ordering requirements

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jul 08, 2025 at 10:22:22AM -0400, Joel Fernandes wrote:
> Add detailed comments explaining the critical ordering constraints
> during RCU grace period initialization, based on discussions with
> Frederic.
> 
> Co-developed-by: Frederic Weisbecker <frederic@xxxxxxxxxx>
> Signed-off-by: Joel Fernandes <joelagnelf@xxxxxxxxxx>

Reviewed-by: Paul E. McKenney <paulmck@xxxxxxxxxx>

> ---
>  .../RCU/Design/Requirements/Requirements.rst  | 41 +++++++++++++++++++
>  kernel/rcu/tree.c                             |  8 ++++
>  2 files changed, 49 insertions(+)
> 
> diff --git a/Documentation/RCU/Design/Requirements/Requirements.rst b/Documentation/RCU/Design/Requirements/Requirements.rst
> index 6125e7068d2c..771a1ce4c84d 100644
> --- a/Documentation/RCU/Design/Requirements/Requirements.rst
> +++ b/Documentation/RCU/Design/Requirements/Requirements.rst
> @@ -1970,6 +1970,47 @@ corresponding CPU's leaf node lock is held. This avoids race conditions
>  between RCU's hotplug notifier hooks, the grace period initialization
>  code, and the FQS loop, all of which refer to or modify this bookkeeping.
>  
> +Note that grace period initialization (rcu_gp_init()) must carefully sequence
> +CPU hotplug scanning with grace period state changes. For example, the
> +following race could occur in rcu_gp_init() if rcu_seq_start() were to happen
> +after the CPU hotplug scanning.
> +
> +.. code-block:: none
> +
> +   CPU0 (rcu_gp_init)                   CPU1                          CPU2
> +   ---------------------                ----                          ----
> +   // Hotplug scan first (WRONG ORDER)
> +   rcu_for_each_leaf_node(rnp) {
> +       rnp->qsmaskinit = rnp->qsmaskinitnext;
> +   }
> +                                        rcutree_report_cpu_starting()
> +                                            rnp->qsmaskinitnext |= mask;
> +                                        rcu_read_lock()
> +                                        r0 = *X;
> +                                                                      r1 = *X;
> +                                                                      X = NULL;
> +                                                                      cookie = get_state_synchronize_rcu();
> +                                                                      // cookie = 8 (future GP)
> +   rcu_seq_start(&rcu_state.gp_seq);
> +   // gp_seq = 5
> +
> +   // CPU1 now invisible to this GP!
> +   rcu_for_each_node_breadth_first() {
> +       rnp->qsmask = rnp->qsmaskinit;
> +       // CPU1 not included!
> +   }
> +
> +   // GP completes without CPU1
> +   rcu_seq_end(&rcu_state.gp_seq);
> +   // gp_seq = 8
> +                                                                      poll_state_synchronize_rcu(cookie);
> +                                                                      // Returns true!
> +                                                                      kfree(r1);
> +                                        r2 = *r0; // USE-AFTER-FREE!
> +
> +By incrementing gp_seq first, CPU1's RCU read-side critical section
> +is guaranteed to not be missed by CPU2.
> +
>  Scheduler and RCU
>  ~~~~~~~~~~~~~~~~~
>  
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 327848f436e7..32fdb66e9c9f 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -1835,6 +1835,14 @@ static noinline_for_stack bool rcu_gp_init(void)
>  	start_new_poll = rcu_sr_normal_gp_init();
>  	/* Record GP times before starting GP, hence rcu_seq_start(). */
>  	old_gp_seq = rcu_state.gp_seq;
> +	/*
> +	 * Critical ordering: rcu_seq_start() must happen BEFORE the CPU hotplug
> +	 * scan below. Otherwise we risk a race where a newly onlining CPU could
> +	 * be missed by the current grace period, potentially leading to
> +	 * use-after-free errors. For a detailed explanation of this race, see
> +	 * Documentation/RCU/Design/Requirements/Requirements.rst in the
> +	 * "Hotplug CPU" section.
> +	 */
>  	rcu_seq_start(&rcu_state.gp_seq);
>  	/* Ensure that rcu_seq_done_exact() guardband doesn't give false positives. */
>  	WARN_ON_ONCE(IS_ENABLED(CONFIG_PROVE_RCU) &&
> -- 
> 2.34.1
> 




[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux