Re: [PATCH nf 4/4] netfilter: nf_conntrack: fix crash due to removal of uninitialised entry

Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> · Wed, 16 Jul 2025 19:00:05 +0200

On Wed, Jul 16, 2025 at 05:59:41PM +0200, Florian Westphal wrote:
> Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> wrote:
> > On Mon, Jul 14, 2025 at 04:36:35PM +0200, Florian Westphal wrote:
> > > Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> wrote:
> > > > On Thu, Jul 03, 2025 at 04:21:51PM +0200, Florian Westphal wrote:
> > > > > Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> wrote:
> > > > > > Thanks for the description, this scenario is esoteric.
> > > > > > 
> > > > > > Is this bug fully reproducible?
> > > > > 
> > > > > No.  Unicorn.  Only happened once.
> > > > > Everything is based off reading the backtrace and vmcore.
> > > > 
> > > > I guess this needs a chaos money to trigger this bug. Else, can we try to catch this unicorn again?
> > > 
> > > I would not hold my breath.  But I don't see anything that prevents the
> > > race described in 4/4, and all the things match in the vmcore, including
> > > increment of clash resolution counter.  If you think its too perfect
> > > then ok, we can keep 4/4 back until someone else reports this problem
> > > again.
> > 
> > Hm, I think your sequence is possible, it is the SLAB_TYPESAFE_BY_RCU rule
> > that allows for this to occur.
> > 
> > Could this rare sequence still happen?
> > 
> > cpu x                   cpu y                   cpu z
> >  found entry E          found entry E
> >  E is expired           <preemption>
> >  nf_ct_delete()
> >  return E to rcu slab
> >                                         init_conntrack
> >                                         <preemption>     NOTE: ct->status not yet set to zero
> > 
> > cpu y resumes, it observes E as expired but CONFIRMED:
> >                         <resumes>
> >                         nf_ct_expired()
> >                          -> yes (ct->timeout is 30s)
> >                         confirmed bit set.
> 
> Yes, that can happen, but then the refcount can't be incremented
> as its 0 (-> entry is skipped).

Right, refcount zero prevents it.

static void nf_ct_gc_expired(struct nf_conn *ct)
{
        if (!refcount_inc_not_zero(&ct->ct_general.use))
                return;

> If its nonzero but the object was returned
> by the kmem cache we have a different kind of bug (free with refcount > 0),
> or use-after-free.

OK, thanks for explaining, use set_bit() and post v2.