On Wed, 2025-06-18 at 18:41 +0800, chenxiaosong@xxxxxxxxxxxxxxxx wrote: > From: ChenXiaoSong <chenxiaosong@xxxxxxxxxx> > > Before commit 38f080f3cd19 ("NFSD: Move callback_wq into struct nfs4_client"), > we had a null-ptr-deref in nfsd4_probe_callback() (Link[1]): > > nfsd: last server has exited, flushing export cache > NFSD: starting 90-second grace period (net f0000030) > Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 > ... > Call trace: > __queue_work+0xb4/0x558 > queue_work_on+0x88/0x90 > nfsd4_probe_callback+0x4c/0x58 [nfsd] > NFSD: starting 90-second grace period (net f0000030) > nfsd4_probe_callback_sync+0x20/0x38 [nfsd] > nfsd4_init_conn.isra.57+0x8c/0xa8 [nfsd] > nfsd4_create_session+0x5b8/0x718 [nfsd] > nfsd4_proc_compound+0x4c0/0x710 [nfsd] > nfsd_dispatch+0x104/0x248 [nfsd] > svc_process_common+0x348/0x808 [sunrpc] > svc_process+0xb0/0xc8 [sunrpc] > nfsd+0xf0/0x160 [nfsd] > kthread+0x134/0x138 > ret_from_fork+0x10/0x18 > Code: aa1c03e0 97ffffba aa0003e2 b5000780 (f9400262) > SMP: stopping secondary CPUs > Starting crashdump kernel... > Bye! > > One of the cases is: > > task A (cpu 1) | task B (cpu 2) | task C (cpu 3) > ---------------------|----------------------|--------------------------------- > nfsd_startup_generic | nfsd_startup_generic | > nfsd_users == 0 | nfsd_users == 0 | > nfsd_users++ | nfsd_users++ | > nfsd_users == 1 | | > ... | | > callback_wq == xxx | | > ---------------------|----------------------|--------------------------------- > | | nfsd_shutdown_generic > | | nfsd_users == 1 > | | --nfsd_users > | | nfsd_users == 0 > | | ... > | | callback_wq == xxx > | | destroy_workqueue(callback_wq) > ---------------------|----------------------|--------------------------------- > | nfsd_users == 1 | > | ... | > | callback_wq == yyy | > > After commit 38f080f3cd19 ("NFSD: Move callback_wq into struct nfs4_client"), > this issue no longer occurs, but we should still convert the nfsd_users > to atomic_t to prevent other similar issues. > > Link[1]: https://chenxiaosong.com/en/nfs/en-null-ptr-deref-in-nfsd4_probe_callback.html > Co-developed-by: huhai <huhai@xxxxxxxxxx> > Signed-off-by: huhai <huhai@xxxxxxxxxx> > Signed-off-by: ChenXiaoSong <chenxiaosong@xxxxxxxxxx> > --- > fs/nfsd/nfssvc.c | 8 ++++---- > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c > index 9b3d6cff0e1e..08b1f9ebdc2a 100644 > --- a/fs/nfsd/nfssvc.c > +++ b/fs/nfsd/nfssvc.c > @@ -270,13 +270,13 @@ static int nfsd_init_socks(struct net *net, const struct cred *cred) > return 0; > } > > -static int nfsd_users = 0; > +static atomic_t nfsd_users = ATOMIC_INIT(0); > > static int nfsd_startup_generic(void) > { > int ret; > > - if (nfsd_users++) > + if (atomic_fetch_inc(&nfsd_users)) > return 0; > > ret = nfsd_file_cache_init(); > @@ -291,13 +291,13 @@ static int nfsd_startup_generic(void) > out_file_cache: > nfsd_file_cache_shutdown(); > dec_users: > - nfsd_users--; > + atomic_dec(&nfsd_users); > return ret; > } > > static void nfsd_shutdown_generic(void) > { > - if (--nfsd_users) > + if (atomic_dec_return(&nfsd_users)) > return; > > nfs4_state_shutdown(); Isn't nfsd_users protected by the nfsd_mutex? It looks like it's held in all of the places this counter is accessed. -- Jeff Layton <jlayton@xxxxxxxxxx>