On Thu, 01 May 2025, Mike Snitzer wrote: > Hi Neil, > > With this change a simple write to a file (using pNFS flexfiles) > triggers this patch's nfs_close_local_fh WARN_ON: > > [ 261.589009] ------------[ cut here ]------------ > [ 261.589016] WARNING: CPU: 2 PID: 7220 at fs/nfs_common/nfslocalio.c:344 nfs_close_local_fh+0x1dd/0x1f0 [nfs_localio] > [ 261.589045] Modules linked in: tls nfsv3 nfs_layout_flexfiles rpcsec_gss_krb5 nfsv4 dns_resolver nfsidmap nfsd auth_rpcgss nfs_acl nft_nat nft_ct nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_masq nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 veth bridge stp llc nfs lockd grace nfs_localio sunrpc netfs nf_tables nfnetlink overlay rfkill vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock vfat fat intel_rapl_msr intel_rapl_common kvm_intel kvm nd_pmem dax_pmem nd_e820 pktcdvd libnvdimm irqbypass ppdev crct10dif_pclmul crc32_pclmul vmw_balloon i2c_piix4 ghash_clmulni_intel vmw_vmci pcspkr joydev i2c_smbus rapl parport_pc parport xfs sr_mod sd_mod cdrom sg ata_generic ata_piix mptspi crc32c_intel libata scsi_transport_spi serio_raw mptscsih vmxnet3 mptbase dm_mod fuse [last unloaded: sunrpc] > [ 261.589403] CPU: 2 UID: 0 PID: 7220 Comm: dd Kdump: loaded Tainted: G W ------- --- 6.12.24.0.hs.62.snitm+ #15 > [ 261.589414] Tainted: [W]=WARN > [ 261.589417] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 12/12/2018 > [ 261.589423] RIP: 0010:nfs_close_local_fh+0x1dd/0x1f0 [nfs_localio] > [ 261.589440] Code: e2 ba 02 00 00 00 48 89 ee 4c 89 e7 e8 9c 9f cb e1 48 8b 43 20 48 85 c0 75 e2 48 89 ee 4c 89 e7 e8 28 a7 cb e1 e9 6f ff ff ff <0f> 0b e8 bc 36 d0 e1 e9 63 ff ff ff e8 02 86 8a e2 66 90 90 90 90 > [ 261.589447] RSP: 0018:ffffb0fac4d5bc98 EFLAGS: 00010282 > [ 261.589455] RAX: 0000000000000000 RBX: ffff94354d5f0270 RCX: 0000000000000002 > [ 261.589461] RDX: ffff943544085040 RSI: ffff9435455559e8 RDI: ffff943544085040 > [ 261.589466] RBP: ffff943544199e80 R08: 58132c4e3594ffff R09: fffffffae10a4a38 > [ 261.589472] R10: 0000000000000001 R11: 000000000000000f R12: ffff94354baae380 > [ 261.589477] R13: ffff94354baae3c0 R14: ffff943546916d08 R15: ffff943546916cf0 > [ 261.589499] FS: 00007fd6724ff580(0000) GS:ffff943773b00000(0000) knlGS:0000000000000000 > [ 261.589512] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 261.589518] CR2: 00007fdade5900a8 CR3: 000000010f3a8002 CR4: 00000000001726f0 > [ 261.589539] Call Trace: > [ 261.589545] <TASK> > [ 261.589556] ff_layout_free_mirror+0x78/0xc0 [nfs_layout_flexfiles] > [ 261.589575] ff_layout_free_layoutreturn+0x64/0x110 [nfs_layout_flexfiles] > [ 261.589594] pnfs_roc_release+0x7e/0x140 [nfsv4] > [ 261.589830] nfs4_free_closedata+0x6c/0x80 [nfsv4] > [ 261.590013] rpc_free_task+0x36/0x60 [sunrpc] > [ 261.590209] nfs4_do_close+0x269/0x330 [nfsv4] > [ 261.590398] __put_nfs_open_context+0xcb/0x150 [nfs] > [ 261.590546] nfs_file_release+0x39/0x60 [nfs] > [ 261.590700] __fput+0xdc/0x2a0 > [ 261.590713] __x64_sys_close+0x3e/0x70 > [ 261.590723] do_syscall_64+0x7b/0x160 > [ 261.590736] ? clear_bhb_loop+0x45/0xa0 > [ 261.590744] ? clear_bhb_loop+0x45/0xa0 > [ 261.590769] ? clear_bhb_loop+0x45/0xa0 > [ 261.590777] entry_SYSCALL_64_after_hwframe+0x76/0x7e > [ 261.590789] RIP: 0033:0x7fd671f2ebf8 > [ 261.590796] Code: 01 02 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 f3 0f 1e fa 48 8d 05 65 6b 2a 00 8b 00 85 c0 75 17 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 40 c3 0f 1f 80 00 00 00 00 53 89 fb 48 83 ec > [ 261.590803] RSP: 002b:00007ffd9826ea88 EFLAGS: 00000246 ORIG_RAX: 0000000000000003 > [ 261.590812] RAX: ffffffffffffffda RBX: 0000558223012120 RCX: 00007fd671f2ebf8 > [ 261.590819] RDX: 0000000000100000 RSI: 0000000000000000 RDI: 0000000000000001 > [ 261.590826] RBP: 0000000000000001 R08: 00000000ffffffff R09: 0000000000000000 > [ 261.590834] R10: 0000000000000022 R11: 0000000000000246 R12: 0000000000000001 > [ 261.590843] R13: 0000000000000000 R14: 0000000000000000 R15: 00007fd67232d000 > [ 261.590858] </TASK> > [ 261.590864] ---[ end trace 0000000000000000 ]--- > > After this last patch is applied, in nfs_close_local_fh() you have: > > /* tell nfs_uuid_put() to wait for us */ > RCU_INIT_POINTER(nfl->nfs_uuid, NULL); > spin_unlock(&nfs_uuid->lock); > rcu_read_unlock(); > > ro_nf = xchg(&nfl->ro_file, RCU_INITIALIZER(NULL)); > rw_nf = xchg(&nfl->rw_file, RCU_INITIALIZER(NULL)); > nfs_to_nfsd_file_put_local(ro_nf); > nfs_to_nfsd_file_put_local(rw_nf); > > rcu_read_lock(); > if (WARN_ON(rcu_access_pointer(nfl->nfs_uuid) != nfs_uuid)) { > rcu_read_unlock(); > return; > } > /* Remove nfl from nfs_uuid->files list and signal nfs_uuid_put() > * that we are done. > */ > spin_lock(&nfs_uuid->lock); > list_del_init(&nfl->list); > wake_up_var_locked(&nfl->nfs_uuid, &nfs_uuid->lock); > spin_unlock(&nfs_uuid->lock); > rcu_read_unlock(); > } > > this is bogus right?: > > RCU_INIT_POINTER(nfl->nfs_uuid, NULL); > ... > if (WARN_ON(rcu_access_pointer(nfl->nfs_uuid) != nfs_uuid)) > > not sure what you were trying to do, maybe stale debugging? [but you didn't test so... ;) ] Hi Mike, thanks for highlighting that. Yes, clearly bogus. This code went through several iterations until I felt it was the right shape. I added that WARN_ON at one point because I had dropped rcu_read_lock and reclaimed it, and that was (I thought) all that was protecting nfs_uuid. But that is certainly not needed now, even if it was earlier in the development. Thanks, NeilBrown