Re: unable to run NFSD in container if "options sunrpc pool_mode=pernode"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, May 23, 2025 at 06:40:45PM -0400, Jeff Layton wrote:
> On Fri, 2025-05-23 at 18:19 -0400, Mike Snitzer wrote:
> > On Fri, May 23, 2025 at 02:40:17PM -0400, Jeff Layton wrote:
> > > On Fri, 2025-05-23 at 14:29 -0400, Mike Snitzer wrote:
> > > > I don't know if $SUBJECT ever worked... but with latest 6.15 or
> > > > nfsd-testing if I just use pool_mode=global then all is fine.
> > > > 
> > > > If pool_mode=pernode then mounting the container's NFSv3 export fails.
> > > > 
> > > > I haven't started to dig into code yet but pool_mode=pernode works
> > > > perfectly fine if NFSD isn't running in a container.
> > > > 
> 
> Oops, I went and looked and nfsd isn't running in a container on these
> boxes. There are some other containerized apps running on the box, but
> nfsd isn't running in a container.

OK.

> > > > ps. yet another reason why pool_mode=pernode should be the default if
> > > > more than 1 NUMA node ;)
> > > 
> > > Huh, strange. I've no idea why that would be. What kernel is this?
> > 
> > It is this 6.12.24 based frankenbeast-ish kernel:
> > https://git.kernel.org/pub/scm/linux/kernel/git/snitzer/linux.git/log/?h=kernel-6.12.24/main-testing
> > 
> > Basically just 6.12.24 + NFS and NFSD sync'd through nfs-testing and
> > nfsd-testing (so 6.15 NFS and NFSD going on 6.16).
> > 
> > But I also just verified that this kernel built on Chuck's
> > nfsd-testing branch (with 2 extra patches) has the same issue:
> > https://git.kernel.org/pub/scm/linux/kernel/git/snitzer/linux.git/log/?h=cel-nfsd-testing-6.16
> > 
> > Here is the NFS related config:
> > 
> > CONFIG_NETWORK_FILESYSTEMS=y
> > CONFIG_NFS_FS=m
> > # CONFIG_NFS_V2 is not set
> > CONFIG_NFS_V3=m
> > CONFIG_NFS_V3_ACL=y
> > CONFIG_NFS_V4=m
> > # CONFIG_NFS_SWAP is not set
> > CONFIG_NFS_V4_1=y
> > CONFIG_NFS_V4_2=y
> > CONFIG_PNFS_FILE_LAYOUT=m
> > CONFIG_PNFS_BLOCK=m
> > CONFIG_PNFS_FLEXFILE_LAYOUT=m
> > CONFIG_NFS_V4_1_IMPLEMENTATION_ID_DOMAIN="kernel.org"
> > # CONFIG_NFS_V4_1_MIGRATION is not set
> > CONFIG_NFS_V4_SECURITY_LABEL=y
> > CONFIG_NFS_FSCACHE=y
> > # CONFIG_NFS_USE_LEGACY_DNS is not set
> > CONFIG_NFS_USE_KERNEL_DNS=y
> > CONFIG_NFS_DEBUG=y
> > CONFIG_NFS_DISABLE_UDP_SUPPORT=y
> > # CONFIG_NFS_V4_2_READ_PLUS is not set
> > CONFIG_NFSD=m
> > # CONFIG_NFSD_V2 is not set
> > CONFIG_NFSD_V3_ACL=y
> > CONFIG_NFSD_V4=y
> > CONFIG_NFSD_PNFS=y
> > # CONFIG_NFSD_BLOCKLAYOUT is not set
> > CONFIG_NFSD_SCSILAYOUT=y
> > # CONFIG_NFSD_FLEXFILELAYOUT is not set
> > # CONFIG_NFSD_V4_2_INTER_SSC is not set
> > CONFIG_NFSD_V4_SECURITY_LABEL=y
> > # CONFIG_NFSD_LEGACY_CLIENT_TRACKING is not set
> > # CONFIG_NFSD_V4_DELEG_TIMESTAMPS is not set
> > CONFIG_GRACE_PERIOD=m
> > CONFIG_LOCKD=m
> > CONFIG_LOCKD_V4=y
> > CONFIG_NFS_ACL_SUPPORT=m
> > CONFIG_NFS_COMMON=y
> > CONFIG_NFS_COMMON_LOCALIO_SUPPORT=m
> > CONFIG_NFS_LOCALIO=y
> > CONFIG_NFS_V4_2_SSC_HELPER=y
> > CONFIG_SUNRPC=m
> > CONFIG_SUNRPC_GSS=m
> > CONFIG_SUNRPC_BACKCHANNEL=y
> > CONFIG_RPCSEC_GSS_KRB5=m
> > CONFIG_RPCSEC_GSS_KRB5_ENCTYPES_AES_SHA1=y
> > CONFIG_RPCSEC_GSS_KRB5_ENCTYPES_AES_SHA2=y
> > CONFIG_SUNRPC_DEBUG=y
> > CONFIG_SUNRPC_XPRT_RDMA=m
> > 
> > > FWIW, I just built a localio-enabled on a v6.12-uek kernel for our own
> > > purposes yesterday and it's running pool_mode=pernode. It seemed to
> > > work fine as a v3 DS, but I didn't test mounting the container's export
> > > directly.
> > 
> > OK, but you were able to access the v3 DS just fine (assuming pNFS
> > flexfiles layouts that point to your DS that is running NFSD in a
> > container) ?
> > 
> > I'm using nfs-utils-2.8.2.  I don't see any nfsd threads running if I
> > use "options sunrpc pool_mode=pernode".
> > 
> 
> I'll have a look soon, but if you figure it out in the meantime, let us
> know.

Will do.

Just the latest info I have, with sunrpc's pool_mode=pernode dd hangs
with this stack trace:

# cat /proc/8087/stack
[<0>] rpc_wait_bit_killable+0x25/0x80 [sunrpc]
[<0>] __rpc_execute+0x151/0x480 [sunrpc]
[<0>] rpc_execute+0xca/0xf0 [sunrpc]
[<0>] rpc_run_task+0x110/0x180 [sunrpc]
[<0>] nfs4_call_sync_custom+0xb/0x30 [nfsv4]
[<0>] nfs4_do_call_sync+0x69/0x90 [nfsv4]
[<0>] _nfs4_proc_getattr+0x128/0x160 [nfsv4]
[<0>] nfs4_proc_getattr+0x73/0x100 [nfsv4]
[<0>] nfs4_do_open+0x775/0x9d0 [nfsv4]
[<0>] nfs4_atomic_open+0xf7/0x100 [nfsv4]
[<0>] nfs_atomic_open+0x1e7/0x6c0 [nfs]
[<0>] path_openat+0xd38/0x11f0
[<0>] do_filp_open+0xae/0x120
[<0>] do_sys_openat2+0x24d/0x2a0
[<0>] do_sys_open+0x4f/0x90
[<0>] do_syscall_64+0x7b/0x160
[<0>] entry_SYSCALL_64_after_hwframe+0x76/0x7e

And if I just try to mount using v3 it fails with:

# mount -vvvvvvv -o vers=3,nolock 10.200.80.89:/cvol_12_0 /mnt/test
mount.nfs: timeout set for Fri May 23 22:52:04 2025
mount.nfs: trying text-based options 'vers=3,nolock,addr=10.200.80.89'
mount.nfs: prog 100003, trying vers=3, prot=6
mount.nfs: trying 10.200.80.89 prog 100003 vers 3 prot TCP port 2049
mount.nfs: portmap query retrying: RPC: Timed out
mount.nfs: prog 100003, trying vers=3, prot=17
mount.nfs: portmap query failed: RPC: Program not registered
mount.nfs: trying text-based options 'vers=3,nolock,addr=10.200.80.89'
mount.nfs: prog 100003, trying vers=3, prot=6
mount.nfs: trying 10.200.80.89 prog 100003 vers 3 prot TCP port 2049
mount.nfs: portmap query retrying: RPC: Timed out
mount.nfs: prog 100003, trying vers=3, prot=17
mount.nfs: portmap query failed: RPC: Program not registered
mount.nfs: trying text-based options 'vers=3,nolock,addr=10.200.80.89'
mount.nfs: prog 100003, trying vers=3, prot=6
mount.nfs: trying 10.200.80.89 prog 100003 vers 3 prot TCP port 2049
mount.nfs: portmap query retrying: RPC: Timed out
mount.nfs: prog 100003, trying vers=3, prot=17
mount.nfs: portmap query failed: RPC: Program not registered
mount.nfs: requested NFS version or transport protocol is not supported for /mnt/test

# rpcinfo -p 10.200.80.89
   program vers proto   port  service
    100000    4   tcp    111  portmapper
    100000    3   tcp    111  portmapper
    100000    2   tcp    111  portmapper
    100000    4   udp    111  portmapper
    100000    3   udp    111  portmapper
    100000    2   udp    111  portmapper
    100005    1   udp  20048  mountd
    100005    1   tcp  20048  mountd
    100005    2   udp  20048  mountd
    100005    2   tcp  20048  mountd
    100024    1   udp  45252  status
    100024    1   tcp  60557  status
    100005    3   udp  20048  mountd
    100005    3   tcp  20048  mountd
    100003    3   tcp   2049  nfs
    100003    4   tcp   2049  nfs
    100227    3   tcp   2049  nfs_acl
    100021    1   udp  40987  nlockmgr
    100021    3   udp  40987  nlockmgr
    100021    4   udp  40987  nlockmgr
    100021    1   tcp  36527  nlockmgr
    100021    3   tcp  36527  nlockmgr
    100021    4   tcp  36527  nlockmgr

(Not sure what's up with portmap issues and it not progressing to
trying program 100005.. which as you can see below it does)

But if I just use sunrpc's default pool_mode=global:

# mount -vvvvvvv -o vers=3,nolock 10.200.80.89:/cvol_12_0 /mnt/test
mount.nfs: timeout set for Fri May 23 22:55:43 2025
mount.nfs: trying text-based options 'vers=3,nolock,addr=10.200.80.89'
mount.nfs: prog 100003, trying vers=3, prot=6
mount.nfs: trying 10.200.80.89 prog 100003 vers 3 prot TCP port 2049
mount.nfs: prog 100005, trying vers=3, prot=17
mount.nfs: trying 10.200.80.89 prog 100005 vers 3 prot UDP port 20048

# rpcinfo -p 10.200.80.89
   program vers proto   port  service
    100000    4   tcp    111  portmapper
    100000    3   tcp    111  portmapper
    100000    2   tcp    111  portmapper
    100000    4   udp    111  portmapper
    100000    3   udp    111  portmapper
    100000    2   udp    111  portmapper
    100024    1   udp  54037  status
    100024    1   tcp  46339  status
    100005    1   udp  20048  mountd
    100005    1   tcp  20048  mountd
    100005    2   udp  20048  mountd
    100005    2   tcp  20048  mountd
    100005    3   udp  20048  mountd
    100005    3   tcp  20048  mountd
    100003    3   tcp   2049  nfs
    100003    4   tcp   2049  nfs
    100227    3   tcp   2049  nfs_acl
    100021    1   udp  36268  nlockmgr
    100021    3   udp  36268  nlockmgr
    100021    4   udp  36268  nlockmgr
    100021    1   tcp  44195  nlockmgr
    100021    3   tcp  44195  nlockmgr
    100021    4   tcp  44195  nlockmgr




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux