On Sat, 2025-05-24 at 10:33 -0400, Mike Snitzer wrote: > On Sat, May 24, 2025 at 08:05:19AM -0400, Jeff Layton wrote: > > On Fri, 2025-05-23 at 23:53 -0400, Mike Snitzer wrote: > > > On Fri, May 23, 2025 at 07:09:27PM -0400, Mike Snitzer wrote: > > > > On Fri, May 23, 2025 at 06:40:45PM -0400, Jeff Layton wrote: > > > > > On Fri, 2025-05-23 at 18:19 -0400, Mike Snitzer wrote: > > > > > > On Fri, May 23, 2025 at 02:40:17PM -0400, Jeff Layton wrote: > > > > > > > On Fri, 2025-05-23 at 14:29 -0400, Mike Snitzer wrote: > > > > > > > > I don't know if $SUBJECT ever worked... but with latest 6.15 or > > > > > > > > nfsd-testing if I just use pool_mode=global then all is fine. > > > > > > > > > > > > > > > > If pool_mode=pernode then mounting the container's NFSv3 export fails. > > > > > > > > > > > > > > > > I haven't started to dig into code yet but pool_mode=pernode works > > > > > > > > perfectly fine if NFSD isn't running in a container. > > > > > > > > > > > > > > > > > > Oops, I went and looked and nfsd isn't running in a container on these > > > > > boxes. There are some other containerized apps running on the box, but > > > > > nfsd isn't running in a container. > > > > > > > > OK. > > > > > > > > > > I'm using nfs-utils-2.8.2. I don't see any nfsd threads running if I > > > > > > use "options sunrpc pool_mode=pernode". > > > > > > > > > > > > > > > > I'll have a look soon, but if you figure it out in the meantime, let us > > > > > know. > > > > > > > > Will do. > > > > > > > > Just the latest info I have, with sunrpc's pool_mode=pernode dd hangs > > > > with this stack trace: > > > > > > Turns out this pool_mode=pernode issue is a regression caused by the > > > very recent nfs-utils 2.8.2 (I rebuilt EL10's nfs-utils package, > > > because why not upgrade to the latest!?). > > > > > > If I use EL9.5's latest nfs-utils-2.5.4-37.el8.x86_64 then sunrpc's > > > pool_mode=pernode works fine. > > > > > > And this issue doesn't have anything to do with running in a container > > > (it seemed to be container related purely because I happened to be > > > seeing the issue with an EL9.5 container that had the EL10-based > > > nfs-utils 2.8.2 installed). > > > > > > Steved, unfortunately I'm not sure what the problem is with the newer > > > nfs-utils and setting "options sunrpc pool_mode=pernode" > > > > > > > I tried to reproduce this using fedora-41 VMs (no f42 available for > > virt-builder yet), but everything worked. I don't have any actual NUMA > > hw here though, so maybe that matters? > > > > Can you run this on the nfs server and send back the output? I'm > > wondering if this setting might not track the module option properly on > > that host for some reason: > > > > # nfsdctl pool-mode > > (from EL9.5 container with nfs-utils 2.8.2) > # nfsdctl pool-mode > pool-mode: pernode > npools: 2 > > (on host) > # numactl -H > available: 2 nodes (0-1) > node 0 cpus: 0 1 2 3 4 5 6 7 > node 0 size: 11665 MB > node 0 free: 9892 MB > node 1 cpus: 8 9 10 11 12 13 14 15 > node 1 size: 6042 MB > node 1 free: 5127 MB > node distances: > node 0 1 > 0: 10 20 > 1: 20 10 > > (and yeahh I was aware the newer nfs-utils uses the netlink interface, > will be interesting to pin down what the issue is with > pool-mode=pernode) Hi Mike, I submitted a patch for this a couple of weeks ago: https://lore.kernel.org/linux-nfs/20250527-rpc-numa-v1-1-fa1d98e9a900@xxxxxxxxxx/ Were you able to test it, and did it fix your issue? Thanks, -- Jeff Layton <jlayton@xxxxxxxxxx>