Re: [PATCH v2] nfs: add dummy definition for nfsd_file

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 23 Apr 2025, Pali Rohár wrote:
> On Wednesday 23 April 2025 07:54:40 NeilBrown wrote:
> > On Wed, 23 Apr 2025, Pali Rohár wrote:
> > > On Sunday 20 April 2025 12:12:22 Mike Snitzer wrote:
> > > > On Sat, Apr 19, 2025 at 01:52:31PM -0400, Chuck Lever wrote:
> > > > > On 4/18/25 5:34 PM, Mike Snitzer wrote:
> > > > > > On Wed, Apr 16, 2025 at 09:31:55AM -0400, Chuck Lever wrote:
> > > > > >> On 4/15/25 10:41 PM, Vincent Mailhol wrote:
> > > > > >>> +CC: Neil Brown
> > > > > >>> +CC: Olga Kornievskaia
> > > > > >>> +CC: Dai Ngo
> > > > > >>> +CC: Tom Talpey
> > > > > >>> +CC: Trond Myklebust
> > > > > >>> +CC: Anna Schumaker
> > > > > >>>
> > > > > >>> (just to make sure that anyone listed in
> > > > > >>>
> > > > > >>>   ./scripts/get_maintainer.pl fs/nfs_common/nfslocalio.c
> > > > > >>>
> > > > > >>> get copied).
> > > > > >>>
> > > > > >>> Here is the link to the full thread:
> > > > > >>>
> > > > > >>>   https://lore.kernel.org/all/Z_coQbSdvMWD92IA@xxxxxxxxxx/
> > > > > >>>
> > > > > >>>
> > > > > >>> On 10/04/2025 at 11:09, Mike Snitzer:
> > > > > >>>> Add dummy definition for nfsd_file in both nfslocalio.c and localio.c
> > > > > >>>> so various compilers (e.g. gcc 8.5.0 and 9.5.0) can be used. Otherwise
> > > > > >>>> RCU code (rcu_dereference and rcu_access_pointer) will dereference
> > > > > >>>> what should just be an opaque pointer (by using typeof(*ptr)).
> > > > > >>>>
> > > > > >>>> Fixes: 86e00412254a ("nfs: cache all open LOCALIO nfsd_file(s) in client")
> > > > > >>>> Cc: stable@xxxxxxxxxxxxxxx
> > > > > >>>> Tested-by: Jeff Johnson <jeff.johnson@xxxxxxxxxxxxxxxx>
> > > > > >>>> Tested-by: Pali Rohár <pali@xxxxxxxxxx>
> > > > > >>>> Tested-by: Vincent Mailhol <mailhol.vincent@xxxxxxxxxx>
> > > > > >>>> Signed-off-by: Mike Snitzer <snitzer@xxxxxxxxxx>
> > > > > >>>
> > > > > >>> Hi everyone,
> > > > > >>>
> > > > > >>> The build has been broken for several weeks already. Does anyone have
> > > > > >>> intention to pick-up this patch?
> > > > > >>>
> > > > > >>> (please ignore if someone already picked it up and if it is already on
> > > > > >>> its way to Linus's tree).
> > > > > >>
> > > > > >> I assumed that, like all LOCALIO-related changes, this fix would go
> > > > > >> in through the NFS client tree. Let me know if it needs to go via NFSD.
> > > > > > 
> > > > > > Since we haven't heard from Trond or Anna about it, I think you'd be
> > > > > > perfectly fine to pick it up.  It is a compiler fixup associated with
> > > > > > nfd_file being kept opaque to the client -- but given it is "struct
> > > > > > nfsd_file" that gives you full license to grab it (IMO).
> > > > > > 
> > > > > > I'm also unaware of any conflicting changes in the NFS client tree.
> > > > > 
> > > > > Hi Mike -
> > > > > 
> > > > > I just looked at this one again. The patch's diffstat is:
> > > > > 
> > > > >  fs/nfs/localio.c           | 8 ++++++++
> > > > >  fs/nfs_common/nfslocalio.c | 8 ++++++++
> > > > > 
> > > > > Although fs/nfs_common/ is part of both trees, fs/nfs/localio.c is
> > > > > definitely a client file. I'm still happy to pick it up, but technically
> > > > > I would need an Acked-by: from one of the NFS client maintainers.
> > > > > 
> > > > > My impression is that Trond is managing the NFS client pulls for v6.15.
> > > > 
> > > > Sure, that's my understanding too.  Feel free to offer your Acked-by
> > > > (for fs/nfs_common/) and hopefully it'll get picked up.  I can
> > > > followup with Trond later this coming week if/as needed.
> > > > 
> > > > Thanks,
> > > > Mike
> > > 
> > > Can we move forward here? The compilation of kernel is broken for at
> > > least 2 months. Also I have tried to contact Trond for more months but
> > > has not responded to my emails.
> > > 
> > > Could be this change picked with a slightly higher priority than just
> > > waiting another two months? Note that nobody objected this particular
> > > fix and there are 3+ Tested-by lines. And it is not a good image if some
> > > kernel component does not compile...
> > > 
> > 
> > Actually I do object to this fix (though I've been busy and hadn't had
> > much change to look at it properly).
> > The fix is ugly.  At the very least it should be wrapping in an 
> >    #if  GCC_VERSION  < whatever
> 
> The problem is that this compile issue happens also with some builds of
> gcc 13.3.0 as was discussed in the email thread.
> 
> So is not clear what is that "whatever". For me it looks like that it
> would be more than the version, probably also some build characteristics
> of gcc. But I have not been investigating it deeper.

Fair enough - let's skip that idea then.

I'm finding a few different races in this code.  They are subtle and
unlikely but should be fixed.  So I'd really rather it did continue to
fail to compile until the various problems are fixed.

For example nfs_close_local_fh() can be called either from
nfs_uuid_put() or from a couple of places in nfs code such 
as __put_nfs_open_context().  These can race.  We really need the files
to be closed before either completes, but there is no interlock to
ensure this.  So e.g. nfs_uuid_put() could module_put(nfs_mod) while
the other caller is still inside nfsd code trying to put the file.

Also nfs_uuid_put() moves the list of files onto a private list which is
no protected by any spinlock, and nfs_close_local_fh() could remove it
from that list asynchronously (thinking it is removing it from the
original lock-protected list) which could cause list corruption.

So thanks for reporting this problem!!

NeilBrown





[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux