Re: [PATCH] nfsd: Implement large extent array support in pNFS

Sergey Bashirov <sergeybashirov@xxxxxxxxx> · Thu, 12 Jun 2025 11:13:04 +0300

On Wed, Jun 11, 2025 at 11:33:27PM -0700, Christoph Hellwig wrote:
> On Wed, Jun 11, 2025 at 03:19:29PM +0300, Sergey Bashirov wrote:
> > > Normal operation should not cause that, what did you see there?
> >
> > I think, this is not an NFS implementation issue, but rather a question
> > of how to properly implement the client fencing. In a distributed
> > storage system, there is a delay between the time NFS server requests
> > a blocking of writes to a shared volume for a particular client and the
> > time that blocking takes effect. If we choose an optimistic approach and
> > assume that fencing is done by simply sending a request (without waiting
> > for actual processing by the underlying storage system), then we might
> > end up in the following situation.
>
> I guess this is using block layout and your own fencing?  Because
> with the SCSI layout we fence right from the kernel path before
> force returning the layout.  The fact that block layout can't do
> reliable fencing is the reason why I came up with the SCSI layout,
> that can.

Yes, you are right.

By the way, even with SCSI Persistent Reservations the fencing is not
entirely clean and simple. We tried a third party enterprise storage
system to test the scsi layout. But it seems that SCSI PR implementation
there is imperfect. We occasionally observed PR_KEYs being erroneously
revoked by the storage system. But the NFS part of this setup worked fine.

--
Sergey Bashirov