On Wed, Jun 11, 2025 at 11:33:27PM -0700, Christoph Hellwig wrote: > On Wed, Jun 11, 2025 at 03:19:29PM +0300, Sergey Bashirov wrote: > > > Normal operation should not cause that, what did you see there? > > > > I think, this is not an NFS implementation issue, but rather a question > > of how to properly implement the client fencing. In a distributed > > storage system, there is a delay between the time NFS server requests > > a blocking of writes to a shared volume for a particular client and the > > time that blocking takes effect. If we choose an optimistic approach and > > assume that fencing is done by simply sending a request (without waiting > > for actual processing by the underlying storage system), then we might > > end up in the following situation. > > I guess this is using block layout and your own fencing? Because > with the SCSI layout we fence right from the kernel path before > force returning the layout. The fact that block layout can't do > reliable fencing is the reason why I came up with the SCSI layout, > that can. Yes, you are right. By the way, even with SCSI Persistent Reservations the fencing is not entirely clean and simple. We tried a third party enterprise storage system to test the scsi layout. But it seems that SCSI PR implementation there is imperfect. We occasionally observed PR_KEYs being erroneously revoked by the storage system. But the NFS part of this setup worked fine. -- Sergey Bashirov