Re: [PATCH v2 2/2] compat/mingw: fix EACCESS when opening files with `O_CREAT | O_EXCL`

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Mar 28, 2025 at 04:41:05PM +0100, Johannes Schindelin wrote:
> On Fri, 28 Mar 2025, Patrick Steinhardt wrote:
> > On Wed, Mar 26, 2025 at 01:20:12PM +0100, Johannes Schindelin wrote:
> > > I know that e.g. PostgreSQL used this undocumented function at least at
> > > some stage, but SQLite avoided it by introducing a simple poll strategy.
> > > We could also do that, but if there is already code in the reftable
> > > library that skips doing things if a `.lock` file exists, then doing the
> > > same if the `.lock` file cannot be created, too, should be a safe argument
> > > to make.
> >
> > I did stumble over the PostgreSQL patch at one point indeed, yeah.
> >
> > Thanks for the pointer to SQLite. It indeed has the following snippet:
> >
> >     #define winIoerrCanRetry1(a) (((a)==ERROR_ACCESS_DENIED)        || \
> >                                   ((a)==ERROR_SHARING_VIOLATION)    || \
> >                                   ((a)==ERROR_LOCK_VIOLATION)       || \
> >                                   ((a)==ERROR_DEV_NOT_EXIST)        || \
> >                                   ((a)==ERROR_NETNAME_DELETED)      || \
> >                                   ((a)==ERROR_SEM_TIMEOUT)          || \
> >                                   ((a)==ERROR_NETWORK_UNREACHABLE))
> >
> > The function gets used via `winRetryIoerr()`, which is used in various
> > I/O functions to retry the operation, including `winOpen()` to open or
> > create a file. And it indeed uses a rather simple polling system there
> > where it sleeps for 25ms up to 10 times.
> >
> > This certainly is something we could implement in `mingw_open()`: when
> > we see that `CreateFileW()` has returned any of the above errors we
> > simply retry the operation. It wouldn't fix the race itself, but it
> > would hopefully make it less likely to hit. If you would be okay with
> > such a solution I can implement it.
> >
> > Also, one thing to note: this problem isn't caused by the reftable
> > library, it's caused by the lockfile subsystem. So if we don't want to
> > do this in `mingw_open()`, any self-contained fix should go into the
> > lockfile system, not into the reftable library, because we may hit the
> > same symptoms anywhere else where we race around creation/deletion of a
> > lockfile. We just happen to hit this case in the reftable library
> > because the test is intentionally stress-testing and racing this code
> > path.
> 
> As I mentioned, I had hoped that we could address this at another layer.
> 
> But let's move forward with the `RtlGetLastNtStatus()` solution because,
> as you correctly pointed out, it is the only solution so far that lets Git
> determine precisely whether the underlying problem is a pending delete.
> 
> I had only one remaining concern: If `RtlGetLastNtStatus()` has not yet
> been initialized, would we not potentially overwrite the last NTSTATUS
> while initializing it? And the answer I can give to myself is: unlikely.
> The `ntdll` is already loaded, so there won't be an update to the
> `NTSTATUS` there, likewise the `GetProcAddress()` call won't fail and
> hence also not update it.
> 
> So let's go ahead with v2!

Great, thanks a lot for your expertise and guidance!

Patrick




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux