Re: ETXTBSY window in __fput

Al Viro <viro@xxxxxxxxxxxxxxxxxx> · Tue, 26 Aug 2025 23:00:33 +0100

On Wed, Aug 27, 2025 at 12:05:38AM +0300, Alexander Monakov wrote:
> Dear fs hackers,
> 
> I suspect there's an unfortunate race window in __fput where file locks are
> dropped (locks_remove_file) prior to decreasing writer refcount
> (put_file_access). If I'm not mistaken, this window is observable and it
> breaks a solution to ETXTBSY problem on exec'ing a just-written file, explained
> in more detail below.
> 
> The program demonstrating the problem is attached (a slightly modified version
> of the demo given by Russ Cox on the Go issue tracker, see URL in first line).
> It makes 20 threads, each executing an infinite loop doing the following:
> 
> 1) open an fd for writing with O_CLOEXEC
> 2) write executable code into it
> 3) close it
> 4) fork
> 5) in the child, attempt to execve the just-written file
> 
> If you compile it with -DNOWAIT, you'll see that execve often fails with
> ETXTBSY. This happens if another thread forked while we were holding an open fd
> between steps 1 and 3, our fd "leaked" in that child, and then we reached our
> step 5 before that child did execve (at which point the leaked fd would be
> closed thanks to O_CLOEXEC).

Egads...  Let me get it straight - you have a bunch of threads sharing descriptor
tables and some of them are forking (or cloning without shared descriptor tables)
while that is going on?

Frankly, in such situation I would spawn a thread for that, did unshare(CLONE_FILES)
in it, replaced the binary and buggered off, with parent waiting for it to complete.