On Wed, Aug 27, 2025 at 12:05:38AM +0300, Alexander Monakov wrote: > Dear fs hackers, > > I suspect there's an unfortunate race window in __fput where file locks are > dropped (locks_remove_file) prior to decreasing writer refcount > (put_file_access). If I'm not mistaken, this window is observable and it > breaks a solution to ETXTBSY problem on exec'ing a just-written file, explained > in more detail below. > > The program demonstrating the problem is attached (a slightly modified version > of the demo given by Russ Cox on the Go issue tracker, see URL in first line). > It makes 20 threads, each executing an infinite loop doing the following: > > 1) open an fd for writing with O_CLOEXEC > 2) write executable code into it > 3) close it > 4) fork > 5) in the child, attempt to execve the just-written file > > If you compile it with -DNOWAIT, you'll see that execve often fails with > ETXTBSY. This happens if another thread forked while we were holding an open fd > between steps 1 and 3, our fd "leaked" in that child, and then we reached our > step 5 before that child did execve (at which point the leaked fd would be > closed thanks to O_CLOEXEC). Egads... Let me get it straight - you have a bunch of threads sharing descriptor tables and some of them are forking (or cloning without shared descriptor tables) while that is going on? Frankly, in such situation I would spawn a thread for that, did unshare(CLONE_FILES) in it, replaced the binary and buggered off, with parent waiting for it to complete.