Re: [PATCH] RFC: autofs-5.1.9 - flag removed entries as stale

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi David,


I was surprised to see this because I'm working on the very same problem.

But since I didn't have a reproducer I've painstakingly worked through

the map reload related code.


I don't know if my changes have fixed the problem but I can post them

for you to try them out. The main reason I would prefer to use my changes

(if they do fix the problem) is that I found quite a few problems with the

map reload not working properly which lead to spending a bunch of time on

that. One of the changes fixes the valid entry lookup and removes setting

the entry negative in lookup_prune_one_cache() and I think fixes the devid

handling in do_readmap_mount().

I'm not quite finished the series yet so I'll post it when I have, hopefully
today.

Umm, I probably should give your reproducer a go ... perhaps later ...

Ian


On 1/8/25 23:22, David Disseldorp wrote:
This effectively reverts commit 21ce28d ("autofs-5.1.4 - mark removed
cache entry negative"), which causes the kernel to stall in autofs_wait
for the following workload:

   cat > /etc/auto.direct <<EOF
   echo "/nfs/share  $mount_args ${NFS_SERVER}:/${NFS_SHARE}"
   EOF

   setsid --fork automount --debug --foreground &> /automount.log
   sleep 1

   touch /test.run
   setsid --fork /bin/bash -c \
     "while [[ -f /test.run ]]; do df -ia >> /test.log; sleep 1; done"
   echo "df loop logging to /test.log"

   sleep 2
   echo "changing and reloading auto.direct"
   echo > /etc/auto.direct
   killall -HUP automount

   sleep 2
   echo "unmounting..."
   umount /nfs/share || echo "umount failed"

The current behaviour sees us hit:
   handle_packet_missing_direct:1352: can't find map entry for ()
...which doesn't respond to the kernel, triggering the stall.

This approach adds a new MOUNT_FLAG_STALE flag to track removed map
entries. While keeping enough state around to respond for the
handle_packet_missing_direct case.

RFC:
- needs further testing (e.g. indirect maps)
- I'm not familiar with the codebase so this may be the wrong approach
- we may need a background job to purge MOUNT_FLAG_STALE entries?

Signed-off-by: David Disseldorp <ddiss@xxxxxxx>
---
  daemon/direct.c     |  8 ++++++--
  daemon/indirect.c   |  8 ++++++--
  daemon/lookup.c     | 11 ++++-------
  include/automount.h |  3 +++
  4 files changed, 19 insertions(+), 11 deletions(-)

diff --git a/daemon/direct.c b/daemon/direct.c
index 42baac8..5e78c40 100644
--- a/daemon/direct.c
+++ b/daemon/direct.c
@@ -1389,8 +1389,12 @@ int handle_packet_missing_direct(struct autofs_point *ap, autofs_packet_missing_
  		return 0;
  	}
- /* Check if we recorded a mount fail for this key */
-	if (me->status >= monotonic_time(NULL)) {
+	/*
+	 * Check if we recorded a mount fail for this key, or the entry has
+	 * been removed.
+	 */
+	if (me->status >= monotonic_time(NULL) ||
+	    me->flags & MOUNT_FLAG_STALE) {
  		ops->send_fail(ap->logopt,
  			       ioctlfd, pkt->wait_queue_token, -ENOENT);
  		ops->close(ap->logopt, ioctlfd);
diff --git a/daemon/indirect.c b/daemon/indirect.c
index 7d4aad7..934bb74 100644
--- a/daemon/indirect.c
+++ b/daemon/indirect.c
@@ -798,8 +798,12 @@ int handle_packet_missing_indirect(struct autofs_point *ap, autofs_packet_missin
me = lookup_source_mapent(ap, pkt->name, LKP_DISTINCT);
  	if (me) {
-		/* Check if we recorded a mount fail for this key */
-		if (me->status >= monotonic_time(NULL)) {
+		/*
+		 * Check if we recorded a mount fail for this key, or the entry
+		 * has been removed.
+		 */
+		if (me->status >= monotonic_time(NULL) ||
+		    me->flags & MOUNT_FLAG_STALE) {
  			ops->send_fail(ap->logopt, ap->ioctlfd,
  				       pkt->wait_queue_token, -ENOENT);
  			cache_unlock(me->mc);
diff --git a/daemon/lookup.c b/daemon/lookup.c
index dc77948..ad0b460 100644
--- a/daemon/lookup.c
+++ b/daemon/lookup.c
@@ -1416,15 +1416,12 @@ void lookup_prune_one_cache(struct autofs_point *ap, struct mapent_cache *mc, ti
  		if (valid && valid->mc == mc) {
  			 /*
  			  * We've found a map entry that has been removed from
-			  * the current cache so it isn't really valid. Set the
-			  * mapent negative to prevent further mount requests
+			  * the current cache so it isn't really valid. Flag the
+			  * mapent stale to prevent further mount requests
  			  * using the cache entry.
  			  */
-			debug(ap->logopt, "removed map entry detected, mark negative");
-			if (valid->mapent) {
-				free(valid->mapent);
-				valid->mapent = NULL;
-			}
+			debug(ap->logopt, "removed map entry detected, mark stale");
+			valid->flags |= MOUNT_FLAG_STALE;
  			cache_unlock(valid->mc);
  			valid = NULL;
  		}
diff --git a/include/automount.h b/include/automount.h
index 9548db8..007d020 100644
--- a/include/automount.h
+++ b/include/automount.h
@@ -548,6 +548,9 @@ struct kernel_mod_version {
  /* Indicator for applications to ignore the mount entry */
  #define MOUNT_FLAG_IGNORE		0x1000
+/* map has been removed, but we can't clean up yet */
+#define MOUNT_FLAG_STALE		0x2000
+
  struct autofs_point {
  	pthread_t thid;
  	char *path;			/* Mount point name */




[Index of Archives]     [Linux Filesystem Development]     [Linux Ext4]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux