[PATCH v3 0/3] align the behavior when opening "packed-refs"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi All:

As discussed in [1], we need to use mmap mechanism to open large
"packed_refs" file to save the memory usage. This patch mainly does the
following things:

1: Fix an issue that we would report an error when the "packed-refs"
file is empty, which does not align with the runtime behavior.
2-4: Extract some logic from the existing code and then use these
created helper functions to let fsck code to use mmap necessarily

[1] https://lore.kernel.org/git/20250503133158.GA4450@xxxxxxxxxxxxxxxxxxxxxxx

Really thank Peff and Patrick to suggest me to do above change.

---

Change in v2:

1. Update the commit message of [PATCH 1/4]. And use redirection to
create an empty file instead of using `touch`.
2. Don't use if for the refactored function in [PATCH 3/4] and then
update the commit message to align with the new function name.
3. Enhance the commit message of [PATCH 4/4].

---

Change in v3:

1. Drop the patch which creates a new function
"munmap_temporary_snapshot". As discussed, there is no need to munmap
the file during fsck.
2. Allocate snapshot variable in the stack instead of heap.
3. Fix rebase issue, remove unneeded code to check the file size
explicitly.

The range-diff seems unreadable. Briefly, for this new version, the only
change is the last patch compared to the v2.

    packed-backend: mmap large "packed-refs" file during fsck

Thanks,
Jialuo

shejialuo (3):
  packed-backend: fsck should allow an empty "packed-refs" file
  packed-backend: extract snapshot allocation in `load_contents`
  packed-backend: mmap large "packed-refs" file during fsck

 refs/packed-backend.c    | 70 ++++++++++++++++++++++------------------
 t/t0602-reffiles-fsck.sh | 13 ++++++++
 2 files changed, 52 insertions(+), 31 deletions(-)

Range-diff against v2:
-:  ---------- > 1:  26c3fd55a8 packed-backend: fsck should allow an empty "packed-refs" file
1:  c0609afac9 ! 2:  4604be8b51 packed-backend: extract munmap operation for `MMAP_TEMPORARY`
    @@ Metadata
     Author: shejialuo <shejialuo@xxxxxxxxx>
     
      ## Commit message ##
    -    packed-backend: extract munmap operation for `MMAP_TEMPORARY`
    +    packed-backend: extract snapshot allocation in `load_contents`
     
    -    "create_snapshot" would try to munmap the file when the "mmap_strategy"
    -    is "MMAP_TEMPORARY". We also need to do this operation when checking the
    -    consistency of the "packed-refs" file.
    +    "load_contents" would choose which way to load the content of the
    +    "packed-refs". However, we cannot directly use this function when
    +    checking the consistency due to we don't want to open the file. And we
    +    also need to reuse the logic to avoid causing repetition.
     
    -    Create a new function "munmap_temporary_snapshot" to do above and
    -    change "create_snapshot" to align with the behavior.
    +    Let's create a new helper function "allocate_snapshot_buffer" to extract
    +    the snapshot allocation logic in "load_contents" and update the
    +    "load_contents" to align with the behavior.
     
         Suggested-by: Jeff King <peff@xxxxxxxx>
         Suggested-by: Patrick Steinhardt <ps@xxxxxx>
         Signed-off-by: shejialuo <shejialuo@xxxxxxxxx>
     
      ## refs/packed-backend.c ##
    -@@ refs/packed-backend.c: static int allocate_snapshot_buffer(struct snapshot *snapshot, int fd, struct st
    - 	return 1;
    - }
    +@@ refs/packed-backend.c: static int refname_contains_nul(struct strbuf *refname)
    + 
    + #define SMALL_FILE_SIZE (32*1024)
      
    -+static void munmap_temporary_snapshot(struct snapshot *snapshot)
    ++static int allocate_snapshot_buffer(struct snapshot *snapshot, int fd, struct stat *st)
     +{
    -+	char *buf_copy;
    ++	ssize_t bytes_read;
     +	size_t size;
     +
    -+	if (!snapshot)
    -+		return;
    ++	size = xsize_t(st->st_size);
    ++	if (!size)
    ++		return 0;
    ++
    ++	if (mmap_strategy == MMAP_NONE || size <= SMALL_FILE_SIZE) {
    ++		snapshot->buf = xmalloc(size);
    ++		bytes_read = read_in_full(fd, snapshot->buf, size);
    ++		if (bytes_read < 0 || bytes_read != size)
    ++			die_errno("couldn't read %s", snapshot->refs->path);
    ++		snapshot->mmapped = 0;
    ++	} else {
    ++		snapshot->buf = xmmap(NULL, size, PROT_READ, MAP_PRIVATE, fd, 0);
    ++		snapshot->mmapped = 1;
    ++	}
     +
    -+	/*
    -+	 * We don't want to leave the file mmapped, so we are
    -+	 * forced to make a copy now:
    -+	 */
    -+	size = snapshot->eof - snapshot->start;
    -+	buf_copy = xmalloc(size);
    ++	snapshot->start = snapshot->buf;
    ++	snapshot->eof = snapshot->buf + size;
     +
    -+	memcpy(buf_copy, snapshot->start, size);
    -+	clear_snapshot_buffer(snapshot);
    -+	snapshot->buf = snapshot->start = buf_copy;
    -+	snapshot->eof = buf_copy + size;
    ++	return 1;
     +}
     +
      /*
       * Depending on `mmap_strategy`, either mmap or read the contents of
       * the `packed-refs` file into the snapshot. Return 1 if the file
    -@@ refs/packed-backend.c: static struct snapshot *create_snapshot(struct packed_ref_store *refs)
    - 		verify_buffer_safe(snapshot);
    - 	}
    +@@ refs/packed-backend.c: static int refname_contains_nul(struct strbuf *refname)
    +  */
    + static int load_contents(struct snapshot *snapshot)
    + {
    +-	int fd;
    + 	struct stat st;
    +-	size_t size;
    +-	ssize_t bytes_read;
    ++	int ret = 1;
    ++	int fd;
      
    --	if (mmap_strategy != MMAP_OK && snapshot->mmapped) {
    --		/*
    --		 * We don't want to leave the file mmapped, so we are
    --		 * forced to make a copy now:
    --		 */
    --		size_t size = snapshot->eof - snapshot->start;
    --		char *buf_copy = xmalloc(size);
    + 	fd = open(snapshot->refs->path, O_RDONLY);
    + 	if (fd < 0) {
    +@@ refs/packed-backend.c: static int load_contents(struct snapshot *snapshot)
    + 
    + 	if (fstat(fd, &st) < 0)
    + 		die_errno("couldn't stat %s", snapshot->refs->path);
    +-	size = xsize_t(st.st_size);
     -
    --		memcpy(buf_copy, snapshot->start, size);
    --		clear_snapshot_buffer(snapshot);
    --		snapshot->buf = snapshot->start = buf_copy;
    --		snapshot->eof = buf_copy + size;
    +-	if (!size) {
    +-		close(fd);
    +-		return 0;
    +-	} else if (mmap_strategy == MMAP_NONE || size <= SMALL_FILE_SIZE) {
    +-		snapshot->buf = xmalloc(size);
    +-		bytes_read = read_in_full(fd, snapshot->buf, size);
    +-		if (bytes_read < 0 || bytes_read != size)
    +-			die_errno("couldn't read %s", snapshot->refs->path);
    +-		snapshot->mmapped = 0;
    +-	} else {
    +-		snapshot->buf = xmmap(NULL, size, PROT_READ, MAP_PRIVATE, fd, 0);
    +-		snapshot->mmapped = 1;
     -	}
    -+	if (mmap_strategy == MMAP_TEMPORARY && snapshot->mmapped)
    -+		munmap_temporary_snapshot(snapshot);
    +-	close(fd);
    + 
    +-	snapshot->start = snapshot->buf;
    +-	snapshot->eof = snapshot->buf + size;
    ++	if (!allocate_snapshot_buffer(snapshot, fd, &st))
    ++		ret = 0;
      
    - 	return snapshot;
    +-	return 1;
    ++	close(fd);
    ++	return ret;
      }
    + 
    + static const char *find_reference_location_1(struct snapshot *snapshot,
2:  c868e3dd16 ! 3:  82e19a65c6 packed-backend: mmap large "packed-refs" file during fsck
    @@ Commit message
         "packed-backend.c" use this way, we should make "fsck" align with the
         current codebase.
     
    -    As we have introduced two helper functions "allocate_snapshot_buffer"
    -    and "munmap_snapshot_if_temporary", we could simply call these functions
    -    to use mmap mechanism.
    +    As we have introduced the helper function "allocate_snapshot_buffer", we
    +    could simple use this function to use mmap mechanism.
     
         Suggested-by: Jeff King <peff@xxxxxxxx>
         Suggested-by: Patrick Steinhardt <ps@xxxxxx>
    @@ refs/packed-backend.c: static int packed_fsck(struct ref_store *ref_store,
      	struct packed_ref_store *refs = packed_downcast(ref_store,
      							REF_STORE_READ, "fsck");
     -	struct strbuf packed_ref_content = STRBUF_INIT;
    -+	struct snapshot *snapshot = xcalloc(1, sizeof(*snapshot));
    ++	struct snapshot snapshot = { 0 };
      	unsigned int sorted = 0;
      	struct stat st;
      	int ret = 0;
     @@ refs/packed-backend.c: static int packed_fsck(struct ref_store *ref_store,
    - 	if (!st.st_size)
    + 		goto cleanup;
    + 	}
    + 
    +-	if (!st.st_size)
    ++	if (!allocate_snapshot_buffer(&snapshot, fd, &st))
      		goto cleanup;
      
     -	if (strbuf_read(&packed_ref_content, fd, 0) < 0) {
     -		ret = error_errno(_("unable to read '%s'"), refs->path);
    -+	if (!allocate_snapshot_buffer(snapshot, fd, &st))
    - 		goto cleanup;
    +-		goto cleanup;
     -	}
    - 
    +-
     -	ret = packed_fsck_ref_content(o, ref_store, &sorted, packed_ref_content.buf,
     -				      packed_ref_content.buf + packed_ref_content.len);
    -+	if (mmap_strategy == MMAP_TEMPORARY && snapshot->mmapped)
    -+		munmap_temporary_snapshot(snapshot);
    -+
    -+	ret = packed_fsck_ref_content(o, ref_store, &sorted, snapshot->start,
    -+				      snapshot->eof);
    ++	ret = packed_fsck_ref_content(o, ref_store, &sorted, snapshot.start,
    ++				      snapshot.eof);
      	if (!ret && sorted)
     -		ret = packed_fsck_ref_sorted(o, ref_store, packed_ref_content.buf,
     -					     packed_ref_content.buf + packed_ref_content.len);
    -+		ret = packed_fsck_ref_sorted(o, ref_store, snapshot->start,
    -+					     snapshot->eof);
    ++		ret = packed_fsck_ref_sorted(o, ref_store, snapshot.start,
    ++					     snapshot.eof);
      
      cleanup:
      	if (fd >= 0)
      		close(fd);
     -	strbuf_release(&packed_ref_content);
    -+	clear_snapshot_buffer(snapshot);
    -+	free(snapshot);
    ++	clear_snapshot_buffer(&snapshot);
      	return ret;
      }
      
-- 
2.49.0





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux