Re: [PATCH v2 4/4] packed-backend: mmap large "packed-refs" file during fsck

shejialuo <shejialuo@xxxxxxxxx> · Fri, 9 May 2025 23:21:34 +0800

On Thu, May 08, 2025 at 04:07:41PM -0400, Jeff King wrote:
> On Wed, May 07, 2025 at 10:54:03PM +0800, shejialuo wrote:
> 
> > diff --git a/refs/packed-backend.c b/refs/packed-backend.c
> > index ae6b6845a6..ff744f1d4c 100644
> > --- a/refs/packed-backend.c
> > +++ b/refs/packed-backend.c
> > @@ -2079,7 +2079,7 @@ static int packed_fsck(struct ref_store *ref_store,
> >  {
> >  	struct packed_ref_store *refs = packed_downcast(ref_store,
> >  							REF_STORE_READ, "fsck");
> > -	struct strbuf packed_ref_content = STRBUF_INIT;
> > +	struct snapshot *snapshot = xcalloc(1, sizeof(*snapshot));
> 
> Minor, but is there any reason to allocate this here and not just:
> 
>   struct snapshot snapshot = { 0 };
> 
> ?

I simply copy the code from the existing code... I will change.

> 
> > @@ -2126,21 +2126,23 @@ static int packed_fsck(struct ref_store *ref_store,
> >  	if (!st.st_size)
> >  		goto cleanup;
> >  
> > -	if (strbuf_read(&packed_ref_content, fd, 0) < 0) {
> > -		ret = error_errno(_("unable to read '%s'"), refs->path);
> > +	if (!allocate_snapshot_buffer(snapshot, fd, &st))
> >  		goto cleanup;
> > -	}
> 
> Looking at allocate_snapshot_buffer(), it will return 0 only when the
> file is empty (and thus there is nothing to allocate) and will
> otherwise die(). So we do not need to report any error when it fails.
> Good.
> 
> But that makes the "!st.st_size" check in the context redundant, doesn't
> it? It can just go away.
> 

Good catch. I remember in the V1, this does not exist. I may make
something wrong when rebasing the code. Thanks!

> > -	ret = packed_fsck_ref_content(o, ref_store, &sorted, packed_ref_content.buf,
> > -				      packed_ref_content.buf + packed_ref_content.len);
> > +	if (mmap_strategy == MMAP_TEMPORARY && snapshot->mmapped)
> > +		munmap_temporary_snapshot(snapshot);
> > +
> > +	ret = packed_fsck_ref_content(o, ref_store, &sorted, snapshot->start,
> > +				      snapshot->eof);
> 
> Why are we unmapping here before we use the content? That will create an
> allocated in-memory copy of the mmap'd content. I thought the whole
> point here was to avoid doing so.
> 

I simply follow how "create_snapshot" does. Actually, I am also quite
confused about this. If we would eventually copy the content into the
user space's memory. What is the reason that we mmap at Windows in the
first place?

My understanding is that after mmaping, we need to do some sanity checks
and then if there is a need, we may sort the "packed-refs" file. So, we
would improve some efficiency at Windows for this part?

> It does shorten the amount of time we hold the temporary mmap in place,
> but I don't think we care about that here. The whole point of
> MMAP_TEMPORARY is that we usually hold the packed-refs file open across
> many requests, and on some platforms (like Windows) we don't want to do
> that. But in this code path we plan to mmap, do our verification, and
> then drop the snapshot. So we're always "temporary" anyway.
> 
> I.e., I'd have expected this code to allocate_snapshot_buffer(), do its
> checks, and then call clear_snapshot_buffer().
> 

I will improve this in the next version.

> -Peff