Re: [PATCH 1/2] BreakingChanges: announce switch to "reftable" format

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jul 02, 2025 at 12:17:50PM -0500, Justin Tobler wrote:
> On 25/07/02 12:14PM, Patrick Steinhardt wrote:
> > diff --git a/Documentation/BreakingChanges.adoc b/Documentation/BreakingChanges.adoc
> > index c6bd94986c5..c96b5319cdd 100644
> > --- a/Documentation/BreakingChanges.adoc
> > +++ b/Documentation/BreakingChanges.adoc
> > @@ -118,6 +118,45 @@ Cf. <2f5de416-04ba-c23d-1e0b-83bb655829a7@xxxxxxxxxxx>,
> >  <20170223155046.e7nxivfwqqoprsqj@LykOS.localdomain>,
> >  <CA+EOSBncr=4a4d8n9xS4FNehyebpmX8JiUwCsXD47EQDE+DiUQ@xxxxxxxxxxxxxx>.
> >  
> > +* The default storage format for references in newly created repositories will
> > +  be changed from "files" to "reftable". The "reftable" format provides
> > +  multiple advantages over the "files" format:
> > ++
> > +  ** It is impossible to store two references that only differ in casing on
> > +     case-insensitive filesystems with the "files" format. This issue is
> > +     especially common on Windows, but also on older versions of macOS. As the
> > +     "reftable" backend does not use filesystem paths anymore to encode
> > +     reference names this problem goes away.
> 
> I believe even modern macOS by default uses a case-insensitive
> file-system. Maybe we should instead say:
> 
>   This limitation is common on Windows and macOS platforms.

Okay, thanks for the clarification. I thought recent versions of macOS
were case-sensitive by default.

> > +  ** Similarly, macOS normalizes path names that contain unicode characters,
> > +     which has the consequence that you cannot store two names with unicode
> > +     characters that are encoded differently with the "files" backend. Again,
> > +     this is not an issue with the "reftable" backend.
> > +  ** Deleting references with the "files" backend requires Git to rewrite the
> > +     complete "packed-refs" file. In large repositories with many references
> > +     this file can easily be dozens of megabytes in size, in extreme cases it
> > +     may be gigabytes. The "reftable" backend uses tombstone markers for
> > +     deleted references and thus does not have to rewrite all of its data.
> > +  ** Repository housekeeping with the "files" backend typically performs
> > +     all-into-one repacks of references. This can be quite expensive, and
> > +     consequently housekeeping is a tradeoff between the number of loose
> > +     references that accumulate and slow down operations that read references,
> > +     and compressing those loose references into the "packed-refs" file. The
> > +     "reftable" backend uses geometric compaction after every write, which
> > +     amortizes costs and ensures that the backend is always in a
> > +     well-maintained state.
> > +  ** Operations that write multiple references at once are not atomic with the
> > +     "files" backend. Consequently, Git may see in-between states when it reads
> > +     references while a reference transaction is in the process of being
> > +     committed to disk.
> > +  ** Writing many references at once is slow with the "files" backend because
> > +     every reference is created as a separate file. The "reftable" backend
> > +     significantly outperforms the "files" backend by multiple orders of
> > +     magnitude.
> 
> The examples above do a good job at explaining individual technical
> benefits. I do wonder if we should include a more general statement
> aimed at users as to why the change to reftables is beneficial. Maybe
> something like:
> 
>   The reftables backend addresses several performance concerns as the
>   number of references scale in a repository. 

I think this would be a bit too handwavy. I'd rather want to point out
the specific cases where we know it to perform better.

Patrick




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux