Re: [PATCH 02/10] hash: add a constant for the original hash algorithm

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jun 20, 2025 at 08:43:07PM +0000, brian m. carlson wrote:
> On 2025-06-20 at 01:56:02, Junio C Hamano wrote:
> > "brian m. carlson" <sandals@xxxxxxxxxxxxxxxxxxxx> writes:
> > 
> > > We have a a variety of uses of GIT_HASH_SHA1 littered throughout our
> > > code.  Some of these really mean to represent specifically SHA-1, but
> > > some actually represent the original hash algorithm used in Git which is
> > > implied by older formats and protocols which do not contain hash
> > > information.  For instance, the bundle v1 and v2 formats do not contain
> > > hash algorithm information, and thus SHA-1 is implied by the use of
> > > these formats.
> > 
> > Does that mean use of _ORIGINAL is a sign that these places should
> > keep using SHA-1 and should not change?
> 
> Yes.

I think this makes sense. There have been a bunch of locations in our
code base where I was left wondering whether the use of SHA1 is
intentional or not. Making these explicit should make it a lot more
obvious into which of these categories a callsite falls into.

[snip]
> > > Add a constant for documentary purposes which indicates this value.  It
> > > will always be the same as SHA-1, since this is an essential part of
> > > these formats, but its use indicates this particular reason and not any
> > > other reason why SHA-1 might be used.
> > 
> > I am not sure what this means.  If we use GIT_HASH_SHA1 in such
> > places explicitly (as opposed to GIT_HASH_DEFAULT), isn't it a sign
> > enough that with different versions of Git, that particular code
> > path should keep using SHA-1 no matter what the default is?
> 
> If we have a test helper that computes hashes and someone specified
> "sha1" on the command line, that's GIT_HASH_SHA1.  Someone said, "I'd
> like to use SHA-1."  Similarly, in the reftable code, we can read the
> byte value indicating that the reftable is in SHA-1 and that's an
> explicit decision.

Tiny nit: even for the reftable format it is not always clear whether it
is GIT_HASH_SHA1 or GIT_HASH_ORIGINAL. There are two versions of the
format:

  - The first version implicitly uses SHA1, so this would be
    GIT_HASH_ORIGINAL.

  - The second version specifies the hash format, so it would be either
    GIT_HASH_SHA1 or GIT_HASH_SHA256.

But again, I think that this distinction is actually useful.

> If we default to SHA-1 because nobody specified extensions.objectformat,
> then that's GIT_HASH_ORIGINAL.  Nobody made a decision or opted into an
> algorithm; we just didn't think hard enough about cryptographic agility
> in the original Git and we assumed SHA-1.
> 
> They're both the same numeric constant here and always will be (even if,
> in a future version of Git, we get rid of SHA-1 altogether and we
> otherwise die on that code).  But there's a difference in intention: one
> explicitly stated SHA-1 as opposed to a different algorithm and one just
> got a default because that's the compatible legacy behaviour.

Yup.

Patrick




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux