Re: Question: how will sha256sum be implemented in git

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 5 July 2025 12:57:29 am IST, "brian m. carlson" <sandals@xxxxxxxxxxxxxxxxxxxx> wrote:
>On 2025-07-04 at 11:18:12, Aditya Garg wrote:
>> Hi all
>> 
>> I just read that git aims to transition to SHA256 by default, and conversion from SHA1 to SHA256 is needed for old
>> repos. I was just curious how will that be achieved.
>> 
>> Dumb idea, but maybe we can just encode the existing SHA1 sums' string to SHA256?
>> 
>> Eg: 
>> 
>> $ echo -n 8994f255af5451b6cd1db01ee16d8cf15b9df81e | sha256sum
>> bf8d6d915848377db81ee47e883c0a683b3d86a49ab120191ea1c3d76a30c33f *-
>> 
>> so bf8d6d915848377db81ee47e883c0a683b3d86a49ab120191ea1c3d76a30c33f will be our new commit hash.
>
>This would unfortunately still be vulnerable to collisions in SHA-1,
>which is the problem we're trying to avoid.  For instance, if I can
>create two blobs with that SHA-1 hash, then I can also create two blobs
>with the corresponding SHA-256 value, since the input in this case is
>just the SHA-1 value.
>
>The way we do the transition is pretty simple.  Blobs don't change; we
>just hash them with either SHA-1 or SHA-256.  For trees, we re-write all
>of the entries to use the SHA-256 object IDs instead of the SHA-1 object
>IDs and then we hash the result with SHA-256.  And for commits and tags,
>the headers that represent objects (tree, parent, and object) are
>converted in a similar manner and then, again, hashed with SHA-256.
>
>You can actually see how the conversion operates in
>`object-file-convert.c`.  `repo_oid_to_algop` converts an object from
>one format to another based on the loose object map outlined in
>`Documentation/technical/hash-function-transition.adoc`, or the v3 pack
>index functionality which is not yet upstream but is available in my
>`sha256-interop` branch.  In general, the hash function transition
>document explains a lot of the decision behind why we're doing what
>we're doing and how it works.  I have to give credit to Jonathan Nieder
>for writing the document and to many people on the list for helping to
>contribute to it, and I encourage you to read it: it's not too complex.
>

I'll have a look

>So with this approach, the SHA-256 object ID is computed totally
>independently of the SHA-1 object ID but in the exact same way, just
>with SHA-256 object IDs inside.  We already have support for
>SHA-256-only repositories right now: you can do `git init
>--object-format=sha256` and create one, although not all forges and
>tools currently support them.
>
>The process of the conversion when we're in interoperability mode means
>that we can take a repository that's in SHA-1, convert it to SHA-256,
>continue to interoperate with the old SHA-1 version if we like, and
>then, when we no longer want to use SHA-1, simply stick with the SHA-256
>version and avoid using SHA-1 at all.  That's part of what I'm working
>on right now, and I'm pleased to report that I'm making a good amount of
>progress.  If you're able to attend Git Merge this year, either in
>person or remotely, I'll be giving a talk on this topic.

I'll see if remotely is possible. I neither have a US visa for in person, nor it suits my budget.

>
>I'm also planning to open a discussion on the list within the next
>couple days or weeks about some protocol extensions that will be
>necessary to let us fetch, clone, and push all repositories in
>interoperability mode, so please feel free to follow along for that.

Great!





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux