On 8/11/2025 3:06 PM, Junio C Hamano wrote: > When you have two or more objects with object names that share more > than half the length of the hash algorithm in use (e.g. 10 bytes for > SHA-1 that produces 20-byte/160-bit hash), find_unique_abbrev() > fails to show disambiguation. > > To see how many leading letters of a given full object name is > sufficiently unambiguous, the algorithm starts from a initial > length, guessed based on the estimated number of objects in the > repository, and see if another object that shares the prefix, and > keeps extending the abbreviation. The loop stops at GIT_MAX_RAWSZ, > which is counted as the number of bytes, since 5b20ace6 (sha1_name: > unroll len loop in find_unique_abbrev_r(), 2017-10-08); Wow. My first Git contribution. Some style issues that you're opportunistically cleaning up are due to my newness, for sure. > before that > change, it extended up to GIT_SHA1_HEXSZ, which was the correct > limit because the loop is adding one output letter per iteration > and back then SHA256 was not in the picture. > > Pass the max length of the hash being in use in the current > repository down the code path, and use it to compute the code to > update the abbreviation length required to make it unique. > > Signed-off-by: Junio C Hamano <gitster@xxxxxxxxx> > --- > object-name.c | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/object-name.c b/object-name.c > index 11aa0e6afc..8f9af57c0a 100644 > --- a/object-name.c > +++ b/object-name.c > @@ -680,6 +680,7 @@ static unsigned msb(unsigned long val) > struct min_abbrev_data { > unsigned int init_len; > unsigned int cur_len; > + unsigned int max_len; > char *hex; > struct repository *repo; > const struct object_id *oid; > @@ -699,12 +700,12 @@ static inline char get_hex_char_from_oid(const struct object_id *oid, > static int extend_abbrev_len(const struct object_id *oid, void *cb_data) > { > struct min_abbrev_data *mad = cb_data; > - > unsigned int i = mad->init_len; > + > while (mad->hex[i] && mad->hex[i] == get_hex_char_from_oid(oid, i)) > i++; > > - if (i < GIT_MAX_RAWSZ && i >= mad->cur_len) > + if (mad->cur_len <= i && i < mad->max_len) > mad->cur_len = i + 1; This logic is all about not extending the abbreviation length to beyond the length of the hex array, so your limits make sense. Moving the comparisons here helps with readability. > return 0; > @@ -864,6 +865,7 @@ int repo_find_unique_abbrev_r(struct repository *r, char *hex, > mad.repo = r; > mad.init_len = len; > mad.cur_len = len; > + mad.max_len = hexsz; This new parameter required for allowing SHA256 is valuable. I agree that we shouldn't need a test case to guarantee this in the future. Good to clean up unnecessary uses of the macro limits. Thanks, -Stolee