"Ezekiel Newren via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes: > +extern u64 xxh3_64(u8 const* ptr, usize size); > + > + > static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_t const *xpp, > xdlclassifier_t *cf, xdfile_t *xdf) { > unsigned long *ha; > @@ -175,14 +178,26 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_ > > xdl_parse_lines(mf, narec, xdf); > > + if ((xpp->flags & XDF_WHITESPACE_FLAGS) == 0) { > + for (usize i = 0; i < (usize) xdf->nrec; i++) { > + xrecord_t *rec = xdf->recs[i]; > + rec->ha = xxh3_64(rec->ptr, rec->size); > + } > + } else { > + for (usize i = 0; i < (usize) xdf->nrec; i++) { > + xrecord_t *rec = xdf->recs[i]; > + char const* dump = (char const*) rec->ptr; > + rec->ha = xdl_hash_record(&dump, (char const*) (rec->ptr + rec->size), xpp->flags); > + } > + } As a technology demonstration and proof of concept patch, this is very nice, but to be upstreamed for real, we'd want a variant of xxhash that can work with the contents with whitespace squashed to be usable with various whitespace ignoring modes of operation. When that happens, and when the result turns out to be more performant, we can lose the xdl_hash_record() and require only the xxhash, which would be great. And that variant of xxhash that understands whitespace squashing can of course be written in Rust as a part of this effort when the series loses its RFC status. At the same time, those who want to use our xdiff code in third-party software (like libgit2 and vim) may want to reimplement it in C in their copy. Thanks.