From: Ezekiel Newren <ezekielnewren@xxxxxxxxx> Currently the whitespace iterator is slower than git's C implementation so we skip using the whitespace iterator if there are no whitespace flags. Special case the --ignore-cr-at-eol similarly to make it performant. For the rest of the whitespace flags they will be slower for now, but as more of Xdiff is translated into Rust it'll be easier to revisit and optimize whitespace processing. Optimizing the other whitespace flags now would be difficult because: * Xxhash uses chunk based processing. * The same iterator is used for hashing and equality, which means the iterator could be optimized for returning large chunks for fast hashing or could return each byte making equality testing faster. I opted for faster hashing. The data structures in C need to be cleaned up before they're interoperable with Rust. Once that's done I believe a faster method of whitespace processing will be possible. * Trying to make heavliy optimized code between 2 languages that aren't easily interoperable in their current state makes the code either fast or easy to maintain. But once enough of Xdiff is written in Rust I believe that a fast and maintainable method can be implemented. Signed-off-by: Ezekiel Newren <ezekielnewren@xxxxxxxxx> --- rust/xdiff/src/xutils.rs | 34 ++++++++++++++++++++++++++++++++++ 1 file changed, 34 insertions(+) diff --git a/rust/xdiff/src/xutils.rs b/rust/xdiff/src/xutils.rs index 796a5708b6bf..1ea9cfa02db5 100644 --- a/rust/xdiff/src/xutils.rs +++ b/rust/xdiff/src/xutils.rs @@ -33,6 +33,18 @@ impl<'a> Iterator for WhitespaceIter<'a> { return None; } + // optimize case where --ignore-cr-at-eol is the only whitespace flag + if (self.flags & XDF_WHITESPACE_FLAGS) == XDF_IGNORE_CR_AT_EOL { + if self.index == 0 && self.line.ends_with(b"\r\n") { + self.index = self.line.len() - 1; + return Some(&self.line[..self.line.len() - 2]) + } else { + let off = self.index; + self.index = self.line.len(); + return Some(&self.line[off..]) + } + } + loop { let start = self.index; if self.index == self.line.len() { @@ -172,6 +184,28 @@ pub fn line_equal(lhs: &[u8], rhs: &[u8], flags: u64) -> bool { return lhs == rhs; } + // optimize case where --ignore-cr-at-eol is the only whitespace flag + if (flags & XDF_WHITESPACE_FLAGS) == XDF_IGNORE_CR_AT_EOL { + let a = lhs.ends_with(b"\r\n"); + let b = rhs.ends_with(b"\r\n"); + + if !(a ^ b) { + return lhs == rhs; + } else { + let lm = if a { 1 } else { 0 }; + let rm = if b { 1 } else { 0 }; + + if lhs.len() - lm != rhs.len() - rm { + return false; + } else if &lhs[..lhs.len() - 1 - lm] != &rhs[..rhs.len() - 1 - rm] { + return false; + } else if lhs[lhs.len() - 1] != rhs[rhs.len() - 1] { + return false; + } + return true; + } + } + let lhs_it = WhitespaceIter::new(lhs, flags); let rhs_it = WhitespaceIter::new(rhs, flags); -- gitgitgadget