[PATCH v2 16/17] xdiff: optimize case where --ignore-cr-at-eol is the only whitespace flag

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



From: Ezekiel Newren <ezekielnewren@xxxxxxxxx>

Currently the whitespace iterator is slower than git's C implementation
so we skip using the whitespace iterator if there are no whitespace
flags. Special case the --ignore-cr-at-eol similarly to make it
performant. For the rest of the whitespace flags they will be slower
for now, but as more of Xdiff is translated into Rust it'll be easier
to revisit and optimize whitespace processing. Optimizing the other
whitespace flags now would be difficult because:

  * Xxhash uses chunk based processing.
  * The same iterator is used for hashing and equality, which means the
    iterator could be optimized for returning large chunks for fast
    hashing or could return each byte making equality testing faster.
    I opted for faster hashing. The data structures in C need to be
    cleaned up before they're interoperable with Rust. Once that's done
    I believe a faster method of whitespace processing will be possible.
  * Trying to make heavliy optimized code between 2 languages that aren't
    easily interoperable in their current state makes the code either
    fast or easy to maintain. But once enough of Xdiff is written in
    Rust I believe that a fast and maintainable method can be
    implemented.

Signed-off-by: Ezekiel Newren <ezekielnewren@xxxxxxxxx>
---
 rust/xdiff/src/xutils.rs | 34 ++++++++++++++++++++++++++++++++++
 1 file changed, 34 insertions(+)

diff --git a/rust/xdiff/src/xutils.rs b/rust/xdiff/src/xutils.rs
index 796a5708b6bf..1ea9cfa02db5 100644
--- a/rust/xdiff/src/xutils.rs
+++ b/rust/xdiff/src/xutils.rs
@@ -33,6 +33,18 @@ impl<'a> Iterator for WhitespaceIter<'a> {
             return None;
         }
 
+        // optimize case where --ignore-cr-at-eol is the only whitespace flag
+        if (self.flags & XDF_WHITESPACE_FLAGS) == XDF_IGNORE_CR_AT_EOL {
+            if self.index == 0 && self.line.ends_with(b"\r\n") {
+                self.index = self.line.len() - 1;
+                return Some(&self.line[..self.line.len() - 2])
+            } else {
+                let off = self.index;
+                self.index = self.line.len();
+                return Some(&self.line[off..])
+            }
+        }
+
         loop {
             let start = self.index;
             if self.index == self.line.len() {
@@ -172,6 +184,28 @@ pub fn line_equal(lhs: &[u8], rhs: &[u8], flags: u64) -> bool {
         return lhs == rhs;
     }
 
+    // optimize case where --ignore-cr-at-eol is the only whitespace flag
+    if (flags & XDF_WHITESPACE_FLAGS) == XDF_IGNORE_CR_AT_EOL {
+        let a = lhs.ends_with(b"\r\n");
+        let b = rhs.ends_with(b"\r\n");
+
+        if !(a ^ b) {
+            return lhs == rhs;
+        } else {
+            let lm = if a { 1 } else { 0 };
+            let rm = if b { 1 } else { 0 };
+
+            if lhs.len() - lm != rhs.len() - rm {
+                return false;
+            } else if &lhs[..lhs.len() - 1 - lm] != &rhs[..rhs.len() - 1 - rm] {
+                return false;
+            } else if lhs[lhs.len() - 1] != rhs[rhs.len() - 1] {
+                return false;
+            }
+            return true;
+        }
+    }
+
     let lhs_it = WhitespaceIter::new(lhs, flags);
     let rhs_it = WhitespaceIter::new(rhs, flags);
 
-- 
gitgitgadget





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux