[PATCH v2 0/4] Integrate the sparse index with 'git apply' and interactive add, checkout, and reset

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The sparse index helps make some Git commands faster when using
sparse-checkout in cone mode. However, not all code paths are aware that the
index can have non-blob entries, so we are careful about rolling this
feature out gradually. The cost of this rollout is that some commands are
slower with the sparse index as they need to expand a sparse index into a
full index in memory, which requires parsing tree objects to construct the
full path list.

This patch series focuses on the 'git add -p' command, which is slow with
the sparse index for a couple of reasons, handled in the first two patches:

 1. 'git add -p' uses 'git apply' as a subcommand and 'git apply' needs
    integration with the sparse index. Luckily, we just need to add the repo
    setting and appropriate tests to confirm it behaves as expected.
 2. The interactive modes of 'git add' ('-p' and '-i') leave cmd_add()
    before the code that sets the repo setting to allow for a sparse index.
    Patch 2 fixes this and adds appropriate tests to confirm the behavior in
    a sparse-checkout.
 3. The interactive mode of 'git reset' leaves cmd_reset() before the code
    that sets the repo setting to allow for the sparse index.

A third patch adds a performance test to p2000-sparse-operations.sh to
confirm that we are getting the performance improvement we expect:

  Test                                      BASE  PATCH 1      PATCH 2      PATCH 3
  -------------------------------------------------------------------------------------
  2000.118: ... git add -p (full-v3)        0.79  0.79  +0.0%  0.82  +3.8%  0.82  +3.8%
  2000.119: ... git add -p (full-v4)        0.74  0.76  +2.7%  0.74  +0.0%  0.76  +2.7%
  2000.120: ... git add -p (sparse-v3)      1.94  1.28 -34.0%  0.07 -96.4%  0.07 -96.4%
  2000.121: ... git add -p (sparse-v4)      1.93  1.28 -33.7%  0.06 -96.9%  0.06 -96.9%
  2000.122: ... git checkout -p (full-v3)   1.18  1.18  +0.0%  1.18  +0.0%  1.19  +0.8%
  2000.123: ... git checkout -p (full-v4)   1.10  1.12  +1.8%  1.11  +0.9%  1.11  +0.9%
  2000.124: ... git checkout -p (sparse-v3) 1.31  0.11 -91.6%  0.11 -91.6%  0.11 -91.6%
  2000.125: ... git checkout -p (sparse-v4) 1.29  0.11 -91.5%  0.11 -91.5%  0.11 -91.5%
  2000.126: ... git reset -p (full-v3)      0.81  0.80  -1.2%  0.83  +2.5%  0.83  +2.5%
  2000.127: ... git reset -p (full-v4)      0.78  0.77  -1.3%  0.77  -1.3%  0.78  +0.0%
  2000.128: ... git reset -p (sparse-v3)    1.58  0.92 -41.8%  0.91 -42.4%  0.07 -95.6%
  2000.129: ... git reset -p (sparse-v4)    1.58  0.92 -41.8%  0.92 -41.8%  0.07 -95.6%



Updates in v2
=============

Thanks for the careful review from Elijah and the pointer from Phillip, we
have these changes:

 1. The tests no longer have different expansion behaviors for 'git add -p'
    and 'git add -i' due to partially-expanded indexes on disk.
 2. We now test 'git checkout -p' and 'git reset -p'.
 3. 'git reset -p' needed some changes to the builtin (similar to 'git add')
    to be fast.

Thanks, -Stolee

Derrick Stolee (4):
  apply: integrate with the sparse index
  git add: make -p/-i aware of sparse index
  reset: integrate sparse index with --patch
  p2000: add performance test for patch-mode commands

 builtin/add.c                            |   7 +-
 builtin/apply.c                          |   7 +-
 builtin/reset.c                          |   6 +-
 t/perf/p2000-sparse-operations.sh        |   3 +
 t/t1092-sparse-checkout-compatibility.sh | 151 +++++++++++++++++++++++
 5 files changed, 167 insertions(+), 7 deletions(-)


base-commit: 6c0bd1fc70efaf053abe4e57c976afdc72d15377
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1914%2Fderrickstolee%2Fapply-sparse-index-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1914/derrickstolee/apply-sparse-index-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/1914

Range-diff vs v1:

 1:  0e6e199cd19 ! 1:  1adf81ecb2c apply: integrate with the sparse index
     @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'sparse-index is n
      +
      +	# Expands when using --index.
      +	ensure_expanded apply --index ../patch-outside &&
     ++
     ++	# Does not when index is partially expanded.
     ++	git -C sparse-index reset --hard &&
     ++	ensure_not_expanded apply --cached ../patch-outside &&
     ++
     ++	# Try again with a reset and collapsed index.
      +	git -C sparse-index reset --hard &&
     ++	git -C sparse-index sparse-checkout reapply &&
      +
     -+	# Does not expand when using --cached.
     -+	ensure_not_expanded apply --cached ../patch-outside
     ++	# Expands when index is collapsed.
     ++	ensure_expanded apply --cached ../patch-outside
      +'
      +
       test_expect_success 'advice.sparseIndexExpanded' '
 2:  63caae87634 ! 2:  0a2752721d0 git add: make -p/-i aware of sparse index
     @@ Commit message
      
          It turns out that control flows out of cmd_add() in the interactive
          cases before the lines that confirm that the builtin is integrated with
     -    the sparse index. We need to move that earlier to ensure it prevents a
     -    full index expansion on read.
     +    the sparse index.
      
     -    Add more test cases that confirm that these interactive add options work
     -    with the sparse index. One interesting aspect here is that the '-i'
     -    option avoids expanding the sparse index when a sparse directory exists
     -    on disk while the '-p' option does hit the ensure_full_index() method.
     -    This leaves some room for improvement, but this case should be atypical
     -    as users should remain within their sparse-checkout.
     +    Moving that integration point earlier in cmd_add() allows 'git add -p'
     +    and 'git add -p' to operate without expanding a sparse index to a full
     +    one.
     +
     +    Add test cases that confirm that these interactive add options work with
     +    the sparse index.
      
          Signed-off-by: Derrick Stolee <stolee@xxxxxxxxx>
      
     @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'add, commit, chec
       	init_repos &&
       
      @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'sparse-index is not expanded: git apply' '
     - 	ensure_not_expanded apply --cached ../patch-outside
     + 	ensure_expanded apply --cached ../patch-outside
       '
       
      +test_expect_success 'sparse-index is not expanded: git add -p' '
     @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'sparse-index is n
      +	git -C sparse-index reset &&
      +	ensure_not_expanded add -i <in &&
      +
     ++	# -p does expand when edits are outside sparse checkout.
      +	mkdir -p sparse-index/folder1 &&
      +	echo "new content" >sparse-index/folder1/a &&
     -+
     -+	# -p does expand when edits are outside sparse checkout.
      +	test_write_lines y n y >in &&
      +	ensure_expanded add -p <in &&
      +
     -+	# but -i does not expand.
     -+	git -C sparse-index reset &&
     ++	# Fully reset the index.
     ++	git -C sparse-index reset --hard &&
     ++	git -C sparse-index sparse-checkout reapply &&
     ++
     ++	# -i does expand when edits are outside sparse checkout.
     ++	mkdir -p sparse-index/folder1 &&
     ++	echo "new content" >sparse-index/folder1/a &&
      +	test_write_lines u 2 3 "" q >in &&
     -+	ensure_not_expanded add -i <in
     ++	ensure_expanded add -i <in
      +'
      +
       test_expect_success 'advice.sparseIndexExpanded' '
 -:  ----------- > 3:  d1482a29d8f reset: integrate sparse index with --patch
 3:  7a777281626 ! 4:  a50c57f7628 p2000: add performance test for 'git add -p'
     @@ Metadata
      Author: Derrick Stolee <dstolee@xxxxxxxxxxxxx>
      
       ## Commit message ##
     -    p2000: add performance test for 'git add -p'
     +    p2000: add performance test for patch-mode commands
      
     -    The previous two changes contributed performance improvements to 'git
     -    apply' and 'git add -p' when using a sparse index. Add a performance
     -    test to demonstrate this (and to help validate that performance remains
     -    good in the future).
     +    The previous three changes contributed performance improvements to 'git
     +    apply', 'git add -p', and 'git reset -p' when using a sparse index. The
     +    improvement to 'git apply' also improved 'git checkout -p'. Add
     +    performance tests to demonstrate this (and to help validate that
     +    performance remains good in the future).
      
          In the truncated test output below, we see that the full checkout
          performance changes within noise expectations, but the sparse index
     -    cases improve 33% and then 96%.
     -
     -                          HEAD~3     HEAD~2         HEAD~1
     -    ---------------------------------------------------------
     -    2000.118: (full-v3)     0.80   0.84 +5.0%     0.84  +5.0%
     -    2000.119: (full-v4)     0.76   0.79 +3.9%     0.80  +5.3%
     -    2000.120: (sparse-v3)   2.09   1.39 -33.5%    0.07 -96.7%
     -    2000.121: (sparse-v4)   2.09   1.39 -33.5%    0.07 -96.7%
     +    cases improve 33% and then 96% for 'git add -p' and 41% and then 95% for
     +    'git reset -p'. 'git checkout -p' improves immediatley by 91% because it
     +    does not need any change to its builtin.
     +
     +      Test                                    HEAD~4  HEAD~3       HEAD~2       HEAD~1
     +      -------------------------------------------------------------------------------------
     +      2000.118: ... git add -p (full-v3)        0.79  0.79  +0.0%  0.82  +3.8%  0.82  +3.8%
     +      2000.119: ... git add -p (full-v4)        0.74  0.76  +2.7%  0.74  +0.0%  0.76  +2.7%
     +      2000.120: ... git add -p (sparse-v3)      1.94  1.28 -34.0%  0.07 -96.4%  0.07 -96.4%
     +      2000.121: ... git add -p (sparse-v4)      1.93  1.28 -33.7%  0.06 -96.9%  0.06 -96.9%
     +      2000.122: ... git checkout -p (full-v3)   1.18  1.18  +0.0%  1.18  +0.0%  1.19  +0.8%
     +      2000.123: ... git checkout -p (full-v4)   1.10  1.12  +1.8%  1.11  +0.9%  1.11  +0.9%
     +      2000.124: ... git checkout -p (sparse-v3) 1.31  0.11 -91.6%  0.11 -91.6%  0.11 -91.6%
     +      2000.125: ... git checkout -p (sparse-v4) 1.29  0.11 -91.5%  0.11 -91.5%  0.11 -91.5%
     +      2000.126: ... git reset -p (full-v3)      0.81  0.80  -1.2%  0.83  +2.5%  0.83  +2.5%
     +      2000.127: ... git reset -p (full-v4)      0.78  0.77  -1.3%  0.77  -1.3%  0.78  +0.0%
     +      2000.128: ... git reset -p (sparse-v3)    1.58  0.92 -41.8%  0.91 -42.4%  0.07 -95.6%
     +      2000.129: ... git reset -p (sparse-v4)    1.58  0.92 -41.8%  0.92 -41.8%  0.07 -95.6%
      
          It is worth noting that if our test was more involved and had multiple
          hunks to evaluate, then the time spent in 'git apply' would dominate due
     @@ t/perf/p2000-sparse-operations.sh: test_perf_on_all git diff-tree HEAD
       test_perf_on_all "git worktree add ../temp && git worktree remove ../temp"
       test_perf_on_all git check-attr -a -- $SPARSE_CONE/a
      +test_perf_on_all 'echo >>a && test_write_lines y | git add -p'
     ++test_perf_on_all 'test_write_lines y y y | git checkout --patch -'
     ++test_perf_on_all 'echo >>a && git add a && test_write_lines y | git reset --patch'
       
       test_done

-- 
gitgitgadget




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux