Am 24.05.25 um 07:57 schrieb René Scharfe: > Am 23.05.25 um 22:51 schrieb Alex via GitGitGadget: >> From: jinyaoguo <guo846@xxxxxxxxxx> >> >> The loop in xdl_build_script used `i1 >= 0 || i2 >= 0`, causing >> `i1` (or `i2`) to reach 0 and then access `rchg1[i1-1]` (or >> `rchg2[i2-1]`), which underflows the buffer. >> This commit adds explicit `i1 > 0` and `i2 > 0` checks around >> those array accesses to prevent invalid negative indexing. > > xdl_prepare_ctx() in xdiff/xprepare.c allocates an extra entry at both > ends for rchg arrays, so an index of -1 should be within the bounds. > > i1 and i2 are decreased in lockstep, though, so one of them can become > smaller than -1 if nrec is different between the files. And that's how > this code run can indeed run off into the weeds. Actually no, i1 can't seem to reach 0 without i2 also being 0 and vice versa. Or can it? It makes sense that we reach the start of both buffers at the same time if we walk backwards from the end, don't misstep and have consistent rchg array contents, but I'm not sure. Are you able to demonstrate any out-of-bounds access with e.g., Valgrind, AddressSanitizer or an assertion? > Curiously, AddressSanitizer doesn't report anything, but if I add the > following line after the outer for, I can trigger it to report a > heap-buffer-overflow with e.g., git show 8613c2bb6c: > > if (i1 < 0 || i2 < 0) fprintf(stderr, "Oops: %ld %ld\n", i1, i2); That's because I forgot to add braces. D'oh! I can't trigger any out-of-bounds access or that Oops with them properly in place. So I let myself get fooled by a daring coding style. :-| > >> >> Signed-off-by: Alex Guo <alexguo1023@xxxxxxxxx> >> --- >> Fix buffer underflow in xdl_build_script >> >> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1976%2Fmugitya03%2Fbuf-1-v1 >> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1976/mugitya03/buf-1-v1 >> Pull-Request: https://github.com/git/git/pull/1976 >> >> xdiff/xdiffi.c | 7 ++++--- >> 1 file changed, 4 insertions(+), 3 deletions(-) >> >> diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c >> index 5a96e36dfbe..2e983965328 100644 >> --- a/xdiff/xdiffi.c >> +++ b/xdiff/xdiffi.c >> @@ -951,9 +951,10 @@ int xdl_build_script(xdfenv_t *xe, xdchange_t **xscr) { >> * Trivial. Collects "groups" of changes and creates an edit script. Trivial for Davide perhaps (libxdiff author), but not my mushy brain.. >> */ >> for (i1 = xe->xdf1.nrec, i2 = xe->xdf2.nrec; i1 >= 0 || i2 >= 0; i1--, i2--) > > Should the || be a && instead? From a birds-eye view I would assume we > can stop scanning for changes when we exhaust (reach the top) of either > side. We just have to make sure everything from the other side is > accounted for in the last added change. > >> - if (rchg1[i1 - 1] || rchg2[i2 - 1]) { >> - for (l1 = i1; rchg1[i1 - 1]; i1--); >> - for (l2 = i2; rchg2[i2 - 1]; i2--); >> + if ((i1 > 0 && rchg1[i1 - 1]) || >> + (i2 > 0 && rchg2[i2 - 1])) { >> + for (l1 = i1; i1 > 0 && rchg1[i1 - 1]; i1--); >> + for (l2 = i2; i2 > 0 && rchg2[i2 - 1]; i2--); > > Nit: The indentation of that line is off. > >> >> if (!(xch = xdl_add_change(cscr, i1, i2, l1 - i1, l2 - i2))) { >> xdl_free_script(cscr); >> >> base-commit: 8613c2bb6cd16ef530dc5dd74d3b818a1ccbf1c0 > >