Justin Tobler <jltobler@xxxxxxxxx> writes: > On 25/08/03 11:52AM, René Scharfe wrote: >> On 8/3/25 12:08 AM, Justin Tobler wrote: >> > I've tested the patch against this specific file, and it fixes the >> > issue, but I'm uncertain how to reproduce and test this issue more >> > generically. I'm open to suggestions if anyone has some ideas :) >> >> Not sure how to fill up zlib's pending buffer most efficiently. >> Reducing the size of the output buffer would make the bug easier to >> trigger, though. Yes, in my local testing I also have been reducing the size of the output buffer. > Ya, I was able to trigger this issue more frequently by making the > output buffer smaller than the input buffer. I was really hoping though > to find a way to reproduce this without code changes so we could add a > test. Not sure if that is really feasible though in this case. I think it's really nice you were able find a repository that reproduces this issue. Having an existing example in the "wild" is better proof to verify this fix work. >> > I've so far only been able to reproduce this issue in the chromium.git >> > repository with a specific file: >> > >> > git clone --depth=1 https://github.com/chromium/chromium.git >> > cd chromium >> > git -c core.bigFileThreshold=1 archive -o foo.zip --format=zip HEAD -- \ >> > ui/events/ozone/evdev/touch_filter/palm_model/onedevice_train_palm_detection_filter_inference.cc Weird, for me this file didn't trigger the error, but the following file did: chrome/test/data/third_party/kraken/tests/kraken-1.1/imaging-darkroom-data.js But also with this file, I can confirm this fix works. >> > diff --git a/archive-zip.c b/archive-zip.c >> > index df8866d5bae..29e7c9f5e3f 100644 >> > --- a/archive-zip.c >> > +++ b/archive-zip.c >> > @@ -492,14 +492,22 @@ static int write_zip_entry(struct archiver_args *args, >> > >> > zstream.next_in = buf; >> > zstream.avail_in = 0; >> > - result = git_deflate(&zstream, Z_FINISH); >> > - if (result != Z_STREAM_END) >> > - die("deflate error (%d)", result); >> > + >> > + do { >> > + result = git_deflate(&zstream, Z_FINISH); >> > + if (result != Z_OK && result != Z_STREAM_END) >> > + die("deflate error (%d)", result); >> > + >> > + out_len = zstream.next_out - compressed; >> > + if (out_len > 0) { >> > + write_or_die(1, compressed, out_len); >> > + compressed_size += out_len; >> > + zstream.next_out = compressed; >> > + zstream.avail_out = sizeof(compressed); >> > + } >> > + } while (result != Z_STREAM_END); >> > >> > git_deflate_end(&zstream); >> > - out_len = zstream.next_out - compressed; >> > - write_or_die(1, compressed, out_len); >> > - compressed_size += out_len; >> >> Looks good. Could probably rolled into the first loop, but that just >> would make this fix more complicated. > > I was also considering rolling this into the first loop, but ultimately > went with the minimal patch to fix the issue. I don't mind rerolling if > we prefer it the other way though. :) When I was researching, I have been working on a version that rolls the changes into the first loop, but that diff is a lot more substantial. The thing is, you cannot seem to avoid introducing another (nested) loop. My refactoring is based on the zlib example page[1] page and example source code[2], and also there two nested loops are used. Ideally, we'd modify our code so it follows the example code more closely, but I'm worried this might introduce new breakage, so it's probably not worth it. On the other hand, I was able to trigger failure when I made the output buffer equally large as the input buffer. In this case git-archive completes successfully, but when inflated, the content is mangled. It seems my version is protected against that. I've submitted my version too[3]. I'm on the fence which solution is better, so I'd love to get input from the both of you. So I'm happy to hear any thoughts. [1]: https://zlib.net/zlib_how.html [2]: https://zlib.net/zpipe.c [3]: https://lore.kernel.org/git/20250804-toon-archive-zip-fix-v1-0-ca89858e5eaa@xxxxxxxxx/ -- Cheers, Toon