On 8/4/25 6:56 PM, Toon Claes wrote: > The function git_deflate() might not complete to deflate all the input > data in one go. While the function is already being called in a loop, > every loop fresh data is read from the stream. This is not correct, > because input data might get lost. > > As we see in many other callsites, git_deflate() should be called in a > loop on the existing input to make it process all the input data. > > Add in a nested loop around git_deflate() to process the input buffer > completely, before continuing the parent loop that reads from more data > from the input stream. > > Co-authored-by: Justin Tobler <jltobler@xxxxxxxxx> > Signed-off-by: Toon Claes <toon@xxxxxxxxx> > --- > archive-zip.c | 18 ++++++++++-------- > 1 file changed, 10 insertions(+), 8 deletions(-) > > diff --git a/archive-zip.c b/archive-zip.c > index d41a12de5f..25a0224130 100644 > --- a/archive-zip.c > +++ b/archive-zip.c > @@ -471,15 +471,17 @@ static int write_zip_entry(struct archiver_args *args, > > zstream.next_in = buf; > zstream.avail_in = readlen; > - zstream.next_out = compressed; > - zstream.avail_out = sizeof(compressed); > - result = git_deflate(&zstream, 0); > - if (result != Z_OK) > - die(_("deflate error (%d)"), result); > - out_len = zstream.next_out - compressed; > + do { > + zstream.next_out = compressed; > + zstream.avail_out = sizeof(compressed); > + result = git_deflate(&zstream, 0); > + if (result != Z_OK) > + die(_("deflate error (%d)"), result); > + out_len = zstream.next_out - compressed; > > - write_or_die(1, compressed, out_len); > - compressed_size += out_len; > + write_or_die(1, compressed, out_len); > + compressed_size += out_len; > + } while (zstream.avail_out == 0); > } > close_istream(stream); > if (readlen) > Makes sense. If deflate somehow fills the output buffer (with internally pending data, I suppose -- the fresh input data alone is not enough), this clears it and lets it go another round without feeding new input. The mistake was thinking that the existence of deflateBound(), which gives a maximum deflated size for any given input, implies a similarly tight limit on individual chunks of input, which makes no sense in hindsight. René