Re: [PATCH] archive: flush deflate stream until Z_STREAM_END

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 8/3/25 12:08 AM, Justin Tobler wrote:
> In `archive-zip.c:write_zip_entry()` when using a stream as input for
> deflating a file, the call to `git_deflate()` with Z_FINISH always
> expects Z_STREAM_END to be returned. Per zlib documentation[1]:
> 
>         If the parameter flush is set to Z_FINISH, pending input is
>         processed, pending output is flushed and deflate returns with
>         Z_STREAM_END if there was enough output space. If deflate
>         returns with Z_OK or Z_BUF_ERROR, this function must be called
>         again with Z_FINISH and more output space (updated avail_out)
>         but no more input data, until it returns with Z_STREAM_END or an
>         error. After deflate has returned Z_STREAM_END, the only
>         possible operations on the stream are deflateReset or
>         deflateEnd.
> 
> In scenarios where the output buffer is not large enough to write all
> the compressed data, it is perfectly valid for the underlying
> `deflate()` to return Z_OK. Thus, expecting a single pass of `deflate()`
> here to always return Z_STREAM_END is a bug. Update the code to flush
> the deflate stream until Z_STREAM_END is returned.
> 
> [1]: https://zlib.net/manual.html

Good find.  I guess back then I thought making the output buffer twice
as big as the input buffer was sufficient, as deflateBound() guarantees
compression is possible with a much lower ratio.  But this doesn't take
the internal state of a stream into account.  Oof!

> Helped-by: Toon Claes <toon@xxxxxxxxx>
> Signed-off-by: Justin Tobler <jltobler@xxxxxxxxx>
> ---
> 
> Greetings,
> 
> At GitLab, we received a report of a user getting the following error
> when generating a zip archive of their repository via git-archive(1):
> 
>         fatal: deflate error (0)
> 
> I've so far only been able to reproduce this issue in the chromium.git
> repository with a specific file:
> 
>         git clone --depth=1 https://github.com/chromium/chromium.git
>         cd chromium
>         git -c core.bigFileThreshold=1 archive -o foo.zip --format=zip HEAD -- \
>                 ui/events/ozone/evdev/touch_filter/palm_model/onedevice_train_palm_detection_filter_inference.cc
> 
> In the above example, `core.bigFileThreshold` is set to a low value to
> cause more files to use a stream as input while being deflated. This is
> the codepath that produces the specific error.
> 
> I've tested the patch against this specific file, and it fixes the
> issue, but I'm uncertain how to reproduce and test this issue more
> generically. I'm open to suggestions if anyone has some ideas :)

Not sure how to fill up zlib's pending buffer most efficiently.
Reducing the size of the output buffer would make the bug easier to
trigger, though.

> Thanks,
> -Justin
> 
> ---
>  archive-zip.c | 20 ++++++++++++++------
>  1 file changed, 14 insertions(+), 6 deletions(-)
> 
> diff --git a/archive-zip.c b/archive-zip.c
> index df8866d5bae..29e7c9f5e3f 100644
> --- a/archive-zip.c
> +++ b/archive-zip.c
> @@ -492,14 +492,22 @@ static int write_zip_entry(struct archiver_args *args,
>  
>  		zstream.next_in = buf;
>  		zstream.avail_in = 0;
> -		result = git_deflate(&zstream, Z_FINISH);
> -		if (result != Z_STREAM_END)
> -			die("deflate error (%d)", result);
> +
> +		do {
> +			result = git_deflate(&zstream, Z_FINISH);
> +			if (result != Z_OK && result != Z_STREAM_END)
> +				die("deflate error (%d)", result);
> +
> +			out_len = zstream.next_out - compressed;
> +			if (out_len > 0) {
> +				write_or_die(1, compressed, out_len);
> +				compressed_size += out_len;
> +				zstream.next_out = compressed;
> +				zstream.avail_out = sizeof(compressed);
> +			}
> +		} while (result != Z_STREAM_END);
>  
>  		git_deflate_end(&zstream);
> -		out_len = zstream.next_out - compressed;
> -		write_or_die(1, compressed, out_len);
> -		compressed_size += out_len;

Looks good.  Could probably rolled into the first loop, but that just
would make this fix more complicated.

>  		zip_offset += compressed_size;
>  
>  		write_zip_data_desc(size, compressed_size, crc);






[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux