Re: [PATCH 4/4] archive-zip: move git_deflate() with Z_FINISH into the loop

René Scharfe <l.s.r@xxxxxx> · Thu, 7 Aug 2025 21:51:22 +0200

On 8/4/25 6:56 PM, Toon Claes wrote:
> Instead duplicating code to do the final deflate (with `flush` value
> Z_FINISH), bring this call inside the loop that's deflate parts of the
> input stream. This causes also this final deflate to be wrapped in a
> loop to ensure the whole input is taken care of.
> 
> This change makes crc32() to be called without checking if the `readlen`
> is greater than zero, but looking at the zlib manual[1] should be
> allowed.

Reading the manual is surely allowed, that's what it was written for.
Snarkiness aside, s/should/it should/

But if you're referring to the example with a zero length there:

	uLong crc = crc32(0L, Z_NULL, 0);

... then note that it's also giving a Z_NULL pointer, which crc32_z()
(https://github.com/madler/zlib/blob/develop/crc32.c#L585) has special
handling for; the length is ignored in this example.

> 
> This patch concluded some refactoring, making the code more similar to
> the example usage of the official zlib docs[2].
> 
> [1]: https://zlib.net/manual.html
> [2]: https://zlib.net/zlib_how.html
> 
> Co-authored-by: Justin Tobler <jltobler@xxxxxxxxx>
> Signed-off-by: Toon Claes <toon@xxxxxxxxx>
> ---
>  archive-zip.c | 25 +++++++++----------------
>  1 file changed, 9 insertions(+), 16 deletions(-)
> 
> diff --git a/archive-zip.c b/archive-zip.c
> index 25a0224130..559ed267be 100644
> --- a/archive-zip.c
> +++ b/archive-zip.c
> @@ -451,7 +451,7 @@ static int write_zip_entry(struct archiver_args *args,
>  		unsigned char buf[STREAM_BUFFER_SIZE];
>  		ssize_t readlen;
>  		git_zstream zstream;
> -		int result;
> +		int result, flush;
>  		size_t out_len;
>  		unsigned char compressed[STREAM_BUFFER_SIZE * 2];
>  
> @@ -459,44 +459,37 @@ static int write_zip_entry(struct archiver_args *args,
>  
>  		compressed_size = 0;
>  
> -		for (;;) {
> +		do {
>  			readlen = read_istream(stream, buf, sizeof(buf));
> -			if (readlen <= 0)
> +			if (readlen < 0)
>  				break;
>  			crc = crc32(crc, buf, readlen);
> -			if (is_binary == -1)
> +			if ((is_binary == -1) && readlen)

It's probably fine to call crc32 with zero length, but since you have the
readlen condition here anyway it would basically be free to extend it to
that call as well.

>  				is_binary = entry_is_binary(args->repo->index,
>  							    path_without_prefix,
>  							    buf, readlen);
>  
> +			flush = readlen ? Z_NO_FLUSH : Z_FINISH;

... and setting Z_FINISH could be moved under there as well to avoid
setting Z_NO_FLUSH over and over.  I'm probably nitpicking here, though.

>  			zstream.next_in = buf;
>  			zstream.avail_in = readlen;
>  			do {
>  				zstream.next_out = compressed;
>  				zstream.avail_out = sizeof(compressed);
> -				result = git_deflate(&zstream, 0);
> -				if (result != Z_OK)
> +				result = git_deflate(&zstream, flush);
> +				if ((result != Z_OK) && (result != Z_STREAM_END))
>  					die(_("deflate error (%d)"), result);
>  				out_len = zstream.next_out - compressed;
>  
>  				write_or_die(1, compressed, out_len);
>  				compressed_size += out_len;
>  			} while (zstream.avail_out == 0);

Hmm, the manual says "After deflate has returned Z_STREAM_END, the only
possible operations on the stream are deflateReset or deflateEnd.".
Here we could have deflate fill the output buffer to the brim, be done
and return Z_STREAM_END and we'd go again, which would result in an
error, I suppose.  So the while loop above needs to end on Z_STREAM_END
as well, no?

René

> -		}
> +		} while (flush != Z_FINISH);
> +
>  		close_istream(stream);
>  		if (readlen)
>  			return readlen;
>  
> -		zstream.next_in = buf;
> -		zstream.avail_in = 0;
> -		result = git_deflate(&zstream, Z_FINISH);
> -		if (result != Z_STREAM_END)
> -			die("deflate error (%d)", result);
> -
>  		git_deflate_end(&zstream);
> -		out_len = zstream.next_out - compressed;
> -		write_or_die(1, compressed, out_len);
> -		compressed_size += out_len;
>  		zip_offset += compressed_size;
>  
>  		write_zip_data_desc(size, compressed_size, crc);
>