Re: [PATCH] bulk-checkin: fix sign compare warnings

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jeff King <peff@xxxxxxxx> writes:

> On Fri, Mar 21, 2025 at 05:08:06PM -0400, Karthik Nayak wrote:
>
>> > @@ -192,7 +188,7 @@ static int stream_blob_to_pack(struct bulk_checkin_packfile *state,
>> >  			offset += rsize;
>> >  			if (*already_hashed_to < offset) {
>> >  				size_t hsize = offset - *already_hashed_to;
>> > -				if (rsize < hsize)
>> > +				if ((size_t)rsize < hsize)
>>
>> Something I found peculiar here is that `rsize` is of type ssize_t'.
>> But it only seems to store a positive value.
>
> I assumed it was ssize_t because it would hold the result of a read
> call. But it doesn't! We put that into the "read_result" variable.
>
> So it could just be a size_t in the first place. And indeed it is better
> as one, because we assign from "size", which is itself a size_t. We do
> not yet warn about type mismatches outside of comparisons, but really it
> is equally bad.

Nice, thanks for exploring this thought out more. I did look at the
code, but was more cursory.

> However, if you switch it, then we get a different -Wsign-compare
> problem: we compare "rsize" and "read_result". So you still have to
> cast, but at a different spot.
>

True. But this would be better in my regards, since this would directly
follow the

  if (read_result < 0)
     die_errno("failed to read from '%s'", path);

code, so a `if ((size_t)read_result != rsize)` here makes logical sense
since we can clearly see that this section is only reached when
`read_result` has a positive value.

> If we are doing this a lot (and really this conversion is necessary any
> time you look at the outcome of a read call), I do still wonder if we
> should have a helper like:
>
> static inline int safe_scast(ssize_t ret, size_t *out)
> {
> 	if (ret < 0)
> 		return 0;
> 	/* cast is safe because of check above */
> 	*out = (size_t)ret;
> 	return 1;
> }
>
> (yes, I know the name is lousy). That would allow something like this:
>
> diff --git a/bulk-checkin.c b/bulk-checkin.c
> index f6f79cb9e2..fbffc7c8d6 100644
> --- a/bulk-checkin.c
> +++ b/bulk-checkin.c
> @@ -178,9 +178,10 @@ static int stream_blob_to_pack(struct bulk_checkin_packfile *state,
>
>  	while (status != Z_STREAM_END) {
>  		if (size && !s.avail_in) {
> -			ssize_t rsize = size < sizeof(ibuf) ? size : sizeof(ibuf);
> -			ssize_t read_result = read_in_full(fd, ibuf, rsize);
> -			if (read_result < 0)
> +			size_t rsize = size < sizeof(ibuf) ? size : sizeof(ibuf);
> +			size_t read_result;
> +
> +			if (!safe_scast(read_in_full(fd, ibuf, rsize), &read_result))
>  				die_errno("failed to read from '%s'", path);
>  			if (read_result != rsize)
>  				die("failed to read %d bytes from '%s'",
>

This does look nice, but I'm worried something like `safe_scast` would
just not be used througout the codebase, causing inconsistencies. But I
think we can drive that through reviews.

> Though it does kind of obscure the call to read_in_full(). You can use
> two variables, like:
>
>   ssize_t read_result;
>   size_t bytes_read;
>
>   read_result = read_in_full(fd, ibuf, rsize);
>   if (!safe_scast(read_result, &bytes_read))
> 	die_errno(...);
>
> which is a bit more verbose but perhaps clearer.

Yeah this is much better too.

> This reminded me a bit of the issues we had with write_in_full() before,
> where:
>
>   if (write_in_full(fd, buf, len) < len)
>
> behaves unexpectedly because of integer conversions. There the solution
> was to never check against "len", because write_in_full() either writes
> everything or returns an error. So:
>
>   if (write_in_full(fd, buf, len) < 0)
>
> is correct and sufficient.
>
> But alas, we can't do the same here, because reading returns three
> cases: error, a full read, or a partial read (maybe even EOF!). So we
> really do need to record and compare the return value between what we
> asked for and what we got.
>

This is to some extent a flaw in the way errors are generally
structured where the error indication (-1 here) and a potential result
(bytes read) are combined into a single return.

It is unfortunate indeed.

> -Peff

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux