Jeff King <peff@xxxxxxxx> writes: > On Fri, Mar 21, 2025 at 05:08:06PM -0400, Karthik Nayak wrote: > >> > @@ -192,7 +188,7 @@ static int stream_blob_to_pack(struct bulk_checkin_packfile *state, >> > offset += rsize; >> > if (*already_hashed_to < offset) { >> > size_t hsize = offset - *already_hashed_to; >> > - if (rsize < hsize) >> > + if ((size_t)rsize < hsize) >> >> Something I found peculiar here is that `rsize` is of type ssize_t'. >> But it only seems to store a positive value. > > I assumed it was ssize_t because it would hold the result of a read > call. But it doesn't! We put that into the "read_result" variable. > > So it could just be a size_t in the first place. And indeed it is better > as one, because we assign from "size", which is itself a size_t. We do > not yet warn about type mismatches outside of comparisons, but really it > is equally bad. Nice, thanks for exploring this thought out more. I did look at the code, but was more cursory. > However, if you switch it, then we get a different -Wsign-compare > problem: we compare "rsize" and "read_result". So you still have to > cast, but at a different spot. > True. But this would be better in my regards, since this would directly follow the if (read_result < 0) die_errno("failed to read from '%s'", path); code, so a `if ((size_t)read_result != rsize)` here makes logical sense since we can clearly see that this section is only reached when `read_result` has a positive value. > If we are doing this a lot (and really this conversion is necessary any > time you look at the outcome of a read call), I do still wonder if we > should have a helper like: > > static inline int safe_scast(ssize_t ret, size_t *out) > { > if (ret < 0) > return 0; > /* cast is safe because of check above */ > *out = (size_t)ret; > return 1; > } > > (yes, I know the name is lousy). That would allow something like this: > > diff --git a/bulk-checkin.c b/bulk-checkin.c > index f6f79cb9e2..fbffc7c8d6 100644 > --- a/bulk-checkin.c > +++ b/bulk-checkin.c > @@ -178,9 +178,10 @@ static int stream_blob_to_pack(struct bulk_checkin_packfile *state, > > while (status != Z_STREAM_END) { > if (size && !s.avail_in) { > - ssize_t rsize = size < sizeof(ibuf) ? size : sizeof(ibuf); > - ssize_t read_result = read_in_full(fd, ibuf, rsize); > - if (read_result < 0) > + size_t rsize = size < sizeof(ibuf) ? size : sizeof(ibuf); > + size_t read_result; > + > + if (!safe_scast(read_in_full(fd, ibuf, rsize), &read_result)) > die_errno("failed to read from '%s'", path); > if (read_result != rsize) > die("failed to read %d bytes from '%s'", > This does look nice, but I'm worried something like `safe_scast` would just not be used througout the codebase, causing inconsistencies. But I think we can drive that through reviews. > Though it does kind of obscure the call to read_in_full(). You can use > two variables, like: > > ssize_t read_result; > size_t bytes_read; > > read_result = read_in_full(fd, ibuf, rsize); > if (!safe_scast(read_result, &bytes_read)) > die_errno(...); > > which is a bit more verbose but perhaps clearer. Yeah this is much better too. > This reminded me a bit of the issues we had with write_in_full() before, > where: > > if (write_in_full(fd, buf, len) < len) > > behaves unexpectedly because of integer conversions. There the solution > was to never check against "len", because write_in_full() either writes > everything or returns an error. So: > > if (write_in_full(fd, buf, len) < 0) > > is correct and sufficient. > > But alas, we can't do the same here, because reading returns three > cases: error, a full read, or a partial read (maybe even EOF!). So we > really do need to record and compare the return value between what we > asked for and what we got. > This is to some extent a flaw in the way errors are generally structured where the error indication (-1 here) and a potential result (bytes read) are combined into a single return. It is unfortunate indeed. > -Peff
Attachment:
signature.asc
Description: PGP signature