Re: [PATCH 2/9] commit-graph: stop using signed integers to count bloom filters

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Patrick Steinhardt <ps@xxxxxx> writes:

> On Mon, Aug 04, 2025 at 11:13:28AM +0200, Oswald Buddenhagen wrote:
>> On Mon, Aug 04, 2025 at 10:17:18AM +0200, Patrick Steinhardt wrote:
>> > When writing a new commit graph we have a couple of counters that
>> > provide statistics around what kind of bloom filters we have or have not
>> > written. These counters naturally count from zero and are only ever
>> > incremented, but they use a signed integer as type regardless.
>> > 
>> > Refactor those fields to be of type `size_t` instead.
>> > 
>> mind elaborating on that choice?
>
> We tend to use `size_t` when counting stuff.

And I would have to say that it is wrong and we need to wean
ourselves from such a superstition.  Unless you are measuring how
big a memory block you ask from the allocator, the platform natural
integer is often the right type to do the counting.

Each of your "stuff" may weigh N megabytes in core, and if you have
M of them, you may have to ask (N*2**20)*M bytes of memory from the
allocator.  Your (N*2**20)*M must fit size_t _and_ you must compute
it without overflowing or wrapping around.

None of the above mean you have to express N in size_t, though.
And more importantly, nobody gives you any extra guarantee that you
would compute the result correctly if you used size_t.  You can write
the right code with platform natural integer, and you have to take
the same care (e.g. by using st_mult()) to catch integer overflows
even if you used size_t.

> ... Regarding the data size I
> don't really think that matters much. It's not like we have hundreds of
> thousands of commit graphs in-memory at any point in time.

Aren't you saying that a platform natural integer is a much better
fit?

As to signedness, it sometimes is better for a struct member that is
used to record the number of "stuff" you have to be a signed integer
that is initialized to -1 to signal "we haven't counted so we do not
yet know how many there are".  So

    These counters naturally count from zero and are only ever
    incremented.

is not always a valid excuse to insist that such a variable must be
unsigned.

In short, not all but much of the recent "use size_t" topics are
misguided, and -Wsign-compare is usually a wrong thing to rely on.




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux