Re: [PATCH 2/9] commit-graph: stop using signed integers to count bloom filters

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Aug 04, 2025 at 05:44:06PM -0400, Taylor Blau wrote:
> On Mon, Aug 04, 2025 at 11:34:22AM -0700, Junio C Hamano wrote:
> > Patrick Steinhardt <ps@xxxxxx> writes:
> > Aren't you saying that a platform natural integer is a much better
> > fit?
> >
> > As to signedness, it sometimes is better for a struct member that is
> > used to record the number of "stuff" you have to be a signed integer
> > that is initialized to -1 to signal "we haven't counted so we do not
> > yet know how many there are".  So
> >
> >     These counters naturally count from zero and are only ever
> >     incremented.
> >
> > is not always a valid excuse to insist that such a variable must be
> > unsigned.
> 
> I wrote these counters in 312cff5207 (bloom: split 'get_bloom_filter()'
> in two, 2020-09-16) and 59f0d5073f (bloom: encode out-of-bounds filters
> as non-empty, 2020-09-17), and I don't see a compelling reason that
> these should be unsigned.

I think that is going backwards though: the question to ask is why
should these be signed if they cannot ever be negative?

> It's true that we don't have any need for negative values here since we
> are counting from zero, but I don't think that alone justifies changing
> the signed-ness here.
> 
> Is there a reason beyond "these are always non-negative" that changing
> the signed-ness is warranted? If so, let's discuss that and make sure
> that it is documented in the commit message. If not, I think we could
> drop this patch (and optionally the patch before it as well).

Yes, there is: it makes code easier to reason about for the reader, and
it means that we don't have to guard ourselves against negative values.

If I see a counting variable that is signed I immediately jump to the
question of whether or not it can ever be a negative value. I assume
that the author of this code _intentfully_ made it signed to cater to a
specific edge case.

Whether or not we should be using `size_t`... I don't mind that one too
much, and I'm fine to use `unsigned` instead. But I really think that we
should do our best and help readers by using the proper type for the
task at hand and not let them figure out whether or not they have to
care about edge cases where the counting value could be negative.

Patrick




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux