Re: question: what does "garbage" field in "git count-objects -v" represent? Is it broken?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 28 Aug 2025 at 19:05, Junio C Hamano <gitster@xxxxxxxxx> wrote:
>
> Daniele Sassoli <danielesassoli@xxxxxxxxx> writes:
>
> > When reading the output `git count-objects -v` there is a `garbage` field. At
> > first I thought this would highlight objects that are considered "garbage", i.e.
> > could be garbage collected. However, I kept noticing that this wasn't the case,
> > despite my repository having plenty of dangling objects (that where removed once
> > I run `git gc --prune=now`), garbage kept being 0.
>
> count-objects is about quick housekeeping stats and does not (and
> should never) analyze reachability like fsck does, which is required
> to tell which objects are dangling.

Totally agree with this.

>
> > I then turned to reading the docs, which state:
> > garbage: the number of files in the object database that are neither
> > valid loose objects nor valid packs
> >
> > I don't think I've ever seen a definition of an invalid object?
> > I tried adding random chars to an object, effectively corrupting
> > the repository(which `git fsck` correctly picked up), but
> > count-objects kept returning 0 at the garbage field.
>
> count-objects is about quick housekeeping stats and does not (and
> should never) analyze object contents like fsck does, which is
> required to tell which objects are corrupt.
>
> > The only way I've been able to get count-objects to report some garbage is by
> > creating files in the packs directory (or in any of the sub-directories of
> > `objects` folder) with random names, like "test", or sometimes I've seen it
> > report the existence of lock files or even preserved files.
> >
> > So my question is, am I fundamentally misunderstanding what garbage means, are
> > the docs simply unclear or is the functionality not working as expected?
> >
> > Thanks for taking the time to read this and respond.
> > Dani
>
> You are fundamentally understanding correctly.  The command tells
> you it found a garbage when you do this:
>
>     $ mkdir -p .git/objects/00 && >.git/objects/00/tmp-garbage
>     $ git count-objects -v
>     warning: garbage found: .git/objects/00/tmp-garbage
>
Do you both agree that the term garbage is somewhat misleading? I've spoken
about this both on Discord[1] and in person at the recent Git Mini Summit, and
both times people expected to see garbage-collectable objects being reported by
this field (which, as Junio says, wouldn't be correct, but that's what people
think of).
Other users also seem to be confused by this as shown by [2].

At the very least, I believe the documentation could do with some clarification
and maybe we should even look at changing the wording of the field.
I wanted to triple check my understand was correct before submitting a patch.

Thanks for your feedback.

[1]https://discord.com/channels/1042895022950994071/1156706741875130499/1408738703156973640
[2]https://stackoverflow.com/questions/30999879/git-garbage-size-out-of-control-need-understanding




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux