On 25/07/07 08:01AM, Patrick Steinhardt wrote: > On Fri, Jul 04, 2025 at 06:40:11PM -0300, Lucas Seiki Oshiro wrote: > > > Would it make sense to maybe have such whole-repo commands > > > grouped together in a `git repo` top-level command? E.g. `git repo info` > > > for your command, `git repo size` to gather information about the repo > > > size. > > > > It seems to be very nice for me! In fact, this being a home also for > > statistics is something I considered while writing the first versions of > > my GSoC proposal. > > > > And what about merging the two codes into a single API? Something like: > > > > ``` > > git repo-info layout.bare references.format survey.commit-count > > { > > "layout": { > > "bare": true > > }, > > "references": { > > "format": "files" > > }, > > "survey": { > > "commit-count": 42 > > } > > } > > > > ? > > We could in theory do that. But there's two things we need to be > cautious about: > > 1. We should be mindful about what specifically this tool is about. It > shouldn't become the next tool that does way too many different > things. > > 2. One of the idea of git-survey(1) is to eventually replace > git-sizer(1). This will require very specific presentation formats > that aren't really compatible with any of the other information. > > Out of these two I think the second item is the more important one why > git-survey(1) should exist as a standalone tool, either as a top-level > command or as a subcommand. As Patrick mentioned, the focus for git-survey(1) is to be an eventual substitute for git-sizer(1). For the initial implementation I was imagining a simple plaintext format that outputs key/value pairs and looks something like the following example: references.branches.count=15 references.tags.count=2 references.remotes.count=5 references.others.count=1 objects.commits.count=50 objects.commits.total_size=1234567 objects.commits.max_size.oid=1817dc08b8ea00fce4cd1fb6bc75713ad00a74d3 objects.commits.max_size.size=1234 objects.commits.max_parents.oid=1817dc08b8ea00fce4cd1fb6bc75713ad00a74d3 objects.commits.max_parents.count=8 objects.trees.count=100 objects.trees.total_size=12345 objects.trees.total_tree_entries=999 objects.trees.max_tree_entries.oid=1817dc08b8ea00fce4cd1fb6bc75713ad00a74d3 objects.trees.max_tree_entries.count=99 objects.blobs.count=142 objects.blobs.total_size=99999999 objects.blobs.max_size.oid=1817dc08b8ea00fce4cd1fb6bc75713ad00a74d3 objects.blobs.max_size.size=999999 objects.tags.count=1 repo.max_depth=999 <etc...> The command will also need to eventually support other output formats, namely a more human friendly table format that provides something similar to git-sizer(1). As layed out above, this looks like it could also work well with the git-repo-info(1) JSON format. This makes me wonder if we should add this functionality as a separate flag for git-repo-info(1). Maybe something like `--stats` and append the info do the output. If we want a more clear distiction though, we could implement this as a separate subcommand. For a more human-readable format, maybe we could still implement a standalone git-survey(1) that is more of a porcelain command and uses git-repo-info(1) under the hood. I think the other information such as reference format and object format may be useful to provide in git-survey(1) output. > > During our meetings, Karthik suggested (I'm planning to it later) to also > > allow to request an entire category instead of only the fields. Then, this > > would also be possible: > > > > ``` > > $ git repo-info survey > > { > > "survey": { > > "commit-count": 42, > > "blob-count": 1234 > > } > > ``` > > It raises another question though: if we ever were to add `--all` we'll > need to step a bit careful about what kind of information we add to this > tool. All of the information proposed so far can be computed rather > trivially. But computing repository sizes has way higher computational > complexity and may easily take seconds, maybe even minutes in large > repositories. > > That to me further points into the direction of giving those two tools a > common top-level command (`git repo info`, `git repo survey`), but to > not mix concerns too much with one another. Getting the info for git-survey(1) is certainly more computationally complex so there should be a way to run the command without performing the more expensive checks if the user doesn't want them. At the same time, I think it may be nice to have a way for a user to request a dump of "interesting" repository info via a single command. > > But I don't know what are Justin's plans for git-survey, if it would be a > > porcelain command for showing those stats to the user of if it is targeted > > for being parsed like this `repo-info`. I think the intent for git-survey was to provide a more porcelain command to display interesting repository stats to the user, but also provide an option to print in a machine-parsable format. I like the idea of computing everything as part of git-repo-info though. This could allow a standalone git-survey to focus on just being a human-friendly porcelain command. For scripted use-cases, users could then just use git-repo-info. -Justin