On 5/6/25 7:09 AM, Patrick Steinhardt wrote:
this patch series refactors the object store subsystem to become more self-contained by getting rid of `the_repository`. Instead of passing in the repository explicitly, we start to pass in the object store itself, which is in contrast to many other refactorings we did, but in line with what we did for the ref store, as well. This series also starts to properly scope functions to the carved out object database subsystem, which requires a bit of shuffling. This allows us to have a short-and-sweet `odb_` prefix for functions and prepares us for a future with pluggable object backends. The series is structured as follows: - Patches 1 to 3 rename `struct object_store` and `struct object_directory` as well as the code files.
Patches 1 and 2 involve renaming some core structures, and I had some questions around these names (since we hope to be stuck with the new names for a long time). I was thinking out loud on a per- patch basis, but now want to collect my thoughts around these: * raw_object_store currently describes the abstraction that contains all objects that can be accessed within the repository. This may include multiple alternates. Patch 1 renames this to 'object_database'. * object_directory currently describes a single directory that has the same structure as $GIT_DIR/objects/ but may be an alternate or a submodule object directory. Patch 2 renames this to 'odb_backend'. My concerns around this are basically around not liking "backend" for this purpose. When I think of a backend, I'm thinking about the implementation details (like the refs backend being files or reftable) and not multiple distinct locations that have their own objects. In this sense, I'm partial to being brief for the most-common structure that will be passed around and then more descriptive about the smaller pieces: * 'struct raw_object_store' could be renamed to 'struct odb' to match its use in all of the new odb_*() methods. This represents the "object database abstraction" and consumers don't care if this is one or many object directories in a trench coat. * 'struct object_directory' could be renamed to 'struct odb_shard' or 'struct odb_slice' or similar. I may even recommend 'odb_partition' though that does imply some disjointness that is not guaranteed (an object can exist in multiple parts). * In the event that we create multiple implementations for storing objects, then a 'struct odb_shard' could point to a backend to help find the appropriate methods for interacting with its storage. * "alternate refs" are locked in as names based on the following config key names: - core.alternateRefsCommand - core.alternateRefsPrefix These user-facing names should not change. This may be valuable to make sure that the 'odb_shard's still have a state of "I'm an alternate" or "I'm the base read/write shard for this repo".
- Patches 4 to 12 refactor "odb.c" to get rid of `the_repository`.
These are carefully done. Thanks. I only have a few nitpicks here and there.
- Patches 13 to 17 adjust the name of remaining functions so that they can be clearly attributed to the ODB. I'm happy to kick these patches out of this series and resend them at a later point in case they create too much turmoil.
I like that these are present, especially because you included compatibility macros for in-flight topics. I do mention that the rename of the object-store.[c|h] files may be unnecessary, or perhaps could be delayed until this series is merged and the collateral is calmed. --- This was clearly a lot of work to put together. Thanks for working hard to thoughtfully rename things while refactoring to reduce our dependence on global state. Thanks, -Stolee