On Tue, May 06, 2025 at 09:40:16PM -0400, Derrick Stolee wrote: > On 5/6/25 7:09 AM, Patrick Steinhardt wrote: > > > this patch series refactors the object store subsystem to become more > > self-contained by getting rid of `the_repository`. Instead of passing in > > the repository explicitly, we start to pass in the object store itself, > > which is in contrast to many other refactorings we did, but in line with > > what we did for the ref store, as well. > > > > This series also starts to properly scope functions to the carved out > > object database subsystem, which requires a bit of shuffling. This > > allows us to have a short-and-sweet `odb_` prefix for functions and > > prepares us for a future with pluggable object backends. > > > > The series is structured as follows: > > > > - Patches 1 to 3 rename `struct object_store` and `struct > > object_directory` as well as the code files. > > Patches 1 and 2 involve renaming some core structures, and I had > some questions around these names (since we hope to be stuck with > the new names for a long time). I was thinking out loud on a per- > patch basis, but now want to collect my thoughts around these: > > * raw_object_store currently describes the abstraction that contains > all objects that can be accessed within the repository. This may > include multiple alternates. Patch 1 renames this to > 'object_database'. > > * object_directory currently describes a single directory that > has the same structure as $GIT_DIR/objects/ but may be an alternate > or a submodule object directory. Patch 2 renames this to > 'odb_backend'. > > My concerns around this are basically around not liking "backend" for > this purpose. When I think of a backend, I'm thinking about the > implementation details (like the refs backend being files or reftable) > and not multiple distinct locations that have their own objects. That is very much the intent eventually. Right now the backend is always the one that uses loose objects and packfiles. But eventually, the goal is to introduce different backends. But regardless of that, ... > In this sense, I'm partial to being brief for the most-common structure > that will be passed around and then more descriptive about the smaller > pieces: > > * 'struct raw_object_store' could be renamed to 'struct odb' to match > its use in all of the new odb_*() methods. This represents the > "object database abstraction" and consumers don't care if this is > one or many object directories in a trench coat. NB: I think having a long name here is nicer, even if it's abbreviated in the functions. But that's mostly my own preference, I don't care too much. I'll keep this as-is in the next iteration, but if you feel strongly I'm certainly happy to rename it to `struct odb`. Just give me a ping and I'll do so. > * 'struct object_directory' could be renamed to 'struct odb_shard' or > 'struct odb_slice' or similar. I may even recommend 'odb_partition' > though that does imply some disjointness that is not guaranteed (an > object can exist in multiple parts). > > * In the event that we create multiple implementations for storing > objects, then a 'struct odb_shard' could point to a backend to help > find the appropriate methods for interacting with its storage. ... I have decided to rename this to `odb_alternate`. I don't think "shard" works well, as shard is an extremely generic term that doesn't really convey much meaning. On the other hand, I think that `odb_alternate` is quite a good fit. We already use it all over the place to mean almost exactly what we are after here. And it doesn't seem far-fetched to have an `odb_packed_alternate`, `odb_loose_alternate` and `odb_redis_alternate` for different backends. The only stretch is that the primary object directory is now the primary alternate. I think that this is acceptable though' > * "alternate refs" are locked in as names based on the following > config key names: > > - core.alternateRefsCommand > - core.alternateRefsPrefix > > These user-facing names should not change. This may be valuable to > make sure that the 'odb_shard's still have a state of "I'm an > alternate" or "I'm the base read/write shard for this repo". Agreed. > > - Patches 4 to 12 refactor "odb.c" to get rid of `the_repository`. > > These are carefully done. Thanks. I only have a few nitpicks here > and there. > > > - Patches 13 to 17 adjust the name of remaining functions so that they > > can be clearly attributed to the ODB. I'm happy to kick these > > patches out of this series and resend them at a later point in case > > they create too much turmoil. > > I like that these are present, especially because you included > compatibility macros for in-flight topics. > > I do mention that the rename of the object-store.[c|h] files may be > unnecessary, or perhaps could be delayed until this series is merged > and the collateral is calmed. > > --- > > This was clearly a lot of work to put together. Thanks for working > hard to thoughtfully rename things while refactoring to reduce our > dependence on global state. Well. Frankly, the hard work is only just starting. Next step: push down `struct packed_git` from `struct object_database` to `odb_alternate`. I'm scared what I'll find there. Patrick