On 9/4/2025 7:12 PM, Junio C Hamano wrote: > Patrick Steinhardt <ps@xxxxxx> writes: > >> commit graphs are currently stored on the object database level. This >> doesn't really make much sense conceptually, given that commit graphs >> are specific to one object source. Furthermore, with the upcoming >> pluggable object database effort, an object source's backend may not >> evene have a commit graph in the first place but store that information >> in a different format altogether. >> >> This patch series prepares for that by moving the commit graph from >> `struct object_database` into `struct odb_source`. > > Hmph, I am finding the above hard to agree with at the conceptual > level. In some future, we may use multiple object stores in a > single repository. Perhaps we would be storing older parts of > history in semi-online storage while newer parts are stored in > readily available storage. But the side data structure that allows > us to quickly learn who are parents of one commit is without having > to go to the object store in order to parse the actualy commit > object can be stored for the entire history if we wanted to, or more > recent part of the history but not limited to the "readily available > storage" part. IOW, where the boundary between the older and the > newer parts of the history lies and which commits the commit graph > covers should be pretty much independent. > > So moving from object_database (i.e. the whole world) to individual > odb_source (i.e. where one particular subset of the history is > stored) feels like totally backwards to me. Surely, a commit graph > file may be defined over a set of packfiles and remaining loose > object files, but it is not like an instance of the commit-graph > file is tied to packfiles in the sense that it uses the index into > some packfile instead of the actual object names to refer to > commits, or anything like that (this is quite different from other > files that are very specific to a single object store, like midx > that is tied to the packfiles it describes). This is an interesting aspect to things, where the commit-graph file is a "structured cache" of certain commit information. It happens to be located within the object stores (either local or in an alternate) but is conceptually different in a few ways. The biggest difference is that you can only open one commit-graph (or chain of commit-graphs). Having multiple files across different object stores will not accumulate additional context. Instead, we have a "first one wins" approach. This does seem to be something that you are attempting to change by including the ability to load a commit-graph for each odb (and closing them in sequence as we close a repo). So in this sense, the commit-graph lives at the repository level, not an object store level. When doing I/O to write or read a graph, we use a specific object store at a time. The other direction to consider is what context we have when we interact with a commit-graph. We generally are parsing commits from a repository or loading Bloom filter data during file history walks. Each of these do not have a predictable nature of which object store will "own" the commit we are inspecting, so it wouldn't make sense to restrict things like odb_parse_commit() over repo_parse_commit(). With these thoughts in mind, I have these big-picture thoughts: 1. Patches 1-5 are great. Nice cleanups. 2. Some of Patch 6 is great, including having the I/O methods use an odb_source to help focus the specific location of the files being read or written. However, the movement of the struct into the odb_source makes less sense and should still exist at the object_database level. Thanks, -Stolee