[PATCH v6 00/17] object-store: carve out the object database subsystem

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

this patch series refactors the object store subsystem to become more
self-contained by getting rid of `the_repository`. Instead of passing in
the repository explicitly, we start to pass in the object store itself,
which is in contrast to many other refactorings we did, but in line with
what we did for the ref store, as well.

This series also starts to properly scope functions to the carved out
object database subsystem, which requires a bit of shuffling. This
allows us to have a short-and-sweet `odb_` prefix for functions and
prepares us for a future with pluggable object backends.

The series is structured as follows:

  - Patches 1 to 3 rename `struct object_store` and `struct
    object_directory` as well as the code files.

  - Patches 4 to 12 refactor "odb.c" to get rid of `the_repository`.

  - Patches 13 to 17 adjust the name of remaining functions so that they
    can be clearly attributed to the ODB. I'm happy to kick these
    patches out of this series and resend them at a later point in case
    they create too much turmoil.

This series is built on top of 6f84262c44a (The eleventh batch,
2025-05-05) with ps/object-store-cleanup at 8a9e27be821 (object-store:
drop `repo_has_object_file()`, 2025-04-29) merged into it. There are a
couple of trivial conflicts when merged with "seen", I have appended the
merge conflict resolution as a patch at the end of this mail.

Changes in v2:
  - Fix for a copy-and-pasted commit message.
  - Rename `struct odb_backend` to `struct odb_alternate`. I'm happy to
    revert to the previous name if we ultimately think it's the better
    suited one.
  - A couple of fixes to move changes into the correct commit. `git
    rebase -x 'meson compile -C build'` is now clean.
  - I _didn't_ back out the rename to "odb.{c,h}". Junio has already
    fixed the fallout, so it's probably more work for him to kick it out
    again than to just leave it in.
  - Link to v1: https://lore.kernel.org/r/20250506-pks-object-store-wo-the-repository-v1-0-c05b82e7b126@xxxxxx

Changes in v3:
  - Polishing for some comments and commit messages.
  - Link to v2: https://lore.kernel.org/r/20250509-pks-object-store-wo-the-repository-v2-0-103f59bf8e28@xxxxxx

Changes in v4:
  - Rebased the patch series on top of 7014b55638d (A bit more topics
    for -rc1, 2025-05-30). This fixes a couple of merge conflicts, most
    importantly with jk/no-funny-object-types.
  - Rename `struct odb_alternate` to `odb_source`.
  - Link to v3: https://lore.kernel.org/r/20250514-pks-object-store-wo-the-repository-v3-0-47df1d4ead22@xxxxxx

Changes in v5:
  - Some polishing to fix leftover terminology from previous rounds.
  - Link to v4: https://lore.kernel.org/r/20250602-pks-object-store-wo-the-repository-v4-0-e986804a7c62@xxxxxx

Changes in v6:
  - Fix a mis-merged comment.
  - A couple of commit message improvements.
  - Link to v5: https://lore.kernel.org/r/20250605-pks-object-store-wo-the-repository-v5-0-779d1c28774b@xxxxxx

Thanks!

Patrick

---
Patrick Steinhardt (17):
      object-store: rename `raw_object_store` to `object_database`
      object-store: rename `object_directory` to `odb_source`
      object-store: rename files to "odb.{c,h}"
      odb: introduce parent pointers
      odb: get rid of `the_repository` in `find_odb()`
      odb: get rid of `the_repository` in `assert_oid_type()`
      odb: get rid of `the_repository` in `odb_mkstemp()`
      odb: get rid of `the_repository` when handling alternates
      odb: get rid of `the_repository`  in `for_each()` functions
      odb: get rid of `the_repository` when handling the primary source
      odb: get rid of `the_repository` when handling submodule sources
      odb: trivial refactorings to get rid of `the_repository`
      odb: rename `oid_object_info()`
      odb: rename `repo_read_object_file()`
      odb: rename `has_object()`
      odb: rename `pretend_object_file()`
      odb: rename `read_object_with_reference()`

 Documentation/user-manual.adoc          |   4 +-
 Makefile                                |   2 +-
 apply.c                                 |  14 +-
 archive-tar.c                           |   2 +-
 archive-zip.c                           |   2 +-
 archive.c                               |   6 +-
 attr.c                                  |   4 +-
 bisect.c                                |   8 +-
 blame.c                                 |  22 +-
 builtin/backfill.c                      |   6 +-
 builtin/blame.c                         |   6 +-
 builtin/cat-file.c                      |  62 ++---
 builtin/checkout.c                      |   2 +-
 builtin/clone.c                         |  14 +-
 builtin/commit-graph.c                  |  20 +-
 builtin/commit-tree.c                   |   4 +-
 builtin/count-objects.c                 |   6 +-
 builtin/describe.c                      |   5 +-
 builtin/difftool.c                      |   4 +-
 builtin/fast-export.c                   |  10 +-
 builtin/fast-import.c                   |  49 ++--
 builtin/fetch.c                         |  21 +-
 builtin/fsck.c                          |  31 ++-
 builtin/gc.c                            |  16 +-
 builtin/grep.c                          |  26 +-
 builtin/hash-object.c                   |   2 +-
 builtin/index-pack.c                    |  29 +-
 builtin/log.c                           |   4 +-
 builtin/ls-files.c                      |   4 +-
 builtin/ls-tree.c                       |   6 +-
 builtin/merge-file.c                    |   2 +-
 builtin/merge-tree.c                    |  14 +-
 builtin/mktag.c                         |   6 +-
 builtin/mktree.c                        |  10 +-
 builtin/multi-pack-index.c              |   6 +-
 builtin/notes.c                         |   8 +-
 builtin/pack-objects.c                  |  70 ++---
 builtin/pack-redundant.c                |   2 +-
 builtin/prune.c                         |   6 +-
 builtin/receive-pack.c                  |   9 +-
 builtin/remote.c                        |   6 +-
 builtin/repack.c                        |   7 +-
 builtin/replace.c                       |  12 +-
 builtin/rev-list.c                      |   8 +-
 builtin/show-ref.c                      |   6 +-
 builtin/submodule--helper.c             |  11 +-
 builtin/tag.c                           |  10 +-
 builtin/unpack-file.c                   |   4 +-
 builtin/unpack-objects.c                |  12 +-
 bulk-checkin.c                          |   6 +-
 bundle-uri.c                            |   5 +-
 bundle.c                                |   6 +-
 cache-tree.c                            |  17 +-
 combine-diff.c                          |   4 +-
 commit-graph.c                          | 106 +++----
 commit-graph.h                          |  20 +-
 commit.c                                |  15 +-
 config.c                                |   4 +-
 connected.c                             |   2 +-
 contrib/coccinelle/the_repository.cocci |   2 +-
 diagnose.c                              |  12 +-
 diff.c                                  |  20 +-
 dir.c                                   |   2 +-
 entry.c                                 |   6 +-
 fetch-pack.c                            |  17 +-
 fmt-merge-msg.c                         |   6 +-
 fsck.c                                  |   4 +-
 grep.c                                  |   6 +-
 http-backend.c                          |   2 +-
 http-push.c                             |  20 +-
 http-walker.c                           |  12 +-
 http.c                                  |   6 +-
 list-objects-filter.c                   |   4 +-
 list-objects.c                          |   6 +-
 log-tree.c                              |   2 +-
 loose.c                                 |  46 ++--
 mailmap.c                               |   4 +-
 match-trees.c                           |   6 +-
 merge-blobs.c                           |  10 +-
 merge-ort.c                             |   8 +-
 meson.build                             |   2 +-
 midx-write.c                            |   2 +-
 midx.c                                  |   6 +-
 notes-cache.c                           |   4 +-
 notes-merge.c                           |   4 +-
 notes.c                                 |  19 +-
 object-file.c                           |  94 +++----
 object-file.h                           |  12 +-
 object-name.c                           |  24 +-
 object-store.h                          | 338 -----------------------
 object.c                                |   8 +-
 object-store.c => odb.c                 | 413 +++++++++++++++-------------
 odb.h                                   | 473 ++++++++++++++++++++++++++++++++
 oss-fuzz/fuzz-pack-idx.c                |   2 +-
 pack-bitmap-write.c                     |   9 +-
 pack-bitmap.c                           |  10 +-
 pack-check.c                            |   2 +-
 pack-mtimes.c                           |   2 +-
 pack-objects.h                          |   2 +-
 pack-revindex.c                         |   2 +-
 pack-write.c                            |  10 +-
 packfile.c                              |  29 +-
 packfile.h                              |   8 +-
 path.c                                  |   4 +-
 promisor-remote.c                       |   6 +-
 protocol-caps.c                         |   4 +-
 reachable.c                             |   2 +-
 read-cache.c                            |  14 +-
 ref-filter.c                            |   6 +-
 reflog.c                                |   8 +-
 refs.c                                  |   7 +-
 remote.c                                |   9 +-
 replace-object.c                        |   2 +-
 replace-object.h                        |   2 +-
 repository.c                            |  21 +-
 repository.h                            |   4 +-
 rerere.c                                |   7 +-
 revision.c                              |   5 +-
 send-pack.c                             |   4 +-
 sequencer.c                             |   7 +-
 server-info.c                           |   2 +-
 shallow.c                               |  14 +-
 streaming.c                             |  10 +-
 submodule-config.c                      |   9 +-
 submodule.c                             |  32 +--
 submodule.h                             |   9 -
 t/helper/test-find-pack.c               |   2 +-
 t/helper/test-pack-mtimes.c             |   2 +-
 t/helper/test-partial-clone.c           |   4 +-
 t/helper/test-read-graph.c              |   8 +-
 t/helper/test-read-midx.c               |   2 +-
 t/helper/test-ref-store.c               |   4 +-
 tag.c                                   |  10 +-
 tmp-objdir.c                            |  30 +-
 tree-walk.c                             |  18 +-
 tree.c                                  |   6 +-
 unpack-trees.c                          |   2 +-
 upload-pack.c                           |   4 +-
 walker.c                                |   6 +-
 xdiff-interface.c                       |   4 +-
 140 files changed, 1453 insertions(+), 1298 deletions(-)

Range-diff versus v5:

 1:  9df738c135b =  1:  55efa04c9b5 object-store: rename `raw_object_store` to `object_database`
 2:  85ee1dd80f0 =  2:  9e259ec9129 object-store: rename `object_directory` to `odb_source`
 3:  8a9e759fcfa =  3:  4bef9e8ca2e object-store: rename files to "odb.{c,h}"
 4:  872828f8061 !  4:  4a82e103b22 odb: introduce parent pointers
    @@ odb.c: static int link_alt_odb_entry(struct repository *r, const struct strbuf *
      		goto error;
      
      	CALLOC_ARRAY(alternate, 1);
    --	/* pathbuf.buf is already in r->objects->source_by_path */
     +	alternate->odb = odb;
    -+	/* pathbuf.buf is already in r->objects->alternate_by_path */
    + 	/* pathbuf.buf is already in r->objects->source_by_path */
      	alternate->path = strbuf_detach(&pathbuf, NULL);
      
      	/* add the alternate entry */
 5:  bf292f80e6a =  5:  d1096993665 odb: get rid of `the_repository` in `find_odb()`
 6:  03f57d8efbc =  6:  8bd70f6e303 odb: get rid of `the_repository` in `assert_oid_type()`
 7:  2aafcbaf706 =  7:  97cd748c462 odb: get rid of `the_repository` in `odb_mkstemp()`
 8:  9a9eaa9fe0f !  8:  bfc550d81e6 odb: get rid of `the_repository` when handling alternates
    @@ Commit message
         odb: get rid of `the_repository` when handling alternates
     
         The functions to manage alternates all depend on `the_repository`.
    -    Refactor them to accept an object database as parameter and adjusting
    -    all callers. The functions are renamed accordingly.
    +    Refactor them to accept an object database as a parameter and adjust all
    +    callers. The functions are renamed accordingly.
     
         Note that right now the situation is still somewhat weird because we end
    -    up using the path provided by the object store's repository anyway. This
    -    will be adapted over time though so that we instead store the path to
    -    the primary object directory in the object database itself.
    +    up using the object store path provided by the object store's repository
    +    anyway. Consequently, we could have instead passed in a pointer to the
    +    repository instead of passing in the pointer to the object store. This
    +    will be addressed in subsequent commits though, where we will start to
    +    use the path owned by the object store itself.
     
         Signed-off-by: Patrick Steinhardt <ps@xxxxxx>
     
 9:  1618716a75f =  9:  34649c4cbe1 odb: get rid of `the_repository`  in `for_each()` functions
10:  9c282be2a37 = 10:  5954680f7be odb: get rid of `the_repository` when handling the primary source
11:  eb31130c720 = 11:  25b07546210 odb: get rid of `the_repository` when handling submodule sources
12:  a5d6a5fb8a1 = 12:  945c95ba26c odb: trivial refactorings to get rid of `the_repository`
13:  61e3cb25aa2 = 13:  624c80b44cb odb: rename `oid_object_info()`
14:  1ab82f81ff5 = 14:  366c2733c69 odb: rename `repo_read_object_file()`
15:  427eb9893b9 = 15:  cf287279010 odb: rename `has_object()`
16:  bdf62e5cf47 ! 16:  42c14c70181 odb: rename `pretend_object_file()`
    @@ Commit message
         functions related to the object database and our modern coding
         guidelines.
     
    -    No compatibility wrapper is introduces as the function is not used a lot
    +    No compatibility wrapper is introduced as the function is not used a lot
         throughout our codebase.
     
         Signed-off-by: Patrick Steinhardt <ps@xxxxxx>
17:  550d4a75562 ! 17:  ad0b56350b0 odb: rename `read_object_with_reference()`
    @@ Commit message
         been found. This is generally referred to as "peeling", so the new name
         should be way more descriptive.
     
    -    No compatibility wrapper is introduces as the function is not used a lot
    +    No compatibility wrapper is introduced as the function is not used a lot
         throughout our codebase.
     
         Signed-off-by: Patrick Steinhardt <ps@xxxxxx>

---
base-commit: 7014b55638da979331baf8dc31c4e1d697cf2d67
change-id: 20250505-pks-object-store-wo-the-repository-9c6cbdf8d4b1





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux