Re: [PATCH 1/8] midx: start tracking per object database source

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 25/07/09 09:54AM, Patrick Steinhardt wrote:
> Multi-pack indices are tracked via `struct multi_pack_index`. This data
> structure is stored as a linked list inside `struct object_database`,
> which is the global database that spans across all of the object
> sources.
> 
> This layout causes two problems:
> 
>   - Multi-pack indices aren't global to an object database, but instead
>     there can be one multi-pack index per source. This creates a
>     mismatch between the on-disk layout and how things are organized in
>     the object database subsystems and makes some parts, like figuring
>     out whether a source has an MIDX, quite awkward.

In a future where there is a separate implementation for each object
source, each multi-pack index is really only relevant for a particular
source. Some source implementations will also not require a midx to
begin with. For organization purposes, tracking a single midx per source
for now seems much more ideal.

>   - Multi-pack indices are an implementation detail of how efficient
>     access for packfiles work. As such, they are neither relevant in the
>     context of loose objects, nor in a potential future where we have
>     pluggable backends.

Agreed.

> Refactor `prepare_multi_pack_index_one()` so that it works on a specific
> source, which allows us to easily store a pointer to the multi-pack
> index inside of it. For now, this pointer exists next to the existing
> linked list we have in the object database. Users will be adjusted in
> subsequent patches to instead use the per-source pointers.

Looking at `prepare_multi_pack_index_one()`, this function is
responsible for checking if a multi-pack index is already loaded in the
linked list and, if not, loading/inserting it into the list. So now we
will want it to handle setting up the object source midx as well.

> 
> Signed-off-by: Patrick Steinhardt <ps@xxxxxx>
> ---
>  midx.c     | 19 +++++++++++--------
>  midx.h     |  7 ++++---
>  odb.h      | 11 +++++++++--
>  packfile.c |  4 +++-
>  4 files changed, 27 insertions(+), 14 deletions(-)
> 
> diff --git a/midx.c b/midx.c
> index 3c5bc821730..a91231bfcdf 100644
> --- a/midx.c
> +++ b/midx.c
> @@ -724,28 +724,29 @@ int midx_preferred_pack(struct multi_pack_index *m, uint32_t *pack_int_id)
>  	return 0;
>  }
>  
> -int prepare_multi_pack_index_one(struct repository *r, const char *object_dir, int local)
> +int prepare_multi_pack_index_one(struct odb_source *source, int local)
>  {
> +	struct repository *r = source->odb->repo;
>  	struct multi_pack_index *m;
> -	struct multi_pack_index *m_search;
>  
>  	prepare_repo_settings(r);
>  	if (!r->settings.core_multi_pack_index)
>  		return 0;
>  
> -	for (m_search = r->objects->multi_pack_index; m_search; m_search = m_search->next)
> -		if (!strcmp(object_dir, m_search->object_dir))
> -			return 1;

We are able to avoid searching the linked list because we know if the
source already has a multi_pack_index it must also be inserted into the
linked list. Makes sense.

> -	m = load_multi_pack_index(r, object_dir, local);
> +	if (source->multi_pack_index)
> +		return 1;
>  
> +	m = load_multi_pack_index(r, source->path, local);
>  	if (m) {
>  		struct multi_pack_index *mp = r->objects->multi_pack_index;
>  		if (mp) {
>  			m->next = mp->next;
>  			mp->next = m;
> -		} else
> +		} else {
>  			r->objects->multi_pack_index = m;
> +		}
> +		source->multi_pack_index = m;

We still insert the multi-pack index into the list, but more importantly
we now store it in the source it pertains to. It is a bit awkward that
source now points to a list of midx when we really only care about the
first entry, but I suspect that will be addressed later in the series by
doing away with the list entirely.

> +
>  		return 1;
>  	}
>  
> @@ -837,6 +838,8 @@ void clear_midx_file(struct repository *r)
>  	if (r->objects && r->objects->multi_pack_index) {
>  		close_midx(r->objects->multi_pack_index);
>  		r->objects->multi_pack_index = NULL;
> +		for (struct odb_source *source = r->objects->sources; source; source = source->next)
> +			source->multi_pack_index = NULL;
>  	}
>  
>  	if (remove_path(midx.buf))
> diff --git a/midx.h b/midx.h
> index 9d1374cbd58..b1626a9a7c4 100644
> --- a/midx.h
> +++ b/midx.h
> @@ -3,11 +3,12 @@
>  
>  #include "string-list.h"
>  
> +struct bitmapped_pack;
> +struct git_hash_algo;
>  struct object_id;
> +struct odb_source;
>  struct pack_entry;
>  struct repository;
> -struct bitmapped_pack;
> -struct git_hash_algo;
>  
>  #define MIDX_SIGNATURE 0x4d494458 /* "MIDX" */
>  #define MIDX_VERSION 1
> @@ -123,7 +124,7 @@ int fill_midx_entry(struct repository *r, const struct object_id *oid, struct pa
>  int midx_contains_pack(struct multi_pack_index *m,
>  		       const char *idx_or_pack_name);
>  int midx_preferred_pack(struct multi_pack_index *m, uint32_t *pack_int_id);
> -int prepare_multi_pack_index_one(struct repository *r, const char *object_dir, int local);
> +int prepare_multi_pack_index_one(struct odb_source *source, int local);
>  
>  /*
>   * Variant of write_midx_file which writes a MIDX containing only the packs
> diff --git a/odb.h b/odb.h
> index e922f256802..8e79c7be520 100644
> --- a/odb.h
> +++ b/odb.h
> @@ -9,10 +9,11 @@
>  #include "string-list.h"
>  #include "thread-utils.h"
>  
> +struct multi_pack_index;
>  struct oidmap;
>  struct oidtree;
> -struct strbuf;
>  struct repository;
> +struct strbuf;
>  
>  /*
>   * Compute the exact path an alternate is at and returns it. In case of
> @@ -55,6 +56,13 @@ struct odb_source {
>  	/* Map between object IDs for loose objects. */
>  	struct loose_object_map *loose_map;
>  
> +	/*
> +	 * private data
> +	 *
> +	 * should only be accessed directly by packfile.c and midx.c
> +	 */
> +	struct multi_pack_index *multi_pack_index;
> +
>  	/*
>  	 * This is a temporary object store created by the tmp_objdir
>  	 * facility. Disable ref updates since the objects in the store
> @@ -75,7 +83,6 @@ struct odb_source {
>  };
>  
>  struct packed_git;
> -struct multi_pack_index;
>  struct cached_object_entry;
>  
>  /*
> diff --git a/packfile.c b/packfile.c
> index af9ccfdba62..16efc2fdca3 100644
> --- a/packfile.c
> +++ b/packfile.c
> @@ -372,6 +372,8 @@ void close_object_store(struct object_database *o)
>  	if (o->multi_pack_index) {
>  		close_midx(o->multi_pack_index);
>  		o->multi_pack_index = NULL;
> +		for (struct odb_source *source = o->sources; source; source = source->next)
> +			source->multi_pack_index = NULL;

Ok, so cleanup of the multi-pack index is still handled by recursively
iterating through the global list via `cose_midx()`, but we now unset
the midx in each source. Looks good.

-Justin

>  	}
>  
>  	close_commit_graph(o);
> @@ -1037,7 +1039,7 @@ static void prepare_packed_git(struct repository *r)
>  	odb_prepare_alternates(r->objects);
>  	for (source = r->objects->sources; source; source = source->next) {
>  		int local = (source == r->objects->sources);
> -		prepare_multi_pack_index_one(r, source->path, local);
> +		prepare_multi_pack_index_one(source, local);
>  		prepare_packed_git_one(r, source->path, local);
>  	}
>  	rearrange_packed_git(r);
> 
> -- 
> 2.50.1.327.g047016eb4a.dirty
> 
> 




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux