Re: [PATCH 07/14] refs/iterator: separate lifecycle from iteration

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Patrick Steinhardt <ps@xxxxxx> writes:

> The ref and reflog iterators have their lifecycle attached to iteration:
> once the iterator reaches its end, it is automatically released and the
> caller doesn't have to care about that anymore. When the iterator should
> be released before it has been exhausted, callers must explicitly abort
> the iterator via `ref_iterator_abort()`.
>
> This lifecycle is somewhat unusual in the Git codebase and creates two
> problems:
>
>   - Callsites need to be very careful about when exactly they call
>     `ref_iterator_abort()`, as calling the function is only valid when
>     the iterator itself still is. This leads to somewhat awkward calling
>     patterns in some situations.
>
>   - It is impossible to reuse iterators and re-seek them to a different
>     prefix. This feature isn't supported by any iterator implementation
>     except for the reftable iterators anyway, but if it was implemented
>     it would allow us to optimize cases where we need to search for
>     specific references repeatedly by reusing internal state.
>
> Detangle the lifecycle from iteration so that we don't deallocate the
> iterator anymore once it is exhausted. Instead, callers are now expected
> to always call a newly introduce `ref_iterator_free()` function that
> deallocates the iterator and its internal state.
>
> While at it, drop the return value of `ref_iterator_abort()`, which
> wasn't really required by any of the iterator implementations anyway.
> Furthermore, stop calling `base_ref_iterator_free()` in any of the
> backends, but instead call it in `ref_iterator_free()`.
>
> Signed-off-by: Patrick Steinhardt <ps@xxxxxx>
> ---
>  builtin/clone.c              |  2 +
>  dir-iterator.c               | 24 +++++------
>  dir-iterator.h               | 13 ++----
>  refs.c                       |  7 +++-
>  refs/debug.c                 |  9 ++---
>  refs/files-backend.c         | 36 +++++------------
>  refs/iterator.c              | 95 ++++++++++++++------------------------------
>  refs/packed-backend.c        | 27 ++++++-------
>  refs/ref-cache.c             |  9 ++---
>  refs/refs-internal.h         | 31 +++++----------
>  refs/reftable-backend.c      | 34 ++++------------
>  t/helper/test-dir-iterator.c |  1 +
>  12 files changed, 99 insertions(+), 189 deletions(-)
>
> diff --git a/builtin/clone.c b/builtin/clone.c
> index fd001d800c6..ac3e84b2b18 100644
> --- a/builtin/clone.c
> +++ b/builtin/clone.c
> @@ -426,6 +426,8 @@ static void copy_or_link_directory(struct strbuf *src, struct strbuf *dest,
>  		strbuf_setlen(src, src_len);
>  		die(_("failed to iterate over '%s'"), src->buf);
>  	}
> +
> +	dir_iterator_free(iter);
>  }
>

A bit puzzled to see `dir_iterator_*` change here, I'm assuming it's
linked to the 'files-backend' and perhaps similar to the changes
mentioned about `ref_iterator_*` in the commit message. Would be nice to
call out in the commit message too.

[snip]

> @@ -223,7 +223,7 @@ int dir_iterator_advance(struct dir_iterator *dir_iterator)
>  		} else {
>  			if (level->entries_idx >= level->entries.nr) {
>  				if (pop_level(iter) == 0)
> -					return dir_iterator_abort(dir_iterator);
> +					return ITER_DONE;
>  				continue;
>  			}
>
> @@ -232,22 +232,21 @@ int dir_iterator_advance(struct dir_iterator *dir_iterator)
>
>  		if (prepare_next_entry_data(iter, name)) {
>  			if (errno != ENOENT && iter->flags & DIR_ITERATOR_PEDANTIC)
> -				goto error_out;
> +				return ITER_ERROR;
>  			continue;
>  		}
>
>  		return ITER_OK;
>  	}
> -
> -error_out:
> -	dir_iterator_abort(dir_iterator);
> -	return ITER_ERROR;
>  }

Okay yeah, we're getting rid of `dir_iterator_abort` so potentially add
`dir_iterator_free` below

>
> -int dir_iterator_abort(struct dir_iterator *dir_iterator)
> +void dir_iterator_free(struct dir_iterator *dir_iterator)
>  {
>  	struct dir_iterator_int *iter = (struct dir_iterator_int *)dir_iterator;
>
> +	if (!iter)
> +		return;
> +
>  	for (; iter->levels_nr; iter->levels_nr--) {
>  		struct dir_iterator_level *level =
>  			&iter->levels[iter->levels_nr - 1];
> @@ -266,7 +265,6 @@ int dir_iterator_abort(struct dir_iterator *dir_iterator)
>  	free(iter->levels);
>  	strbuf_release(&iter->base.path);
>  	free(iter);
> -	return ITER_DONE;
>  }
>
>  struct dir_iterator *dir_iterator_begin(const char *path, unsigned int flags)

Okay this makes sense!

[snip]

> diff --git a/refs/files-backend.c b/refs/files-backend.c
> index 29f08dced40..9511b6f3448 100644
> --- a/refs/files-backend.c
> +++ b/refs/files-backend.c
> @@ -919,10 +919,6 @@ static int files_ref_iterator_advance(struct ref_iterator *ref_iterator)
>  		return ITER_OK;
>  	}
>
> -	iter->iter0 = NULL;
> -	if (ref_iterator_abort(ref_iterator) != ITER_DONE)
> -		ok = ITER_ERROR;
> -
>

Since we're explicitly going to call `ref_iterator_free`, this makes sense.

>  	return ok;
>  }
>
> @@ -935,23 +931,17 @@ static int files_ref_iterator_peel(struct ref_iterator *ref_iterator,
>  	return ref_iterator_peel(iter->iter0, peeled);
>  }
>
> -static int files_ref_iterator_abort(struct ref_iterator *ref_iterator)
> +static void files_ref_iterator_release(struct ref_iterator *ref_iterator)
>  {
>  	struct files_ref_iterator *iter =
>  		(struct files_ref_iterator *)ref_iterator;
> -	int ok = ITER_DONE;
> -
> -	if (iter->iter0)
> -		ok = ref_iterator_abort(iter->iter0);
> -
> -	base_ref_iterator_free(ref_iterator);
> -	return ok;
> +	ref_iterator_free(iter->iter0);
>  }
>

I like how much more cleaner it looks now.

[snip]

> diff --git a/refs/packed-backend.c b/refs/packed-backend.c
> index a7b6f74b6e3..38a1956d1a8 100644
> --- a/refs/packed-backend.c
> +++ b/refs/packed-backend.c
> @@ -954,9 +954,6 @@ static int packed_ref_iterator_advance(struct ref_iterator *ref_iterator)
>  		return ITER_OK;
>  	}
>
> -	if (ref_iterator_abort(ref_iterator) != ITER_DONE)
> -		ok = ITER_ERROR;
> -
>  	return ok;
>  }
>

The merged_iterator is used to combine the files and packed backend
iterators to provide a uniform view over them. Likewise the changes here
seem similar too.

> @@ -976,23 +973,19 @@ static int packed_ref_iterator_peel(struct ref_iterator *ref_iterator,
>  	}
>  }
>
> -static int packed_ref_iterator_abort(struct ref_iterator *ref_iterator)
> +static void packed_ref_iterator_release(struct ref_iterator *ref_iterator)
>  {
>  	struct packed_ref_iterator *iter =
>  		(struct packed_ref_iterator *)ref_iterator;
> -	int ok = ITER_DONE;
> -
>  	strbuf_release(&iter->refname_buf);
>  	free(iter->jump);
>  	release_snapshot(iter->snapshot);
> -	base_ref_iterator_free(ref_iterator);
> -	return ok;
>  }
>
>  static struct ref_iterator_vtable packed_ref_iterator_vtable = {
>  	.advance = packed_ref_iterator_advance,
>  	.peel = packed_ref_iterator_peel,
> -	.abort = packed_ref_iterator_abort
> +	.release = packed_ref_iterator_release,
>  };
>
>  static int jump_list_entry_cmp(const void *va, const void *vb)
> @@ -1362,8 +1355,10 @@ static int write_with_updates(struct packed_ref_store *refs,
>  	 */
>  	iter = packed_ref_iterator_begin(&refs->base, "", NULL,
>  					 DO_FOR_EACH_INCLUDE_BROKEN);
> -	if ((ok = ref_iterator_advance(iter)) != ITER_OK)
> +	if ((ok = ref_iterator_advance(iter)) != ITER_OK) {
> +		ref_iterator_free(iter);

Nit: Since we don't return early here, wouldn't the `ref_iterator_free`
at the end of the function be sufficient? I think the only early return
when `iter == NULL` is towards the end of the function, where it might
be better to add a `goto error`.

[snip]

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux