The `git-for-each-ref(1)` command is used to iterate over references present in a repository. In large repositories with millions of references, it would be optimal to paginate this output such that we can start iteration from a given reference. This would avoid having to iterate over all references from the beginning each time when paginating through results. This series adds a '--start-after' option in 'git-for-each-ref(1)'. When used, the reference iteration seeks to first reference following the marker alphabetically. When paging, it should be noted that references may be deleted, modified or added between invocations. Output will only yield those references which follow the marker lexicographically. If the marker does not exist, output begins from the first reference that would come after it alphabetically. This enables efficient pagination workflows like: git for-each-ref --count=100 git for-each-ref --count=100 --start-after=refs/heads/branch-100 git for-each-ref --count=100 --start-after=refs/heads/branch-200 To add this functionality, we expose the `ref_iterator` outside the 'refs/' namespace and modify the `ref_iterator_seek()` to actually seek to a given reference and only set the prefix when the `set_prefix` field is set. On the reftable and packed backend, the changes are simple. But since the files backend uses 'ref-cache' for reference handling, the changes there are a little more involved, since we need to setup the right levels and the indexing. Initially I was also planning to cleanup all the `refs_for_each...()` functions in 'refs.h' by simply using the iterator, but this bloated the series. So I've left that for another day. Changes in v4: - Patch 3/4: Move around the documentation for the flag and rename the seek variable to refname. - Patch 4/4: Cleanup the commit message and also the documentation. - Link to v3: https://lore.kernel.org/r/20250708-306-git-for-each-ref-pagination-v3-0-8cfba1080be4@xxxxxxxxx Changes in v3: - Change the working of the command to exclude the marker provided. With this rename the flag to '--start-after'. - Extend the documentation to add a note about concurrent modifications to the reference database. - Link to v2: https://lore.kernel.org/r/20250704-306-git-for-each-ref-pagination-v2-0-bcde14acdd81@xxxxxxxxx Changes in v2: - Modify 'ref_iterator_seek()' to take in flags instead of a 'set_prefix' variable. This improves readability, where users would use the 'REF_ITERATOR_SEEK_SET_PREFIX' instead of simply passing '1'. - When the set prefix flag isn't usage, reset any previously set prefix. This ensures that the internal prefix state is always reset whenever we seek and unifies the behavior between 'ref_iterator_seek' and 'ref_iterator_begin'. - Don't allow '--skip-until' to be run with '--sort', since the seeking always takes place before any sorting and this can be confusing. - Some styling fixes: - Remove extra newline - Skip braces around single lined if...else clause - Add braces around 'if' clause - Fix indentation - Link to v1: https://lore.kernel.org/git/20250701-306-git-for-each-ref-pagination-v1-0-4f0ae7c0688f@xxxxxxxxx/ Signed-off-by: Karthik Nayak <karthik.188@xxxxxxxxx> --- Documentation/git-for-each-ref.adoc | 10 +- builtin/for-each-ref.c | 8 ++ ref-filter.c | 80 +++++++++++---- ref-filter.h | 1 + refs.c | 6 +- refs.h | 155 ++++++++++++++++++++++++++++ refs/debug.c | 7 +- refs/files-backend.c | 7 +- refs/iterator.c | 26 +++-- refs/packed-backend.c | 17 ++-- refs/ref-cache.c | 99 ++++++++++++++---- refs/ref-cache.h | 7 -- refs/refs-internal.h | 152 ++-------------------------- refs/reftable-backend.c | 21 ++-- t/t6302-for-each-ref-filter.sh | 194 ++++++++++++++++++++++++++++++++++++ 15 files changed, 564 insertions(+), 226 deletions(-) Karthik Nayak (4): refs: expose `ref_iterator` via 'refs.h' ref-cache: remove unused function 'find_ref_entry()' refs: selectively set prefix in the seek functions for-each-ref: introduce a '--start-after' option Range-diff versus v3: 1: eed39162f5 = 1: 9e6ecff291 refs: expose `ref_iterator` via 'refs.h' 2: b9db49d31b = 2: 22f5222e4f ref-cache: remove unused function 'find_ref_entry()' 3: 502e2696fd ! 3: 0e71d8ffd9 refs: selectively set prefix in the seek functions @@ refs.h: struct ref_iterator *refs_ref_iterator_begin( +enum ref_iterator_seek_flag { + /* -+ * Also set the seek pattern as a prefix for iteration. This ensures -+ * that only references which match the prefix are yielded. ++ * When the REF_ITERATOR_SEEK_SET_PREFIX flag is set, the iterator's prefix is ++ * updated to match the provided string, affecting all subsequent iterations. If ++ * not, the iterator seeks to the specified reference and clears any previously ++ * set prefix. + */ + REF_ITERATOR_SEEK_SET_PREFIX = (1 << 0), +}; @@ refs.h: struct ref_iterator *refs_ref_iterator_begin( - * passed when creating the iterator will remain unchanged. + * This function is expected to behave as if a new ref iterator has been + * created, but allows reuse of existing iterators for optimization. -+ * -+ * When the REF_ITERATOR_SEEK_SET_PREFIX flag is set, the iterator's prefix is -+ * updated to match the seek string, affecting all subsequent iterations. If -+ * not, the iterator seeks to the specified reference and clears any previously -+ * set prefix. * * Returns 0 on success, a negative error code otherwise. */ -int ref_iterator_seek(struct ref_iterator *ref_iterator, - const char *prefix); -+int ref_iterator_seek(struct ref_iterator *ref_iterator, const char *seek, ++int ref_iterator_seek(struct ref_iterator *ref_iterator, const char *refname, + unsigned int flags); /* @@ refs/debug.c: static int debug_ref_iterator_advance(struct ref_iterator *ref_ite static int debug_ref_iterator_seek(struct ref_iterator *ref_iterator, - const char *prefix) -+ const char *seek, unsigned int flags) ++ const char *refname, unsigned int flags) { struct debug_ref_iterator *diter = (struct debug_ref_iterator *)ref_iterator; - int res = diter->iter->vtable->seek(diter->iter, prefix); - trace_printf_key(&trace_refs, "iterator_seek: %s: %d\n", prefix ? prefix : "", res); -+ int res = diter->iter->vtable->seek(diter->iter, seek, flags); ++ int res = diter->iter->vtable->seek(diter->iter, refname, flags); + trace_printf_key(&trace_refs, "iterator_seek: %s flags: %d: %d\n", -+ seek ? seek : "", flags, res); ++ refname ? refname : "", flags, res); return res; } @@ refs/files-backend.c: static int files_ref_iterator_advance(struct ref_iterator static int files_ref_iterator_seek(struct ref_iterator *ref_iterator, - const char *prefix) -+ const char *seek, unsigned int flags) ++ const char *refname, unsigned int flags) { struct files_ref_iterator *iter = (struct files_ref_iterator *)ref_iterator; - return ref_iterator_seek(iter->iter0, prefix); -+ return ref_iterator_seek(iter->iter0, seek, flags); ++ return ref_iterator_seek(iter->iter0, refname, flags); } static int files_ref_iterator_peel(struct ref_iterator *ref_iterator, @@ refs/files-backend.c: static int files_reflog_iterator_advance(struct ref_iterat static int files_reflog_iterator_seek(struct ref_iterator *ref_iterator UNUSED, - const char *prefix UNUSED) -+ const char *seek UNUSED, ++ const char *refname UNUSED, + unsigned int flags UNUSED) { BUG("ref_iterator_seek() called for reflog_iterator"); @@ refs/iterator.c: int ref_iterator_advance(struct ref_iterator *ref_iterator) -int ref_iterator_seek(struct ref_iterator *ref_iterator, - const char *prefix) -+int ref_iterator_seek(struct ref_iterator *ref_iterator, const char *seek, ++int ref_iterator_seek(struct ref_iterator *ref_iterator, const char *refname, + unsigned int flags) { - return ref_iterator->vtable->seek(ref_iterator, prefix); -+ return ref_iterator->vtable->seek(ref_iterator, seek, flags); ++ return ref_iterator->vtable->seek(ref_iterator, refname, flags); } int ref_iterator_peel(struct ref_iterator *ref_iterator, @@ refs/iterator.c: static int empty_ref_iterator_advance(struct ref_iterator *ref_ static int empty_ref_iterator_seek(struct ref_iterator *ref_iterator UNUSED, - const char *prefix UNUSED) -+ const char *seek UNUSED, ++ const char *refname UNUSED, + unsigned int flags UNUSED) { return 0; @@ refs/iterator.c: static int merge_ref_iterator_advance(struct ref_iterator *ref_ static int merge_ref_iterator_seek(struct ref_iterator *ref_iterator, - const char *prefix) -+ const char *seek, unsigned int flags) ++ const char *refname, unsigned int flags) { struct merge_ref_iterator *iter = (struct merge_ref_iterator *)ref_iterator; @@ refs/iterator.c: static int merge_ref_iterator_seek(struct ref_iterator *ref_ite iter->iter1 = iter->iter1_owned; - ret = ref_iterator_seek(iter->iter0, prefix); -+ ret = ref_iterator_seek(iter->iter0, seek, flags); ++ ret = ref_iterator_seek(iter->iter0, refname, flags); if (ret < 0) return ret; - ret = ref_iterator_seek(iter->iter1, prefix); -+ ret = ref_iterator_seek(iter->iter1, seek, flags); ++ ret = ref_iterator_seek(iter->iter1, refname, flags); if (ret < 0) return ret; @@ refs/iterator.c: static int prefix_ref_iterator_advance(struct ref_iterator *ref static int prefix_ref_iterator_seek(struct ref_iterator *ref_iterator, - const char *prefix) -+ const char *seek, unsigned int flags) ++ const char *refname, unsigned int flags) { struct prefix_ref_iterator *iter = (struct prefix_ref_iterator *)ref_iterator; @@ refs/iterator.c: static int prefix_ref_iterator_advance(struct ref_iterator *ref + + if (flags & REF_ITERATOR_SEEK_SET_PREFIX) { + free(iter->prefix); -+ iter->prefix = xstrdup_or_null(seek); ++ iter->prefix = xstrdup_or_null(refname); + } -+ return ref_iterator_seek(iter->iter0, seek, flags); ++ return ref_iterator_seek(iter->iter0, refname, flags); } static int prefix_ref_iterator_peel(struct ref_iterator *ref_iterator, @@ refs/packed-backend.c: static int packed_ref_iterator_advance(struct ref_iterato static int packed_ref_iterator_seek(struct ref_iterator *ref_iterator, - const char *prefix) -+ const char *seek, unsigned int flags) ++ const char *refname, unsigned int flags) { struct packed_ref_iterator *iter = (struct packed_ref_iterator *)ref_iterator; @@ refs/packed-backend.c: static int packed_ref_iterator_advance(struct ref_iterato - if (prefix && *prefix) - start = find_reference_location(iter->snapshot, prefix, 0); -+ if (seek && *seek) -+ start = find_reference_location(iter->snapshot, seek, 0); ++ if (refname && *refname) ++ start = find_reference_location(iter->snapshot, refname, 0); else start = iter->snapshot->start; @@ refs/packed-backend.c: static int packed_ref_iterator_advance(struct ref_iterato + FREE_AND_NULL(iter->prefix); + + if (flags & REF_ITERATOR_SEEK_SET_PREFIX) -+ iter->prefix = xstrdup_or_null(seek); ++ iter->prefix = xstrdup_or_null(refname); + iter->pos = start; iter->eof = iter->snapshot->eof; @@ refs/ref-cache.c: static int cache_ref_iterator_seek(struct ref_iterator *ref_it } +static int cache_ref_iterator_seek(struct ref_iterator *ref_iterator, -+ const char *seek, unsigned int flags) ++ const char *refname, unsigned int flags) +{ + struct cache_ref_iterator *iter = + (struct cache_ref_iterator *)ref_iterator; + + if (flags & REF_ITERATOR_SEEK_SET_PREFIX) { -+ return cache_ref_iterator_set_prefix(iter, seek); -+ } else if (seek && *seek) { ++ return cache_ref_iterator_set_prefix(iter, refname); ++ } else if (refname && *refname) { + struct cache_ref_iterator_level *level; -+ const char *slash = seek; ++ const char *slash = refname; + struct ref_dir *dir; + + dir = get_ref_dir(iter->cache->root); + + if (iter->prime_dir) -+ prime_ref_dir(dir, seek); ++ prime_ref_dir(dir, refname); + + iter->levels_nr = 1; + level = &iter->levels[0]; @@ refs/ref-cache.c: static int cache_ref_iterator_seek(struct ref_iterator *ref_it + sort_ref_dir(dir); + + slash = strchr(slash, '/'); -+ len = slash ? slash - seek : (int)strlen(seek); ++ len = slash ? slash - refname : (int)strlen(refname); + + for (idx = 0; idx < dir->nr; idx++) { -+ cmp = strncmp(seek, dir->entries[idx]->name, len); ++ cmp = strncmp(refname, dir->entries[idx]->name, len); + if (cmp <= 0) + break; + } @@ refs/refs-internal.h: void base_ref_iterator_init(struct ref_iterator *iter, */ typedef int ref_iterator_seek_fn(struct ref_iterator *ref_iterator, - const char *prefix); -+ const char *seek, unsigned int flags); ++ const char *refname, unsigned int flags); /* * Peels the current ref, returning 0 for success or -1 for failure. @@ refs/reftable-backend.c: static int reftable_ref_iterator_advance(struct ref_ite static int reftable_ref_iterator_seek(struct ref_iterator *ref_iterator, - const char *prefix) -+ const char *seek, unsigned int flags) ++ const char *refname, unsigned int flags) { struct reftable_ref_iterator *iter = (struct reftable_ref_iterator *)ref_iterator; @@ refs/reftable-backend.c: static int reftable_ref_iterator_advance(struct ref_ite + iter->prefix_len = 0; + + if (flags & REF_ITERATOR_SEEK_SET_PREFIX) { -+ iter->prefix = xstrdup_or_null(seek); -+ iter->prefix_len = seek ? strlen(seek) : 0; ++ iter->prefix = xstrdup_or_null(refname); ++ iter->prefix_len = refname ? strlen(refname) : 0; + } -+ iter->err = reftable_iterator_seek_ref(&iter->iter, seek); ++ iter->err = reftable_iterator_seek_ref(&iter->iter, refname); return iter->err; } @@ refs/reftable-backend.c: static int reftable_reflog_iterator_advance(struct ref_ static int reftable_reflog_iterator_seek(struct ref_iterator *ref_iterator UNUSED, - const char *prefix UNUSED) -+ const char *seek UNUSED, ++ const char *refname UNUSED, + unsigned int flags UNUSED) { BUG("reftable reflog iterator cannot be seeked"); 4: a571579886 ! 4: e4e9dddd15 for-each-ref: introduce a '--start-after' option @@ Commit message 'git-for-each-ref(1)'. When used, the reference iteration seeks to the lexicographically next reference and iterates from there onward. - This enables efficient pagination workflows like: + This enables efficient pagination workflows, where the calling script + can remember the last provided reference and use that as the starting + point for the next set of references: git for-each-ref --count=100 git for-each-ref --count=100 --start-after=refs/heads/branch-100 git for-each-ref --count=100 --start-after=refs/heads/branch-200 @@ Documentation/git-for-each-ref.adoc: TAB %(refname)`. --include-root-refs:: List root refs (HEAD and pseudorefs) apart from regular refs. -+--start-after:: ++--start-after=<marker>:: + Allows paginating the output by skipping references up to and including the + specified marker. When paging, it should be noted that references may be + deleted, modified or added between invocations. Output will only yield those -+ references which follow the marker lexicographically. If the marker does not -+ exist, output begins from the first reference that would come after it -+ alphabetically. Cannot be used with general pattern matching or custom -+ sort options. ++ references which follow the marker lexicographically. Output begins from the ++ first reference that would come after the marker alphabetically. Cannot be ++ used with general pattern matching or custom sort options. + FIELD NAMES ----------- base-commit: cf6f63ea6bf35173e02e18bdc6a4ba41288acff9 change-id: 20250605-306-git-for-each-ref-pagination-0ba8a29ae646 Thanks - Karthik