On Wed, Aug 20, 2025 at 03:42:11PM -0400, Derrick Stolee wrote: > On 8/20/2025 3:02 PM, Junio C Hamano wrote: > > "Derrick Stolee via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes: > > > >> The core problem here is that the "maybe_interesting" member of 'struct > >> type_and_oid_list' is not initialized to '1'. This member was added in > >> 6333e7ae0b (path-walk: mark trees and blobs as UNINTERESTING, > >> 2024-12-20) in a way to help when creating packfiles for a small commit > >> range using the sparse path algorithm (enabled by pack.useSparse=true). > > > > OK, in other words, the bug is fairly contained within the path-walk > > traversal. We treat things as reachable not just from ref tips and > > reflogs (where path-walk code can use the tree object to compute on > > what pathname each blob comes from) and the main index array (that > > has paths, even though it needs separate way to compute than those > > for trees), but also from places like REUC and TREE extensions that > > make associations between pathnames and objects. Are they also OK? > > The key integration point is the "pending" list operating a bit > different from walking directly from tags or commits. I was trying > to reproduce the issue from all of those other sources before unlocking > the "singleton" nature of the problem, and failed to do so. > > The resolve-undo cache (REUC) is something that I had not tested > previously. Adding "git rm --cached x/y" to the test in the previous > case leads to the 'git fsck' call giving a "dangling blob" warning, > so that could be an interesting way to strengthen the test. Thanks, I also wonder a bit about the future -- if we ever add a new source for pending objects, would the author have to amend "path-walk.c" to take this new pending source into account? I guess the answer is "yes", which does make me feel a bit uneasy as it is very easy to now corrupt the repository. Patrick