On Fri, Apr 11, 2025 at 11:29:55AM +0200, Patrick Steinhardt wrote: > Split out functions relating to the index subsystem from "object-file.c" > to help us separate concerns. I know these functions all start with "index_", and they do take an index_state variable, but I'm not sure they are really about Git's index subsystem at all. The term "index" here is more about "compute the sha1 index of the content". E.g., the function index_path() goes all the way back to ec1fcc16af (Show original and resulting blob object info in diff output., 2005-10-07)! Back then it did not take an index struct, or even care about having an index at all. Later, they learned to call convert_to_git() in 6c510bee20 (Lazy man's auto-CRLF, 2007-02-13). And that function may check the index for .gitattributes files. It originally just used the global the_index variable for that, but later commits like 58bf2a4cc7 (sha1-file.c: remove implicit dependency on the_index, 2018-09-21) passed the istate around the call stack. So having access to an index struct is mostly incidental to these functions. Which makes sense looking at the callers: there are many pure-object operations that would work without an index (or even a repo in some cases!) like hash-object, git-replace, diff. Side note: I'm actually not even sure we would read attributes from the index, since we don't set GIT_ATTR_INDEX. So I wondered if we could simply pass NULL to convert_to_git() here. But I think these days some of the "auto" CRLF modes also have heuristics based on what's the content we find in the index for that path. See has_crlf_in_index() and its callers. So it seems to me that these really are more about creating objects than they are about the index. I don't mind splitting them out, but it seems like they're equally weird in read-cache.[ch]. -Peff