Han Young <hanyang.tony@xxxxxxxxxxxxx> writes: > If the promisor objects fail to fetch, we check the remaining objects > to see if they are indeed promisor objects. Then, we die on the first > remaining promisor object. However, this promisor object check is > unnecessary because callers of promisor_remote_get_direct already filter > out local objects. All objects passed to promisor_remote_get_direct are > promisor objects. > > The is_promisor_object check essentially iterates through every object > in the local packfiles and adds them to an oid set. This process is > agonizingly slow for large repositories. > Remove the check so that we fail immediately. Thanks for CC'ing Christian and Jonathan Tan, who may indeed be good reviewers for this change. let me think aloud a bit, but because I haven't looked at this part of the system for quite some time, what I mumble may not make much sense. Please bear with me. So the incoming list remaining_oids[] is what was already filtered by the caller and everything in it should be promisor? If so ... > - for (i = 0; i < remaining_nr; i++) { > - if (is_promisor_object(repo, &remaining_oids[i])) > - die(_("could not fetch %s from promisor remote"), > - oid_to_hex(&remaining_oids[i])); ... the first iteration of this loop, after is_promisor_object() says the object is a promisor object (which is guaranteed if everybody in that remaining_oids[] array is promisor object), will die immediately. So your "agonizingly slow" comes from just one single call to is_promisor_object() function is ultra slow? If that is the case, the patch, including the above explanation, makes perfect sense (note that I didn't verify the "everybody in the remaining_oids[] array is guaranteed to be a promisor object" claim myself, though). But at the same time, it sounds like is_promisor_object() seriously is wrong. Perhaps we need to tell pack-objects to pre-compute the packfile.c:add_promisor_object() stuff and cache the result in an on-disk file, just like reverse index is stored in an auxiliary file? What do the callers of the function use to "filter out local objects" to ensure that "all objects passed ... are promisor objects" do? Have they already spent agonizingly large amount of time to do so? I sampled a few existing callers but they do not explicitly restrict what they throw at the function to promisor objects---they prepare an oid array, throw anything they do not see locally in the array and call this function. So objects in the array are either promisor objects or missing due to repository corruption---we simply cannot tell. I suspect that the claim "everything the caller calls the function with is a promisor object" is not exactly correct. The end-result when remaining_nr is not zero (i.e., some objects that the caller wanted us to be fetched) would not exactly be the same. The function used to die only when the object we failed to obtain was what a promisor remote promised to give us. With this change, we also die when an object the caller asked us to fetch is not promised by any promisor. I do not know what the implication of this behaviour change would be. Reviewers, comments? > + if (remaining_nr) { > + die(_("could not fetch %s from promisor remote"), > + oid_to_hex(&remaining_oids[0])); > }