On Wed, Sep 03, 2025 at 09:23:25PM -0700, Junio C Hamano wrote: > Patrick Steinhardt <ps@xxxxxx> writes: > > diff --git a/upload-pack.c b/upload-pack.c > > index 4f26f6afc7..fba3e339e2 100644 > > --- a/upload-pack.c > > +++ b/upload-pack.c > > @@ -476,20 +476,21 @@ static void create_pack_file(struct upload_pack_data *pack_data, > > > > static int do_got_oid(struct upload_pack_data *data, const struct object_id *oid) > > { > > - int we_knew_they_have = 0; > > struct object *o = parse_object_with_flags(the_repository, oid, > > PARSE_OBJECT_SKIP_HASH_CHECK | > > PARSE_OBJECT_DISCARD_TREE); > > > > if (!o) > > die("oops (%s)", oid_to_hex(oid)); > > + > > + if (o->flags & THEY_HAVE) > > + return 0; > > + o->flags |= THEY_HAVE; > > + > > if (o->type == OBJ_COMMIT) { > > struct commit_list *parents; > > struct commit *commit = (struct commit *)o; > > - if (o->flags & THEY_HAVE) > > - we_knew_they_have = 1; > > - else > > - o->flags |= THEY_HAVE; > > + > > OK, so regardless of the type, an object that we already know they > have do not need to be processed twice, which makes sense? > > For commits, we used to ensure that even if we see a commit that we > know they have for the second time, instead of returning, we marked > its immediate parents with THEY_HAVE bit. We no longer do so. > > Instead, the new code only paints the parents of a commit that we > haven't painted with THEY_HAVE. Otherwise we return without doing > anything. > > I wonder if they are equivalent. In the original code, when our > commit A has a parent B, where neither of them we haven't painted, > we paint A with THEY_HAVE, we paint all parents of A with THEY_HAVE, > and put A in the have_obj array and return 1. Then later when we > feed B to this function, we already know they have B, so we do not > add it to have_obj array, but otherwise we still paint B's parents > with THEY_HAVE bit. In the new code with the same set-up, we paint > A and B with THEY_HAVE and put A in the have_obj array and return 1. > When the caller feeds B to this function, we pretty much do nothing > and return, so B's parent will not be painted, will it? > > I must be misreading the patch, but I do not quite see how. Oh, I think you're indeed correct. I totally missed that we wouldn't paint the grandparents anymore with this change. It's quite surprising that this didn't cause any issues whatsoever. But that might be due to how clients typically negotiate HAVEs with the remote side: when the server ACKs an object there is no reason for the client to go deeper, so they stop advertising any of its parents. There will be cases where this breaks though. So we should probably move the return until after we have marked parents, like in the below patch. Patrick diff --git a/upload-pack.c b/upload-pack.c index fba3e339e2..9b9b149068 100644 --- a/upload-pack.c +++ b/upload-pack.c @@ -483,10 +483,6 @@ static int do_got_oid(struct upload_pack_data *data, const struct object_id *oid if (!o) die("oops (%s)", oid_to_hex(oid)); - if (o->flags & THEY_HAVE) - return 0; - o->flags |= THEY_HAVE; - if (o->type == OBJ_COMMIT) { struct commit_list *parents; struct commit *commit = (struct commit *)o; @@ -499,6 +495,10 @@ static int do_got_oid(struct upload_pack_data *data, const struct object_id *oid parents->item->object.flags |= THEY_HAVE; } + if (o->flags & THEY_HAVE) + return 0; + o->flags |= THEY_HAVE; + add_object_array(o, NULL, &data->have_obj); return 1; }