On Tue, May 06, 2025 at 10:02:49PM -0500, Justin Tobler wrote: > During git-receive-pack(1), connectivity of the object graph is > validated to ensure that the received packfile does not leave the > repository in a broken state. > > Generally, this check is critical to avoid an incomplete receieved s/receieved/received/ > packfile from corrupting a repository. In situations where server > operators validate the connectivity of incoming objects outside of Git, > such a check may be redundant. This is a bit handwavy. _I_ know why we at GitLab are doing this, but other readers won't have the necessary context to be able to judge whether this really is a good idea. I think the important question to answer is: why does the server side want to perform the check if Git already does it anyway? Why is it in a better position to do so? And why can't we instead have Git itself perform it in the same "better" way? Ultimately it boils down to having more knowledge around exactly how Git is being used on the server side. With the additional information we can make better decisions and we can make assumptions that a general user of the connectivity check cannot do. Most importantly, we know that all objects in the repository will always be fully connected. Received packfiles get filtered, so we won't ever accept an object that isn't fully connected. Neither will Gitaly write such an object. So what we can do in Gitaly specifically is similar to what I proposed in [1] a while ago: we simply walk all received objects and then verify that the edges resolve to objects in the existing repository. The idea was rightfully shot down because we cannot assume in general that a mere object walk in the quarantine directory really means that all objects are fully connected. But with the extra knowledge we have in Gitaly we can do this optimization indeed. Patrick [1]: <cover.1621451532.git.ps@xxxxxx>