Re: [PATCH v3 0/4] fetch/receive: use batched reference updates

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Karthik Nayak <karthik.188@xxxxxxxxx> writes:

> The git-fetch(1) and git-receive-pack(1) commands update references as
> part of the flow. Each reference update is treated as a single entity
> and a transaction is created for each update.
>
> This can be really slow, specifically in reference backends where there
> are optimizations which ensure a single transaction with 'N' reference
> update perform much faster than 'N' individual transactions. Also having
> 'N' individual transactions has buildup and teardown costs. These costs
> add up in repositories with a large number of references.
>
> Also specifically in the reftable backend, 'N' individual transactions
> would also trigger auto-compaction for every transaction.
>
> The reasoning for using individual transactions is because we want to
> allow partial updates of references in these commands. Using a single
> transaction would be an all-or-nothing scenario.
>
> Recently we introduced an in-between solution called batched reference
> updates in 23fc8e4f61 (refs: implement batch reference update support,
> 2025-04-08). This allows us to batch a set of reference updates, where
> individual updates can pass/fail without affecting the batch.
>
> This patch series, modifies both 'git-fetch(1)' and
> 'git-receive-pack(1)' to use this mechanism. With this, we see a
> significant performance boost:
>
> +---------------------+---------------+------------------+
> |                     | files backend | reftable backend |
> +---------------------+---------------+------------------+
> | git-fetch(1)        | 1.25x         | 22x              |
> | git-receive-pack(1) | 1.21x         | 18x              |
> +---------------------+---------------+------------------+

Very nice.

> The first and third patch handle the changes for 'git-fetch(1)' and
> 'git-receive-pack(1)' respectively. The second patch fixes a small
> memory leak I encountered while working on this series.
>
> This is based on top of master: 7a1d2bd0a5 (Merge branch 'master' of
> https://github.com/j6t/gitk, 2025-05-09). There were no conflicts
> observed with next or seen.
>
> Junio, since the release window for 2.50 is closing in. I would prefer
> we mark this for 2.51, so perhaps when/if we merge it to 'next', we let
> it bake there till the next release window opens. While all the tests
> pass, any bugs here would be high impact and it would be nice to catch
> it before it hits a release.

I've read the difference since the last iteration, "git diff @{1}",
and everything looked sensible.

Not an issue with this series at all, but one thing I wondered is if
it makes sense to change the type of strmap_get/strmap_put to deal
with "const void *".  That way, it would not be necessary to cast
away the constness like so:

>     -+	strmap_put(failed_refs, refname, ref_transaction_error_msg(err));
>     ++	strmap_put(failed_refs, refname, (char *)ref_transaction_error_msg(err));

without harming the other side, namely

>     @@ builtin/receive-pack.c: static void BUG_if_skipped_connectivity_check(struct com
>      +		if (reported_error)
>      +			cmd->error_string = reported_error;
>      +		else if (strmap_contains(&failed_refs, cmd->ref_name))
>     -+			cmd->error_string = xstrdup(strmap_get(&failed_refs, cmd->ref_name));
>     ++			cmd->error_string = strmap_get(&failed_refs, cmd->ref_name);

this piece of code.

It may not make sense, and even if it did, of course, it is totally
outside of this series.

Thanks.




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux