Re: [PATCH 1/3] fetch: use batched reference updates

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Patrick Steinhardt <ps@xxxxxx> writes:

[snip]

> Yup, this makes sense, as well. We lose a bunch of overhead by creating
> separate transactions for each of the updates, but the slow path is
> still that we have to create 10000 files for each of the references. So
> it's expected to see a small performance improvement, but nothing game
> changing.
>

Exactly, but it is still a good bump in speed. Which is always welcome

>> diff --git a/builtin/fetch.c b/builtin/fetch.c
>> index 5279997c96..1558f6d1e8 100644
>> --- a/builtin/fetch.c
>> +++ b/builtin/fetch.c
>> @@ -1688,6 +1644,32 @@ static int set_head(const struct ref *remote_refs, struct remote *remote)
>>  	return result;
>>  }
>>
>> +struct ref_rejection_data {
>> +	int *retcode;
>> +	int conflict_msg_shown;
>> +	const char *remote_name;
>> +};
>> +
>> +static void ref_transaction_rejection_handler(const char *refname UNUSED,
>> +					      const struct object_id *old_oid UNUSED,
>> +					      const struct object_id *new_oid UNUSED,
>> +					      const char *old_target UNUSED,
>> +					      const char *new_target UNUSED,
>> +					      enum ref_transaction_error err,
>> +					      void *cb_data)
>> +{
>> +	struct ref_rejection_data *data = (struct ref_rejection_data *)cb_data;
>> +
>> +	if (err == REF_TRANSACTION_ERROR_NAME_CONFLICT && !data->conflict_msg_shown) {
>> +		error(_("some local refs could not be updated; try running\n"
>> +			" 'git remote prune %s' to remove any old, conflicting "
>> +			"branches"), data->remote_name);
>> +		data->conflict_msg_shown = 1;
>> +	}
>> +
>> +	*data->retcode = 1;
>> +}
>> +
>>  static int do_fetch(struct transport *transport,
>>  		    struct refspec *rs,
>>  		    const struct fetch_config *config)
>
> Okay, so we now handle errors over here. Is the handled error the only
> error that we may see, or do we also accept other errors like D/F now?
> If the latter, wouldn't it mean that we don't print any error messages
> for those other failures at all? That might be quite confusing.
>

I was mostly trying to replicate the current behavior, which is
- For F/D conflicts print this error message.
- For any other error, simply propagate the error code.

I do think there is merit in changing this, and since you're pointing
it out too, I think we should make this change. I've modified this to
now print a better message for other errors.

>> @@ -1808,6 +1790,20 @@ static int do_fetch(struct transport *transport,
>>  			retcode = 1;
>>  	}
>>
>> +	/*
>> +	 * If not atomic, we can still use batched updates, which would be much
>> +	 * more performent. We don't initiate the transaction before pruning,
>
> s/performent/performant/
>

Ah! Thanks.

>> +	 * since pruning must be an independent step, to avoid F/D conflicts.
>> +	 */
>> +	if (!transaction) {
>> +		transaction = ref_store_transaction_begin(get_main_ref_store(the_repository),
>> +							  REF_TRANSACTION_ALLOW_FAILURE, &err);
>> +		if (!transaction) {
>> +			retcode = -1;
>> +			goto cleanup;
>> +		}
>> +	}
>> +
>>  	if (fetch_and_consume_refs(&display_state, transport, transaction, ref_map,
>>  				   &fetch_head, config)) {
>>  		retcode = 1;
>
> Don't transactions handle D/F conflicts for us? Isn't that the sole
> reason why for example `refs_verify_refname_available()` accepts an
> "extras" parameter that is supposed to contain refs that are about to be
> deleted?
>

My understanding was a little different, from the documentation for the
function:

  If extras is non-NULL, it is a list of additional refnames with which
  refname is not allowed to conflict.

This is to capture additional conflicts. We want a way to avoid said
conflicts. That said, there is a 'skip' parameter which does exactly
what you're saying. But the transaction logic doesn't incorporate this
entirely. Specifically in the files backend, where we create a lock in
the filesystem, this would cause a conflict, consider the following:

  ❯ eza --tree .git/refs/remotes/
  .git/refs/remotes
  └── origin
      ├── dir
      │   └── file.lock
      ├── dir.lock
      └── HEAD

This is from the test 'branchname D/F conflict resolved by --prune', the
test prunes the existing reference 'refs/remotes/origin/dir/file' while
adding 'refs/remotes/origin/dir'. In 'lock_raw_ref()' we lock both and
read the reference, but this causes an issue since
'refs/remotes/origin/dir' exists as a directory already.

I would say this is logically solvable if we start treating conflict
resolution within updates as a first class problem. But perhaps that's
something of a patch series in itself and better solved outside of this?

> Patrick

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux