Re: [PATCH v2 1/6] remote: fix tear down of struct branch and struct remote

Junio C Hamano <gitster@xxxxxxxxx> · Tue, 17 Jun 2025 15:25:14 -0700

Jacob Keller <jacob.e.keller@xxxxxxxxx> writes:

> From: Jacob Keller <jacob.keller@xxxxxxxxx>
>
> The branch_release() and remote_clear() functions fail to completely
> release all of the memory they point to. This results in leaked memory
> if you read the configuration for an arbitrary repository and then
> release that repository.
>
> This should be caught by the leak sanitizer. However, for callers which
> use ``the_repository``, the values never go out of scope and the
> sanitizer won't complain.
>
> A future change is going to add a caller of read_config() for a
> submodule repository structure. Doing so reveals one immediate issue due
> to a bad NULL pointer access, as well as the mentioned leaks.
>
>  * The branch->merge array is accessed without checking if its NULL.
>    Since this array is only setup by calling set_merge, it may in fact
>    not be initialized even though merge_nr is non-zero.
>
>  * The remote->push and remote->fetch refspecs are never cleared.
>
>  * The branch->merge_name array is never cleared.
>
>  * The individual elements of branch->merge are not released.
>
> Add a check against branch->merge before accessing it and calling
> refspec_item_clear. Update remote_clear() with calls to refspec_clear()
> for both the push and fetch refspecs. Add a release of the merge_name
> items as well as a final release of the merge_name array.
>
> Freeing merge_name elements results in a warning because we discard the
> const qualifier on the parameter name. These values come from a call to
> add_merge() in handle_config(), which always copies the names with
> xstrdup. This makes ownership of the memory difficult to track in the
> code.
>
> Move the call to xstrdup inside add_merge() so that its clear that the
> memory is duplicated here and must be released when the merge_name array
> is released. Drop the const qualifier on the branch structure to allow
> calling free without an explicit cast.
>
> These changes make it safe to call read_config() on a submodule
> repository without crashing or leaking any of the memory when the
> submodule repository is released.
>
> There is still some confusion with the difference between branch->merge
> and branch->merge_name, and the confusion of using branch->merge_nr for
> both. That could probably use some future cleanup.

Nicely described.  One thing that puzzles me is this part:

> @@ -253,9 +256,16 @@ static void branch_release(struct branch *branch)
>  	free((char *)branch->refname);
>  	free(branch->remote_name);
>  	free(branch->pushremote_name);
> -	for (int i = 0; i < branch->merge_nr; i++)
> -		refspec_item_clear(branch->merge[i]);
> +	for (int i = 0; i < branch->merge_nr; i++) {
> +		if (branch->merge) {
> +			refspec_item_clear(branch->merge[i]);
> +			free(branch->merge[i]);
> +		}
> +		if (branch->merge_name)
> +			free(branch->merge_name[i]);
> +	}
>  	free(branch->merge);
> +	free(branch->merge_name);
>  }

where we iterate over branch->merge[] and branch->merge_name[]
for branch->nr times and each time we check the NULL-ness of these
two pointers.

merge_nr is only incremented inside add_merge() when merge_name[]
array is grown by the same amount.  Do we need to check for the
NULL-ness of branch->merge_name?

Near the beginning of set_merge() we allocate branch->merge[] 
only when branch->merge_nr is non-zero (the assumption seems to be
that we accumulate in the merge_names[] while reading the config,
so as long as branch->merge is initialized properly to NULL, it will
never be NULL if merge_nr is not 0, no?

Thanks.