On 6/17/2025 3:25 PM, Junio C Hamano wrote: > Jacob Keller <jacob.e.keller@xxxxxxxxx> writes: > >> From: Jacob Keller <jacob.keller@xxxxxxxxx> >> >> The branch_release() and remote_clear() functions fail to completely >> release all of the memory they point to. This results in leaked memory >> if you read the configuration for an arbitrary repository and then >> release that repository. >> >> This should be caught by the leak sanitizer. However, for callers which >> use ``the_repository``, the values never go out of scope and the >> sanitizer won't complain. >> >> A future change is going to add a caller of read_config() for a >> submodule repository structure. Doing so reveals one immediate issue due >> to a bad NULL pointer access, as well as the mentioned leaks. >> >> * The branch->merge array is accessed without checking if its NULL. >> Since this array is only setup by calling set_merge, it may in fact >> not be initialized even though merge_nr is non-zero. >> >> * The remote->push and remote->fetch refspecs are never cleared. >> >> * The branch->merge_name array is never cleared. >> >> * The individual elements of branch->merge are not released. >> >> Add a check against branch->merge before accessing it and calling >> refspec_item_clear. Update remote_clear() with calls to refspec_clear() >> for both the push and fetch refspecs. Add a release of the merge_name >> items as well as a final release of the merge_name array. >> >> Freeing merge_name elements results in a warning because we discard the >> const qualifier on the parameter name. These values come from a call to >> add_merge() in handle_config(), which always copies the names with >> xstrdup. This makes ownership of the memory difficult to track in the >> code. >> >> Move the call to xstrdup inside add_merge() so that its clear that the >> memory is duplicated here and must be released when the merge_name array >> is released. Drop the const qualifier on the branch structure to allow >> calling free without an explicit cast. >> >> These changes make it safe to call read_config() on a submodule >> repository without crashing or leaking any of the memory when the >> submodule repository is released. >> >> There is still some confusion with the difference between branch->merge >> and branch->merge_name, and the confusion of using branch->merge_nr for >> both. That could probably use some future cleanup. > > > Nicely described. One thing that puzzles me is this part: > >> @@ -253,9 +256,16 @@ static void branch_release(struct branch *branch) >> free((char *)branch->refname); >> free(branch->remote_name); >> free(branch->pushremote_name); >> - for (int i = 0; i < branch->merge_nr; i++) >> - refspec_item_clear(branch->merge[i]); >> + for (int i = 0; i < branch->merge_nr; i++) { >> + if (branch->merge) { >> + refspec_item_clear(branch->merge[i]); >> + free(branch->merge[i]); >> + } >> + if (branch->merge_name) >> + free(branch->merge_name[i]); >> + } >> free(branch->merge); >> + free(branch->merge_name); >> } > > where we iterate over branch->merge[] and branch->merge_name[] > for branch->nr times and each time we check the NULL-ness of these > two pointers. > > merge_nr is only incremented inside add_merge() when merge_name[] > array is grown by the same amount. Do we need to check for the > NULL-ness of branch->merge_name? > Probably not. > Near the beginning of set_merge() we allocate branch->merge[] > only when branch->merge_nr is non-zero (the assumption seems to be > that we accumulate in the merge_names[] while reading the config, > so as long as branch->merge is initialized properly to NULL, it will > never be NULL if merge_nr is not 0, no? > We initialize branch->merge with set_merge() which is called by branch_get() and which is the only way for callers external to remote.c of getting a branch structure. The issue is that merge_nr can be non-zero because if no caller has done a branch_get() on the given branch, we still have merge_nr is non-zero and merge is NULL. I suspect you're right that merge_name will never be NULL while merge_nr is non-zero. Really, I think we should find a better way to handle this than having two separate arrays which interact with merge_nr. I especially just noticed that set_merge will assign merge_nr to 0 if remote_name is bad, but it doesn't release the memory, so we'd leak on teardown in that case... I wonder if it is possible to just get rid of merge_names entirely and only have the merge array? > Thanks.