Re: [PATCH v6 4/4] last-modified: use Bloom filters when available

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Aug 01, 2025 at 06:23:08PM +0200, Toon Claes wrote:
> Patrick Steinhardt <ps@xxxxxx> writes:
> > On Wed, Jul 30, 2025 at 07:55:10PM +0200, Toon Claes wrote:
> >> diff --git a/builtin/last-modified.c b/builtin/last-modified.c
> >> index e4c73464c7..19bf25f8a5 100644
> >> --- a/builtin/last-modified.c
> >> +++ b/builtin/last-modified.c
> >> @@ -179,6 +192,27 @@ static void last_modified_diff(struct diff_queue_struct *q,
> >>  	}
> >>  }
> >>  
> >> +static int maybe_changed_path(struct last_modified *lm, struct commit *origin)
> >> +{
> >> +	struct bloom_filter *filter;
> >> +	struct last_modified_entry *ent;
> >> +	struct hashmap_iter iter;
> >> +
> >> +	if (!lm->rev.bloom_filter_settings)
> >> +		return 1;
> >> +
> >> +	filter = get_bloom_filter(lm->rev.repo, origin);
> >> +	if (!filter)
> >> +		return 1;
> >> +
> >> +	hashmap_for_each_entry(&lm->paths, &iter, ent, hashent) {
> >> +		if (bloom_filter_contains(filter, &ent->key,
> >> +					  lm->rev.bloom_filter_settings))
> >> +			return 1;
> >> +	}
> >> +	return 0;
> >> +}
> >
> > This function is basically the same as `maybe_changed_paths()` in
> > "blame.c", but that isn't a huge issue from my point of view. What makes
> > me wonder though is why we have an additional check over there for
> > whether or not the commit has a valid generation number.
> 
> I've been asking me the same question. And I couldn't find a good reason
> (neither from the commit history, or from my reasoning). This check was
> in the version shared by Taylor, but because we were ignoring the return
> value from generation_numbers_enabled() in that version, it didn't make
> sense to me to do this check. That's why I removed it.

Okay. It might make sense to point this out in the commit message.

> >> @@ -227,6 +264,9 @@ static int last_modified_init(struct last_modified *lm, struct repository *r,
> >>  		return argc;
> >>  	}
> >>  
> >> +	prepare_commit_graph(lm->rev.repo);
> >> +	lm->rev.bloom_filter_settings = get_bloom_filter_settings(lm->rev.repo);
> >> +
> >
> > So this here is why we export `prepare_commit_graph()`. How about we
> > instead expose `bloom_filters_enabled()` that mirrors what we do in
> > `generation_numbers_enabled()` and `corrected_commit_dates_enabled()`?
> > That would both be on a higher level and do exactly what we want to
> > achieve.
> 
> I've got another proposal, what if we let get_bloom_filter_settings()
> call prepare_commit_graph()? Functions like
> repo_find_commit_pos_in_graph() and lookup_commit_in_graph() do this
> too. 

Yeah, I don't see any issue with that, either.

Patrick




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux