Re: [PATCH v2 1/3] commit: convert pop_most_recent_commit() to prio_queue

René Scharfe <l.s.r@xxxxxx> · Mon, 4 Aug 2025 21:56:27 +0200

On 8/3/25 6:48 PM, Junio C Hamano wrote:
> René Scharfe <l.s.r@xxxxxx> writes:
> 
>> We add and retrieve each commit in the (relevant part of) history.  That
>> takes O(N) and O(1) for the sorted list, and O(log N) and O(log N) for
>> the prio_queue, where N is the length of the list.
>>
>> So the best-case history is a string of single-parent commits, keeping
>> only a single item on the list/queue throughout.  That requires no
>> sorting or heaping, making the additions and retrievals O(1).  The
>> overall complexity is then O(N) for both variants, N being the number
>> of commits in the history.
>>
>> Worst-case history might be a single merge of all commits -- a
>> centipede or myriapod?  With all commits on the sorted list we get a
>> complexity of O(N²) for the traversal, and O(N log N) with a prio_queue.
> 
> In other words, for a typical two-parent merge, we peek the current
> one, "replace" it with its first parent and then do the usual "put
> and sift it down into place" for the second one.  
> 
> I am wondering if there is a more optimization opportunity if we
> allowed "put more than one, and then sift all of them down into
> place".  In other words, if I told the machinery:
> 
>     I am doing this put.  I promise I won't do get until I say "now
>     I'll start doing get's, so you are free to delay your internal
>     state maintenance and do so immediately before my next 'get'".
> 
> and did such put's a few times before I do a 'get', would there be a
> way to teach the machinery to take advantage of the promise?

Well, we could reestablish the heap at a cost of O(N), which only pays
off if it's less than the O(P log N) needed for regular puts of P
parents, with N being the number of queue elements.  This starts to
lose once queues become too long -- just when an optimization would be
most welcome.  So it seems impractical.

We could replace our binary heap with an algorithm that has O(1)
inserts, like a pairing heap, though.

René