Re: [PATCH 1/2] userdiff: add support for R programming language

Junio C Hamano <gitster@xxxxxxxxx> · Tue, 27 May 2025 08:04:14 -0700

Johannes Sixt <j6t@xxxxxxxx> writes:

>> +	"^[ \t]*([a-zA-z][a-zA-Z0-9_.]*[ \t]*<-[ \t]*function.*)$",
>
> I wonder how useful this is in practice. Unlike C or Java for example,
> code can live outside of functions in R scripts. If you have a script
> without any functions, there would not be any hunk headers. If you have
> a script with a mix of functions and code outside of functions, the code
> after a function would be attributed to the function. I'm not saying
> that this is bad, but just asking if this is part of the plan.

Isn't it the same as shell, perl, python, e-lisp and perhaps others?

If we can reliably detect that we are outside of any function and
set it to an empty string that would be great ;-).

>> +	/* -- */
>> +	"[a-zA-Z_][a-zA-Z0-9_.]*"),
>
> This singles out identifiers. Every single other characters would be its
> own word. I'd consider this a disimprovement. If you are not prepared to
> provide worddiff patterns, I recommend to use "[^ \t]+", which roughly
> amounts to the default behavior. It can be improved incrementally in
> later patches.

Good point.

> Please squash the test cases into this patch. Don't forget to test an
> indented function, and while at it, test a function definition *nested*
> in a function definition: that documents what the expected outcome is.

Again, good point.

Thanks.