Re: [PATCH 1/2] userdiff: add support for R programming language

Johannes Sixt <j6t@xxxxxxxx> · Mon, 26 May 2025 07:57:33 +0200

Am 25.05.25 um 23:02 schrieb Rodrigo Carvalho:
> The patch appends userdiff.c file in order to support R programming
> language function header. This will be useful for those who use Git
> for versioning .R files.
> 
> Signed-off-by: Rodrigo Carvalho <rodrigorsdc@xxxxxxxxx>
> ---
>  userdiff.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/userdiff.c b/userdiff.c
> index da75625020..d1d31ea67e 100644
> --- a/userdiff.c
> +++ b/userdiff.c
> @@ -317,6 +317,10 @@ PATTERNS("python",
>  	 "|[-+0-9.e]+[jJlL]?|0[xX]?[0-9a-fA-F]+[lL]?"
>  	 "|[-+*/<>%&^|=!]=|//=?|<<=?|>>=?|\\*\\*=?"),
>  	 /* -- */
> +PATTERNS("r",

Rant mode: I am not a fan of the name "R". It is so un-searchable. It
hijacks a single character for its own purpose.

I don't see a negative impact in this case, though.

> +	"^[ \t]*([a-zA-z][a-zA-Z0-9_.]*[ \t]*<-[ \t]*function.*)$",

I wonder how useful this is in practice. Unlike C or Java for example,
code can live outside of functions in R scripts. If you have a script
without any functions, there would not be any hunk headers. If you have
a script with a mix of functions and code outside of functions, the code
after a function would be attributed to the function. I'm not saying
that this is bad, but just asking if this is part of the plan.

> +	/* -- */
> +	"[a-zA-Z_][a-zA-Z0-9_.]*"),

This singles out identifiers. Every single other characters would be its
own word. I'd consider this a disimprovement. If you are not prepared to
provide worddiff patterns, I recommend to use "[^ \t]+", which roughly
amounts to the default behavior. It can be improved incrementally in
later patches.

>  PATTERNS("ruby",
>  	 "^[ \t]*((class|module|def)[ \t].*)$",
>  	 /* -- */

Please squash the test cases into this patch. Don't forget to test an
indented function, and while at it, test a function definition *nested*
in a function definition: that documents what the expected outcome is.

-- Hannes