Am 25.05.25 um 23:02 schrieb Rodrigo Carvalho: > The patch appends userdiff.c file in order to support R programming > language function header. This will be useful for those who use Git > for versioning .R files. > > Signed-off-by: Rodrigo Carvalho <rodrigorsdc@xxxxxxxxx> > --- > userdiff.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/userdiff.c b/userdiff.c > index da75625020..d1d31ea67e 100644 > --- a/userdiff.c > +++ b/userdiff.c > @@ -317,6 +317,10 @@ PATTERNS("python", > "|[-+0-9.e]+[jJlL]?|0[xX]?[0-9a-fA-F]+[lL]?" > "|[-+*/<>%&^|=!]=|//=?|<<=?|>>=?|\\*\\*=?"), > /* -- */ > +PATTERNS("r", Rant mode: I am not a fan of the name "R". It is so un-searchable. It hijacks a single character for its own purpose. I don't see a negative impact in this case, though. > + "^[ \t]*([a-zA-z][a-zA-Z0-9_.]*[ \t]*<-[ \t]*function.*)$", I wonder how useful this is in practice. Unlike C or Java for example, code can live outside of functions in R scripts. If you have a script without any functions, there would not be any hunk headers. If you have a script with a mix of functions and code outside of functions, the code after a function would be attributed to the function. I'm not saying that this is bad, but just asking if this is part of the plan. > + /* -- */ > + "[a-zA-Z_][a-zA-Z0-9_.]*"), This singles out identifiers. Every single other characters would be its own word. I'd consider this a disimprovement. If you are not prepared to provide worddiff patterns, I recommend to use "[^ \t]+", which roughly amounts to the default behavior. It can be improved incrementally in later patches. > PATTERNS("ruby", > "^[ \t]*((class|module|def)[ \t].*)$", > /* -- */ Please squash the test cases into this patch. Don't forget to test an indented function, and while at it, test a function definition *nested* in a function definition: that documents what the expected outcome is. -- Hannes