Feature Request: Support character escapes in .gitignore

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Summary: .gitignore rules should support using character escapes to
match for non-printable and non-ASCII characters in paths, e.g. '\r'
for carriage-return.

Background: After a recent incident in my project where a bunch of
.DS_store files got accidentally committed to our repo, I decided to
update our .gitignore file to exclude a bunch of common metadata files
and other cruft.

One of the files my research turned up as something we would want to
exclude was MacOS's Icon file (
https://superuser.com/questions/298785/icon-file-on-os-x-desktop ).
However, there was a complication: the full name of these files is
"Icon\r", i.e. the last character of the file name is a ASCII
carriage-return. I look into how to write a .gitignore rule that would
match this, and was distressed to discover the only way to do so would
be to have a literal ASCII carriage-return in the .gitignore file.

Now, including a literal carriage-return in the file was undesirable
for a number of reasons:
- There are a number of tools in our dev environments (various
different editors, linters, even Git itself with the 'core.whitespace'
and 'core.autocrlf' options) which might incorrectly warn about the
carriage-return as a line-break of the incorrect type, or worse try to
'fix' it automatically
- Most editors would either display it as nothing at all or as a
line-break, either of which would be misleading and make it easy to
break while editing the file
- It is just a bad idea in general  to have ASCII control characters
is a text file if you aren't using them for their control-character
purpose

The workaround I ended up using was to add a rule, 'Icon?', which
ignored 'Icon' followed by any one character, then another rule,
'!Icon[ -~]`, which un-ignored 'Icon' followed by any printable ASCII
character. (This had the minor side-effect of causing git to also
ignore any file named 'Icon' followed by a control character other
than carriage return, and the more major side-effect of causing git to
ignore any file named 'Icon' followed by a non-ASCII character.)

The proper solution would instead be for .gitignore to have some
mechanism to include non-printable characters in a rule without
requiring those characters to literally be in the file. .gitignore
already supports using backslash as an escape character to disable the
special effects of certain punctuation characters (e.g. *, [, leading
! or #), so the obvious choice is to enable its use for character
escapes as well. (I'd suggest borrowing the list of escapes used by
Python ( https://docs.python.org/3.11/reference/lexical_analysis.html#escape-sequences
).)

One potential issue: ensuring this does not break compatibility by
changing the function of existing .gitignore files. The .gitignore
documentation doesn't define the semantics of a backslash applied to a
non-special character, so I'm not sure what the current behaviour is.
One option to mitigate the issue would be to only enable these escapes
inside [character classes], which would reduce the probability of
triggering it accidentally at the cost of making it a bit more
cumbersome to use intentionally.

-- 
So many books, so little time... - Anon.

You haven't lived
'Till you've heard the floor ring
To the whoop and the call
Of 'Balance and swing!'




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux