Re: [GIT PULL] bcachefs fixes for 6.15-rc4

Matthew Wilcox <willy@xxxxxxxxxxxxx> · Fri, 25 Apr 2025 20:40:35 +0100

On Fri, Apr 25, 2025 at 09:35:27AM -0700, Linus Torvalds wrote:
> Now, if filesystem people were to see the light, and have a proper and
> well-designed case insensitivity, that might change. But I've never
> seen even a *whiff* of that. I have only seen bad code that
> understands neither how UTF-8 works, nor how unicode works (or rather:
> how unicode does *not* work - code that uses the unicode comparison
> functions without a deeper understanding of what the implications
> are).
> 
> Your comments blaming unicode is only another sign of that.
> 
> Because no, the problem with bad case folding isn't in unicode.
> 
> It's in filesystem people who didn't understand - and still don't,
> after decades - that you MUST NOT just blindly follow some external
> case folding table that you don't understand and that can change over
> time.

I think this is something that NTFS actually got right.  Each filesystem
carries with it a 128KiB table that maps each codepoint to its
case-insensitive equivalent.  So there's no ambiguity about "which
version of the unicode standard are we using", "Does the user care
about Turkish language rules?", "Is Aachen a German or Danish word?".
The sysadmin specified all that when they created the filesystem, and it
doesn't matter what the Unicode standard changes in the future; if you
need to change how the filesystem sorts things, you can update the table.

It's not the perfect solution, but it might be the least-bad one I've
seen.