On Thu, Aug 07, 2025 at 07:42:38PM +0700, Aquinas Admin wrote: > Generally, this drama is more like a kindergarten. I honestly don't understand > why there's such a reaction. It's a management issue, solely a management > issue. The fact is that there are plenty of administrative possibilities to > resolve this situation. Yes, this is accurate. I've been getting entirely too many emails from Linus about how pissed off everyone is, completely absent of details - or anything engineering related, for that matter. Lots of "you need to work with us better" - i.e. bend to demands - without being willing to put forth an argument that stands to scrutiny. This isn't high school, and it's not a popularity contest. This is engineering, and it's about engineering standards. Those engineering standards have been notably lacking in the Linux filesystem world. When brtfs shipped, it did so with clear design issues that have never been adequately resolved. These were brought up on the list in the very early days of btrfs, when it was still experimental, with detailed analysis - that was ignored. The issues in btrfs are the stuff of legend; I've been to conferences (past LSFs) where after dinner the stories kept coming out from people who had worked on it - for easily an _hour_ - and had people falling out of their chairs. As a result, to this day, people don't trust it, and for good reason. Multidevice data corruptions, unfixed bugs with no real information, people who have tried to help out and fund getting this stuff fixed only to be turned away. This stuff is still going on: https://news.ycombinator.com/item?id=44508601 This is what you'd expect to happen when you rush to have all the features, skip the design, and don't build a community that's focused on working with users. Let's compare what's going on in bcachefs: Bug tracker: https://github.com/koverstreet/bcachefs/issues?q=is%3Aissue%20state%3Aopen%20-label%3Aenhancement%20-label%3A%22waiting%20confirmation%20fixed%22 Syzbot, and the other major filesystems for comparison: https://syzkaller.appspot.com/upstream/s/bcachefs https://syzkaller.appspot.com/upstream/s/ext4 https://syzkaller.appspot.com/upstream/s/xfs https://syzkaller.appspot.com/upstream/s/btrfs (Does btrfs even have a central bug tracker?) An important note, with bcachefs most of the activity doesn't happen on the bug tracker, it's on IRC (and the IRC channel is by far the most active out of all the major filesystems). The bug tracker is for making sure bugs don't get lost if they can't get fixed right away - most bugs never make it there. So the bug tracker is a good measure of outstanding bugs, but not fixed bugs or gauging usage. How did we get here, what are we doing differently - and where are we now? The recipe has been: patient, methodical engineering, with a focus on the users and building the user community, and working closely with the people who are using, testing and QAing. Get the design right, keep the codebase reasonably clean and well organized so that we can work efficiently; _heavy_ focus on assertions, automated testing (i.e. basic modern engineering best practices), introspection and debug tooling. Get enough feature work done to validate the design, and then - fix every last bug, and work with users to make sure that bugs are fixed and it's working well; work with people who are doing every kind of torture testing imaginable. A refrain I've been hearing has been about "working with the community", but to the kernel community, I need to hammer the point home that the community is not just us; it's all the people running our code, too. We have to actively work with those people if we want our code to actually work reliably in the real world, and this is something that's been frighteningly absent elsewhere, in filesystem development these days. 30 years ago, Linux took over by being a real community effort. But now, most of the development is very corporate, and getting corporate developers to actually engage with the community and do anything that smells of unpaid support is worse than pulling teeth - it just doesn't happen. Now bcachefs is the community based up and comer... But it's not really "up and coming" anymore. 6.16 is "unofficially unexperimental" - it's solid. It's attracting real interest and feedback from the ZFS community, and that hasn't happened before; those are the people who care about reliability and good engineering above all else. All the hard engineering problems are solved, stabilizing is basically done. We've got petabyte scalability, the majority of online fsck in place, all the multi device stuff rock solid (a major area where brtfs falls over); the error handling, logging and debugging tools are top notch. Repair is comprehensive and robust, with real defense in depth, and an extensive suite of tools for analyzing issues and making sure we can debug anything that may occur in the wild. The kernel community is being caught with their pants down here. The desicionmaking process has, at every step in the way, been "things couldn't possibly be that insane" - and yet, I am continually proven wrong. Post btrfs, I seriously expected there to be real design review for any future filesystems, and a retrospective on development process. Needless to say, that did not happen - it seems we're still in the "trust me bro, I got this" stage in the development of an engineering culture. But a cowboy culture only takes you so far, at some point you really do need actual engineer standards; you need to be able to explain your designs, your methods, your processes and decisionmaking. I've talked at length in the past about the need for a tight feedback loop on getting bugs out to users if we want to be able to work with those users (and to be honest, that should not even have been a discussion; I've been going over RC pull requests and there's been nothing remotely unusual about what I've been sending - except for volume, which is exactly what you want and expect for a filesystem that's been rapidly stabilizing). But "shipping bugfixes" has been called "whining" - that's the mentality we're dealing with here. I have to hammer on this one: there are certain bedrock principles of systems engineering we all know. "Make sure things work and stay working" is one of them. The rest of the kernel knows this as "do not break userspace", but in filesystem land that same underlying principle is written as "we do not lose user data". Our job is to ship things that work, and make sure they work. I also talk a lot about the need for automated testing; and that's another area where the kernel is woefully behind - and it's been one of the sources of conflict. I've asked people in other subsystems to please make sure they tests when regressions have hit bcachefs; it's good for everyone, not just bcachefs. But this has been cited (!) as one of the causes of conflict that's been pissing Linus off. Engineering principles. Basic stuff, here. And regarding manegement processes: Linus has been saying repeatedly (and loudly, and in public) that it's his decision whether or not to remove bcachefs from the kernel - but the criteria and decisionmaking process have been notably absent. It is not for me to say whether or not the kernel should still be a personal project, with decisions made in this way. And at the end of the day, we're all human beings, I'm not going to argue against the human factor, or against considering the people behind these projects. But the uncertainty this has caused has created massive problems for building a sustainable developer community around this thing, it should be noted. For my part, I just want to reassure people that I'm not going anywhere; bcachefs will continue to be developed and supported, in or out of the kernel. Cheers, Kent