Essential insights from Hacker News discussions

Bcachefs Goes to "Externally Maintained"

This Hacker News discussion covers several key themes related to bcachefs, kernel development practices, and the comparison with other filesystems like ZFS and btrfs.

The bcachefs Development and Upstreaming Issues

A central theme revolves around the difficulties bcachefs has faced in being maintained within the Linux kernel. The core of the problem is the conflict between bcachefs maintainer Kent Overstreet and the kernel community, particularly Linus Torvalds.

  • DKMS vs. In-Tree Maintenance: The initial comment highlights the dilemma of using bcachefs with DKMS (Dynamic Kernel Module Support) and facing potential breakage, or sticking to older kernel versions. This sets the stage for the discussion about bcachefs's status. Volundr notes, "Damn. I was enjoying not having to deal with the fun of ZFS and DKMS, but it seems like now bcachefs will be in the same boat..."
  • Kent Overstreet's Interaction Style and "Drama": A significant portion of the conversation focuses on Kent Overstreet's perceived inability to adhere to kernel development workflows and his tendency to create friction. Several users describe a pattern of him disregarding established processes, challenging authority, and engaging in arguments.
    • ffsm8 observes, "Drama really does seem to follow Kent around for one reason or another. And it's never his fault if you take him by his public statements..."
    • sarlalian elaborates, "The common thread seems to be him not respecting workflows and processes that those places have, that inconvenience his goals. So, he ignores the workflows and processes of those places, and creates a constant state of friction..."
    • toast0 states, "IMHO, what his communications show is an unwillingness to acknowledge that other projects that include his work have focus, priorities, and policies that are not the same as that of his project."
    • sevg believes, "Sadly, Kent responds to everything in an email except the key part that is being pointed out to him (usually his behavior). Or deflects by going on the attack. And generally refuses to apologise."
    • arp242 paints a picture of perpetual conflict: "Kent just does not listen. Every time the discussion starts from the top. Even if you do agree on some compromise, in a month or two he'll just do the same thing again and all the same arguments start again."
  • The "Journal Rewind" Incident: A specific event, the journal_rewind patch, is cited as the catalyst for the recent escalation. Overstreet viewed it as a critical bugfix, while Linus Torvalds classified it as a "feature" that should have gone through the normal merge window.
    • koverstreet explains, "The patch that kicked off the current conflict was the 'journal_rewind' patch; we recently (6.15) had the worst bug in the entire history upstream..." He argues it was a critical bugfix and that Linus overreacted, stating, "Linus then flipped out because it was listed as a 'feature' in the pull request..."
    • motorest, however, counters this, linking to an article and suggesting Overstreet is "trying to gaslight everyone in the thread" and that Linus had criticized Overstreet for "lack of testing and collaboration before submitting patches."
    • nirava summarizes the kernel's perspective: "Regardless of whether correct or not, it's Linus that decides what's a feature and what's not in Linux. ... Repair code is a feature if Linus says it is a feature."
  • Decision to Ship as DKMS: The outcome of this conflict is that bcachefs is being moved out of the mainline kernel to be distributed as a DKMS module. This is seen as a pragmatic but unfortunate decision.
    • koverstreet states, "at this point, cutting ties with the kernel community and shipping as a DKMS module is really the only path forwards."
    • saubeidl fears, "I don't think getting the FS kicked out of the kernel is best by the users. Good engineering requires long term thinking."
  • The "Bus Factor" and Developer Burnout: Concerns are raised about bcachefs having a "bus factor of one" (meaning the project's future depends heavily on a single person), and the general burnout experienced by kernel developers due to the friction.
    • uecker asks, "Who would use a file system which essentially seems to be developed by a single person? A bus-factor of one seems unacceptable for a FS."
    • koverstreet notes, "The XFS folks have had their own issues with interference, leading to burnout... And I'm still seeing Linus lashing out at people on practically a weekly basis. I could never ask anyone else to have to deal with that."

Legal and Licensing Concerns with ZFS

The discussion indirectly touches upon why ZFS is not integrated into the Linux kernel, with licensing and legal concerns regarding Oracle being mentioned.

  • jchw offers a straightforward reason: "I think the Linux Kernel just doesn't want to be potentially in violation of Oracle's copyrights. That really doesn't seem that unreasonable to me..."
  • kstrauser agrees, "I can think of non-religious reasons to want to avoid legal fights with Oracle."
  • Volundr refers to the "ZFS + Linux situation" and calls it "mostly Linux religiosity gone wild," although others point out the legal basis.

btrfs Stability, Features, and Comparisons

A significant portion of the conversation, triggered by comparisons, delves into the stability and features of btrfs, often contrasted with ZFS.

  • RAID Parity Issues: The primary criticism leveled against btrfs is the instability of its RAID 5 and RAID 6 modes, which can lead to data loss.
    • betaby states, "btrfs was unusable in multi disk setup for kernels 6.1 and older. Didn't try since then. How's stable btrs today in such setups?" and later, "There is no 'modern' ZFS-like fs in Linux nowadays."
    • AaronFriel questions, "How can this be a stable filesystem if parity is unstable and risks data loss? How has this been allowed to happen?"
    • qalmakka is strongly critical: "Btrfs is constantly eating people data, it's a bad joke nowadays."
  • Counterpoints on btrfs Stability: Others defend btrfs, arguing that its RAID 5/6 issues are specific and that other modes are stable. They also point out that many issues are hardware-related or due to user error.
    • wtallis argues, "If you're not interested in a multi-disk storage system that doesn't have (stable, non-experimental) parity modes, that's a valid personal preference but not at all a justification for the position that the rest of the features cannot be stable..."
    • goneri defends btrfs: "Btrfs is NOT constantly eating people data. You have nothing to back this statement. It's widely used and the default filesystem of several distributions."
    • cmurf states, "Single, dup, raid0, raid1, raid10 have been usable and stable for a decade or more."
  • btrfs Features and Usability: Users discuss btrfs's advantages, such as its rapid mounting of many subvolumes, quotas, and COW snapshots, compared to ZFS.
    • williamstein highlights performance benefits: "It's also much faster than ZFS at mounting a disk with a large number of filesystems (=subvolumes)..." and details use cases at scale for Meta.
    • wmf mentions, "The btrfs commands and UX are really awkward (for me) compared to ZFS, but btrfs is extremely efficient at some things where ZFS just falls down."
    • doubletwoyou praises space saving: "I’ve noticed pretty good space savings on the order of like 100 GB from zstd compression and CoW on my personal disks with btrfs."
  • Data Integrity and Verification: The reliability of filesystems in detecting and handling data corruption (bit rot) is a persistent topic.
    • anon-3988 states, "have you tried processing terabytes of data every day and storing them? It gets better now with DDR5 but bit flips do actually happen."
    • ajross suggests that block device layers like dm-integrity can handle checksumming, rather than filesystems integrating it themselves. This is met with skepticism regarding performance and usability.
    • koverstreet defends filesystem-level integrity checks, stating, "From what I've seen, ext4 and bcachefs are the gold standard here; both can recover from basically arbitrary corruption and have no single points of failure." He critiques ZFS for potentially needing data recovery services in catastrophic failures, unlike bcachefs which has fsck.

Alternatives and Operating Systems

The conversation also touches on alternative operating systems and storage solutions.

  • ZFS on FreeBSD and TrueNAS: Many users express satisfaction with ZFS on FreeBSD or TrueNAS, highlighting its robust integration and data protection features.
    • NewJazz notes, "FreeBSD is giving me a sultry look as I ponder my NAS build."
    • _0xdd shares, "Go for it. I made the switch ~10 years ago and didn't regret it at all. First-class, rock solid ZFS integration. Saved my data on more than one occasion."
    • kstrauser contemplates switching to FreeBSD as well.
  • Device Mapper (dm) and LVM: The use of dm and LVM for features like snapshots and integrity is discussed as an alternative to integrated filesystem features.
    • tptacek mentions, "DM has targets that facilitate block-level snapshots, lazy cloning of filesystems, compression, &c. Most people interact with those features through LVM2."
  • Automated Tiered Storage: bcachefs's unique feature of automated tiered storage (using SSDs and HDDs seamlessly) is highlighted by its author as a significant advantage not found in other filesystems.
    • ThatPlayer expresses a long-held desire: "For me bcachefs provides a feature no other filesystem on Linux has: automated tiered storage. I've wanted this ever since I got an SSD more than 10 years ago..."

The Nature of Technical Discourse and Rules

A philosophical debate emerges about adhering to rules versus prioritizing technical correctness and user outcomes.

  • Rules vs. Correctness: Some users emphasize that rules and processes, even if not perfectly "correct," are necessary for large-scale collaboration in projects like the Linux kernel.
    • nirava states, "When rules and authority start to take precedence over making sure things work, things have gone off the rails and we're not doing engineering anymore." This is rebutted by others who see process as essential.
    • immibis uses a train analogy: "Linus is trying to run the release cycle on a strict schedule, like a train station. You are trying to delay the train so that you can load more luggage on, instead of just waiting for the next train."
  • "Speaking Truth to Power" vs. "Deferring to Authority": One user frames the conflict as a fundamental psychological divide between those who prioritize correctness and those who prioritize hierarchy and social harmony.
    • quotemstr posits, "it highlights a psychological division in tech and humanity in general between people who prioritize 1) deferring to authority, reading the room, knowing your place and people who prioritize 2) insisting on your concept of excellence, standing up against a crowd, and speaking truth to power." This perspective is met with a "citation needed" comment.
  • The Role of the Maintainer and Collaboration: The idea of a single-person project versus collaborative development is debated, with concerns about the "bus factor" and the ability to handle inevitable conflicts constructively.
    • Szpadel advises Overstreet, "consider expanding team to have few developers... learn to control your pride... working with (and coordinating) other developers could make you understand better upstream kernel community."