TPDE-LLVM: Faster LLVM -O0 Back-End

The Hacker News discussion revolves around a new compiler backend, TPDE, which promises significantly faster compilation times, particularly for debug builds, while maintaining competitive runtime performance. This has sparked a broader conversation about the state of compiler technology, LLVM's limitations, and the future of programming language toolchains.

The Promise of Faster Compilation and TPDE's Impact

A central theme is the excitement surrounding TPDE's potential to alleviate long compilation times, a known pain point for many developers. Users highlight that while LLVM is powerful, its speed is a bottleneck, especially for iterative development.

"The problem with LLVM has always been that it takes a long time to produce code. The post in the link promises a new backend that produces a slower artifact, but does so 10-20x quicker. This is great for debug builds." - testdelacc1
"I hope this trickles down to the Swift compiler, sometime soon." - woadwarrior01

There's also discussion about how TPDE's speed improvements are specifically targeted at the backend compilation stage, not necessarily the entire compilation pipeline.

"The 10-20x improvement described here doesn’t mean the compilation as a whole gets quicker. There are 3 steps in compilation... This doesn’t mean the compilation as a whole gets quicker... Nevertheless, this is still an incredible win for compile times because the other two steps can be optimised independently." - testdelacc1

Some express optimism about TPDE's potential to reduce reliance on other tools like Cranelift for fast WebAssembly builds.

"This is awseome, I didn't know this was possible and may replace the need of using Cranelift for having fast builds of Wasm bytecode into assembly in Wasmer." - syrusakbary

Concerns and Limitations of TPDE

Counterpoints and reservations are raised regarding TPDE's practicality and its comparison to LLVM. A key concern is the limited subset of LLVM IR that TPDE currently supports, which could restrict its applicability to real-world, complex projects that utilize LLVM's more advanced features.

"There's no free lunch. I don't know where you're getting this "Pareto improvements" thing from because it's a much more constrained codegen framework than LLVM's backend. It supports a much smaller subset, and real world code built by LLVM will use a lot of features like vectors even at -O0 for things like intrinsics." - pertymcpert
"I wonder what such a "typical" subset is. How exotic should something be to not work with it?" - SkiFire13

The maintenance cost of a separate compiler path and the potential for performance regressions in real-world scenarios are also mentioned.

"It supports a much smaller subset, and real world code built by LLVM will use a lot of features like vectors even at -O0 for things like intrinsics. There's a maintenance cost to having a completely separate path for -O0." - pertymcpert

However, there are also clarifications that TPDE's support for LLVM IR constructs is growing, aiming to cover more frequently used features by compilers like Rust.

"We now support most LLVM-IR constructs that are frequently generated by rustc (most notably, vectors). I just didn't get around to actually integrate it into rustc and get performance data." - aengelke

The State of LLVM and Compiler Development

The discussion touches upon the broader ecosystem around LLVM, including its development process and inherent challenges. The perceived lack of a strong product management driving LLVM's roadmap is cited as a reason why certain promising research or improvements might not receive adequate attention.

"Most contributions are driven by university students papers that somehow managed to get merged into LLVM, there is hardly a product manager driving its roadmap, thus naturally not everything gets the same attention span." - pjmlb

The seminal "TDPE paper" that underpins TPDE's approach is a point of interest, with some users expressing surprise that its implications haven't been more widely adopted or debated within the compiler community.

"It feels sometimes that compiler backend devs haven't quite registered the implications of the TDPE paper. As far as I can tell it gets pareto improvements far above LLVM, Cranelift, and any WebAssembly backend out there. You'd expect there to be a rush to either adopt their techniques or at least find arguments why they wouldn't work for generalist use cases, but instead it feels like maintainers of the above project have absolutely no curiosity about it." - PoignardAzur

The "Correctness vs. Performance" Debate and IR Semantics

A significant thread delves into the perennial tension between compiler correctness and performance, with LLVM's IR semantics being a specific point of contention. One user argues that LLVM's IR has poorly defined semantics, leading to bugs, and that "faster but wrong" is not a valuable outcome.

"IMO the worst problem with LLVM isn't that it's slow, the worst problem is that its IR has poorly defined semantics or its team doesn't actually deliver those semantics and a bug ticket saying "Hey, what gives?" goes in the pile of never-never tickets, making it less useful as a compiler backend even if it was instant. This is the old "correctness versus performance" problem and we already know that "faster but wrong" isn't meaningfully faster it's just wrong, anybody can give a wrong answer immediately and that's not at all useful." - tialaramex

This perspective sparks a discussion about the difficulty of creating and maintaining a compiler backend with coherent and well-defined IR, and the challenges of getting community buy-in when alternatives might offer slight performance advantages or broader feature support.

"The really difficult thing would be to write a new compiler backend with a coherent IR that everybody understands and you'll stick to. Unfortunately you can be quite certain that after you've done the incredible hard work to build such a thing, a lot of people's assessment of your backend will be: 1. The code produced was 10% slower than LLVM, never use this, speed is all that matters anyway and correctness is irrelevant. 2. This doesn't support the Fongulab Splox ZV406 processor made for six years in the 1980s, whereas LLVM does, therefore this is a waste of time." - tialaramex

LLM Accusations and Discussion Moderation

A distinct, albeit temporary, theme emerged concerning accusations of comments being generated by Large Language Models (LLMs). This led to a meta-discussion about HN's moderation policies and user conduct.

"Maybe HN should add "Don't accuse comments of being LLM generated" to the guidelines, because this sure seems like it'll be in the same category as people moaning that they were downvoted or more closely people saying "Have you read the link?"" - testdelacc1
"I feel like a fuck you to the accuser is sufficient. It proves that you’re not an LLM and is a reasonable response to an unfounded accusation. LLMs decline when asked to say fuck you." - testdelacc1

Moderator responses indicated that such accusations are not explicitly against the rules but are discouraged, and that flagging unfit comments is the preferred course of action.

"We've talked about this but we're not adding it to the guidelines. It's already covered indirectly by the established guidelines, and "case law" (in the form of moderator replies) makes it explicit." - tomhow

Miscellaneous and Humorous Remarks

The discussion also included lighter moments and tangential observations:

Jokes about advanced compiler flags: "I thought everybody moved to Arch. I hear they patched the compiler to go to -O11." - wmf
A reference to Turbo Pascal: "Turbo pascal 3 is back !" - Agingcoder
Observations about the timing of the HN post and its relationship to a previous discussion on the same topic. - aw1621107, tomhow
Acknowledgment of a rare "Comex" comment: "That rare Comex comment." - twothreeone