Converting a large mathematical software package written in C++ to C++20 modules

Here's a summary of the themes from the Hacker News discussion:

C++ Modules vs. Precompiled Headers (PCH)

A significant portion of the discussion revolves around the comparison between C++ modules and precompiled headers (PCH) as mechanisms for improving compilation times. Users debate their effectiveness, implementation, and applicability.

Performance Potential: Some users believe modules offer greater speedups, especially when leveraging C++23 standard library modules. One user states, "As per Office team, modules are much faster, especially if you also make use of C++ standard library as module, available since C++23." However, others suggest PCH could achieve similar gains with less effort, with one user expressing, "I have a suspicion that using precompiled headers could provide the same build time gains with much less work."
Platform Differences: The effectiveness and adoption of PCH have historically been seen as platform-dependent. "Pre-compiled headers have only worked well on Windows, and OS/2 back in the day," notes one user, adding, "For whatever reason UNIX compilers never had a great implementation of it. With exception of clang header maps, which is anyway one of the first approaches to C++ modules."
Nature of Improvement: Modules are seen as primarily addressing time spent parsing, while their impact on code generation (codegen) can be neutral or even negative by enabling more inlining. "modules only really help address time spent parsing stuff, not time spent doing codegen. Actually they can negatively impact codegen performance because they can make more definitions available for inlining/global opts, even in non-lto builds," explains one commenter.
Challenges with Implementation: Users point out that effectively building and integrating modules, especially from existing large codebases, is complex. "Header units are supposed to partially address this but right now they are not supported in any build systems properly (except perhaps msbuild?)" one user observes. Another adds that "clang is pretty bad at pruning the global module fragment of unused data, which makes this worse again."

Strategies for Reducing Compilation Time

Beyond modules and PCH, the discussion touches on various other techniques and architectural decisions that impact C++ compilation speeds.

Thoughtful C++ Practices: Basic good coding practices are highlighted as crucial for managing compile times. "Compile times ain't an issue if you pay a little attention. Precompiled header, thoughtful forward declarations, and not abusing templates get you a long way," a user advises, citing their experience with a 1 million line C++ codebase compiling in 30-40 seconds.
Engine and Library Architecture: Designing codebases with compilation speed in mind is seen as essential for large projects. One user points to a "modern cryengine" that "compiles very fast" due to architects prioritizing "interfaces that are on very thin headers."
Unity Builds: Unity builds, where multiple source files are combined into a single compilation unit, are mentioned as a potential optimization, though their effectiveness and support can be debated. "Fwiw doing a unity build with thin-lto can yield lovely results. That way you still get parallel and incremental codegen," one user suggests. However, another notes that Google found them slower and problematic in Chromium's build system.
Lightweight and Decoupled Headers: Regardless of the compilation acceleration technique used, keeping headers small and minimizing dependencies is consistently valued. "And neither is a substitute for keeping headers lightweight and decoupled," states one contributor.

The Role of AI/LLMs in Code Refactoring and Management

A significant thread of the conversation explores the potential and limitations of using AI, particularly Large Language Models (LLMs), to assist with large-scale code refactoring, especially in the context of improving code structure for compilation or maintenance.

Potential for Automation: Several users see LLMs as a promising tool for automating tedious or complex code modification tasks that would otherwise be required for things like modularization. "This is a great task for LLMs, honestly," one user suggests. Another elaborates on how LLMs can handle refactoring like changing function arguments by typing a description, which "does the legwork" and prevents "sloppy fixes" and "technical debt."
Trust and Reliability Concerns: A primary concern raised is the reliability of LLMs in accurately transforming code, especially with common copy-paste or minor modification errors. "The thing which killed the whole thing is that can’t be trusted to cut+paste code — a clang warning informed me, when a 200 line function had been moved and slightly adjusted, a == was turned into a = deep inside an if statement," recounts one user's negative experience. This leads to a preference for LLMs generating instructions for verifiable tools.
Review and Verification Processes: The need for robust review and verification mechanisms when using LLMs for code changes is emphasized. Tools that can streamline the review of AI-generated edits, such as collapsing unchanged sections or providing an outline of modifications, are seen as valuable. One developer is working on a GUI for this purpose, which "auto-collapses functions/classes that were unchanged, and having an outline window that shows how many blocks were changed."
Broader Applicability: The utility of AI code assistance is not seen as exclusive to AI-driven changes. "Why does it need to be AI specific? This would be valuable for reviewing human code changes aswell right?" a user questions, highlighting the potential for these review tools to benefit human collaboration as well.
LLMs for Legacy Language Conversion: The potential of LLMs to convert older programming languages like Fortran or COBOL into modern ones is also mentioned as a potential application. "I really wonder whether LLMs are helpful in this case. This kind of task should be the forte of LLMs: well-defined syntax and requirements, abundant training material available, and outputs that are verifiable and validatable," one user posits.

The Practicalities and Feasibility of Modularizing Large Codebases

The discussion acknowledges the inherent difficulty and resources required to retrofit modern C++ features, such as modules, onto decades-old, large codebases like deal.II.

Manual Effort vs. Automation: The paper being discussed highlights the immense manual effort required to manually modularize a large project, with tasks like reorganizing code or annotating exports taking significant time. Users discuss if this "boring copy-paste work... often has unexpected benefits; you get to know the code better and may discover better ways to organize the code."
Project Scale and Rewrite Feasibility: The sheer size of projects like deal.II makes wholesale rewrites or substantial modifications infeasible. The conclusion from the paper's data is described as "underwhelming in this iteration and perhaps speaks to 'module-fication' of existing source code... rather than doing it from scratch."
Iterative Approach and Gradual Improvement: Even if complete modularization is a monumental task, incremental improvements through techniques like PCH or forward declarations are still valuable. Users note that "good speedup with just pch, forward decls etc. (more than 10%)" can be achieved, suggesting that a complete overhaul isn't always necessary for tangible benefits.

Architectural Trade-offs: Compile Time vs. Runtime Performance

A subtle but important theme is the potential conflict between optimizing for compile times and maintaining optimal runtime performance.

Interface-Heavy Designs: Architectures that use many interfaces to keep headers light can lead to runtime costs. "their trick is that they have architected everything to go through interfaces that are on very thin headers... But its a shame we need to do tricks like this for compile speed as they harm runtime performance," a user explains.
Virtual Calls and Inlining: The specific example given is using virtual calls for simple operations like getting a texture size, which can prevent inlining and introduce cache misses. "Because you now need to go through virtual calls on functions that dont really need to be virtual, which means the possible cache miss from loading the virtual function from vtable, and then the impossibility of them being inlined," is how this is described.
Compiler Optimizations: The possibility of compiler optimizations (like devirtualization) mitigating these runtime costs is discussed, with specific compiler flags mentioned. "In my experience, as long as there's only a single implementation, devirtualization works well, and can even inline the functions. But you need to pass something along the lines of '-fwhole-program-vtables -fstrict-vtable-pointer' + LTO," mentions one user.

Code Formatting and Presentation

A minor, but present, theme is the presentation of code within the discussion itself.

Code Block Styling: One user comments on the presentation of code, stating, "The code block styling is less than ideal." This suggests that the way code snippets are displayed can affect readability and the overall user experience of the discussion.