Essential insights from Hacker News discussions

LLMs pose an interesting problem for DSL designers

Here's a breakdown of the key themes from the Hacker News discussion:

Doubts About DSLs and the Rise of LLMs

A dominant theme is the re-evaluation of Domain-Specific Languages (DSLs) in light of the capabilities and limitations of Large Language Models (LLMs). Many participants express growing skepticism about DSLs.

  • "DSLs look like noise to everyone else, including Gemini and Claude." (ajross)
  • "Now that we're in the era of LLMs and Coding Agents it's never been more clear that DSLs should be avoided; because LLMs cannot reason about them as well as popular languages, and that's just a fact." (quantadev)
  • "Once you need to stop what you're doing and figure out your ninth or eleventh oddball syntax, you realize that (as per the article) Everything is Easier in Python." (ajross)
  • "I guess if you love writing DSLs this is an unfortunate development, but for me it's more of a glass half full: I can have the AI spit out boilerplate I need to solve a problem instead of spending a week building a one-off DSL compiler." (jbellis)

LLMs and Stifled Innovation

Some argue that LLMs are not only challenging the relevance of DSLs but also potentially stifling innovation in programming languages and frameworks.

  • "To be honest I don't think this is necessarily a bad thing, but it does mean that there is a stifling effect on fresh new DSL's and frameworks. It isn't an unsolvable problem... However, there will always be a strong force in LLM's pushing users towards the runtimes and frameworks that have the most training data in the LLM." (NathanKP)
  • "Yep, I hadn't considered how LLMs would affect frameworks in existing languages, but it makes sense that there's a very similar effect of reinforcing the incumbents and stifling innovation." (gopiandcode)

DSL Design and LLM Comprehension

Several comments discuss what makes a DSL more or less amenable to LLM understanding and generation. Semantic richness and token design are highlighted.

  • "Two extremes as examples: Regex is a DSL that is not written in tokens that have inherent semantic meaning. LLM's can only understand Regex by virtue of the fact that it has been around for a long time and there are millions of examples for the LLM to work from. And even then LLM's still struggle with reading and writing Regex. Tailwind is an example of a DSL is that is very semantically rich. When an LLM sees: class="text-3xl font-bold underline" it pretty much knows what that means out of the box, just like a human does." (NathanKP)
  • "Basically, a fresh new DSL can succeed much faster if it is closer to Tailwind than it is to Regex." (NathanKP)
  • "...more concise, equals less tokens, equals faster coding agents and faster responses from prompts. But too much conciseness (in the manner of Regex), leads to semantically confusing syntax, and then LLM's struggle." (NathanKP)
  • "Embedded DSLs have their own challenge, since the LLM can easily move out of the DSL into the host language in ways that aren’t valid for the eDSL." (seanmcdirmid)

Defending DSLs

Despite the prevalent skepticism related to LLMs, some users defend DSLs, pointing out specific advantages and use cases.

  • "The benefit of (some) DSLs is that they make invalid states unrepresentable, which isn't possible with the entire surface-area of a programming language at your (or the LLM's) disposal." (mplanchard)
  • "Not just frameworks, but libraries also. Interacting with some of the most expressive libraries is often akin to working with a DSL... In fact, the paradigms of some libraries required such expressiveness that they spawned their own in-language DSLs, like JSX for React, or LINQ expressions in C#. These are arguably the most successful DSLs out there." (TimTheTinker)
  • "DSLs are not all created equal... Consider MiniZinc. This DSL is super cool and useful for writing constraint-solving problems once and running them through any number of different backend solvers." (TimTheTinker)
  • "Codegen DSLs are also amazing for some applications, especially for creating custom boilerplate -- write what's unique to the scenario at hand in the DSL and have the template-based codegen use the provided data to generate code in the target language." (TimTheTinker)

Critique of Popularity Indices (TIOBE)

A significant offshoot of the discussion centers on the validity of programming language popularity indices, specifically the TIOBE index.

  • "Python increase in Tiobe index is scary: https://www.tiobe.com/tiobe-index/" (averkepasa)
  • "I'm sorry, I simply refuse to take seriously an outlet that publishes the following... [regarding SQL and NoSQL]." (qsort)
  • "I was pretty convinced by this article to not use TIOBE as a mark of a language's popularity: https://nindalf.com/posts/stop-citing-tiobe/ Its primary point is that TIOBE is based on number of search results on a weighted list of search engines, not actual usage in Github, search volume, job listings, or any of the other number of signals you'd expect a popularity index to use." (arciini)
  • "It's even worse than "Stop Citing TIOBE" makes it sound... The TIOBE rank is based on the number of hits reported from "25 search engines", which amount to [a list dominated by shopping websites]." (duskwuff)
  • "This is actually a great site; it feels much more representative of what I actually see in job ads and the real world than some other rankings. If all I did was browse HN all day I'd think Rust is the only language people use for new projects" (guywithahat)

Broader Implications

Some contributors consider the potentially broader effects of LLMs on computing, including impacts on language evolution and software development practices.

  • "Linguistics and history of language folk: isn't there an observed slowdown of evolution of spoken language as the printing press becomes widespread? Also, "international english"? Is this an observation of a similar phenomenon?" (scelerat)

  • "The fewer languages there are in the world (as a general rule) the better off everyone is. We do need a low level language like C++ to exist and a high level one like TypeScript, but we don't need multiple of each. The fact that there are already multiple of each is a challenge to be dealt with, and not a goal what we reached on purpose." (quantadev)