Here's a summary of the themes from the Hacker News discussion:
The "Parse, Don't Validate" Philosophy
A central theme is the discussion and interpretation of the "Parse, Don't Validate" principle, most notably linked to an article by Alexis King. The core idea, as understood by many, is to transform input data into a specific, well-defined type that inherently encodes the valid states and invariants of the application. This contrasts with simply validating existing data and then continuing to use the potentially still "loose" type that might contain invalid states. The advantage, proponents argue, is that the type system then guarantees the correctness of the data, offloading the burden of constant checking from the developer to the compiler.
- "The point is you donβt check that your string only contains valid characters and then continue passing that string through your system. You parse your string into a narrower type, and none of the rest of your system needs to be programmed defensively." - yakshaving_jgt
- "Parsing includes validation." - yakshaving_jgt
- "The key point is you never need to look at a loose type and think 'I don't need to check this is valid, because it was checked before'; the type system tracks that for you." - dwattttt
- "When you create a program, eventually you'll need to process & check whether input data is valid or not. In C-like language, you have 2 options... 'Parse, don't validate' is just trying to say don't do
void validate(struct Data d)
(procedure withvoid
), but doValidatedData validate(struct Data d)
(function returningValidatedData
) instead." - lock1 - "The goal should be to move from interface type ... to internal domain type ... as quickly as possible. That way, more of the application can be written to use those rich data types, avoiding errors or unnecessary defensive programming." - MrJohz
The Ubiquity and Practicality of Parsing
The initial premise of the discussion is that parsing is a much more common and relevant task for programmers than often assumed. While some agree that parsing is indeed everywhere (handling user input, API responses, configuration, etc.), others feel that "parsing" in the context of the "Parse, Don't Validate" principle is often applied to tasks solvable with existing, higher-level libraries, and that truly writing parsers from scratch is less frequent in day-to-day work.
- "Programmers should be writing parsers all the time!" - yakshaving_jgt
- "Last week my primary task was writing a github action that needed to log in to Heroku and push the current code on main and development branches to the production and staging environments respectively. The week before that, I wrote some code to make sure the type the object was included in the filters passed to an API call. ... It's just not required all that often in my day-to-day work." - WJW
- "The three most common things I think about when coding are DAGs, State Machines and parsing." - dkubb
- "I think most security issues are just due to people not parsing input at all/properly." - eska
Tooling and Abstraction Levels
A significant portion of the discussion revolves around the appropriate level of abstraction and the tools used for parsing. Many point to libraries like Zod, Pydantic, argparse, and type systems like TypeScript as examples of how parsing and validation can be handled effectively without needing to write low-level parsing logic. The debate touches on whether these libraries truly embody the "parse" paradigm or if they are simply sophisticated validation tools.
- "Every mainstream language has libraries for parsing into general types, but none of them will have libraries for parsing values specific to your application." - yakshaving_jgt
- "When you get JSON from an API, you don't just parse it as any and then write a bunch of if-statements. You use something like Zod to parse it directly into the shape you want. Invalid data? The parser rejects it. Done." - jmull
- "What OP calls an 'combinatorial parser' I'd call object schema validation and that's more similar to pydantic[0] than argparse in python land." - whilenot-dev
- "ThinkBeat: And that is why there are plenty of parser generators so you dont have to write the parser yourself every time."
- "bvrmn: A valid type for server and port should be a single value. Stop parse it separately please. ':3000' -> use port 3000 with a default host. 'some-host' -> use host with a default port. 'some-host:3000' -> you guess it."
Error Handling and User Experience
The practical implications of parsing and validation, particularly concerning error messages, are a key point of discussion. Users question how comprehensive and user-friendly error reporting can be when strict parsing models are employed, especially when multiple errors can occur in an input. The trade-offs between strict parsing (failing fast) and more forgiving approaches that collect all errors are explored.
- "The problem I run into here is - how do you create good error messages when you do this? If the user has passed you input with multiple problems, how do you build a list of everything that's wrong with it if the parser crashes out halfway through?" - 12_throw_away
- "Most validation libraries worth their salt give you options to deal with this sort of thing? They'll hand you an aggregate error with an 'errors' array, or they'll let you write an error message 'prettify-er' to make a particular validation error easier to read." - ambicapter
- "Ygg2: Parsers can be made to not fail on first error. You return either a parsed structure or an array of found error. Html5 parser is notoriously friendly to errors."
- "jpc0: How is getting an error array not making invalid input unrepresentable. You either get the correctly parsed data or you get an error array. The incorrect input was never represented in code, vs a 0 value being returned or even worse random gibberish."
- "SloopJon: I don't see anything in the post or the linked tutorial that gives a flavor of the user experience when you supply an invalid option. ... What happens when you pass --foo, --target bar, or --port 3.14?"
- "macintux: I had a similar question: to me, the output format 'or' statement looks like it might deterministically pick one winner instead of alerting the user that they erred. A good parser is terrific, but it needs to give useful feedback."
The Role of Type Systems
The integration of parsing with type systems, particularly with languages like TypeScript, is a recurring theme. The benefit of having the type system enforce invariants is highlighted, with some users appreciating how it can serve as a compile-time check that makes runtime validation in application logic less necessary.
- "The article tries to teach folks to utilize the type system to their advantage. Rather than praying to never forget invoking
validate(d)
on every single call site, make the type signature only acceptValidatedData
type so the compiler will complain loudly if future maintainers try to shoveData
type to it. This strategy offloads the mental burden of remembering things from the dev to the compiler." - lock1 - "dwattttt: The type isn't just there to make it easy to understand when you do it, it's for you a year later when you need to make a change further inside a codebase, far from where it's validated. Or for someone else who's never even seen the validation section of code."
- "parhamn: I was recently thinking about type safety and validation strategies are particularly thorny in languages where the typings are just annotations. E.g. the Typescript/Zod or Python/Pydantic universes."
- "hahn-kev: It's almost like you want compile time type safety" - hahn-kev
Perceived Author Identity and Writing Style
A minor but present theme involves speculation about the origin of the article's text, with multiple users identifying stylistic elements as reminiscent of AI-generated content (specifically ChatGPT). This leads to discussions about AI's role in writing and the characteristics of such output.
- "Stopped reading after realising this is written by ChatGPT" - HL33tibCe7
- "I thought the style was like ChatGPT in a 'clever, casual, snarky' prompt flavor as well." - bobbiechen
- "I found the content novel and helpful ... and the tone very enjoyable. In fact, it's so idiomatically written that I can't even believe it's just a machine translation of something written in another language." - akoboldfrying