Essential insights from Hacker News discussions

SQL Needed Structure

This Hacker News discussion revolves around the idea of leveraging modern SQL capabilities, particularly JSON support and features like LATERAL, to simplify data fetching and reduce the need for complex application-level logic. Several key themes emerge:

Embracing Modern SQL and Database Capabilities

A significant portion of the discussion centers on the idea that modern databases, especially PostgreSQL, offer powerful features that can solve complex data manipulation problems directly within the database. This contrasts with the traditional approach of fetching "flat" data and re-assembling it in the application layer.

  • tucnak highlights this by stating, "This is NOT the case with modern SQL, as it supports JSON. The typical Postgres use-case would be to produce denormalised views for the specific relationships and/or aggregates in JSON form. LATERAL, the correlated sub-query, is really convenient in avoiding double-counting." They further critique the common practice of reinventing data access: "Time and time again, software engineers reinvent the data-access wheel out of ignorance to database capabilities."
  • MaxMonteil expresses a similar sentiment: "My thinking is everything I could get done by the DB avoids heavier and maybe slower application code." They also ask for more resources to learn this approach.
  • fredguth directly attributes their learning to the discussion: "Today I learned the power of LATERALs... thanks. :-)"
  • sakesun laments the tendency to create new solutions instead of improving existing ones: "When something in SQL becomes cumbersome or difficult to work with, people often tend to create entirely new solutions rather than improving the existing ones. Few developers seem interested in refining SQL itself."

Storing and Querying JSON Data

The use of JSON within relational databases is a recurrent theme, with users discussing its benefits and practical implementation.

  • mg states, "I am becoming more and more comfortable with storing everything in JSON." They provide an example of efficient querying in SQLite: sql SELECT cars->>'model' FROM cars They also express a desire for similar arrow operators in MariaDB, noting its current cumbersomeness: sql SELECT JSON_UNQUOTE(JSON_EXTRACT(cars, '$.model')) FROM cars
  • tgv offers a nuanced view: "If that's all, I still recommend classical columns. Now if you have array values and/or values with varying structures which you don't need to query (frequently), JSON makes makes sense."
  • mr_toad queries the flexibility of JSON querying: "Can β€˜model’ be a variable, or does it have to be a constant literal?"
  • roenxi discusses the practical aspects of JSON blobs in databases: "And, in practice a lot of these frontenders end up storing a JSON blob with either no nesting or just one level of nesting that looks a lot like a database schema built by someone being stubborn about not normalising data." They also acknowledge that databases are adapting: "some sense that is fine, databases seem to be learning to just treat simple JSON blobs as normalised data but it is hard to make that as efficient as a guaranteed schema."

Architectural Patterns and Best Practices

The discussion touches upon architectural choices, including the placement of business logic, schema design, and testing.

  • taffer provides valuable resources and personal experience: "Putting business logic in the database: https://sive.rs/pg ... Personally I am using an architecture similar to https://sive.rs/pg in my personal projects and am very happy with it." They emphasize a key practice: "It is important that you put data and procedures/views in different schemas so that you can use migrations for your data but automatically delete and recreate the procedures/views during deployment. Also use pgtap or something similar for testing."
  • hbrundage suggests extending the idea further: "I would take it one step further β€” the structure of the joins is something that should most often come from the schema, not the query." They also mention their work on a hybrid language: "We built a GraphQL / SQL hybrid expression language that does just this here: https://docs.gadget.dev/guides/data-access/gelly#gelly , curious what yall think."
  • stared contrasts their experience with relational databases after working with NoSQL: "I used to work bit with MongoDB (over 10 years ago) and it did wonders in making me fall back into love with relational databases. I finally saw the light of Codd!" They agree on the potential of standardized JSON tooling but defend relational databases for separating concerns: "Yet, with relational databased you can separate concerns of: what is your data and what you want to display. If you use JSON-like way to store data, it can do the job until you want to change data or queries."

Terminology and Semantics

One user points out a potential difference in how technical terms are understood across different domains.

  • greggyb notes: "Coming from a data engineering and business analytics background, the terms "structured" and "unstructured" seem to be swapped in meaning here. I'd expect to talk about anything in a DB as "structured data". Flexible serialization formats such as JSON or XML are "semi-structured". And something from e.g., an office document is "unstructured"." They clarify this isn't a criticism but an observation about perspectives.

Alternatives and Related Concepts

The discussion briefly touches on alternative approaches and related database concepts.

  • maweki asks if the author's proposal is essentially GraphQL: "Isn't the author just proposing GraphQL?"
  • silon42 mentions a common challenge: "I've seen 'decorellation' mentioned... 95% of the time I am fighting the query optimizer is to get the correlation and a loop over few items working instead of a table scan."
  • Mikhail_Edoshin links to a related HN discussion about a paper: "'What if an SQL statement returned a database?' https://news.ycombinator.com/item?id=38606146"
  • jbverschoor makes a humorous, albeit critical, comment: "Time to start taking that those CS classes before blogging"