Essential insights from Hacker News discussions

OpenAI slams court order to save all ChatGPT logs, including deleted chats

Here's a breakdown of the key themes and opinions expressed in the Hacker News discussion, supported by direct quotations:

The Illusion of Deletion: "Deleted" Doesn't Always Mean Gone

A central theme revolves around the common industry practice of "soft deletes," where data isn't truly erased but merely marked as deleted or hidden. This practice clashes with user expectations. Some argue this should be illegal or at the very least, more transparent.

  • "It should be illegal to call something deleted when it is not." - hyperhopper

  • "I don't disagree, but that ship sailed at least 15+ years ago. Soft delete is the name of the game basically everywhere..." - girvo

  • "With how modern systems, languages, databases and file systems are designed, deletion often means 'mark this as deleted' or 'erase the location of this data'." - miki123211

Data Retention and Court Orders vs. User Privacy

The discussion highlights the conflict between legal obligations (like court orders to preserve data), user privacy expectations, and practical data management. Some participants object to mass collection and indefinite retention of user data, even if motivated by legal necessity.

  • "The privacy of millions of people should take precedence over ease of evidence collection for a lawsuit." - djrj477dhsnv

  • "Or maybe it should be illegal to have a court order that the privacy of millions of people should be infringed? I’m with OpenAI on this one, regardless of their less than pure reasons. You don’t get to wiretap all of the US population, and that’s essentially what they are doing here." - Aeolun

Data Deletion Challenges: Technical Complexity and Performance Trade-offs

Several users emphasize the technical difficulties and performance penalties associated with truly and securely deleting data, particularly in complex systems like databases. Implementing full data deletion would introduce high overheads.

  • "As a technical matter, it is surprisingly difficult and expensive to unrecoverably delete something with high assurance. Most deletes in real systems are much softer than people assume because it dramatically improves performance, scalability, and cost." - jandrewrogers

  • "Changing this would slow computers down massively. Just to give a few examples, backups would be prohibited, so would be garbage collection and all existing SSD drives. File systems would have to wipe data on unlink(), which would increase drive wear and turn operations which everybody assumed were O(1) for years into O(n), and existing software isn't prepared for that." - miki123211

  • "There have been many attempts to build e.g. databases that support deterministic hard deletes. Unfortunately, that feature is sufficiently ruinous to efficient software architecture that performance is extremely poor such that no one uses them." - jandrewrogers

Deletion Through Encryption: A Potential Solution, But With Limitations

Encryption is proposed as a method for effectively deleting data by deleting the encryption key, but the discussion also acknowledges the limitations and complexities of this approach.

  • "Encrypt everything, and to delete you delete the decryption key. If a user wants to clear their personal data, you delete their decryption key and all of their data is burned without having to physically modify it." - Gigachad

  • "That only works if you have a single key at the block level, like an encryption key per file. It essentially doesn’t work for data that is finely mixed with different keys such as in a database. Encryption works on byte blocks, 16-bytes in the case of AES. Modern data representations interleave data at the bit level for performance and efficiency reasons. How do you encrypt a block with several users data in it? Separating these out into individual blocks is extremely expensive in several dimensions." - jandrewrogers

GDPR Compliance and Data Management Practices

The discussion touches on GDPR requirements for data deletion and how companies manage these requests in practice. Considerations include verifying user identity, handling backups, and communicating limitations.

  • "At work we dutifully delete all data on a GDPR request" - eurekin

  • "Backup retention policy 60 days, respond within a week or two telling someone that you have purged their data from the main database but that these backups exist and cannot be changed, but that they will be automatically deleted in 60 days." - crdrost

Data Security and Long-Term Risks: Data Breaches and Future Ownership

Some users voice concerns about the long-term security of preserved data, highlighting the potential for data breaches and changes in ownership to expose "deleted" data.

  • "Consequently all your 'deleted chats' might one day become public if someone manages to dump some tables off OpenAI's databases." - aranelsurion

  • "Maybe not today on its heyday, but who knows what happens in 20 years once OpenAI becomes Yahoo of AI, or loses much of its value, gets scrapped for parts and bought by less sophisticated owners. It's better to regard that data as already public." - aranelsurion

Transparency and User Expectations: The Importance of Clear Communication

The need for transparency in data deletion practices is emphasized, with some arguing that companies should clearly communicate to users how their data is actually handled.

  • "It's easy to change the words we use instead to make it clear to users that the data isn't irrevocably deleted." - girvo

Moral and Ethical Considerations: Privacy as a Right

Some participants suggest that companies have a moral or ethical obligation to prioritize user privacy, even if it comes with increased costs or technical challenges.

  • "It is true that protecting the user's privacy costs more than not protecting it, but some organizations feel a moral obligation or have a legal duty to do so. And some users value their own privacy enough that they are willing to deal with the decreased convenience." - alisonatwork