Essential insights from Hacker News discussions

Information has been permanently deleted, for small values of permanently

The following themes emerged from the Hacker News discussion:

Enforcement and Efficacy of Data Deletion Laws

A central theme is the perceived lack of effective enforcement and the loopholes that undermine data deletion requests, even under regulations like GDPR. Users expressed skepticism about whether companies truly delete data and whether legal frameworks are sufficient to ensure compliance.

  • "until there's actual enforcement, there isn't the incentive to tell the truth..." - therobot24
  • "GDPR can only be enforced by regulators. The bar for a valid complaint is quite high, and a company can lie and essentially remove your grounds for said complaint. And even once you do get a valid complaint in, it'll stay in limbo for years." - Nextgrid
  • "the GDPR isn't mentioned, but as one of the more stringent privacy regulation regimes, its 'right to erasure' has all sorts of conditions attached to it where a customer might be told that all of their data has been deleted, but some legally has to be (or can be) stored." - petercooper
  • "No, it's lying. If they say your data has been deleted without a qualifier that some of it remains undeleted (regardless of the reason), that's just a straight-up lie because their statement is factually untrue and they know it." - JohnFen

The Ambiguity of "Deletion" and Corporate Definitions

The discussion highlights how terms like "deleted" and "permanently deleted" are open to broad interpretation by companies, often through legalese in their terms of service. This creates a disconnect between user expectations and corporate practices.

  • "*For definitions of “your”, “information”, “permanently” and “deleted”, please refer to one of the dense, poorly worded contracts you implicitly agreed to when you thought about our site." - clickety_clack
  • "It's probably in the Privacy Policy they just emailed about." - codeplea
  • "It's less “poorly worded” than finely tuned legalese that gives the company carte blanche." - tempodox
  • "These definitions are subject to not just variance from their common meaning but also unilateral change without notice." - reverendsteveii
  • "petercooper: It's semantics, but one man's "lying" is another man's pragmatic, non-legalese customer-facing wording."
  • "JohnFen: No, it's lying. If they say your data has been deleted without a qualifier that some of it remains undeleted (regardless of the reason), that's just a straight-up lie because their statement is factually untrue and they know it."

Technical Challenges of Data Deletion and Backups

Participants delved into the technical complexities of ensuring data is truly erased, particularly concerning backups and distributed systems. The practicalities of scrubbing data from all backup media were a recurring point of concern.

  • "GDPR requires data to be deleted where feasible. A common area where this falls apart is in backups made of systems implemented prior to GDPR rules, or systems which have not implemented a mechanism to allow user level deletion from backups." - ygjb
  • "One of the headaches of system design in this area is how do you deal with backups. Lets say you do regular backups to s3, glacier, tape, stone tablets. When you tell your customer "we have deleted all your data", are you loading all your backups and scrubbing their data from there as well? Probably not as it would probably be too expensive..." - nemothekid
  • "When someone "really really" needs data from a week ago and gets a one off backup, it has deleted customer data in it." - nemothekid
  • "With data deletion requests, you sometimes do need a mechanism to keep track of who/what you deleted. This inevitably involves PII. What comes to mind is CCPA requests to delete data from private data brokers - there is an inherent problem that to avoid re-ingesting your data into their system, they need to know what that data is in the first place." - JohnMakin

The Inherent Conflict Between Data Minimization and Business Practices

There's a tension between the desire for data privacy and deletion, and the practical needs of businesses, such as legal retention requirements, auditing, and even marketing databases. This conflict often leads to companies retaining data that users believe should be or has been deleted.

  • "Can you claim to that user it is deleted when it is not just because you're holding onto it for legal reasons? I understand the need or requirement to hold some documents, but I don't understand how companies can lie to users claiming their information was deleted when it was not." - asadotzler
  • "For example, you can store a record that an erased user requested erasure so you can prove it later on if needed in a legal situation (article 17.3.e)." - petercooper
  • "But really, aren't companies legally required to retain a lot of information anyway? Such as invoices needed for tax purposes?" - codeplea

User Skepticism and the Strategy of Data Minimization

Given the difficulties in verifying deletion and the prevalence of perceived deception, users expressed a lack of trust in companies' claims and advocated for minimizing the amount of data shared in the first place.

  • "Since it's possible to verify, and there's a whole ton of deception by companies, I simply don't trust that any data is ever deleted just because I've been told it is. Instead, I try to minimize the amount of data that others have." - JohnFen
  • "I did. It was against my better judgement, but I did it anyway as a favor for a family member. I regret making an exception for that, and won't make that mistake again." - JohnFen (referring to 23andMe)

The Problem of Proving Deletion (or Lack Thereof)

The difficulty in proving whether data has actually been deleted or not was also raised as a significant obstacle for users seeking recourse. Subtle discrepancies or reappearances of data are hard to demonstrate in legal contexts.

  • "The thing is people generally aren't going to court when they suddenly notice a deleted photo still in their cloud docs. They're going to think it's a glitch or that they hadn't deleted it. Proving something like this in court is tricky - how would you prove to the judge you deleted something and that it randomly reappeared after some years?" - trinix912

Incompetence vs. Malice in Data Handling

Some participants suggested that discrepancies in data deletion might stem more from systemic incompetence or poorly designed, denormalized databases rather than deliberate malice, although incompetence can have a similar effect.

  • "I'm leaning toward incompetence on this one. Certainly if they were deliberately keeping his data against his will, it would be stupid to email him about it. The people responsible for deleting the account info probably deleted everything they knew about or had access to, but his email was also in some other database run by marketing or something. Or their databases are just overall horrifically denormalized and inconsistent." - andrewflnr
  • "(Of course, sufficiently advanced incompetence is indistinguishably from malice. Hard to say if that's applicable here.)" - andrewflnr