Here's a breakdown of the key themes discussed in the Hacker News thread, supported by direct quotes where appropriate:
Typewriters and Character Substitution
A central theme revolves around the limitations of typewriters, particularly the omission of certain characters and the resulting need for substitutions. The lack of a dedicated '1' key and the use of a lowercase 'l' in its place is a primary example.
- "My parents had a typewriter without a 1 or a 0. I always thought it was to provide room for two other valuable characters like the old "cents" c with a bar through it." - bediger4000
- "Typewriter keys cost money, and dropping the 1 allowed them to drop a key without significantly affecting the use of it. As far as I can tell, that's effectively the entire rationale." - thedufer
- "Was it some technical constraint of the typewriter that caused “1” to become more like “l” come XX century? The typewriter I grew up with simply didn't have a key for it. It also didn't have a 0 or an exclamation mark or a plus sign. There were well known substitutes: For the number 1, type lowercase letter l. For the number 0, type uppercase letter o." - adrianmonk
- "In the UK, we also used to type '£' with an 'L' and backspacing and overtyping an '='." - ralferoo
The Economics and Mechanics of Typewriter Design
The discussion explores the economic and mechanical factors that influenced typewriter design, leading to compromises in character sets. Space limitations and cost considerations played a crucial role.
- "They didn't just cost money. They were competing to the limited space around the typing area, what meant they were constrained at the border of a circumference that would be entirely filled with mechanisms. In other words, the cost in both money, size, and weight depended on the square of the number of keys." - marcosdumay
- "With limited space and resources, I wonder what other letter or number could be dropped and meaning retained. 0 and O might be worth considering?" - lostlogin
Overstriking and Character Composition in Typewriters and Computing
The thread delves into the technique of overstriking characters on typewriters (and emulated in some computing contexts) to create characters not directly available on the keyboard. This is explored both in its historical context on typewriters and its continued presence (or former presence) in tools like GNU groff.
- "The font designer in you is being influenced by computer displays and movable type. The world of typewriting had some idiosyncrasies, not even shared with (say) the world of dot-matrix printing, one of which was pressure to make the largest composable range of characters with the smallest number of physical keys. Which means common base shapes." - JdeBP
- "It's a little known fact that some parts of the computing world are faithfully reproducing this aspect of typewriters, trying to write bullet points, daggers, and currency symbols not with actual Unicode but with a very limited ASCII repertoire and overstriking, even today." - JdeBP
- "Here, for example, is how it currently, in its UTF-8-disabled mode (as unfortunately still used by manual page readers on several operating systems), composes a down arrow in the typewriter style of overstriking a vertical bar with a 'v' letter:" - JdeBP
- "In its UTF-8 mode, for bullet points GNU groff uses the actual Unicode characters that are available. Until 2024, in "ascii" mode it overstruck a plus symbol with a letter 'o', one of many such typewriting tricks, which no-one but those printing manual pages to old printers capable of the same typewriting trick would have ever seen as it was supposed to be seen. On VDUs, such bullet points just came out as the letter 'o'." - JdeBP
- "The technical constraint still applies. My keyboard uses ' ' and '-' to represent many different symbols." - devnullbrain
Context and Distinguishing Characters
Several users noted that humans often rely on context to differentiate between similar-looking characters, particularly '1' and 'l'. However, this becomes problematic in situations like reading code or passwords.
- "was it that in prior years a reader could usually distinguish 1 from l by context. Even today, very few things cause me to need to te11 a 1 from a l." - hidingfearful
- "it matters when reading code and random string (what we now call passwords, though back then passwords were things you could pronounce, unlike say ywtr466Nh%vX)." - hidingfearful
Data Quality, Outliers, and Skepticism
The thread touches upon the importance of data quality and the need to be cautious about outliers. Several statistical concepts are mentioned, including the median replacing the mean, the olympic scoring system and the reliability of data.
- "This is why one of my principles is to be skeptical of outliers. Often they are not real and therefore misrepresent the true data." - djoldman
- "The lesson I took from this is that it is useful and important to dig into how any piece of data was sourced." - yen223
- "Similar to Twyman's Law: “Any figure that looks interesting or different is usually wrong.”" - dustincoates
- "You can tell how much they cared about data quality because they never took the time to look at context-dependent glyph equivalencies. And some context-sensitive algorithms might not make the same mistakes as a naive “guess what characters are here” algorithm that just uses glyph shapes. You run into this a LOT with ALPR systems because some of the presses excluded some characters. O and 0 are the most common character equivalency. But only in certain places." - throwaway173738
- "OCR is actually complicated if you’re trying to rely on the data for something." - throwaway173738
The Original Post
The original post was an OCR error resulting from typewritten lowercase L's being misinterpreted as the number 1.
- "tl,dr: It's an OCR error" - esafak
- "Or, sometimes, not; one of the more interesting takeaways was typewritten lowercase ells instead of ones: “When the algorithm read October llth, it was far more correct than we have been giving it credit.”" - dahart
- "Naming an event after its date will have a limited run." - mensetmanusman