The math of shuffling cards almost brought down an online poker empire

Here's a summary of the themes from the Hacker News discussion:

The Immensity of 52! and Human Comprehension

A significant portion of the discussion revolves around the sheer magnitude of 52 factorial (52!) and how humans perceive such large numbers. While computers can handle these numbers in terms of bit representation, grasping their scale is challenging.

"The number 52! is absurdly large." (shagie)
"Numbers like Tree(3) or Loader's number or the various 'these are really big mathematical functions'... we know we can't comprehend them. But 52! - that's something that we think we can get our head around... and it's mind boggling large once you start doing the math and relating it to real things that we can relate to (which are mind bogglingly large)." (shagie)
"52! < 2^226, so it's not that big in computer terms." (gucci-on-fleek)
"Still mindbogglingly huge, but not uncommonly large." (gucci-on-fleek)

The Role and Requirements of Random Number Generation (RNG) for Shuffling

The core technical issue discussed is the inadequacy of the random number generation (RNG) in the described shuffling algorithm, particularly how it was seeded. This leads to a broader conversation about what constitutes "good" or "true" randomness and the practicalities of achieving it.

"RNG was seeded with a millisecond time of day, leading to 86.4M possible decks. Oooph." (ajross)
"It's not that the probability theory was a novel and unexplored field at the time." (alexey-salmin)
"The issue in the article isn't application of pseudorandom numbers. It's seeding the generator." (RainyDayTmrw)
"So? You’re only using one deck at a time; so you only need to generate 1 bit randomly 226 times — then use that deck." (zmgsabst)
"Without a bunch of (unentangled) electrons you can observe, getting 226 truly random bits in quick succession doesn't seem all that easy." (bobbylarrybobby)
"Depends on what you mean by 'truly random' but if it's 'cryptographically secure random' then it's not particularly hard." (alexey-salmin)
"Sanzig: RDRAND has a throughput of hundreds of megabytes per second on most processors. Getting 226 bits is pretty easy, even shuffling millions of decks per second you'd be bottlenecked elsewhere." (Sanzig)
"The non-evenly-diving divisor thing is a bit trickier, which is why standard libraries implement Fisher-Yates for you. But the solution is basically: Using your PRNG, generate the number of bits you need. So if you need a number 0-51, generate 6 bits. 6 bits can hold 0-63. If you get a number in the range 52-63, discard that and generate a new 6-bit number with the PRNG." (kerkeslager)
"Rethink your RNG choices. Are you using /dev/random, /dev/urandom, or a custom implementation? What is your seeding strategy?" (dr_death)
"RDRAND and RDSEED are both using quantum principles (aka: heat and temperature / truly quantumly random noise at the microscopic level in the CPU's transistors) to generate random numbers... Well... a seed at least. And then they are expanded using AES encryption IIRC (which 'shouldn't' be breakable, and even if it were breakable it'd probably be very difficult to follow)." (dragontamer)

The Suitability of the Fisher-Yates Shuffle Algorithm

The Fisher-Yates (or Knuth) shuffle algorithm is repeatedly brought up as the correct and proven method for producing unbiased shuffling. The discussion clarifies its principles and how it depends on a good RNG.

"Go (and probably other languages) has the Fisher-Yates algorithm for shuffling" (awesome_dude)
"Really? It repeats a point from the article, and in so doing utterly misses the point of the comment it's replying to. Fisher-Yates is at least not broken, but it can only be as good as the underlying RNG." (andrewflnr)
"Nobody in this thread is criticizing Fisher-Yates, because in all likelihood all of us are using Fisher-Yates. We're discussing the failure of the algorithm used in the article." (kerkeslager)
"If you do it correctly, you've reinvented Fisher-Yates[1]. If you do it wrong, you've reinvented this unnamed, broken algorithm[2], instead." (RainyDayTmrw)
"Today many online poker sites use the Fisher–Yates algorithm, also called the Knuth shuffle (which sounds delightfully like a dance). It’s easy to implement and delivers satisfactory results." (insaider)
"But it's not just good enough, it's optimal. It is equivalent to picking a random deck from the set of all possible decks assuming your random source is good. More random than a real shuffle." (pfg_)
"You can: (a) Use Fischer-Yates, which is fast and, if correctly implemented, unbiased, or (b) Come up with an ad-hoc scheme like this, do some fancy math to bound its total variation distance..." (amluto)

Questioning the Premise: Humans vs. Computers in Shuffling

Several users challenge the article's assertion that computers cannot replicate human shuffling, arguing that computers, with proper algorithms and RNGs, can achieve superior randomness.

"> Card dealers create a unique deck with each shuffle, something computers cannot replicate This is entirely, unequivocally false." (sneak)
"Shuffling as an algorithm to implement is easy to fuck up, but if you use a good one and a true RNG source, computers can shuffle better than humans - just as randomly, and many orders of magnitude faster." (sneak)
"If anything it's humans (unless specifically trained) who produce horribly biased shuffles. Whole chunks of sequences from the previous deal can show up, either directly or spaced 2x or 4x. Try ordering a deck first, then shuffle it, then open each card and see if you notice any patterns." (alexey-salmin)
"As a sort of TL;DR: it's basically like selection sort but instead of selecting the sort-compliant choice among the remaining ones, you ask the RNG and pick a random one." (namibj)
"However, note that using a random function as the comparator for a sort function does not generally result in an unbiased shuffle." (namibj)
"Maybe they mean that computers can't replicate the imperfections/shortcomings that might define human/hand shuffling? That's kind of how I took it." (indigodaddy)
"Those human imperfections likely decrease randomness - for example leaving cards that started adjacent more likely to remain adjacent han by strict chance." (evrydayhustling)
"They most definitely decrease randomness. But I guess the article’s point is that human imperfections offset that with lower correlated failure modes." (harrall)
"Given > an algorithm would randomly select an arrangement from the 52! possible decks. But no computer has enough memory to evaluate all of these possibilities It sounds more like they don't understand how a computer can shuffle something." (dwattttt)
“In an ideal world, an algorithm would randomly select an arrangement from the 52! possible decks. But no computer has enough memory to evaluate all of these possibilities, and a perfect random number generator doesn’t yet exist. Therefore, developers generally rely on algorithms that simulate card shuffling.” You’ve got to be kidding me. What’s wrong with sampling each card independently and uniformly (without replacement)?" (ziofill)

Historical Context and Developer Overconfidence

The discussion touches upon the history of shuffling algorithms and highlights the overconfidence of developers who implemented flawed methods, especially when these methods were publicly shared.

"In the late 1990s the development platform ASF Software supplied several online poker providers, such as Planet Poker, with card-shuffling algorithms. The platform even posted the algorithm on its website as proof that the game was reliably programmed. What amazes me is the level of overconfidence the developers of the broken algorithm had to post it online." (alexey-salmin)
"The original Fisher–Yates shuffle was published in 1938 by Ronald Fisher and Frank Yates in their book Statistical tables for biological, agricultural and medical research. The modern version of the Fisher–Yates shuffle, designed for computer use, was introduced by Richard Durstenfeld in 1964. It is also known as the Knuth shuffle after Donald Knuth who popularized it in Volume Two of The Art of Computer Programming (1969)" (defrost)
"Picking a bad algorithm, as they did here in this example from the 1990's is timeless. Good shuffle algorithms existed then, and had existed since the 1930s." (defrost)
"It actually has some depth into the multiple problems of the algorithm published by the gaming company." (kristianp)

Real-World Implementations and Anecdotes (MTGO, etc.)

Users share personal experiences and observations from real-world applications, particularly online gaming platforms like Magic: The Gathering Online (MTGO) and online poker, where RNG quality has been a point of suspicion or critical failure.

"This became a hot discussion issue in magic the gathering online (MTGO). I always felt my shuffles in MTGO felt different somehow than my offline paper shuffles. It's hard to know for sure when its all closed source. I know a lot of people were suspicious though." (mmmpetrichor)
"When I did it back in the day for a poker site we also had the constraint of never storing the positions of shuffled cards for security reasons. We always sorted the deck and when it came to dealing I developed a shuffle server that got noise from a taped up webcams cmos sensor, when a player needed a card it polled the server which was just outputting random positions from 0 to 51 continuously, if the card position was already dealt, it re requested a new one, entropy was gotten from player decision time and network speed. It was a fun thing to build." (rekttrader)
"Online poker died because having 1 other person at the table sharing cards surreptitiously drastically increases the odds in your favor, which ruins cash games. Let alone AI which has mostly solved the game if you have a bot making your decisions. Poker must be played in person, otherwise it’s a dead game" (monero-xmr)

The Difficulty of Proving Algorithm Correctness and Security

A critical point raised is that simply "testing thoroughly" is insufficient for security-critical algorithms. Rigorous mathematical proof of correctness and unbias is often necessary, and ad-hoc methods should be treated with extreme skepticism.

"If you want to propose an algorithm, start with trying to prove the security claims of the algorithm. In this case, you'd need to prove that your algorithm creates a permutation that is indistinguishable from random. If you can't prove it, it's highly probable that your algorithm doesn't create a random permutation and someone will figure out how to break it." (kerkeslager)
"'Testing it thoroughly' is not adequate. Start by proving the algorithm is correct. If you don't have a proof the algorithm works there's no reason to even start implementing it so there's nothing to test." (kerkeslager)
"Jesus Christ, no. If you still believe anything ChatGPT says then security is not the field for you." (kerkeslager)
"and end up with a slower algorithm when you’re done." (amluto)