Here's a summary of the themes discussed in the Hacker News conversation regarding the use of booleans:
Booleans vs. Richer Data Types (Enums, Timestamps, etc.)
A central theme is the debate over whether booleans are the best representation for data, or if richer data types like enums, timestamps, or more complex structures offer advantages. Proponents of richer types argue they communicate intent better, are more extensible, and can prevent invalid states.
- "The consideration for choosing types is often to communicate intend to others (and your future self). I think that’s also why code is often broken up into functions, even if the logic does not need to be modular / repeatable: the function signature kind of “summarizes” that bit of code." - Fraterkes
- "Making a boolean a datetime, just in case you ever want to use the data, is not the kind of pattern that makes your code clearer in my opinion. The fact that you only save a binary true/false value tells the person looking at the code a ton about what the program currently is meant to do." - Fraterkes
- "Recording verifications as events handles this trivially, whilst this doesn't work with booleans." - turboponyy
- "Storing a boolean is overly rigid, throws away the underlying information of interest, and overloads the model with unrelated fields (imagine storing say 7 or 8 different kinds of events linked to some model)." - turboponyy
- "Booleans are good toggles and representatives of 2 states like on/off, public/private. But sometimes an association, or datetime, or field presence can give you more data and said data is more useful to know than a separate attribute." - usernamed7
- "The timestamps instead of boolean thing is something good engineers stumble upon pretty reliably." - alphazard
- "It's almost never a boolean. It's almost always an enum. Enums are just better. You can't accidentally pass a strong enum into the wrong parameter. Enums can be extended. There's nothing more depressing than seeing:
do_stuff(id, true, true, false, true, false, true);
" - jmyeet - "If you find a Boolean isn't semantically clear, or you need a third variant, then move to an enum." - the__alchemist
- "Booleans beget more booleans. Once you have one or two argument flags, they tend to proliferate, as programmers try to cram more and more modalities into the same function signature. The set of possible inputs grows with 2^N, but usually not all of them are valid combinations. This is a source of bugs. Again, enums / sum-types solve this because you can make the cardinality of the input space precisely equal to the number of valid inputs." - dain
Conversely, others argue that booleans are appropriate for genuinely binary states and that over-engineering by using richer types unnecessarily can introduce complexity and waste space.
- "A lot of boolean data is representing a temporal event having happened. For example, websites often have you confirm your email. This may be stored as a boolean column, is_confirmed, in the database. It makes a lot of sense." - amelius
- "How about using Booleans for binary things? Is the LED on or off, is the button pressed or not, is the microcontroller pin low or high? Using Enums, etc. to represent those values in the embedded world would be a monumental waste of memory, where a single bit would normally suffice." - bsoles
- "Since Lua allows you to change arity of a function without changing call-sites (missing arguments are just nil), they had just added a flag as an argument. And then another flag. And then another." - OskarS (Illustrating problem with implicit boolean flags)
- "How about using Booleans for binary things? Is the LED on or off, is the button pressed or not, is the microcontroller pin low or high? Using Enums, etc. to represent those values in the embedded world would be a monumental waste of memory, where a single bit would normally suffice." - bsoles
- "This is semantic pedantry. The association true/1/high and false/0/low is well-known and understood." - bsoles (Responding to the idea that physical signals aren't boolean)
- "Booleans are a cornerstone of programming and logic. They're great. I don't know where this 'booleans are bad' idea came from, but it's the opposite of communicating intention clearly in code. That boolean should probably stay a boolean unless there's an actual reason to change it." - crazygringo
- "Turning boolean database values into timestamps is a weird hack that wastes space. Why do you want to record when an email was verified, but not when any other fields that happen to be strings or numbers or blobs were changed? Either implement proper event logging or not, but don't do some weird hack where only booleans get fake-logged but nothing else does." - crazygringo
- "No. All of this is breaking the primary rule of programming: KISS (keep it simple, stupid). Don't add unnecessary complexity. Avoid premature optimization. Tons of things are correctly booleans and should stay that way." - crazygringo
The "Typestate" or "Nullability as State" Debate
A significant portion of the discussion revolves around using the presence or absence of a value (often a timestamp or NULL
) to represent a state, as suggested by the article's premise. This is contrasted with explicit booleans.
- "If you have two fields (i.e. UserHasVerified, UserVerificationDate) doesn't waste THAT much more space, and leaves no room for interpretation." - chikinchinpotpi
- "What happens when they get out of sync?" - cratermoon (Raising a concern about the two-field approach)
- "But it does leave room for 'UserHasVerified = false, UserVerificationDate = 2025/08/25' and 'UserHasVerified = true, UserVerificationDate = NULL'." - jerf
- "The better databases can be given a key to force the two fields to match. Most programming languages can be written in such a way that there's no way to separate the two fields and represent the broken states I show above. However the end result of doing that ends up isomorphic to simply having the UserVerificationDate also indicate verification. You just spent more effort to get there. You were probably better off with a comment indicating that 'NULL' means not verified." - jerf
- "The author example, checking if 'Datetime is null' to check if user is authorized or not, is not clear." - mrheosuper
- "Or if you receive Null in Datetime field, is it because the user has not login, or because there is problem when retriving Datetime ?" - mrheosuper
- "And for is_current, I still think a nullable timestamp could be useful there instead of a boolean. You might have a policy to delete old email addresses after they've been inactive for a certain amount of time, for example. But I'll admit that a boolean is fine there too, if you really don't care when the user removed an email from the current list." - kelnos
- "skybrian: Changing a boolean database field like 'is_confirmed' to a nullable datetime is a simple, cheap hack that records a little bit of information about an event. It's appropriate when you're not sure you care about the event." - skybrian
- "amelius: Even more useful is a log of all the changes in the database. This gives you what you want, and it would be automatic for any data you store. So: keep the Boolean, and use a log." - amelius
- "aydyn: Okay, but for something like SQL this seems like a bad idea." - aydyn (Responding to storing history in SQL)
- "afc: It should be: std::optional
(or Optional[datetime] or equivalent in others languages)" - afc - "zwieback: Maybe for the DB domain author is talking about but the nice thing about a bool is that it's true or false. I don't have to dig around documentation or look through the code what the convention of converting enum, datetime, etc. to true/false is. 1970/1/1 (I was four years old then, just sayin), -6000 or something else?" - zwieback
- "amelius: So the Boolean should be something else + NULL? Now we have another problem ..." - amelius
- "buckle8017: It should be a timestamp of the last time the email was verified. It's a surprisingly useful piece of data to have." - buckle8017
- "The Boolean type is the massive whaste, not the enum. A boolean in c is just a full int. So definitely not a whaste to use an enum which is also an int." - aDyslecticCrow
Booleans in Function Arguments and API Design
A recurring point of contention is the use of booleans as arguments to functions. Many argue this leads to unreadable code, combinatorial complexity, and difficulties in testing.
- "A piece of advice I read somewhere early in my career was 'a boolean should almost never be an argument to a function'." - OskarS
- "What does those extra arguments do? Who knows, it's impossible without looking at the function definition." - OskarS
- "Basically, what had happened was that the developer had written a function ('serialize()', in this example) and then later discovered that they wanted slightly different behaviour in some cases (maybe pretty printed or something)." - OskarS
- "If the tiny fraction is small enough, just write different functions for it ('serialize()' and 'prettyPrint()'). If it's not feasible to do it, have a good long think about the API design and if you can refactor it nicely." - OskarS
- "Basically, if you have a function takes a boolean in your API, just have two functions instead with descriptive names." - OskarS
- "Yeah right like I’m going to expand this function that takes 10 booleans into 1024 functions. I’m sticking with it. /s" - hamburglar
- "If your function has a McCabe complexity higher than 1024, then boolean arguments are the least of your problems..." - OrderlyTiamat
- "Named arguments don't stop the deeper problem, which is that N booleans have 2^N possible states." - dain
- "I now believe very strongly that you should virtually never have a boolean as an argument to a function. There are exceptions, but not many." - OskarS
- "nutjob2: Really? That sounds unjustified outside of some specific context. As a general rule I just can't see it. I don't see whats fundamentally wrong with it. Whats the alternative? Multiple static functions with different names corresponding to the flags and code duplication, plus switch statements to select the right function?" - nutjob2
- "OskarS: You're correct in principle, but I'm saying that 'in practice', boolean arguments are usually feature flag that changes the behavior of the function in some way instead of being some pure value. And that can be really problematic, not least for testing where you now aren't testing a single function, you're testing a combinatorial explosions worth of functions with different feature flags." - OskarS
- "I personally believe very strongly that people shouldn’t use programming languages lacking basic functionalities. StopDisinfo910: Named arguments are a solution to precisely this issue." - StopDisinfo910
Mutually Exclusive States and Enumerations
The discussion touches on how to represent states that should be mutually exclusive, with enums often being presented as a superior alternative to multiple booleans.
- "One argument that I’m missing in the article is that with an enumerated, states are mutually exclusive, while with several booleans, there could be some limbo state of several bool columns with value true, e.g. is_guest and is_admin, which is an invalid state." - ck45
- "In that case, you set the enumeration up to use separate bit flags for each boolean, e.g., is_guest is the least significant bit, is_admin is the second least significant bit, etc." - cjs_ac
- "look up the typestate pattern." - cratermoon
- "Many have been burned. If you decided to make your boolean a timestamp, and now realize you need a field with 3 states, now what? If you'd kept your boolean, you could convert the field from BOOL to TINYINT without changing any data. [0, 1] becomes [0, 1, 2] easily." - bluGill (arguing for initial flexibility) and crazygringo (counterpoint) contradicted by hahn-kev who suggests synthetic primary keys.
- "One argument that I’m missing in the article is that with an enumerated, states are mutually exclusive, while with several booleans, there could be some limbo state of several bool columns with value true, e.g. is_guest and is_admin, which is an invalid state." - ck45
- "Some other examples - what is the rate of verifications per unit of time? How many verification emails do we have to send out? Flipping a boolean when the first of these events occurs without storing the event itself works in special cases, but not in general." - turboponyy
- "Enums are better because you can carve out precisely the state space you want and no more." - dain
- "The same way we don't start with floats rather than ints 'just in case' we need fractional values later on." - crazygringo (analogy for not over-engineering)
- "And usually you use operations to isolate the bit from a status byte or word, which is how it's also stored and accessed in registers anyway. So its still no boolean type despite expressing boolean things. Enums also help keep the state machine clear. {Init, on, off, error} capture a larger part of the program behavior in a clear format than 2-3 binary flags, despite describing the same function. Every new boolean flag is a two state composite state machine hiding edgecases." - aDyslecticCrow
Performance and Space Considerations (Embedded vs. General Software)
The trade-offs between memory usage and expressiveness are discussed, particularly concerning embedded systems where memory is a critical constraint.
- "How about using Booleans for binary things? Is the LED on or off, is the button pressed or not, is the microcontroller pin low or high? Using Enums, etc. to represent those values in the embedded world would be a monumental waste of memory, where a single bit would normally suffice." - bsoles
- "I would say for your specific example, you shouldn't have boolean flags for that in the user_emails table, but instead have a primary_email column in the users table, that has a foreign key reference to the user_emails table. That way you can also ensure that the user always has exactly one primary email." - kelnos (suggesting alternative design for clarity)
- "I work at an industrial plant we use boolean datatypes for stateful things like this. For example is Conveyor belt running (1) or stopped (0). Sure we could store the data by logging the start timestamp and a stop timestamp but our data is stored on a time series basis (i.e. in a Timeseries DB, the timestamp is already the primary key for each record) When you are viewing the trend (such a on control room screen) you get a nice square-wave type effect you can easily see when the state changes." - bigger_cheese
- "The Boolean type is the massive whaste, not the enum. A boolean in c is just a full int. So definitely not a whaste to use an enum which is also an int." - aDyslecticCrow
- "If embedded projects start using C standards from the past quarter century, they can join in on type discourse." - devnullbrain
Durability and Data Modeling in Databases
The conversation frequently touches on database design, touching on how data should be stored durably and how "overdesign" in databases might be acceptable.
- "In the case of a database you often can't fix mistakes so overdesign just in case makes sense. Many have been burned." - bluGill
- "You might persist that value as an optimisation, but if you make it your source of truth, and discard your inputs, you better make sure you never ever ever ever have a bug in deriveValuedCustomer() or else you have lost data permanently" - jbreckmckye (cautioning against derived values as source of truth)
- "You wouldn't likely durably store a boolean in an OLTP store, but your ETL into the OLAP store may capture a boolean to simplify logic for all the systems using the OLAP store to drive decision support. That is, it's an optimization." - taylodl
- "Parcel carrier shipment transaction: ReturnServiceRequested: True/False. I can think of many more of these that are options of some transaction that should be stored and naturally are represented as boolean." - RaftPeople
- "The scope of TFA is data modelling, where it advises to use more descriptive data values, such as enums or happenedAtTimestamp." - fifticon
- "I think the onus is on embedded to vocally differentiate itself from normal software development, not for it to be assumed that general software advice applies to embedded." - devnullbrain
- "I think the advice, in that context [data modelling], feels like decent advice that would prevent a lot of mess." - padjo
- "Typically, booleans should be derived not stored. This generally holds true until you need to store them as an optimization for processing and analysis, like your example with is_valued_customer." - jbreckmckye
- "It's a failing of many type systems of older languages (except Pascal). The best way in many languages for flags is using unsigned integers that are botwise-ORed together." - lelannthran (advocating for bitwise operations)