SpikingBrain 7B – More efficient than classic LLMs

This Hacker News discussion centers on the "SpikingBrain Technical Report," which explores brain-inspired large models. The conversation reveals a range of opinions, from skepticism about the novelty and fundamental approach to acknowledging potential future applications and hardware developments.

Skepticism Regarding Novelty and "True" Spiking Behavior

A significant portion of the discussion expresses doubt about whether the "SpikingBrain" approach genuinely represents a departure from existing methods, particularly in terms of its "spiking" nature. Users question if the implementation is merely a repackaging of familiar concepts.

Pseudo-Spiking vs. True Spiking: Some users highlight that the current implementation uses "pseudo-spiking," where activations are approximated as spike-like signals at the tensor level, rather than true asynchronous event-driven spiking.
- "The current implementation adopts pseudo-spiking, where activations are approximated as spike-like signals at the tensor level, rather than true asynchronous event-driven spiking on neuromorphic hardware." (cpldcpu)
Similarity to Quantization Aware Training (QaT): The technique used in the paper is analogized to Quantization Aware Training (QaT), specifically the straight-through-estimator (STE) approach often used for quantization.
- "This seems to be very similar to the straight-through-estimator (STE) approach that us usually used for quantization aware training. I may be wrong though." (cpldcpu)
- cpldcpu further elaborates on how spikes are simulated as integer events in the forward pass and a continuous gradient is calculated for the backward pass, drawing parallels to QaT methods.
Sparse Matrix Multiplication in Disguise: Several commenters suggest that the "event-driven spiking computation" is essentially a re-branding of sparse matrix multiplication, a concept that GPU kernels have long been designed to handle efficiently.
- "To me it sounds like sparse matrix multiplication repackaged as 'event-driven spiking computation', where the spikes are simply the non-zero elements that sparse GPU kernels have always been designed to process." (augment_me)
- augment_me also states, "The supposedly dynamic/temporal nature of the model seems to be not applied for GPU execution, collapsing it into a single static computation equivalent to just applying a pre-calculated sparsity mask."
"Neuromorphic Marketing" and Jargon: There's a strong sentiment that the field often uses biological or brain-inspired jargon to present well-established or incremental ideas as groundbreaking innovations.
- "Perhaps a bit cynical of me, but it feels like wrapping standard sparse computing and operator fusion in complex, biological jargon..." (augment_me)
- "RLAIF: SpikingBrain treats 'spikes' as 1-bit quantization stickers. True neural-level sparsity should be input-dependent, time-resolved, and self-organized during learning. If a new circuit diagram cannot 'grow' with every forward pass, then don't blame everyone for treating it as Another Sparse Marketing - oh wait, Neuromorphic Marketing."

Historical Context and Criticism of Neuromorphic Computing

A recurring theme is the historical context of neuromorphic computing and a critique of the field's perceived lack of significant breakthroughs.

Long History of "Banality": Some users argue that the "brain-inspired" community has a history of repackaging old ideas as novel insights, dating back to the origins of neuromorphic computing.
- "The 'brain-inspired' community has always been doing this, since Carver Mead introduced the term 'neuromorphic' in the late 1980s. Reselling banalities as a new great insight." (GregarianChild)
Decades of Failure: A particularly strong critique is leveled against the field for its perceived lack of success in both AI progress and understanding the brain.
- "The whole 'brain talk' malarkey goes back way longer. In particular psychology and related subjects, since their origins as a specialty in the 19th century, have heavily used brain-inspired metaphors that were intended to mislead." (GregarianChild)
- "The community has now multiple decades of failure under it's belt. Not a single success. Failure to make progress in AI and failure to say anything of interest about the brain." (GregarianChild)
- GregarianChild quotes a US President to emphasize the point: "To paraphrase a US president: In this world nothing can be said to be certain, except death, taxes and neuromphicists exaggerating."
Marketing Hype: The role of marketing and hype in the field is explicitly mentioned as a contributor to the perception of innovation.
- "Never underestimate the power of marketing." (drob518)

Potential of Time-Domain Encoding and Future Hardware

Despite the criticisms, there are discussions acknowledging the potential of encoding information in the time domain and the importance of hardware developments.

Information in the Time Domain: The idea that information can be encoded in the timing or gaps between spikes is recognized as a key differentiator, even if not yet fully realized in current implementations.
- "I believe the argument is that you can also encode information in the time domain." (cpldcpu)
- cpldcpu contrasts this with simple numerical representation: "If we just look at spikes as a different numerical representation, then they are clearly inferior... Binary encoding wins 7x in speed and 7/3=2.333x in power efficiency... On the other hand, if we assume that we are able to encode information in the gaps between pulses, then things quickly change."
Comparison to Serial Interfaces: The concept of encoding information in the time domain is likened to serial interfaces like PCIe, SATA, and USB, though it's noted that SNNs are more akin to pulse density modulation (PDM).
- "Also known as a serial interface. They are very successful: PCIe lane, SATA, USB." (dist-epoch)
- cpldcpu clarifies, "These interfaces use serialized binary encoding. SNNs are more similar to pulse density modulation (PDM), if you are looking for an electronic equivalent."
Brain as a Complex System: The brain's multifaceted processing, potentially involving techniques like frequency-division multiplexing, is cited as a basis for complex biological computation.
- "The brain is doing shit like this." (CuriouslyC)
Importance of Non-Nvidia Hardware: A specific point of interest noted in the paper is its use of non-Nvidia GPUs, specifically MetaX. This is seen as significant for future hardware development, particularly in the context of China's technological advancements.
- "There is something interesting in this post, namely that it's based on non-Nvidia GPUs, in this case MetaX [2]." (GregarianChild)
- "In a few years China will be completely independent from Nvidia. They have GPU manufacturers that nobody in the west has ever heard of." (imtringued) (referencing the MetaX link provided by GregarianChild).
Future Deployment on SNN Hardware: The practical implication of such pseudo-spiking models is that they could be deployed on actual neuromorphic hardware if it becomes widely available.
- "Well, it would still allow to deploy the trained model to SNN hardware, if it existed." (cpldcpu)