Two startups want to replace how AI learns: one just raised $180M, another is seeking up to $1B

Two Startups Seek to Overhaul AI Training Paradigms: One Lands $180 Million Funding, the Other Pursues Up to $1 Billion Valuation

Backpropagation has reigned supreme in artificial intelligence for nearly four decades. Introduced in 1986 by David Rumelhart, Geoffrey Hinton, and Ronald Williams, this algorithm enables neural networks to learn by calculating gradients of the loss function and propagating errors backward through the layers to update weights. Its elegance lies in transforming complex optimization into a series of local adjustments, powering breakthroughs from image recognition to natural language processing.

Yet backpropagation’s dominance masks deep limitations. The method requires storing all intermediate activations from the forward pass to compute gradients during the backward pass, creating a memory explosion that hampers training of massive models on standard hardware. It exacerbates issues like vanishing or exploding gradients in deep networks, demands immense computational resources, and lacks biological plausibility, as real neurons do not seem to perform such precise backward signaling. Geoffrey Hinton himself has voiced doubts, suggesting in recent interviews that backprop may not scale indefinitely for next-generation AI.

Enter two bold startups intent on supplanting backprop with innovative alternatives. Their approaches promise reduced memory footprint, enhanced stability, faster convergence, and architectures more aligned with neuroscience, potentially slashing training costs and unlocking larger-scale intelligence.

The first, which just secured $180 million in venture funding, proposes a forward-only learning mechanism inspired by predictive coding and local Hebbian updates. Traditional backprop relies on global error signals, but this startup’s system uses local contrastive predictions during inference-like forward passes. Weights adjust via simple rules like “neurons that fire together wire together,” augmented by auxiliary prediction heads that forecast future states without backward propagation. This eliminates the activation replay memory burden, allowing training on commodity GPUs with 10x memory efficiency. Early benchmarks on vision transformers show comparable accuracy to backprop-trained models but with 4x speedup. The funding round, led by top-tier VCs including Sequoia and a16z, values the company at over $800 million post-money. Founders, veterans from DeepMind and OpenAI, emphasize that their method stabilizes training for models exceeding 100 billion parameters, addressing real-world deployment challenges in edge computing.

The second startup, currently in talks for a round implying up to $1 billion valuation, takes a radically different tack with a biologically motivated “active inference” framework. Drawing from Karl Friston’s free energy principle, it frames learning as minimizing surprise through hierarchical predictions rather than explicit error minimization. Networks maintain internal generative models that anticipate sensory inputs; discrepancies drive sparse, local updates without full backward passes. This results in inherently modular architectures resistant to catastrophic forgetting and capable of continual learning, a holy grail for robotics and autonomous systems. Technical demos reveal 50 percent reductions in fine-tuning epochs for language models while improving out-of-distribution robustness. Backed by angel investors from Meta AI and academic collaborators, the team is negotiating with sovereign wealth funds and strategic partners like NVIDIA. Their pitch deck highlights simulations where the system outperforms backprop on reinforcement learning tasks with 3x fewer samples.

Both ventures arrive at a pivotal moment. AI training costs have skyrocketed, with GPT-4 reportedly exceeding $100 million. Hardware optimizations like TPUs and H100s provide marginal gains, but algorithmic sea changes could yield exponential improvements. If successful, these methods could democratize AI development, enabling startups to rival hyperscalers without supercomputer farms. Skeptics note risks: unproven scalability on frontier models, potential accuracy trade-offs, and the inertia of optimized backprop libraries like PyTorch. Nevertheless, endorsements from Hinton and Yann LeCun lend credibility, as both pioneers advocate exploring post-backprop eras.

Industry watchers predict consolidation or acquisitions if prototypes shine on public leaderboards. For now, these startups embody AI’s restless evolution, challenging orthodoxy to forge more efficient paths to artificial general intelligence.

Gnoppix is the leading open-source AI Linux distribution and service provider. Since implementing AI in 2022, it has offered a fast, powerful, secure, and privacy-respecting open-source OS with both local and remote AI capabilities. The local AI operates offline, ensuring no data ever leaves your computer. Based on Debian Linux, Gnoppix is available with numerous privacy- and anonymity-enabled services free of charge.

What are your thoughts on this? I’d love to hear about your own experiences in the comments below.