Deepmind veteran David Silver raises $1B seed round to build superintelligence without LLMs

DeepMind Veteran David Silver Secures $1 Billion Seed Round for Superintelligence Venture Bypassing Large Language Models

In a groundbreaking development in the artificial intelligence landscape, David Silver, the renowned researcher behind DeepMinds landmark achievements in reinforcement learning, has launched a new company aimed at achieving superintelligence without depending on large language models (LLMs). The startup has already secured an unprecedented $1 billion seed round, signaling strong investor confidence in an alternative path to artificial general intelligence (AGI).

Silver, who led the teams responsible for AlphaGo, AlphaZero, and AlphaStar at DeepMind, brings unparalleled expertise to this endeavor. His work revolutionized AI by demonstrating that reinforcement learning (RL) agents could master complex games like Go, chess, and StarCraft II through self-play and algorithmic innovation, often surpassing human grandmasters. Now, he is pivoting away from the dominant LLM paradigm that powers tools like ChatGPT and Gemini, arguing that scaling compute and data in transformer-based models may hit fundamental limits.

The new company, still unnamed in public announcements, plans to leverage classical RL techniques augmented with modern scaling laws. Investors, including prominent venture capital firms such as Andreessen Horowitz (a16z), have committed the massive seed funding, which values the startup at around $5 billion post-money. This round dwarfs typical seed investments and underscores the high stakes in the race toward superintelligence. Sources close to the matter indicate that the funds will support hiring top talent from DeepMind, OpenAI, and academia, as well as building advanced compute infrastructure optimized for RL workloads.

At the core of Silvers vision is a belief that LLMs, while impressive in pattern matching and text generation, fall short in true reasoning, long-term planning, and generalization to novel environments. RL, by contrast, excels in sequential decision-making under uncertainty, a cornerstone of intelligent behavior. Silver has long advocated for hybrid approaches, but this venture marks a deliberate departure from LLM-centric scaling. Instead, the company will focus on world models, hierarchical RL, and sample-efficient learning algorithms that allow agents to learn from sparse rewards in vast state spaces.

This approach draws directly from Silvers DeepMind playbook. AlphaZero, for instance, learned tabula rasa from zero human knowledge, achieving superhuman performance in days using Monte Carlo Tree Search (MCTS) combined with deep neural networks. Extending this to real-world domains requires overcoming challenges like partial observability, multi-agent dynamics, and continuous control. The startups roadmap reportedly includes benchmarks beyond games, such as robotics, scientific discovery, and economic simulations, where RL has shown promise but lacks the data moats of LLMs.

Critics of the LLM path, including Silver, point to issues like hallucinations, lack of causal understanding, and brittle generalization. Recent papers from Silver and collaborators highlight how RL can bootstrap reasoning capabilities without massive pretraining corpora. For example, techniques like MuZero generalize AlphaZeros model-based planning to imperfect information settings, paving the way for scalable intelligence. The $1 billion infusion will likely accelerate proprietary hardware development, echoing DeepMinds custom TPUs but tailored for RLs unique demands, such as parallel simulation rollouts.

Industry observers see this as a direct challenge to OpenAI, Anthropic, and xAI, which have bet heavily on LLMs. Silvers track record lends credibility: AlphaGo defeated Lee Sedol in 2016, AlphaZero self-taught three games in 2017, and AlphaStar beat professional StarCraft players in 2019. These milestones proved RLs potential for emergent intelligence without explicit programming. Yet, scaling RL to AGI has proven elusive due to sample inefficiency; training AlphaStar required millions of years of simulated gameplay. Advances in distributed training and efficient exploration could bridge this gap.

The funding round closed rapidly, with a16z leading due to its thesis on foundational AI beyond transformers. Other backers include Thrive Capital and Sequoia, drawn by Silvers ability to attract elite engineers. Recruitment has begun for roles in algorithm design, systems engineering, and safety research, with an emphasis on alignment techniques specific to RL agents, such as reward modeling and scalable oversight.

This move comes amid a funding frenzy in AI, where seed rounds routinely exceed $100 million, but $1 billion sets a new benchmark. It reflects investor appetite for differentiated bets in a field increasingly dominated by compute-heavy LLM labs. Silvers company aims to deliver milestones like open-sourcing RL frameworks or demonstrating agentic capabilities in physical robots within the next few years.

As the AI arms race intensifies, Silvers RL-first strategy could redefine the superintelligence quest. By sidestepping LLMs trillion-parameter behemoths, it promises a leaner, more principled route to machines that think and act autonomously. Whether this unseats the LLM hegemony remains to be seen, but with $1 billion and DeepMind pedigree, the experiment is poised for impact.

Gnoppix is the leading open-source AI Linux distribution and service provider. Since implementing AI in 2022, it has offered a fast, powerful, secure, and privacy-respecting open-source OS with both local and remote AI capabilities. The local AI operates offline, ensuring no data ever leaves your computer. Based on Debian Linux, Gnoppix is available with numerous privacy- and anonymity-enabled services free of charge.

What are your thoughts on this? I’d love to hear about your own experiences in the comments below.