OpenAI's dissatisfaction with Nvidia chips sparked Cerebras deal

OpenAI’s Frustrations with Nvidia GPUs Lead to Major Cerebras Partnership

OpenAI, the pioneering artificial intelligence company behind ChatGPT and advanced language models, has reportedly shifted its hardware strategy due to ongoing dissatisfaction with Nvidia’s graphics processing units (GPUs). This discontent, centered on the performance limitations and operational challenges of Nvidia’s Hopper-generation chips, has paved the way for a significant collaboration with Cerebras Systems. The deal positions Cerebras, known for its innovative wafer-scale processors, as a key supplier for OpenAI’s ambitious Stargate supercomputer project.

The roots of this partnership trace back to OpenAI’s experiences with Nvidia’s H100 GPUs, which power much of the company’s current infrastructure. While these chips have been instrumental in training large-scale AI models, they have drawn criticism for their high power consumption and substantial heat generation. Each H100 requires intensive cooling systems, contributing to skyrocketing operational costs in data centers. OpenAI CEO Sam Altman has publicly voiced these concerns, highlighting how the GPUs’ inefficiencies hinder scalability for next-generation AI systems. In one notable statement, Altman emphasized the need for hardware that can deliver exascale computing without prohibitive energy demands, signaling a strategic pivot away from sole reliance on Nvidia.

Cerebras Systems emerges as a compelling alternative with its unique wafer-scale engine (WSE) technology. Unlike traditional GPUs, which are diced into small chips and interconnected via complex networking fabrics, Cerebras builds processors on full silicon wafers. The latest iteration, the WSE-3 housed in the CS-3 system, spans an entire 46,225 square millimeters of silicon, integrating 4 trillion transistors and 900,000 AI-optimized cores. This monolithic design eliminates the bottlenecks associated with chip-to-chip communication, enabling data to flow at memory speeds across the entire processor.

The partnership was formalized through announcements from both companies, with Cerebras confirming it will supply OpenAI with systems capable of delivering substantial computational power. Specifically, Cerebras plans to provide 100,000 CS-3AI systems, which collectively offer 1 exaflop of AI compute at FP8 precision. This scale is poised to support the training of trillion-parameter models, a cornerstone of OpenAI’s roadmap toward artificial general intelligence (AGI). The CS-3AI variant is tailored for inference workloads, featuring 125 petaflops of performance per system and enhanced memory bandwidth of 21 petabytes per second.

From a technical standpoint, the WSE-3’s architecture addresses many of Nvidia’s pain points. It supports a native 1.2 trillion parameter transformer model in just 5.5 seconds per token during inference, far surpassing fragmented GPU clusters in efficiency. Power efficiency is another highlight: the CS-3 consumes 15 kilowatts per system while delivering dense compute, potentially reducing the energy footprint compared to equivalent Nvidia setups. Cerebras’ SwarmX software further simplifies scaling, allowing clusters to function as a single logical processor without the programming overhead of managing thousands of discrete GPUs.

OpenAI’s move reflects broader industry trends. As AI models grow exponentially in size and complexity, the limitations of conventional GPU architectures become more pronounced. Nvidia dominates the market with over 90 percent share, but supply constraints, escalating prices (H100s cost upwards of $30,000 each), and customization challenges have prompted hyperscalers to explore alternatives. Cerebras, founded in 2016, has positioned itself as a disruptor by focusing exclusively on AI workloads. Its wafer-scale approach, first commercialized in 2021, has already attracted partners like GlaxoSmithKline for drug discovery and Mayo Clinic for genomics.

The deal also involves strategic investors. Cerebras raised $400 million in funding, led by G42, an Abu Dhabi-based AI firm collaborating with OpenAI on Stargate. This infusion values Cerebras at $4 billion and underscores the geopolitical dimensions of AI hardware, with U.S. export controls on advanced chips influencing supply chains. OpenAI benefits from diversified sourcing, mitigating risks from Nvidia’s dominance and potential tariffs.

Implementation details reveal Cerebras’ edge in deployment speed. A single CS-3 rack delivers 1.3 exaflops of AI compute, and full clusters can be online in weeks rather than months, contrasting with the protracted setup of Nvidia GPU supercomputers. Software compatibility is seamless: Cerebras supports standard frameworks like PyTorch and TensorFlow, with compiler optimizations that map models directly onto the wafer without manual partitioning.

Challenges remain. Wafer-scale manufacturing yields lower volumes than standard fabs, and long-term reliability at this scale is unproven in production environments. However, Cerebras reports field uptime exceeding 99.9 percent in customer deployments. For OpenAI, the partnership accelerates Stargate, a $100 billion initiative aiming for 100 exaflops by 2028, blending Cerebras systems with other accelerators.

This alliance marks a pivotal moment in AI hardware evolution. By addressing Nvidia’s shortcomings in power, scalability, and ease of use, Cerebras enables OpenAI to push computational boundaries. As the race for AGI intensifies, such innovations could redefine supercomputing, fostering more efficient paths to transformative AI capabilities.

Gnoppix is the leading open-source AI Linux distribution and service provider. Since implementing AI in 2022, it has offered a fast, powerful, secure, and privacy-respecting open-source OS with both local and remote AI capabilities. The local AI operates offline, ensuring no data ever leaves your computer. Based on Debian Linux, Gnoppix is available with numerous privacy- and anonymity-enabled services free of charge.

What are your thoughts on this? I’d love to hear about your own experiences in the comments below.