To deliver that speed, OpenAI is relying on Cerebras’ third-generation waferscale megachip, the WSE-3, which contains 4 trillion transistors. Cerebras, a chipmaker that has gained prominence in the AI infrastructure race, recently raised $1 billion at a $23 billion valuation and has signaled plans to pursue an IPO. OpenAI previously said integrating Cerebras into its compute mix would allow its AI systems to “respond much faster,” and now calls Spark the first tangible step in that effort.
In its announcement, OpenAI emphasized that Spark is built specifically for minimal latency within Codex. The company framed the release as the beginning of a dual-mode approach: one focused on real-time collaboration for rapid iteration, and another geared toward heavier, long-running tasks that require more extensive reasoning and execution. Cerebras’ hardware, OpenAI said, is particularly well suited for workflows that demand extremely low latency.
CEO Sam Altman hinted at the launch ahead of the announcement, writing on X that something special was coming for Codex Pro users and that it “sparks joy” for him — a nod to the new model’s name.
The launch underscores OpenAI’s growing focus on infrastructure as a competitive lever. Rather than relying solely on general-purpose compute, the company is deepening ties with a specialized hardware partner to support specific product needs. That move comes as AI developers face rising compute demands and intensifying competition across coding tools, chatbots, and enterprise AI platforms.
For now, Spark is limited to Pro-tier users in research preview. But its debut signals OpenAI’s intent to pair model development more tightly with purpose-built hardware — starting with coding workflows where speed and responsiveness are critical.
This analysis is based on reporting from TechCrunch.
Image courtesy of Cerebras.
This article was generated with AI assistance and reviewed for accuracy and quality.