OpenAI and Broadcom announced the completion and testing of their first jointly developed AI inference chip named Jalapeño on June 24, 2026 [1, 2, 3, 4]. The chip is designed specifically for AI model inference workloads supporting large language models, including OpenAI’s GPT series, but not for training [3, 4, 5].
Jalapeño aims to reduce OpenAI’s reliance on Nvidia GPUs and diversify its AI hardware ecosystem alongside partners like AMD, Cerebras, and Amazon’s Trainium [1, 3, 4]. Broadcom CEO Hock Tan said, "This chip's performance rivals Nvidia's Blackwell GPU and Google's TPU," highlighting the high-performance design and long-term collaboration plans [6].
The chip reduces costs by about 50% compared with traditional GPUs and achieves significant improvements in performance per watt, according to initial tests since samples began testing on June 24, 2026 [1, 2, 4, 7]. OpenAI’s hardware lead Richard Ho said Jalapeño "can efficiently execute critical workloads close to hardware theoretical limits" [3].
Broadcom and OpenAI completed the chip design in about nine months, accelerated by AI-assisted design optimization techniques [3, 5]. The compute die measures approximately 840 square millimeters [5]. Taiwan Semiconductor Manufacturing Company (TSMC) handles chip fabrication, while Celestica builds server systems integrating Jalapeño for OpenAI [6, 7].
Deployment of Jalapeño chips is planned to start in late 2026 across Microsoft and other OpenAI partner data centers, targeting a total power consumption scale of 10 gigawatts for AI infrastructure [3, 4, 8]. OpenAI president Greg Brockman described the project as "part of our long-term infrastructure strategy to serve advanced AI efficiently" [9]. Further scaling is expected in 2027 and early 2028 when large-scale production and operation will commence [9].