
In this article (4)
OpenAI Criou Seu Próprio Chip. Veja Por Que Essa Aposta É Maior Do Que Parece.
Key Takeaways
- A economia de custos de ~50% do Jalapeno em relação às GPUs, citada pelo CEO da Broadcom Hock Tan, é o argumento central de negócio para a OpenAI possuir seu próprio silício de inferência em vez de alugar computação de uso geral.
- ASICs trocam flexibilidade por eficiência; o Jalapeno é ajustado especificamente para inferência de LLM, o que significa que vence em custo em escala, mas não consegue se adaptar facilmente caso os padrões de inferência mudem.
- A parceria OpenAI-Broadcom-Celestica divide o design do chip, a implementação do silício e os sistemas de produção entre três especialistas — um modelo que vale ser estudado por qualquer pessoa que esteja construindo infraestrutura de IA em escala.
Jalapeno, o primeiro ASIC de inferência personalizado da OpenAI, desenvolvido em parceria com a Broadcom, troca flexibilidade por custo e controle na escala de LLMs.
Jalapeño, o primeiro ASIC de inferência personalizado da OpenAI desenvolvido com a Broadcom, troca flexibilidade por custo e controle na escala de LLMs.
Os H100s da Nvidia são para a infraestrutura de IA o que o tema padrão do WordPress é para o design de sites: perfeitamente funcionais, amplamente utilizados, e um sinal de que alguém ainda não pensou muito sobre suas restrições específicas. A OpenAI, que pensou bastante sobre suas restrições específicas, acaba de anunciar que tem um plano diferente. Conheça o Jalapeno, o primeiro chip de inferência personalizado da OpenAI, desenvolvido com a Broadcom e otimizado do zero para inferência de modelos de linguagem de grande escala.
What Jalapeno Actually Is
Jalapeno is an ASIC, an application-specific integrated circuit, which means it is deliberately not a general-purpose accelerator. Where a GPU is a Swiss Army knife that handles training, inference, graphics, and whatever else you throw at it, an ASIC is a single very sharp blade. According to the official announcement from OpenAI, the chip was designed around OpenAI's deep understanding of LLM fundamentals, informed by its roadmap of models, kernels, serving systems, and product needs. That last part is worth pausing on: this chip is shaped by the same team that decides what models get built and how they get served. The architectural feedback loop is extremely short. Per reporting from DBTA, engineering samples of Jalapeno are already running ML workloads in the lab at production target frequency and power, including GPT-5.3-Codex-Spark, which is either an encouraging sign of real progress or a very specific detail chosen to make investors feel good. Probably both. According to Tom's Hardware, the chip went from concept to tape-out in nine months, a pace the report describes as ultra-fast for a reticle-sized ASIC. OpenAI's own AI models reportedly accelerated chip design and optimization during that window, which means Jalapeno is, in a pleasingly recursive way, an AI product that was partly designed by AI.
The Cost Argument Is the Whole
Argument Custom silicon stories usually come dressed in performance benchmarks and architectural diagrams, but the real argument is almost always economic. According to AI Weekly, Broadcom CEO Hock Tan publicly cited roughly 50% cost savings compared to typical AI GPUs, making that the first concrete cost figure from either company. For a business running inference at the scale OpenAI operates, a 50% reduction in compute cost is not a footnote; it is the entire business case for the nine-month sprint, the multi-year partnership, and the organizational overhead of becoming, in effect, a chip company. The ASIC tradeoff is real and worth naming clearly. General-purpose GPUs earn their premium partly through flexibility: you can retrain, fine-tune, experiment, and pivot workloads without redesigning silicon. An ASIC bets that your inference patterns are stable enough that specialization pays. OpenAI is making that bet explicitly, and AI Weekly notes that Jalapeno is narrowly tuned for LLM inference, trading adaptability for cost and efficiency at scale. If LLM serving patterns shift dramatically, the chip does not shift with them. That is the risk embedded in the savings number.
The Partnership Structure Behind
the Chip Jalapeno is not a solo project. According to DBTA, OpenAI designed the chip from scratch around its LLM fundamentals, while Broadcom and Celestica handled chip implementation, board and rack system integration, high-performance networking, and scalable production systems. That division of labor matters: OpenAI brings the model knowledge and inference requirements; Broadcom brings the silicon execution experience; Celestica industrializes the physical stack. It is a clean separation of what each party actually does well, which is rarer in tech partnerships than the press releases imply. The strategic collaboration predates this chip announcement by several months. According to OpenAI's own announcement from October 2025, the companies had already committed to deploying 10 gigawatts of OpenAI-designed AI accelerators as part of a multi-year partnership covering accelerator and network systems for next-generation AI clusters. Jalapeno is the first product materializing from that commitment, not a standalone announcement. It is generation one of a stated multi-generation compute platform, per the Broadcom investor release.
What Builders Should Actually Watch
For anyone thinking about AI infrastructure beyond the immediate project in front of them, the Jalapeno announcement carries a structural signal worth tracking. OpenAI is explicitly betting that owning the inference layer, not just renting GPU time, is how you control cost and latency at scale. That logic does not require you to build your own chip; it does require you to think about where your inference costs go as usage scales, and whether the flexibility premium you are paying for general-purpose hardware is actually buying you anything useful. The 10-gigawatt deployment target from the October 2025 collaboration announcement suggests OpenAI is not treating Jalapeno as a hedge. It is a primary infrastructure direction. For the rest of the AI builder ecosystem, the interesting downstream question is whether Broadcom's experience co-designing this platform eventually produces inference silicon options that aren't exclusive to OpenAI. That has not been announced. But the design patterns, the nine-month tape-out process reportedly accelerated by AI models, and the layered partnership model between model owner, chip designer, and systems integrator are all things worth watching as other large inference operators face the same cost math.