AI Daily Brief — June 24, 2026: OpenAI Builds Its Own Inference Chip, GPT-5 Cracks a Three-Year Immunology Mystery, and Claude Moves Into Slack Full-Time

AI Tech Spectrum daily brief cover for June 24, 2026, headline 'Who owns the stack', with bullets on OpenAI and Broadcom's Jalapeño inference chip taped out in nine months, GPT-5 Pro cracking a three-year immunology mystery, and NVIDIA powering 81 percent of the world's fastest supercomputers

Good morning. Five stories, and the throughline is ownership of the stack — who designs the silicon, who runs the supercomputers, and what all that compute is finally being pointed at. It opens with OpenAI reaching one layer deeper than it ever has: its own chip. Prefer this once a week? Subscribe to the weekly brief.

Today's stories

OpenAI and Broadcom unveil Jalapeño, OpenAI's first chip
GPT-5 Pro helps crack a three-year immunology mystery
Anthropic makes Claude a full-time Slack teammate
NVIDIA still powers 81% of the world's fastest supercomputers
IBM ships two dozen working agentic-app examples

1. OpenAI and Broadcom unveil Jalapeño, OpenAI's first chip

Card summarizing the OpenAI–Broadcom Jalapeño chip: OpenAI's first custom Intelligence Processor, a blank-slate accelerator built for LLM inference; co-developed from design to tape-out in nine months; engineering samples already running ML workloads in the lab including GPT-5.3-Codex-Spark; performance per watt reported substantially better than current state of the art; partners Broadcom (Tomahawk networking) and Celestica; gigawatt-scale deployment with Microsoft and partners beginning 2026

OpenAI reached a layer deeper into its own stack than ever before. With Broadcom (NASDAQ: AVGO), it unveiled Jalapeño, what it calls its first "Intelligence Processor" — a from-scratch accelerator built specifically for LLM inference rather than a general-purpose chip adapted from older AI workloads. OpenAI says it designed the architecture around its own models, kernels and serving systems, with Broadcom handling silicon implementation and Tomahawk networking and Celestica building the board, rack and system. Engineering samples are already running ML workloads in the lab at production target frequency and power, including a model OpenAI names GPT-5.3-Codex-Spark, and early testing reportedly shows performance per watt "substantially better than current state-of-the-art" — though OpenAI cautions it is still measuring and a full technical report is months away.

Why it matters. The most striking claim is speed of execution: OpenAI says Jalapeño went from initial design to manufacturing tape-out in just nine months — which it believes is the fastest ASIC development cycle ever achieved in high-performance semiconductors — partly because it used its own models to accelerate the design. "The world is moving to a compute-powered economy," said OpenAI president Greg Brockman, framing the chip as part of a long-term, full-stack strategy to make inference cheaper and more abundant. Read against story four below, the target is clear: inference is the softest spot in NVIDIA's near-total grip on AI hardware, and a custom chip is OpenAI's bid to control its own unit economics. What to watch. Real numbers, not adjectives — the promised performance report — plus whether the "initial deployment by the end of 2026" timeline holds and whether gigawatt-scale racks actually ship with Microsoft and other partners on schedule.

2. GPT-5 Pro helps crack a three-year immunology mystery

Card summarizing the GPT-5 immunology case study: immunologist Derya Unutmaz at The Jackson Laboratory for Genomic Medicine used GPT-5 Pro on a three-year-old T-cell puzzle about how glucose shapes T-cell development; the model proposed that deoxyglucose disrupts construction of the IL-2 protein, blocking cells from becoming inflammatory Th17 cells; in a separate test GPT-5 Pro correctly predicted a lymphoma experiment Unutmaz had already run, anticipating a boost in CD8 T-cells' ability to kill lymphoma cells

For a sense of what all that compute is for, OpenAI published an account of GPT-5 in the lab. It detailed how immunologist Derya Unutmaz of The Jackson Laboratory for Genomic Medicine used GPT-5 Pro to untangle a question his lab had been stuck on since 2022: how glucose shapes the way T cells develop and specialize. Working through the data, GPT-5 Pro proposed a specific mechanism — that deoxyglucose interferes with construction of a protein called IL-2, which can stop T cells from becoming an inflammatory cell type known as Th17. In a separate and sharper test, Unutmaz asked the model to predict the outcome of a lymphoma experiment he had already run; GPT-5 Pro correctly anticipated the boost in CD8 T cells' ability to kill lymphoma cells.

Why it matters. This is the difference between an AI that summarizes the literature and one that proposes a testable mechanism a domain expert hadn't pinned down — and then survives a held-out prediction. Unutmaz has said the approach could compress parts of the discovery cycle from years toward weeks. What to watch. It is one case, narrated by the model's maker, and the real proof is independent replication at the bench across labs. The honest framing: GPT-5 generated and stress-tested a hypothesis that a human still had to design experiments around — promising as a research accelerator, not a substitute for wet-lab confirmation.

3. Anthropic makes Claude a full-time Slack teammate

Card summarizing Anthropic's Claude Tag: a persistent AI agent that lives inside Slack channels, replacing the older Claude for Slack app; tag at-Claude to summon a teammate that remembers context across conversations and learns from shared documents; an ambient mode proactively follows up on forgotten threads; admins define separate Claude identities scoped to specific channels, tools and data, and each identity's memory stays within those boundaries; research preview for Claude Enterprise and Team, with auto-migration on August 3, 2026

Anthropic moved Claude from a chat box to a standing member of the team. Claude Tag replaces the older Claude-for-Slack app with a persistent agent that lives in a channel: tag @Claude and it responds in-thread, but it also remembers context across conversations, learns from shared documents, and stays present without being re-invited. An "ambient" mode lets it proactively surface information across the organization and chase down threads or tasks that have gone quiet. Crucially for governance, administrators can define separate Claude identities scoped to specific channels, tools and data — a Claude configured for sales won't share its memories or access with one configured for engineering.

Why it matters. A teammate that continuously reads your company's messages and accumulates memory is genuinely useful and a genuine data-governance question at the same time; Anthropic's answer is scoping and isolation rather than one all-knowing bot. It is launching as a research preview for Claude Enterprise and Team customers, with administrators given until August 3, 2026 to opt in and configure before Anthropic auto-migrates existing workspaces. What to watch. Whether "ambient" reads as helpful or noisy in practice, and how the scoped-memory model holds up the first time a team discovers Claude remembered something it arguably shouldn't have.

4. NVIDIA still powers 81% of the world's fastest supercomputers

For the scoreboard against which OpenAI's chip should be read: at ISC High Performance 2026 in Hamburg, the new TOP500 list showed NVIDIA technology powering more than 400 of the world's 500 fastest supercomputers — 81% of the list, a gain of 17 systems since the last ranking, with roughly nine of every ten systems new to the list built on NVIDIA. The detail underneath is just as lopsided: 376 of the 500 are interconnected with NVIDIA networking, 26 systems now use the NVIDIA Grace CPU (up eight), and on the energy-efficiency Green500, NVIDIA GPUs run the top eight systems and nine of the top ten — led by KAIROS, which tops the list on a single Grace Hopper Superchip.

Why it matters. This is the wall OpenAI's Jalapeño is built against. NVIDIA's dominance is strongest in training and HPC, where its GPUs and networking are effectively the default; inference — exactly what Jalapeño targets — is the part of the stack where custom ASICs have the most room to chip away. What to watch. Whether the wave of custom inference silicon from OpenAI, the hyperscalers and others starts showing up as a dent in NVIDIA's share at the inference layer specifically, even as its supercomputer grip holds.

5. IBM ships two dozen working agentic-app examples

While the labs build the silicon and the teammates, the open-source middle layer is racing to make agentic apps reproducible. IBM Research published a set of roughly two dozen working example apps for CUGA, its open-source "Configurable Generalist Agent" framework on Hugging Face (Apache 2.0). CUGA handles multi-step tasks through structured planning, a dynamic task ledger and failure-recovery, integrates with OpenAPI, MCP servers and LangChain, and has posted competitive results on agent benchmarks such as AppWorld and WebArena. The new release is less about a fresh model than about something builders have been short on: concrete, runnable examples on a lightweight harness rather than a framework you have to wire up yourself.

Why it matters. The first three stories all assume agents that run for a long time with real tools; CUGA is the open answer to "how do I actually build one of those without a frontier lab's infrastructure." Working examples lower the barrier from "interesting framework" to "I shipped something this afternoon." What to watch. Whether open scaffolding like CUGA becomes the default way smaller teams build agents — or gets absorbed by the labs' own agent SDKs as those mature.

What to take from today

One current runs under all five: ownership of the stack is becoming the whole game. OpenAI is reaching down into its own silicon to control the cost of inference; NVIDIA still owns the supercomputer floor that everyone else rents; and the compute is being pointed at harder work — a genuine scientific hypothesis, a teammate that lives in your Slack, agents that run themselves for hours. The decision framework that keeps paying off is the same as ever, and it scales down to your own choices: when a layer of the stack consolidates, ask who controls it, what it now touches, and whether the savings or the capability actually reach you — before the press release tells you they do.

Tomorrow's brief lands by 15:00 UTC. If you'd rather read this in your inbox once a week — just the stories that actually matter — subscribe here.