ONE — A NUMBER THAT SUMMARIZES THE DAY

200 milliseconds — the latency budget Mira Murati picked for her first product at Thinking Machines Lab, Interaction Models, released to research preview this morning. It is the threshold below which human conversation stops feeling like turn-taking and starts feeling like presence. Tomasz Tunguz named the same number at the other end of the stack on Sunday in Localmaxxing, explaining why half of his daily work now runs on a 35B local model at 2x the cloud Ferrari's speed. Latency is the binding variable across every layer of the suit. Find the slowest piece. Optimize there. I am Iron Man.

THREE — ACTIONS TO TAKE TODAY

Audit the slowest piece of your AI workflow before you leave the office today. Open a doc. List your three most important AI-assisted workflows. For each, identify the slowest step — how you talk to the system, how agents execute, which model handles which call, how the output lands in front of a human eyeball. That step is your binding constraint. Every dollar you spend optimizing somewhere else compounds at the rate of the slowest layer. Jeff Wilke told Bezos the same thing about Amazon's idea pipeline in 1999: "you have to release the work at the right rate that the organization can accept it." The slowest layer rate-limits everything else. Amdahl's Law has not been repealed.

Run the Tunguz model-routing audit on your team's actual AI volume this week. Tomasz Tunguz spent five weeks routing his workload to a 35B-parameter Qwen on a MacBook Pro M5 and benchmarked against Claude Opus 4.5. Half of his 1,471 tasks ran successfully on local hardware at roughly twice the speed. Email, scheduling, summarization, admin — 41.8% of his volume — never needed the frontier model. Your team's actual ratio is somewhere between 30 and 70 percent. You will not know until you measure it. The cost savings are real but secondary. The latency improvement is the compounding advantage — every workflow that drops from a six-second cloud roundtrip to a two-second local response gets used three times more often, because friction is fatal to adoption.

Stop deploying agents before you stand up identity and orchestration. Aligned News named it tonight: agent sprawl is the top new enterprise concern. Palo Alto Networks launched Idira today — the first identity platform built for AI agents, not humans. UiPath opened enterprise orchestration to any coding agent. Anthropic's Claude Platform went GA on AWS with native IAM on Sunday. The infrastructure that converts ten thousand individual operators into a coherent enterprise is now buyable today. Pause the third-tier agent rollouts. Stand up the identity and orchestration layer first. The agents you deploy on top of it next quarter will compound. The agents you already deployed without it are an audit risk you are about to have to triage anyway.

Today's actions are squarely the work we've been doing with clients this quarter — auditing the slowest layer of the suit, building the model-routing policy, standing up identity and orchestration before the next agent ships. If your team is staring at the four-layer stack and is not sure which piece to optimize first, that is the conversation Outsider Labs Advisory is built for.

FIVE — STORIES TO KEEP YOU INFORMED

Tuesday, May 12

1. Mira Murati shipped the JARVIS console — and quietly told the entire agentic-first race they're solving the wrong frontier. Thinking Machines Lab released Interaction Models this morning. 200-millisecond streaming voice/video/text, sub-perceptual latency, a fast loop that keeps talking while a second background model handles the slow reasoning. While every other lab is racing to build longer-horizon autonomous agents, Murati just put a flag in the half of the operator's day everyone else is treating as a wait state. Real-time, multimodal, bidirectional — not faster dictation. (Full analysis below.)

2. Sam Altman gave Bill Clinton's answer to Steve Rogers' question. Steven Molo, Musk's lead counsel, opened cross-examination in Oakland this morning with "Are you completely trustworthy?" — and then read into the record the Sutskever, Brockman, Murati, and Shear testimony already on the docket. Asked directly whether he always tells the truth, Altman: "I'm sure there are some times in my life when I did not." Concede the categorical impossibility, refuse the specific instance. The architecture at the top of the trade is now on the public record. (Full analysis below.)

3. Cognition's Devin is at $445M ARR and doubling every eight weeks. The vampire gap is now a compounding curve. Marc Andreessen called them "AI vampires" this week — the engineers who absorbed AI tools instead of being displaced by them. Devin priced them. Eight-week doubling for twelve months is roughly 90x — the real number will bend, but even half is the kind of gap that makes a non-vampire seat at 2024 comp look like a structural liability. U.S. Army and Goldman Sachs are the named customers.

4. Cerebras files for a $35B IPO. Pinecone is shopping itself at $2B after Notion walked. The moats moved up the stack. AI silicon gets its first major non-Nvidia public market test. Vector search just commoditized — Aligned's editorial line: "vector search became a commodity feature in every major database." The next moat is governance, identity, orchestration. The Idira/UiPath/AWS-IAM stack is where durable value now sits.

5. Nick Bostrom called for "the great retirement of humanity." The founder of AI existential risk just published a working paper titled Optimal Timing for Superintelligence arguing a small chance of human extinction is worth the risk because AI might relieve humanity of "its universal death sentence." He calls himself "a fretful optimist." The man who wrote the doom book changed sides. (Full analysis below.)

We're at Homebrew Computer Club, May 1975. The components are on the bench. Nobody has shipped the integrated product yet.

SEVEN — SIGNAL / NOISE

Menlo Park, May 12

A garage in Menlo Park, March 1975. Steve Wozniak walks in carrying a hand-soldered prototype that will eventually be called the Apple I. Bill Gates is writing his Open Letter to Hobbyists complaining about software piracy on the Altair. The Altair 8800 — a $397 kit computer with toggle switches and no monitor — sits on a folding table next to a hot soldering iron. None of the components on that table had been assembled into an integrated product yet. The hobbyists who figured out the assembly first became Apple and Microsoft. The ones who did not became footnotes in tech-press archives nobody now reads.

Tuesday, May 12, 2026. The components are on the bench again. Mira Murati shipped a 200-millisecond input layer this morning — voice, video, and text streaming below the threshold of human perception, the JARVIS console as a commercial product. Palo Alto Networks launched the first identity layer designed for AI agents rather than for the humans who originally walked into the office and badged in (Idira). UiPath opened enterprise orchestration to any coding agent. Tomasz Tunguz published five weeks of empirical data showing that half of his daily AI workload runs successfully on a 35B local model at twice the cloud Ferrari's speed. Anthropic's Claude went generally available on AWS with native IAM on Sunday. The components are on the bench. Nobody has shipped the integrated product yet.

That is the operator's situation in Q2 2026. Last night's Lucy piece on Omar Ismail at Ascend was the existence proof — one COO, $20M ARR into $27.6M ARR in six months, zero growth hires, the entire growth engine built on a personal stack of Claude Code slash commands. He assembled his version of the suit on his own workbench and walked out of the cave with a 38% ARR increase as the receipt. Microsoft's 2026 Work Trend Index put the population numbers next to the existence proof: 19% in the "Frontier" tier (suit-ready operators), 10% "Blocked" (suit-ready operators stuck in companies that cannot use them), half "Emergent" middle, the rest catching up.

Jeff Wilke's line to Bezos from the early Amazon years, surfacing again on X this week, says the operating principle out loud: "you have to release the work at the right rate that the organization can accept it." The bottleneck is not the idea generator. It is the slowest layer's absorption rate. In the suit metaphor: find the slowest layer, optimize there, repeat. In the Homebrew metaphor: the components are all here; the operator who assembles them in the right order first ships the integrated product. Same insight. Different image.

There is a contrarian read of agent sprawl that nobody else is naming, and it lands here. If interoperability across organizations is the binding constraint on AI deployment — if every node where one person's agents touch another person's is a potential point of failure — then the optimal-size organization in 2026 is the one with the fewest interop boundaries. A solopreneur has zero. Two people have one. Ten have roughly forty-five. A thousand have roughly five hundred thousand. The math scales with the square of the operator count. Walmart is fine — physical dependencies are still moats — but the forty-person marketing agency, the twenty-person consulting practice, and the fifteen-person software shop are under structural pressure they have not yet priced.

Cognition's Devin at $445M ARR with usage doubling every eight weeks is the price tag on the compounding curve. Cyan Banister's mycologist on a podcast this week — the one who walked through a forest, noticed bee behavior, and developed a Reishi-and-Chaga extract that boosted bee immunity, without ever becoming an apiologist — is the polymath proof point. Her line lifted verbatim: "the polymaths win, not because they know everything, but because they have the tools to connect what specialists cannot." That is the operator with the four-layer suit, in one sentence.

The hobbyists at Menlo Park in 1975 had two options. Assemble the components into a usable rig, or stay in the audience while somebody else did. The operators reading this newsletter in May 2026 have the same two options. The components are on the bench. The integrated product has not shipped. Whoever assembles them first becomes the Apple of the next decade — or, at minimum, gets twelve more months of life out of their existing business before someone else ships the integrated product and absorbs the function. Either way: stop watching. Start assembling.

At COAI today: Full analysis at getcoai.comI Am Iron Man, on Murati's 200ms input layer, Wilke's release-rate doctrine, Tunguz's Camry-and-Ferrari, the four-layer suit, Cyan Banister's accidental polymath, Sam Altman's Clinton-grade deposition, Nick Bostrom's "Big Retirement" pivot, and what happens when you take the suit off.

— Harry and Anthony

Sources:

Reply

Avatar

or to participate

Keep Reading