Skip to content

Sakana AI releases Fugu Ultra, an orchestration model that routes across frontier LLMs to match top benchmark scores

· by Pondero Newsdesk

The short version

Sakana AI launched Fugu and Fugu Ultra on June 22, 2026, delivering multi-model orchestration through a single OpenAI-compatible API and pitching it as a hedge against export-control disruptions like the one that cut off Anthropic's Fable and Mythos models.

Sakana AI releases Fugu Ultra, an orchestration model that routes across frontier LLMs to match top benchmark scores

Sakana AI shipped two models on June 22, 2026 that reframe what "a model" means: instead of bigger weights, Fugu and Fugu Ultra are trained to orchestrate a swappable pool of frontier LLMs through a single API call. The timing is deliberate. The company cited the recent export-control restrictions on Anthropic's Fable and Mythos models as proof that single-vendor dependency is "a material vulnerability."

What

Fugu takes a user request through one OpenAI-compatible endpoint and decides whether to answer directly or assemble a team of specialist models. That pool currently includes GPT-5.5, Claude Opus 4.8, and Gemini 3.1 Pro, per Sakana's announcement. Fugu Ultra, the higher-tier variant, coordinates a deeper pool for hard, multi-step tasks where accuracy matters more than latency.

The architecture draws on two ICLR 2026 papers from Sakana, Trinity and Conductor, which established learned orchestration as a research direction. Fugu is the productized version.

Benchmark scores, per Sakana, show Fugu Ultra scoring 93.2 on LiveCodeBench v6, 50.0 on Humanity's Last Exam, and 86.6 on CharXiv Reasoning per MarkTechPost's coverage of the launch. Sakana claims Fugu Ultra topped 10 of 11 standard benchmarks tested, though the specific benchmark table was released as an image in the technical report rather than in a structured format that allows third-party verification.

The agent pool is swappable by design. Fugu picks models internally; users see one response. For teams with data compliance requirements, the standard Fugu tier lets operators opt specific sub-agents out of the pool.

Why it matters

The export-control angle is the real news here. Sakana is not just selling performance. It is selling continuity. When Anthropic restricted Fable and Mythos access, organizations that had built workflows on those specific models had no fallback. Fugu's architecture makes the underlying model layer replaceable at runtime, so a provider restriction triggers a reroute rather than an outage.

For AI-tool operators evaluating how to structure their LLM dependencies in 2026, Fugu is the first generally available product framing vendor diversification as a first-class feature rather than an integration footnote. That is a distinct positioning move. The cost is opacity: users cannot inspect which sub-agents handled their query, which matters for regulated use cases that require audit trails.

The product entered early access with close to 500 beta users, per Sakana's announcement. Early-user feedback shaped the launch version.

Context

Sakana AI is a Tokyo-based lab founded in 2023 by former Google Brain researchers, including David Ha. The company has concentrated on evolutionary and collective intelligence approaches since its founding. Fugu is its first product aimed at production API consumers rather than researchers.

The orchestration-as-a-model framing is notably different from common multi-agent frameworks like LangGraph or Autogen, where the developer writes orchestration logic. Fugu's distinguishing claim is that the orchestration itself is learned: the model was trained to route, not programmed to.

Community reception was mixed in the first 24 hours after launch. Technical skeptics questioned whether the system is a router with extra steps or a genuine orchestrator with emergent coordination behavior.

What to watch next

Sakana said Fugu will add its own in-house models to the agent pool as they ship, which would reduce its current dependence on third-party provider APIs. The first native Sakana model entering the pool would be a meaningful milestone: it would let Sakana offer frontier-level performance on tasks where no single third-party model has been restricted, without needing access to Fable, Mythos, or any other model that may face future export controls. Enterprise pricing tiers have not been announced; the product is currently in early access.

Sources