When was Claude Opus 4.8 released?

Anthropic released Claude Opus 4.8 on May 28, 2026. It is available everywhere at launch, including claude.ai, Claude Code, the Claude API, Amazon Bedrock, Google Vertex AI, and Microsoft Foundry.

What is the biggest improvement in Claude Opus 4.8?

Honesty. Anthropic reports Opus 4.8 is roughly four times less likely than Opus 4.7 to let flaws in its own code pass unremarked, and it is more likely to flag uncertainty instead of making unsupported claims. It also improves long-horizon agentic coding, effort calibration, and tool triggering.

Does Claude Opus 4.8 cost more than Opus 4.7?

No. Pricing is unchanged from Opus 4.7: $5 per million input tokens and $25 per million output tokens for regular usage, and $10 per million input / $50 per million output for fast mode.

What is the Claude Opus 4.8 API model ID?

The model ID is claude-opus-4-8. It supports a 1M-token context window by default on the Claude API, Amazon Bedrock, and Vertex AI (200k on Microsoft Foundry), 128k max output tokens, and adaptive thinking with the effort parameter.

Claude Opus 4.8: What Changed & Why It Matters

On May 28, 2026, Anthropic released Claude Opus 4.8, the newest version of its flagship Opus model. The headline is refreshingly simple: it's better across coding, agentic tasks, reasoning, and knowledge work — and it costs exactly the same as Opus 4.7. If you build or run AI agents, this is the low-risk upgrade you don't have to think hard about.

We build production AI agents for a living, so we read these releases through one lens: does it make agents more reliable on real work? Here's what changed in Opus 4.8, and why it matters.

The short version

Better, at the same price. Improvements across coding, agentic tasks, reasoning, and professional work — pricing unchanged from Opus 4.7.
Honesty is the marquee feature. Opus 4.8 is roughly 4× less likely than its predecessor to let flaws in its own code pass unremarked.
Built for long-running work. Better long-context handling, fewer compactions, and cleaner recovery when compaction happens.
New surface features: effort control on claude.ai, "dynamic workflows" in Claude Code, and a faster, cheaper fast mode.

Honesty as a measurable feature, not a slogan

The improvement Anthropic leads with isn't a benchmark number — it's honesty. A common failure mode for AI models is jumping to conclusions: confidently claiming progress when the evidence is thin, or quietly shipping code with bugs and calling it done.

Anthropic reports that Opus 4.8 is about four times less likely than Opus 4.7 to let flaws in code it wrote pass unremarked, and that it's more inclined to flag its own uncertainties instead of papering over them. Early testers describe a model that asks the right questions, catches its own mistakes, and pushes back when a plan isn't sound before making big changes.

For anyone handing real work to an agent, "tells you when it's unsure" is worth more than another point on a leaderboard. It directly changes how much you can trust an unattended agent's claim that a task is done.

Built for long-horizon, agentic work

Most of Opus 4.8's targeted improvements are about staying coherent over long, multi-step tasks — exactly where agents tend to drift:

Long-horizon agentic coding — better long-context handling, fewer compactions, and better recovery after a compaction so long traces stay on task.
Reasoning effort calibration — more reliable behavior at each effort level across different domains.
Tool triggering — fewer cases of skipping a tool call the task actually required, a pain point some users hit on 4.7.

Adaptive thinking remains the only thinking mode, and it's smarter about when to engage: Opus 4.8 reasons on hard, multi-step problems and answers directly on simple lookups — wasting fewer thinking tokens on mixed workloads at the same effort level.

What launched alongside the model

Opus 4.8 didn't ship alone. A few things landed the same day:

Effort control

On claude.ai and Cowork, there's now a control next to the model selector for how much effort Claude spends on a response. Higher effort means deeper, more frequent thinking and better answers; lower effort means faster replies that burn through rate limits more slowly. It's available on all plans, and Opus 4.8 defaults to high — with "extra" (xhigh in Claude Code) and "max" available for harder, long-running tasks.

Dynamic workflows in Claude Code

A research-preview feature that lets Claude plan a large task, then run hundreds of parallel subagents in a single session, verify its outputs, and report back. The flagship example: codebase-scale migrations across hundreds of thousands of lines, from kickoff to merge, with the existing test suite as the bar. Available in Claude Code for Enterprise, Team, and Max plans.

Faster, cheaper fast mode

Fast mode runs Opus 4.8 at up to 2.5× the output speed, and it's now three times cheaper than it was on previous models.

For developers: the API specifics

If you build on the Claude API, the model ID is claude-opus-4-8. The details that matter:

1M-token context window by default on the Claude API, Amazon Bedrock, and Vertex AI (200k on Microsoft Foundry); 128k max output tokens.
Mid-conversation system messages — the Messages API now accepts role: "system" entries inside the messages array. You can update instructions, permissions, token budgets, or environment context mid-task without breaking the prompt cache or routing through a user turn. No beta header required.
Lower prompt-cache minimum — now 1,024 tokens, so shorter prompts that couldn't be cached on 4.7 can be cached with no code changes.
Refusal stop details — the stop_details object on refusals is now publicly documented, making it easier to categorize declined requests and route users to the right next step.

Constraints carried over from Opus 4.7 (so existing code needs no changes): temperature, top_p, and top_k still return a 400 if set to non-default values, and extended thinking budgets aren't supported — use adaptive thinking plus the effort parameter instead.

Pricing

Unchanged from Opus 4.7:

Mode	Input (per 1M tokens)	Output (per 1M tokens)
Regular	$5	$25
Fast mode	$10	$50

Opus 4.8 is available everywhere today — claude.ai, Claude Code, the Claude API, Amazon Bedrock, Google Vertex AI, and Microsoft Foundry.

What's next

Anthropic framed Opus 4.8 as a "modest but tangible" step up, and signposted two directions: cheaper models with much of Opus's capability, and a new, higher-intelligence class above Opus. Under Project Glasswing, a small number of organizations are already using Claude Mythos Preview for cybersecurity work; Anthropic says Mythos-class models need stronger cyber safeguards before a general release, which it expects "in the coming weeks."

Should you upgrade?

If you're on Opus 4.7, this is a low-risk move: same price, same API contract, better behavior — especially for long-running, agentic, or coding-heavy workloads. The honesty improvements alone change how much you can trust an unattended agent's "done." Point your model ID at claude-opus-4-8 and go.

How CleverHub thinks about model upgrades

We build custom AI agents, voice agents, and agentic workflows that are model-agnostic by design — so a release like Opus 4.8 is a config change and a round of evals, not a rebuild. If you want agents that improve as the models do, without rewiring everything each release, let's build it together.

Claude Opus 4.8 Is Here: What Changed and Why It Matters for Agents

The short version

Honesty as a measurable feature, not a slogan

Built for long-horizon, agentic work

What launched alongside the model

Effort control

Dynamic workflows in Claude Code

Faster, cheaper fast mode

For developers: the API specifics

Pricing

What's next

Should you upgrade?

How CleverHub thinks about model upgrades

FAQs

Ready to build your AI agent?

Related Articles

Google I/O 2026: What the "Agentic Era" Means for Your Business

The Future of AI Agents: What 2026 and Beyond Looks Like

How to Scope an AI Agent Project in 3 Weeks