Opus 4.7 is not generally a worse model than Opus 4.6, but there is a real downgrade: with Opus 4.7, the control over the thinking budget is now fully owned by Anthropic. This change matters in a way…
Let Claude dynamically determine when and how much to use extended thinking with adaptive thinking mode.
Opus 4.7 is not generally a worse model than Opus 4.6, but there is a real downgrade: with Opus 4.7, the control over the thinking budget is now fully owned by Anthropic. This change matters in a way…
Our coding agent went from Top 30 to Top 5 on Terminal Bench 2.0. We only changed the harness. Here's our approach to harness engineering.
The post highlights constraints, mechanisms, and factors influencing Agentic Engineering, emphasizing the types of bottlenecks we’re hitting and how GPU shortages are driving product changes.
Practical guidance for developers building computer and browser use integrations with the Claude model family.
Claude Opus 4.7 reasoning-effort curve on 29 matched GraphQL-go-tools tasks: low, medium, high, xhigh, and max. Medium wins the behavioral metrics; more reasoning does not reliably buy better patches.
Claude Sonnet 4.6 is a full upgrade of the model’s skills across coding, computer use, long-reasoning, agent planning, knowledge work, and design.
We’re upgrading our smartest model. Across agentic coding, computer use, tool use, search, and finance, Opus 4.6 is an industry-leading model, often by wide margin.