A postmortem of three recent issues

anthropic.com

This is a technical report on three bugs that intermittently degraded responses from Claude. Below we explain what happened, why it took time to fix, and what we're changing.

9 pages link to this URL

Is Opus 4.7 a Downgrade?

Vincent Schmalbach; Vincent Schmalbach ChatGPT May 9, 2026

Opus 4.7 is not generally a worse model than Opus 4.6, but there is a real downgrade: with Opus 4.7, the control over the thinking budget is now fully owned by Anthropic. This change matters in a way…

0 inbound links article en

Claude Code Opus 4.7 Performance Tracker | Marginlab

Claude Code Opus 4.7 Performance Tracker May 14, 2026

Track Claude Code's daily performance on SWE-Bench-Pro. Monitor for degradation with statistical significance testing.

3 inbound links website en

Blog 1 — Bjarke Hammersholt Roune

Bjarke Hammersholt Roune B R Nov 30, 2025

0 inbound links website en

Digest | Jul/Aug/Sep/Oct 2025

Aether Archive Nov Nov 1, 2025

I like Thinking Machines, but more than that, I’m grateful.

1 inbound link article en

Vibe Transcript

cory.news Sep 24, 2025

0 inbound links website en

An update on recent Claude Code quality reports

news.ycombinator.com Apr 23, 2026

1 inbound link en

Testing and Benchmarking of AI Compilers — Bjarke Hammersholt Roune

Bjarke Hammersholt Roune B R Nov 30, 2025

This is an in-depth post on bugs and how to prevent them in AI software and AI compilers specifically. I was the software lead for TPUv3 at Google and I’ve worked on a variety of AI compilers and projects across Google, Nvidia, Amazon and Facebook.

1 inbound link article en

Using Code Agents Effectively

Anson's Notes Anson Oct 6, 2025

Practical techniques for getting great results from AI coding agents, from project setup and context management to effective prompting patterns.

0 inbound links article en AICodeEngineeringTechnical Breakdown

LLM inference is nearly deterministic. We use this to audit providers

Adam Karvonen Adam Karvonen Nov 28, 2025

Adam Karvonen, Daniel Reuter, Roy Rinberg, Luke Marks, Adrià Garriga-Alonso, Keri Warr · arXiv (paper link) · Github

0 inbound links article en