Claude Opus 4.7 reasoning-effort curve on 29 matched GraphQL-go-tools tasks: low, medium, high, xhigh, and max. Medium wins the behavioral metrics; more reasoning does not reliably buy better patches.
No pages have linked to this URL yet.
Log in or sign up to submit feeds.