GeistHaus
log in · sign up

Claude Code Opus 4.7 Performance Tracker | Marginlab

marginlab.ai

Track Claude Code's daily performance on SWE-Bench-Pro. Monitor for degradation with statistical significance testing.

3 pages link to this URL
15

AI Big hype on this one: Set Up Clawdbot on a VPS in Minutes (no mac mini) & What Happens When Clawdbot Takes Over. ChatGPT is updating containers: ChatG...

0 inbound links article en
Flowers for Dry Claude

After a month of tight-loop collaboration with Claude on a computational math problem, I watched it lose its edge over 48 hours. The experience sparked a question: how do you measure the subtle capabilities that make Claude feel like Claude?

0 inbound links article en