GeistHaus
log in · sign up

AI Agents: Engineering Over Intelligence | Marvin Zhang

marvinzhang.dev

When SWE-bench scores improved 50% in just 14 months—from Claude 3.5 Sonnet's 49% in October 2024 to Claude 4.5 Opus's 74.4% in January 2026—you'd think AI agents had conquered software engineering. Yet companies deploying these agents at scale tell a different story. Triple Whale's CEO described their production journey: "GPT-5.2 unlocked a complete architecture shift for us. We collapsed a fragile, multi-agent system into a single mega-agent with 20+ tools... The mega-agent is faster, smarter, and 100x easier to maintain."

0 pages link to this URL

No pages have linked to this URL yet.