GeistHaus
log in · sign up

Claude Opus 4.6

anthropic.com

We’re upgrading our smartest model. Across agentic coding, computer use, tool use, search, and finance, Opus 4.6 is an industry-leading model, often by wide margin.

54 pages link to this URL
daveverse

Opus 4.6 is much smarter than the other one. It feels like I’m working with someone from Bronx Science. I had been using Sonnet 4.6, which I switched to after reading somewhere that it costs …

0 inbound links article en
Simon Willison on pelican-riding-a-bicycle

114 posts tagged ‘pelican-riding-a-bicycle’. My benchmark for LLMs: "Generate an SVG of a pelican riding a bicycle". Here's my answer to what happens if AI labs train for pelicans riding bicycles?. "User …

0 inbound links website en ai 2025generative-ai 1792llms 1758llm-release 199llm 600llm-reasoning 98ai-in-china 95llm-pricing 72openai 419google 407
Simon Willison on pelican-riding-a-bicycle

113 posts tagged ‘pelican-riding-a-bicycle’. My benchmark for LLMs: "Generate an SVG of a pelican riding a bicycle". Here's my answer to what happens if AI labs train for pelicans riding bicycles?. "User …

0 inbound links website en generative-ai 1791ai 2024llms 1757llm-release 199llm 600llm-reasoning 98ai-in-china 95llm-pricing 72openai 419google 407
Extracting Xcode 26.3's Claude Code Prompt

A journey from TLS decryption and Frida patching to a surprisingly simple solution for extracting the system prompt that Xcode feeds to Claude Code.

0 inbound links article en Language Models XcodeClaude CodeAnthropicMCPCloudflare AI Gatewaycertificate pinningFridasystem promptCursoriOS development
https://www.facebook.com/3GeeksandALawBlog/

3 Geeks and a Law Blog: A law blog addressing the foci of 3 intrepid law geeks, specializing in their respective fields of knowledge management, internet marketing and library sciences, melding together to form the Dynamic Trio.

0 inbound links website en
Simon Willison on pelican-riding-a-bicycle

113 posts tagged ‘pelican-riding-a-bicycle’. My benchmark for LLMs: "Generate an SVG of a pelican riding a bicycle". Here's my answer to what happens if AI labs train for pelicans riding bicycles?. "User …

0 inbound links website en generative-ai 1791ai 2024llms 1757llm-release 199llm 600llm-reasoning 98ai-in-china 95llm-pricing 72openai 419google 407
Home

Research notes on foundation models, evaluation, and ML in biology

0 inbound links website en
Simon Willison on pelican-riding-a-bicycle

113 posts tagged ‘pelican-riding-a-bicycle’. My benchmark for LLMs: "Generate an SVG of a pelican riding a bicycle". Here's my answer to what happens if AI labs train for pelicans riding bicycles?. "User …

0 inbound links website en generative-ai 1790ai 2023llms 1756llm-release 199llm 600llm-reasoning 98ai-in-china 95llm-pricing 72openai 419google 407
Why AI Agents Remain Unreliable - Does it matter in AI?

(Yes, even with the just-released Claude Opus 4.6.) Generative AI is an amazing technology because it’s so…human. No, we should NOT anthropomorphize AI (that’s a subject for another post) but inevitably we will because…we’re human. It's much easier to understand something we don't understand when we can relate it to something that we do understand.

0 inbound links article en Uncategorized
You Can’t Stockpile AI: Military Advantage in the Age of Algorithmic Diffusion - Modern War Institute

Artificial intelligence will reshape how wars are fought, and the United States enters this era with genuine advantages. American companies build the most capable models in the world. US-based chip designers dominate the advanced semiconductor supply chain. Private investment in AI flows into American firms at a rate that dwarfs every other nation. These are

0 inbound links article en Commentary & Analysiskyle.dotterrer
Friday Links 26-05

If you watch one thing this weekend, watch the video about being misled about renewable energy. Spoiler: It’s not just about renewable energy. I still haven’t decided what I think about coding with AI. It has been a great help and I see all the drawbacks. My links reflect this. Leadership Culture is built on ‘moments of truth’ [Podcast] - not a technology company, but still very applicable to all organisations.

0 inbound links article en post friday linksfridaylinksleadershipengineeringurbanismenvironment
Uno: What I Learned Shaping LLMs into a 90s Comic Book AI

I wrestled current1 LLMs into behaving like my childhood AI hero: here’s what worked, what didn’t, and why. Uno is my favorite character from my favorite Italian childhood comic PKNA. He’s the friendly and sarcastic AI helping Donald Duck in adventures spanning 56 issues in the mid ’90s. My first attempt at re-creating him was a complicated Excel spreadsheet in my early teens with lots of nested IF functions. This is my second attempt.

0 inbound links article en posts ai
AI fears pummel software stocks: Is it 'illogical' panic or a SaaS apocalypse?

The software space is facing serious market concerns this week, after the release of new AI tools from AI triggered a market sell-off.

10 inbound links article en Technology cnbcArticlesNVIDIA CorpSoftwareTechnologyBreaking News: TechnologyThomson Reuters CorpThomson Reuters Corp.LegalZoom.com IncSalesforce IncWorkday IncTata Consultancy Services LtdInfosys LtdArm Holdings PLCBusinessSoftware and Computer ServicesApplication SoftwareSystems SoftwareAI - Artificial IntelligenceBusiness Newssource:tagname:CNBC Asia Source
Killer Context

TL;DR: I theorize that coding agent errors compound over time leading to increasingly worse outcomes as a session continues. I built Blackbird based on this theory, which restarts each task in a plan with a fresh context window, with a new task’s instructions as its starting point. This minimizes deviation from intention leading to better outcomes.

0 inbound links article en
opus 4.6 and two small tools

First impressions of Opus 4.6, and two small tools—an interview plugin and a markdown annotator—for staying engaged with your own work.

0 inbound links article en
2 ways to bet on a Trillion Dollar Market

I was listening to Dario Amodei’s interview with dwarkesh patel and found his insights into how anthropic plans their capex investments and path to profitability quite fascinating. They need to balance their risks into how much compute to build for the next 2 years in advance based on current demands because the data centers take 2 years to build. If they overestimate their demand then they won’t have enough profit in the next years and will go bankrupt while if they underestimate it they won’t be able to match the demand and will risk losing their customers to their competitors, this is what he calls their cone of uncertainty. This sentiment felt weird to me because openai seems to aggressively bullish on their capex investments, infact sam altman disclosed they will be spending $1 trillion on compute infra across microsoft, oracle, nvidia and coreweave between 2025 and 2035 while also partnerring with cerebras, so why do these 2 AI companies have completely different capex investment strategies?

0 inbound links article en aibusinessstrategyanthropicopenai
February quick-takes

Recap of my short posts on LinkedIn in February AI Slop in Content Writing Dear bloggers, content writers, commentators and social medi...

0 inbound links article en
No Code by Hand · ashwch

CI dropped from 37 minutes to 9, at 35% lower cost. What we learned about mixing Claude and Codex, clearing the hidden queue, and where the real multiplier came from.

0 inbound links article en
Simon Willison on codex

47 posts tagged ‘codex’. OpenAI's coding agent tools: Codex CLI, Codex Desktop, Codex Cloud.

0 inbound links website en ai 2020llms 1753generative-ai 1787openai 419coding-agents 201ai-assisted-programming 382gpt 124claude-code 112ai-agents 111llm-release 199
Simon Willison on llm-release

199 posts tagged ‘llm-release’. New releases of various LLMs.

0 inbound links website en LLMsllms 1751generative-ai 1785ai 2016pelican-riding-a-bicycle 113llm 598local-llms 156llm-reasoning 98ai-in-china 95llm-pricing 72gemini 185
FOSDEM 2026

Went to FOSDEM 2025, saw talks, met people, drank beer. Again.

Unsupervised Learning NO. 515

Opus 4.6 Finds Vulns the Way Human Testers Do, The SaaSpocalypse, Malicious OpenClaw Skills, New Urgency in Building, and more

0 inbound links website en artificial intelligencecybersecuritytechnology