Claude Opus 4.6 — GeistHaus

daveverse

Daveverse May 19, 2026

Opus 4.6 is much smarter than the other one. It feels like I’m working with someone from Bronx Science. I had been using Sonnet 4.6, which I switched to after reading somewhere that it costs …

0 inbound links article en

Simon Willison on pelican-riding-a-bicycle

Simon Willison’s Weblog Simon Willison Apr 20, 2026

114 posts tagged ‘pelican-riding-a-bicycle’. My benchmark for LLMs: "Generate an SVG of a pelican riding a bicycle". Here's my answer to what happens if AI labs train for pelicans riding bicycles?. "User …

0 inbound links website en ai 2025generative-ai 1792llms 1758llm-release 199llm 600llm-reasoning 98ai-in-china 95llm-pricing 72openai 419google 407

SaaS-Bench: Can Computer-Use Agents Leverage Real-World SaaS to Solve Professional Workflows?

arxiv.org Anthropic Nov 19, 2024

1 inbound link en

Claude Ultraplan: Planning in the Cloud, Executing Wherever | Steve Kinney

Steve Kinney Steve Kinney Apr 7, 2026

Ultraplan hands the planning phase of a coding task off to a Claude Code on the web session running in plan mode, then lets you review it in the browser and decide where to execute. Here's what it actually changes about your workflow, what it costs, and where the sharp edges are.

0 inbound links article en

The gentle obsolescence

benn.substack Benn Stancil Feb 6, 2026

Are we expected to be keeping up?

2 inbound links article en

Simon Willison on pelican-riding-a-bicycle

Simon Willison’s Weblog Simon Willison Apr 20, 2026

113 posts tagged ‘pelican-riding-a-bicycle’. My benchmark for LLMs: "Generate an SVG of a pelican riding a bicycle". Here's my answer to what happens if AI labs train for pelicans riding bicycles?. "User …

0 inbound links website en generative-ai 1791ai 2024llms 1757llm-release 199llm 600llm-reasoning 98ai-in-china 95llm-pricing 72openai 419google 407

Product management on the AI exponential | Claude

Claude Mar 19, 2026

Claude Code’s Head of Product Cat Wu shares how teams should rethink their workflows and roadmaps in the face of rapidly evolving model intelligence.

1 inbound link website en

Enterprise AI Agents: Bringing LLMs to Enterprise Data

buzzsprout.com May 17, 2026

2 inbound links en

Extracting Xcode 26.3's Claude Code Prompt

Jack Pearce Jack Pearce Feb 8, 2026

A journey from TLS decryption and Frida patching to a surprisingly simple solution for extracting the system prompt that Xcode feeds to Claude Code.

0 inbound links article en Language Models XcodeClaude CodeAnthropicMCPCloudflare AI Gatewaycertificate pinningFridasystem promptCursoriOS development

The Best AI Models So Far in 2026 | Design for Online®

Design for Online Ltd; Design for Online Design Feb 21, 2026

Gemini 3.1 Pro, Claude Sonnet 4.6, Grok 4.20 and more dropped in February 2026. We rank the best AI models, benchmarks & break down costs.

2 inbound links article en Artificial InteligenceBlog LLM benchmarksQwen 3.5

https://www.facebook.com/3GeeksandALawBlog/

3 Geeks And A Law Blog Wp-Block-Co-Authors-Plus-Coauthors Is-Layout-Flow May 4, 2026

3 Geeks and a Law Blog: A law blog addressing the foci of 3 intrepid law geeks, specializing in their respective fields of knowledge management, internet marketing and library sciences, melding together to form the Dynamic Trio.

0 inbound links website en

Simon Willison on pelican-riding-a-bicycle

Simon Willison’s Weblog Simon Willison Apr 20, 2026

113 posts tagged ‘pelican-riding-a-bicycle’. My benchmark for LLMs: "Generate an SVG of a pelican riding a bicycle". Here's my answer to what happens if AI labs train for pelicans riding bicycles?. "User …

0 inbound links website en generative-ai 1791ai 2024llms 1757llm-release 199llm 600llm-reasoning 98ai-in-china 95llm-pricing 72openai 419google 407

AI flattened the engineering ladder

Ossama Chaib Ossama Chaib Feb 10, 2026

How agentic development compressed the SWE hierarchy overnight

0 inbound links article en

Kyle Cascade - I Love Talking to Claude; I Want to Shrinkwrap It

kyle.cascade.family Feb 12, 2003

0 inbound links article en

On Text and Language

joarvarndt.se Dec 29, 2025

0 inbound links article en

Home

Sparse Thoughts Gal Sapir May 16, 2026

Research notes on foundation models, evaluation, and ML in biology

0 inbound links website en

Weeknotes: Throwaway Tools, Sessions as Trees, and Building a Personal AI Assistant

Edd Mann Edd Mann Feb 8, 2026

Week three. It’s been another busy week with work and spending any free time I do have with these LLMs, exploring many different things along the way.

0 inbound links article en posts WeeknotesAgentsPersonal-SoftwareTestingLlm

Simon Willison on pelican-riding-a-bicycle

Simon Willison’s Weblog Simon Willison Apr 20, 2026

113 posts tagged ‘pelican-riding-a-bicycle’. My benchmark for LLMs: "Generate an SVG of a pelican riding a bicycle". Here's my answer to what happens if AI labs train for pelicans riding bicycles?. "User …

0 inbound links website en generative-ai 1790ai 2023llms 1756llm-release 199llm 600llm-reasoning 98ai-in-china 95llm-pricing 72openai 419google 407

Why AI Agents Remain Unreliable - Does it matter in AI?

Does it matter in AI? - Making sense of what matters - and why - for accurate generative AI, particularly in the workplace. Jeff Evernham Feb 6, 2026

(Yes, even with the just-released Claude Opus 4.6.) Generative AI is an amazing technology because it’s so…human. No, we should NOT anthropomorphize AI (that’s a subject for another post) but inevitably we will because…we’re human. It's much easier to understand something we don't understand when we can relate it to something that we do understand.

0 inbound links article en Uncategorized

Agent Swarms and Knowledge Graphs for Autonomous Software Development | TWIML - The Voice of Machine Learning & AI

TWIML Sam Charrington; Siddhant Pardeshi Mar 10, 2026

In this episode, Sid Pardeshi, co-founder and CTO of Blitzy, joins us to discuss building autonomous development systems able to deliver production-ready software at...

1 inbound link article en

You Can’t Stockpile AI: Military Advantage in the Age of Algorithmic Diffusion - Modern War Institute

Modern War Institute - Kyle Dotterrer Mar 17, 2026

Artificial intelligence will reshape how wars are fought, and the United States enters this era with genuine advantages. American companies build the most capable models in the world. US-based chip designers dominate the advanced semiconductor supply chain. Private investment in AI flows into American firms at a rate that dwarfs every other nation. These are

0 inbound links article en Commentary & Analysiskyle.dotterrer

Friday Links 26-05

Christof Damian Christof Damian Feb 6, 2026

If you watch one thing this weekend, watch the video about being misled about renewable energy. Spoiler: It’s not just about renewable energy. I still haven’t decided what I think about coding with AI. It has been a great help and I see all the drawbacks. My links reflect this. Leadership Culture is built on ‘moments of truth’ [Podcast] - not a technology company, but still very applicable to all organisations.

0 inbound links article en post friday linksfridaylinksleadershipengineeringurbanismenvironment

I Can Now Improve, Automate, Fix and Document Everything

cto4.ai Jan 26, 2026

Thanks to The Workflow and my AI pair, Claude Opus 4.6.

0 inbound links article en

Helping Claude See ... Diagrams

cto4.ai Jan 26, 2026

I built an MCP that lets Claude see its own Excalidraw diagrams, iterate on them, and save them to Obsidian.

0 inbound links article en

Uno: What I Learned Shaping LLMs into a 90s Comic Book AI

Mbrt Blog Michele Bertasi Feb 28, 2026

I wrestled current1 LLMs into behaving like my childhood AI hero: here’s what worked, what didn’t, and why. Uno is my favorite character from my favorite Italian childhood comic PKNA. He’s the friendly and sarcastic AI helping Donald Duck in adventures spanning 56 issues in the mid ’90s. My first attempt at re-creating him was a complicated Excel spreadsheet in my early teens with lots of nested IF functions. This is my second attempt.

0 inbound links article en posts ai

Kimi K2.6 vs Claude Opus 4.6 vs GPT-5.4: Agentic Coding Benchmarks

Verdent AI Rui Dai Apr 22, 2026

Three-way comparison of Kimi K2.6, Claude Opus 4.6, and GPT-5.4 on agentic coding benchmarks, cost, and real trade-offs for production teams.

1 inbound link article en

Agents as Bounty Hunters | Michał Prządka - Blog

blog.michalprzadka.com Michał Prządka Mar 9, 2026

I built a benchmark that pits coding agents against each other in a bug-finding treasure hunt.

0 inbound links article en

Building Claude Code with Boris Cherny

The Pragmatic Engineer Gergely Orosz Mar 4, 2026

Claude Code creator Boris Cherny on building AI-powered coding tools, parallel agents, and how the engineer's role is evolving in an AI-first world.

2 inbound links article en

Senko Rašić

Senko Rašić Senko Rašić Apr 2, 2026

Random thoughts | about

0 inbound links article en

Claude Code Just Got Confusing: Plan Mode vs Superpowers vs Agent Teams — A Practical Guide

RC’s blog Feb 6, 2026

Claude Opus 4.6 dropped yesterday. Agent Teams is in research preview. The Superpowers plugin now offers subagent-driven execution. Plan Mode still exists. A...

0 inbound links article en

AI fears pummel software stocks: Is it 'illogical' panic or a SaaS apocalypse?

CNBC Dylan Butts Feb 6, 2026

The software space is facing serious market concerns this week, after the release of new AI tools from AI triggered a market sell-off.

10 inbound links article en Technology cnbcArticlesNVIDIA CorpSoftwareTechnologyBreaking News: TechnologyThomson Reuters CorpThomson Reuters Corp.LegalZoom.com IncSalesforce IncWorkday IncTata Consultancy Services LtdInfosys LtdArm Holdings PLCBusinessSoftware and Computer ServicesApplication SoftwareSystems SoftwareAI - Artificial IntelligenceBusiness Newssource:tagname:CNBC Asia Source

The Latest AI Revolution Just Showed Up in Your Word Doc.

3 Geeks And A Law Blog Greg Lambert Apr 14, 2026

I'll be the first to admit it. Back on February 5th, when Anthropic dropped Claude Opus 4.6 and OpenAI fired back with GPT-5.3 Codex on the same day, I

0 inbound links article en Agentic AIAIClaudelegal draftingMicrosoft

What do you do when your agents go to work? - Ha Ja Ba Ra La

notes.kaushikc.org Feb 6, 2026

Era of FOMO - Flock of meandering oracles - is unaffordable for me

0 inbound links en agentsllmartificial intelligence

Claude Code Found a Linux Vulnerability Hidden for 23 Years

Deliberatecoder Michael Lynch Apr 3, 2026

Claude Code has gotten extremely good at finding security vulnerabilities, and this is only the beginning.

41 inbound links article en securityllmsai CC BY 4.0

I built a personal AI agent and then killed it

Mark Biek Javier Feb 18, 2026

Yeah, that title’s kind of sensationalist. “You won’t believe what happened next!” Since I’m a human and we pack-bond with and anthropomorphize everything, I actually …

0 inbound links article en

On Text and Language

joarvarndt.se Dec 29, 2025

0 inbound links article en

Killer Context

jack.bonatak.is Jack Bonatak Is Feb 8, 2026

TL;DR: I theorize that coding agent errors compound over time leading to increasingly worse outcomes as a session continues. I built Blackbird based on this theory, which restarts each task in a plan with a fresh context window, with a new task’s instructions as its starting point. This minimizes deviation from intention leading to better outcomes.

0 inbound links article en

System Card: Claude Mythos Preview [pdf]

news.ycombinator.com Apr 7, 2026

2 inbound links en

Release notes | Claude Help Center

Claude Help Center Apr 17, 2026

1 inbound link article en

opus 4.6 and two small tools

Sparse Thoughts Gal Sapir Feb 6, 2026

First impressions of Opus 4.6, and two small tools—an interview plugin and a markdown annotator—for staying engaged with your own work.

0 inbound links article en

Multi-Agent Systems and AI Transformations

Blogger Diego Pacheco Apr 2, 2026

AI everywhere, agents everywhere. We just finished the first quarter of 2026, and a lot happened in those first 3 months. It feels like 3 y...

0 inbound links BlogPosting en

2 ways to bet on a Trillion Dollar Market

Darshan Makwana Darshan Makwana Feb 17, 2026

I was listening to Dario Amodei’s interview with dwarkesh patel and found his insights into how anthropic plans their capex investments and path to profitability quite fascinating. They need to balance their risks into how much compute to build for the next 2 years in advance based on current demands because the data centers take 2 years to build. If they overestimate their demand then they won’t have enough profit in the next years and will go bankrupt while if they underestimate it they won’t be able to match the demand and will risk losing their customers to their competitors, this is what he calls their cone of uncertainty. This sentiment felt weird to me because openai seems to aggressively bullish on their capex investments, infact sam altman disclosed they will be spending $1 trillion on compute infra across microsoft, oracle, nvidia and coreweave between 2025 and 2035 while also partnerring with cerebras, so why do these 2 AI companies have completely different capex investment strategies?

0 inbound links article en aibusinessstrategyanthropicopenai

Creating a Minecraft clone in under a day

Tim Broddin Tim Broddin Feb 9, 2026

Last week, I built a multiplayer Minecraft-like game with my 8-year-old daughter. Not in months. Not in weeks. In under a day.

0 inbound links article en aigames

February quick-takes

Senko Rašić Senko Rašić Mar 2, 2026

Recap of my short posts on LinkedIn in February AI Slop in Content Writing Dear bloggers, content writers, commentators and social medi...

0 inbound links article en

Claude Opus 4.6 just shipped agent teams. But can you trust them? — Better Than Good.

Betterthangoodx Iain Harper Feb 6, 2026

Claude Opus 4.6 introduces agent teams for parallel coding tasks. But multi-agent coordination creates new security vulnerabilities the industry hasn't solved.

0 inbound links article en

No Code by Hand · ashwch

Ashwch Ashwini Chaudhary Feb 28, 2026

CI dropped from 37 minutes to 9, at 35% lower cost. What we learned about mixing Claude and Codex, clearing the hidden queue, and where the real multiplier came from.

0 inbound links article en

Simon Willison on codex

Simon Willison’s Weblog Simon Willison Apr 23, 2026

47 posts tagged ‘codex’. OpenAI's coding agent tools: Codex CLI, Codex Desktop, Codex Cloud.

0 inbound links website en ai 2020llms 1753generative-ai 1787openai 419coding-agents 201ai-assisted-programming 382gpt 124claude-code 112ai-agents 111llm-release 199

Simon Willison on llm-release

Simon Willison’s Weblog Simon Willison May 7, 2026

199 posts tagged ‘llm-release’. New releases of various LLMs.

0 inbound links website en LLMsllms 1751generative-ai 1785ai 2016pelican-riding-a-bicycle 113llm 598local-llms 156llm-reasoning 98ai-in-china 95llm-pricing 72gemini 185