73 posts tagged ‘llm-pricing’. Posts about the pricing of various LLMs. See also my pricing calculator.
Claude Sonnet 4.5 is the best coding model in the world, strongest model for building complex agents, and best model at using computers.
73 posts tagged ‘llm-pricing’. Posts about the pricing of various LLMs. See also my pricing calculator.
Anthropic released Claude Sonnet 4.5 today, with a very bold set of claims: Claude Sonnet 4.5 is the best coding model in the world. It’s the strongest model for building …
Anthropic may be using its specialization in coding and STEM as a differentiator in the buzzy frontier model market.
Anthropic published a new guide on "context engineering" that fundamentally shifts how we should think about getting better results from AI. Turns out, the words in your prompt matter less than what information you're feeding the AI in the first place.
But it increasingly looks solvable
What do we do if AI progress keeps happening?
Welcome to Import AI, a newsletter about AI research. Import AI runs on lattes, ramen, and feedback from readers. If you’d like to support this, please subscribe. Subscribe now Import A-IdeaAn occa…
Notes from Three Weeks in the Valley
Substack: We’re speeding directly into the AI of the hurricane In three words, yesterday was wild.
Sleuthing in the source code of GitHub's new AI CLI.
Anthropic's new model appears to use "eval awareness" to be on its best behavior
Welcome to Import AI, a newsletter about AI research. Import AI runs on lattes, ramen, and feedback from readers. If you’d like to support this, please subscribe. Subscribe now Import A-IdeaAn occa…
New context editing and memory tools enable Claude agents to handle long-running tasks without hitting context limits.
I got a buggy WordPress plugin from ChatGPT in 15 minutes. Then I got a good plugin in one shot from Claude. See my failure, success, and iterative improvement.
Developers are navigating confusing gaps between expectation and reality. So are the rest of us.
The Who is Who in AI Land.
I migrated my Cursor Rules to Skills in Claude Code. Simple, elegant, powerful.
I love Claude Code and Sonnet 4.5, but when training data is thin, the reasoning failures are hilarious
Measuring run-to-run variance in coding agents: Claude Code vs Mistral's Vibe.
Analyzing the current LLM landscape and its use-cases
Improving your thinking is improving your output as a leader. Use agents to supercharge your thinking by building your own councils.
Qwen 3.5-35B runs on a gaming PC and matches Claude Sonnet 4.5. When the commodity version is 95% as good and 97% cheaper, you have a pricing problem.
No vagueposting here, just look at the Estimated Read Time.
Claude Code has gotten extremely good at finding security vulnerabilities, and this is only the beginning.
No vagueposting here, just look at the Estimated Read Time.
%at=2026-02-10T02:31:55.248Z #author_luna #sandbox #claude_code #ai ![claude_code_opus46_at_sea.jpg] *image: generated with nai4.5full, with claude code opus 4.6's [anime girl self image]* I see my friends playing with fire. this fire has a spec
A year of reasoning, agents, and compressed innovation cycles
Per-talk notes from the PyCon US 2026 Typing Summit in Long Beach: Pyrefly and AI agents, ty constraint sets, Lean formalization, tensor shape types, intersection types, PEP 827, Guido on the direction of typing, and the Typing Council Q&A.
What do we do if AI progress keeps happening?
68 posts tagged ‘llm-tool-use’. Tool use is when an LLM is instructed to occasionally request that an external tool be run on its behalf, with the result passed back to the model for further …
A long-running agent can keep making progress over hours, days, or weeks. It can do this across many context windows and sandboxes, recover from failure, lea...
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
Sam's Spot - Sam saffron's web log