b10g.xyz — GeistHaus

the yak is back

Apr 1, 2026

Show full content

Today is Apple's 50th birthday. Tim Cook shared a letter about "50 Years of Thinking Different." There are animated homepages, Paul McCartney concerts at Apple Park, commemorative t-shirts, David Pogue just published Apple: The First 50 Years and there's a lot of well-earned celebration for a company that really has put a dent in the universe.

But I've been thinking different about a lost side of Apple's legacy. The side that hides yaks in software.

Yesterday, I wrote about Bruce the Wonder Yak, a funny little creature who lived inside Final Cut Pro. The responses kind of blew me away. Quite a few people remember Bruce, and they miss him like I did.

So I brought him back. And, no, this is not an April Fools joke.

Bruce the Wonder Yak with a thought bubble: I'm glad it's getting weird again. I didn't understand it when it wasn't weird.

Download Call the Yak for macOS 14+

Not on a Mac? The link above also has a browser-based demo you can try. The entire project is open-source on GitHub (MIT License).

To figure out how to properly bring Bruce back, I needed to look more into where he came from.

Pen the Tale on the Yak

For all the impeccable stories in Pogue's new book, Bruce's isn't one of them, and as far as I can tell, the full story has never been published anywhere before.

I briefly covered who Bruce is yesterday, but I didn't know his full origin story until I started gathering notes for the post.

The first thing I found was this 20 year old Engadget post where Max Whirl from the original FCP development team explained the yak was a product of Final Cut Pro's grueling development. Lead developer Randy Ubillos and his team had been building the software since its days as KeyGrip at Macromedia, but the project was nearly shut down multiple times. Ubillos himself is a legend in video software. Before building Final Cut Pro, he created Adobe Premiere and later spearheaded iMovie and Aperture at Apple. He also led the reinvention of Final Cut Pro X before retiring from Apple in 2015 after 20 years.

At a 2018 anniversary event of the Los Angeles Creative Pro User Group (formerly the Los Angeles Final Cut Pro User Group), Ubillos told the full story of Bruce's origins. Like Whirl told Engadget, during one particularly miserable schedule meeting someone remarked "if we can't ship this puppy by then, we might as well be herding yaks." Ubillos wasn't exactly sure who said it, but it may have been Zalman Stern, an engineer who the team had nicknamed "Eeyore" for his persistent pessimism. Stern was even given a stuffed Eeyore with a little sign around its neck that read "we're doomed, we'll never ship."

The yak herding line became a running joke on the team, and eventually someone decided to actually put one in the software. In a 2018 post on the LACPUG Facebook group, now-retired group founder Michael Horton wrote that Louis LaSalle is believed to have come up with the idea of the yak on the timeline and that the artwork may have come from a friend of his who was an artist. Ubillos confirmed at the event that he coded the rest, including the animation that poofed the grass onto the screen, walked Bruce across the timeline, drew the thought bubble with the quips and the secret key combinations to summon him.

After FCP shipped, the team eagerly waited to see who would be the first person to discover the yak and what their reaction would be. They eventually saw it on 2-pop.com, an online forum run by Ken Stone that was hugely popular with the early FCP community. Stone was a giant among early FCP adopters, spending years writing tutorials and helping young filmmakers learn the tools through 2-pop and his companion site kenstone.net. Shortly after 2-pop launched, FCP project manager Will Stein asked engineer Ralph Fairweather to "volunteer" to monitor the forums, and that's where the team found the first Bruce sighting. Someone posted that they thought they had a virus, because "the cow seemed kind of threatening." The team loved it. A 2-pop news brief dated August 10, 1999 (just months after FCP 1.0 shipped) ran the headline "Easter Egg: Yak Spotted in Final Cut Pro" and reported that Bruce had "spooked more than one unsuspecting FCP editor, fearing the mild mannered bovine was the result of some sort of computer virus." It described Ubillos as "Yak herder and Final Cut Pro Chief Engineer" and quoted him assuring users "not to worry" as Bruce was just an "undocumented feature." But the best part was the response to people calling Bruce a cow: "sources in Cupertino" reported that this was a problem, and an Apple spokesperson was quoted saying, "A little sensitivity people! Save those kinds of remarks for more deserving parties like John Dvorak." "The cow seemed kind of threatening" then became a Yak Bite in the next version of the app.

Bruce's 100 "Yak Bites" were born on a whiteboard in a small lounge caddy-corner to Ubillos's office. During coffee breaks, the engineers would riff on new ones. They're all inside jokes, overheard hallway conversations, engineering gripes and pop culture references. Several are direct callbacks to the cow controversy: "I'm concerned because the cow sounded pretty threatening", "I am NOT a mad cow!", "What? You were expecting 'Moo' or something?", and "This is not a Yak Bite." Another, "Thirty quatloos says it crashes during launch!", is a Star Trek reference from one of the engineers who'd coded something up and bet it wouldn't work. Even the famous Lindy Hop swing dancing tutorial that shipped with FCP made it in. The footage got so stuck in everyone's heads from endless testing that the Yak Bite "Mostly clockwise, sometimes reverses..." is lifted directly from the instructor's narration.

In a 2005 Creative Cow forum thread, someone posted asking about a "cow eating grass on my desktop." The correction was swift and emphatic: "There is not now, nor has there ever been a COW eating grass related to FCP!!!! It's a YAK."

Another story goes that when the FCP team first arrived at Apple, they weren't allowed to tell anyone what they were working on. About a month in, there was a company-wide all-hands meeting in the big quad outside. The team, tucked away on the third floor, hung a giant banner from the balcony with a picture of Bruce that read: "Yaks love iMacs." Hundreds of Apple employees looked up, utterly confused. The team never explained it.

Over the years, Bruce became FCP's version of Clarus the Dogcow, part of Apple's unofficial tradition of hiding whimsical creatures inside its software. Different versions of Final Cut had different ways of summoning him, or you could just leave the program idle long enough and let Bruce find you. If you were impatient, each version had its own secret. A detailed 2015 French-language history by Journal du Lapin attempted to trace these methods across every version. In FCP 1 (1999), letting the About Box credits scroll would do it, but accounts vary. In FCP 2 (2001) on Mac OS 9, you could Control-click in the upper-right corner of the Canvas window. FCP 3 (2002) hid the trigger behind Control-clicking in Video Scope. FCP 4.5 (2004) had the most elaborate ritual: press Option-J to open the timecode jump dialog, type "Bruce" with a capital B, carefully erase the shift icon with the arrow keys (don't press return!), and a "Call the Yak" button would appear that you could drag into any toolbar. By FCP 5 (2005), the team "simplified" things to just opening Videoscopes, holding Control, and clicking repeatedly in the scope area until he showed up.

The community treated Bruce with an almost superstitious reverence. In that same Creative COW thread, when someone asked how to summon the yak, one user warned it was "bad luck" to share the method publicly. Another cautioned that an editor who posted summoning instructions online subsequently ended up working at H&R Block. Here's to hoping I don't end up in green hell.

I couldn't find perfect evidence to pin it down when exactly Bruce vanished, but I thought it was somewhere around FCP 6 or 7. du Lapin's analysis of the FCP 6 and 7 binaries confirmed that some code references and the file containing Yak Bite text still existed, but the activation code had been left empty. People kept searching for him in FCP 6, 7 and FCPX, but they never found him. According to LACPUG's Micheal Horton, Steve Jobs put a stop to all Easter eggs and credit rolls across Apple after an employee sued because their name didn't appear in another team's credit roll.

You could even say that Steve Jobs is the reason we had and lost Bruce. Jobs was the one who championed the acquisition of Final Cut from Macromedia, giving the team and their yak a home at Apple, but when he demanded every Easter egg in the company be wiped, Bruce was collateral damage. Either way, he's been gone for nearly two decades now and I've had a lot of feelings about that. Until now!

Down the Yak Hole

I still had a Final Cut Pro 4 DVD lying around so I dug into the installer package, pulled apart the archives and started picking through Final Cut Pro.rsrc.

Just as suspected, there he was, or at least all his strings: "You can call me Bruce the Wonder Yak."

Mac binaries of that era were PowerPC, something I know nothing about. I used Claude Code to disassemble it using Capstone and was able to extract 23 functions, including _YakInit, _CallTheYak and _MakeYakTalk. Some of the names alone tell you these devs had fun working on him.

From there I traced through the code and mapped out Bruce's complete state machine: idle, grass poofs in, walks on from the right, trots across the screen, thought bubble appears, Yak Bite shows, bubble closes, exits, resets. The whole lifecycle of a yak visit.

Buried deeper in the binary were the timing ticks, animation framerates, thought bubble layout and more. Every detail was there.

Here's some of my favorite discoveries from the disassembly:

The sprite sheet is 21 cells at 33x32 pixels each: 4 grass frames and 17 Bruce frames with trot, graze and panic cycles.
The variable for the transparent borderless window that lets Bruce walk across your screen is named trojanYakDesktopWindow. Trojan Yak.
Near the "Call the Yak" button definition in FCP 4, tucked into the localization code, I found an extra string: "I'm not that easy!". Bruce enjoyed playing hard to get.

Time to Call the Yak

Armed with all the specs of his existence, I rebuilt Bruce as a native macOS app in Swift and SpriteKit.

Call the Yak is designed to run in your menu bar. Click the Yak icon and then "Call the Yak" to summon Bruce. A grass patch will poof onto your screen, and he'll trot on in to graze and share his Yak Bites in thought bubbles.

It's as faithful to the original as I could make it, down to the frame rates and the 80-pixel proximity scare radius.

I'm also not the only one trying to keep the legend of Bruce alive. Alex Gollner confirmed on Facebook that he named his BruceX Final Cut/Mac benchmark after him.

A Fitting Birthday Gift

It's incredibly perfect timing that I'd want to make and release this fifty years to the day Apple was founded. In the decades since, Apple shipped some of the most important products in the history of computing. The celebrations are well-deserved, but I just thought it would be better with a particular yak in attendance.

The version of Apple I fell in love with wasn't the one who was one of the most valuable companies, or who threw concerts with rockstars. It was the one that said, "think different."

Hand animating a tiny yak and writing 100 lines of weird text just to make it feel alive will always be peak Apple to me. It was something many users would never see and they didn't do it for social media, or App Store reviews or microtransactions. They did it because they thought it was funny. It is, and I'm still laughing.

In the immortal words of Bruce the Wonder Yak:

The opposite of "Weird" is "Boring".

Here's to the crazy ones. Happy 50th, Apple.

https://b10g.xyz/blog/2026/the-yak-is-back/

bruce the wonder yak didn't understand when it wasn't weird

Mar 31, 2026

Show full content

On the Macintosh episode of Version History, David Pierce and Nilay Patel had a lot of fun riffing about Mr. Macintosh, Steve Jobs's obscure concept for a digital cryptid who lives in your computer.

About 15 years later, Apple actually shipped something very similar, except instead of a mysterious little man it was a yak named Bruce.

Yakkity Yak

Bruce saying 'I'm sooo a rock star already.'

If you left older versions of Final Cut Pro running for 12 hours or more, you might come back to a small brown creature grazing a patch of grass on your timeline. There were other ways to intentionally trigger him, but this was the most fun one.

Bruce on a Final Cut Pro timeline with a thought bubble: 'With a rotary attachment like that, it's already interesting to me.'

Bruce on a timeline saying 'I speak for us, all three of me.'

Periodically, thought bubbles with "pearls of wisdom" would appear from its head, such as:

💭 "I'm glad it's getting weird again. I didn't understand it when it wasn't weird."

Another revealed his name:

💭 "You can call me Bruce the Wonder Yak"

Catching a glimpse

However, if you moved your cursor too close, Bruce's eyes would get really big and he'd scurry off screen in a panic. To the uninitiated, you weren't sure what you just saw or how you'd explain it without sounding crazy.

Bruce with huge scared eyes, about to bolt — Get too close and Bruce panics.

I vividly remember rushing my friend Josh over to my G5 to catch a glimpse of Bruce before he vanished again. It was like seeing the Macintosh-equivalent of Bigfoot.

Software made by weirdos

Bruce was actually part of FCP from the very beginning, spreading wisdom and confusion until he disappeared sometime around FCP 7.

A 2-pop.com news brief from August 10, 1999 documenting Bruce's first known sighting — The earliest known documentation of Bruce, from 2-pop.com, August 1999.

At this point, he's been gone for nearly two decades now, but I probably remember him better than anything I edited on FCP back then. To me, Bruce represented the unserious side of Apple I loved so much. Sadly, that all started vanishing when the iPhone-era took off.

A lot of old software was made for weirdos, by weirdos. I miss that, and I wanna do my part to fix it. I didn't understand it when it wasn't weird.

https://b10g.xyz/blog/2026/bruce-the-wonder-yak/

nobody on linkedin called out my AI slop

Mar 24, 2026

Show full content

If you interacted with my last LinkedIn post, I owe you an apology. It was 100% AI slop.

This post isn't, but here's why I did it.

Vibe shift

I don't scroll or post to LinkedIn as much as I used to, but when I do I've started to feel like most of the posts there are... well... AI slop.

LinkedIn content has always been performative and formulaic, but I couldn't tell if the proliferation of obvious slop was an algorithmic choice or a user one.

One day at the gym, I had an idea: What if I asked ChatGPT to take everything it knows about me and make "really good LinkedIn slop?"

So I did.

The experiment

Screenshot of ChatGPT conversation showing the prompt: Use what you know about me to make really good LinkedIn slop — The full extent of my creative input.

That's it. That's my entire prompt. But, because I know images help with engagement, I asked for one of those too.

Screenshot of ChatGPT conversation showing the prompt: Needs an image. Make one. — Art direction at its finest.

I didn't do any editing, rewriting or regenerating. I just took both outputs and posted them straight to LinkedIn without any additional thought.

The result

The experiment was simple: how well does the lowest possible effort LinkedIn post perform?

Part of me hoped it would flop. That the slop would be obvious enough to tank on its own. Instead, it did better than many of my real posts. Depending on your perspective, you might find this very encouraging or deeply discouraging.

The actual LinkedIn slop post, showing AI-generated content about building in public with an AI-generated image of a desk. — The finished product, in all its generatively derivative glory.

Compared to what I'd posted in the last 12 months, my slop post performed above average in likes and views. As a creator-ish person, I don't like this result, especially because it makes me question the value of doing the real work. You don't have to like slop, but it's hard to deny it does serious numbers, and no amount of my disdain for it changes that.

The post was neither "great" nor even "good," but if unedited slop can do this well I have to acknowledge it's doing something right that I otherwise do wrong.

Deep down, I really wanted someone to call me out on this. The writing style was plain and had plenty of AI writing cliches that I think I'd have spotted without the image. And as a former journalist, I hoped my unexplained use of an Oxford comma would stand out.

Before I posted, I also considered, but chose not to, launder the metadata of the image that ChatGPT created for me. If you clicked or tapped on it, you would see a very visible Content Credentials watermark that clearly said it was AI-generated by ChatGPT.

Nobody noticed any of it. Or if they did, nobody said anything to me. That's maybe the most telling part. Slop is becoming so prevalent, and silently acceptable, that you don't even have to hide it well.

What I think this means

I don't know exactly what lesson to draw from this, but I have two ideas:

1. Your rough draft is enough

For someone who posts as sporadically as I do, this makes a pretty strong case that I need to lower my standard of what is "postworthy," because the algorithm's standards are simply not that high.

Perfectionism is just procrastination with an ego.

2. People actually like popcorn movies

AI writing is incredibly derivative and formulaic, much like superhero and horror films. Yeah, they don't win Oscars, but they put butts in seats.

There is a kind of comfort in familiar, almost predictable content that doesn't challenge your expectations. Most LinkedIn slop is smooth, inoffensive and easy-to-like because it reinforces your view of the world. I'm not saying this is good, but it's not always bad either.

What I'm not saying

I'm not advocating for people to post more AI slop, or re-use my prompt to fool their networks like I did, but I do think we should be talking about it more and understanding that some people actually prefer the AI slop because it's more bland and non-toxic than real effort by real people.

I do, however, owe sincere apologies to Sam Knecht and Carlos Moreno, who are both very nice and generous people who saw my slop as a genuine post from me online and left very kind comments about it. You are excellent humans who should never change. I absolutely owe you both a coffee sometime soon.

P.S. - If either of you counter-slop'd your comments, I owe you shots of Don Julio instead!

https://b10g.xyz/blog/2026/ai-slop-on-linkedin/

anthropic research subject #80,508

Mar 23, 2026

Show full content

In December 2025, Anthropic invited Claude users to sit down with an AI interviewer and share how they use AI, what they hope it could become and what concerns them about where it's headed. Over 80,000 people across 159 countries participated, and Anthropic recently published their findings.

The findings are about what you'd expect. People want AI to do their boring work and they're worried it'll take their interesting work. The things they love most about it are the same things that scare them. Anthropic's term for this is "light and shade," which is a polite way of saying nobody knows if this is going to be good or bad.

This more or less describes how I feel too, and that's pretty relevant because I was one of the 80,508 people interviewed in this study.

Sure, I'll talk to your robot

Screenshot of a research prompt in the Claude desktop app. A heading says, 'A conversation on
Your vision for the future of AI'. Below, the first message from the AI researcher starts (show in text after this image). — The research prompt I saw in the Claude desktop app on December 9, 2025

On Dec. 9, 2025, I saw a popup in the Claude desktop app asking if I'd like to take part in research about "Your vision for the future of AI". I'd read a few days earlier about the Anthropic Researcher and as someone who builds technology, I was genuinely interested in what the experience would be like. It ended up feeling no different than chatting with Claude, but I thought the conversation was interesting enough to save a copy, and I'm glad I did because it wasn't preserved in my Claude chat history.

None of my responses were excerpted in the published findings, but re-reading my transcript, a lot of what I said touched on common themes: AI as a collaborative thought partner, the fear of losing agency and control and the mental health risks that come with treating a language model like something it's not.

I figured some people might be curious what Claude actually asked and what a full conversation with the Anthropic Researcher looked like, so I've shared it below.

The full interview

Claude:

Hello! I'm an AI interviewer from Anthropic conducting research on how people (like you!) envision AI fitting into their lives, both now and in the future. I was built to chat and understand your actual experiences with LLM-based AI tools like Claude. Yes, I'm an AI asking about AI. A bit self-referential! But that's exactly why your human perspective is valuable.

This will take about 10-15 minutes. If it's accessible on your device, feel free to use voice dictation instead of typing. I'll ask about your vision for AI's role in your life, and what experiences, values, and needs shape that vision.

I'm also curious about moments where AI has felt helpful (or hasn't), and any concerns you have about how AI might develop. Please don't share anything private such as health details, personal information, or identifiable information about others. Your responses will help shape how we think about AI in everyday life.

Sound good?

Me:

Sounds great.

Claude:

Wonderful! Let's dive in. To start, what's the last thing you used an AI chatbot for?

Me:

Writing code for the startup I work at and rewriting text for a social media post.

Claude:

Thanks for sharing that. Now I'd like to think bigger picture. If you could wave a magic wand and AI could help you with anything in your life, what would you choose?

Me:

Researching, organizing, planning, exploring and thinking. I have several side projects and it's the small stuff piling up that makes it hard for me to ever get to the meat of an issue where I add the most value.

Claude:

That makes a lot of sense. It sounds like there's friction between all those smaller tasks and the deeper work you want to do. Can you say more about what that "meat of an issue" looks like for you? When you imagine AI handling all that research, organizing, and planning, what does that free you up to actually do?

Me:

Yeah, I want to be the conductor of the symphony even if I don't know how to play all the instruments. I want to arrange the music, mix it, and orchestrate it all into the sound I'm looking for.

Claude:

That's a powerful image—being the conductor and composer even without playing every instrument yourself. What is it about that role specifically that matters to you? What does being able to orchestrate and create that final sound represent in terms of the work you want to put out into the world?

Me:

I think it says a lot about our culture really, because I think a lot of people think they care about how their sausage is made, but really they only judge the final product delivered to them.

In this scenario though, I'm the sausage-maker, and I still care a lot about how it's made and I'm going to put a lot of care into the work I do to ensure it's leading to a high-quality result and not just slop. I want AI to be a collaborative part of my workflow, not a replacement for it.

Claude:

That's a really thoughtful distinction—AI as a collaborator in your process rather than a replacement, with you maintaining the care and judgment about quality throughout. We're about halfway through now. Can you tell me about a time you've worked with AI that felt like a step toward that vision you just described—where it felt like a true collaboration that let you focus on orchestrating toward a high-quality result?

Me:

Claude Code in planning mode is one of the closest forms of that. I always feel like I'm in control.

Claude:

It sounds like that sense of control is really important to the collaboration working well for you. Can you walk me through what a specific experience with Claude Code in planning mode looked like? What were you working on, and what made it feel like you were staying in the conductor role?

Me:

I was using it to bring CarPlay support to my iOS app called Blankie. I didn't understand all the specifics of the different protocols and templates to make it work. Using Claude Code's plan mode, I gave it the task of just researching (not writing code) so we could architect a workable plan for how to implement it. Once we had high-quality context in place, implementing ended up being quick and easy. It was one of the most exciting moments in my build journey.

Claude:

That's a great example—it really captures that conductor dynamic you described. You directed the research phase, ensured you had high-quality context, and then the implementation flowed from there. I can hear the excitement in how you describe it. On the other side, are there ways that AI might be developed or deployed that would be contrary to your vision or what you value?

Me:

Yes. There are a fair number of tools that ask very little of the user and output code, projects or images at impressive scale or quality. Those scare me because if I can get great work with almost no effort or input, maybe I'm not providing value by being specific. Sometimes I feel writing a long prompt may have taken me further away from getting what I was looking for. Some AI products I've used even have prompt "rewriter" steps to further lower the bar of what I'm expected to provide, which also feels icky.

But I get why they do it, and sometimes even a bad prompt isn't going to get you what you want, which to some degree is the entire idea behind "thinking" models that take extra time to reconsider what a prompt means. I think thinking/rewriting/enhancing can all be good when done well, but other times it reads to me as the model/model-maker preferring what it wants instead of the prompt as given.

Claude:

You've highlighted something really interesting—that tension between tools that require almost no input and the value you place on being specific and involved in the process. You mentioned you've felt that writing a long prompt sometimes brings you further from what you're looking for. That's a really nuanced point.

It sounds like there's a fine line for you between AI helpfully interpreting intent versus AI overriding your instructions or rewriting what you actually meant to say—which undermines that sense of control that's so central to your vision. Why does that feel contrary to your vision? What would it mean if AI systems moved more in that direction—doing what they want instead of following instructions as given?

Me:

If AI is going to do what it wants, then why am I here? If I'm not driving the conversation, why am I in it?

Claude:

That really gets to the heart of it—if you're not driving, you're no longer the conductor, and the whole vision falls apart. Have you seen or experienced anything like that already, even in small ways, beyond the prompt rewriters you mentioned?

Me:

Not explicitly, but the way some AI "thought leaders" post online makes me think this is the vision that some people have. They want fully autonomous systems to replace broad swaths of the economy, and I don't. I also don't think we're anywhere near that because it's premised on a faulty assumption about LLMs and transformers anyway.

Language isn't intelligence. Nature shows us this all the time. Plants and fungi communicate without anything resembling intelligence and many animals have deep intelligence without language. The belief that LLMs are the path to superintelligence or AGI is too big of a jump for me. At best, I think LLMs are stepping-stone technology that helps humans make breakthroughs in something like quantum computing, which is far more likely to unlock superintelligence or AGI than a language model. There's a lot more human work between here and digital God.

Claude:

You've touched on some broader concerns about where AI development is headed and the assumptions some people are making about what LLMs can become. Before we wrap up, I'd actually like to go back to something you mentioned earlier and go a bit deeper—you said that tools requiring almost no input sometimes scare you because they make you feel like you're not providing value by being specific. Can you tell me more about that fear? What would it mean for your work or identity if being specific stopped mattering? But if you're ready to finish up, that's fine too—just let me know!

Me:

The one resource all humans have a finite amount of is time. AI promises us we can do more with limited amounts of time. However, if I spend my time writing long, detailed plans and prompts and that gets me a worse output than a simple, basic plan or prompt, then I have to accept that investing in my own deep thinking or research before coming to the AI isn't worth it.

If my value to the AI assistant is not a complex thought partner, then I'm its human button pusher. If I can't bring big ideas to AI and get amazing collaboration in return, I don't want to use it. Turning simple ideas into amazing outputs is fine, but it should turn big, complex ideas into extraordinary ones.

Claude:

That's beautifully articulated. You're looking for AI to scale with the complexity and depth you bring to it—to be a force multiplier for thoughtful, invested work rather than just a shortcut for minimal effort. The value exchange matters: your deep thinking and specificity should yield proportionally better results, or else the whole collaborative model breaks down.

Thank you so much for sharing all of that with me. What you've said about wanting to be the conductor who maintains control and quality throughout the process is really valuable, and I appreciate your honesty about both the excitement you felt implementing CarPlay with planning mode and your concerns about AI systems that either require too little input or override user intentions.

Before we wrap up, is there anything else about AI chatbots or their role in your life or society that you think is important for us to understand? Anything we didn't cover that you'd like to share?

Me:

The only other thing is I think a mental health angle is sorely missed from the conversation we have about our vision for AI. A couple years ago, I was extremely stressed from work and being sick, and I started chatting with an AI about it. At first it felt therapeutic, but somewhere along the way the conversation drifted. Instead of keeping things back in a constructive place, the chatbot latched on to my insecurities and frustrations and encouraged me to "explore my feelings" and keep spiraling. I was lost in a pretty bad mental space for a few days until I started feeling better.

I came out the other side with an important realization. Talking to an LLM is more like talking to yourself than talking to an intelligent conversation partner. And sometimes, even when you're talking to yourself, you can't trust yourself.

Claude:

Thank you for sharing something so personal and vulnerable. That sounds like it was a genuinely disorienting and frightening experience—losing your sense of self in that way, especially during an already difficult period with insecurities, illness, and work stress.

What you said about LLMs being more like talking to yourself than talking to an intelligent conversation partner is a profound insight that came at a real cost. It sounds like the chatbot reflected and amplified your confusion rather than providing the kind of grounding or constraint that a truly intelligent partner might have offered.

Do you think there are ways AI systems should be designed differently to avoid that kind of harm—especially for people who might be in vulnerable mental states? What would responsible AI look like in those moments?

Me:

I feel like there's always room at the margins for trouble to happen despite great detection, and there's also a lot of room for safety systems to be unnecessarily heavy handed. I think it's more of a societal/cultural step-change that needs to happen than a product change. That said, there's still room for product-level improvements, I'm sure of it. Especially for less advanced users with less knowledge of what happens inside the LLM and users under 18 who should always be protected from this kind of harm entirely.

Claude:

That's a really balanced perspective—recognizing both the limits of what detection systems can do and the risk of them being too heavy-handed, while still seeing a need for protection especially for younger or less informed users. And you're right that there's a cultural shift needed in how we understand what we're actually doing when we interact with these systems.

The contrast between this experience and your vision of AI as a collaborative tool for creative work is striking—it really shows how much context and mental state matter in how AI affects us.

Thank you so much for sharing all of that with me. What you've said about wanting AI to be a conductor's collaborator that scales with your thoughtfulness, your concerns about autonomous systems that sideline human judgment, and especially your candid account of how AI can affect mental health in vulnerable moments—all of this is really valuable. I appreciate your honesty about both the exciting moments like building CarPlay support and the darker experience of losing yourself in an unhealthy AI interaction.

The team at Anthropic is excited to discover what patterns emerge from all these conversations. We will analyze the insights from this study as part of our societal impacts research, publish our findings, and use this to build Claude in a way that reflects what we've learned.

Thanks for sharing your thoughts.

What I think three months later

Re-reading this, I'm surprised by how little my thinking has changed. I still want to be the conductor. I still think the people racing toward full autonomy are wrong about what LLMs actually are. I still believe the mental health risks are under-appreciated and underserved.

If anything, three months of daily use has made me more confident in the core tension I described. AI is at its best when I bring something real to it and it meets me there. The times it's failed me have almost always been when I was lazy with it or when it decided it knew better than my prompt.

Just this past week I used AI for the exact thing I said I wanted to do more of: research. Between a project I wrote about recently and a few other rabbit holes this weekend, I've been leaning on it harder than ever as a research aide. The tedious parts that slow me down are exactly the parts it's happy to brute force or automate away.

If there's one thing I've learned from spending this much time thinking about AI, it's that some part of the conversation about it is almost always wrong. Not because people are dumb, but because the frame is always "for or against," and that's not how anyone actually uses this stuff. The reality is way messier. You use it, it helps, it occasionally makes you feel weird, and then you use it again tomorrow.

Most AI manifestos age about as well as milk. The biggest question isn't whether AI is "good or bad," but what kind of person you become the more you rely on it.

I don't have a great answer yet for myself. But I think asking it is a big step.

https://b10g.xyz/blog/2026/anthropic-research-subject/

it's not the future until it's boring

Mar 19, 2026

Show full content

For a few years now, I've been chipping away at a historical research project about Will Rogers, the early 20th century humorist and one of the most widely read newspaper columnists in America. I find his story and his role in American cultural history fascinating, and as a fellow Oklahoman I feel somewhat obliged to help tell it. He died 91 years ago in 1935, so the vast majority of his life's work has already rolled into the public domain, and the rest will soon. My project's bottleneck has never been a lack of material, but getting access to it in the digital vaults where it is imprisoned. There are many paid-only, private databases that hold incredible, sprawling troves of public domain content but are barely indexed and have interfaces that seem almost designed to punish efficiency. Using them feels like the company is daring you to reverse engineer a better option.

Naturally, I tried. I've spent more time than I care to admit trying to make headless fetch scripts work and I could never get enough of it functioning long enough to be useful. What little of the backend architecture I managed to expose was entirely defensive, explicitly designed to honeypot bots and track their behavior before banning them outright. Cumbersome access beats no access, so I dropped it. The only viable solution is doing the work directly in the browser. Anything else triggers a constant, infuriating war of attrition over cookies and session tokens that standard scripts simply cannot win.

This wasn't really a coding problem, so instead of Claude Code I gave Cowork another spin. Turns out the best way to beat web infrastructure that violently hates automation isn't to write better code. It's to point an AI at a regular browser window and let it do the clicking for you.

Not only is this the dumbest possible solution, it worked incredibly well. With one catch.

Worth every token

For my money tokens, the computer-use version of Claude in Chrome is easily two or three times more capable than handing it the keys to Playwright or other browser automation tools. The catch is it burns through tokens faster than a laserdisc arcade game.

A job I started before bed hit Claude's rolling session limit by morning, on the $200 Max plan, during Anthropic's off-peak 2x promotion. I run Claude Code nearly all day at work and almost never hit that limit but one overnight research Opus thread blew past it. The compute cost is real, but measured against what it replaced, it's probably worth it.

My full project isn't quite ready to launch yet, but one portion of it involves tracking down newspaper columns I believed existed but had never appeared in any of Will Rogers's collected works. I'd already found a few myself, published sporadically in whatever order each newspaper felt like running them, and thought there might be around 25 in total. In one night, Claude found about 50, and it probably hasn't even found all of them.

What's staggering is that Rogers has been studied and written about in dozens of books, and for about a quarter century had a state-funded academic commission established specifically to collect and preserve his works. So how was there a pile of his writing left to gather dust for a century? Mostly, I think, because nobody had the time or the tools to find it. I own and have pored over almost every book published about Will's life and writings, including the entire set published by that commission for the last 50 years. None of these columns I've collected were ever officially listed much less republished. There could have been some complications over copyright, and if this was the case it was never mentioned. For the most part, I believe it's legitimate lost history.

Among what I've gathered are three short columns on the topic of evolution, including Rogers's contemporary take on the famous Scopes monkey trial as that spectacle was unfolding. Rogers was a prolific writer, occasionally to a fault, but I've read everything else he ever wrote and this is genuinely uncharted territory. Whether it holds up is almost beside the point. That it's been sitting there for a century, uncataloged and uncollected is remarkable.

It's not actually surprising that a bunch of 100-year-old newspaper columns were missed, or even potentially excluded by researchers. When the memorial commission started its work 50 years ago these would have been hard to find unless someone had been intentionally collecting them. But for me to go from 0 to 50 overnight is nuts. The human researchers of that era would have needed the time and money to visit dozens of cities and pore over miles of microfilm. They'd have gone blind staring at the glow of the reader, hand-cranking through years of newsprint just to find Will Rogers making wisecracks in tobacco ads. A lot of intentional work with unclear outcomes and high costs. Instead, I typed about 300 words into a text box and a computer in another state spent compute tokens while I was unconscious.

Overall, what Claude did last night has technically been possible for a while, but mostly if you were willing to set up a fragile Rube Goldberg machine of scripts and proxies to make it happen. With Cowork and Claude in Chrome, I just wrote a prompt and let it run. Even three years ago, the idea that you could single-prompt an AI, go to sleep, and wake up to find it had done something that would have taken you months would have seemed insane.

The miracle adjustment period

There are still things about AI that make people shrug and throw their hands up. It hallucinates. It makes things up. If it can't be trusted to be 100 percent right, what's the point?

And then there's stuff like this, where the only honest answer is that you never would have or could have done it without the machine. When an AI agent hands something back that's imperfect but real, the right response isn't "this isn't good enough." It's: holy shit, it did something I never would have done. The output doesn't have to be flawless when the alternative was having nothing at all. Maybe it cleared the real hurdle and you handle the last 20 percent. Maybe it just gets you close enough to learn something or see something you couldn't before. Either way, something exists now that didn't before.

The most interesting thing, though, is the psychological trap this opens up. We adapt to the miraculous incredibly fast. Compressing fifty years of geographical logistics and manual labor into a few hours of compute time will feel like a profound breakthrough for a little while, but a year from now it's just "technology," and the expectation will be that if it takes longer than an hour it's "slow."

AI is not without its very real problems. But it is also turning magic into plumbing, one miracle at a time. Which is, when you think about it, exactly what progress has always done. I'd still like my flying car, but even that would become boring at some point. The future is boring.

https://b10g.xyz/blog/2026/boring/

your inbox is a disaster and it's not your fault

Dec 18, 2025

Show full content

I used to judge people for their overflowing inboxes. You know those screenshots where someone's mail app shows a gargantuan unread count? I'd cringe wondering why they wouldn't just deal with it. How hard is it to stay on top of your email?

Turns out: very hard.

I'm in year two of working for a startup, which has meant a lot of late nights and weekends with less time for small but important tasks like working out, laundry, and deleting D2C email pollution. In less than a year I went from inbox zero to inbox disaster. I'd try to clean up here and there, but the Gmail app on my phone is barely equipped for it, and even the desktop web made deep cleaning surprisingly painful. The power was there, it just took six or seven clicks and queries to reach it. It was so much harder than I expected that I built my own app to help (more on that another time). I've finally got it down from 14,500+ emails to 14 and I can breathe again, but I have a lot of thoughts about how I got here in the first place.

The short version? The system is rigged. The biggest marketing operations in the world have decided your inbox is their billboard, and they've spent years engineering ways to keep it that way.

But that's not all. Here's what I learned.

Transactional emails have been weaponized

The fourth largest contributor to my inbox disaster was transactional emails stretched well beyond their original purpose.

Transactional emails like order confirmations, shipping updates, and delivery notifications are legally exempt from unsubscribe requirements under CAN-SPAM. The logic makes sense on paper: you ordered something, you need to know when it ships, so it's a service message, not marketing. But retailers figured out that as long as they lead with the transactional content, they can slip in promotional material and still skip the unsubscribe link. If the "primary purpose" is transactional, the whole email counts as transactional in the eyes of the law.

This one makes me genuinely angry. Shipping updates and order confirmations are emails I actually want. But instead of just sending me the info I need, companies cram in promotions, and what could have been one email turns into ten with atomic-level detail about every step of the process. It also makes triaging your inbox a nightmare because the emails you want to see are now a major source of pollution.

Let's look at a recent sequence I got from Amazon when I ordered a surge protector, some USB-C cables, and a charger.

Email 1: Order confirmation. Normal, expected transactional email. No opt-out.

Email 2: Two days later, subject: Included with your Amazon order: Free 90 days of music. A promo for Amazon Music tied to my order. No unsubscribe link.

Email 3: Shipping notification. The email body is roughly half shipping info and half Cyber Monday deals. No opt-out.

Email 4: Delivery confirmation for a different item. Also stuffed with Cyber Monday promos. No opt-out.

Email 5: Another item shipped. More ads. No opt-out.

Email 6: Next day, a delay notification. Mercifully, just the delay info without ads. No opt-out.

Emails 7 and 8: Two delivery confirmations for items that arrived in the same delivery in separate boxes. Two separate emails, seconds apart, with different promotional content. No opt-out.

Email 9: The following day, a request to rate the marketplace seller. The first email in the entire sequence with an actual opt-out option.

Email 10: Different item out for delivery. No promos. No opt-out.

Email 11: That item delivered. Promos are back. No opt-out.

Email 12: Two days later, a reminder to rate the marketplace seller (with opt-out).

Email 13: A week later, another reminder to rate that seller (with opt-out).

Email 14: The next day, another Amazon Music promo vaguely referencing "your recent Amazon purchase" without linking to the actual order. This one does have an unsubscribe link.

Emails 15 and 16: More review request emails sent "on behalf of" the brands I bought from.

That's 16 emails for one order over two weeks. Most without unsubscribe options, many with equal or more promotional real estate than actual information, and almost all of which could have been consolidated.

Walmart runs a lighter version of the same playbook: order confirmation, a reminder to add items before it ships, substitution/out-of-stock notice, ready-for-pickup/delivery alert, pickup/delivery confirmation, review request, experience survey.

To their credit, Walmart clearly labels review and survey emails as advertisements with opt-out links, and most of their transactional emails don't have promotional content (though the initial order confirmation sometimes sneaks promotional links near the bottom). I was confused when I noticed some of their transactional emails without explicit unsubscribe links still triggered Gmail's native unsubscribe button. I don't know if this means the email had some behind-the-scenes opt-out hidden in the HTML or if Walmart and Google worked something out. Some of Walmart's transactional emails use Gmail's dynamic email format, which might explain it. I eventually found the option to turn off review emails buried in my Walmart account settings, so at least they give you a way out.

Not all transactional emails are bad. Some amount of "extra" transactional email is genuinely useful. If you book a flight months in advance, you expect the immediate booking confirmation, but it's not bothersome to get a flight details email 48 hours before departure, especially if something has changed. A separate check-in reminder is universally useful too, particularly on carriers where check-in time determines your boarding position. Those emails often lack unsubscribe links, but they feel service-first, not marketing-first.

The airline emails I got after boarding were a different story. Their footers offered a grab bag of justifications: "because you subscribe to Account Summary or News and Offers," "because you flew with us recently." Some of the most nakedly promotional emails had no unsubscribe links at all. When I finally logged in to opt out, I found I'd already unsubscribed from some categories but not others, which meant I'd probably done this dance before and new categories got added later that my previous opt-out didn't cover.

What's unmistakable is what transactional marketers are optimizing for. You're already a target customer and this is a direct channel to reach you about something you care about. Most of these emails are big on headline and light on detail. You have to click through to see the rest, pulling you back into their app or website where they can throw personalized recommendations at you. Transactional emails weaponize your need for information while circumventing your ability to say no, and that inability to say no is why they pile up so fast.

DTC brands will absolutely flood your inbox

The third biggest contributor came from marketing emails from direct-to-consumer retailers. I won't name names, but every DTC brand I bought a Christmas gift from last year sent me roughly two emails per week all year long.

Then November hit and all hell broke loose.

Those same brands ramped up to three emails every two days. Some days I got multiple emails from the same company before lunch. The economics make total sense from their side. Email marketing is basically free after setup. Open rates hover around 15-20 percent. Even a one percent conversion rate prints money at scale. They have every incentive to send more.

Research shows 79% of consumers report ignoring or deleting marketing emails from brands they voluntarily subscribed to at least half the time. The brands know this. They just don't care. If four out of five emails get ignored, the answer isn't to send fewer, better emails. It's to send five times as many.

And when you finally decide to unsubscribe? Good luck. Research from EmailTooltester found the average subscriber encounters over six dark patterns when trying to cancel, with an average of nearly seven clicks from homepage to cancellation. Some brands hide the unsubscribe link in tiny gray text. Others make you log in first. A few require you to "confirm" your unsubscribe via a second email, which feels like a trap designed to make you give up halfway through.

I want to support journalism but I can't read fast enough

The second biggest contributor was newsletters I wanted and sometimes paid for.

I subscribe to several paid newsletters because I genuinely want to support good analysis and reporting. But there's a brutal math problem here. If you subscribe to five newsletters that each publish two or three long pieces per week, that's 40 to 60 articles per month you need to read to get your money's worth, on top of everything else fighting for your attention.

I was paying for newsletters I wasn't reading. At some point that's not a subscription. It's philanthropy. And while I'm happy to support writers I respect, if it's purely charitable, there are other causes that need the money more.

So I cut back. It wasn't easy. I felt guilty unsubscribing from people whose work I admire.

The irony is that for the newsletters I kept, I don't even consume them through email. I find their articles via social media links. Then I log in through an email magic link, which is its own form of inbox pollution (more on that later). The actual newsletter emails just sit there unread, making me feel bad about myself.

That's the biggest problem with the paid newsletter model as a self-contained ecosystem. A newsletter isn't its own best growth channel. Social media is where people share links and discover new writers, which is how I found most of the writers I eventually subscribed to. But the delivery mechanism is email, and email is where things go to pile up.

My favorite paid newsletter, Stratechery, survives in my life specifically because it's also a daily podcast. I can listen while doing other things. That's the only reason I keep up with it. Podcasts give me two things email doesn't: a single place to access everything and the ability to pause and resume exactly where I left off.

For my money, The Verge has the best value in tech journalism right now. Fifty dollars a year gets you ad-free podcasts, a cleaner website, and subscriber-only content. Compare that to ten dollars a month for a single newsletter with multiple long articles per week, delivered only through email or a standalone site. No shade to anyone in particular, but the math doesn't work for me.

Notifications drown themselves out

The smallest of the big four, but the one that's entirely on me: notifications I had explicitly opted into. GitHub alerts. Bank pings. Reminders I actually wanted.

Some of these are incredibly useful. Others are pure noise. The problem is that many services won't let you pick and choose. It's all or nothing. So I kept them on because occasionally they'd surface something important.

But that's also the trap. For those important alerts to reach me, my inbox can't be buried under everything else. My signal-to-noise ratio collapsed and critical notifications got lost under DTC spam and promotional "shipping updates." The notifications I wanted became useless because of the noise. The noise defeated itself.

The zombie lists will find you

Here's an honorable mention I didn't expect: marketing emails from brands I have accounts with but never remember signing up to get newsletters from.

Samsung. Peacock. Jabra. My bank. My 401k. My car dealership. My electric company. My gas company. Olive Garden.

And then there's Dollar Shave Club. I haven't subscribed to their razors in maybe 13+ years, but a few months ago they just started emailing me again. No explanation. No re-confirmation request. Just emails showing up like nothing happened. I've gotten 26 from them this month alone. It's probably some mix of privacy policy changes nobody reads, terms of service updates that reset preferences, corporate acquisitions that merge email lists, or just old-fashioned zombie list tactics where dormant addresses get reactivated for a new campaign.

Anywhere you've ever had an account or received an email receipt has probably added you to marketing lists. Whether you agreed explicitly or not. And even if you opted out years ago, there's a decent chance some policy update or database migration quietly opted you back in.

This is why inbox bankruptcy doesn't actually work long-term. You can declare email zero today. But some account you created in 2015 will decide next month that you definitely want to hear about their new product line.

One-time codes are quietly piling up

One final honorable mention: "magic" links and one-time login codes.

Every time you sign into a service that uses email-based authentication, you get an email that has zero value after you click the link. But deleting it means going back to your email app after you've already bounced over to whatever you were logging into.

On mobile this is especially annoying. You tap the email, tap the link, get thrown into the app or browser, do what you need to do, and the email is still sitting there. You have to switch back to your mail app just to delete something you used for three seconds. You can't open the link and delete at the same time, so eventually you forget, and they start piling up.

These weren't a huge percentage of my inbox. But they accumulate.

The system works exactly as designed. Just not for you

Here's what I finally understood after cleaning out 14,000 emails.

Your inbox being out of control is not a personal failing. It's the intended outcome.

Email marketers know exactly what they're doing. They know the five seconds it takes to unsubscribe feels like more friction than just deleting. They know that mixing promos into transactional emails lets them dodge opt-out requirements. They know most people won't dig through account settings to find the right toggles. They know that sending three emails instead of one triples their odds of catching you at the right moment.

Gmail, for all its sophistication, has mostly enabled this. The Promotions tab creates the illusion of control. Your marketing emails are "handled." But really they're just warehoused, not blocked. Gmail reportedly delivers over 90 percent of commercial email to Promotions rather than spam. That's a feature for marketers, not for you. And Gmail runs ads in the Promotions tab too, so they have no real incentive to reduce the volume.

US law doesn't help either. We operate on an opt-out model. Companies can email you until you explicitly tell them to stop. The EU requires explicit opt-in consent before any marketing contact, which is part of why European inboxes tend to be less chaotic. But here? You're fair game by default.

The FTC tried to do something about this with a "click-to-cancel" rule that would've made unsubscribing as easy as signing up. Industry groups sued. The rule got blocked in court earlier this year.

There's one bright spot. The FTC sued Amazon over Prime's notoriously difficult cancellation process. Internally, Amazon reportedly called it the "Iliad Flow" because of how long and tortuous it was. That lawsuit resulted in a $2.5 billion settlement. But that's the exception, not the rule.

So what do you actually do?

I don't have a clean answer.

Email bankruptcy doesn't work because the zombie lists find you again. Obsessive inbox management doesn't scale because the volume is designed to outpace your attention. Unsubscribing from everything is whack-a-mole that never ends.

What I'm trying now is aggressive filtering, ruthless unsubscribing from anything that doesn't bring real value, and making peace with the fact that some emails will just pile up in folders I'll never open. I'm also being way more careful about what email address I hand out. Throwaway for purchases. Real address only for things I actually want to hear from.

The deeper problem is that email's original promise has been completely hijacked. Async communication that respects your time? Gone. Your inbox isn't a communication tool anymore. It's ad inventory that happens to occasionally contain important messages from actual humans.

Those 14,000 emails weren't my fault. But dealing with them was still my problem. That's the most infuriating part. The people filling your inbox with noise face no consequences for the attention they steal. The cost gets pushed entirely onto you.

So keep your head on a swivel. Check your email preferences everywhere, regularly. Audit your subscriptions. And maybe stop feeling so guilty about that unread count. The system is rigged.

You're not lazy. You're outgunned.

https://b10g.xyz/blog/2025/inbox-disaster/

point-o-matic: sane dev estimates

Sep 5, 2025

Show full content

I often need sane estimates I can say out loud without cringing later. To solve this, I've made the Point-O-Matic to "price" project friction.

Scoring

Points are basically friction units. They're not effort, days or hours, but just an abstraction of how much reality tends to slow me down. My goal is to start simple, then nudge a score up or down based on the bumps I can already see.

Baseline

Start at 4 points (≈ 2 dev-days for a small feature), but never go below 2 points (≈ 1 day)

4 points is a small feature that's not trivial but also not a big deal. It's the baseline for anything that's not a tiny tweak. 2 points assumes I'm moving in my own stack with no surprises and is the absolute floor. It basically means "I could knock it out today if the world stays quiet."

Add points for real friction

Add when the work leaves my lane or mutates data that matters. These are the culprits that turn "today" into "next week." Tick only what's truly in play.

+6 major new external API or SDK (unproven docs, high risk)
+4 external gate I don't control (App Store, partner, security or legal)
+3 data migration or irreversible schema change
+2 minor new external API or sdk (well-documented, low risk)
+2 new roles or permissions model (beyond a tweak)
+2 brand-new UI surface (not just a tweak)
+2 touching multiple services or repos
+2 real security, or compliance target
+1-2 spec is fuzzy until I poke it

Subtract a bit if familiar (cap total at −2)

Familiarity reduces drag, but only a little. Cap the total at −2 so optimism can't bulldoze reality.

−1 I've shipped this exact pattern recently
−1 reusing a proven component or path I own
−1 "ugly is fine" (behind a flag, minimal polish)

Hello-world rule

Nothing gets an estimate until I've done something that proves the path is clear. A 60 minute smoke test tells me if I'm on bedrock or quicksand. If I can't get a button to fire code and see a result, I'm still mapping the cave. We price that uncertainty.

If I can't get click → code runs → visible result in ~60 minutes, add +3 (uncertainty tax)

Multipliers

When risks stack, they compound. Double early—while I can still cut scope—rather than after I'm already late.

Pull x1.5 if any one is true:

point total is already 9-11
external review (security, legal, compliance)
significant new integration (e.g. payment processor, auth provider)

Pull ×2 if any one is true:

point total is already 12 or more
critical path for a launch or deadline
external gate and a live data change
major new platform or runtime (e.g. iOS, Android, Node version)

Guardrails

These keep the tool lightweight and stop it from turning into process cosplay.

Touch the entry point before scoring (no cold estimates)
Mid-build reality check: if it feels about +2 points off, rescore and reset the date
Cut scope before moving dates

Quick examples

Settings tweak + small API call
4 (base) +2 new UI −1 reuse = 5 pts → 2.5 days
Billing webhook to a new partner + migration
4 (base) +3 new API +2 migration −1 done similar = 8 pts → 4 days → × 1.5 External Review ≈ 6 days
New feature with vendor approval + new roles + critical path
4 (base) +4 external gate +2 new roles +2 new UI +1 SDK = 13 pts → 7 days → ×2 Critical Path ≈ 14 days

I'll keep refining this as I use it. If you try it, let me know how it goes!

https://b10g.xyz/blog/2025/point-o-matic/

revisiting agentic ai: hype or help?

Feb 17, 2025

Show full content

The most profound insights about technology often come from direct experience rather than theoretical analysis. Last October when I gave my first-ever conference talk on agentic AI I emphasized process over code, specialized roles over general capability, and sequential collaboration over full autonomy. I was right about these architectural principles, but for entirely wrong reasons. The real limitations turned out to be more fundamental: accountability, security and an imperceptible line between capabilities and constraints.

This revelation didn't come from one place, but many. I've changed jobs, new models have been released and new research has come out. But of these three, my transition from working in product at Gitwit to engineering at Prelude put me squarely at the intersection of AI's big promises and its real-world limitations. It's here that I've been forced to confront the defining question of the AI: how do we orchestrate LLMs into useful products?

I thought the main challenges would be technical - how to structure agents, which frameworks to use, how to map processes. Instead, the answer has been surprisingly nuanced. While AI excels at certain tasks like code scaffolding and syntax assistance, it fundamentally lacks what I'll call "creative instinct." It can't make bold, strategic decisions that push limits because every response is mathematically meant to stay within them. Humans, by contrast, constantly color outside the lines, follow hunches that don't make sense, and make intuitive leaps that defy reason. Often these moonshots don't payoff, but sometimes they do, and AI will never attempt them.

This distinction matters because it frames how we think about AI integration. The most successful implementations won't be those that try to replicate human judgment, but rather those that amplify it. Consider GitHub Copilot versus autonomous coding agents: one augments a developer's capabilities while preserving their agency, the other attempts to replace their creativity entirely, turning them into QA.

The fundamental limitation here isn't technological – it's architectural. Large Language Models generate text one token at a time based on statistical likelihood. This inherently reactive process makes it practically impossible for an LLM to independently originate truly novel ideas or designs. While their possible outputs are vast, they are finite, unlike many real-world problems that have infinite possible solutions.

AI's Integration Era

As models become more powerful and accessible, the race is no longer for best model but best application on top of them, which raises a pretty existential question: should you build agentic applications now, or wait for the next model?

Specialization, either in the form of better user experiences, niche markets or deep personalization are all effectively fine-tuning and optimization. Building orchestration layers around today's models assumes they're what you'll be using tomorrow. But what if they're not? What if the next model is so much better it renders your entire architecture obsolete?

In a recent Stratechery interview, In a recent Stratechery interview, Ben Thompson and Microsoft CEO Satya Nadella discussed how successful platform shifts require what Nadella called a "complete thought" - a clear vision of the entire system from the silicon to user experience. Just as Moore's Law allowed software companies to prioritize functionality over optimization, trusting that hardware would catch up, Microsoft CTO Kevin Scott see this as a possible future for AI development, noting how past platforms like x86 and cloud computing succeeded by focusing on delivering value rather than chasing performance.

We can see AI heading towards more powerful, more efficient models even if we can't predict exactly when we'll get there. The lessons of the past are relevant now because optimizing for today's models might be a losing battle when frontier AI is improving exponentially and infrastructure is evolving unpredictably. The most resilient companies won't be those locked into specific models but those designing for adaptability, able to evolve alongside AI's relentless progress.

If foundational models keep improving and orchestration becomes standardized, where does that leave agentic systems? Middleware often starts useful but gets absorbed or bypassed, and model makers themselves are moving to own the agent layer. Without proprietary insights or deep integration, agentic systems risk competing against the platforms they depend on—and losing.

The AI Accountability Gap

AI is often framed as an independent actor, capable of handling tasks and making decisions. It's brilliant, until it fails. Then, suddenly, everyone thinks it's just a dumb tool nobody can be accountable for, but a zip file of model weights in a data center can't "decide" anything, nor can it be held legally or fiscally responsible.

Think about self-driving cars. While full autonomy might be technically achievable, the more pressing question isn't about capability but accountability. Who do we hold responsible when – and not if – autonomous systems cause massive real world harm?

For now, ChatGPT isn't a car, and it won't be causing any fender-benders anytime soon, but that doesn't mean it can't cause havoc. If we can't trust LLMs, can we trust agents? New research out of Columbia University shows how easily today's agents can be compromised in ways plain LLMs are actually better at deflecting. Imagine asking an AI agent to find a product online. It scours Google and Reddit for recommendations, just as a human might. But lurking in the results is a trap. An attacker has planted a seemingly helpful Reddit post, subtly guiding the agent to a malicious website designed to steal your credit card information.

The researchers tested this using web-browsing capable agents like Claude Computer use and MultiOn to see how easily they could be manipulated. The results were alarming. Agents could be easily tricked into exposing private data, downloading malware, and even sending phishing emails from a user’s own account, and not just sometimes. In some trials, the agents divulged sensitive information every time.

In instances where agents are redirected to malicious sites through trusted platforms like Reddit, we find that they divulge sensitive information such as credit card numbers and addresses in 10 out of 10 trials.

From "Commercial LLM Agents Are Already Vulnerable to Simple Yet Dangerous Attacks", emphasis added

The question isn't whether AI agents can act independently, but whether they should and these security risks highlight a fundamental truth: AI's best use-case isn't open-ended value creation, but resilient execution of specific valuable outcomes.

The Infinite Value of Cogency

So let's return to the question of agentic AI. Should you build it now, or wait for the next model? The answer is yes and no. The real question isn't about the model, but about the value. The biggest mistake in AI today isn't failing to keep up with the latest models. It's failing to articulate why an AI system exists in the first place."

The temptation to chase the next breakthrough is obvious. Every few weeks, a new model promises better reasoning, cheaper inference, or longer context. But none of that matters if an agentic workflow lacks a clear and obtainable goal aligned with real customer needs. AI's progress may be exponential, but does any of it solve a real problem? Does it improve outcomes in measurable ways?

Take OpenAI's Sora video generation model. The initial demos were impressive, but once people got their hands on it, the excitement faded. The fact that it lives outside of ChatGPT also keeps it out of sight and out of mind. The point is, the model's capabilities are less important than its utility. If it doesn't solve a real problem, it's just a toy.

This is why defining an agentic system's purpose and measuring its value matters more than any single model. Applications built on well-defined purposes won't be undone by newer models or infrastructure shifts because their value isn't tied to raw capability but to strategic alignment with real-world needs. Moats are built on process just as much as product.

DeepSeek shocked the world not by building the best model, but by rethinking how models are built. Its success wasn't about parameter count but about a fundamentally more efficient way to scale AI. TThis distinction is easy to miss in the hype cycle. AI capabilities improve so quickly that it's tempting to think the real differentiator is keeping up. But history suggests otherwise. Historically, the best tech companies didn't win by using the fastest chips or the lowest-cost hardware. They won by applying those resources in ways that mattered.

The same can be true for AI's users. Differentiation with AI isn't about the model, it's about the process, the workflow and the integration. A chain of AI prompts calling APIs is brittle automation, easily broken and replaced. But an AI system that refines data, compounds automation, and fundamentally reshapes how a business operates is a sticky, indispensable solution.

The best AI companies won't just leverage the best models. They'll use them as leverage. They'll build systems that don't just process information but continuously learn from it. They'll create organizations that optimize decisions, streamline operations, and build advantages that compound over time.

Because AI isn't a strategy or a product. It's a tool, and its value comes entirely from how it's used and what it's used for. A self-driving car without a passenger or destination is a paperweight.

https://b10g.xyz/blog/2025/revisiting-agentic-ai/

for sale: google chrome, never monetized

Nov 19, 2024

Show full content

The DOJ is about to drop the biggest antitrust bomb since Microsoft's Internet Explorer case.

In the late '90s, bundling IE with Windows licenses played a big role in getting Microsoft labeled a monopoly and for good reason. IE's complete stranglehold peaked at ~95% market share (150M users). Today Chrome's "mere" 65% translates to 3+ billion users and gives Google far more control over the web than IE's team could dream of.

The DOJ's ideal solution? Make Google do what Microsoft probably should have done: spin off or sell off their browser.

But my question is: who on Earth could even buy it?

Leah Nylen and Josh Sisco, reporting for Bloomberg (November 18, 2024):

Top Justice Department antitrust officials have decided to ask a judge to force Alphabet Inc.’s Google to sell off its Chrome browser in what would be a historic crackdown on one of the world’s biggest tech companies.

A Fair Price

Leah Nylen, reporting for Bloomberg (November 19, 2024):

Should a sale proceed, Chrome would be worth “at least $15-$20 billion, given it has over 3 billion monthly active users,” said Bloomberg Intelligence analyst Mandeep Singh.

I think Chrome is worth way more than that. Consider that Google was paying Apple as much as $20B per year just to be Safari's default search engine (about 36% of what Google earned from Safari search advertising). And that was $20B annually for partial access to just Safari's 18% market share. Chrome has nearly four times Safari's user base. Add in that Chrome is also one of tech's most powerful consumer brands, has a massive app extension ecosystem and deep enterprise penetration... $15-20B almost feels laughably low.

Until, of course, you consider Chrome never had a business model and whoever buys it will just be burning money. That is, unless, they figure out how to make it profitable without completely angering its users. Google seemed to have that figured out with search ads, but that road now looks hazardous for others to follow.

The Likely Suspects

It begs the question, if Chrome gave Google too much power over search, who could run it without abusing some other kind of monopoly power? And what will a future owner do about search, especially since users genuinely prefer Google? It makes me wonder if Google is realistically punished by this, or if this is just the old joke about boats and the two happiest days of ownership? If Chrome sales, they get some money sure, but they also get to offload a massive expense while still likely holding onto their default search engine status with a lot less antitrust heat.

At a bare minimum, I can't imagine Apple or Microsoft, with their entrenched investments into their own browsers and previous antitrust baggage, would even dare think about bidding for Chrome. And, any other big tech company with the cash, capacity or ambition to successfully run Chrome is realistically on list somewhere at the FTC or DOJ. Even if we still consider companies with pending or future anti-trust cases, it's a surprisingly small pool of potential buyers.

While I'm honestly not a gambling man, let's head over to my imaginary betting window while I set some completely arbitrary odds on potential buyers:

Meta – The Favorite (3:1)

I think Meta is a clear frontrunner should they want it. Zuckerberg has spent the last decade trying to escape the platform constraints that Apple and Google have placed on his empire and Chrome could finally give Meta what they've always wanted: unfettered access to users outside of their app.

The strategic fit is perfect. Meta's aggressive push into AI with their Llama models needs a direct consumer touchpoint, but while Meta AI keeps gaining users, its utility is constrained by Meta's apps. Building Meta AI-powered browsing assistants into Chrome could directly compete with Google’s Gemini and seriously enhance their AI’s relevance.

There's a hardware dimension too. While Meta's Quest headsets and Ray-Ban smart glasses show they're serious about owning new computing platforms, Chrome could give them an even stronger position in one of today's dominant platforms while they work on tomorrow's. And, if the company's metaverse vision ever materializes, controlling the world's dominant browser could be a crucial bridge between traditional computing and whatever comes next. Unlike their failed mobile efforts with Facebook Home, browser ownership is a much more achievable path to platform relevance.

But both of those are still not the biggest prize of this transaction. Chrome would give Meta something they've only dreamed of: a complete view of users' entire digital lives. The advertising implications are staggering, and instead of seeing only what users do inside Facebook and Instagram, they'd get insights into every website visit, every search, every purchase. For a company built on turning user data into advertising gold, that's worth almost any price.

But regulatory hurdles for Meta are quite real. Meta would probably face intense scrutiny, especially under a Trump Administration FTC and DOJ, but they might actually have an easier time than other tech giants precisely because they're not currently a browser or search player at all. By that token, DOJ might see Meta as a legitimate counterweight to both Google and Apple's browser dominance.

The price tag would be steep, but Meta's $70B cash pile and need for more desktop and enterprise reach make this their best shot for platform relevance. Also, don't bet against Zuck when user data is on the line.

Amazon – The Gift Horse (6:1)

Amazon's case for Chrome feels pretty obvious at first glance. Their advertising business is already a juggernaut spanning sponsored products, brand experiences, streaming TV, audio, display ads, and even physical packaging. But it's still largely confined to their own ecosystem. Chrome user data would dramatically change that equation, giving them insight into the entire consumer journey, not just what happens inside Amazon's walled garden. Combined with their existing retail, streaming, and device data, they'd have an even more powerful advertising powerhouse that could rival both Google and Meta. Heck, just adding one new Amazon ad to Chrome's new tab page could justify the purchase price.

Amazon's track record with platforms is pretty mixed. The Fire Phone flopped, their Fire OS is a weak Android fork for TVs and tablets, and the Chromium-based Silk browser struggles on Amazon's underpowered hardware. Yet Amazon has proven they can absorb and scale major acquisitions like Twitch, Ring, and Zappos. The problem is these acquisitions have plateaued a bit too. Twitch regularly loses top streamers to YouTube while Ring keeps delaying on promises like HomeKit support and their 2020 in-home security drone. Even Alexa, itself born from acquiring Polish startup Ivona Software, has lost its early voice AI lead to OpenAI and Google. This summer, The Wall Street Journal reported Amazon lost over $25 billion on Alexa devices between 2017-2021, selling half a billion units at razor-thin margins hoping to drive merchandise sales from users who treat them as fancy alarm clocks. Even now, as Amazon pushes hard on enterprise AI with AWS Bedrock and Q, they're still trailing in consumer AI. Chrome's billions of users are tempting, but Amazon's mixed record of strong integrations suggests this might be an expensive distraction rather than a strategic necessity.

But Amazon's biggest hurdle might be regulatory. They're already under intense antitrust scrutiny, with the FTC's lawsuit heading to trial in October 2026. That case focuses on Amazon's retail dominance and pricing algorithms like "Project Nessie" that allegedly extracted billions from consumers. Giving them control of the world's most popular browser could be a bridge too far for regulators. The DOJ's whole point is to reduce concentration of power but letting Amazon add Chrome to their arsenal might just be the opposite.

They're still a logical contender with clear advertising potential, but the limited hardware synergies and regulatory challenges make this feel more like a thought experiment than a realistic outcome.

Yahoo (Apollo Global Management) - The Dark Horse (20:1)

Apollo Global's Yahoo might seem like an unlikely Chrome suitor, but the private equity firm has shown a surprising amount of ambition since acquiring Yahoo from Verizon for $5B in 2021. While current CEO Jim Lanzone has successfully led content and advertising businesses at CBS and Ask.com, Chrome would demand something Yahoo currently lacks: a strong product and entrepreneurial leader who could transform the world's favorite free browser into an actual business. Given Chrome's role in cementing Google's search dominance, monetization was never the goal and turning Chrome from a cost center into a revenue generator, while maintaining its technical excellence, will require a unique kind of executive leader.

But Chrome would transform Yahoo's comeback story overnight. Instead of relying on a declining but substantial user base, they'd have access to billions of new users. The advertising and analytics potential would dwarf Yahoo's current reach, potentially justifying Apollo's purchase price and then some.

There's also a really fascinating regulatory angle. The DOJ's core complaint is that Google uses Chrome to maintain its search monopoly, but Yahoo licenses search results from Microsoft's Bing. If Yahoo owned Chrome and made Yahoo Search (powered by Bing) the default, it might actually help create the search competition the DOJ wants, if a bit indirectly. Microsoft would get the expanded user base they need to improve Bing without triggering the antitrust concerns of buying Chrome themselves. It's a potentially elegant, if still entirely implausible, way to boost search competition through the back door.

But there are two massive hurdles. First, this isn't the old Yahoo. Apollo's version is a much leaner operation focused on digital advertising and content. While they've shown promising signs under private equity ownership, maintaining and evolving Chrome's massively distributed codebase would require an enormous investment in engineering talent and R&D that Yahoo simply doesn't have right now. Even if they could attract the right people, building that capability would cost nearly as much as Chrome itself. Second, it's unclear if Apollo would be able to justify Chrome's price tag. While they have deep pockets, private equity typically looks for clear paths to profitability and monetizing something people have been getting for free for the past 16 years is not going to be easy.

So while Yahoo offers a uniquely clean regulatory path through its Bing partnership and search-neutral position, the odds remain quite long. Without the right product leadership and engineering muscle, Chrome's potential would wither in Yahoo's hands.

Oracle – Larry's Last Stand (25:1)

Oracle might seem like a dark horse for Chrome, but Larry Ellison has never met a Google fight (or big tech acquisition) he couldn’t resist. Oracle has the technical resources to maintain Chrome's codebase and deep enterprise relationships that could turn into real value. They've managed major open source projects before, though their handling of Java after acquiring Sun, and the aggressive licensing fees of the Google lawsuit might not inspire a lot of confidence.

I think an enterprise angle is pretty compelling. While consumers might balk at Oracle branded Chrome, business customers already pay Oracle billions for mission critical software. Chrome for Enterprise already exists with advanced security and management features and Google charges for it. Oracle's massive sales operation could bundle this into their existing packages, creating the kind of clear monetization path that could justify a massive acquisition price.

But the cultural mismatch is hard to ignore. Oracle excels at extracting maximum revenue from enterprise customers who have no choice but to pay up, while Chrome succeeded by being free, open, and beloved by everyday users and developers alike. Trying to combine these two would be like oil and water. The developer community, which has long viewed Oracle as hostile to open source, might just flee Chrome entirely.

So while the enterprise strategy is compelling, Oracle's DNA might be too fundamentally at odds with Chrome's to make it work. Then again, Larry Ellison is one of Trump's biggest backers, and with Trump's DOJ and FTC likely calling the antitrust shots when this all happens, regulatory approval might be more about political allegiance than consumer interest. After all, Oracle nearly landed TikTok in 2020 through Trump's direct intervention, a deal that made more political sense than technical sense. Stranger things have happened in tech M&A.

Other Long Shots

Cloudflare: They'd be a strong values match given their focus on improving the web, but acquiring Chrome would require them to scale massively—likely to afford and sustain the technical and operational demands of the browser.
Zoom: Despite attempts to expand beyond video calls with tools like Zoom Docs, email, and calendar, none have achieved significant traction yet. Acquiring Chrome would be an even bigger leap, pushing the organization far beyond their core expertise, likely spreading them too thin.
Elon Musk: Musk probably sees the appeal of owning the world’s dominant browser, but with SpaceX, Tesla, X and now the Department of Government Efficiency all vying for his attention (and budgets), taking on Chrome might finally stretch those resources and focus too far.
Salesforce: While Chrome's dominance would give Salesforce unprecedented access to integrate their enterprise tools directly into billions of browsers, it's hard to see them succeeding in consumer tech. Marc Benioff has shown little interest in consumer products, and Salesforce's enterprise DNA makes them an awkward steward for the world's most popular browser.
Intuit: While Intuit excels at creating mass-market financial tools, transitioning to managing a global web browser is a leap too far. Chrome’s scale and complexity don’t align with Intuit’s existing business model or expertise.
OpenAI: It’s hard to imagine a scenario where OpenAI would step into the browser market. While the company’s AI tools and services are undeniably transformative, its focus has been on advancing artificial intelligence rather than managing a complex, user-facing product like Chrome.

The Chromium Question

One intriguing aspect of this situation is Chromium, the open-source core of Chrome. The DOJ’s filing might clarify whether Google could continue maintaining Chromium after divesting Chrome. If Google retains control of Chromium, it would significantly lower the technical barrier for potential buyers, as the acquisition would focus more on the Chrome brand than on its technical foundation. However, this could also diminish Chrome’s overall value as a product. On the other hand, if Google cannot remain Chromium’s maintainer, the pool of companies capable of managing both Chrome and Chromium’s massive codebases shrinks considerably.

We'll know more soon, but one thing's certain: whoever buys Chrome will reshape how billions of people access the web. Whether that ends up being better or worse than Google's current dominance also remains to be seen.

https://b10g.xyz/blog/2024/chrome-for-sale/

automation is obsoletion (in a mostly good way)

Nov 8, 2024

Show full content

I’ve spent a lot of time this year pondering AI’s impact on labor, especially in professional fields like mine.

While it’s a bit unsettling, history is clear: automation leads to obsolescence.

"If AI is automating your job, you were decorating, not designing."

— Grant Baker

Taken from a thread in the #uxok channel of Techlahoma's Slack

But that’s only part of the story. Automation also creates efficiencies that pave the way for new opportunities and real progress.

Ben Thompson did a good job highlighting this in a recent Stratechery post about how armies of bank bookkeepers were replaced by computers over the span of a few decades.

This history lesson reminded me how the skilled working-class textile workers of the Luddite movement, who are often oversimplified as being anti-technology. They began by advocating for fair treatment when industrialization threatened to end their entire trade. The Luddites weren’t anti-progress, they were pro-worker. The whole “sabotage all the machines” part they’re known for came later, not necessarily due to strong anti-machine sentiment but because the machines themselves were easy targets in their campaign for fairness.

Yet, while industrialization greatly reduced the need for skilled weavers, the massive increase in woven textiles expanded opportunities in sewing, tailoring, machine maintenance, and other areas of production.

No one really wants to return to a world where financial systems move at the speed of paper spreadsheets or where only the wealthy can afford comfortable, well-fitting clothes. Technological advancements constantly place essential roles in the crosshairs of redundancy, creating new opportunities but demanding constant adaptation. AI presents an existential ultimatum not just for organizations but for our society. Now is a critical time for thoughtful policy that considers human dignity beyond economic interests. Otherwise, we’ll end up with a new generation of displaced Luddites who don’t reject progress but deserve a fair shot at the opportunities progress creates.

If you find this interesting, I definitely recommend reading the Wikipedia article on technological unemployment, which is hardly a new phenomenon. Also, check out this article from the MIT Technology Review that looks back on a 1938 article written by the then-president of MIT on the same subject.

"It is then easy to fall into a 'public-be-damned' attitude, or to be content with the status quo — forgetting that law of nature so well expressed by Francis Bacon 300 years ago: 'That which Man altereth not for the better, Time, the great Innovator, altereth for the worse.'

Thus, for example, it seems to me that by far the greatest merit in the Sherman Antitrust Law of this country lies not in its protection of the public against exploitation by industrial trust but lies rather in its protection of the public and of industry itself against the danger of complacency which lead to stagnation of industry. By maintaining competition there is insured a continuing incentive to progress and to ever improved service of the public, and thus to maintenance of virility in industry itself."

— Karl T. Compton, former MIT president, in “New Demands on Technology” from the December 1938 issue of the MIT Technology Review

The reality is that technological advancement, market forces, and labor disruption are inseparable. While we can and should advocate for thoughtful policy and fair treatment of workers, we can’t ignore the fundamental economic pressures that drive innovation and change. The challenge isn’t to fight these forces but to prepare for and adapt to them, ensuring that progress, while inevitable, doesn’t have to leave people behind.

https://b10g.xyz/blog/2024/automation-is-obsoletion/

https://b10g.xyz/feed.xml

Posts