GeistHaus
log in · sign up

Test Pappy

Part of wordpress.com

Thoughts and brain-droppings of a software tester.

stories
What Am I Actually Writing About? A Reflection After 42 Days
Uncategorized
I caught up with my friend Stu Crocker a couple of weeks ago. We talked about a lot of things. He asked about my blogging process, how the posts come together, where the ideas come from. And then he asked something I hadn’t fully thought about. Across all these posts, on very different topics, is … Continue reading "What Am I Actually Writing About? A Reflection After 42 Days"
Show full content

I caught up with my friend Stu Crocker a couple of weeks ago. We talked about a lot of things. He asked about my blogging process, how the posts come together, where the ideas come from. And then he asked something I hadn’t fully thought about. Across all these posts, on very different topics, is there a core thread? A personality, an intent, something that runs through the whole thing?

I couldn’t tell him what it was.

After I ended a 42 day streak of posting every day, and posting nearly 60 posts over the last 8-9 weeks, it’s time to find out what it is.

How the Posts Actually Come Together

The process is simple. Ideas arrive on my morning or evening walks. Usually two minutes or one duck later, they’re gone, unless I capture them. I started to use Claude’s speech-to-text, stammer my thoughts into it, and a small skill I “built” turns the verbal mess into a scaffold. When I get home, I take that scaffold into WordPress and then or later write the post. Claude helps with research, with reviewing for typos and grammar, sometimes with pushing back on a weak argument or bad structure.

Writing is where the actual thinking happens. The scaffold is never the final post. I restructure, cut, find what I actually meant. The post that ends up online is not the one I started with. That’s the point.

So when Stu mentioned this core thread, I realized I could actually check with the help of AI. I asked Claude to read my last 50 posts, February through April, and tell me what it saw.

What Came Back, the Generous Part

The summary was flattering in places.

A practitioner who doesn’t panic in either direction. Someone who uses AI daily and still writes carefully about its risks. Someone teaching the thinking rather than handing out conclusions. Craft as ethics, not as a metaphor I reach for. Systems thinking as the spine underneath almost everything. Still learning out loud, citing others generously, refusing the guru position.

Fine. Not sure about all of it. I’ll take it. But that’s the easy half.

What Came Back, the Uncomfortable Part

I’m repeating myself. Meadows’ bathtub shows up three times. The junior-developer-with-amnesia appears over and over. “Velocity without verification is just negligence with extra steps” got used twice, word for word.

I forgot I’d made the points already. Meadows’ bathtub is a classic in the first half of her book “Thinking in Systems: A Primer”. With reader numbers being somewhere between 15 and 100, the duplication probably doesn’t matter that much.

But I’m arguing with CEOs in posts read by testers. The rhetorical questions about “do you really want to lose your senior people” are aimed at people who aren’t in the room. My actual readers are already nodding. That’s not persuasion, that’s a choir rehearsal.

Yes, I agree. As I post it on LinkedIn, who knows who will get it pushed into their feed. But I agree, the main audience is my kin.

Some posts reached outside my domain. Especially the macroeconomic ones. When I write about testing, teams, and codebases, I’m on solid ground. When I write about Nvidia chip supply or the AI bubble, I’m just another guy with opinions.

True again. But especially at the moment the global situations are those, where systems thinking is very valuable. And I read and listen to a lot of news, so I have the information to connect.

And the big one. Of the last 50 posts, maybe three aren’t primarily about AI. A reader landing on the blog today would not think this is a blog about software testing or systems thinking. They’d think I’m an AI commentator.

That’s not what I set out to be.

The Mirror, Not the Answer

Stu didn’t hand me a conclusion. He handed me a mirror. Asking Claude to hold it up was a worthwhile experiment. The mirror showed me both sides. You don’t get to keep only the flattering half. Of course it only gave me the good things in the beginning, I had to explicitly ask it for the negative feedback.

Stepping Off the Treadmill

I’ve said what I wanted to say about AI for now. The warnings are on record. When I find myself writing the same warning over and over again, that’s not adding insight anymore. That’s rehashing. (I hope I used the right word.)

The blog run of the last 8 weeks or so started as a place to process thoughts on software testing and systems thinking. The rants on AI misuse just came along, and I tried to use them in a way to teach systems thinking. Probably with less success. Either rant or teach.

The systems thinking part is where I want to put my attention next. Fewer posts. Better posts. Scarcity over flood. If a thought is worth interrupting your day with, it should earn the time.

As one of my favorite woodworkers coined: “Stay humble. Make something.” I even have it as a t-shirt. Three times.

curved road seen in convex mirror reflection
testpappy
http://testpappy.wordpress.com/?p=1829
Extensions
Glue or Process Band-Aid?
Uncategorized
You know that feeling when you’re the one holding things together? When you spot something missing in the process, and you take ownership to take care of it? When you do things that are not in your role description, but they help to improve your job? There are two roles that look similar from the … Continue reading "Glue or Process Band-Aid?"
Show full content

You know that feeling when you’re the one holding things together? When you spot something missing in the process, and you take ownership to take care of it? When you do things that are not in your role description, but they help to improve your job?

There are two roles that look similar from the outside but are fundamentally different.

The Glue Person brings the right things together. You enable flow. You connect parts that need connecting. The interfaces exist, they just need someone to make them work smoothly. You’re creating lasting cohesion.

The Process Band-Aid stretches over holes. You’re rapidly patching gaps, compensating for missing structures, filling in process steps that should exist but don’t. You’re not enabling the system. You’re masking its deficits.

Both feel like helping. Both look like being indispensable. But one is sustainable, the other is not.

Here’s where it gets interesting. The real question isn’t about you at all. It’s about the system you’re operating in.

If you’re a Glue Person, that tells you something healthy about your organization. There are interfaces, handoffs, collaboration points that genuinely benefit from someone connecting them. The parts exist, they’re well-defined, they just need someone to facilitate the flow between them.

If you’re a Process Band-Aid, that’s diagnostic information too. It means there are structural holes. Missing processes. Undefined responsibilities. The system has gaps, and instead of closing them, the organization has learned to rely on you to stretch over them. You’re not solving a problem. You’re hiding it.

And here’s the uncomfortable test: What happens if you leave tomorrow?

If you’re glue, the connections you’ve built should hold. The parts you’ve brought together should be able to stand on their own, or at least know how to find each other. Your absence might slow things down, but it won’t break the system.

If you’re a band-aid, everything you’ve been covering up gets exposed. The wound is still there. It was always there. You were just preventing anyone from seeing it.

Now, should it break? That’s the real question. Because sometimes exposing the gap is exactly what the system needs to finally address it. As long as you’re compensating, there’s no pressure to fix the underlying problem. The pain is hidden. The symptom is managed. The cause remains.

So what do you do with this? I’m not saying stop helping. That’s not realistic, and honestly, sometimes being a band-aid is what keeps things running while you figure out the structural fix.

But start making it visible. Every time you stretch over a gap, name it. Document it. Say out loud: “I’m doing this because we don’t have a process or person for X.” Make the compensation explicit. Because the system can only learn if it sees its own gaps.

close up view of band aids on blue surface
testpappy
http://testpappy.wordpress.com/?p=1824
Extensions
Finding Your Way or Why My Dad Can’t Use Windows
Uncategorized
I recently switched companies. New team, new processes, new tools. After a decade of using Jira, I’m now working with Azure DevOps. Different interface, different terminology, different quirks. And yet, within a few days, I felt at home. Not because I’m particularly clever. But because I have heuristics. My father doesn’t have these. He’s been … Continue reading "Finding Your Way or Why My Dad Can’t Use Windows"
Show full content

I recently switched companies. New team, new processes, new tools. After a decade of using Jira, I’m now working with Azure DevOps. Different interface, different terminology, different quirks. And yet, within a few days, I felt at home. Not because I’m particularly clever. But because I have heuristics.

My father doesn’t have these. He’s been using computers for decades, but every Windows or iOS update throws him off. If a button moves, he’s lost. If a menu looks different, he calls me. He learned to use his PC by memorizing or rather writing down exact sequences: click here, then there, then type this. It works, until it doesn’t. When Microsoft or Apple rearrange the furniture, his map is useless.

The difference between us isn’t talent or experience. It’s the question we ask when facing something unfamiliar.

My father asks: “Where is the button I need? HELP!”

I ask: “What are the parts of this system? What do I want to achieve? What could the way there look like?” Looking for clues that spark attention. “Follow me!”

This is a heuristic I have used ever since. And now I learned from the Drs. Cabrera about their DSRP framework and the cognitive jigs. One of those jigs they call the S-to-P jig, short for System-to-Perspective. The idea is simple. You take an existing system you know and apply it to new perspectives. Don’t memorize the surface. Understand the structure.

Let me give you an example. When I opened Azure DevOps for the first time, I didn’t look for a tutorial. I asked myself: What does a work item tracking system need? It needs a way to manage work items. It needs links between items. It probably has different item types. There’s likely some kind of hierarchy or grouping. And hey, maybe even a test management module as it seems. And those all work more or less the same, don’t they?

Then I went looking for those parts. Where does Azure DevOps list items? Ah, there. How do links work here? Slightly different from Jira, but the concept is the same. And the test management module? Yes, and it still looks surprisingly similar to what I remember from Team Foundation Server back in 2013.

The tool changed. The questions didn’t.

This is what a heuristic or S-to-P jig does. It gives you a starting point when you don’t know the territory. You’re not stumbling around hoping to recognize something. You’re actively exploring with a mental checklist: What are the parts? How do they relate? What’s missing?

When you know which problem you need to solve, then there is usually a pattern you have seen before that can be applied as jig. If the solution you look at is not radically new, spoiler alert, there rarely is, then you will see common and returning patterns. Once you understand that structure, you can navigate variations. Jira, Azure DevOps, Trello. They’re all rearranging the same furniture.

My father doesn’t see furniture. He sees a room that must look exactly like it did yesterday. And when it doesn’t, he’s stuck. Which is especially fun when he has printer problems. (Boy, I hate printers!)

If you already have a good understanding of how a system roughly works, then applying that understanding to new situations is very helpful.

a man sitting at the table
testpappy
http://testpappy.wordpress.com/?p=1783
Extensions
Looking Into the Crystal Ball
Uncategorized
I want to take a look in my slightly broken crystal ball, and try to explain what I see for the future of AI, based on the news, podcasts, social media, experience, and some heuristics. I don’t do this because I have any special insight. I don’t. But writing things down is sharing. And when … Continue reading "Looking Into the Crystal Ball"
Show full content

I want to take a look in my slightly broken crystal ball, and try to explain what I see for the future of AI, based on the news, podcasts, social media, experience, and some heuristics.

I don’t do this because I have any special insight. I don’t. But writing things down is sharing. And when I look back at this article in half a year or three years, I can make fun of today’s Grumpy. So here it is, a snapshot of what I think is happening right now in the GenAI market, and where I expect it to go over the next few months, maybe years.

Let me start with the obvious. The last three years have been wild. ChatGPT opened the floodgates in late 2022, and since then we’ve had a cascade of models, companies, products, and demos. Everybody with a computer has used GenAI by now. As search replacements, with those AI summary sections at the top. We got access to all sorts of GenAI. Image and video generation, voice clones, coding assistants, and so much more. The public got handed something genuinely new, and we’ve been poking at it with sticks to see what it does and can do.

And this is where it gets interesting. Because I think OpenAI, Anthropic, Google, or any of the other big players didn’t actually know what GenAI was good for when they shipped it. Not really. They knew it was powerful. They knew it could do impressive things in demos. But concrete, reliable, money-making applications? That was unclear.

So what did they do? They gave it away. Cheap or free, to everyone, all at once. And we, the public, became the world’s largest focus group. Millions of people running millions of experiments in parallel, figuring out what this stuff is actually useful for. Lawyers feeding it contracts. Students using it for homework. Developers pairing with it. Marketers cranking out copy. Kids making memes. Populists spreading fake news.

This wasn’t generosity. It was scouting. A resource was handed out with no clear application, and the job of finding the applications was crowdsourced to the public. That’s a fascinating inversion of how technology usually rolls out. Normally you know what a tool does before you sell it. With GenAI, the tool came first and the use cases had to be discovered.

And now, a few years in, the picture is starting to crystallize.

Look at where the big players are putting their real engineering effort. Coding, testing, designing. Claude Code, Cursor, GitHub Copilot, OpenAI’s Codex, countless startups building developer tools. That’s not a coincidence. Coding is the first place where the productivity gain is measurable, repeatable, and valuable enough that companies will write large cheques for it. A senior engineer costs real money. The ROI is clear, and the market is enormous.

Compare that to the consumer side. Yes, ChatGPT has hundreds of millions of users. Yes, a lot of people pay $20 a month for Plus subscriptions. But the economics of consumer subscriptions are brutal compared to enterprise contracts. A single Fortune 500 customer can be worth more than a million individual subscribers, with less churn and less support overhead. If you’re running one of these companies and you’re thinking about where to invest for the next quarter, the answer isn’t “more features for casual users”.

The pressure for this shift is only going to grow. These companies have taken enormous amounts of money from investors. They’ve built infrastructure that costs a fortune to run. And the big ones are on a path to the public markets. Once you IPO, or even once you’re clearly on the runway to IPO, patience for “we’re still figuring out the business model” disappears. Shareholders want revenue growth. Revenue lives in B2B contracts, not in keeping the free tier generous.

So here’s what I expect to see.

GenAI for end users won’t disappear, but it will stop being the product. It’ll become the loss leader. The thing that keeps the brand familiar, that feeds training data, that gets people comfortable with the technology so their employers will buy the enterprise version. Free tiers will get thinner. Rate limits will tighten. The most capable models will sit behind higher price tiers or, increasingly, behind enterprise contracts.

The actual products, the ones that will pay the bills, will be domain-specific. Legal AI that helps with analysis, research, and writing. Industrial solutions for running of complex machines. Pharma research tools that help with drug discovery. Financial compliance systems that read regulatory filings. Software engineering tools that integrate with your codebase and your ticketing system. These are the places where someone will pay serious money because the value is clear and the alternative is expensive human labour.

You’ll see consolidation too. Not every company currently claiming to do GenAI will survive. The ones that did nothing but wrap a prompt around somebody else’s API are already struggling. The ones without a clear industry focus will get squeezed. The winners will either be the foundation model providers themselves or the companies that have deep domain expertise and real enterprise relationships. Everybody in the middle is in trouble.

For casual users like most of us, this means the next few years will feel different. The pace of “wow, look what it can do” demos aimed at the general public will slow down. The exciting releases will start being things nobody outside a specific industry cares about. A new model for legal document analysis. A breakthrough in protein folding. A coding assistant that can handle larger codebases. Impressive to specialists, invisible to everyone else.

That’s not a bad thing. It’s how technology normally matures. The sampling phase ends. The application phase begins. The market sorts itself into segments where real value gets created and real money gets paid.

GenAI is a resource, but nobody knew what to do with it. It took a gigantic social experiment to find that out.

Now it’s time to focus on the money.

person wearing silver ring holding white ceramic bowl
testpappy
http://testpappy.wordpress.com/?p=1810
Extensions
Hey Siri, is AI Fatigue a Thing?
Uncategorized
The other day I scrolled through my LinkedIn feed and noticed something. AI, AI, AI. Three out of four posts are about AI in one way or the other. New tools, new prompts, new ways to be more productive. And somewhere between the tenth and twentieth post, a word popped into my head: AI fatigue. … Continue reading "Hey Siri, is AI Fatigue a Thing?"
Show full content

The other day I scrolled through my LinkedIn feed and noticed something. AI, AI, AI. Three out of four posts are about AI in one way or the other. New tools, new prompts, new ways to be more productive. And somewhere between the tenth and twentieth post, a word popped into my head: AI fatigue.

So I asked Bing: “Is AI fatigue already a thing?” The answer came back in capital letters: YES. Together with, of course, an AI-generated summary. The irony wasn’t lost on me.

Turns out, since 2025 there are actual reports and studies about people suffering from AI fatigue. AI fatigue, as it turns out, is not just one thing. It’s a whole family of exhaustions. (1) There’s the constant pressure to adopt yet another tool, to keep up with updates and new features. (2) There’s the cognitive drain of reviewing AI output instead of creating, turning us from makers into judges. (3) There’s the endless flood of AI-generated “slop”, filling our feeds with hollow noise and fake news.(4) And there’s the sheer hype fatigue, the relentless drumbeat of AI everywhere you look. Four flavors of exhaustion, and most of us are tasting multiple of them at once.
Even though this post is triggered by number (4), this post focuses on (1) and (2).

What triggered this reflection was a report from Anthropic titled “Estimating AI productivity gains.” The numbers are impressive and frightening. Current AI models could increase US labor productivity growth by 1.8% annually over the next decade. That would double the growth rate we’ve seen since 2019. The median task shows 80% time savings. Sounds like progress, right?

But here’s the question nobody seems to be asking. Progress for whom?

From news articles and podcasts, I’m hearing something different. Management has discovered a new instrument. And it’s not being used to make work easier or more meaningful. It’s being used to squeeze more out of people. Faster. More. More efficient. At any cost. There’s this constant pressure now, this unspoken threat hanging over every desk: AI could replace you. So you better speed up. You better adopt. You better produce.

It feels less like a tool and more like a whip, like a threat.

And if you look at it from a purely capitalistic perspective, it makes sense. Companies are not in the business of keeping humans healthy and happy. They’re in the business of generating value. And right now, AI promises to generate more value with less human involvement. Which means the humans can produce more and more output in the same time. They just need to focus on the parts that AI can’t solve yet, and use AI as efficient as possible for the rest.

And the human cost? AI fatigue is real. People are being forced to use tools they don’t understand, don’t trust, or simply don’t like. Some companies even reward the employees with the highest token usage. The pressure mounts. The exhaustion grows. We’re optimizing productivity at the expense of wellbeing. We’re measuring output while ignoring the people producing it. And with tokenmaxxing you only guarantee two things. More income for AI providers and a warmer planet thanks to unnecessary resource use.

Proponents say that AI handles the stupid and easy jobs, so that humans can focus on the tough parts. But constantly working on the tough parts is draining energy. And work models don’t account for that. Managers now expect you to work 40+ hours a week on the tough parts only.
In a good world the work days would become shorter, because people with the help of AI can now manage 40 hours of work in let’s say 25 hours. So why not cut the work week down to 30 hours or less. People sometimes need a few stupid tasks to regenerate mentally. Or they need other means. Like longer breaks.

Professional athletes don’t train all day the hard moves and tactics. They also train the simple things, the basics. Because the simple things are the foundation for the hard things. By outsourcing the foundation you will lose a vital part of the package.
It’s like when an F1 drivers uses his F1 car to get to the train station or running errands. Driving the whole time at maximum speed.

I don’t have a neat answer here. But I do have a question: Where does this end? Is “more productivity” really the goal we should be chasing? Or do we need a different conversation entirely, one about what good work actually means, about sustainable pace, about treating people as humans rather than resources to be optimized?

I know that companies need to make money. But for the long run, maybe it’s time to stop asking “How can AI make us more productive?” and start asking “What kind of work do we actually want to do?” and “How can we balance work and life in these new times?”

man in shirt lying down on table at office
testpappy
http://testpappy.wordpress.com/?p=1793
Extensions
AI Doesn’t Understand the World (yet)
Uncategorized
There’s a video by woodworker John Malecki called “AI vs Man” where he tries to build a cutting board designed by AI. The image looks, well, interesting. Complex patterns, different wood tones. The only problem: it’s physically nearly impossible to assemble. The AI generated an interesting looking result. But it has no idea how you’d … Continue reading "AI Doesn’t Understand the World (yet)"
Show full content

There’s a video by woodworker John Malecki called “AI vs Man” where he tries to build a cutting board designed by AI. The image looks, well, interesting. Complex patterns, different wood tones. The only problem: it’s physically nearly impossible to assemble.

This is the design John settled on, that looked actually doable.

The AI generated an interesting looking result. But it has no idea how you’d actually make it. It doesn’t think about which pieces need to be glued first, how grain direction affects stability, or whether the geometry even allows for assembly. It just sees patterns in training data and predicts what a cutting board should look like. And that prediction can be completely disconnected from reality.

This isn’t just a woodworking problem. The craft community has been fighting this battle for a while now.

Etsy and Ravelry are flooded with AI-generated crochet and knitting patterns. They come with polished images of adorable amigurumi animals, intricate shawls, cozy socks. People pay for the patterns, start working, and discover: they don’t work. NBC News reported on frustrated buyers who found patterns with impossible stitch counts, garments that couldn’t physically fit together, instructions that contradicted themselves.

One Reddit user, a crocheter with over 15 years of experience, wrote that the AI images fooled them at first. It was only when they looked closer that they noticed “stitches that weren’t there” and designs that were simply impossible to create with yarn and a hook.

There’s a wonderful earlier experiment called SkyKnit, where researchers fed thousands of knitting patterns into a neural network and let it generate new ones. The knitters who volunteered to test them called it “Operation Hilarious Knitting Disaster.” The AI produced patterns requiring 6,395 stitches in a single row. Patterns that would immediately unravel. Patterns that turned into “long whiplike tentacles.” The knitters had to debug the AI output, because they actually understood how yarn behaves in the physical world.

Why does this happen?

AI, especially generative AI, is fundamentally a pattern-matching system. It predicts the next likely token, pixel, or word based on statistical patterns in its training data. It doesn’t simulate physics. It doesn’t understand that wood expands across the grain, that yarn has tension, and that knots must be attached to other knots.

Humans learn these things through their bodies. You cut wood and feel the resistance. You knit a row and notice when something’s off. You fail, adjust, and build intuition over thousands of repetitions. This is what cognitive scientists call embodied cognition. Knowledge that lives in your hands, your muscles, your sensory experience. Some call it “muscle memory”, but it’s more than that.

AI has none of that. It has never held a chisel. Never felt a thread snap. Never assembled a single physical object. It can mimic the appearance of expertise with impressive accuracy. But appearance is all it has.

And here’s the danger: people are selling these phantom plans. On Etsy, on random websites, often for real money. The images look professional. The descriptions sound confident. And if you don’t know enough about the craft to spot the problems, you might not realize you’ve been had until you’re halfway through a project that can’t be finished.

So what can you do?

If it looks too perfect, be suspicious. Check the reviews. Look for photos of actual finished projects, not just the listing images. Ask yourself: can I trace the steps from raw material to finished product? If the seller can’t show you that path, maybe they don’t know it either.

AI is a powerful tool. But it doesn’t understand the physical world. It mimics the appearance of understanding. And that difference can cost you time, money, and a lot of frustration.

Trust the people who have actually built things. They know what’s possible because they’ve done it with their hands.

Screenshot 2026-04-17 184708
testpappy
http://testpappy.wordpress.com/?p=1800
Extensions
Talk to the Machine, or Why We Chat with AI But Not Each Other
Uncategorized
Lisa Crispin recently dropped a comment under one of my posts. “It seems funny to me that people are willing to turn things over to an agent, but they don’t want to collaborate with other humans to do a better job of building quality in.” A little sad, if you think about it. With the … Continue reading "Talk to the Machine, or Why We Chat with AI But Not Each Other"
Show full content

Lisa Crispin recently dropped a comment under one of my posts.

“It seems funny to me that people are willing to turn things over to an agent, but they don’t want to collaborate with other humans to do a better job of building quality in.”

A little sad, if you think about it.

With the rise of AI coding assistants suddenly everyone is prompting. People spend hours iterating with ChatGPT or Claude. They refine their prompts, try different angles, patiently wait for the response, adjust, try again. They have entire conversations with the machine about architecture decisions, edge cases, test strategies.

And yet. Ask them to walk over to a colleague and have a five-minute conversation about the same thing? Too much effort. Too awkward. Takes too long. Don’t want to interrupt.

I find myself wondering, what’s the psychology here? The friction is lower, that is clear. But is the machine “safer” because it doesn’t judge? Because we stay in control of the conversation? Because asking a human might feel like admitting we don’t know something? I genuinely don’t have the answer. But the pattern is hard to ignore.

Here’s the thing though. AI doesn’t skip the conversation. It just delays it.

You can prompt your way to code. Lots of code. Fast code. But without talking to your teammates first, that code arrives without shared understanding. Nobody discussed what problem we’re actually solving. Nobody aligned on the approach. Nobody mentioned that other edge case, or that dependency, or that thing the customer said last week.

So the code lands. And then come the pull requests that miss the point. The bugs that could have been avoided. The incidents that escalate. The rework. The frustration. All the conversations you skipped? They happen anyway. Just later, angrier, and almost certainly more expensive.

This is where in my opinion the difference shows. Teams that talk first and then use AI tools to execute? They scale. The AI amplifies their shared understanding. It accelerates what they’ve already aligned on.

Teams that prompt instead of talking? They drown. In code nobody asked for. In PRs that go nowhere. In technical debt that piles up because nobody stopped to ask “wait, is this what we actually need?”

This echoes what Lisa Crispin and Janet Gregory have championed for years: the human side of building software – the talking, the collaborating, the shared understanding – isn’t overhead. It’s the work.

AI changes nothing about this. If anything, it makes it more obvious. The teams that were already good at communication? They’re getting better, faster. The teams that weren’t? They’re just producing more stuff that misses the mark. At higher speed.

So here’s my ask. Before you open that chat window with your AI assistant, try something radical. Talk to a human first. Ask a colleague what they think. Have the awkward conversation. Align on the problem before you generate the solution.

The AI will still be there when you’re done. And it’ll be a lot more useful once you actually know what you’re building.

No agent can save you from not talking to each other.

a robot holding a flower
testpappy
http://testpappy.wordpress.com/?p=1763
Extensions
The Tech Radar is Blinking Red
Uncategorized
ThoughtWorks just dropped Volume 34 of their Tech Radar, and it reads less like a technology map and more like a warning letter. Several signals on the same screen, all pointing the same way. If you’ve been following my posts, none of them will surprise you. What’s new is that one of the most respected … Continue reading "The Tech Radar is Blinking Red"
Show full content

ThoughtWorks just dropped Volume 34 of their Tech Radar, and it reads less like a technology map and more like a warning letter. Several signals on the same screen, all pointing the same way. If you’ve been following my posts, none of them will surprise you. What’s new is that one of the most respected consultancies in our industry is now saying it out loud. I have picked out a few topics to focus, as at least half of the Caution blips were mentioned in some of my blog posts recently.

Cognitive debt

Teams shipping code faster than they can understand it. The Radar now names it explicitly: codebase cognitive debt. As AI generates more code, the gap between the humans on the team and the software they supposedly own keeps growing.

“Weaker system understanding also reduces developers’ ability to guide AI effectively, making it harder to anticipate edge cases and steer agents away from architectural pitfalls. Left unmanaged, teams reach a tipping point where small changes trigger unexpected failures, fixes introduce regressions and cleanup efforts increase risk instead of reducing it.” What sounds like one of my recent blog post is actually from blip 36 of the TechRadar.

Measuring the wrong thing

Blip 38 of the Radar also sits in the Caution ring, and it should make every engineering leader uncomfortable. It’s called coding throughput as a measure of productivity. The warning is simple. If you measure AI success by lines of code generated or pull requests opened, you will get exactly that. A flood of it. And not much else. It’s the age-old metric gamification, and obviously AIs are good at it as well.

This is a classic systems thinking trap. You optimize for the metric, not the outcome. Goodhart’s Law in a hoodie. The Radar points out what actually happens. Reviewers drown in barely-reviewed AI output, cycle times go up instead of down, and the effort to adapt generated code to the team’s architecture and conventions gets hidden from the numbers. The dashboard looks great. The deliveries do not.

ThoughtWorks suggests a better signal: first-pass acceptance rate. How often does the AI output get used with minimal rework? Optimize for that metric and it will do good. That’s a number that actually means something, and it connects directly to the DORA metrics. If the acceptance rate is low, change failure rate goes up and lead time stretches. If it’s high, you’re genuinely getting faster. Throughput alone tells you nothing.

Semantic diffusion and the flood of novelty

In a companion blog post, ThoughtWorks describes two things that are really the same problem wearing different clothes.

The first is semantic diffusion. New AI terms are proliferating faster than shared meaning. ThoughtWorks admits that while compiling the Radar, they had to spend serious time just clarifying what words actually meant. Everyone uses them slightly differently. If the authors of the Radar can’t agree on definitions, what chance does your delivery team have? You end up in meetings where everyone nods and leaves with different conclusions. And I have written about problems with semantics before the advent of GenAI.

The second is scale. For this edition, ThoughtWorks received more than 300 proposed technologies. Many were only weeks old. Some had barely any contributors, and one of those contributors was often a coding agent itself. Read that again. We’re now in a world where AI is producing the tools that AI is supposed to help us evaluate.

Both come back to the same thing. Novelty is outpacing understanding. The map can’t keep up with the territory, and the territory is partly being drawn by the very thing we’re trying to map. How do you make informed decisions in that environment? You probably don’t. You guess, or you follow whoever is loudest on LinkedIn.

So what

ThoughtWorks’ own conclusion is blunt: the current cognitive demand isn’t sustainable and will likely undermine the very gains AI is supposed to deliver. Let that land for a second. The productivity paradise we keep being sold is already eating itself.

This isn’t a tool problem. You can’t buy your way out of it. It’s a technique problem, and underneath that, a human problem. It’s about whether the humans on the team still understand their software, still share a language, still operate as a team, and can still make sense of the landscape they’re working in.

None of this means AI is bad. But the Radar has now confirmed, with data and client experience behind it, what a lot of us have been saying from the sidelines. If you build faster than you understand, you are not winning. You are accumulating a debt somebody will have to pay.

Slow down. Read your code. Talk to your colleagues. Agree on what the words mean. And when the next “revolutionary” agent framework drops this week, maybe sit that one out.

The Radar is blinking red. Look at it.

PS: If you haven’t read the Radar yourself yet, do. It’s free. And unlike most vendor content, it’s written by people who actually have to clean up the mess afterwards.

testpappy
http://testpappy.wordpress.com/?p=1817
Extensions
The Convenient Scapegoat?
Uncategorized
When it takes a team two weeks to add a button, the problem isn’t the button. And the solution isn’t AI. I’ve been there. Literally. A simple button on a screen. Two calendar weeks. And if you’re on the outside looking in, you think: what are these people doing all day? Are they incompetent? Lazy? … Continue reading "The Convenient Scapegoat?"
Show full content

When it takes a team two weeks to add a button, the problem isn’t the button. And the solution isn’t AI.

I’ve been there. Literally. A simple button on a screen. Two calendar weeks. And if you’re on the outside looking in, you think: what are these people doing all day? Are they incompetent? Lazy? You see a button. You think it’s a button problem. But what you don’t see is the teams that need to coordinate, the approval stages, the deployment that only happens fortnightly, and the legacy system nobody dares to touch. The button is easy. The organization around it is not.

And this is where it gets interesting. Because when leadership looks at that two-week button story, they don’t see organizational complexity. They see inefficiency. They see cost. And now, conveniently, they see a solution: AI.

Here’s what I’ve been observing. The IT industry has been under pressure for a while. Long before anyone typed a prompt into ChatGPT. Companies grew their tech teams aggressively for years. Salaries went up. Headcounts went up. But did the output go up proportionally? Honestly, in many places, it didn’t. The structures became heavy. Diligent, well-trained developers who truly think about what they’re building have always been rare. And surrounding them with layers of process and coordination didn’t make things faster. It made things expensive.

So the pressure to consolidate was already there. The uncomfortable truth that “we hired too many people and built too much complexity” was sitting in the room. But nobody wanted to say it out loud. Restructuring is a hard sell. It means admitting mistakes. It means difficult conversations.

Then AI arrived as the perfect narrative. “We’re not cutting people because we over-hired. We’re cutting people because AI makes us more efficient.” That’s a much easier story to tell. To the board. To investors. To the press. AI becomes the scapegoat for a correction that was coming anyway. It’s not a lie, exactly. But it’s not the whole truth either. Studies are still not consistently showing an actual increase in productivity.

And here’s what worries me. I see highly talented people, folks that companies would have fought over just two or three years ago, struggling to find new roles. Not because they’re not good enough. But because the calculus has shifted. Cheaper people plus AI looks like a solution on a spreadsheet. Leaders and investors see AI as a performance multiplier for everyone, so why pay top dollar for senior expertise when you can pay less and sprinkle some AI on top?

But let me ask you this. When you replace the people who actually understood why the button took two weeks, what do you think happens? The button won’t take two weeks anymore, sure. It might not get built at all. Or it gets built fast, in the wrong place, solving the wrong problem, breaking something downstream that nobody on the new, leaner team even knows exists. Speed without understanding is just faster failure.

So which one is the bubble? Is it AI, with all its promises and hype and investor fever dreams? Or is it IT itself, with decades of inflated teams and salaries that didn’t always match the value delivered? Maybe both. I honestly don’t know. But I do know that pretending one neatly fixes the other is dangerous. That’s not strategy. That’s wishful thinking with a narrative attached.

The consolidation is real. AI is real. But using one to justify the other without actually thinking it through? That’s how you lose the people who understood your systems. The ones who knew why the button took two weeks and, more importantly, knew how to make it take two days. And that kind of knowledge doesn’t show up on any dashboard. By the time you notice it’s gone, it’s too late. And more AI won’t fix it.

photo of a white goat with long horns
testpappy
http://testpappy.wordpress.com/?p=1770
Extensions
Are We Building the Right Thing, or How GenAI Made Us Forget to Ask
Uncategorized
I was listening to the A/B Testing podcast the other day, episode 229, where Alan Page had my friend Chris Armstrong on as a guest. At one point they were talking about Verification and Validation. Two concepts that have been around forever. Two simple questions, really. “Are we building the thing right?” That’s verification. “Are … Continue reading "Are We Building the Right Thing, or How GenAI Made Us Forget to Ask"
Show full content

I was listening to the A/B Testing podcast the other day, episode 229, where Alan Page had my friend Chris Armstrong on as a guest. At one point they were talking about Verification and Validation. Two concepts that have been around forever. Two simple questions, really. “Are we building the thing right?” That’s verification. “Are we building the right thing?” That’s validation.

And it hit me. People with GenAI in their hands are not asking these questions anymore.

I wrote about this a few weeks ago, about how AI has killed the art of leaving things out. How the cost of adding has dropped so low that we stopped asking whether we should add. But it’s worse than that. We’re not just adding without thinking. We’re building without asking if we should build at all.

There’s a narrative I keep seeing on LinkedIn and in YouTube videos. The assumption that AI is just a tool to implement what you were going to build anyway. Same scope, less time. You had a plan, AI helps you execute faster. Sounds reasonable.

But that’s not what’s happening.

What I see is people using AI to produce more. Not faster delivery of the same thing. More things. More features. More code. More documents. Because now they can. The barrier is gone, so why not add another feature? Why not generate another module? Why not ship something extra while you’re at it?

It reminds me of “Flooding the zone with shit.” We lose focus of what’s actually important, because there’s just so much going on.

Think about it. You have an idea. You prompt it. A minute or two later, there’s code. Or a document. Or a feature. The friction is gone. And with the friction, the pause. The moment where you used to sit back and ask yourself, does this actually need to exist? Is this solving a real problem, or am I just building because I can? I often have this moment while writing the code or the document.
My friend Stu Crocker says “Quality is the absence of unnecessary friction!”. In this case the friction is necessary for the quality!

Verification and Validation are not new. They’ve been part of engineering disciplines for decades. And yet, somehow, the shiny new tool in our hands made us forget the most basic discipline of all. Ask before you build.

I see it everywhere. Teams shipping features nobody asked for. Code generated and merged without anyone questioning if it belongs in the product. Documents created because the AI made it easy, not because anyone needed them.

Here’s the thing. Validation comes first. Not verification. You don’t ask “are we building it right” until you’ve answered “should we build it at all.” GenAI flips this on its head. It makes building so effortless that we skip straight to the how. We never stop at the why.

And that’s dangerous. Because now we’re not just accumulating features, like I wrote before. We’re accumulating entire products, entire solutions, that nobody validated. We’re building faster than ever, and asking less than ever.

So here’s my call to action. Before you type that next prompt, stop. Ask yourself the two questions. In order.

First: Are we building the right thing? Is this needed? Does this solve a real problem? Will anyone actually use this, or want this, or benefit from this?

Only then: Are we building it right? Is the implementation sound? Is the code clean? Does it fit into the system?

The questions have been there for decades. GenAI didn’t make them obsolete. It made them more important than ever.

Start asking again.

a group of people sitting down and one man is holding his head
testpappy
http://testpappy.wordpress.com/?p=1772
Extensions