GeistHaus
log in · sign up

Annotations

Part of leaflet.pub

Exploring the overlaps between design, people, product, science, and tech — one messy idea at a time.

stories
The value is in the difficulty
On building software, vibe coding, and what happens to market prices when the barriers fall
Show full content

Anomalocaris canadensis. Credit: Paleozoo

I got no sleep last night, which means I'm awake early thinking through things I can't fix. One haunted topic is how software creation is going through a value collapse and despite all the vibe coding poasts on the subject, I'm still not sure most people in the industry have yet worked out what that means and what comes next.

Pop economics gives a tidy answer to what makes something valuable. It's scarce or hard to produce (or both). Price roughly reflects the cost of making something, plus whatever premium scarcity allows. When barriers to production are high, supply stays limited. A limited supply X real demand = higher value.

This value holds until something disrupts the barriers. When production suddenly becomes cheap or easy, supply floods in. When that supply reaches a saturation point, then prices must slide toward the marginal costs. When the costs approaches zero, inevitably so does the price.

Even if buyers can't evaluate quality directly, the fact that something was hard and expensive to make is a useful proxy. A handmade suit is probably better quality than a fast-fashion one. Software that took a team of expert engineers a year to build probably solves a problem thoroughly. This labour signal communicates some value before you've experienced the product. Are scarcity and labour signals are about to stop applying to software, and how can they not?

Before language models, writing software required significant and genuine expertise, time and capital. Most users didn't need to understand how code worked to understand that it was bloody hard. You'd hire a developer, or teams of them, pay them a boat load of money, and they'd spend months working on something hard. That effort was visible and it priced the market.

There was a real scarcity of people who could write code well. Years of training, accumulated pattern recognition, architectural instincts. That stuff took time to develop. You couldn't shortcut it to the level that shipped reliable, scalable AAA software. That constraint is now going, or gone.

Vibe coding collapses all these the barriers and the labour signal will be the first to break. Anyone with taste, a bit of domain knowledge, a decent prompt, and a tokens budget can now ship something functional in days. A weekend vibe-coder can produce something that looks and behaves comparably to months of professional work. Maybe not enterprise-grade distributed systems perhaps (well, not yet), but something functional. Something good enough to solve a problem. Something good enough to sell (well, for now).

Buyers now know this (or they're learning this). Once that's more-widely understood, the implicit premium attached to "software exists, therefore someone worked really hard on it" disappears, never to return.

We've seen this arc before, and music is the richest analogy.

Pre-internet, making and releasing music was genuinely expensive. Recording, pressing, distribution, promotion all cost real money and required real experts and gatekeepers. Fewer artists got through, but of those who did, a meaningful proportion could make a half-decent living. Then DAWs, MP3s, P2P sharing, and MySpace arrived almost all at once. Anyone could record and release a professional-sounding track for next to nothing.

The filters collapsed almost overnight.

The result was more musicians, more music, more creativity, more voices than ever before. Genuinely wonderful in many ways. And proportionally, far fewer artists able to make a sustainable living from their work.

Skip forward, and the long tail is now very long and very thin. Most musicians make music despite the economics. The joy is real; the money usually isn't. The entities making money are the streaming platforms, the ticket vendors, the merchandisers, the distributors. As in every gold rush, it's the shovel sellers that make the money.

What's interesting is that the music industry didn't flatten into equality when the barriers dropped. It sorted into three distinct tiers:

At the top, a tiny number of artists enjoy massive commercial success, but even that gets propped up by brand deals, sync licensing, touring. Streaming royalties alone rarely sustain even the very famous ones.

In the middle, a hollowed-out tier survives on direct-to-consumer relationships: Bandcamp sales, Patreon subscriptions, merch drops, ad reads. Less glamorous, but honest work. The model is: find the people who value what you make enough to pay you directly, and sustain that relationship.

Then the low end long tail—the vast majority—where the day job pays the bills and nights and weekends go to the thing they love. No illusions about going full-time. It's maybe the most-honest relationship between a maker and their craft. They make music because making something matters, regardless of whether it pays—and it usually doesn't. Software is heading for the same stratification.

After a couple of too many years working in product development, I know that the most valuable thing about software isn't always in the code itself.

Network effects work when value lives in who uses the product. WhatsApp isn't valuable because it's incredibly hard to build. It's valuable because everyone you want to message is already on it. A technically equivalent replacement doesn't matter. But genuinely networked products are rare as hen's teeth.

Distribution keeps attention scarce even when code is free. Getting software in front of users still requires marketing, trust, and often real money. But I wonder does control of distribution sustain value now, or does it just delay the erosion? This moat may be shallower than it appears. A well-funded competitor with a vibe-coded version can appear overnight and undercut immediately. The foundational AI companies are now expanding their product portfolios and sherlocking is a thing.

Deep domain expertise is the strongest argument. Knowing what to build for a genuinely complex problem—e.g. medical, legal, or industrial use cases—still requires knowledge language models can't (yet) replicate cheaply. The code is "easy". The insight about what the code needs to do isn't. But again, this protects only a narrow-but-lucrative slice of the overall software products market. Expect industries to try solve their problems internally now rather than pay external experts.

But broadly, are network effects, distribution and domain knowledge enough for most products? Are they enough for the flood of vibe-coded apps that are coming?

What I think actually happens is that value concentrates at the top. A small number of products with genuine moats capture most of the market. Everything else slides toward marginal costs. The middle market is where it may get really uncomfortable.

Those products existed because software was hard to build. The difficulty was the value. Remove the difficulty, and the rationale for their existence—and certainly their price—weakens considerably.

What follows is a Cambrian explosion of software products: vast in number, lacking in real differentiation, with near-zero individual value. Many useful things get built. Most generate no revenue. The long tail of content creation, now applied to software.

And here's the part that workers in the industry should sit with for a moment: Cui Bono? Who wins in this scenario? The Anthropics, the Lovables, the Vercels: the infrastructure layer. The shovel sellers that profit regardless of whether the builders succeed or not.

A diverse software ecosystem—lots of builders, lots of products, lots of different solutions to lots of different problems—has been a net benefit to everyone for many years.

But if the music industry example holds true, the picture that emerges isn't "zero software value". Its a power law that says a handful of big winners, a hollowed middle sustained by direct relationships and goodwill, and a long tail of useful things with a handful of users built by people who love building (without expectation of financial return).

That's not nothing. The long tail may produce real value for real users. But it's a very different world from the one the software industry has operated in for the last 30+ years. It's one where many in the industry may need to start looking for other forms of income.

https://renderghost.leaflet.pub/3mlsbz7j5rc2z
Interaction Costs: The High Price of Low Usability
How Balsamiq's recent redesign broke rapid wire-framing.
Show full content

Photo by Gabriel Rodrigues

For fifteen years, I've used Balsamiq on almost every UX job I've ever had (I checked my 1Password earlier and I have 35 unique logins for different projects and clients). I loved Balsamiq because it was the fastest wireframing tool on the market.

Not because it had more features than its competitors, but because it had fewer. The scratchy pencil sketch aesthetic, the Comic Sans typography, the deliberately constrained component library: all of it was designed to make one thing blazing fast: iterating on information architecture before we started discussing visual design.

Then they redesigned the interface. And in doing so, they forgot the fundamental principle that made their product so valuable in the first place: interaction cost.

What Is Interaction Cost?

Interaction cost is the total mental and physical effort users spend to reach their goals when using a website or app.

A direct measure of usability. Every successful product spends enormous resources minimising it. Ideally, you want almost-zero interaction costs: a user thinks of a goal, and it just happens, but that's not realistic, because most products do many things. So instead, users look around, read, scroll, click, wait for pages to load, remember information from one screen to apply it on another and so on. All of these physical and mental actions add up to interaction cost.

Highly usable products minimise this cost across multiple dimensions:

Mental Interaction Cost (Cognitive Load)

The brain power you spend learning, reading, remembering and figuring things out. It increases when tasks require more attention and memory to complete.

Your working memory is limited. Psychologist George Miller famously demonstrated that the average person can hold only 5 items (plus or minus 2) in working memory at one time. Good interfaces work with this constraint through techniques like chunking information into meaningful units, using short paragraphs, clear visual hierarchy, and grouping related items together.

Recognising things is easier than remembering things. It's far easier to recognise something when you see it than to recall it from memory. This is why labelled buttons outperform icon-only interfaces—unless you're using those icons so frequently that they become second nature. Making field labels visible, providing step-by-step tutorials in context rather than forcing users through documentation, and using familiar design patterns all reduce the mental cost.

Physical Interaction Cost

Clicking, scrolling, typing, and moving between interface elements.

Fitts's Law describes how some aspects of this works: the time required to move and click a target relates to distance-to-the-target divided by the size-of-the-target. Longer distances and smaller targets both increase interaction time.

This is why mobile interfaces need larger buttons, and why toolbars with persistent locations, especially pinned to sides and corners, are faster to use than context menus that appear in unpredictable positions.

Measuring Interaction Cost

You can measure interaction cost through task analysis—the process of observing users to understand how they perform tasks and achieve goals. A task-analysis can map the mental and physical steps required to complete a goal, revealing:

  • Total number of tasks: Are there opportunities to remove steps?

  • Frequency of tasks: Which operations are repetitive?

  • Cognitive complexity: How much mental load does each step generate?

  • Physical requirements: Could these affect performance or accessibility?

  • Time taken: Which steps add unnecessary duration?

Why Interaction Cost Matters
Expected utility = Expected benefits - Expected interaction costs

That formula governs every product decision users make, including HOW they use your product and IF they will use your product.

People naturally want to maximise expected utility. When there are multiple ways to reach the same goal with similar benefits, they'll choose the path of least resistance. Designers call this a desire path.

But when the interaction cost of using your site or app becomes too high, people will simply choose a different product. This is why interaction cost isn't an academic concern—it's an existential one. The more effort you ask of them, the greater reward they expect.

If the benefits stay the same but the costs go up, the expected utility goes down. The formula is unforgiving: A product that increases interaction cost is a product that's decreasing its own value.

Balsamiq's Forgotten Fundamentals

Balsamiq launched in 2008 as a wireframing tool that understood one thing perfectly: time-to-results matters more than sophistication, and their value proposition was explicit about speed and how that impacts outcomes:

Wireframe your way to faster, better product decisions.
Turn product ideas into clear visuals in seconds....

What is Wireframing?

Although the definition has crept in recent years, wireframing is traditionally an exercise in information architecture, not interface or interaction design.

It's the practice of creating sketch drawings in low fidelity to allow fast iteration on content before committing to hierarchies and flows across a website or application.

The value of working in low fidelity is the separation of concerns: you focus on content and layout, not pixels and polish. This reduces cognitive load and lets everyone in the design process discuss content and structure without getting derailed by visual design decisions.

Balsamiq seemed to understand this deeply. The scratchy aesthetic and Comic Sans typography weren't just stylistic choices—they were functional constraints. You couldn't do visual design in Balsamiq. That was its power. You could only focus on information architecture, which is exactly what wireframing is supposed to be about.

You could make something just tidy enough to understand.

The Old Interface: Designed for Speed

In the old Balsamiq interface, selecting a component on the canvas revealed all available properties and actions in a persistent sidebar. The properties were:

  • Immediately visible: Zero clicks required to see what you could configure

  • Clearly labelled: No need to memorise icon meanings

  • Consistently located: The sidebar didn't move, disappear, or change position based on canvas navigation

  • Logically organised: Properties were grouped in a sensible order

Apart from occasionally scrolling within the sidebar, you could configure components almost as fast as you could think. One click to select a component, immediate access to all properties.

That's minimal interaction cost by design.

The New Interface: Design Pattern Cargo-Culting

The new interface replaces the sidebar with a minimal floating 'selection toolbar'.

  • Appears above the selected component: Position varies based on where the component is on the canvas

  • Uses unlabelled icons: You must memorise what each icon means

  • Hides properties in nested submenus: Some properties require clicking through two or three levels of navigation

  • Scrolls off-viewport: If you navigate the canvas while a component is selected, the toolbar disappears

  • Changes between components: The same property uses different icons depending on which component type you're working with

Comparing Old & New

Lets think about a simple everyday task for a wireframer: configuring a button component with custom text, a colour, and a link.

In the old interface, I might:

That's 4 actions, taking about 10 seconds with the minimal cognitive load.

In the new interface, I must:

That means more actions, more clicks, more time and more mental effort for a simple and frequent operation.

Every Dimension of Interaction Cost IncreasedRecognition Replaced with Recall

Nielsen's heuristic #6 is "recognition rather than recall".

Systems should minimise the information users must remember. Labels let you recognise the option you need. Icon-only interfaces force you to recall what each icon means. Some icons are so universal, they're immediately recognisable. Like the B and I that signal bold and italic text. But most, must be learned.

The old sidebar used recognition. The new floating toolbar uses recall. And worse: the icons aren't even consistent between component types. The same property (such as auto-resize) uses different symbols depending on the selected component. Now, you're not just building a single icon vocabulary, you're memorising multiple icons and forced to account for inconsistency.

Can you feel your brain getting tired?

Fitts's Law Violations

The floating toolbar violates Fitts's Law in at least three ways:

The old sidebar had a fixed location and was constantly visibility. These are optimal conditions for building muscle memory and minimising physical effort. The floating toolbar destroyed both of these advantages.

Working Memory Overload

The old sidebar required minimal working memory: scan visible labels, click the one you need. Nothing was hidden in sub-menus. The new toolbar requires you to maintain a mental map of:

  • What each unlabelled icon means

  • Which properties are in which submenus

  • How the icon meanings change between component types

  • Whether the toolbar is currently visible or has scrolled off-screen

That's 4 additional cognitive burdens for every single component configuration! Multiply across 50 components in a wireframe. Multiply across a day's work. The accumulated cost is potentially exhausting.

How Did This Happen?

The floating toolbar is a design pattern borrowed from visual design tools. It makes sense in those contexts because.

  • Canvas real estate is precious for pixel-perfect visual work

  • Users configure fewer properties more deeply (extensive colour adjustments, precise positioning, complex layer effects)

  • Visual designers expect to memorise complex tool palettes as part of professional expertise

  • The pace of work is slower because each component receives detailed styling attention

But wireframing is the opposite use case:

  • Canvas real estate management is less relevant because you're sketching, not pixel-pushing

  • Users configure many properties shallowly (lots of components, basic configurations, rapid iteration)

  • Speed users need fast menus with immediate labelled access because memorising icon vocabularies is pure overhead that doesn't contribute to information architecture thinking

  • The pace of work must be fast because the entire value proposition is rapid iteration

This is design pattern cargo-culting: copying an interaction pattern from a different product category without understanding why it works there or whether it solves an actual problem in your context.

The floating toolbar is a solution looking for a problem. And by implementing it, Balsamiq created a much worse problem: they made rapid wireframing slow.

A Category Error

Balsamiq succeeded precisely because it wasn't a visual design tool. It was the anti-Figma. The constrained component library, the deliberately 'ugly' aesthetic, the focus on content over form—all of it positioned Balsamiq as the tool you used before you opened Figma.

Are Balsamiq trying to compete with Figma? That's a category error. They built a genuinely great product by understanding the use case and optimising ruthlessly. This redesign suggests they've forgotten their primary proposition and values:

  • Simplicity beats complexity
    The best tools get out of your way (!!!)
  • Clarity leads to better products

The Solution?

Interaction cost is a measurable usability metric and clicks count, especially the wasted ones. Consider the mental load. Compare alternatives quantitatively. Task analysis exists for a reason. Use it. Map the steps required to complete core workflows. Count actions. Measure timing. Identify mental overheads.

Do not cargo-cult design patterns from other product categories without understanding whether they actually solve the problem in your context.

The best design is not 'less on screen'. The best design helps me to achieve my goals and gain benefits without paying high interaction costs to achieve it.

Dear Balsamiq

You built a truly great product by understanding that rapid iteration requires minimising interaction costs. This redesign broke that fundamental principle. Without a rollback, your power users (people like me who've been using Balsamiq for more than a decade) will haemorrhage productivity.

The formula is simple. The expected utility of Balsamiq has decreased because the interaction costs have increased the burden of use while the benefits stays the same. The solution is obvious.

Restore the sidebar! (and remember to count the cost of design changes)

🙏

https://renderghost.leaflet.pub/3mhd6nfyhtc2z
I work, I think?
How AI may quietly dismantle the feedback loop that turns inexperienced people into competent ones, and why my work matters to me.
Show full content

I don't know exactly when I started noticing it. Somewhere in the past year, in meetings, at conferences, on feeds. A particular quality of confidence in people who have no business being confident. A smoothness. Outputs that look considered but aren't. Decks that have the right shape but none of the right thinking. And underneath it, something that took me a while to name: these people aren't embarrassed. They have no idea anything is missing.

That's the part that stays with me.

There's a version of the AI-and-work conversation that's about jobs. Will there be enough of them. Who gets displaced. Whether the gains get distributed or hoarded. That's a real conversation and other people are having it better than I can. This isn't that conversation.

This is about something that's already happening, that doesn't show up in employment figures: the quiet destruction of the feedback loop that turns inexperienced people into competent ones. The process by which you get something wrong, feel it, understand why, and become slightly less wrong next time. It's unglamorous and it's slow and it's the only way it's ever worked.

AI short-circuits that learning completely. Not maliciously. Just structurally. When you can generate something that looks right without doing the thinking, you will (most people, most people being me, will, most of the time, under pressure, with a deadline) and the muscle that thinking would have built never develops.

You don't notice the deficit because the outputs keep coming. Everything looks fine. Everything is fine, by every metric anyone is measuring. Until it isn't, and by then the people who would have known the difference have either left or stopped saying anything.

I work, I think?

My high school principal always said "Never choose a job based on what you want to be. Choose a job for what you want to do every day".

Marx wrote about this in the Economic and Philosophic Manuscripts of 1844, where he describes what work actually does for a person, beyond just producing income. His argument is that the thing you spend your hours doing is the thing through which you come to understand yourself and your relationship to the world. The thinking, the trying, the making, the solving — that isn't separate from you. It is you.

Estrange a person from their work, make it something they merely perform rather than genuinely do, and you don't just take their time. You take something harder to name and harder to get back.

I used to think this was the romantic reading of labour. Now I think it's the precise one. Because what I'm watching, in slow motion, across an industry full of smart and well-intentioned people, is exactly that estrangement. Not forced on anyone. Chosen, incrementally, under entirely rational local pressures, by people who would probably object to the description.

The Luddites have been on my mind.

Skilled textile workers in early nineteenth century England who understood exactly what was happening to them and responded with coordinated, targeted action. They weren't against machines. They were against machines being used to hollow out skilled work, drive down wages, and concentrate the gains in the hands of mill owners who had no stake in what the workers lost. They saw the mechanism clearly. They named it. They fought it, briefly, before being crushed.

What they understood and what I think we're struggling to hold onto right now is that the question was never really about the technology. It was about power. Who controls the conditions of work. Who benefits from changes to those conditions. Who bears the cost.

We've been through this already, quietly, in the background, for decades. Property made inaccessible. Pensions made inadequate. The basic cost of a stable life ratcheted upward, steadily, by mechanisms abstract enough that it was hard to name a culprit.

Efficiency! was the word used. Optimisation!

The people making those decisions live somewhere that consequences don't reach.

Now it's the work itself. The thinking. The last thing that can actually actually be mine.

Looming dread

The mill owners didn't build the looms. My peers. Designers. Engineers. Product managers. Founders. We're constructing, with genuine intelligence and often genuine good intentions, the systems that are doing this to us, and everyone else.

The incentives are there, the pressure is real, the alternative is being left behind, and nobody wants to get caught with their pants down while everyone else moves. The logic of inevitability is coherent at the individual level. At the collective level it's a mechanism for producing outcomes nobody really wanted, with no obvious brake.

"We can't just un-invent it!" they tell me.

I watch people I respect evangelise tools that will commoditise their own expertise. I watch founders sell products they know, at some level, will make the case that even founders aren't necessary. I watch the whole devaluation apparatus accelerate and I understand why each individual component is accelerating.

Speeding around this bend towards the event horizon, I can't see where it stops. Can you?

The heat death of doing things.

I don't have a programme. But I have a strong preference for staying on the side of this where the thinking is still real and present. Not because it's a moral victory, or a winning career move (probably not), but because the alternative isn't neutral.

You don't stay still if you stop doing the work. You drift. And the drift is invisible from the inside. You don't feel yourself becoming someone who can't tell the difference anymore. You just gradually stop noticing that you used to be able to. The thing you've been separated from doesn't haunt you. It just stops being imaginable.

Some of us can feel the temperature dropping. Some of us are turning up the cooling ourselves. Most of us, if we're being straight about it, are doing some of both.

I'm not happy or willing to let that happen to me. I'm also not sure that I can stop it. But at the scale of my practice, my person, my work? That much I can try to hold onto. And I'm already failing. I have to use AI every day now, not because I want to, but because I need to earn income. The honest thing is to admit it rather than pretend I'm outside the problem I'm describing.

But bit by bit, I feel my beloved work become something I perform and not do.

https://renderghost.leaflet.pub/3mgigxmbx4k2z
Stop asking users what they want — and start watching what they do.
People's opinions about themselves and the things they use rarely match real behaviour.
Show full content
The Harbormaster by Shondon

If you're involved in creating software products, you’ve probably done the reasonable, common-sense thing many times: you asked people what they wanted.

Maybe you sent a survey, asked in an interview, or dug through a pile of feature requests. It feels natural and intuitive. Who better to tell you what to build than the people who will use it?

But decades of UX research, behavioural science, and product practice point to a surprising truth: people's opinions about themselves and the things they use rarely match real behaviours.

We confidently and frequently declare tastes we don't have and predict behaviours we'll never actually carry out.

Relying on user opinions to make product decisions leads to strategic drift, fragmented user experiences, wasted resources, user churn, and competitive risk.

Why asking people for opinions feels right but doesn't work well.

The instinct to ask people for opinions isn’t misguided exactly. It’s just incomplete, and an over-reliance on requests and opinions leads many a well-intentioned team quietly off course.

Asking people what they want is one of the most natural instincts in product work. Surveys, interviews, and feature wish lists feel accessible, social, and collaborative. They open channels to understand and empathise with the user base. They help teams feel closer to the people they serve. For teams under pressure, a stack of opinions can feel like solid data.

But this breaks when we compare what users say to what they actually do (say-do gap).

We all want to present ourselves a certain way. We want to seem more competent than confused (social desirability bias). Our memories can be fuzzy, especially about routine tasks (recall bias). Standards for what feels “easy” or “intuitive” can vary wildly between people (reference bias).

Most human decision making happens somewhat automatically, outside of our conscious awareness, shaped by our moods and habits. This makes accurately articulating our needs, the problems we face, and the motivations guiding our choices really hard.

The say-do gap isn't a bug, but a universal feature of human cognition. We don't have introspective access to the underlying forces that shape our actions, which makes opinions unreliable.

There's another trap baked into opinion-based research: when we ask users what they want, they naturally jump to imagined solutions that usually reflect personal taste, current workarounds, or the narrow scope of their own workflow problems.

“Add a button for X” or “let me export Y as a CSV” can be genuine signals of friction—but they’re still just guesses. Taken as-is, they can steer teams toward incremental tweaks and scattered features instead of shared problems that matter to all the user base.

This is how roadmaps fill with "faster horses", "low hanging fruit" and "quick wins": the low-impact clutter that feels like insight but rarely leads to scalable sustainable products producing great outcomes for both companies and customers.

All this points toward a simple principle: treat opinions as clues, not conclusions. They're useful for understanding context, language, expectations, and emotional textures but they cannot (and must not) stand alone as a source of truth.

To understand what people actually need—and to build products that actually solve their problems—we have to look beyond what they say and pay attention to what they do.

The quiet consequences of taking opinions at face value

Opinions connect to past experiences. People struggle to imagine new possibilities. Designing good things is actually quite hard!

Breakthrough products—from cars to Walkmans to iPhones—didn’t emerge from asking users “What should we build?” but from deeply understanding people, problems, technological opportunities, and patterns of context that customers don't know how to articulate.

When product makers anchor decisions on self-reported opinions, some predictable anti-patterns emerge:

Strategic drift

The roadmap becomes reactive, not proactive. Everything feels "validated by users", so saying no gets harder. You lose sight of outcomes while chasing outputs. Meanwhile markets shift and you're still optimising last year's workflow.

Fragmented user experience

Feature requests represent well-intentioned guesses: someone's best attempt at describing a fix from their limited perspective. Ship these literally and you accumulate scattered tweaks instead of elegant, system-level solutions. The interface bloats. The value proposition blurs.

Treat symptoms, not causes, and users inherit the complexity.

Resource waste

Research budgets fund work that misleads. Design and engineering time goes to features nobody uses after launch. Teams build what was asked for, ship it, then watch the metrics stay flat. The cost isn't just wasted effort—it's the opportunity cost of not building what would have mattered.

Measurement problems

Success becomes "we shipped what was requested", measuring outputs and mistaking them for outcomes. There's no way to learn what actually works or delivers value. Over time, costs compound, progress slows, impact becomes less clear, and the product gets shaped by noise instead of evidence.

Competitive risk

Competitors using observation-based methods ship better products. They solve deeper problems while you're still adding buttons. They innovate while you iterate. The gap widens slowly, then very fast, and by the time it's visible, you're behind.

“How did you go bankrupt?” Bill asked.
“Two ways”, Mike said. “Gradually and then suddenly”
— Ernest Hemingway, “The Sun Also Rises”

What to do instead: observe, measure, and uncover the real problems.

The alternative isn't to stop listening to users. Watch what people do, measure what matters, and use what they say to add context.

Start with observation

Watch people use your product or their current workaround. Look for hesitation, what they miss or ignore, what they repeat, what they invent to get the job done.

Five short sessions will reveal more than 100 survey responses to uncover needs that users can't articulate simply because they don't consciously notice them.

Measure real behaviour

Analytics show what people do, not what they remember doing.

Track where users actually drop off in funnels, which features they actually use, how long tasks actually take. A/B test changes against behaviour—completion rates, error rates, return visits—not opinions. Cohort analysis shows who sticks around.

Bypass bias entirely.

Use self-reporting to understand perceptions and build context, not collect feature requests

Interviews work better when anchored in specific recent behaviour, not hypotheticals.

Ask "Tell me about the last time you tried to do X" over "Would you use a feature that does Y?". Ask "Why does this matter to you?" over "Do you prefer option A or B?". Ask “Why?” to dig past symptoms to root causes.

Listen for the job-to-be-done, the friction they face, and the outcomes they want. And always always always ask why.

Treat feature requests as problem signals

Every request contains useful information, but you have to learn to look past the instruction.

Someone asking for "an export button" is telling you they need to get data elsewhere. Someone asking for "dark mode" is telling you they're using your product late at night or for long stretches and it's physically uncomfortable. Someone asking to "hide completed items" is telling you visual clutter is making it harder to focus on what still needs their attention.

Find out why they need it. Find out what they'll do with it. Find out if there's a better way to serve that need. If there's only one solution, you haven't understood the real problem yet.

Learn to say no!

Maintain a structured backlog. Prioritise based on observed patterns across many users, validated problems, and measured impact—not vote counts or volume. Synthesise what you learn into systemic solutions, not scattered features. You're designing for the crowd, not the individual.

Anchor your decisions in outcomes

Define what success looks like before you build. Will this change behaviour? Will it improve the metrics that matter? Build lightweight tests. Ship, measure, learn. If behaviour doesn't shift, the problem wasn't real or the solution didn't work. Either way, you learn.

Observe first, ask second

The goal here is not to reject user opinions.

It's to understand users better than they understand themselves—which means watching what they do, not just hearing what they say. When you make this shift, clarity follows close behind.

Your roadmap stops being a wish list and becomes a strategic tool. You build fewer features but solve bigger problems. You stop reacting to requests and start driving outcomes.

And you understand your users not because they told you what they wanted, but because you saw what they needed.

https://renderghost.leaflet.pub/3m75kss5jfs2z
The point is to understand
Machines can’t make sense of the world for us. Generating content that looks like knowledge isn't the same as producing it. We lose everything by confusing the two.
Show full content

This recent post by Mel Andrews got me thinking this morning about AI in academia, and the risk of diminishing human thought.

The growing fascination with 'human-out-of-the-loop' science — the idea that machines could autonomously generate new knowledge without human intervention — rests on a fundamental category error: confusing the production of content with the production of knowledge and understanding.

The difference is not semantic; it’s ontological.

Without a knower, there is no knowledge.

After receiving the Nobel Prize, Max Planck toured Germany giving the same lecture over and over. Eventually his chauffeur, who had heard it delivered many times, joked that he could give the talk himself.

Amused, Planck agreed, and they swapped places. The chauffeur delivered the lecture flawlessly, until a professor in the audience asked a challenging question and Planck was needed back on stage.

Imitations work fine until reality intervenes. When understanding and wisdom is required so is a human expert. The chauffeur could repeat the knowledge, but he couldn’t apply it.

This feels very much like where we are today.

We live in a world obsessed with producing and automating knowledge — or rather content that looks like knowledge.

We’ve built systems that can copy and simulate expertise so convincingly that it’s easy to forget there’s no real understanding or wisdom underneath. Machines can generate data and information, even texts that sound convincingly knowledgable. But regurgitating content and actual understanding are simply not the same thing.

As Albert Einstein put it, 'Any fool can know. The point is to understand.'

Credit: Matthew.viel, CC BY-SA 4.0

Data, Information, Knowledge & Wisdom

The DIKW pyramid (Data, Information, Knowledge, Wisdom) is a simple way to see how understanding develops from facts.

Data is a raw unrefined signals: The number 37.2 on its own is just a value. Turning data into information means giving it context: 37.2°C is my current body temperature here on this cold autumnal morning. Knowledge adds relevance: This is slightly above normal, and it may indicate I'm getting a cold. Wisdom guides how I might act: I should probably stay in bed today.

Climbing to the peak of this pyramid is a uniquely human act. Only people can make that leap from information to understanding. From knowing what, to knowing how, to asking and finding out the essential why.

Machines can easily process data and organise information but they don’t and can't know what’s worth understanding or why something matters or what ‘better’ even means. They can simulate all of this of course, but they don’t have stakes, goals, values or experiences simply because they are not alive.

And that’s why the upper layers of this pyramid — knowledge and wisdom — belong to us. They are where meaning lives, and meaning is why we move forward.

The point is to understand.

For people, understanding has a purpose. We learn so that we can survive and grow, together. It’s how and why we evolve as individuals, as societies, and as a species. To move from our current state to a better one. To imagine and build the next more-desirable reality.

Confusing data knowledge means mistaking accumulation for progress, and so we stop climbing the pyramid. Progress only happens when meaning enters the equation. When someone asks 'why?' and 'what next?'.

The goal of knowing is not to store facts but to create change. It’s how we transcend what we are — through understanding, curiosity, and intention.

Machines might be able to mimic understanding, just as Planck’s chauffeur mimicked his lecture. But if we ever forget the difference, we risk becoming chauffeurs ourselves: performing wisdom without truly possessing it.

https://renderghost.leaflet.pub/3m3cld6zlcc2l
Fidelity scales inline with certainty
Fidelity in design isn’t about how finished something looks — it’s about knowing when something is ready to be finished.
Show full content

'The Evolution of the PlayStation Controller - Den of Geek

Fidelity in product development is often misunderstood as a visual property: the polish of a mock-up, the smoothness of an animation, the details of a micro-interaction.

But at its core, fidelity is about specificity: How much we know, how clearly we can express it, and how precisely we can make decisions based on it.

At the earliest stages, we have hunches, vague goals, and open questions. Rather than forcing clarity, we sketch. We wave our hands. We write scenarios. We put glue to paper. We storyboard.

These simplest of artefacts are to expose what we don’t know. They create shared reference points, provoke questions, and make gaps more visible.

Crucially, low fidelity helps reduce risk through learning. It’s not about doing less — it’s about doing the kind of work that makes uncertainty visible and change cheap. When we work loosely, we’re free to ask: what’s missing here? what else could this mean?

This is a space where real progress can happen.

And as our confidence grows, then so too can fidelity.

But the jump to higher fidelity should always be earned — it reflects decisions made, constraints understood, risks addressed and knowledge learned. A wireframe becomes an interactive prototype when we’re confident in the flow of information. That prototype can become a coded feature when we’ve estimated the value of a concept.

Fidelity should increase when we have more to say about it — not because we want things to look more finished.

This principle applies far beyond designing visuals. Fidelity, in the broadest sense, is a cultural tool for leaders and workers.

Product managers sketch assumptions before committing to roadmaps. Engineers sketch architecture diagrams on whiteboards before committing to production code. Sales teams soft-pitch ideas in pre-sales calls, listening for objections and excitement before a formal offers reach a buyer. Marketers test resonance with exploratory posts before investing in campaigns.

These low-fidelity moves aren’t shortcuts. They’re strategies to make confident decisions. They signal intent, communicate confidence, and create space to test, learn, and adapt before cost and complexity set in.

Fidelity isn’t just how things look — it’s how well we know what we’re doing. When we present something in high fidelity, we’re not just showing detail — we’re implying more certainty.

That visual sharpness says: “we know what this is”. That’s a powerful statement that can clarify but can also mislead. And so we need to use it deliberately.

In low-maturity cultures, where output is valued over understanding, rough work is easily dismissed. But in adaptive, learning-oriented teams, lo-fi work generates high-understanding discussions.

They show that we’re willing to stay honest and open, explore multiple paths, and repeatedly state "we don't know yet".

Getting it wrong early to get to more-right faster and at less expense. So we scale fidelity with certainty. Not to slow down, but to learn faster. Not to hide ambiguity, but to expose it.

Fidelity in design isn’t about how finished something looks — it’s about knowing when something is ready to be finished.

https://renderghost.leaflet.pub/3m2yaaylaes2c
The AI recruitment ouroboros
AI is breaking the simple act of getting a job.
Show full content

Photo by Aleksandr Popov

A job interview used to be a conversation between people.

The purpose was simple: candidates tried to show who they were and what they could do, and employers tried to judge whether they were a good fit for the role, the team, and the work. It was a human exchange, shaped by instinct, ambiguity, and judgement.

Today, that’s no longer the case.

The recruitment process has turned in on itself. Every stage is influenced by AI. Job descriptions, CVs, interview questions, answers, feedback, and final decisions can all be generated, filtered, and processed by machines.

Hiring has become an automated loop where machines speak to other machines through people. Candidates use AI to shape their performance, and employers uses AI to define and evaluate that performance.

It begins with the job advert.

The hiring manager enters a few bullet points into the recruitment platform. An AI writing tool expands these into a complete job description, adjusting tone to match brand guidelines and optimising for search engines.

The language includes precise, inclusive, and generic terms: 'collaborative environment', 'data-driven decision making', 'growth mindset'. The job post is automatically distributed across job boards and hiring platforms.

Candidates respond with AI-adjusted CVs and cover letters, tuned to match the posting. Each applicant instructs their AI to 'optimise for keywords' and 'mirror the tone'. AI identifies the phrasing patterns most likely to pass automated filters and adjusts accordingly.

The company receives a batch of polished applications, structured to pass the AI-powered screening tool. This system parses each document, extracts key data points, and scores candidates based on how closely they match the job description.

Hiring managers are sent a ranked shortlist automatically.

Candidates use AI to generate likely interview questions based on the role, industry, and seniority level. Suggested answers are also provided, tailored to the job description.

Hiring teams prepare using the same AI tools. Both sides relying on near-identical prompts and models.

Candidates rehearse with AI-powered coaching software which tracks filler words, tone, pace, and confidence, and gives feedback for improvement.

During the video interview, a virtual assistant joins the call. It records, transcribes, and processes the conversation in real time, analysing sentiment and topic coverage.

Afterwards, interviewers use AI to clean up their notes and structure their feedback. The recruitment platform merges all inputs into a single report. It summarises impressions, identifies patterns, flags concerns, and suggests next steps.

The platform compares candidates based on interview scores, screening data, and profile matches. Hiring managers review the recommendations and confirm the decision.

Finally a concise, legally-compliant, and empathetic rejection letter is drafted, referencing 'appreciate you taking the time', 'a strong field of applicants' and 'alignment with current business priorities'.

Emails are sent in bulk through the platform. No manual edits or reviews are made before dispatch.

The whole point of a job interview is to find out who someone is, how they think, and whether they could work well with others. It's a moment of messy, subjective, and essential human judgement.

Now, it seems AI writes the roles, screens the people, and scores the conversations. Candidates perform for systems designed to read patterns, not people. Employers review summaries that reduce personalities to matches, metrics and tone scores.

When every step becomes automated, workers stop being seen. The ones who can’t or won’t tailor themselves to the system are filtered out before a person ever reads their name. Skill gives way to prompt literacy. Personality becomes noise.

For companies, the result is no better. They hire mirrors — candidates optimised to reflect their own data back at them. Teams become narrower, less adaptive, less human. Decisions feel efficient but produce sameness, fragility, and mediocrity.

When no one is really choosing, no one is really responsible.

A process still runs, but the purpose has collapsed in on itself.

https://renderghost.leaflet.pub/3m2vr4zm3xk2n
The worst designer I've ever worked with was also the most productive
When companies mistake volume for value, and incentivise outputs over outcomes.
Show full content
A photo of a classic conveyor belt from a factory assembly line.
Credit: AutoSysConveyors

This ambitious young designer pumped out more screens in a day than more-experienced designers would in a week. Leadership adored them. They saw the speed, the volume, the 'beast mode' energy — and mistook it for value creation. Under pressure to sign a lifesaving contract with a major customer, management praised the increased output without understanding the value or assessing the risks.

By rewarding speed and volume over quality and necessity, leadership created a dangerous feedback loop: the more this designer produced, the more praise they received, the more resistant they became to a rational user-centred design process.

What those leaders failed to see was a growing pile of unaudited, untested, and misaligned design work: a ticking time-bomb of incoherence, technical debt and costly rework that would eventually cost the team far more than it saved.

Flooding the zone

This eager young designer produced so much work that we ultimately lost the ability to evaluate it.

What began as impressive momentum became an unmanageable flood. Screens appeared in the backlog faster than anyone could review them. No critique, no usability testing, no standards. Most of it bypassed the design team entirely: no alignment, no time to check whether any of it could solve a real problem, no space to assess if it worked in our design system.

This work poured into the backlog and created confusion across teams. Impossible to manage, impossible to question, it drowned the design process in sheer volume. Work went straight from Figma to Jira, bloating the pipeline with decisions no one had the time or space to challenge, creating an illusion of progress.

The pile grew taller and the thinking grew thinner.

This is what political agitators call 'flooding the zone': Overwhelm a space with noise until discernment and critical thought collapses. It’s a powerful tactic to kill scrutiny and accountability. In design, pure productivity is just as corrosive. Inundate a team with outputs and artefacts and eventually, no one can tell what’s good anymore.

Build the right thing. Build the thing right.

John Maeda defines power as doing less to get more. But this ambitious young designer made themselves so busy, they lost their power, along with any grip on what good design looks like.

They didn’t question the product documents. Didn’t read user research findings. Didn’t run design reviews or usability tests. Ironically they complained they were was 'too busy' to follow our processes, as if a design process was a luxury, and not their actual job. Their first ideas became final outputs. Over and over again.

Design is not just production. It’s about discernment. Design is a discipline of questioning, testing, refining and making hard choices.

As Marty Cagan put it, great teams build the right thing, and build it right. That means not just making something, but making the right thing, at the right time, for the right reasons.

Something that works for real customer's needs at a fair price.

Flooding the zone is easy. It looks impressive. It gets promotions. But that's not progress: It’s just output. Without critical thinking, output becomes theatre — things that look nice but solve nothing.

Good designers slow down when it matters. To clarify the problem. To align with their team. To create outcomes that matter.

When someone brags about how much they shipped, don’t applaud. Ask what impact they had. Ask who they helped. Ask what behaviour they changed for the better. Ask what would’ve happened if they’d simply done nothing.

That’s the measure of good design. Everything else is waste.

https://renderghost.leaflet.pub/3m2qmhm7kcc2l
Bring squiggle birds to your workshops
A simple silly and powerful drawing exercise that unlocks creative thinking instantly in workshops about complex things
Show full content
A photo of a collection of squiggle birds from a service mapping workshop.

A collection of squiggle birds from a service mapping workshop.

Most every workshop begins with a tense mood.

Sometimes it’s curiosity, sometimes it’s anxiety, and sometimes it’s the silent hum of busy people wondering why they’ve been pulled into yet another meeting.

I’ve learned that the first 10 minutes determine everything that follows. If people start tight, self-conscious, or stuck in “work mode”, collaboration becomes mechanical and the outcomes are less exciting.

Creativity demands a certain looseness — social, mental, and emotional. So before we do anything serious, I like to ask everyone to make something completely unserious: a Squiggle Bird.

It’s fast, silly, and absurd — but it works every single time by resetting the energy in the room, unlocking creative thinking, and reminding people that good ideas rarely start polished.

A birdy intervention

Squiggle Birds is a fast fun drawing exercise I love to use at the start of workshops with stakeholders that requires creative thinking.

Each participant gets a pen and a post-it note and I ask them to quickly draw a loose and fluid squiggle. Then they must rotate the post-it note, looking for a form that feels bird-like, and then quickly add three simple elements: eyes, a beak, and legs. Nothing more.

Finally, they give their bird a name and a short backstory — what kind of bird it is and what its special ability or unique trait might be.

Once everyone’s finished, we go around the room and share them.

The whole thing takes about 10 minutes.

The simplicity is deliberate. No skill required, no right answers. Just some silly and unexpected play.

Creativity is pattern recognition

Squiggle birds taps into a psychological phenomenon called pareidolia: our brain’s tendency to find patterns and familiar forms in noise, like seeing faces in clouds.

By looking for “the bird” within a random squiggle, participants engage the part of our minds that recognises patterns, metaphors, and stories.

The constraint of only adding eyes, a beak, and legs prevents overthinking and encourages playfulness. Everyone knows what birds look like and they're surprisingly easy to find in squiggles.

Rotating the post-it is crucial. It forces people to view something from multiple perspectives, breaking rigid thought patterns and gently opening people up to reinterpretation, curiosity, and flexibility — exactly the mindset needed for true creative work.

What begins as doodling becomes a small rehearsal for recognising possibility in what (at first) appears to be chaos.

Play as social glue and hierarchy leveller

Workshops often bring together people from every level of an organisation — from workers to managers to C-suite execs.

Within minutes, Squiggle Birds has people laughing at their misshapen bird drawings, comical names, and silly stories. Atmospheres shift from polite formality to shared amusement.

That collective silliness is very powerful. In environments where people are used to performing competence, it gives permission to be curious, imperfect, and human. It resets the room.

Before any structured activity begins, the group has already built the foundations of some psychological safety and trust — the real and very necessary conditions for creativity to thrive.

Imperfection as a principle

Many of my workshops involve sketching wireframes, mapping flows and journeys, and visualising concepts. Squiggle Birds quietly sets the tone for that work: we’re not here to make art; we’re here to communicate with clarity. It provides that essential reminder that creativity is about communicating intent not precision.

The constraint (only eyes, a beak, and legs) lowers the bar on “good enough”, so that people stop polishing and start expressing.

In 10 minutes it teaches a few non-negotiables:

  • Speed over neatness: we’ll move quickly, iterate, and refine together; perfection isn’t the goal.

  • Clarity over polish: a rough mark that others can understand beats a neat drawing that says nothing.

  • Visual beats verbal: short text is easily misinterpreted; a simple sketch anchors a shared understanding.

  • Safety to show unfinished work: the silly birds normalise imperfection: sketches feel low-stakes and collaborative.

Practically, this means that for the rest of the workshop people are comfortable making rough sketches, labelling parts, drawing arrows, and talking through intent. The bar is simple and low: can someone else grasp the idea well enough to discuss or build on it?

More about Birds

Read this simple guide to run Squiggle Birds at your next workshop, or even better, get in touch with me and we can do it together.

https://renderghost.leaflet.pub/3m2oiebmouk26
The product designer's paradox
How UX design’s quest for more influence led to its quiet subordination
Show full content

Power of Nature By Ash Wright

Over the past decade or so, the shift from UX designer to product designer was heralded by many as proof that design was maturing — integrating with business strategy, creating efficiencies, and driving measurable outcomes.

In theory, product designers would balance user needs with business goals, directly shaping strategic direction. The new title promised founders and teams a broader skillset and a more-informed seat at the decision-making table. Design functions would become more-impactful value generation tools.

In practice, many organisations moved the opposite direction. The rebrand that promised evolution delivered marginalisation. Rather than elevating design, they consolidated power within product management.

What was once an even partnership became hierarchical. In most organisations, product now sets direction and design executes.

If design lost its seat at the table, was the product designer rebrand a genuine evolution — or a quiet surrender?

The new reality

The triumvirate of design, product, and engineering has now tilted heavily toward product. Designers and product managers now compete for the same strategic space with different priorities, incentives, and levels of authority. The result is often a lack of clarity, duplicated effort, and blurred ownership.

Product owns the roadmap and metrics. Design discovery has for many narrowed to brief quantitative validation rather than qualitative exploration. User research mostly confirms existing assumptions instead of uncovering new opportunities. The generative research that once revealed unexpected user behaviours and unmet needs has been replaced by A/B tests that optimise within existing paradigms.

Design as a result drifted downstream, increasingly informed rather than consulted. Teams now often expect designers to polish predetermined solutions rather than question whether those solutions address the right problems. The discipline that once championed user needs and fought for quality now struggles to influence even basic product decisions, let alone meaningful outcomes.

The cost of disconnection

When design is disconnected from strategy, products become more measurable but less meaningful. Teams chase conversion rates and engagement spikes at the expense of coherence and long-term value. Features proliferate without purpose, creating bloated products that serve metrics dashboards better than they serve humans.

When empathy and coherence fall out of the equation, startups lose trust and loyalty, and users inevitably disengage.

Businesses see it in rising acquisition costs, lower retention, and weaker differentiation. Users feel it in products that function but fail to connect — efficient yet forgettable at best, victims of enshittification at worst. They navigate interfaces optimised for value extraction rather than experience, where every interaction feels transactional.

This isn't product's fault or responsibility.

As designers retreated into craft and avoided the messy realities of revenue pressures, technical constraints, and organisational politics, Product naturally filled the vacuum. While Designers honed their processes and craft, product managers mastered influence and accountability.

Product managers speak the language of business—OKRs, attribution models, unit economics—while Design struggled to quantify its own value beyond satisfaction scores and usability metrics.

In organisations that valued certainty over costly explorations, its clear which skillsets are more valued, trusted and rewarded.

The next movement

The consolidation under product management probably serves lean startups better.

Technical product managers who can navigate both Figma and finances offer founders immediate measurable value in today's efficiency-focused market. They can ship faster, iterate based on data, and directly connect features to revenue.

But optimisation alone doesn't create lasting differentiation. As markets mature and products commoditise, winners will balance operational excellence with real experiential innovation. The companies that endure create products people choose even when cheaper alternatives exist — products that inspire loyalty through delight, not just utility through features.

The path forward isn't about design reclaiming lost territory or product ceding control. It's about both disciplines evolving toward genuine collaboration: product managers who understand the compounding value of coherent experience and can champion quality even when it's hard to measure; designers who engage with metrics, embrace trade-offs, and can articulate decisions in business terms rather than retreating into theory and pixels.

The next evolution won't come from new job titles or reshuffled responsibilities. It will come from teams that dissolve the artificial boundaries between strategy and execution, between measurable and meaningful, between a business goals and human needs. It requires designers brave enough to own outcomes, not just outputs, and Product Managers secure enough to share the strategic vision.

The question isn't whether designers or product managers should own the strategy — it's whether we're able to properly share it.

https://renderghost.leaflet.pub/3m2mel6p4qc2l
Business leaders don’t care about design (and that’s a good thing)
Or why your boss was probably right not to support your design ideas.
Show full content

SETI by OndrejHrdina

Have you ever asked yourself: “Why doesn’t this company care about great design?” or “How can design get a seat at the table?”

It’s frustrating when work doesn’t resonate with decision makers. To feel stakeholders disengage during discussions about the fine details of an interaction or a design language. We care deeply and we want others to care too.

The easy assumption is that leaders just don’t get it, and that’s why they don’t back our ideas. But true or not, assuming ignorance doesn’t help persuade anyone about the merits of a solution.

They don't care like you do.

Take a deep breath. This may be hard to hear. Business leaders don’t really care about design, and that’s a good thing.

Their job is to make strategic decisions that grow and sustain the business, not to obsess over the design like you do. Once they understand the context, they need to trust you with the details.

Very few companies exist to do great design for its own sake.

Design exists to serve business goals — it’s a means to an end, an enabler of success and growth.

So if we want support, we can’t lean on design theory, best-practice diagrams, or a Nielsen Norman blog post because that’s not what helps them to make decisions.

We have to connect the work to the choices leaders actually face.

So, what do they actually CARES about?

Even the most design-conscious leader will look at proposals through a few simple lenses:

  • How much will it cost?

  • How much effort will it take?

  • Why this, and not something else?

  • How will it help our business?

All of these questions boil down to simple leavers of commercial success: Cost, Acquisition, Retention, Engagement and Satisfaction — or CARES.

Risk-averse decision makers won’t back design changes unless they clearly move one or more of these levers. This is why designers are hired: to help serve and keep as many happy customers as possible — solving real problems at the lowest cost, to grow the business.

Show that you CARES

If we want stakeholders to back new features or UX improvements, our proposals should lead with how they move metrics that matter:

  • "Usability testing uncovered three blockers in checkout. Fixing them now could save ~€50k in support costs per year."

  • "Simplifying sign-up from six steps to three could lift conversion rates by 12%, based on patterns we’ve seen in our funnel analysis."

  • "Customers who complete onboarding within the first week are 40% more likely to renew. We can redesign the flow to shorten time-to-value by half."

  • “A scalable design system would cut duplicate design and dev work in half, reducing build costs by ~15% and helping us ship new features weeks faster.”

  • "Introducing personalised recommendations is projected to increase repeat visits by 20%."

  • "Accessibility improvements in navigation will open the product to an estimated 10-15% more of our market and reduce frustrations for existing users."

Even rough estimates like these help connect design changes directly to success metrics and company goals. That’s the common ground where priorities, scope, and resources get negotiated.

Grounding your proposals in real business assumptions makes your case far more compelling. Designers don’t always have these numbers to hand — but your partners in product, data, sales, or finance probably does or can help you find out.

Get to persuasive

Focus on outcomes first. That’s where you’ll find common ground with decision makers — at the intersection of user needs and business goals.

Don’t try to go it alone. Ask your manager, product partners, sales, finance, or data colleagues. Be clear about what you want to know, and why it matters.

  • How does our company make money?

  • What are this quarter’s priorities, and what’s coming next?

  • Which metrics matter most right now?

  • How are we tracking them?

  • What are the biggest business problems to solve?

  • Why are they critical?

If no one can answer, that’s a signal. Point out that clarity on goals helps you do better design, and push to get it written down somewhere and socialised with the teams if you can.

What will leaders want to see and hear the next time you need their support? Start your conversations, documents, and decks with those points.

Get uncomfortable

Commercial topics may not feel natural for designers, but they’re essential for building trust, earning influence, and having real impact.

It takes time and patience to learn this new language, but every conversation is a chance to practise. Start by reframing your work in terms of CARES, and back it up with insights from colleagues who know the numbers.

The next time you present a design proposal, don’t just show the craft — show the impact. That’s how design gets heard, supported, and acted on.

https://renderghost.leaflet.pub/3m2dzi6sgis2l
Incentives eat your strategy for lunch
How short-term sales culture destroys product strategy, and what founders can do to stop it.
Show full content

Photo credit: FaithMag

Nature abhors a vacuum

In a typical B2B SaaS startup, product managers obsess over customer goals, strategic roadmaps, and viable business outcomes. Designers prioritise user needs, problem clarity, and tested sustainable solutions. Engineers focus on feasibility, quality, and scalability. Salespeople care about turning leads into revenue by closing deals.

These groups overlap in mission but not always in incentives.

We’re all attempting to build something people need and will pay for, but the daily levers we pull to get there are different. Product orgs have built strong tools and processes for product management 🗘 design 🗘 engineering collabs, but sales usually don't get the same treatment, despite being just as critical.

That gap creates a lot of risk. Commission-driven salespeople without the right support default to desperation and improvisation.

Overselling features that don’t exist. Promising one-off builds that won’t scale. Signing contracts with unrealistic terms. The result is ever increasing off-roadmap requests with impossible timelines.

They may win lucrative contracts in the short term, but this quickly drains engineering capacity, erodes cross-team trust, sets customers on a path to disappointment and churn, and quietly hijacks the roadmap, tilting strategy toward the loudest deals on the table.

Sales off the rails

Salespeople, in my experience, fall on a spectrum.

At one end are the deep learners: they work hard to understand the product, sell to the right customers, and represent it with integrity. With enough knowledge, and armed with great sales collateral, they become powerful allies for the product org—closing the right deals and providing relevant timely feedback from target customers.

At the other end are the say-anything closers. They move fast, bring in big revenue, and hit targets. But they also overpromise, push for custom builds, and sign terms that can easily derail a roadmap. Managed well, they can be a highly effective sales force. Left unchecked, they’re extremely dangerous to progress.

Most sales teams have both types. Without being set up for success, neither operates at their best. The reckless create chaos. The more capable are left uninformed, underpowered, and unmotivated.

The outcome isn’t just a messy sales pipeline—it’s organisational dysfunction: shaky deals, confused messaging, conflicting narratives, teams drained by unplanned work, and a product strategy going off the rails.

The problem isn’t the sales team. It’s the absence of strong structures that align sales incentives with strategic goals. Without them, the gap widens—and costs compound fast.

Incentives eat strategy

The root cause is incentives.

Product orgs optimise for long-term value, but sales is paid on short-term revenue gains. Non-technical founders often oversell a vision they don’t fully understand in the details. Add VC pressure for hyper-growth, and the gravitational pull towards easy sales and quick wins becomes overwhelming.

Information gaps only deepen the problem.

Product knows what’s real, what’s in progress, and what’s unlikely to ever happen. Even well-intentioned salespeople—rarely hired as technical experts—can struggle to keep up with the rate of change in a product development cycle and so are left to improvise. A rational response to a dysfunctional system.

When clarity and enablement are missing, improvisation is not negligence; it’s often the only option available.

Staff turnover rates further compound issues.

Sales roles can churn quickly, and with each departure institutional knowledge evaporates. Good salespeople burn out and leave. The reckless ones thrive unchecked.

Without solid onboarding and collateral, teams waste energy re-explaining the basics and watching the same mistakes repeat.

The failure lies in the sales culture and systems leadership has chosen—or neglected—to build. Unless those systems change, the outcomes are inevitable: wasted cycles, broken trust, customer churn, and 'success' dominated by short-term gains.

Making sales work for strategy, not against it

The solution isn’t to blame sales—it’s to redesign the interactions between the product org (product management, design, and engineering) and the sales org.

Design 🗘 engineering gets attention because it’s visible. But product org 🗘 sales is where strategy collides with market realities.

Get it wrong, you bleed money and trust, and bend the roadmap to the loudest customers. Get it right, and sales becomes a real force multiplier—your strongest ally in turning vision into reality.

Treat sales as a partner, not an afterthought

Help to create playbooks: clear collateral, usable demos, and plain-language documentation that explains what the product does and why it matters to customers.

Make communication two-way

Push updates downstream, but also pull feedback upstream. Filter aggressively so one-off customer demands don’t hijack the roadmap.

Frame in Jobs to Be Done

Customers don’t care about technical stories; they care about outcomes. Give sales a language rooted in real customer goals, not product backlog items.

Provide roadmap clarity

Be explicit: what’s available now, what’s coming soon, what’s future-but-uncertain, and what’s never. Remove ambiguity before it becomes false promises. Be prepared to negotiate, but not compromise for quick cash.

Enable the good, contain the reckless

Reward salespeople who invest in understanding the product. Protect strategy from those who’ll say anything to close. This isn’t about being “nice”—it’s about defending the mission.

Hire and reward wisely.

Don’t bring in or incentivise salespeople who chase quick deals without understanding SaaS dynamics or scalable products. They’ll trade your strategy for commission every time.

Above all: build for sustainability, not sugar highs.

Short-term wins burn trust, drain teams, and derail the roadmap. Long-term sales culture fuels compounding, sustainable success.

https://renderghost.leaflet.pub/3m2bwbvmnw22o
Equity Compensation: A Lottery Ticket with Terrible Odds
Exits are low probability events and don't offer real value to workers.
Show full content

Photo by Carl Raw

Tech startups routinely promise equity as part of a total compensation package. The pitch is seductive: 'work hard now, reap big rewards later'.

75% of startup workers say equity was an important reason to join their company. But for most, that upside never comes.

Startup equity is sold as a golden ticket to wealth. In truth, it’s a machine that makes a handful of rich people richer and runs on everyone else’s labour.

90 % of all startups fail

The odds are stacked from the start.

Most startups never make it to a rewarding exit. In fact, most VC-backed companies fail outright or sell at a loss.

Exits are low probability events

Most startups that survive don’t reach a meaningful exit and when they do, it’s more likely to be an acquisition than an IPO — meaning less transparency, less liquidity, and less upside for workers.

IPOs have slowed to a crawl.

Many companies continue postponing public listings, and many VC‑backed IPOs trade below earlier valuations.

In 2023, the US saw ~100 IPOs raising ~$19 billion: far below the highs of past years. Meanwhile, 2024 saw ~1,000 M&A deals involving VC-backed firms, confirming that acquisitions remain the dominant exit path.

Most value is now concentrated in a small handful of deals and large exits ($500M+) are now extremely rare.

Down rounds are increasingly common: in 2024, about 30% of global VC deals were flat or down. In Q1 2025, down rounds made up ~14.7% of U.S. venture deals—the highest in a decade.

That means fewer and fewer 'big wins' where employee equity might actually pay off for workers.

Rare wins and frequent disappointments

For most workers, equity never delivers on the promise.

Many don't exercise their options — blocked by high costs, unclear tax treatment, or short exercise windows. Even when they do, gains are often wiped out by dilution, bad terms, or taxes.

Only early, senior workers with strong negotiating leverage tend to benefit. Companies like Airbnb or Dropbox are exceptions — the exits were big, dilution was limited, and employee equity terms were unusually fair.

But most tech startups face down rounds, layoffs, or go bankrupt altogether, slashing any value for workers.

Equity is the promise of upside, without the delivery.

Waterfall payouts: Who gets paid first?

Even when an exit happens, the path to payout for workers is fraught with challenges. The typical order is:

  • 1. Creditors (Banks & Lenders) Debts always get paid first.

  • 2. Preferred Shareholders (VC Investors) Investors with preferred shares get liquidation preferences, meaning they get paid before everyone else.

  • 3. Regular Shareholders (Founders + Early Team) Equity holders without preferences get paid after debts and investors.

  • 4. Options/VSOPs Holders (Workers) Workers are always last in line. Payouts happen if there’s money left over — and that’s rare in anything but high‑value exits.

Each layer eats into the exit value pie, meaning there’s often little or nothing left for workers.

Equity is less reliable than before

Startups increasingly protect capital, not workers.

Investors now get stronger terms: liquidation preferences, participation rights, guaranteed payouts and workers sit at the bottom of the stack.

Rising valuations have increased dilution and raised the bar for meaningful exits. Equity grants may look generous on paper, but rarely survive the path to liquidity and evaporate under market pressures.

With IPO markets stalled, companies stay private longer. And when liquidity appears, it's often through private share sales open only to insiders and workers inevitably find themselves locked out. As flat and down rounds spread, even modest gains are erased.

Equity still exists — but more of the upside flows to investors and founders than to the people doing the work to create that value.

What does this all mean?

Equity can work — but mostly for those who join early, sit high, and get lucky. For workers, it’s a risky bet dressed as salary.

If you're offered equity, treat it like a scratch card. It might pay off, but odds are it won't. Ask the awkward questions, push for clarity, and don’t base your future on paper promises.

The myth is that startup equity makes everyone rich. The reality? It makes a few people rich — and uses the rest of us get there.

https://renderghost.leaflet.pub/3m24fqoisbs2n
Founders: The best predictor of your SaaS failing is visualising your future too soon
Why high-fidelity mockups create cognitive lock-in that prevents real growth
Show full content

Tarot cards By ten-dril

The best predictor that a startup will struggle to achieve its vision isn’t lack of ambition or ideas: it's attempting make the vision look real before it’s ready.

Every founder recognises the moment in front of an investor or an early customer and the question comes

So what does it look like?

A perfectly reasonable question. And so, the design or marketing team is asked to produce some high-fidelity mockups. Now suddenly we have something concrete to show that feels like progress.

In reality, it’s often the start of a bigger problem that quietly suffocates many early-stage SaaS products.

Why we reach for visuals too soon
Prediction is very difficult, especially if it's about the future.
— Niels Bohr

Startups run on uncertainty, and understandably, that can make people feel quite uneasy. Founders want confidence that progress is being made. Investors want proof their money is building something tangible. Early Adopter Customers need something to react to. They want a glimpse of the thing they're committing to.

Visuals scratch this psychological itch. High fidelity mockups make the abstract appear concrete. Shiny prototypes create a sense of control. We can point at it, share it, and tell a story around it.

The problem is that, at the earliest stages, these are not truths. They are guesses presented as forecasts.

The high cost of premature visualisation

When immature designs gain too much polish, they stop being sketches and start becoming specifications. They stick in people's minds. First impressions harden into expectations.

High-fidelity mockups trigger an anchoring effect — a bias where early impressions distort later judgements. In product development this becomes cognitive lock-in: once people see something, it feels real, and reversing course becomes harder and much more expensive.

Exploration are mistaken for commitments. Thought experiments are mistaken for decisions. The drawings define the product more than evidence from the market.

Conversations shift from fundamental questions about user needs and workflows to surface debates about placement and colour. Instead of exploring options, teams defend decisions made before the problem was understood.

Discoveries that only emerge through iteration, usage and feedback are overlooked.

Lock-in spreads.

  • Founders and sales teams pitch from decks that depict products not yet built.

  • Investors interpret early visuals as implicit commitments, shaping expectations around features and timelines before the product is validated.

  • Customers remember polished versions and measure reality against an idealised promise.

Visuals that help. Visuals that harm.

Now, I’m not saying “never draw anything” but fidelity is a cost best spent on refined working software, not presentations, mockups, and prototypes.

Visualisation is an essential element of any design process — but only when used intentionally and appropriately. They can clarify thinking, align teams, and spark critical conversations.

The question isn’t whether to visualise — it’s when and by how much. Early visuals should be deliberately crude. They’re props for discussion, not blueprints for delivery. The moment something looks too finished, it stops being a question and starts being an answer.

Fidelity should only ever increase inline with increased understanding or reduced risk.

Learning to live with ambiguity

The best SaaS products grow out of ambiguity, and in the early days it can feel uncomfortable, sometimes excruciating.

Living with ambiguity is not weakness. It is a strategic advantage and a survival strategy.

Success depends on tolerating that ambiguity longer than feels comfortable. Anchoring around problems, outcomes, and key metrics creates the space to discover what a valuable product should truly be.

Focus on hypotheses, not polished visuals. Run experiments, observe behaviour, measure results, and test priority assumptions. Move quickly, save resources, and iterate repeatedly. Let data and feedback guide decisions, not mockups.

Investors and customers are not buying pretty pixels — they are buying confidence that your team can learn, adapt, and deliver real value faster than competitors.

Evidence of that capability is more compelling than any perfect picture.

https://renderghost.leaflet.pub/3lzlf5lrlps2f
Ghost kitchens in academia: How AI wrapper products threaten research integrity
Behind the glossy branding and false authority, wrappers hide the same flawed systems — eroding trust, distorting methods, and putting academic knowledge at risk.
Show full content

Change of Fortune By GimmeBamba

In the world of home food delivery, 'ghost kitchens' have mastered the art of culinary deception: A single warehouse operation can present itself as many different restaurants on popular delivery apps. Tony's Italian Kitchen. Seoul Street Tacos. Brooklyn Burger Bar. Each with its own unique branding, product shots, and carefully crafted identity.

Order from any of them, and the same industrial kitchen prepares your meal to be delivered by the same riders, but wrapped in different cuisine-appropriate packaging to match your expectations.

A strikingly similar phenomenon is quietly taking over academic publishing, except instead of rebranding the same kitchen, entrepreneurs are rebranding the same AI.

What most don't realise is that many of these "breakthrough" tools are nothing more than more-expensive facades wrapped around the same foundational models (ChatGPT, Claude, etc) that everyone else is using.

The confidence trick

These products are called 'wrappers', and understanding what they are, and why they can be so problematic, is crucial for anyone serious about research integrity.

Think of a wrapper as a fancy storefront for someone else's products. When you visit a tool that promises to 'revolutionise academic writing' or 'analyse research data with advanced AI,' you're often not encountering new artificial intelligence at all.

Imagine we use 'Super Science Reviewer AI Plus' a new product which uses unique cutting-edge AI technology to review our manuscript before submission, give expert tailored advice, and get us published in our journal of choice. All for just $25 per review.

That's quite a bargain, even if it takes 2-3 turns to get it right.

So we upload our manuscript, which is quietly combined with hidden custom instructions, and sent to OpenAI, before a response bounces back via 'Super Science Reviewer AI Plus' presented as their own advice and insights.

This is nothing but a sleight of hand.

'Super Science Reviewer AI Plus' contributes zero value or intelligence to this exchange. It's a middle-man transaction with some pretty UI and clever marketing copy, charging premium rates to access the same AI system we could use directly for a fraction of the price.

The gold rush: why AI wrappers are suddenly appearing everywhere.

Today OpenAI OpenAI charge ~$1.25 to process 1m tokens: roughly 750,000 words. That's about $0.0000016 per word.

Our 5,000 word manuscript costs less than a cent to process and yet we paid $25 per check. A 300,000x markup seems outrageous, bordering on immoral to me.

Figure: Understanding LLM Token Counts

Unlike developing genuine AI systems—which demands years of research and deep expertise—a wrapper can go from napkin sketch to market-ready product in a matter of days.

The business model writes itself: subscribe to OpenAI's API for cents per transaction, build a simple web interface, craft domain-specific custom instructions, add branding, deploy on cheap hosting.

Creating a wrapper requires no real expertise, no breakthrough research, no specialised knowledge beyond basic web development.

AI-powered academic tools are appearing daily, each claiming revolutionary capabilities. The dirty secret is they all using the same AI models.

The academic publishing problem

For academic publishing and research integrity, wrappers create problems extending far beyond simple economics. Entrepreneurs have minimal control over 3rd party foundational models, yet researchers risk unknowingly surrender control of their works:

Research theft risk

Unpublished findings shared with companies lacking strong intellectual property protections create scooping opportunities.

Black box methodology

Hidden prompts are proprietary, making results challenging to verify or replicate—which risks breaking some academic fundamentals.

Disclosure violations

Researchers may unknowingly violate journal AI-usage requirements when wrappers disguise ChatGPT as specialised academic tools.

Data security gaps

Sensitive research data gets processed by general-purpose AI systems that have standard commercial terms without academic protections.

False authority

Tools masquerading as domain experts encourage unwarranted trust in the outputs. Tools like our 'Super Science Reviewer AI Plus' carry the exact same limitations, biases, and potential for error as their underlying AI models—they're just packaged to appear more authoritative.

Finding a path forward

Wrappers aren’t just overpriced middlemen. They’re a transparency crisis for academia — distorting trust, obscuring methods, and encouraging researchers to outsource critical judgement to tools that add nothing of their own.

They launder generic AI outputs into something that looks specialised, credible, and safe. But behind the glossy branding, it’s still the same unaccountable, biased systems with no real understanding of science.

The real danger is how easily wrappers pass as legitimate. Once they slip into research workflows, they can normalise bad practice: extortionate markups, black-box methods treated as credible, and the handing over of sensitive work to untrustworthy companies.

In science, the cost of this deception isn’t just wasted money: It’s compromised knowledge and integrity.

https://renderghost.leaflet.pub/3lzg6eptz5c2l
The AI Bubble: Efficiency Theatre at Scale
Investors embraced AI not for its genius, but for its promise to cut people, cut costs, and boost portfolios.
Show full content

Photo Credit: Mick Haupt

AI is booming because of a narrative that it can deliver more with less.

Investors and founders alike have fully embraced the idea that AI lowers costs, accelerates growth, and signals efficiency at a time when capital is harder and more expensive to secure.

The claim is that startups can ship products with fewer people, burning less capital on salaries, and still be disruptive in a capital-constrained market.

In today's venture capital environment, shaped by rising interest rates and valuation collapses, that promise is potent and powerful.

And yet, despite the hype, the startup failure rate remains as predictable and steady as ever.

2017: The Spark

Google publishes 'Attention Is All You Need' introducing the 'Transformer' model architecture that made training large language models (LLMs) feasible and viable.

2018-2020: Scaling up

OpenAI builds on this breakthrough, releasing successive GPT models, and LLMs move from academic novelty to viable commercial tech.

2022: The Cultural Moment

ChatGPT launches and goes viral, reaching 1 million users in just five days — the fastest adoption of any consumer product in history.

Generative AI bursts into the public consciousness, reshaping expectations and creating an entirely new product category overnight.

2022-2023: The SaaSacre

Central banks raise interest rates to curb post-pandemic inflation, driving up the cost of capital and shrinking investor appetite for risk.

Venture funding drops by 70% as capital tightens, while average SaaS revenue multiples fall from 17x in 2021 to around 6x — the lowest level in a decade.

High-growth companies once buoyed by cheap credit and future potential see valuations cut by as much as 75%.

In response, VC funds press their portfolios for leaner, more capital-efficient business models.

The SaaSacre of 2022

Public cloud stocks have taken a drubbing in 2022. But not all cloud companies have been impacted equally. Why is this happening, how severe is the compression, & where are the areas of resilience?

The Three Valuation Lows in SaaS: 2013, 2016 … and 2022-2024

The SaaS Capital Index from SaaS Capital had a nice summary this week of the 3 low points we’ve seen in SaaS multiples, and this chart puts things in great context: As you can see above, in t…

SaaS follow-on rounds see a slowdown, but it won't last forever | TechCrunch

The exit market for SaaS dried up in the second half of 2022, which saw the lowest exit activity since 2016.

2023 — Investor Logic

With payroll consuming ~75% of startup operating costs, human labour becomes an obvious target for efficiency. Nowhere is this more visible than in tech, where workers are among the highest-paid in the world, and a perceived culture of frivolous perks came to symbolise the excesses of venture-backed growth.

The 'correction' begins in late 2022, when Elon Musk’s takeover of Twitter leads to a near 80% workforce reduction. Boards and investors signal that the era of unchecked headcount expansion is over. By the end of 2023, hundreds of thousands of tech workers across the industry have been laid off, with Meta, Amazon, Google, and Microsoft all cutting deeply.

In this environment, AI emerges as the perfect narrative: automate more, hire fewer, scale faster.

Investors bet more on startups that promise automation-driven margins, even when the business cases are thin. Founders quickly realise that without an 'AI angle', funding becomes harder if not impossible to secure.

AI shifts from exciting differentiator to prerequisite for fundraising.

2024 – A Digital Workforce

By 2024, the narrative shifts from 'AI as a feature' to 'AI as labour substitute'.

OpenAI rolls out ChatGPT Enterprise with premium tiers — including $200 per-user plans — positioning LLMs directly against the cost of hiring knowledge workers.

For the first time, enterprises can weigh a digital employee at SaaS prices against a human salary. With average entry-level tech salaries in the US between $80,000–$120,000, the comparison to a $2,400 annual licence is sobering.

The framing lands as companies continue cost-cutting after the mass layoffs of 2023, and some begin testing the logic in the open.

Klarna report AI assistants doing more and more work in customer service and marketing, the equivalent workload many hundreds of workers, saving an estimated ~$50 million annually.

Hiring freezes in many roles reinforce the message: AI is being treated not just as a productivity tool, but as a headcount reduction strategy.

This becomes the first large-scale price test of human productivity in dollar terms. Investors and executives alike watch closely for signals that digital workers can shift operating margins.

Adoption remains experimental, but the signal is clear: AI is no longer just another product category — it has become a benchmark against which the value of human labour is measured.

Klarna using GenAI to cut marketing costs by $10 mln annually

Fintech firm Klarna, one of the early adopters of generative AI (GenAI), said on Tuesday it is using AI for purposes such as running marketing campaigns and generating images, saving about $10 million in costs annually.

AI replaces 700 customer service reps at fintech startup Klarna - Tech Startups

In a clear sign of things to come with the rise of AI, Klarna, Europe’s most valuable fintech startup, just released a post highlighting the success of its OpenAI-powered customer service chatbot. According to Klarna, the chatbot managed an impressive 2.3 million conversations in the last month alone. That's not all. The company claims that

2025: A reckoning

Today in 2025, AI capabilities have become almost ubiquitous in product launches. Equally, existing products and platforms scramble to retrofit AI features to stay competitive.

And yet Gartner places generative AI squarely in the 'trough of disillusionment' where initial excitement gives way to harsh realities as implementations fail to meet overhyped expectations.

Figure: Gartner Hype Cycle for Artificial Intelligence 2025

The disconnect between hype and reality is stark to say the least. Companies are discovering the hard way that AI is not the magic bullet they were sold.

A recent report from McKinsey & Company found that 71% of companies reported using AI, and more than 80% reported no 'tangible impact' on earnings. A recent MIT study reveals that 95% of generative AI pilots failed to impact profits. Klarna's ambitious AI-driven customer service overhaul exemplifies this downturn: a disastrous decline in service quality forced the company to rehire human workers and pivot their strategy at great cost.

Startup economics remain relatively unchanged: failure rates persist at a rocksteady 70–90%, demonstrating that AI adoption has yet to alter the fundamental dynamics of venture success.

Seizing the agentic AI advantage

Discover how the GenAI paradox shapes AI agents in both vertical and horizontal use cases, highlighting the potential of agentic AI.

MIT Finds 95% Of GenAI Pilots Fail Because Companies Avoid Friction

Success comes when enterprises embrace friction — human, organizational, and technical — to turn GenAI into transformation. Here's why.

Company Regrets Replacing All Those Pesky Human Workers With AI, Just Wants Its Humans Back

Years after outsourcing marketing and customer service gigs to AI, the Swedish company Klarna is looking to hire its humans back.

What comes next?

The AI boom (pronounced 'bubble') was never just about intelligence, it was about economics: capturing and retaining maximum value with minimum reward for workers.

In a time of rising costs and scarcer capital, AI became the perfect story for VCs and founders alike: do more with less, automate the expensive parts, scale faster than the competition and achieve the holy grail of VC-backed software: drive marginal costs to zero.

But narratives can't change market realities: building sustainable businesses still requires people, time, patient capital and well-built products that solve real problems that customers will pay for. The current wave of AI enthusiasm is just another bubble primed for correction.

The question isn't whether AI will transform business. It's whether this generation of 'AI-first' AKA 'people-second' companies can survive long enough to find sustainable business models before the hype and capital runs out.

https://renderghost.leaflet.pub/3lzdifns2ac2e
AI in Peer Review: Heuristics for Academic Publishers
A draft evaluation framework to assess risks, trade-offs, and practices in scholarly communication.
Show full content
1. Technical Architecture & Performance1.1 Model Transparency

Can the AI system's decision-making process be inspected, explained, and validated by domain experts?

1.2 Performance Boundaries

Are the system's capabilities and limitations explicitly defined and empirically validated for peer review tasks?

1.3 Reliability Metrics

What quantitative evidence demonstrates consistent performance across diverse manuscript types, disciplines, and edge cases?

1.4 Interoperability

How and where does the system integrate with existing editorial management systems and workflows?

2. Data Governance & Privacy2.1 Confidentiality Protocols

How is manuscript content isolated from model training pipelines, external data repositories, and other 3rd party purposes?

2.2 Intellectual Property Protection

What contractual and technical safeguards prevent unauthorised use or exposure of research data?

2.3 Data Retention & Deletion

What policies govern the storage, access, and disposal of processed manuscripts?

2.4 Compliance Framework

How does the system comply with frameworks such as privacy legislations, institutional review board policies, and disciplinary data standards?

3. Risk Management & Quality Assurance3.1 Adversarial Resilience

What defences exist against prompt injection, model manipulation, and systematic gaming?

3.2 Hallucination Prevention

How are fabricated references, false claims, and spurious correlations detected and prevented?

3.3 Bias Mitigation

What processes identify and correct for demographic, geographic, institutional, and methodological biases?

3.4 Error Recovery

What protocols exist for identifying, documenting, and rectifying AI-generated errors post-publication?

4. Governance & Accountability Structure4.1 Liability Framework

Who bears legal and ethical responsibility for AI-generated review content?

4.2 Decision Authority

At what point in a process is human judgment mandatory versus optional?

4.3 Audit Infrastructure

What documentation trail enables post-hoc review of AI involvement in editorial decisions?

4.4 Escalation Pathways

How are disputes, appeals, and concerns about AI use handled?

5. Stakeholder Impact & Communication5.1 Disclosure Standards

How is AI involvement communicated to authors, reviewers, readers, and indexing services?

5.2 Consent Mechanisms

What opt-in or opt-out options exist for authors and reviewers?

5.3 Feedback Loops

How is stakeholder experience regularly captured and incorporated into system improvements?

6. Economic & Strategic Considerations6.1 Cost-Benefit Analysis

Beyond efficiency gains, what value does AI add to review quality and journal reputation?

6.2 Vendor Dependencies

What contingencies exist if an AI provider discontinues service, changes terms, increases prices, or alters service quality?

6.3 Competitive Positioning

How does AI adoption affect the journal's reputational standing, relative to peers?

6.4 Resource Allocation

What human and technical resources are required for responsible well-governed implementation?

7. Scholarly Ecosystem & Long-term Sustainability7.1 Reviewer Development

How might AI use affect early-career researcher training and expertise development?

7.2 Community Trust

What evidence demonstrates that AI enhances rather than undermines scholarly credibility?

7.3 Knowledge Evolution

How does the system adapt to emerging methodologies, interdisciplinary work, and paradigm shifts?

7.4 Cultural Preservation

What measures ensure AI augments rather than replaces collegial discourse and mentorship?

8. Validation & Continuous Improvement8.1 Independent Verification

Has the system undergone third-party evaluation specific to peer review contexts?

8.2 Performance Monitoring

What metrics track accuracy, fairness, and stakeholder satisfaction?

8.3 Update Protocols

How are model improvements tested and deployed without disrupting ongoing publishing and peer review processes?

8.4 Sunset Criteria

What triggers would necessitate discontinuing or fundamentally restructuring AI use?


About this Document

AI in academic publishing is often sold as a story of speed and efficiency. But behind the hype, there are major gaps: few practical tools help publishers weigh the risks, trade-offs, and cultural implications of using AI in peer review.

This document is an sketch framework of heuristics — not a finished guide, but a set of questions drawn from my experience and observations that highlight where careful scrutiny is most needed.

I hope you find it useful.

How to use it

Treat this less as a checklist and more as a set of provocations.

Each heuristic is framed as a question for publishers and institutions. Some will apply directly; others may spark new ideas and directions. The aim isn’t to settle the debate, but to widen it.

Current state

This is an early draft, incomplete and imperfect. I’m sharing it now to test it in the open, rather than waiting for a “finished” version that may never come.

How to contribute

This framework, if useful can eventually live somewhere more structured — a shared space where others can contribute, adapt, or challenge it. For now, I’m releasing it in this rough form to start the conversation.

If you’ve got thoughts, critiques, or additions, you can comment on this document, or can reach me directly on LinkedIn or Bluesky. I'm eager to hear from you how these heuristics resonate, where they fall short, and what’s missing.

— Barry


License

AI in Peer Review: Heuristics for Academic Publishers © 2025 by Barry Prendergast is licensed under CC BY-NC-SA 4.0

https://renderghost.leaflet.pub/3lz7q5vfxas2u