The Programmer's Paradox

Security

Paul W. Homer May 14, 2026 Updated May 14, 2026

Show full content

Programmers hate adding security to their systems.

First, it is a huge amount of work, and second, since it is so often left to the end, it is very ugly and disruptive work. A patchwork of hacks.

But it’s misdirected. Without enough security, the system they built is useless, well, worse than useless. If people use it, it could severely screw them over. Nobody would intentionally use something that helps criminals more than it helps them. Even if it is in a walled garden, you can never really be sure that someone isn’t motivated to take a peek.

It’s worth noting that I am not a security expert, and although I’ve had to deal with it a lot in my career, my practice might not be as strong as the experts would like. That being said, I’ll continue.

There are only a few things you need to worry about in security. First is actually identifying any ‘users’. You always have to know who they are and have enough confidence in that decision that you don’t make a mistake.

Then the other part of it is that you want to protect both the data and the code from anyone who isn’t supposed to see or activate it. It’s not enough to protect just the data or just the code. You need both.

In that sense, security isn’t that hard if it is your concern from day one. There are a bunch of entry points that people will use to get to the features and functionality. First, you identify them, then you check to see if they can access the given functionality. If they can, then lower, you check to see if they can access all of the data input into that functionality. If they can, then they can see the output. Simple 🙂

Here’s where the trouble starts. First, there should be no anonymous endpoints. But people love adding them, but they open the door to leaks or denial of service attacks. If you have none, though, all of that goes away. If you can’t quickly identify someone right at the top, punt them immediately, send a log to some administrator. They might have to block the incoming address or put up some firewalls to stop botnets and other nasty things. You always need to flag a punted user as a serious problem.

Second is databases. For capitalist reasons, they charge by users, so the system users are not the same as the database users. That sucks, and it has always sucked. Life would be pretty easy if a person’s identity propagated all the way down to the metal. It should.

If there was a necessity for a group or functional account shared by a bunch of people, then the group is the identity, and that identity goes all the way down.

If your database or its license makes that impossible, then you need to wrap it. You need to wrap it thoroughly enough that pretty much nobody can get to it in any way without first passing an identity check. So, not just in the backend code, but also on the machine in the scripts, with the OS, etc. Everywhere.

Wrap the database. It won’t make it convenient, since that is the opposite of secure, but you need to do it.

Now, at the top, after you have checked identity, you take a quick look at whatever functionality is called. Are they allowed to use it? In some extreme cases, that is a messy lookup table, and it needs to be managed by data admins. It’s annoying, but really, in a large organization, that really should be a distinct piece that is shared by a whole bunch of systems. You just check with it, user X wants to call foo, is that okay?

If that’s good, then as the code executes and hits the wrapped database, the second check will trigger on the data. If it’s good, then it is all done. If you always reuse both the high and lower levels, then the security will be everywhere, and you don’t have to lie awake at night worrying about it failing.

The only other part is that if a user ever sends you ‘code’, you laugh and reject it. If you want some cool dynamic execution feature, great, but there have to be two paths, not one. The code comes in from somewhere else, having been fully and completely vetted, and then the user later asks for it to execute dynamically. That keeps it really simple, and sets you up with some external means for this uber dangerous code to be properly managed, vetted, and approved. That in itself is a huge task; you can’t just ignore it and hope for the best. Dynamic code can never be ‘open’ dynamic code; it has to be closed and come from a reliable source that actually has to be more reliable than just reliable.

So, in the end, if you wrap the database, always identify everyone, manage a lookup table or two, and punt anything that could ever be possibly executed by any downstream party or library, then you’re done. All of this code is reusable; you just need to do it once, at the beginning of the project, then leverage it for success and glory.

tag:blogger.com,1999:blog-6104420435021904082.post-6054690487455038138

Extensions

Goodness

Paul W. Homer May 7, 2026 Updated May 7, 2026

Show full content

It really feels like the world has sunk to its lowest point in my lifetime. And it does not seem likely to improve anytime soon. We’re sliding downhill.

When I first started playing with computers, way back in the 80s, I felt like they had huge potential to help humanity. To lift us up, but it seems like they did the opposite. First, they trapped us; now they are forcing us to regress.

It’s not the computers themselves, but the types of monsters that latch onto them in order to make money, grab power, and manipulate people. Sadly, they find it too easy to use software to do this.

It seems like software developers made a rather tragic mistake in not preventing this earlier. We were just too eager to build stuff; we didn’t ask enough questions.

But the good news is that we can still do something about it now. We can build all new stuff that is empathy-driven and meant to really help people, not just pick their pockets or force their behaviour.

The trick is not to get hyper-focused on the technology itself; it doesn’t really matter. Instead, we put ourselves into the shoes of the users. Software without empathy is just a weapon waiting to be exploited.

The problem has always been that empathy-driven software is extraordinarily hard to write. It’s not just getting the technology to dance, or flooding it with domain data; you have to integrate all that very carefully into the full context of a user's life in order to shave off any of the hard spikes. It’s not just code and data; it is code and data that deliberately help people. It all emanates from their perspective, not from the builders or the operators.

To get us going, I’d suggest that everyone just start throwing any “non-monetizable” ideas they have out there. Pick a problem you know, write up a dream solution, and publish it. It doesn’t have to be practical; it doesn’t even have to be possible. It’s not about technology, but about seeing the world from the user’s perspective and making suggestions about how to really improve it. Too often, we first focus on technology, and then we try to shove it back into the solution space. That doesn’t work very well.

Once we have ideas, we can figure out how to implement these as solutions in ways that can’t be subverted by monsters. That, of course, is the difficult part. Serious software is still very expensive to build and run, and the cost of getting it funded has a lot of painful strings that we’ve seen over and over again are used to pull the efforts off in very bad directions.

If we can figure that out, then we just need to find a way to swap these technologies with the mess we’ve got right now.

I’ve occasionally dumped out some raw ideas:

https://theprogrammersparadox.blogspot.com/2024/05/identitynet.html

https://theprogrammersparadox.blogspot.com/2015/08/digital-discussions.html

https://theprogrammersparadox.blogspot.com/2009/04/end-of-coding-as-we-know-it.html

They were mostly unfundable, and since I had needed to pay the bills, they were beyond my ability to take further. But I’ve always thought it would be cool if someone was inspired to do something similar, so I wrote them up.

Other areas that desperately need our attention:

Source of Truth. I appreciate and admire Wikipedia, but I really want something more structured, like an ontology built on graphs or even hypergraphs, that contains all of human knowledge or at least as much as we can capture right now. It would assign a probability to any “knowledge”. For instance, a known mathematical proof would be 100% correct, but most other things we think are true are at best 99ish. And the myths and falsehoods are really low, maybe even 0. If there were multiple competing opinions, they would all exist in the data, but with some percentage of likely truth (as of today). It would be worldwide and not controllable by any country or dictator. Untouchable by monsters. A perfect use case for decentralization.

Privacy. We need to protect any facts about individuals, but still provide some (difficult) means of external verification. This would extend to group conversations as well. Some part of it would only allow retrospective external access if and only if the case made it to a territorial court accepted by all of the individuals. That is, they can’t spy on you, but if you did something bad in some jurisdiction that you have accepted, the information could be retrieved if there were actual court proceedings. It’s the notion that they have to do the policing legwork to catch you, but once you are in trouble, the whole truth will come out.

Time/Complexity Simulations. Being able to list out the consequences of a given decision over a complex circumstance. Lots of moving parts. You could throw together some approximate complexity for something, then play with any possible decisions to see how they fare in the long run. We need this, as too many people can’t see beyond extremely short horizons. Even if it is crude, it would help people think about more than just tomorrow, or next month, or next quarter. If you could come back with a chance that there is a 48.2% that “doing that” would turn the profits negative in the next seven years, it would be harder for someone to just forge ahead blindly. Or that there really are “century” events that we do need to protect against, like pandemics.

Consolidation. It sucks having to rely on dozens of different, widely inconsistent apps. Their collective value is eroded by the combined cognitive load. I’d want something simpler than a spreadsheet that brings together the common data and can trigger code in all sorts of remote places. A customized gateway that makes it easy to leverage the power of a computer, but just for you. The trick is to breach the complexity limits that so often hold us back. The abstraction that holds it together can’t be too abstract but still needs to be powerful enough that it is all-encompassing. If I could configure it for all of the repeatable parts of my life, like a crazy, distributed, super-integrated to-do list, with behaviours and data shareable for a wide range of scenarios, it would be my first point of contact on all my computers. It would have some deep way of reorganizing itself as I keep adding more to it. The key, though, is that it isn’t a remote service; you don’t rent it. You own it, it is yours, it can be seamlessly upgraded over the decades of your life, and it is fully private. The costs are trivial, but it will consume your time. There are parts you can share if you want, but there would never be a way to make money off your contributions; all you get for your efforts is a better life.

Guardrails. Lots of awful stuff happens on the web. Why? Why can’t we keep the good qualities of the Internet, without continuously opening doors for the bad ones? My guess is that capitalism drives an unquenchable thirst for monetization, so making that safe is just too costly. Eats into the profits. So we get half-baked stuff that eventually the monsters figure out how to leverage. From that perspective, it seems like we could put up some types of walls and fences that would protect this weak code from being exploited. Protect private data from going anywhere. You shouldn’t be subject to an attack unless you explicitly lowered your guard. It shouldn’t be possible to trick you into lowering your guard. All of the angst from this not being true today piles on the friction that devalues the capabilities of the computers. Finding a way to stop that is huge.

I’m sure there are a million more issues and ideas out there. Now is the time to flood the world with them, and then maybe we can figure out how to bring the best of them to life. If you do this, odds are you won’t get much credit, and it definitely won’t make you rich, but it is still a good thing to do, so it is worth the effort.

What we ultimately want is for computers to make our lives easier and more meaningful. To take away some of the drudgeries and difficulties of reality, but not numb us into a coma or stupor. Sure, we’ll still turn to the machines for assistance, but we won’t get caught in negative incentive loops like doom scrolling. We will live life in reality, not digitally.

To get this, we need to stop the people who are financially motivated from bending all of the technologies against us. They only see the bad potential, realize their use in carving off profits, and then find ways to slip these into our lives. They’re tearing us apart so they can own mansions, sports cars, jets, and yachts. We have to stop allowing this.

tag:blogger.com,1999:blog-6104420435021904082.post-6872366490213957366

Extensions

Shortcuts and Makework

Paul W. Homer Apr 30, 2026 Updated Apr 30, 2026

Show full content

On the face of it, shortcuts and makework may seem like they are opposites.

A shortcut is a faster way to do something that effectively pushes out the consequences down the road. You take the quick and easy way now, only to pay for it later.

Makework, on the other hand, is anything that you are made to do that does not directly or indirectly contribute to the work at hand. For instance, you fill out a complicated form with copious details that is ultimately ignored forever.

Makework is usually some people trying to control or throttle others, often an abuse of power, or a justification of their value.

In bureaucratic organizations, the centralized control over poorly understood aspects of the company is usually thick with makework. There are plenty of administrators trying to control things that they do not understand. Thus, the rules and processes get weird and form the basis for lots of politics.

But the two are oddly related. Where you see one, you usually see the other.

The root cause is time. There is a project that has a tight timeline. But the people working keep losing huge blocks of time to makework. However, as makework is an integral part of the organization, blaming it for being late is not allowed. So, in order to try to catch up, they resort to a lot of shortcuts. The long-term consequences don’t matter if, in the short term, you will get in trouble for being late. The context of the project forces mistakes and panic.

It gets triggered the other way, too. Some people just take shortcuts out of habit; the project looks initially crazy successful. But as the long-term consequences come due, it collapses. In the downfall, lots of unrelated people jump in to “help”, like bureaucrats and generic management. Since they don’t understand and they don’t know why a once successful effort suddenly flipped, they propose a lot of work that they believe will fix the problem. More tracking, more documentation, more sign-offs, more meetings, etc. But all of this is effectively makework, and the real problem of replacing the shortcuts causing all of the grief gets ignored. The project ends up under the microscope, which amplifies its problems and does not correct them.

Mostly, though, the best approach is to be rigorously practical. Minimize both shortcuts and makework. Carefully assess any and all effort with respect to both of those categories. If it smells like one or the other, don’t do it. Get the core work done as best as possible.

The other part is to tackle the hardest parts first, don’t leave them for later, and don’t rush through them. While that gives the appearance of being late right from the get-go, it provides two valuable properties.

First, if you get stuck, you can raise a late flag early, rather than later, which tends to mitigate some of the bureaucrats coming out of the woodwork and drowning you with makework.

Second is that a shortcut on the hard stuff is way more destructive than a shortcut on the easy stuff. If time forces you to take shortcuts, then the ones with minimal consequences are far better. They are less costly to repair later. If the foundations are solid, you have a better shot of recovery when late. In many organizations, you are already late long before you even realize that you have work to do. It’s normal, so you need to adopt habits that mitigate it.

The biggest problem, though, is that coming up with shortcuts or makework is often a lot easier than doing things properly. It’s the easy path for both the workers and management. But it is an unsuccessful path too. It distracts from the things that really need to get done.

Ultimately, there is some work that needs to be finished with at least enough quality to keep the detractors at bay. Do that work, don’t get lost by trying to avoid it.

tag:blogger.com,1999:blog-6104420435021904082.post-7432476026964129879

Extensions

Users

Paul W. Homer Apr 23, 2026 Updated Apr 23, 2026

Show full content

A common misconception in software development comes from not understanding users.

For any piece of software, there is a bunch of primary users who are using it to solve their problems. This ranges from commercial product usages all the way to large enterprise usages.

If the software is large and has been evolving for quite some time, these usages are often partitioned into different subgroups. Some use one feature set, others use a different one. Some users are tied directly to their group, and others will overlap between a bunch of groups.

There is usually a second set of users, whose primary tasks are system management, usually related to the data in the software, but sometimes configuration, access, or feature capabilities. Occasionally referred to as data administrators.

They most often sit outside of the primary users and do not use the software to solve those problems; they are just involved in making sure the software itself is capable of allowing the other users to get their work done.

Totally ignored, there is actually a third set of users. These users take the software, install it on the fundamental hardware, and interact with it when there are problems. Sometimes they don’t even know what the software does, but they are still responsible for providing the platform for the software to exist and fulfil its role.
It used to be that this was classified as an Operations Department, but over the decades, there have often been a lot of Software Developers directly involved as well. These are users, too. They should have their own ops, test, or development accounts; they have complex access issues, and sometimes they have to be able to get into the software to determine that it is working or malfunctioning in a specific way, but most times should not be able to see what the data administrators can see.

Traditionally, people have not designated operations as users, which is a common mistake. They do “use” the software, and getting their work done is also dependent on it. It’s just that they are not focused on using the specific features or managing the data.

If you take a step back, then user requirements, and even user stories, should cover all three groups, not just the first or second one.

But it’s also true that if an enterprise has hundreds of software packages running, from an operations perspective, all of them share a large number of common requirements. They all need to be installed and upgraded, they all need to be monitored, and they all need some form of smoke tests for emergencies. An operations dept could put out a list of mandatory common user requirements long before any specific software project was a twinkle in someone’s eye.

What’s also true is that for the most part, these types of requirements do not change significantly with most tech stacks. The specifics may change a bit, but the base requirement is locked in stone.

This is a rather classic mistake that happens with new tech stacks or operations environments. Because they are new, everyone thinks they can start over again from scratch and ignore any previous round of complexities. However, once things get going, those same complexities come back to haunt the effort, and because they were ignored, they get handled extra poorly.

So, we see people put up software systems without any adequate monitoring, for example, and are surprised when the users complain about the system being down. Pushing the monitoring back onto the first and second user groups is common now, but it still makes the effort look rather amateur. Operations should be the first to know about a crash; they just can’t detect more subtle bugs buried deep under big features.

The users of software are anybody who interacts directly with that software, in any way. Non-users, while they still may be “stakeholders” in the effort, will never run the software, test it, log into it, or trigger some of the features.

Someone may be responsible for making sure the software project gets done on time, but if they do not interact with the software, they are not a user.

User requirements should have special priority above almost all other aspects of the work. They would only take a back seat when there are overriding cost or time constraints. But it really should be written down somewhere that the users did not get the feature or functionality they needed due to the enforcement of these constraints. At minimum, that builds up a wonderful list of future enhancements that should be considered as early in the effort as possible. The key point, though, is knowing what the user’s need in order to actually solve their problems is different than the specifics of a technical solution implementation.

tag:blogger.com,1999:blog-6104420435021904082.post-1565041917202320692

Extensions

Strange Loops

Paul W. Homer Apr 16, 2026 Updated Apr 16, 2026

Show full content

I remember when I was in school; we got a difficult programming assignment that I struggled with. The algorithm, if I remember correctly, was to fill an oddly shaped virtual swimming pool at different levels, and then calculate the volume of water. The bottom of the pool was wavy curves.

The most natural way to write the code to simulate adding little bits of water at a time to the pool was with recursion. No doubt, you could unroll that into a loop and maybe a stack or two, but we were supposed to use recursion; it was the key to the assignment.

I had never encountered recursion before, and I found it mystifying. A function calls itself? That just seemed rather odd and crazy. Totally non-intuitive. I coded it up, but it never really worked properly. I got poor marks on that assignment.

Many months later, the light went on. It came out of nowhere, and suddenly, not only did recursion make sense, but it also seemed trivial. Over the decades, I’ve crafted some pretty intense recursive algorithms for all sorts of complex problems. It now comes intuitively, and I find it far easier to express the answer recursively than to unroll it.

Recursion is, I think, a simple version of what Douglas Hofstadter refers to in GEB as a ‘strange loop’; a paradox is a more advanced variety. For me, it is a conceptual understanding that sorts of acts like a cliff. When you are at the bottom, it makes no sense; it seems like a wall, but if you manage to climb up, it becomes obvious.

There are all sorts of strange loops in programming. Problems where the obvious first try is guaranteed to fail, yet there is another simpler way to encode the logic that always works.

A good example is parsing. The world is littered with what I call string-split parsers. They are the most intuitive way to decompose some complex data. You just start chopping the data into pieces, then you look at those pieces and react. For very simple data that works fine, but if you fall into programming languages, or pretty much anywhere where there are some inherent ambiguities, it will fail miserably.

But all you really need to do to climb this cliff is read the Green Dragon book. It gives you all of the practical knowledge you need to implement a parser, but also to understand any of the complicated parser generators, like Antlr or Yacc.

I guess the cool part is that once you have encountered and conquered a particular strange loop, that knowledge is so fundamental that it transcends tech stacks. If you can write a solid parser in C, you can write it in any language. If you understand how to write parsers, you can learn any programming language way faster. And nothing about that knowledge will radically change over the course of your career. You’ll jump around from different domains and stacks, but you always find that the same strange loops are waiting underneath.

A slight change over the decades was that more and more of the systems programming aspects of building software ended up in libraries or components. While that means that you’ll have fewer opportunities to implement strange loops yourself for actual production systems, you really still do need to understand how they work inside the components to leverage them properly. Being able to pound out a parser and an AST does help you understand some of the weirdness of SQL, for example. You intuitively get what a query plan is, and with a bit of understanding of relational algebra and indexing, how it is applied to the tables to satisfy your request. You’ll probably never have enough time in your life to write your own full database, but at least now you can leverage existing ones really well.

I’m not sure it’s technically a strange loop, but there is a group of mathematical solutions that I’ve often encountered that I refer to as ‘basis swaps’ that are similar. Essentially, you have a problem with respect to one basis, but in order to solve it, you need to find an isomorphism to some other basis, swap it, partially solve it there, then swap all or part of it back to the original basis. This happens in linear programming, exponential curve fitting, and it seemed to be the basis of Andrew Wiles solution for Fermat’s last theorem. But I’ve also played around with this for purely technical formal systems, such as device-independent document rendering. I guess ASTs and cross-compilers are there too, as are language-specific VMs like the JVM.

What I’ve seen in practice is that some programmers, when confronted with strange-loop type problems, go looking for shortcuts instead of diving in and trying to understand the actual problem. I do understand this. There is so much to learn in basic software development that you really don’t want to have to keep going off and reading huge textbooks. But I also know that trying to cheat a solution to a strange loop is a massive time waste. You’re always just another bug away from making it work correctly, but it will never work correctly. The best choice if you don’t have the time to do it properly is always not to do it. Instead, find someone else who knows or use something else where it already exists and is of pretty decent quality.

Mostly, there are very few strange loops in most applications programming, although there are knowledge swamps like schema normalization that cause similar grief. Once you stumble into system programming, even if it is just a simple cache, a wee bit of locking, or the necessity of transactional integrity, you run into all sorts of sticky problems that really do require existing knowledge to resolve them permanently.

Strange loops are worth learning. They don’t change with the trends or stacks, and they’ll enable you to be able to write, use, or leverage any of the software components or tools floating around out there. Sure, they slow you down a bit when you first encounter them, but if you bother to jump those hurdles, you’re lightning fast when you get older.

tag:blogger.com,1999:blog-6104420435021904082.post-6551592738666215598

Extensions

The Quality Bars

Paul W. Homer Apr 9, 2026 Updated Apr 9, 2026

Show full content

For any given software development, there are a bunch of ‘bars’ that you have to jump over, which relate to quality.

At the bottom, there is a minimum quality bar. If the code behaves worse than this, the project will be immediately cancelled. Someone who is watching the money will write the whole thing off as incompetence. To survive, you need to do better.

A little higher is the acceptable quality bar. That is where both management and the users may not be happy about the quality, but the project will definitely keep going. It may face increased scrutiny, and there will probably be lots of drama.

Above that is the reasonable quality bar. The code does what it needs to, in the way people expect it to behave. There are bugs, of course, but none of them are particularly embarrassing. Most of them exist for a short time, then are corrected. The total number of the known long-term outstanding ones is one or two digits. There are probably several places in the code where people think “we should have ...”

Then we get into the good quality bar. Bugs are rare; there are very few regrets. People like using the code; it will stay around for a long, long time. Its weakness isn’t what’s already there; it is making sure future changes don’t negate that value.

There is a great quality bar too. The code is solid, dependable, and can be used as a backbone for all sorts of other stuff. It’s crafted with a level of sophistication that keeps making it useful even for surprising circumstances. People can browse the code and get an immediate sense of what it does and why it works so well.

Above that, there is an excellent quality bar, where the code literally has no known defects. It was meticulously crafted for a very clear purpose and is nearly guaranteed to do exactly that, and nothing but that. It’s the type of code that lives can depend on.

There is a theoretical ‘perfect’ quality bar, too, but it is unreachable. It’s asymptotic.

Getting to the next bar is usually at least 10x more work than getting to that lower bar; the scale is clearly exponential. If it costs 1 just to get to minimum, then it’s 10 for acceptable, and 100 for reasonable. Roughly. This occurs because the higher bars need people to continually revisit and review the effort and aggressively refactor it, over and over again. Code that you splat out in a few hours is usually just minimum quality. Maybe if you’ve written the same thing a few times already, you can start at a higher bar, but that’s unreliable. To get up to those really high bars means having more than one author; it has to be a group of people with an intense focus, all working in sync with up-to-date collective knowledge. Excellent code has a guarantee that it will not deviate from expectations, one that you can rely on, so it’s far more than just a few lines of code.

A great deal of the code out there in libraries and frameworks falls far short of being reasonable. You might not get affected by that, as it’s often code that is sitting idle in little-used features. Still, you have to see them as a landmine waiting to go off when someone tries to push the boundaries of its usage. Code that has been battle tested for decades can generally get near the good bar, but there is always a chance that some future version will fall way, way back.

The overall quality of a codebase is really its lowest bar. So if someone splats some junk into an excellent project, if it’s ever triggered, it can pull down everything else below acceptable. This is the Achilles heel of plugins, as a few poor ones getting popular can cause a lot of damage to perceived quality.

tag:blogger.com,1999:blog-6104420435021904082.post-4184190150946492987

Extensions

Outlines

Paul W. Homer Apr 2, 2026 Updated Apr 2, 2026

Show full content

A software system is a finite resource.

For some people, this might be a surprising statement. They might feel that, as they can store a massive amount of data and talk to any other system in the world, this feels a lot more infinite.

At any given time, there is a specific quantity of hardware, wires, and electricity. If more of these resources are available than are currently being consumed, they are still finite. It’s space to grow, but still limited.

Anything that operates within this finite boundary is in itself finite. Sure, it is always changing, usually growing, but despite its massive and somewhat unimaginable size, it is still finite.

If, even in its immense size and complexity, all software is finite, then any one given system within this is also finite.

A software system has fixed boundaries. It does exactly one set of things. Parts of the code may be dynamic, so they have huge expressive capabilities, but there are still very fixed boundaries on exactly how large those are. It may be permutations greater than all of the particles in the known universe, but it still has a limit.

Time might appear to change that, but the period of time for which any given piece of software will be able to run is also finite. Someday it will come to an end. The hardware will disintegrate, the sun may supernova, or humanity may just blow itself up. More likely, though, that it will just get upgraded and essentially become something else.

Given all of this, any given software system in existence, or as imagined, has a very sharp boundary that defines it. In that sense, since it is composed of code and configurations, those precisely dictate what it can and cannot do.

You can go outside of this boundary and write tests that confirm 100% of these lines. It’s just that, given the ability to have dynamic code, it may take far, far longer to precisely test those behaviours than the lifetime of the system itself. Still, even though it is vague, due to time and complexity, the tests form an encapsulating boundary on the outside of the system.

The same is true for specifications. You could precisely specify any and all behaviours within the software system, but to get it entirely precise, the specifications would have to have a direct one-to-one correspondence with the lines of code and configurations. That would effectively make the specifications an Nth generation language that is directly compilable into a 3rd-generation one, or even lower. Because of this, some people equate precise specifications to the code itself, seeing the code as just a specification for the runtime instances of the software.

An exact specification is almost a proof of correctness. I suspect that proofs need a bit more in that they are driven by the larger context of the expectations. They are also generally only applied to algorithms, not the system as a whole.

So, all of this gives us a bunch of different ways to draw the outlines of a system.

On top of this, there are plenty of vague ways to draw or imagine them as well. We have requirements and user stories, as popular means. As well, one could perceive the system by its dual, which is the arrangement and flow of its data. You can more easily describe an inventory system, for example, by the data it holds, the way that data is interacted with, and how it flows from and to other systems. While the dual in this case is more abstract, it is also considerably simpler than specifying the functionality or the code.

Another way is to see the system by how people interact with it, essentially as a set of features that people use to solve their problems. If those features are effectively normalized, it too is a simpler representation, but if they have instead been arbitrarily evolved over years of releases, they probably have become convoluted.

One key point with all of these different outline types is that some are much better at describing certain parts of the expectations for the behaviour than others. You might need a proof of correctness for some tiny critical parts, but a rather vague outline is suitable for everything else.

No one representation fits everything perfectly, which even applies to the code. If the code was constructed over a long period of time, by people without strong habits for keeping it organized, it too has degenerated into spaghetti, and isn’t easily relatable to the other outlines. People may have changed their expectations and grown accustomed to the behaviours, but that still doesn’t make it the best possible representation for what they needed.

In practice, the best choice is often to half-do a bunch of different types of outlines, then set it all running and repair the obvious deficiencies. While this obviously doesn’t ensure high quality or rigorous mapping to the expectations, it is likely the cheapest form of creating complex software systems. It’s just that it is also extremely high-risk and prone to failure.

tag:blogger.com,1999:blog-6104420435021904082.post-4019066768726312141

Extensions

Cogtastic

Paul W. Homer Mar 26, 2026 Updated Mar 26, 2026

Show full content

Since I started programming decades ago, there has been one seriously annoying trend that just does not seem to want to go away.

If you work for an enterprise, banging away at their internal systems, the management above you really, really, really wants you to just sit there, do your job, and not cause any problems. They want you to be an obedient little cog. Just a part of the machine that they direct to satisfy their goals.

The utter failure with that is that you are in the trenches. And to build even halfway decent stuff, you have to understand a huge amount about the technology and the domain. If you mix in some empathy for the actual users, then the results are usually pretty good. You’ve solved real problems for real people, and they are usually thankful.

But the managers sitting way up high on the hill are disconnected from all of that. They don’t care about users or technology; they care about budgets, politics, and promotions. That’s not a surprise; that is the world they are forced to live in.

But it is a fly in the ointment, since their games are entirely disconnected from the users' lives. And you’re caught in the middle.

In the best circumstances, management enables you to find an appropriate balance between all sides and still keep mostly to some crazy artificial schedule. They trust you, and they listen to your concerns. You’re not a cog, you’re a critical part of the construction process.

But there are very few higher-up managers who actually subscribe to that perspective. Most, instead, believe that they were anointed as the boss and that their will supercedes all other concerns. These are the people who want you to be a quiet, obedient cog. “Just shut up and do your job.”

From my direct experience, it has always been a disaster. It was what was failing in the 70s, 80s, 90s, and early turn of the century with software development. It wasn’t “Waterfall” that deviated the work from where it needed to be; it was the people in charge who mindlessly went off in the wrong direction. If they had been paying attention, they would be course correcting as needed, following whatever chaotic changes tumbled out of the fray. In those days when things failed, they failed at the top, but somehow it was the weight of the process that got blamed.

From where I have often sat, the “root problem” has and will always be a lack of understanding. The people driving and working on the development project get disconnected from the people who will ultimately use the output. If you don’t know what the user’s problem really is, how can you build any type of solution that will help them? You have to dig into that; it’s not optional.

The desire for the development teams to just be mindless cogs comes directly from that cogtastic viewpoint. There is some type of made-up schedule, but the programmers keep surfacing ignored or forgotten issues. If management indulges, then the schedule gets blown. Instead of blaming analysis or a lack of understanding, it is just easier to suppress the feedback and keep on going as planned, hoping that some good luck happens somewhere along the way.

I’m sure there are lots of examples out there of out-of-control projects getting saved at the last minute by some Wesley who manages to Save the Ship and thus avoid impending disaster. That’s great, but highly unreliable, and the people who do this never get any of the credit they deserve for avoiding the original doomed fate. It’s only when they are gone that people realize that they were quietly avoiding the cliffs, while staying nearly on course.

So, you get this situation where someone is “in charge,” and they want their will to be manifested as and how they decide, but they do not truly have the right objectives to achieve success.

This takes us back to AI. Instead of being some cool new tool to lift the quality of the work we're doing, it seeks to turn the developers themselves into those disconnected managers. So they generate whackloads of code which they don’t understand, and don’t care about, because they are trying to impress the higher-ups with their “velocity”. What is actually happening gets ignored, and what is really needed gets ignored, too. Now, instead of them being the cog, it is some AI agent who fills that role, and the whole cycle repeats.

The act of engineering some large complex system involves both understanding how it works and why it is necessary. Nothing can escape that. In the Waterfall days, people blamed the heavy processes, but making them super light and hugely reactive did nothing to fix the problems. Now, we’ll see the same effect play out again. The developers will just be agent managers, and the auto-cogs below them will pile up more of the wrong stuff. You still have the wrong stuff even if you get it faster, and there is a lot more of it to add to the mudball, which is worse.

The moral of the story is that people just don’t seem to be learning the same lesson that keeps rearing its ugly head over and over again. A vague notion of what you want is not enough to build it. The real work is in taking that loose idea and understanding it at a very deep level so that you can then actuate it into the large number of parts you need that all work together as expected.

The people in the middle who pull off this feat are not and will never be cogs. What’s in their heads, and what they know, is the essence of being able to pull this thing out of the ethos and into reality. Their job is to understand it all and then find a way to turn that understanding into physical bits. They are good at it if what is produced matches everyone’s expectations. The users, management, technologists, etc. There is this codebase that, when built and deployed, becomes a real solution to all of these problems. That codebase, and its boundaries as code, a specification, tests, or proofs, is just a manifestation of the software developers' actual understanding. If, in the middle of doing that, it clicks in that part of the ask is too ambiguous, way off the mark, impossible, or just crazy, it’s the developer's understanding that is the mission-critical part of success. Address that, and it probably works. Ignore that, and it definitely fails.

tag:blogger.com,1999:blog-6104420435021904082.post-295886989281064968

Extensions

Interviews

Paul W. Homer Mar 19, 2026 Updated Mar 19, 2026

Show full content

I was crafted by the Waterloo co-op program in the late eighties. Part of that experience was a crazy large number of interviews, so I got pretty good at them. Since then, I’ve worked for over a dozen companies.

For one interview (for my all-time favourite job), I was really young, so I got asked rapid tech questions and had to bring printouts of my older code with me. That was fine, it was a systems programming job and really competitive, lots of people desired it.

For another interview, at the turn of the century, I just went for dinner and drinks. Definitely my favourite interview.

I’ve had a couple of interviews in coffee shops; they were usually successful. The casual atmosphere really helps to connect.

Once I got ganged up on. I think it was six of them crammed into a little office, but the questions were product, process, and feature-related, not coding, so it was fine.

Often, I was at the interview because a friend I had worked with in the past was trying to pull me in. That generally made them go pretty easy on me.

I’ve had some bad interviews, though. Usually, when applying for advertised jobs.

Once I showed up, they put me in a big room with a bunch of other people and gave us a written test. I took it, sat down, and just signed my name. Then I got up, handed it in and left. No way I was going to work for them.

Another time, I was grilled on tech that I told them I hadn’t used for twenty years. The kid interviewing me was annoyed that I couldn’t remember some esoterica. Seriously?

One time, they gave me an online coding test. An editor embedded in an online chat. The first question was okay, but for the second question, they wanted me to correctly code something huge. I explained the actual theory behind it and why it wasn’t trivial, but they didn’t understand and told me just to grind it out in a little bit of ‘approximate’ code. I hemmed and hawed for a while, then said I wasn’t going to finish. They told me to try anyway, so I sat there quietly until the interview timed out. I was hoping awkward silence made my point.

After one interview, one of the executives proudly told me that I would have to look after his “hobby” system. I turned that down without a second thought.

For another, it was going well, but then I started making jokes about live locks. Turns out the interviewer did his master's thesis on that topic. Opps.

One time, they said my take-home coding test had too many functions. I just laughed.

Another time, the interviewer started in with tricky little puzzle questions. I had seen them all before, so I could have answered, but I was already having a bad day, so I blew up. I got really angry, and the interviewer tried to calm me down. We agreed to meet in person, which went really well. I was sent across the country for a round of second interviews, but I was told I had a bit too much personality for them. It was still fun, and they paid my expenses.

One time, one of my co-workers showed me the tests he was going to use for interviews. He said to find the one problem, I pointed out a whole bunch of them; he got mad at me. Lol.

I often interviewed candidates, too. If we were multiple interviewers, I’d get the others to ask the tech questions, and I’d focus on personality. I like to see that people are curious and keen to learn. If they had that and I could get them talking about something that excited them, I generally accepted them. That had a pretty good track record of finding good people.

I usually expected a long ramp-up time and the need for training, so I was rarely looking for prefab employees. I saw them as longer-term bets, which tended to pay off better.

For one company, since the code was brutal, I used a technical question that I was pretty sure the candidates couldn't answer. I just wanted them to try to work through the problem. I would help, but not give it all away. If they were stumped and started pitching ideas, it was perfect. I had some pretty good hires from that.

For that round of interviews, one of the senior candidates got insulted and said he wouldn’t take the test. I obviously sympathized, but for that type of work, it wasn’t optional; it was our daily grind.

Overall, my hiring track record is iffy. Some great hires, but also some duds. Usually, though, the duds were caused by scarcity and/or my not trusting my own instincts. Sometimes the candidates are too limited; there is not much you can do about it.

I really hate the big, long, stupid interviews, particularly when the questions are way out of whack with the actual work. Seems like an ego problem if they’ll test you on stuff that you’d never have to do. They’re trying too hard to be cool, and I really hate the taste of Kool-Aid. If the interview makes you uncomfortable, the actual job is probably worse. I never felt bad when I just walked away, but I usually wasn't desperate either. That helps.

Only once did I really have to take a job that I didn’t want. I stayed for a while, but some days were hard. The irony was that many of my later jobs were with people I had met at that early one. The job wasn’t great, but the contacts turned out to be awesome. That's why it's important to keep a positive attitude even in a negative situation.

I’ve always figured that if you got the world’s ten greatest programmers all together on a project, they would spend their days fighting with each other and nothing practical would get built. A less impressive team that works together really well is always better. It’s too bad modern interview practices don’t reflect that.

If you’re interviewing these days, be patient, stay strong. It’s a numbers game. Win a few, lose a lot.

tag:blogger.com,1999:blog-6104420435021904082.post-2835742520900245093

Extensions

Functions

Paul W. Homer Mar 12, 2026 Updated Mar 12, 2026

Show full content

Over the decades, I’ve seen the common practices around creating functions change quite a bit.

When I first started coding, functions had come out of the procedural paradigm. I guess, long ago, in maybe assembler, a program was just one giant list of instructions. That would be a little crippling, so one of the early attempts to help was to break it up into smaller functions. An added benefit was that you could reuse them.

By the time I started coding, the better practice was to break up the code along the lines of similarity. Code that is similar is clumped together.

As the data structures and object-oriented paradigms started taking hold, the practices switched to being targeted. For instance, you’d write a lot of little ‘atomic’ primitive functions for each action you did against the structure, like create, add, or traverse. Indirectly, that gave rise to the notion of a function just having a single responsibility.

In data structures, you might end up coding up a whole bunch of structures, then stacking them one on top of the other, mostly trying to get to one rather giant data structure for the whole program. That was excellent in building up sophistication from reusable parts, but a lot of programmers just saw it as excessive layering, not one big interactive structure. People kept wanting to decompose, without ever reassembling.

Object-oriented followed suit, but seemed to get lost on that notion of building up application objects. There were often dozens at the top. It also renamed functions to ‘methods’, but I’ll skip that. It was initially a very successful paradigm change, but later people started objecting to the feeling of layering, and to the idea that the entry points were somehow a ‘god’ object.

The very early smalltalk object-oriented code had lots of functions, many of which were just one-liners. My first encounter surprised me. There were so many functions...

I guess, as a later reaction to the success of those earlier styles, the common practices moved back towards procedural. Almost no layers and very huge functions. Giant ones. This reopened the door for more brute force practices, which had been pushed away by those earlier paradigms.

I’ve always known that huge functions were a nightmare. Too much stuff all tangled together, it is unreadable and hard to follow. But any of the earlier attempts to limit function size were too restrictive and too fragile. You can’t just say that all functions must be less than 10 lines of code, for example. The attempts to categorize it as a single responsibility were pretty good, but because of the intense dislike of layers, they didn’t get across the notion that it is one thing at just one level. So, it would be coded as one thing, but with all of the raw instructions below that, as far down as the coder could go, all intertwined together. If you have messy low-level stuff to do, it should be hidden below in more functions, effectively a layer, but not really. For example, string fiddling in the middle of business logic is distracting, quickly killing off the readability.

Layers, as they first came out, were really architectural lines, not stacked data structures. For example, you cut a hard line between code that messes with persistence and code that computes derived objects, so you don’t mix and match. The derived stuff then sits in a layer above the persistence.

The point of those types of lines is to make the code super easy to debug later. If you know it is a derived calculation problem, you can skip the other code.

Overall, I’d say that there can never be too many functions. Each one is a chance to attach some self-describing name to a chunk of code. Think of it as concise language-embedded comments. If you were tight about coding with some of the older paradigms, then the data structures or even objects are pretty much all named with nouns, and the functions are all verbs. Using that, you can implement the code with the same vocabulary that you might verbally describe to a friend or another programmer. The closer that implementation is to descriptive paragraphs of what it does, the more you will be able to verify its behaviour on sight. That doesn’t absolve you of testing, or even for some code, creating a real formal proof of correctness, but it does cut down a lot of the work later when catching bugs.

If you also put a hard split between the domain logic and the technical necessities, you can usually just jump right to the incorrect block of code. When someone describes the bug, you know immediately where it is located. Since diagnostics and debugging eat way more time than coding, any sort of practice to reduce friction for them will really help with scheduling and reducing stress. Code that you can fix effortlessly is worth far more than code that you can write quickly.

For me, I think we should return to that notion of stacking up data structures and objects in order to build up sophistication. The best code I’ve seen does this, and has crazy long shelf lives. Its strength is that it encapsulates really well and makes it easy for reuse. It is also quite defensive, and it helps to zero in quickly on bugs. Realistically, it isn’t layering; it is a form of stacking, and those two should not be confused with each other. A layer is a line in the architecture you should have; stacking is just depth in the call chain. If the stacking is really encapsulated, programmers don’t have to go down a rabbit hole to understand what is happening in the higher levels. Entangling that all together is worse.

You can always get a sense of the code quality by quickly looking at the functions. Big, bloated functions with ambiguous, convoluted, or vague names are just nurseries for bugs. If you can skim the code and mostly know what it should do, then it is readable. If you have to pick over it line by line, it is a cognitive nightmare. If the function says DoX, and the code looks like it might actually do X, then it is pretty good.

tag:blogger.com,1999:blog-6104420435021904082.post-5906280012153585715

Extensions

Stress

Paul W. Homer Mar 5, 2026 Updated Mar 5, 2026

Show full content

Being a software developer is difficult and stressful.

In the early days, there is an uncontrollable fear that you cannot build what you were asked to build.

The industry is awash with too many unknown unknowns, and few programmers receive adequate training. Newbies are often just pushed into the deep end with a brick tied to their ankle and expected to figure out how to swim.

Worse, the industry discourse is erratic. Some people claim one thing works correctly, while lots of people contradict them. Everyone argues, so there is usually never a consensus. It’s super trendy and plagued with myths and misunderstandings. Over the decades, this has gotten far worse. It’s a turbulent sea of opinions and amnesia.

At some point, if you survive long enough, you figure out how to build the things they ask you to build.

Well, almost. Each time around, the thing they ask for is larger and far more complicated. That seems to never end. Most programmers believe, as a result of this, that fundamentally everything you do is new, but oddly, most things you do will have been done before by hundreds, if not millions, of other people. Real greenfield software is a rarity, even if it’s a newly evolving domain. The basics that underlie the development have been around for a very long time, and haven’t really changed all that much over the decades, even if the dependencies and stacks are different.

After you’ve sort of managed to get your feet solidly on the ground -- for instance, you can build most different types of common applications -- your problems get worse.

It is inevitable that as you outgrow the work of coding, you find yourselves entangled in all sorts of other industry issues, like management, planning, usability, architecture, design, domain knowledge, etc. Once you are no longer an inexpensive kid, you find that you need to dip your toes into these other issues in order to justify your higher salary. You probably don’t want to, which is why you focused on coding instead. Still, you quickly learn that the more you bring to the table, the more people will be willing to put up with your demands.

That is when software development gets really hard.

The more knowledge you acquire, and the more experiences you survive, the more likely you will find yourself in a situation where you can anticipate a big problem, know absolutely how to avoid it, but are not taken seriously enough to be allowed to do that. So, you’ve shed the creation stress only to be pummeled by tactical or strategic stress. You’re expected to code, but are not supposed to control things. “They” just want you to be a cog in their machine. That is often the low point.

That is the time that you have endless discussions with people about how too little time will grind the quality far below usable, or how throwing in extra bodies will only slow things down, not speed them up. That’s when someone recommends a technology that you know is completely unreliable, or they push a change that is inherently destructive, even if it seems to work on their machine. You end up sitting through meetings about design, where the most popular options are awful and wasteful, and practices that you know will work have been deemed to be too old school.

The biggest skill you end up learning is to pick your battles carefully. Maybe the code is too messy, but the interface is better suited to what the users actually need. Maybe you rush through a throw-away feature in order to get enough time to do some mission-critical core work. As you get more and more experience, you find yourself higher up in the ranks, but if your fingers are still in the code, it is hard to be taken seriously. That irony, where the work is mostly controlled by people who are the most clueless about the nature of the work, starts to haunt you.

Some people give in at this point; others push through the pain. If you push through, you find yourself staring at yet another development effort that is one tiny step away from being a death march, and there is a huge wind trying to push it off the cliff. Sometimes you just have to shrug it off and walk away.

That’s kinda when you change. At first, you thought the priorities were technical engineering. That the code should be as good as the code can get. Then you switched to understanding that helping users through their problems is more important, even if the code gets dinged because of it. Now, though, you wake up and realize that building stuff is stupidly expensive, and what really matters is managing all of those strings tied to the money that you need to continue.

If you’re meta-physical, you’ve moved from being concerned about the code to being concerned about the data and the code. Then you were concerned about the users and whether they were happy or not. Then you’re concerned about the development shop itself. Is it functioning properly?

You get to a point where you're no longer trying to build software; now you are trying to build organizations that can build good software.

And if you wonder past that, then you are concerned about creating organizations that can collect enough funds to be able to set up a development shop that can create software worth using.

Basically, the horizons of what you are trying to build just keep expanding farther and farther afield.

The irony is that the stresses of the early days are looking somewhat pedestrian at that point. You miss just being obsessed with creating good code; it all seemed so much more innocent in those days.

What is the case for most developers is that they start stressed and as they conquer those stresses, they are replaced by even bigger, less manageable ones. Just interacting with a computer was fun; interacting with people, politics, strings, and agendas is not. But if you want to keep on building bigger and more sophisticated things, you have to keep getting broader in your focus.

On the other side of the coin, if you picked a place where eventually you get to a point where there is little or no stress, then you’ll start to get stressed by the upcoming fate of your own obsolescence. That is, any path to avoid the stress will lead you to the stress of being too expensive, too far behind, or easily replaceable.

Stress, it seems, in programming, is unavoidable. At best, you can try to pick the types you are willing to put up with.

tag:blogger.com,1999:blog-6104420435021904082.post-2408587363885358279

Extensions

When the Bubble Bursts

Paul W. Homer Feb 26, 2026 Updated Feb 26, 2026

Show full content

I’ve been deep into software since the mid-eighties, obsessively following the industry while I slough through its muddy trenches.

The benefit of having survived so long is that you get the repeated pleasure of seeing the next annoying hype cycle explode.

The pattern is always the same. Something almost newish comes along. It’s okay, but not that big of a deal. Still, it gets exposed to way more people than before. That fuels the adrenaline, which twists into a hype machine detached from reality. As it grows, its growth adds more fuel, until it has been so watered down that it is far beyond irrational. Eventually reality hits, and it goes *pop*.

AI, which started in the sixties, almost hit that point in the eighties. But now it’s returned with a vengeance, this time reaching stratospheric heights and causing untold damage to the world.

To be clear, it is cute. LLMs will survive, and eventually be relegated to the same bucket as full-text search or command line completion. Something that is useful for some people, but not significant and definitely not monetizable. A throwaway feature used by a few people, but not vital.

Not good enough to make profits and definitely not good enough to replace employees. If the world were sane, we would have barely noticed it and just shoved it into the ‘not worth the resources it consumes’ category.

But that’s not what happened. Instead, some tech bros are making suicidal bets on profits, while executroids foolishly believe it will liberate them from payroll woes. Neither will happen, but a lot of people will burn because of these delusions. Again.

The Web was similar. Yes, it survived the dotcom bomb, and gradually ate the world, but the initial gold rush turned out to mostly mine huge chunks of pyrite.

Technology takes a long time to mature. If you rely on it too early, it will bite you. Nothing ever changes that. Not well-written books, management theories, nor aggressive marketing. Immature technology might be fun to play with, but it is not yet industrial strength. It will collapse under any sort of weight.

LLMs play a clever trick with finding paths of tokens through a huge tensor space. That’s all they do. Nothing else. If you anthropomorphize those paths as being anything other than a random ant trail through interwinned data, you are being fooled. Sure, it looks pretty good sometimes. But “sometimes” isn’t even close to good enough.

You wouldn’t replace your employees with Furbies; LLMS are only marginally better. They are no threat to intelligence, even if the lack of it has been triggered by them.

But that isn’t even the real problem.

The technology sets resources on fire. It is an all-consuming flame of computation. So stupidly expensive that even our fabulous modern hardware can barely keep up. So stupidly expensive that its value is not even close to its costs.

Someday in the future, when our computers are thousands of times more powerful than today and have finally been optimized to use minimal electricity, that value may be there. But not today. Not next week, next year, and probably not for at least a decade.

Nothing short of scientific simulations or extreme mathematics eats that amount. Burning that much on a massive scale isn’t viable. And any sort of value is clearly not worth it. There are no profits to be made here, at this point in time.

As an added benefit, the technology obliterates security and opens the door for outlandish surveillance. Since it is too expensive and too flaky to run locally, people have leaped in to help. You’re literally sending all of your IP and process knowledge to these unvetted third parties in the hopes they won’t betray you.

What’s consistent about the 21st Century is that eventually that information will become valuable enough for them to seek profits. And there is absolutely nothing out there to stop them. So, as we have seen over and over again, they’ll go whole hog into monetizing your secrets. Their impending financial crisis will be so large that they won’t even have a choice. There will be data buffets springing up on every corner, hawking your appetizers.

I’m old enough that I don’t even need to predict the burst. It will happen, it always does. And someday in the future, at most interactive text bars, you’ll be able to get stale gobblygook generated locally from a decrepit model that hasn’t been retrained for years. It won’t be as good as now, but it won’t be that much worse either.

As for programmers and the panic setting into the industry, don’t worry. You get paid to know things; code is just what you do with that knowledge. You won’t be replaced by a mechanical procedure that doesn't actually understand anything. Bounce that noise between a thousand models, and it will still fail eventually. And when it does, unless it's been constantly retrained on its own slop, it will be clueless and unable to save the day. Sooner or later, management will wake up to the fact that they are exfiltrating their own information in an epic breach and put a stop to it. If some of that generated code is nearly usable today, when the resource excesses stop, the quality will plummet past hopelessness. Any development that isn’t entirely local is far too dangerous to be allowed to continue. This too shall pass.

tag:blogger.com,1999:blog-6104420435021904082.post-1001361473191769725

Extensions

Data Collection

Paul W. Homer Feb 19, 2026 Updated Feb 19, 2026

Show full content

One of the strongest abilities of any software is data collection. Computers are stupid, but they can remember things that are useful.

It’s not enough to just have some widgets display it on a screen. To collect data means that it has to be persisted for the long term. The data survives the programming being run and rerun, over and over again.

But it’s more than that. If you collect data that you don’t need, it is a waste of resources. If you don’t collect the data that you need, it is a bug. If you keep multiple copies of the same data, it is a mistake. The software is most useful when it always collects just what it needs to collect.

And it matters how you represent that data. Each individual piece of data needs to be properly decomposed. That is, if it is two different pieces of information, it needs to be collected into two separate data fields. You don’t want to over-decompose, and you don’t want to clump a bunch of things together.

Decomposition is key because it allows the data to be properly typed. You don’t want to collect an integer as a string; it could be misinterpreted. You don’t want a bunch of fields clumped together as unstructured text. Data in the wrong format opens up the door for it to be misinterpreted as information, causing bugs. You don’t want mystery data, each datam should have a self-describing label that is unambiguous. If you collect data that you can not interpret correctly, then you have not collected that information.

If you have the data format correct, then you can discard invalid junk as you are collecting it. Filling a database with junk is collecting data you don’t need, and if you did that instead of getting the data you did need, it is also a bug.

Datam are never independent. You need to collect data, and that data has a structure that binds together all of the underlying datam correctly. If you downgrade that structure, you have lost the information about it. If you put the data into a broader structure, you have opened up the possibility of it getting filled with junk data. For example, if the relationship between the data is a hierarchical tree, then the data needs to be collected as a tree; neither a list nor a graph is a valid collection.

In most software, most of the data is intertwined with other values. If you started with one specific piece of data, you should be able to quickly navigate to any of the others. That means that you have collected all of the structures and interconnections properly, and you have not lost any of them. There should only be one way to navigate, or you have collected redundant connections.

As such, if you have collected all of the data you need, then you can validate it. There won’t be data that is missing, there won’t be data that is junk. You can write simple validations that will ensure that the software is working properly, as expected. If the validations are difficult, then there is a problem with the data collection.

If you collect all of the data you need for the software correctly, then writing the code on top of it is way simpler and far easier to properly structure. The core software gets the data from persistence, then passes it out to some form of display. It may come back with some edits, which need to be updated in the persistence. There may be some data that you did not collect, but the data you did collect is enough to be able to derive it from a computation. There may be tricky technical issues that are necessary to support scaling, but those are independent from the collection and flow of data.

Collecting data is the foundation of almost all software. If you get it right, you will be able to grow the software to gradually cover larger parts of the problem domain. If you make a mess out of it, the code will get really ugly, and the software will be unreliable.

tag:blogger.com,1999:blog-6104420435021904082.post-3155615517263440979

Extensions

Blockers

Paul W. Homer Feb 12, 2026 Updated Feb 12, 2026

Show full content

Some days the coding goes really smoothly. You know what you need; you lay out a draft version, which happens nicely. It kinda works. You pass over it a bunch of times to bang it properly into position. A quick last pass to enhance its readability for later, and then out the door it goes.

Sometimes, there is ‘friction’. You start coding, but you have to keep waiting on other things. So, it’s code a bit, set it aside, code a bit, etc. The delays can be small, but they add up and interfere with the concentration and sense of accomplishment.

Some friction comes from missing analysis. There was something you should have known, but it fell through the cracks. Some comes from interactions with others. You need something from your teammates, or you need it from some other external group.

With some issues for external groups, it will take lots of time to escalate it, arrange introductory meetings, get to the issue, and then finally come to a resolution. You can kinda fake the code a little in the meantime, but that is usually throw-away work, so you’d prefer to minimize it. If you are patient, it will eventually get done.

Occasionally, though, there is a ‘blocker’. It is unpassable. You started to work on something, but it was shut down. You are no longer able to work on it. It’s a dead end.

One type of blocker is that someone else is doing the same work. You were going to write something, but it turns out they got there first or have some type of priority. In some cases, that is fine, but sometimes you feel that you could have done a much better job at the effort, which is frustrating. Their code is limiting.

Another type is knowledge-based. You need something, but it is far too complex or time-consuming for others to let you write it.

Some code is straightforward. But some code requires buckets of very specific knowledge first, or the code will become a time sink. People might stop you from writing systems programming components like persistence, or domain-specific languages, or synchronization, for example. Often, that morphs into a buy-versus-build decision. So something similar exists; you feel you could do it yourself, but they purchase it instead, and the effort to integrate it is ugly. If you don’t already have that knowledge, you dodged a bullet, but if you do have it, it can be very frustrating to watch a lesser component get added into the mix when it could have been avoided with just a bit of time.

There are fear-based blockers as well. People get worried that doing something a particular way may just be another time sink, so they stop it quickly. That is often the justification for brute force style coding, for example. They’d rather run hard and pound it all out as a mess than step back and work through it in a smart way. In some shops, the only allowable code is glue, since they are terrified of turnover.

In that sense, blockers are usually about code. You have it, you need it, where is it going to come from? Are you allowed to write something or not? With knowledge, you can usually do the work to figure it out, or at least approximate it, but there could be some secret knowledge that you really need to move forward, but are fully blocked from getting it, although that is extremely rare.

If you flip that around, when you're building a medium-sized or larger system, the big issue is where is the code for it going to come from? In that sense, building software is the work of getting all of the code you need together in one organized place. Some of it exists already, some of it you have to create yourself.

In the past, the biggest concern about pre-existing code was always ‘support’. You don’t want to build on some complex component only to have it crumble on you, and there is nothing you can do about it. That is an expensive mistake. So, if you aren’t going to write it yourself, then who is going to support it, and how good is that support?

If you follow that, then you generally come to understand that as you build up all of this code, support is crucial. It’s not optional, and it is foolish to assume the code is bug-free and will always work as expected.

It’s why old programmers like to pound out a lot of stuff themselves; they know when doing that, they can support their own code, and they know that that doesn’t waver until they leave the project. The support issue is resolved.

It’s also why most wise programmers don’t just add in any old library. They’ve had issues with little dodgy libraries that were poorly supported in the past, so they have learned to avoid them. Big, necessary components are unavoidable, but the little odd ones are not. If you can’t find a legitimate version of something, doing it yourself is a much better choice.

Which brings us all of the way around to vibe coding. If you’ve been around a while, then nothing seems like a worse idea than having the ability to dynamically generate unsupported code. Tonnes of it.

Particularly if it is complex and somewhat unlimited in depth.

A whack load of boilerplate might be okay; at least you can read and modify it, although a debugger would still likely be necessary to highlight the problem, so it can mean a lot of work recreating the issue. So, it might only be a short-term time saver, but a nasty landmine waiting for later. Supportable, but costly.

But it would be heartbreaking to generate 100K in code, which is almost usable but entirely unsupportable. If you did it in a week, you’d probably just have to live with the flaws or spend years trying to pound out the bugs.

Not surprisingly, people tried this often in the past. They built sophisticated generators, hit the button and got full, ready-to-go applications. You don’t see any of these around anymore, since the support black holes they formed consumed them and everything else around them, so they essentially eradicated the evidence of their existence. It was tried, and it failed miserably.

But even more interesting was that those older application generators were at least deterministic. You could run them ten times, and mostly get back the same code. With vibe coding, each run is a random turkey shoot. You’ll get something different. So, extra unsupportable, and extra crazy.

If you are going to build a big system to solve a complex problem, then you need to avoid any and all blockers that get in your way. Friction can slow you down, but a blocker is often fatal.

These days, you’re not really ‘writing’ the system, so much as you are ‘assembling it’. If you do that from too many unsupportable subparts, then the whole will obviously be unsupportable. Inevitably, if you put something into a production environment, you either have to be prepared to support it somehow or move on to the next gig. But if too much unsupportable crud gets out there, that next gig may be even worse than the one that you tried to flee from.

tag:blogger.com,1999:blog-6104420435021904082.post-5882242929787772617

Extensions

Systems Thinking

Paul W. Homer Feb 5, 2026 Updated Feb 5, 2026

Show full content

There are two main schools of thought in software development about how to build really big, complicated stuff.

The most prevalent one, these days, is that you gradually evolve the complexity over time. You start small and keep adding to it.

The other school is that you lay out a huge specification that would fully work through all of the complexity in advance, then build it.

In a sense, it is the difference between the way an entrepreneur might approach doing a startup versus how we build modern skyscrapers. Evolution versus Engineering.

I was working in a large company a while ago, and I stumbled on the fact that they had well over 3000 active systems that were covering dozens of lines of business and all of the internal departments. It had evolved this way over fifty years, and included lots of different tech stacks, as well as countless vendors. Viewed as ‘one’ thing it was a pretty shaky house of cards.

It’s not hard to see that if they had a few really big systems, then a great number of their problems would disappear. The inconsistencies between data, security, operations, quality, and access were huge across all of those disconnected projects. Some systems were up-to-date, some were ancient. Some worked well, some were barely functional. With way fewer systems, a lot of these self-inflicted problems would just go away.

It’s not that you could cut the combined complexity in half, but more likely that you could bring it down to at least one-tenth of what it is today, if not even better. It would function better, be more reliable, and would be far more resilient to change. It would likely cost far less and require fewer employees as well. All sorts of ugly problems that they have now would just not exist.

The core difference between the different schools really centers around how to deal with dependencies.

If you had thousands of little blobs of complexity that were all entirely independent, then getting finished is just a matter of banging out each one by itself until they are all completed. That’s the dream.

But in practice, very few things in a big ecosystem are actually independent. That’s the problem.

If you are going to evolve a system, then you ignore these dependencies. Sort them out afterwards, as the complexity grows. It’s faster, and you can get started right away.

If you were going to design a big system, then these dependencies dictate that design. You have to go through each one and understand them all right away. They change everything from the architecture all the way down to the idioms and style in the code.

But that means that all of the people working to build up this big system have to interact with each other. Coordinate and communicate. That is a lot of friction that management and the programmers don’t want. They tend to feel like it would all get done faster if they could just go off on their own. And it will, in the short-term.

If you ignore a dependency and try to fix it later, it will be more expensive. More time, more effort, more thinking. And it will require the same level of coordination that you tried to avoid initially. Slightly worse, in that the time pressures of doing it correctly generally give way to just getting it done quickly, which pumps up the overall artificial complexity. The more hacks you throw at it, the more hacks you will need to hold it together. It spirals out of control. You lose big in the long-term.

One of the big speed bumps preventing big up-front designs is a general lack of knowledge. Since the foundations like tech stacks, frameworks, and libraries are always changing rapidly these days, there are few accepted best practices, and most issues are incorrectly believed to be subjective. They’re not, of course, but it takes a lot of repeated experience to see that.

The career path of most application programmers is fairly short. In most enterprises, the majority have five years or less of real in-depth experience, and battle-scared twenty-year+ vets are rare. Mostly, these novices are struggling through early career experiences, not ready yet to deal with the unbounded, massive complexity present in a big design.

Also, the other side of it is that evolutionary projects are just more fun. I’ve preferred them. You’re not loaded down with all those messy dependencies. Way fewer meetings, so you can just get into the work and see how it goes. Endlessly arguing about fiddly details in a giant spec is draining, made worse if the experience around you is weak.

Evolutionary projects go very badly sometimes. The larger they grow, the more likely they will derail. And the fun gives way to really bad stress. That severe last-minute panic that comes from knowing that the code doesn't really work as it should, and probably never will. And the longer-term dissatisfaction of having done all that work to ultimately just contribute to the problem, not actually fix it.

Big up-front designs are often better from a stress perspective. A little slow to start and sometimes slow in the middle, they mostly smooth out the overall development process. You’ve got a lot of work to do, but you’ve also got enough time to do it correctly. So you grind through it, piece by piece, being as attentive to the details as possible. Along the way, you actively look for smarter approaches to compress the work. Reuse, for instance, can shave a ton of code off the table, cut down on testing, and provide stronger certainty that the code will do the right thing in production.

The fear that big projects will end up producing the wrong thing is often overstated. It’s true for a startup, but entirely untrue for some large business application for a market that’s been around forever. You don’t need to burn a lot of extra time, breaking the work up into tiny fragments, unless you really don’t have a clue what you are building. If you're replacing some other existing system, not only do you have a clue, you usually have a really solid long-term roadmap. Replace the original work and fix its deficiencies.

There should be some balanced path in the middle somewhere, but I haven’t stumbled across a formal version of it after all these decades.

We could go first to the dependencies, then come up with reasons why they can be temporarily ignored. You can evolve the next release, but still have a vague big design as a long-term plan. You can refactor the design as you come across new, unexpected dependencies. Change your mind, over and over again, to try to get the evolved works to converge on a solid grand design. Start fast, slow right down, speed up, slow down again, and so forth. The goal is one big giant system to rule them all, but it may just take a while to get there.

The other point is that the size of the iterations matters, a whole lot. If they are tiny, it is because you are blindly stumbling forward. If you are not blindly stumbling forward, they should be longer, as it is more effective. They don’t have to all be the same size. And you really should stop and take stock after each iteration. The faster people code, the more cleanup that is required. The longer you avoid cleaning it up, the worse it gets, on basically an exponential scale. If you run forward like crazy and never stop, the working environment will be such a swamp that it will all grind to an abrupt stop. This is true in building anything, or even cooking in a restaurant. Speed is a tradeoff.

Evolution is the way to avoid getting bogged down in engineering, but engineering is the way to ensure that the thing you build really does what it is supposed to do. Engineering is slow, but spinning way out of control is a heck of a lot slower. Evolution is obviously more dynamic, but it is also more chaotic, and you have to continually accept that you’ve gone down a bad path and need to backtrack. That is hard to admit sometimes. For most systems, there are parts that really need to be engineered, and parts that can just be allowed to evolve. The more random the evolutionary path, the more stuff you need to throw away and redo. Wobbling is always expensive. Nature gets away with this by having millions of species, but we really only have one development project, so it isn’t particularly convenient.

tag:blogger.com,1999:blog-6104420435021904082.post-6220633398737666659

Extensions

Reap What You Sow

Paul W. Homer Jan 29, 2026 Updated Jan 29, 2026

Show full content

When I first started programming, some thirty-five years ago, it was a somewhat quiet, if not shy, profession. It had already been around for a while, but wasn’t visible to the public. Most people had never even seen a serious computer, just the overly expensive toys sold at department stores.

Back then, to get to an intermediate position took about 5 yrs. That would enable someone to build components on their own. They’d get to senior around 10 yrs, where they might be expected to create a medium-sized system by themselves, from scratch. 20 yrs would open up lead developer positions for large or huge projects, but only if they had the actual experience to back it up. Even then, a single programmer might spend years to get their code size up to medium, so building larger systems required a team.

Not only did the dot-com era rip programming from the shadows, but the job positions also exploded. Suddenly, everyone used computers, everyone needed programmers, and they needed lots of them. That disrupted the old order, but never really replaced it with anything sane.

So, we’d see odd things like someone getting promoted to a senior position after just 2 or 3 years of working; small teams of newbie programmers getting tasked with building big, complex systems with zero guidance; various people with no significant coding experience hypothesising about how to properly build stuff. It’s been a real mess.

For most computer systems, to build them from scratch takes an unreasonably large amount of knowledge and skill, not only about the technical aspects, but also the domain problems and the operational setups, too.

The fastest and best way to gain that knowledge is from mentoring. Courses are good for the basics, but the practice of keeping a big system moving forward is often quite non-intuitive, and once you mix in politics and bureaucracy, it is downright crazy.

If you spend years in the trenches with people who really get what they are doing, you come up to speed a whole lot faster.

We have a lot of stuff documented in books and articles, and some loose notions of best practices, but that knowledge is often polluted with trendy advice, so people bend it incorrectly to help them monetize stuff.

There has always been a big difference between what people say should be done and what they actually do successfully.

Not surprisingly, after a couple of decades of struggling to put stuff together, the process knowledge is nearly muscle memory and somewhat ugly. It’s hard to communicate, but you’ve learned what has really worked versus what just sounds good. That knowledge is passed mouth to mouth, and it’s that knowledge that you really want to learn from a mentor. It ain’t pretty, but you need it.

As a consequence, it is no surprise that the strongest software development shops have always had a good mix of experience. It is important. Kids and hardened vets, with lots of people in the middle. It builds up a good environment and follows loosely from the notion that ‘birds of a feather flock together’. That type of experience diversity is critical, and when it comes together, the shop can smoothly build any type of software it needs to build. Talent attracts talent.

That’s why when we see it crack, and there is a staffing landslide, where a bunch of experienced devs all leave at the same time, it often takes years or decades to recover. Without a strong culture of learning and engineering, it’s hard to attract and keep good people; it’s understaffed, and the turnover is crazy high.

There are always more programming jobs than qualified programmers; it seems that never really changes.

Given that has been an ongoing problem in the industry for half a century, we can see how AI may make it far worse. If companies stop hiring juniors because their intermediates are using AI to whack out that junior-level code, that handoff of knowledge will die. As the older generations leave without passing on any process knowledge, it will eventually be the same as only hiring a bunch of kids with no guidance. AI won’t help prevent that, and its output will be degraded from training on those fast-declining standards.

We’ve seen that before. One of the effects of the dot-com era was that the lifespan of code shrank noticeably. The older code was meant to run for decades; the new stuff is often just replaced within a few years after it was written. That’s part of why we suddenly needed more programmers, but also why the cost of programming got worse. It was offset somewhat by having more libraries and frameworks available, but because they kept changing so fast, they also helped shorten the lifespan. Coding went from being slowly engineered to far more of a performance art. The costs went way up; the quality declined.

If we were sane, we’d actually see the industry go the other way.

If we assume that AI is here to stay for coders, then the most rational thing to do would be to hire way more juniors, and let them spend lots of time experimenting and building up good ways to utilize these new AI tools, while also getting a chance to learn and integrate that messy process knowledge from the other generations. So instead of junior positions shrinking, we’d see an explosion of new junior positions. And we’d see vets get even more expensive.

That we are not see this indicates either a myopic management effect or that AI itself really isn’t that useful right now. What seems to be happening is that management is cutting back on payroll long before the intermediates have successfully discovered how to reliably leverage this new toolset. They are jumping the gun, so to speak, and wiping out their own dev shops as an accidental consequence. It will be a while before they notice.

This has happened before; software development often has a serious case of amnesia and tends to forget its own checkered history. If it follows the older patterns, there will be a few years of decreasing jobs and lower salaries, followed by an explosion of jobs and huge salary increases. They’ll be desperate to undo the damage.

People incorrectly tend to think of software development as one-off projects instead of continually running IT shops. They’ll do all sorts of short-term damage to squeeze value, while losing big overall.

Having lived through the end of programming as we know it a few dozen times already, I am usually very wary of any of these hype cycles. AI will eventually find its usefulness in a few limited areas of development, but it won’t happen until it has become far more deterministic. Essentially, random tools are useless tools. Software development is never a one-off project, even if that delusion keeps persisting. If you can’t reliably move forward, then you are not moving forward. At some point, the ground just drops out below you, sending you back to square one.

The important point at the high level is that you set up and run a shop to produce and maintain the software you need to support your organization’s goals. The health of that shop is vital, since if it is broken, you can’t really keep anything working properly. When the toolset changes, it would be good if the shop can leverage it, but it is up to the people working there to figure it out, not management.

tag:blogger.com,1999:blog-6104420435021904082.post-2964614667710646244

Extensions

Dirty Little Secret

Paul W. Homer Jan 22, 2026 Updated Jan 22, 2026

Show full content

At the beginning of this Century, an incredibly successful software executive turned to me and said, “The dirty little secret of the software industry is that none of this stuff really works.”

I was horrified.

"Sure, some of that really ancient stuff didn’t work very well, but that is why we are going to replace it all with all of our flashy new technologies. Our new stuff will definitely fix that and work properly," I replied.

I’ve revisited this conversation in my head dozens of times over the decades. That new flashy stuff of my youth is now the ancient crusty stuff for the kids. It didn’t work either.

Well, to be fair, some parts of it worked just enough, and a lot of it stayed around hidden deep below. But it's weird and eclectic, and people complain about it often, try hard to avoid it, and still dream of replacing it.

History, it seems, did a great job of proving that the executive was correct. Each new generation piles another mess on top of all of the previous messes, with the dream of getting it better, this time.

It’s compounded by the fact that people rush through the work so much faster now.

In those days, we worked on something for ‘months’, now the expectation is ‘weeks’.

Libraries and frameworks were few and far between, but that actually afforded us a chance to gain more knowledge and be more attentive to the little details. The practice of coding software keeps degrading as the breadth of technologies explodes.

The bigger problem is that even though the hardware has advanced by orders and orders of magnitude, the effectiveness of the software has not. It was screens full of awkward widgets back then; it is still the same. Modern GUIs have more graphics, but behave worse than before. You can do more with computers these days, but it is far more cognitively demanding to get it done now. We didn’t improve the technology; we just started churning it out faster.

Another dirty little secret is that there is probably a much better way to code things that was probably more commonly known when it was first discovered in the 70s or the 80s. Most programmers prefer to learn from scratch, forcing them to resolve the same problems people had decades ago. If we keep reinventing the same crude mechanics, it is no surprise that we haven’t advanced at all. We keep writing the same things over and over again while telling ourselves that this time it is really different.

I keep thinking back to all of those times, in so many meetings, were someone was enthusiastically expounding the virtues of some brand new, super trendy, uber cool technology, and essentially claiming “this time, we got it right”, while knowing, that if I wait for another five years, the tides will turn and a new generation will be claiming that that old stuff doesn’t work.

“None of this stuff really works” got stuck in my head way back then, and it keeps proving itself correct.

tag:blogger.com,1999:blog-6104420435021904082.post-7682148650586450523

Extensions

This Week's Turbo Encabulator

Paul W. Homer Jan 15, 2026 Updated Jan 15, 2026

Show full content

Sometimes, the software industry tries to sell products that people don’t actually need and won’t solve any real problems. They are very profitable and need minimal support.

The classic example is the set of products that falls loosely under the “data warehouse” category.

The problem some people think they have is that if they did not collect data when they needed it, they don’t have access to it now. In a lot of cases, you can’t simply go back and reconstruct it or get it from another source; it is lost forever.

So, people started coming up with a long series of products that would let you capture massive amounts of unstructured data, then later, when you need it, you could apply some structure on top and use it.

That makes sense, sort of. Collect absolutely everything and dump it into a giant warehouse, then use it later.

The first flaw, though, is that collecting data and keeping it around for a long time is expensive. You need some storage devices, but you also need a way to tell when the data has expired. Consumer data from thirty years ago may be of interest to a historian, but is probably useless for most businesses. So, you have all of this unstructured data, when do you get rid of it? If you don’t know what the data really is, then sadly, the answer is ‘never’, which is insanely expensive.

The other obvious problem is that some of the data you have captured is ‘raw’, and some of it is ‘derived’. It is a waste of money to persist that derived stuff, since you can reconstruct it. But again, if you don’t know what you have captured, you can not distinguish it.

The bigger problem, though, is that you now have this sea of unstructured data, and thanks to various changes over time, it does not fit together nicely. So that act of putting a structure on top is non-trivial. In fact, it is orders of magnitude more complicated now than if you had just sorted out carefully what you needed first.

Changes make it hard to stitch together, and bugs and glitches pad it out with lots and lots of garbage. The noise overwhelms the signal.

It’s so difficult to meticulously pick through it and turn it into usable information that it is highly unlikely that anyone will ever bother doing that. Or if they did it for a while, eventually they’d stop doing it. If they don’t have the time and patience to do it earlier, then why would that change later?

So, you're burning all of these resources to prevent a problem you really shouldn’t have, in the unlikely case that someone may go through Herculean effort to get value out of it later.

If there is a line of business, then there are fundamental core data entities that underpin it. Mostly, they change slowly, but in all cases, you will always have ongoing work to keep up with any changes. You can’t escape that. If you did reasonable analysis and modelled the data correctly, then you could set up partitioned entry points for all of the data and keep teams around to stay synced to any gradual changes. In that case, you know what data is out there, you know how it is structured and what it means, and you have the appropriate systems in place to capture it. Your IT department is organized and effective.

The derived variations of this core data may go through all sorts of weird gyrations, but the fundamentals are easy enough to understand and capture. So, if you are organized and functioning correctly, you wouldn’t need an insurance technology to double up the capture ‘just in case’.

Flipped around, if you think you “need” a data warehouse in order to capture stuff that you worried you might have missed, your actual problem is that your IT department is a disorganized disaster area. You still don’t need a warehouse; you need to fix your IT department. So someone selling you a warehouse as a solution to your IT problems is selling you snake oil. It ain’t going to help.

Now it is possible that there is a lag between ‘changes’ and the ability to update the collection of the data. So, you think that to solve that, you need a warehouse, but the same argument applies.

The changes aren’t frequent enough and really aren’t surprises, so the lag is caused by other internal issues. If you have an important line of business, and it changes every so often, then it would make sense (and is cheaper) if you just have a team waiting to jump into action to keep up with those changes. If they are not fully occupied in between changes, that is not a flaw or waste of money; they are just firefighters and need to be on standby. Sometimes you need firefighters, or a little spare capacity, or some insurance. That is reasonable resource management. Don’t place and build your house on the beach based on low tides; do it based on at least king tides or even tsunamis.

There are plenty of other products sold to enterprises that are similar. If you look at what they do, and you ask reasonable questions about why they exist, you’ll often find that the answers don’t make any real sense. The industry prefers to solve these easy problems on a rotating basis.

There will be wave after wave of questionable solutions to secondary problems that ultimately just compound the whole mess. They make it worse. Then, as people realized that they don’t work very well, a whole new wave of uselessness will hit. So long as everyone is distracted and chasing that latest wave, they will be too busy to question the sanity of what they are implementing.

tag:blogger.com,1999:blog-6104420435021904082.post-309979217953079912

Extensions

Against the Grain

Paul W. Homer Jan 8, 2026 Updated Jan 23, 2026

Show full content

When I was really young and first started coding, I hated relational databases.

They didn’t teach us much about them in university, but they were entirely dominant in enterprise development for the 80s and 90s. If you needed persistence, only an RDBMS could be considered. Any other choice, like rolling it yourself, files, or lighter dbs, caches, was considered inappropriate. People would get mad if you used them.

My first experiences with an RDBMS were somewhat painful. The notion of declarative programming felt a bit foreign to me, and those earlier databases were cruder in their syntax.

But eventually I figured them out, and even came to appreciate all of the primary and secondary capabilities. You don’t just get reliable queries; you can use them for dynamic behaviour and distributed locking issues as well. Optimizations can be a little tiring, and you have to stay tight to normal forms to avoid piling up severe technical debt, but with practice, they are a good, strong, solid foundation. You just have to use them correctly to get the most out of them.

If you need reliable persistence (and you do) and the computations aren’t exotic (and they mostly aren’t), then relying on a good RDBMS is generally the best choice. You just have to learn a lot about the technology and use it properly, but then it works as expected.

If you try to use it incorrectly, you are doing what I like to call “going against the grain”. It’s an ancient woodworking expression, but it is highly fitting for technology. With the grain, you are using it as the originators intended, and against the grain, you are trying to get it to do something clever, awkward, or funny.

Sometimes people think they are clever by trying to force technology to do something unexpected. But that is always a recipe for failure. Even if you could get it to work with minimal side effects, the technology evolves, so the tricks will turn ugly.

Once you’ve matured as a programmer, you realize that clever is just asking for trouble, and usually for no good reason. Most code, most of the time, is pretty basic. At least 90% of it. Usually, it only needs to do the same things that people have been doing for at least half a century. The problem isn’t coming up with some crazy, clever new approach, but rather finding a very reasonable one in an overly short period of time, in a way that you can keep moving it forward over a long series of upgrades.

We build things, but they are not art; they are industrial-strength machinery. They need to be clean, strong, and withstand the ravages of the future. That is quality code; anything else is just a distraction.

Now, if you are pushing the state of the art for some reason, then you would have to go outside of the standard components and their usages. So, I wasn’t surprised that NoSQL came into existence, and I have had a few occasions where I both needed it and really appreciated it. ORMS are similar.

It’s just that I would not have been able to leverage these technologies properly if I didn’t already understand how to get the most out of an RDBMS. I needed to hit the limits first to gain an understanding.

So, when I saw a lot of people using NoSQL to skip learning about RDBMSes, I knew right away that it was a tragic mistake. They failed to understand that their usage was rather stock and just wanted to add cool new technologies to their resumes. That is the absolute worst reason to use any technology, ever. Or as I like to say, experiment on your own time, take your day job seriously.

In that sense, using an RDBMS for something weird is going against the grain, but skipping it for some other eclectic technology is also going against the grain. Two variations on the same problem. If you need to build something that is reliable, then you have to learn what reliable means and use that to make stronger decisions about which components to pile into the foundations. Maybe the best choice is old, and not great for your resume, but that is fine. Doing a good job is always more important.

This applies, of course, to all technologies, not just RDBMSes. My first instinct is to minimize using any external components, but if I have to, then I am looking for the good, reliable, industrial-strength options. Some super-cool, trendy, new component automatically makes me suspicious. Old, crusty, battle-scarred stuff may not look as sweet, but in most cases, it is usually a lot more reliable. And the main quality that I am looking for is reliability.

But even after you decide on the tech, you still have to find the grain and go with it. You pick some reasonable library, but then try to make it jump around in unreasonable ways; it will not end well. In the worst case, you incorrectly convince yourself that it is doing something you need, but it isn’t. Swapping out a big component at the last minute before a release is always a huge failure and tends to result in really painful circumstances. A hole that big could take years to recover from.

So, it plays back to the minimization. If we have to use a component, then we have to know how to use it properly, so it isn’t that much of a time saving, unless it is doing something sophisticated enough that learning all of that from scratch is way out of the time budget. If you just toss in components for a tiny fraction of their functionality, the code degenerates into a huge chaotic mess. You lose that connection to knowing what it will really do, and that is always fatal. Mystery code is not something you ever want to support; it will just burn time, and time is always in short supply.

In general, if you have to add a component, then you want to try to use all of its core features in a way that the authors expected you to use them. And you never want to have two contradictory components in the same system; that is really bad. Use it fully, use it properly, and get the most out of the effort it took to integrate it fully. That will keep things sane. Overall, beware of any components you rely on; they will not save you time; they may just minimize some of the learning you should have done, but they are never free.

tag:blogger.com,1999:blog-6104420435021904082.post-2261694057677111200

Extensions

New Year

Paul W. Homer Jan 1, 2026 Updated Jan 26, 2026

Show full content

I was addicted from the moment I bought my first computer: a heavily used Apple ][+ clone. Computers hadn’t significantly altered our world yet, but I saw immense potential in that machine.

None of us, back in those days, could have predicted how much these machines would damage our world. We only saw the good.

And there has been lots of good; I can’t live without GPS in the car, and online shopping is often handy. I can communicate with all sorts of people I would not have been able to meet before.
But there have also been a massive number of negative effects, from surveillance to endless lies and social divisions.

Tools are inherently neutral, so they have both good and bad uses; that is their nature. We have these incredibly powerful machines, but what have we done with them? The world is far more chaotic, way less fair, and highly polluted now. We could have used the machines to lift ourselves up, but instead we’ve let a dodgy minority use them to squeeze more money out of us. Stupid.

I’m hoping that we can turn a corner for 2026 and get back to leveraging these machines to make the world a better place. That we ignore those ambitious weasels who only care about monetizing everything and instead start to use software to really solve our rapidly growing set of nasty problems. Sure, it is not profitable, but who cares anymore? Having a lot of money while living on a burning planet isn’t really great. Less money on a happy planet is a big improvement.

The worst problem for the software industry is always trying to rush through the work, only solving redundant, trivial problems. We need to switch focus. Go slow, build up knowledge and sophistication, and ignore those people shouting at us to go faster. Good programming is slow. Slow is good. Take your time, concentrate on getting better code, and pay close attention to all of the little details. Programming is pedantic; we seem to have forgotten this.

The other thing is that we need to be far more careful about what software we write. Just say no to writing sleazy code. Not all code should exist. If they find someone else, that is not your problem, but doing questionable work because someone else might is a sad excuse. As more of us refuse, it will get a lot harder for them to achieve their goals. We can’t stop them, but at least we can slow them down a little.

The final thing to do is to forget about the twisted, messed-up history of software, at least for a moment. Think big, think grand. We have these powerhouse intellectual tools; we should be using them to lift humanity, not just for lame data entry. We need to build up stable, strong complexity that we can leverage to solve larger and larger problems. A rewrite of some crude approach with another crude approach just isn’t leveraging any of the capabilities of software. Rearranging the screens and using different widgets is only going sideways, not up. Software can remember the full context and help us make way better decisions. That is its power; we just need to start building things that truly leverage it.

Given the declining state of the world these days, it makes sense that we use this moment to shift focus. If computers got us into this mess, then they can also get us out of it. It’s been clear for quite a while that things are not going very well, but many of the people leveraging that sentiment are only doing so in order to make things worse. It's time we change that.

tag:blogger.com,1999:blog-6104420435021904082.post-1542975158071364958

Extensions

A Manifestation of my Understanding

Paul W. Homer Dec 18, 2025 Updated Feb 3, 2026

Show full content

When I code, I only code what I know. How could it be otherwise?

If I were writing out mysterious instructions that I am clueless about, I’d never be able to tell if they did or did not do what people wanted them to do. I could blindly follow some loose specification, but the inevitable typos and inherent vagueness would force the code to drift far from what is desired. I’d have no sense of direction on how to get it back on track.

So, before I code, I learn. I learn about the tech I am using, and I learn about the problems that people need solving. I try to learn about the state of the art and the different ways other people have solved similar problems.

I take all of this learning and use it to craft the instructions. I put it out there, first in testing, and use any feedback to fix both unintentional problems and learning ones. Sometimes, I just didn't fully get all of the ramifications of some combination of the topics. Oops.

In that way, any and all of the code I lay out in editors, for the various different internal and external parts of the software that I will release, are very much manifestations of what I know, who I am, and what I believe.

Which is to say that the work is not coding, it is learning. Code is the result of that knowledge I acquired, not some arbitrary arrangement of instructions.

This is no different than writers for magazines and newspapers. They do some amount of research and investigative journalism, then they craft articles intended to communicate the things that they, too, have learned.

They might be constrained by style guides and ethical concerns to write out what they know in a certain way so that it fits in with the larger publication, but their writing is still their expression of what they know. Their personality is still imprinted on it. It might be edited quite a bit by others, which may pull it away somewhat from their original effort or focus, but ultimately, in the end, it is still mostly a manifestation of them.

People who try to disconnect code from the thinking of the programmers do so because they don’t like having to accommodate the programmers' needs. They want programmers to be mindless clerks who blindly grind out endless code. We see all sorts of movements in the software industry that express this view, but it is not and will never be the case. If you have a huge series of instructions that is intended to do something specific, then someone has to understand what those instructions are. They are too precise and pedantic for people flying at higher levels to belt out in pictures or general descriptions. To work as needed, they need that deep level of correctness to be there. You can’t skip past that, so you can’t skip past programmers, and at least you should respect them enough to understand that their works are actually manifestations of themselves.

You can’t unbind code from its authors. It just isn’t possible, until maybe AGI. If you need some complex code that does something precise, then you need some people to fully, completely, and totally understand exactly what those instructions do, and why they are correct and precise.

In that sense, everything that I coded from scratch is an extension of my personality. It is written all over the work. Maybe if I do some little hacks to someone else’s code, there is little trace of my personality there, but if it is a large, unified block of code that I wrote, not only is it me that wrote it, but if you looked at it carefully and you knew me, you’d know that I was the one who wrote it. We leave a lot of ourselves behind in our code; we have no choice in that.

I get that it is inconvenient to some management who wants to claim all of the credit for our work, but they are just self-absorbed. They can’t hide stuff from me; I have to know it all in order for the code to live up to its usefulness. I might not care why we are solving a given set of problems, but I can not do even a reasonable job if I am blindfolded. They can’t just chuck me away afterwards and think that it will all continue on as normal. Replacing me with a kid that doesn’t know anything yet is cheaper, but also grossly ineffective. It will take a long time for the kid to learn what I did over the decades I spent doing it. Then it will be their code and a manifestation of who they are, so the problem persists.

tag:blogger.com,1999:blog-6104420435021904082.post-7371189613644946342

Extensions

The Value of Data

Paul W. Homer Dec 11, 2025 Updated Dec 11, 2025

Show full content

According to Commodore Grace Hopper, back in 1985, the flow is: data -> information -> knowledge.

https://www.youtube.com/watch?v=ZR0ujwlvbkQ

I really like this perspective.

Working through that, data is the raw bits and bytes that we are collecting, in various different ‘data types’ (formats, encodings, representations). Data also has a structure, which is very important.

Information is really what we are presenting to people. Mostly these days via GUIs, but there are other, older mediums, like print. The data might be an encoded Julian date, and the information is a readable printed string in one of the nicer date formats.

Knowledge, then, is when someone absorbs this information, and it leads them to a specific understanding. They use this knowledge to make decisions. The decisions are the result of the collection of data as it relates to the physical world.

A part of what she is saying is that collecting data that is wrong or useless has no value. It is a waste of resources. But we did not know back then how to value data, and 40 years later, we still do not know how to do this.

I think the central problem with this is ambiguity. If we collect data on something, and some or part of it is missing, it is ambiguous as to what happened. We just don’t know.

We could, for instance, get a list of all of the employees for a company, but without some type of higher structure, like a tree or a dag, we do not know who reported to whom. We can flatten that structure and embed it directly into the list, as say a column called ‘boss’, which would allow us to reconstruct the hierarchy later.

So, this falls into the difference between data and derived data. The column boss is a relative reference to the structural reporting organization. If we use it to rebuild the whole structure, then we could see all of the employees below a given person. The information may then allow someone to be able to see the current corporate hierarchy, and the knowledge might be that it is inconsistent and needs to be reorganized somehow. So, the decision is to move around different employees to fix the internal inconsistencies and hopefully strengthen the organization.

In that sense, this does set the value somewhat. You can make the correct decision if you have all of the employees, none are missing, none of them are incorrect in an overall harmful way, and you have a reference to their boss. The list is full, complete, and up-to-date, and the structural references are correct.

So, what you need to collect is not only the current list of employees and who they report to, but also any sort of changes that happen later when people are hired, or they leave, or change bosses. A snapshot and a stream of deltas that is kept up-to-date. That is all you need to persist in order to make decisions based on the organization of the employees.

Pulling back a bit, if we work backwards, we can see that there are possibly millions of little decisions that need to be made, and we need to collect and persist all of the relevant individual pieces of data, and any related structural relationships as well.

We have done this correctly if and only if we can present the information necessary without any sort of ambiguity. That is, if we don't have a needed date and time for an event, we at least have other time markers such that we can correctly calculate the needed data and time.

But that is a common, often subtle bug in a lot of modern systems. They might know when something starts, for instance, and then keep track of the number of days since the start when another event occurred. That’s correct for the date, but any sort of calculated time is nonsense. If you did that, the information you present would be the data only, but if you look at a lot of systems out there, you see bad data, like fake times on the screens. Incorrect derived information caused by an ambiguity caused by not collecting a required piece of data, or at very least, not presenting the actual collected and derived data on the screen correctly. It’s an overly simple example, but way too common for interfaces to lie about some of the information that they show people.

The corollary to all of this is that it seems unwise to blindly collect as much data as possible and just throw it into a data swamp, so that you can sort it out later. That never made any real sense to me.

The costs of modelling it correctly so it can be used to present information are far cheaper if you do it closer to when you collect the data. But people don’t want to put in the effort to figure out how to model the data, and they are also worried about missing data that they think they should have collected, so they collect it all and insist that they’ll sort it out later. Maybe later comes, sometimes, but rarely, so it doesn’t seem like a good use of resources. The data in the swamp has almost no real value, and is far more likely to never have any real value.

But all of that tells us that we need to think in terms of: decision -> knowledge -> information -> data.

Tell me what decisions you need to make, and I can tell you what data we have to collect.

If you don’t know, you can at least express it in general terms.

The business may need to react in terms of changes to the customer spending, for example. So, we need a system that shows at a high level and all of the way down, how the customers are spending on the products and services. And we need it to be historic, so that we can look at changes over time, say last year or five years ago. It can be more specific if the line of business is mature and you have someone whose expertise in that line is incredibly deep, but otherwise, it is general.

It works outwardly as well. You decide to put up a commercial product to help users with a very specific problem. You figure out what decisions they need to make while navigating through that problem, then you know what data you need to collect, and what structure you need to understand.

They are shopping for the best deals. You’d want to collect all of the things they have seen so far and rank them somehow. The overall list of all deals possible might get them going, but the actual problem is enabling them to make a decision based on what they’ve seen, not to overwhelm them with too much information.

The corollary to this is what effectively bugs me about a lot of the lesser web apps out there. They claim to solve a problem for the users, but then they just go and push back great swaths of the problem to the users instead. They’re too busy throwing up widgets onto the screen to care about whether the information in the widgets is useful or not, and they’ve organized the web app based on their own convenience, not the users' need to make a decision. Forcing the users to end up bouncing all over the place and copying and pasting the information elsewhere to refine the knowledge. It’s not solving the problem, but just getting in the way. A bad gateway to slow down access to the necessary information.

I’ve blogged about data modelling a lot, but Grace Hopper’s take on this helps me refine the first part. I’ve always known that you have to carefully and correctly model the data before you waste a lot of time building code on top.

I’ve often said that if you have made mistakes in the modelling, you go down as low as you can to fix them as early as you can. Waiting just compounds the mistake.

I’ve intuitively known when building stuff to figure out the major entities first, then fill in the secondary ones as the system grows. But the notion that you can figure out all of the data for your solution by examining the decisions that get made as people work through their problems really helps in scoping the work.

Take any sort of system, write out all of the decisions you expect people to make as a result of using it, and then you have your schema for the database. You can prioritize the decisions based on how you are justifying, funding, or growing the system.

Following that, first you decide on the problem you want to solve. You figure out which major decisions the users would need to make using your solution, then you craft a schema. From there, you can start adding features, implementing the functionality they need to make it happen. You still have some sense of which decisions you can’t deal with right away, so you get a roadmap as well.

Software essentially grows from a starting point in a problem space; if we envision that as being fields of related decisions, then it helps shape how the whole thing will evolve.

For example, if you want to help the users decide what’s for dinner tonight, you need data about what’s in the fridge, which recipe books they have, what kitchen equipment, and what stores are accessible to them. You let them add to that context, then you can provide an ordered list of the best options, shopping lists, and recipes. If you do that, you have solved their ‘dinner problem’; if you only do a little bit of that, the app is useless. Starting with the decision that they need help making clarifies the rest of it.

As I have often said, software is all about data; code is just the way you move it around. If you want to build sophisticated systems, you need to collect the right data and present it in the right way. Garbage data interferes with that. If you minimize the other resource usages like CPU, that is a plus, but it is secondary.

tag:blogger.com,1999:blog-6104420435021904082.post-6652151184490431045

Extensions

Expressive Power

Paul W. Homer Dec 4, 2025 Updated Dec 4, 2025

Show full content

You can think about code as just being a means to take different inputs and then deliver a range of related outputs.

In a relative sense, we can look at the size of that code (as the number of lines) and the range of its outputs. We can do this from a higher system perspective.

So, say we have a basic inventory system. It collects data about some physical stuff, lets people explore it a bit, then exports the data downstream to other systems. Without worrying about the specific features or functionality, let's say we were able to get this built with 100k lines of code.

If someone could come along and write the exact same system with 50K lines of the same type of code, it is clear that their code has more ‘expressive power’ than our codebase. Both are doing the same thing, take the same inputs, generate the same range of outputs, use the same technologies, but one is half the amount of code.

We want to amplify expressive power because, ultimately, it is less work to initially build it, usually a lot less work to test it, and it is far easier to extend it over its lifetime.

The code is half the size, so half of the typing work. Bugs loosely correlate to code size, so there are relatively half the number of bugs. If the code reductions were not just cute tricks and syntactic sugar, it would require a bit more cognitive effort to code, and bug fixing would be a little harder, but not twice, so there is still some significant savings. It’s just less brute force code.

Usually, the strongest way to kick up expressive power is code reuse with a touch of generalization.

Most systems have reams of redundant code; it’s all pretty much the same type of similar work. Get data from the database, put it on a screen, and put it back into the database again. With a few pipes in and a couple out, that is the bulk of the underlying mechanics.

If you can shrink the database interaction code and screen widget layout code, you can often get orders of magnitude code reductions.

But the other way to kick up expressive power is to produce a much larger range of outputs from the inputs. That tends to come from adding lots of different entities into the application model, some abstraction, and leveraging polymorphism everywhere. More stuff handled more generally.

For instance, instead of hard-coding a few different special sets of users, you put in the ability to group any of them for any reason. One generic group mechanism lets you track as many sets as you need, so it’s less screens, less specific entities, but a wider range of capabilities. A bump up in expressive power.

The biggest point about understanding and paying attention to expressive power comes from the amount of time it saves. We’re often asked to grind out medium-sized systems super quickly, but a side effect of that is that the specifications are highly reactive, so they change all of the time. If you build in strong expressive power early, then any of those arbitrary changes later become way less work, sometimes trivial.

If, from the above example, you had hardcoded sets, adding a new one is a major pain. If you had arbitrary groups, it would be trivial.

Brute force code is too rigid and fragile, so over time, it counts as dead weight. It keeps you from getting ahead of the game, which keeps you from having enough time to do a good job. You’re scrambling too hard to catch up.

We see that more dramatically if we write 1M lines of code, when we just needed 50K. 1M lines of code is a beast, so any sort of change or extension to it goes at a snail's pace. And adding new subsystems into something that brute-forced is the same work as doing it from scratch, so there is no real ability to leverage any of the earlier efforts. The code becomes a trap that kills almost all momentum. Development grinds to a halt.

But if you have some solid code with strong expressive power, you can use it over and over again. Sometimes you’ll have to ratchet it up to a new level of expressiveness, but it is a fraction of the work of coding it from scratch. Redeploying your own battle-hardened code a whole bunch of times is far superior to writing it from scratch. Less work, less learning, and way less bugs.

Since time is often the biggest development problem and the source of most problems, anything to save lots of time will always make projects go a whole lot smoother. To keep from getting swamped, we always need to get way more out of any work. That is the only way to keep it sane.

tag:blogger.com,1999:blog-6104420435021904082.post-3012412300226573752

Extensions

Software Failures

Paul W. Homer Nov 27, 2025 Updated Nov 27, 2025

Show full content

Way back, maybe 15ish years ago, when I was writing that software projects failed just as often then as they did back in the Waterfall era, lots and lots of people said I was wrong.

They insisted that the newer lightweight reactive methodologies had “fixed” all of the issues. But if you understand what was going wrong with these projects, you’d know that was impossible.

So it’s nice to see a modern perspective confirm what I said then:

https://spectrum.ieee.org/it-management-software-failures

The only part of that article that I could disagree with is that it was a bit too positive towards Agile and DevOps. It did mitigate itself at the end of the paragraph, but it still has an overly positive marketing vibe to the writing. “Proved successfully” should have been lower to “claimed successfully”, which is a bit different and a lot more realistic.

If you lumped in all of the software that doesn’t even pay for itself, you’d see that the situation is much worse in most enterprises. Millions of lines of fragile code that kinda do just enough that it convolutes everything else. It’s pretty ugly out there. A big digital data blender.

From my perspective, the chief problem of our industry is expectations. When non-technical people moved in to take control of development projects, they misprioritized things so badly that the work commonly spins out of control.

If a development project is out of control, the best case is that it will produce a lame system; the worst case is that it will be a total outright failure. It is hard to come back from either consequence.

If we want to fix this, we have to change the way we are approaching the work.

First, we have to accept that software development is extremely, extremely slow. There are no silver bullets, no shortcuts, no vibing, no easy or cheap ways out. It is a lot of work, it is tedious work, and it needs to be done carefully and slowly.

Over the 35 years of my career, it just keeps getting faster, but with every jump up, the quality keeps getting worse. Since you need a minimal level of quality for it to not be lame or a failure, you need a minimal amount of time to get there. You try to skimp on that, it falls apart.

Hacking might be a performance art, but programming is never that. It is a slow, intense slog bordering on engineering. It takes time.

Time to design, time to ramp up, time to train, time to learn, time to code, time to test. Time, time, time.

If the problem is that you are trying to race through the work in order to keep the budget under control, the problem is that you are racing through the work. So, slow it down. Simple fix.

For any serious system, it takes years and years for it to reach maturity. Trying to slam part of it out in six months, then, is more than a bit crazy. Libraries and frameworks don’t save you. SAAS products don’t save you. Being overly reactive and loose with lightweight methodologies doesn’t save you either, and then can actually fuel the problems, making it worse, not better.

If you want your software to work properly, you have to put in the effort to make it work properly. That takes time.

The other big issue that is needed is that the group of people you assembled to build a big system matters a whole lot. Huge amount.

Programmers are not easily replaceable cogs. An all-junior team that is barely functional is not only far less expensive, it is also a massive risk. The resulting system is already in big trouble before any code is written.

The people you put in charge of the development work really matter. They need to be skilled, experienced, and understand how to navigate some pretty complicated and difficult tradeoffs. Without that type of background, the work gets lost and then spins out of control.

It’s very common to see too much focus dumped on trivial visible interface issues while the underlying mechanics are hopelessly broken. It’s like worrying about the cup holder in your car when the engine block is cracked. You need someone who knows this, has lived this, and can avoid this.

As well, enough of the team needs to have significant experience too. Experience is what keeps us from making a mess, and a mess is the easiest thing to create with software. So, a gifted team of developers, mixed in experience with both juniors and seniors, led by experience, is pretty much a prerequisite to keeping the risks under control.

Software development has always been about people, what they know, and what they can build. They are the most important resource in building and running large software systems. If you don’t have enough skilled people, nothing else matters. No methodology, process, or paperwork can save you. You lack the talent to get it done. Simple.

That’s mostly the roots of our problems. Not enough time, and not taking the staffing issues seriously enough. Fix those two, and most development gets back on the rails. Nurture them, and most things built out of a strong shop are pretty good. From there, you can decide how high the quality should be, or how to streamline the work, or strategize about direction, but without that concrete foundation, you are lucky if any of it runs at all, and if it does that the crashes just aren’t too epic.

tag:blogger.com,1999:blog-6104420435021904082.post-8919716401422024165

Extensions

Integrations

Paul W. Homer Nov 20, 2025 Updated Nov 20, 2025

Show full content

There are two primary ways to integrate independent software components, we’ll call them ‘on-top’ and ‘underneath’.

On top means that the output from the first piece of software goes in from above to trigger the functionality of the second piece of software.

This works really well if there is a generalized third piece of software that acts as a medium.

This is the strength of the Unix philosophy. There is a ‘shell’ on top, which is used to call a lot of smaller, well-refined commands underneath. The output of any one command goes up to the shell, and then down again to any other command. The format is unstructured or at least semi-structured text. Each command CLI takes its input from stdin and command line arguments, then puts its output to stdout, and splits off any errors into stderr. The ‘integration’ between these commands is ‘on top’ of the CLI in the shell. They can all pipe data to each other.

This proved to be an extremely powerful and relatively consistent way of integrating all of these small parts together in a flexible way.

Underneath integrations are the opposite. The first piece of software keeps its own configuration data for the second piece and calls it directly. There may be no third party, although some implementations of this ironically spin up a shell underneath, which then spins up the other command. Sockets are also commonly used to communicate, but they depend on the second command already being up and running and listening on the necessary port, so they are less deterministic.

A lot of modern software prefers the second type of integration, mostly because it is easier for programmers to implement it. They just keep an arbitrary collection of data in the configuration, and then start or call the other software with that configuration.

The problem is that even if the configuration itself is flexible, this is still a ‘hardwired’ integration. The first software must include enough specific code in order to call the second one. The second one might have a generic API or CLI. If it needs the output, the first software needs to parse the output it gets back.

If the interaction is bi-directional and a long-running protocol, this makes a lot of sense. Two programs can establish a connection, get agreement on the specifics, and then communicate back and forth as needed. The downside is that both programs need to be modified, and they need to stay in sync. Communication protocols can be a little tricky to write, but are very well understood.

But this makes a lot less sense if the first program just needs to occasionally trigger some ‘functionality’ in the second one. It’s a lot of work for an infrequent and often time-insensitive handoff. It is better to get the results out of the program and back up into some other medium, where it can be viewed and tracked.

The top-down approach is considerably more flexible, and depending on the third-party is far easier to diagnose problems. You can get a high-level log of the interaction, instead of having to stitch together parts of a bunch of scattered logs. Identifying where a problem originated in a bunch of underneath integrations is a real nightmare.

Messaging backbones act as third parties as well. If they are transaction-oriented and bi-directional, then they are a powerful medium for different software to integrate. They usually define a standard format for communication and data. Unfortunately, they are often vendor-specific, very expensive, locked in, and can have short lifespans.

On-top integrations can be a little more expensive when using resources. They are slower, use more CPU, and it is costly to format to a common format than to parse back to the specifics. So they are not preferred for large-scale high-performance systems. But they are better for low or infrequent interactions.

However, on-top integrations also require a lot more cognitive effort and pre-planning. You have to carefully craft the mechanics to fit well into the medium. You essentially need a ‘philosophy’, then a bunch of implementations. You don’t just randomly evolve your way into them.

Underneath integrations can be quite fragile. When there are a lot of them, they are heavily fragmented; the configurations are scattered all over the place. If there are more than 3 of them chained together, it can get quite hairy to set them up and keep them running. Without some intensive tracking, unnoticed tiny changes in one place can manifest later as larger mysterious issues. It is also quite a bit harder to reason about how the entire thing works, which causes unhappy surprises. Equally problematic is that each integration is very different, and all of these inconsistencies and different idioms increase the likelihood of bugs.

As an industry, we should generally prefer on-top integrations. They proved to be powerful and reliable for decades for Unix systems. It’s just that we need more effort in finding expressive generalized data passing mechanisms. Most of the existing data formats are far too optimized for limited sub-cases or are too awkward to implement correctly. There are hundreds of failed attempts. If we are going to continue to build tiny independent, distributed pieces, we have to work really hard to avoid fragmentation if we want them to be reliable. Otherwise, they are just complexity bombs waiting to go off.

We’ll still need underneath integrations too, but really only for bi-directional extensive, high-speed, or very specific communication. These should be the exception -- optimization -- rather than the rule. It is easier to implement, but it is also less effective and is a dangerous complexity multiplier.

tag:blogger.com,1999:blog-6104420435021904082.post-8170666880939059133

Extensions

Unknown unknowns

Paul W. Homer Nov 13, 2025 Updated Nov 13, 2025

Show full content

If I decided to build a house all on my own, I am pretty sure I would face lots of unexpected problems.

I am comfortable building something like a fence or a deck, but those skills and the knowledge I gained using them are nowhere close to what it takes to build a house.

What does it take to build a house? I have no clue. I can look at already built houses, and I can watch videos of people doing some of the work, but that isn’t even close to enough information to empower me to just go off and do it on my own.

If I tried, I would surely mess up.

That might be fine if I were building a little shed out back to store gardening tools. It’s likely that whatever mess I created would probably not result in injuries to people. It’s a very slim possibility.

But knowing that there are a huge number of unknown unknowns out there, I would be more than foolish to start advertising myself as a house builder, and even sillier to take contracts to build houses.

If a building came tumbling down late one night, it could very likely kill its occupants. That is a lot of unnecessary death and mayhem.

Fortunately, for my part of the world, there are plenty of regulators out there with building codes that would prevent me from making such a dangerous mistake.

The building codes are usually specific in how to do things, but they were initially derived from real issues that explain why they are necessary.

If I were to carefully go through the codes, I am sure that their existence -- if I pondered hard enough-- would shed light on some of those unknown unknowns that I am missing.

There might be something specific about roof construction that was driven by roofs needing to withstand a crazy amount of weight from snow. The code mandates the fix, but the reason for seemingly going overboard on the tolerances could be inferred from the existence of the code itself. “Sometimes there is a lot of extra weight on roofs”.

Rain, wind, snow, earthquakes, tsunamis, etc. There are a series of low-frequency events that need to be factored into any construction. They don’t occur often, but the roof needs to survive if and when they manifest themselves.

Obviously, it took a long time and a lot of effort over decades, if not centuries, to build up these building codes. But their existence is important. In a sense, they separate out the novices from the experts.

If I tried to build a house without reading or understanding them, it would be obvious to anyone with a deeper understanding that I was just not paying attention to the right areas of work. The foundations are floppy or missing, the walls can’t hold up even a basic roof, and the roof will cave in under the lightest of loads. The nails are too thin; they’ll snap when the building is sheared. It would be endless, really, and since I don’t know how to build a house, I certainly don’t know all of the forces and situations that would cause my work to fail.

I’ve always thought that it was pretty obvious that software needs building codes as well.

I can’t count the number of times that I dipped into some existing software project only to find that problems that I find very obvious, given my experiences, were completely and totally ignored. And that, once the impending disasters manifested themselves, everybody around me just said “Hey, that is unexpected”, when it was totally expected. I’ve been around the block; I knew it was coming.

Worse is that whenever I tried to forewarn them, they usually didn’t want to listen. Treated me as some old paranoid dude, and went happily right over the cliff.

It gets so boring having to say “I told you so”, that at some point in my career, I just stopped doing it. I stuck with “you can lead a horse to water, but you can not make it drink” instead.

And that is where building codes for software come in. As a new developer in an existing project, I often don’t carry much weight, but if there was an official reference for building codes that covered the exact same thing, it would be easy to prevent. “You’ve violated code 4.3.2, it will cause a severe outage one day”, is better than me trying to explain why the novice blog posts they read that said it was a good idea are so horribly wrong.

Software development is choked with so many myths and inaccuracies that wherever you turn, you bump into something false, like trying to run quickly through a paper-maché maze without destroying it.

We kinda did this in the past with “best practices”, but it was informal and often got co-opted by dubious people with questionable agendas. I think we need to try again. This time, it is a bunch of “specific building codes” that are tightly versioned. They start by listing out strict ‘must’ rules, then maybe some situationally optional ones, and an appendix with the justifications.

It’s oddly very hard to write, and harder to keep it stack, vendor, and paradigm neutral. We should probably start by being very specific, then gradually consolidate those codes into broader, more general ones.

It would look kinda of like:

1.1.1 All variable names must be self-describing and synchronized with any relevant outside domain or technical terminology. They must not include any cryptic or encoded components.

1.2.1 All function names must be self-describing and must clearly indicate the intent and usability of the code that they encapsulate. They must not include any cryptic or encoded components, unless mandated by the language or usage paradigm.

That way, if you crossed a function called FooHandler12Ptr, you could easily just say it was an obvious violation of 1.2.1 and add it as a bug or a code review fix.

In the past, I have worked for a few organizations that tried to do this. Some were successful, some failed miserably. But I think that in all cases, there was too much personality and opinion buried in their efforts. So, the key part here is that each and every code is truly objective. Almost in a mathematical sense, they are all ‘obviously true’ and don’t need to be broken down any further.

I do know that, given the nature of humanity, there is at least one programmer out there in this wide world who currently believes that ‘FooHandler12Ptr’ isn’t just a good name, it should also be considered best practice. For each code, I think they need an appendix, and that is where the arguments and justifications should rest. It is for those people adventurous enough to want to pursue arguing against the rules. There are plenty of romanized opinions and variations on goals; our technical discussions quickly get lost in very non-objective rationales. That should be expected, and the remedy is for people with esoteric views to simply produce their own esoteric building codes. The more, the merrier.

Of course, if we do this, it will eat up some time, both to write up the codes but also to enforce them. The one ever-present truth of most programming is that there isn’t even close to enough time to spare, and most managements are chronically impatient. So, we sell adherence to the codes as a ‘plus’, partially for commercial products or services. “We are ‘XXX 3.2.1 compliant’ as a means of really asserting that the software is actually good enough for its intended usage. In an age where most software isn’t good enough, at some point, this will become a competitive advantage and a bit later a necessity. Just need a few products to go there first, and the rest will have to follow.

tag:blogger.com,1999:blog-6104420435021904082.post-7099345657905268118

Extensions

Intent

Paul W. Homer Nov 6, 2025 Updated Nov 6, 2025

Show full content

Recently, I was reading some AI-generated code.

At the high level, it looked like what I would expect for the work that it was trying to do, but once I dug into the details, it was kinda of bizarre.

It was code that one might expect from a novice programmer who was struggling with programming. Its intent was muddled. Its author was clearly confused.

It does help, though, for a discussion of readability.

Really good code just shows you what it is going to do, really easily. It makes it obvious. You don’t need to think too hard or work through the specifics. The code says it is going to do X, and the code does that X, as you would expect. Straightforward, and totally boring, as all good code should be.

In thinking about that, it all comes down to intentions. “What did the programmer intend the code to do?” Is it some application code that moves data from the database and back to the GUI again? Does it calculate some domain metric? Is it trying to span some computation over a large number of resources?

To make it readable, the programmer has to make their intent clear.

The most obvious thing is that if there is a function called GetDataFromFile, which gets data from the database, you know the stated intent is wrong, or obscured, or messed up. Shouldn’t the data come from a file? Why is it going to a database underneath? Did they set it up one way and duct-taped it later, without bothering to properly update the name?

If the code is lying about what it intends to do, it is not readable. That’s an easy point, in that if you have to expend some cognitive effort to remember all of the places where the code is lying to you, that is just unnecessary friction getting in your way. Get enough misdirection in the code, and it is totally useless, even if it compiles. Classic spaghetti.

Programming is construction, but it is also, unfortunately, a performance art. It isn’t enough to just get the code in place; you also have to keep it going, release after release.

Intent also makes it clear for the single responsibility issues.

If the intent of a single function is to “get data from the db”, “... AND to reformat it”, “... AND to check it for bad data”, “... AND to ....” then it’s clear that the function is not doing just one thing, it is doing a bunch of them.

You’d need a higher function that calls all of the steps: “get”, “reformat”, “validate”, etc., as functions. Its name would indicate that it is both grabbing the data and applying some cleanup and/or verification to it. Raw work and derived work are very different beasts.

Programmers hate layering these days, but muddying in a bunch of different things into one giant function not only increases the likelihood of bugs, but also makes it hard for anyone else to understand what is happening. Nobody ever means to write DoHalfOfTheWorkInSomeBizarreInterlacedOrder, but that should really be a far more common function name in a lot of codebases out there. The intent of the coder was to avoid typing in functions to avoid having to name them. What the code itself was doing was forgotten.

If you decompose the code nicely into decent bite-sized chunks and give each chunk a rational, descriptive name, it is pretty easy to follow what the code is trying to do. If it is layered, then you only need to descend into the depths if there is an underlying bug or problem. You can quickly assert that GenerateProgressReport does the 5 steps that you’d expect, and move on. That is readable, easy to understand, and you probably don’t need to go any deeper. You now know those 5 steps are. You need that capability in huge systems; there is often more code there than you can read in a lifetime. If you always have to see it with all of the high and low steps intertwined together, it cripples your ability to write complex or correct code.

In OO languages, you can get it even nicer: progress.report.generate() is nearly self-documenting. The nouns are stacked in the way you’d expect them, even though “progress” is really a nounified verb. If the system were running overnight batches, and the users occasionally wanted to check in on how it was going, that is where you’d expect to find the steps involved. So, if there was a glitch in the progress report, you pretty much know where that has to be located.

A long, long time ago, in the Precambrian days of OO, I remember watching one extremely gifted coder in action. He was working with extremely complicated graphic visualizations. As he’d see a problem on the screen while testing, he had structured his code so well that he pretty much knew exactly which line in it was wrong. That kind of code is super readable, and the readability has an extra property making it highly debuggable as well. The bug nicely tells you exactly where in the code you have a problem. That is a very desirable property.

His intent was to render this complicated stuff; his code was coded in a way to make it easy to know if it was doing the right thing or not. This let him quickly move forward with in his work. If his code had been badly named spaghetti, it would have taken several lifetimes to knock out the bugs. For anybody who does not think those readability and debugability properties are necessary, they don’t realize how much more work they’ve turned it into, how much time they are wasting.

If the intent of the code is obscured or muddled, it limits the value of the code. That’s why we have comments, for adding in extra commentary that the code itself cannot express. Run-once code, even if it works, is too expensive in time to ever allow it to pay for itself. You don’t want to keep writing nearly similar pieces of code each time; if you can just solve it once in a slightly generalized fashion, and then move on to other, larger pieces of work.

It takes a bit of skill to let intent shine through properly. It isn’t something intuitive. The more code from other people you read, the more you learn which mistakes hurt the readability. It’s not always obvious, and it certainly changes a bit as you get more and more experience.

You might, for example, put in a clear idiom that you have seen in the past often, but it could confuse less experienced readers. That implies that you have to stick to the idioms and conventions that match the type of code you are writing. Simple application idioms for the application’s code, and more intricate system programming idioms for the complex or low-level stuff. If you shove some obscure functional programming idiom into some basic application code, it will be hard for other people to read. There is a more ‘application coding’ way of expressing the code.

It is a lot like writing. You have to know who you are coding for and consider your audience. That gives you the most readable code for the work you are doing. It is necessary these days because most reasonably sized coding projects are a team sport now.

It’s worth noting that deliberately hiding your intent and the functionality of the code in an effort to get ‘job security’ is generally and most often unethical. Maybe if it’s some code that you and a small group of people will absolutely maintain for its entire lifetime, then it might make sense, but that is also never actually the case. More often, we write stuff, then move on to writing other stuff.

tag:blogger.com,1999:blog-6104420435021904082.post-8252992459818315658

Extensions