GeistHaus
log in · sign up

Thoughts, by tssm

Part of neocities.org

stories
Vibers hold language models wrong
Show full content

As the weeks went and I found myself using language models more and more at work, I had to admit that the scope of my hate has narrowed.

In particular I started to use them more often to debug or find bugs I’m not aware of, and as result I’ve experience first hand the awe of reading something about my code that I didn’t know, but should have known, and that must be fixed. So it’s easier for me to understand why people are excited about these things to the point where they make exaggerated claims about their capabilities.

I still believe there’s nothing like understanding going on here, not in the sense that I previously used the word, yet it feels natural to talk in such terms after writing a prompt that lightly describes the problem I’m seeing, and getting back either an explanation, a solution, or both. And sometimes, even well argued bullshit. Uncanny, some people would say.

I’ve realised that the latter case, the random crap, is way more subtle than before, due to models improving. You have to be more weary of what you are getting because something may be wrong in a non obvious way, and because it’s non obvious it’s harder to catch. To avoid shipping a rushed solution that seems to work under certain circumstances or that makes the code surface to grow without reason, I still type everything myself. I still believe that writing is a rigorous way of thinking, and that slowing down is good.

But when I’m done, I request the model to make suggestions on the code I’ve worked on. Last time I got six, for a non trivial piece of UI and behaviour. One was completely useless even if technically correct, because the thing really doesn’t know what I’m doing. Two were good and I ended learning something, after manually testing the behaviour and checking the docs. The final three looked OK, but didn’t convince me, so I made some clarifications, tested every option, and ended changing the code in a different way than suggested, fixing the issues at the their source, instead of applying what now I know were ugly work-arounds.

Then I prompt it to find the bugs in the whole project. This is just a scale up version of the above, so it will take longer and the output will be often bigger. As a result, analysing it takes more time, yet I’m starting to feel more confident about what I ship. You have to consider that I’ve been working mostly alone and no review has been performed on my code for the last four years. Every new release makes me feel a bit anxious. I wish I had a colleague to back me up, but as stated by my employer, the idea here is that the current team becomes more efficient. Efficiency was never defined although «productivity» was suggested. I reject that idea and interpret efficiency as deploying something that will require less maintenance and less fixing, even if a bit late. And if I have to, it will be easier to perform both.

So, I still hate so called generative AI and I strongly oppose «generating» anything with it. I do the thinking, because I like to think, and I like to do it in programming languages. Yet language models sneaked into my toolbox—and whatever that comes after could do the same—in the form of AI, short for augmented intelligence. This is obviously not my term, there was a tradition of people thinking about computers in such a way1, 2, 3. It’s time for a renaissance, and it looks nothing like delegating neither thoughts nor techniques. Vibers are holding languages models the wrong way.

https://tssm.neocities.org/vibers-hold-language-models-wrong
Programming is writing is programming, for reals
Show full content

This is of course a reference to a paper by Felinne Hermans et al. The thing is while the authors make a fair comparison between both activities, to me it barely touches the surface.

I’ve been calling myself a software writer from longer than I’ve been living in Norway. I don’t remember when I made the connection, but the more code I wrote in different paradigms the more clear it was that the mental work performed while typing a blog post and a program are quite similar.

In The programmer’s brain Hermans quotes research whose results suggest that natural language processing and programming activate the same regions of the brain, which is why I dare to write the following.

Tabs vs spaces

In Spanish the rule is to use em dashes to mark out dialogues. In English most authors use quotes. Then there’s Sally Rooney, who uses no sign, just the flow of her text. I love it. And then there’s Mariana Enríquez who in the same novel uses both, em dashes, as well as no sign at all.

I personally prefer tabs over spaces because I can configure their width according to my mood without changing the code, which is a technical argument; but it also forces me to break lines often to avoid alignment issues, which is a stylistic one. Yet most people use spaces, getting rid of the flexibility provided by tabs, favouring what leads to a purely aesthetic question: 2 or 4 spaces?

Programming languages are expressive

When you write a text, whether fiction or not, you have an innumerable amount of ways to say exactly the same thing. A writer my chose the clearest, less ambiguous one, or the most bold, the most brutal, or (why not?) the most confusing.

While programming you have a smaller set of options to pick from—after all programming languages are formal ones—yet the set is still big enough that you can question if a certain way is the more “readable”.

I’m a fan of higher order functions for iterating over collections regardless of the language, but especially in object oriented ones, where the benefit of composition also gives you the benefit of readability: each operation is a function call, clearly labeled, no comments needed, where to modify things becomes obvious at a glance. Except that if someone doesn’t have experience working with higher order functions, a loop isn’t only more readable, but probably looks prettier.

Which reminds me of reading. I started as a reader of pop-science magazines. I could read them for hours, no distraction. But literature was very hard. Getting in the flow of reading different genres requires to be exposed to every single one for enough time. Reading exponents of stream of consciousness in school was boring and confusing, reading them now is easy, just because I got prepared by reading enough slipstream.

So “readable” depends on what the audience is familiar with, and sometimes what’s familiar is the wrong choice.

Consider the time a colleague wrote some computation on PyTorch and compiled it to CoreML, just because Python is what he knew—there was no ML here. Its performance varied a lot depending on the device, being terrible slow on older ones. At some point it was obvious the Python implementation had to go, and we spent several weeks porting it to SIMD operations. The speed gain justified not only the effort, but also going from “here you have a nice representation of matrices, with types” to “now everything is an array full of floats”. There’s room for boldness, brutality, and confusion in programming too.

Paradigms are movements

When we say that instead of modifying state, programs should be written as the composition of mathematical functions, we are talking philosophy.

I posit that “prefer composition over inheritance” summarises why prototype-based object orientation is better than class-based, even when the latter won. And that we can see how programming isn’t immune to the forces of marketing.

Where and when and how we catch errors is irrelevant once the program is running, as long as we catch them. I believe the right place is compile time. Some people believe it’s during testing, after all you need tests anyways. Others just let it crash.

When we argue about these things we really argue about how we believe things ought to be, and we theorise in the ways we want to shape the discipline. None is ultimately right nor wrong, all of them have their time and place.

Programming is a rigorous form of thinking

The zeitgeist is that code doesn’t matter. On one hand “code is a liability”, and on the other “we won’t need to read it anymore”. Both make my blood boil because what Jane Jacobs said about writing surely applies to programming. We may have an architecture scheme, we gather requirements. We believe that once everything is correctly designed the code will flow line by line. A machine will produce it. It is like constructing buildings, and formal methods will fix it.

Yet the truth is that when we are writing code, things happen in our brains. We realise we chose the wrong abstraction. We notice things may be too slow. In horror we discover that we don’t understand the domain as much as we thought because suddenly modelling it is hard. We jump back to look at what we wrote before. We edit and edit ad nauseam. We ask for feedback. We have references of what we would like to achieve. We may have a working prototype and not be satisfied by it even when it does what it’s supposed to do.

Jacobs quote reads: “I often try writing at an early stage because writing is, for me, a rigorous form of thinking. When you put things down on those blank sheets of paper you find the holes in what you suppose. I do a lot of drafts, and a lot of discarding”. I refuse the believe I’m the only programmer out there that can relate.

Code matters, of course it does. As long as we have computers around and as long as we want to execute it. But more importantly, code matters because writing it makes us think.

https://tssm.neocities.org/programming-is-writing-is-programming-for-reals
LLMs made me an even slower programmer
Show full content

Don’t tell my boss though, but I think it’s for the better. It’s also his fault anyways: the day when I was politely nudged to use the Claude subscription provided by my employer finally came. But nobody told me how to use it, so I started to “experiment” with both, Claude Code and OpenCode, on my own terms. Devil emoji.

I used scary quotes because despite whatever the fandom told me, there isn’t really much room for experimentation here. Both are TUI that by running in your computer have access to your files and tools. So whatever you already know how to do can now be automated in a natural language, that’s it. Plus some randomness sprinkled here and there. Of course, if you don’t know how to do much with the command line to begin with, these tools surely feel like game changer magic.

As I do know how to use the command line, as I do have personal Makefiles as well as a shell history that’s searchable, and more importantly, as I do enjoy programming a lot, I immediately restricted all permissions in both UIs to read only. This is of course at odds with the whole purpose of these things, but I couldn’t care less. I’ll keep holding them wrong. I have some sort of programming philosophy, and I previously described what’s an acceptable LLM workflow according to it. I’m the one that will drive drive the whole creative process, I’m the one that’s going to think! And writing (including writing code) is in my view a tool to get into rigorous thinking. So how can an LLM help me?

Well, I’m a drop out and a dumb ass, and as a consequence understanding and implementing algorithms that everyone else implemented in college is a bottleneck. Turns out an LLM, for obvious reasons, is perfectly capable of vomiting a most likely correct implementation of any algorithm ever taught to CS undergrads.

So I’ve been asking for examples of X, Y, and Z. I’ve read the whole explanations, and (touch) typed the code, while refactoring it, to make it suit my style, and more importantly, to understand it. This is of course nothing new. It is exactly what I have been doing since I learned HTML and CSS and looked at other’s people code. It is what I have been doing since I discovered StackOverflow. This is even recommended as a technique to gain understanding in The programmer’s brain by Felinne Hermans. And here is the first way in which having a TUI for an LLM has made me slower: it’s so easy to get most likely functional alternatives of stuff I don’t know, that I can iterate over them, retype them, jump to the web (often the Wikipedia or official docs) to check and double check the new knowledge I’m acquiring, and suddenly it is time to get lunch. Before I had access on demand to a bunch of alternatives I could study, if something wasn’t clear I would just leave an “I don’t get why this works but it does” comment and move on. That seems impossible now, even irresponsibly.

Weizenbaum observed people getting hooked to his primitive chat interface in the late 60s, so it should’t be a surprise that I am. The natural language interface is a leap somewhere and do make certain things more convenient. Having feedback by typing “look at line X in file Y and tell me if something is off” is addictive. And if something is off I will fix it! All I can say in my defense here is that no matter how great Claude thinks my questions and ideas are, I don’t see it as my boyfriend neither as sentient.

The thing is though, that the James Randi foundation made me an sceptic a long time ago. And once you learn some things by heart you cat just unlearn them. I’ve experienced the rage of being bullshitted by an LLM at least once per day in a wide range of topics, including literature and music since a colleague showed us ChatGPT. Fun fact: if you ask them about stuff you know, you quickly get a sense of their unreliability, and that should be a reminder that you should never trust their output. That’s the other way in which I’m slower now: I’ve become more rigorous when it comes to check the provided information, and the more I check, the more bullshit I find, and the more bullshit I find, the more I want to check. I have now screenshots and saved sessions of the useless stuff I’ve gotten thrown to may face, just because I know I’ll have to justify why I don’t “vibe” at some point.

In fact, there was one instance at the beginning of the year when I blindly trusted an API transformation provided by Claude, from Apple’s Combine to AsyncSequence. It seemed to work, but the QA wasn’t thorough enough because I believed it should have been equivalent. It wasn’t, and I broke our app for several customers. That definitely made me even more weary. As the dumb ass I am I like when tools prevent me from releasing broken stuff. I prefer to deliver slower but with fewer defects and a decent understanding, than to deliver fast and have to stay at work fixing a mess while the costumers complain. I’ve burned out in the past, fixing dynamically typed stuff past midnight. I refuse to burn out because I release stuff hoping, instead of knowing, that it will work.

The final way in which I’m slower now is that I can refactor endlessly. I often get ideas of how to improve what I just did when I’m finishing it, but time pressure makes me add a to do note and move to the next thing. Now the illusion of speed encourages me to try immediately. Show me how this would look like if I extracted the gestures implementation as an extension, I type, and after Claude produces a plan that seems reasonable, I have to carry it over while fixing whatever seems off. So whatever time I saved by letting the tool to generate a prototype of the modification is “lost” anyways by me caring “too much” about code quality. For my own definition of quality.

https://tssm.neocities.org/llms-made-me-an-even-slower-programmer