Bèr ‘berkes’ Kessels

Ignore Rust's target build directories in Deja-Dup

Aug 9, 2024 Updated Aug 9, 2024

Show full content

I use Deja-Dup, to backup my Ubuntu machines. And just like shimo describes in this blog post, I too want to avoid backing up gigabytes of node_module directories1.

But I also want to avoid backup up gigabytes of target directories from building Rust projects with Cargo. I have accumulated some 50+ Rust codebases scattered throughout my project directories. Their combined target directories take up about 9GB at time of writing. And all of that is reproducible, cache-like data. It doesn’t need to be backed up.

So, I adapted Shimo’s crontab to add .deja-dup-ignore to all:

All directories named target
Where the parent directory has a Cargo.toml file

In a crontab:

05 10    *   *   *   find ~/ -type d -name target -exec bash -c 'if [ -f "$(dirname {})/Cargo.toml" ]; then touch "{}/.deja-dup-ignore"; fi' \; 2>&1

The command runs find, which will execute a bash command on every directory called target. This bash command then gets for the parent dir - $(dirname {}) - of this target directory. It will then check if this parent dir has a Cargo.toml file - if [ -f "$(dirname {})/Cargo.toml" ]; and if so, add a .deja-dup-ignore ignore file in this target dir.

I don’t just want to ignore any target directory. I have at least one legit directory named target (containing a business target). There will be more, there probably are already. Hence the added check for the Cargo.toml file.

This speeds up my backup from over an hour a week, to under 20 minutes a week. And it saves me some 20GB in backup space. Almost a quarter of the space of my incremental backups were node_modules and target directories.

Unfortunately a feature request to add some pattern ignore system to Deja-Dup has been ignored for years. Personally, I’d love Deja-Dup to ignore any patterns found in any .gitignore file it encounters. Same as ripgrep and fd and some more modern CLI tools do. For me that’d be the opinionated simplicity that I seek in software. But alas. ↩

https://berk.es/2024/08/10/ignore-rust-target-build-directories-in-deja-dup

The Fediverse never Forgets

Dec 22, 2022 Updated Dec 22, 2022

Show full content

Deleting content, reliably, from a decentralised network is just not possible. The fediverse (Mastodon, Pleroma, PixelFed, etc.) in theory, never forgets. The internet never forgets either.

When I helped design a decentralized application[1], and later again an Event-Sourced setup, we applied cryptography to solve this: encrypt all data that should be deletable, then throw away the key(s) to “delete” the data. ActivityPub relies very little on cryptography and certainly not as a way to “delete” stuff. The common solution to removing stuff from a distributed network, isn’t possible, on the Fediverse.

So, any content that has been published, should, in theory, be considered out of your hands: with no way to remove it. In that sense, the fediverse adds nothing new. People can (and will) have screenshots, proxies can (and will) keep copies. Archivers have copies, search engines have it indexed, data-collectors have it collected, AI embedded in their models, and so on.

Deleting content from a centralized service (i.e. a tweet from Twitter) doesn’t guarantee that it is deleted from all those places; at most, it will prevent distribution (forwarding, copying) of stuff that hasn’t been distributed (forwarded, copied) yet. Same with deleting a blog-post on medium, a repository on Github, an image on Instagram, or a comment on Reddit. If your data hit the internet, anyone might have stored a copy. But since there’s only one central authority that has the content, and everyone else must ask them for that content in all other cases.

The fediverse, however, amplifies the distribution. Where a centralized service operates on the idea of “at any time, just ask us, and we’ll send you a copy if you have access” (on-demand), the fediverse operates on the idea “we’ll send you a copy because you have access” (push). “Storing copies” is the modus operandi of the fediverse.

In order for your Mastodon-post to reach hundreds of people, it is copied many times[3]. Aside from this being rather inefficient, it means your Right to be forgotten (GDPR) is technically impossible to uphold on the fediverse. For a privacy-touting network, this is a problem. Or at least something to be very aware of. In practice, it is not a very big problem, though.

Mastodon implements ActivityPub rather well. This is what ActivityPub says about deleting (Emphasis mine):

6.4 Delete Activity

The Delete activity is used to delete an already existing object. The side effect of this is that the server MAY replace the object with a Tombstone of the object that will be displayed in activities which reference the deleted object.

For those unfamiliar with the words MAY, MUST etc in RFCs:

MAY This word, or the adjective “OPTIONAL”, mean that an item is truly optional. One vendor may choose to include the item because a particular marketplace requires it or because the vendor feels that it enhances the product while another vendor may omit the same item.

So even those that implement ActivityPub good-willing and properly, could omit deleting entirely. Essentially, the ActivityPub standard does not adhere to the GDPRs’ right to be forgotten (IANAL).

And that leaves out all those that aren’t good-willing, or deliberately careless. Those that index the fediverse to mine advertising data, to build indexes for bullying, or those that collect security-sensitive data for phishing or whatnot. Malicious participants will most likely not adhere to the standard anyway. Instances that oppose the GDPR for ideological reasons, or live by the idea that nothing should ever be “(self-)sensored”. So even if the standard would enforce deleting, anyone can just choose to not follow the standard.

While this may sound paranoid, it’s not very far-fetched. Especially because on this fediverse, many minorities and privacy-sensitive people congregate; those with a higher risk of being bullied, phished, stalked, tracked and so on. It’s for that reason that Mastodon has automatic deletion built right in!

This “delete your content to increase privacy” is an often heard mantra, it even gets it’s own feature in Mastodon, the most used software. But this feature promises much more than it can ever deliver.

What happens under the hood, when you delete a post, is that it sends out a Delete request for that post to all servers that this post has reached code here. This is a best effort action. Your instance cannot know for certain who really received the content: a boost will have distributed it further, as will a reply with a mention, or even a relay passing it along. It tries, though. It tries to follow all the boosts, mentions, relays. But obviously only knows about those that reported this boost, mention or forwarding successfully. And this ignores an actual issue: if your post was distributed to instance-foo, and that instance then block yours, or your instance now blocks them[2], the delete request won’t be delivered. You could see this as someone sending a second email in which the sender asks everyone to please delete the previous embarrasing email: not a great privacy model.

All the servers that then receive that Delete may chose to remove the content (or replace it with a placeholder: a so called Tombstone). But even the good-willing may fail: a glitch might prevent this: maybe the server is overloaded, maybe it is down, maybe it hits an error, anything, really. In a centralized service, such an even would probably be noticed and re-handled later. In a distributed setup, we cannot ever know if the Delete request was honored. And even if this is handled, it only gets removed from that instance’s database (and potentially from its ElasticSearch cluster): but not from its backups or logs. Your instance does have backups, right? Right?

The same goes for deleting your profile. Or for editing a post. Or editing your profile. Any data, any images, thoughts, biographies, links, names or avatars that you put out there publicly should be considered distributed, and therefore out of your hands: impossible to delete.

On the fediverse, it is best to consider anything that you published, to be out there. Forever.

But that is also where the good news is: you can protect yourself, your data, and your content from “getting out there” in the first place.

You can set a profile to protected. This means only people that you grant access, will be able to read your profile’s details and, more importantly, the content you write. Only your followers will get a copy of your posts. Though any follower (or their server) might still choose to keep the data; so keep that in mind when granting follow-requests.
You can limit the reach of a post. This means only a selected group will get a copy. And they -at least in theory- aren’t able to copy it to other people: boosting should be disabled (by their server). The only threat is now very similar to centralised social media: a receiver taking a screenshot, or copying it from screen.

Content that hasn’t been made public, hasn’t distributed beyond your reach. So even if any of those you sent it to, cannot or will not delete it, at least you know who might still have it (and whether or not you trust them to keep it safe). It’s like sending an email to a limited amount of people. I know not to include that gossipy aunt. And I know that if any of the recipients uses Gmail, Google might mine my email for advertisers, eventhough I, the sender, am not using Gmail, Google still gets a copy. But I can control this: If I truly distrust Gmail I could choose to never send emails to anyone on Gmail. I’ts in my hands. With Mastodon, you can, if you want, choose to never deliver your content to certain servers and it’s far easier than it would be for that hypothetic gmail-avoider.

The “threat” (if it can be called that at all - maybe limitation is a better word?), is at the moment mostly theoretical. And while that might change, it hasn’t yet.

Mastodon, and, as far as I know, also Pleroma, PixelFed, PeerTube and probably others, do try their best to actually delete your content when you request it to be deleted, you cannot be certain that a recipient is running an (unmodified) Mastodon. Or that they secure their servers, or that their instance won’t be sold to some advertising company. Or that they run into technical problems and fail to follow up on such a delete request.

Right now, a delete request, in practice, will get through and will be followed up. Your deleted content is unlikely to end up in public places. And you have the tools to protect yourself from actors that don’t follow up on deletion requests. Deleting content will, quite certainly, reduce the places where a copy is kept. Maybe even reduce it to zero!

But, as I posted earlier:

In the #fediverse, deleting content is really just a suggestion. Servers can, and will ignore it entirely. Keeping the content that you deleted.

@berkes@bitcoinhackers.org

[1] Blockchain stuff. Defi and all. Obviously never went anywhere.
[2] I’m not entirely sure of this. Reading through the code makes me think that a mastodon instance will actually still deliver delete-requests to instances it blocks. But that seems weird, and it does leak information (though only the fact that a post with id X has ever existed). I might be wrong.
[3] The ActivityPub protocol allows two modi here: the email-concept, where a server will copy it to the final recipient, and so might hold on to hundred identical copies if a message was delivered to hundred users on their server. The other is where the server just keeps one version and serves that to each recipient.

https://berk.es/2022/12/23/fediverse-never-forgets

New Project: Fedetivity Automate your Mastodon Account

Dec 12, 2022 Updated Dec 12, 2022

Show full content

A few months ago, we shelved Plannel, on which I worked as co-founder for about a year. There was a need, there was problem, but the actual solution turned out to not be tech (to improve project communication between clients and web/creative agencies), but human (really just a “project/account manager”, not more tools). It’s a hard lesson, but a good one.

After trying to join a few existing startups as CTO or co-founder, I decided to try fix a problem that I’ve been having for a while. Those are the best problems to solve. And to hopefully turn into a product and business.

The project is Fedetivity. A solution to easily automate your Mastodon account. But first, how I got to this idea.

I’m still working a few hours a week on Flockingbird which wants to become the Fediverse version of “LinkedIn”: a professional social network. Flockingbird, however much I love it, has no viable business-model; at least not one that I see or believe in. So I’ve made it my side-project.

Within Flockingbird, I’ve developed a bot that hunts the fediverse for Job-postings, then indexes those in a job-board. This is search.flockingbird.social, the bot is @hunter2. The bot is flaky and unreliable: it often crashes, loses connection, misses replies. To solve this, I need in-depth knowledge of websockets (in Mastodon), multithreading, federation, ActivityPub, HTML scraping, mastodon APIs, mastodon permission scopes, and so forth. Writing an interactive bot (that doesn’t overload it’s mastodon instance) turns out to be hard. But I’m solving that for hunter2!

So, why not share it?

And that’s what I’m going to work on: taking the most difficult part of writing bots and other automation for Mastodon or the Fediverse, and turn that into a product. SAAS and Open Source. Hence, Fedetivity.

For those wondering: the fediverse is a name for a set of decentralized, interconnected social media platforms. It’s seen a enormous influx and is on the rise, ever since Elon Musk took over Twitter. But it was already steadily growing after every other scandal by one of the giant social media companies.

Coming months, I want to ask as many questions to anyone who has a need to automate stuff on their mastodon or fediverse account. Or already does so. I’ve blocked my agenda to offer free consultancy on mastodon automation. In the hope that helping people to solve their problem, teaches me about those problems.

If you have a bot, want to move to the fediverse, have moved to the fediverse, just want to learn more about the fediverse, need to automate or integrate anything there, just let me know! I can help you with resources, ideas, tools, anything, really. I can even program these automations for you, if there’s a need.

I’m sure I can help, while helping myself. At the very least help you to understand how the fediverse looks towards automation (and not have you immediately banned, or blocked). So, please feel free to book a free consult!

My vision for Fedetivity is that it becomes the If-This-Then-That, CircleCI or Travis of fediverse automation. Where tech folks can write their own automation pipelines and non-techies can build those though a friendly UI. And where anyone has the freedom to run the software on their own, but where my hosted version is free for small users and paid for the bigger ones.

But, first, I’ll be writing software to solve my own immediate problem: a tool that listens to mentions, DMs, and as much of the fediverses’ status-updates as possible. Then filter it, and forward that to a bot, hunter2, so that this bot can focus on it’s domain (job postings) rather than waste time on juggling flaky websockets, (de)serializing, authorization, API calls, and other annoying, time-consuming details.

It’s one of those problems where I really wished it was solved already, and I could just pay someone for a hosted product. Even when Flockingbird isn’t making any money, I’d gladly pay to be able to focus more on finding and filtering great job-postings. I think those kind of problems are the best to solve. If I would pay someone for this product, than surely someone else will? But even if no-one else feels this problem, or no-one else wants to pay for it, I do need to solve it anyway, since I’m stuck on this problem today.

https://berk.es/2022/12/13/new-project-fedetivity---automate-your-mastodon-account

The Fediverse is Inefficient (but that's a good trade-off)

Nov 7, 2022 Updated Nov 7, 2022

Show full content

Let’s address the mammoth in the room: the fediverse, the network of mastodon servers, is very inefficient.

In this post I’ll show why it is inefficient and why that isn’t a problem.

A great analogy to explain this with is growing food.

Gardening: making my own food, decentralized

I spend a lot of my time in my vegetable garden, our communal orchard, chicken coop, and keeping bees. A lot of my food is produced by myself, decentralized.

Compare this to one farmer who produces thousands of tonnes of potatoes, another one who produces tonnes of tomatoes another single farmer keeping 500.000 chickens, or a beekeeper with a huge truck with several millions of bees travelling from gestation- to gestation. It really holds no comparison. I’ve grown two third of my food myself[1], but my tomatoes would’ve roughly cost me €35 per kilo, if I calculated everything including a minimum hourly wage. My honey would cost over €10 a jar if I’d pay my time. And my neighbors do the same in their garden. It’s terribly inefficient.[2]

But that doesn’t matter, because it’s a trade-off.

I get a lot of good things back. The labor is good. I’ve solved many software bugs while weeding a patch of garden[3]. I’ve come up with some of the best system architectures while going through one of my hives. I’ve had the best fun making jam, drying fruits or conserving harvest. Much better than those guilt-ridden nights wasting hours of another addicting Netflix series.

The incentives are different. I don’t have shareholders demanding more tomatoes, have no loans to pay for the most efficient chicken-coop. I don’t need to out-compete my neighbors. No pressure to sell my produce for the right price. The inefficiency doesn’t matter!

There is connection. By growing my own, I feel connected to the thing that literally keeps me alive: my food.

There is reward. I’m convinced it’s purely psychological, but a self-made sauerkraut, a pasta from own (carefully selected) tomatoes or fries from these special potatoes just tastes so much better than anything wrapped in plastic from a supermarket.

There are different efficiencies. My food isn’t shipped across oceans (in airplanes). There isn’t a single gram of waste in my “supply-chain”: everything ends as compost or chicken food (manure!)[4].

There is resilience. If my potatoes fail, my neighbors’ probably won’t. And even if theirs fail, I still have pumpkins, corn, spinach and much more to cover the loss. When our chickens get the chicken-flu, they won’t all die (we’ve selected them for strength, not optimal-egg-laying-efficiency). And even if they do, it’s “just” 8 chickens dead, not 100.000s. Monoculture is a terrible risk.

With decentralized systems like the Fediverse, like Mastodon, we see very similar trade-offs. We see the same kinds of efficiencies, but the same kinds of inefficiencies.

Fediverse Technical Inefficiencies

The obvious inefficiency are all the tens of thousands of servers running mastodon. All of them run databases, storage, workers, webservers and so on. A single post with an image may reach 32.1k people and all their contacts, if these people are spread across thousands of servers, that image is now stored on all those thousands servers. The (meta) data is duplicated across thousands databases and thousands of servers spend time validating and processing the post[5].

Any distributed system is inefficient, for one, because it lacks the economy of scale.

And so is the case with the fediverse. The six million users would “easily” fit on a mastodon on under thirty (virtual) servers, a very few large PostgreSQL database servers and a single file-server/storage. I know, because I’ve built and grown such Rails systems, with millions of users (on AWS). Certainly not thousands of servers. Definitely not thousands of database-servers.

Even if Mastodon were to be rewritten in Rust, tuned, and changed into a backend that can host thousands of users on a single Raspberry-Pi running on solar power, it still is inefficient. For one, because that backend would be even more efficient when employed in a centralized setup. And secondly because there is a lot of network overhead.

All the servers also need to communicate and distribute the posts, media, profiles and metadata. Over the internet. Activity-pub the underlying protocol is very chatty, necessarily. And it does this over HTTP, hence TCP/IP. All of which is chatty and relatively inefficient. And because every server is communicating with every other server, all require extra CPU time, storage, memory and other precious resources just to distribute a post.

Fediverse Human Inefficiencies

All the mastodon (and Pleroma, Pixelfed, Misskey and such) servers are managed by humans. Some humans will manage several servers, and some servers are managed by several humans. In any case, this causes a lot of duplicated work. Done mostly by volunteers. Unpaid.

All posts, accounts, media are moderated by humans. Moderators must wade through spam, racism, hate, (child)porn and abuse. Daily. And then blocked. A single spam-post might -worst case- have to be dealt with by tens-of-thousands of moderators. Each one reviewing it and potentially taking action. There’s, again, a lot of duplicate work[6].

Gardening in the Fediverse

Each server on the fediverse is like a personal vegetable garden, or a neighborhood orchard. Lots of overhead, lots of duplicate work, lots of inefficiencies that could be solved by handing it over to a few huge-scale “farmers” instead. Instead of tuning thousands of databases over and over, just “buy” a single big database from something like AWS RDS: their economy of scale makes it far more efficient than all the small ones combined.

But like with gardening, it doesn’t really matter that it’s inefficient. No shareholders to demand duplicate work to be eliminated. No competition to out-compete. No growth required to keep relevant. As long as two people can communicate over “the fediverse” it’s a full-on success. Nothing else matters.

All the extra work, done by humans, creates an immense feeling of connection. When a participant reports troublesome content, they know the humans who handle the report. Trust them. This human connection is, and has always been, a crucial factor in creating healthy communities.

While moderating a fediverse server is hard work, it’s also rewarding. The same way getting your email-inbox empty is. But also because there’s a direct, positive impact on the humans that signed up on your server. Often many thank an admin in person. People trusted this human with their social network and communication. Whenever a server-admin asks for donations, support start streaming in: never enough to cover all hours of work and often not even enough to cover the full costs. But people gladly pay a few dollars a month when they feel connected to the ones they’re paying it to. Gladly pay humans for their hard work, yet reluctantly pay a fee for an automated “validation icon”.

A fediverse-alike network sees different kinds of efficiencies pop up. People develop many alternative types of software to handle different needs and niches. A furry-fetish group can tune their moderation and setup very different to a group who wants to build a safe-space-community for victims of cyber-bullying. No central setup could attune itself to all these different and dynamic needs and wants efficiently. Validation of participants becomes so much simpler and banning bullies and hate is a single click away.

The original internet was set up decentralized, with the goal to be resilient to failing parts and to attacks. A lot of this property has been undone by re-centralisation: if AWS has an outage, half the internet goes down. When Twitter is run into the ground by some Billionaire Chaos Monkey, the whole world sees it’s journalist and government communication break down. A (lifted) ban on Twitter has actual effect on democracy and society.

Re-decentralising is, therefore, very much needed to make our society resilient. We cannot afford to have a single software bug break all airplanes, cars or thermostats. Or to have the communication of a country be dependent on the unpredictable whims of a billionaire. The education of our kids be governed by advertising algorithms. We cannot afford to be held hostage by monopolies when it comes to our infrastructure, communication and information.

As a European, I clearly see how monoculture is spreading over the world. American monoculture. Renewed prudism (imagine that a child sees a nipple! gasp!), replaced traditions (we now all have this ridiculous festival of consumerism, called “Black Friday”, without ever having thanksgiving. We’re almost done replacing our weird Dutch Santa Claus -5th of December- with this American Coca-Cola version).

Plurality is key to evolution. It is what makes us as humans progress. Monoculture enlarges the risks, and halts progress. Both in biology and in tech.

The only alternatives would be either to embrace a communist-like centralized, monopolized, governance by entities that are governed only by a few shareholders or billionaires. Or to forego all the internet, automation and digitization of the last centuries and move back to the early 1900s.

Both aren’t acceptable alternatives for me. So I’ll keep working on my own digital garden. And on my physical garden.

[1] I don’t eat meat, which makes this easier. Though I did learn to butcher a hare this year - roadkill near my home. I ate that: first meat after some 15 years.
[2] There’s a long standing argument if we’d be able to “feed the world” like this. I am convinced, by studies and papers, not just feelings, that we can. But only if we change our consuming patterns. Seasonal crops, local crops, no meat (or it becoming extremely exceptional, like caviar or so), no fish etc.
[3] Which begs another question: should my clients pay me for the hours tending to my bees, for weeding my garden, or just for the hours that I’m actually staring at a screen?
[4] Admitted: if I count the crops eaten by animals, the honey I leave in for the bees, and the fruit I didn’t manage to preserve, I did lose a lot. Almost a third, I guess. It was all eaten, just not by me. And it did contribute to a healthy ecosystem.
[5] Ironically all using electricity and therefore (indirectly) emitting CO2.
[6] There’s good tools, shared lists to block, and mechanisms to block large amounts at once. Still, each server needs humans to do this on their particular server.

https://berk.es/2022/11/08/fediverse-inefficiencies

Blog comments on a static site via social networks

Oct 11, 2022 Updated Oct 11, 2022

Show full content

Most blogs have a comment section below each article. A relic from the days that blogs itself were the bulk of “social media”. Before Slashdot and Digg, there simply was no place to discuss the article other than in the comments below that article. It’s still a core feature of Drupal, WordPress, and other blog-software.

A static site, however, cannot have comments in the same way. The blog-software generates the HTML, but cannot handle incoming comments or anything dynamic really. One can add a form and/or embed a service like Disqus or isso for that, but aside from privacy concerns, it somewhat defeats the purpose of a static site.

But also, today, discussions are hardly held on the blog itself anymore, but rather on platforms such as Reddit, Hackernews or Lobsters. And obviously Facebook, Twitter, Mastodon and even LinkedIn.

I ditched the common solution to this, Disqus, long ago for privacy concerns. And never even considered an alternative. The signal to spam ratio was far too high: for every on-topic comment, I had to moderate tens, sometimes hundreds of nonsense comments. Even with some solid spam-protection in place[1]. At some point, I spent more time moderating comments than writing articles.

I found that the discussions on Hackernews and even Reddit, are far better than what happened in the early days on this blog (back when it was Drupal 3, still) in the comments. So why not leverage that?

Turns out there is discu.eu, a fantastic service that you provide with an URL and then gives back results from various social networks. Places where that URL is discussed. Hackernews, Reddit, and/or Lobsters.

It has an API, that I can call with some JavaScript and then insert in the page. I’ve added that to this blog. But also extracted it into it’s own repository: github.com/berkes/discueu.js. Tried to make it easy to copy[2] into your own static site or blog.

Check out a demo here.

We can then implement this on our site as follows:

It consists of four parts: the Discussion, a DiscussionsRepo, a Renderer and a set of templates in HTML.

The Discussion is the main class: a controller so to say. It gets passed in (dependency injection) a Renderer and a DiscussionsRepo. Idea behind this setup, is that it’s easy to test and easy to swap out for other implementation. Maybe you want a Repo that sorts differently, or has additional filters (only return stuff with a ranking over 10, e.g.). Or maybe a renderer that builds its HTML in JavaScript rather than from <template> tags[3].

The discussionsRepo gets the URL for discussions passed in, a list of networks to filter by (maybe you only want reddit?) and needs to return either an empty list (no errors, but nothing found), a list with discussion objects, or a null (an error).

The Renderer then is handled this list and builds a DOM from <template> pieces found in the HTML. It renders either a list of discussions found around the web, or the message that nothing was found, or an error.

You will need to register at discu.eu to get a proper API key, but the test key works for testing.

Please let me know at the repo issues or in a PR, if you see any improvements (I do!).

And big thanks to the work done by Alexandru Cojocaru on discu.eu. I’m merely adding a few lines of code to his hard work.

[1] One reason for that is that static site blogs, by nature, have fairly good SEO, so such blogs are a rewarding target to spam.
[2] It’s not a proper library yet, and I doubt I’ll make it one. The use-cases are diverse and the actual code so simple, that abstracting this as a library that can handle most common use-cases, makes it way more complex than needed. Sometimes it’s just best to copy the code from a library into your project, tweak it, and own it.
[3] I’m aware this is somewhat a-typical Javascript, but I like Dependency Injection, and I like composition and encapsulation, which I find cumbersome and hard to achieve in functional JavaScript style.

https://berk.es/2022/10/12/blog-comments-on-a-static-site-via-social-networks

Value Objects in Rust

Sep 27, 2022 Updated Sep 27, 2022

Show full content

When writing software in an OOP language, Value Objects are an invaluable (pun intended) tool in my toolbelt.

The idea of Value Objects arose from Domain Driven Design, but are useful in and on itself, so outside of DDD.

Examples of Value Objects are things like numbers, dates, monies and strings. Usually, they are small objects which are used quite widely. Their identity is based on their state rather than on their object identity. This way, you can have multiple copies of the same conceptual Value Object. Every $5 note has its own identity (thanks to its serial number), but the cash economy relies on every $5 note having the same value as every other $5 note.

C2 Wiki

Value Objects must adhere to three rules. I like to add two.

It has no identity.
It is immutable.
It is comparable.
It has meaning to the business.
It is guaranteed to be valid according to the business.

Usually, only the first three are mentioned. I always add the last two, to include why we want this. In other words: if there is no need for a business meaning, or there is no validation, a Value Object may be overkill. Sometimes something really is just an integer or string.

In most software, we use Primitives all over the place. Need a URL? Just pass a string. A price? Decimal, or Int (or, heaven forbid: a float). A from- and to date for some report? Send in two Dates. And so on. This is also known as the Primitive Obsession Code-smell.

This causes a lot of issues. Most important issue being: it spreads your business rules all over the place. Or simply leaves you with no business rules at all. And secondly: Meaning is lost. A price as decimal may be unambiguous enough, but a let weight = 3.76; certainly is ambiguous.

Also, primitives aren’t immutable by definition (but in Rust at least they are by default) and while often they can be compared, and lack identity, they most certainly lack any meaning to the business. And we cannot guarantee their validity. They have no place in the Domain model. So they tick some boxes, but not all.

An contrived example of “software” that emails your boss some report of recent requests to a website. This uses primitives: strings, integers, vectors etc.

Our API is both PageRequest and make_and_send_report. Aside from the obvious issue that the function has multiple responsibilities, this API leaves a lot to desire. The users (colleagues, future-me, today-me, contractors etc) need additional information that the code does not convey.

Our business logic of what is a valid email is encapsulated: good! But we must now remember to use that encapsulated rule everywhere we want to do stuff with an email-address[1].
What is the size? Apparently when diving deep into the code, it represents kilobytes.
Can a size be negative? 18 quintillion kilobytes? Zero?
What is the response_code? Maybe a HTTP code? If so, what is zero? Or 65000?

Please note that PageRequest is not a value object here. It has identity (though no explicit ID field), it makes no sense to compare on its content and it would probably be something that comes from a database or other source in reality. It would be either an Aggregate or Entity. But that’s for another discussion.

Value Objects to the rescue

First, a Value Object wraps one, or multiple things. An example of multiple-things would be a 2D coordinate or a “list of things”. My contrived example lacks such Value Objects that wrap multiple primitives, but I’ll touch on this later.

For now, let’s turn the email address into a Value Object:

A HttpResponseCode and PageSize would follow the same pattern.

We then use them as:

An intermediate version where all this is implemented can be found in this playground.

This pattern is:

We define a simple struct. It wraps the value.
We implement a constructor. In that constructor, we place any business rules. So that we can be sure that any such Value Object created through this constructor is valid, by our business-rules.
When the constructor gets an invalid value, it will return an error in runtime.
We implement the Display trait, so that our users can display it.

This is far from ideal, still. But we already have a few boxes ticked:

It has no identity.
It is immutable.
It has meaning to the business.

Still missing:

It is guaranteed to be valid according to the business.
It is comparable.

For the first, we need to enforce that users’ go through our constructors. For the second, we can implement the PartialEq trait.

And for the impatient: yes, we will clean up the API and make it ergonomic later on. The code, with its Results, Displays and whatnots has arguably become worse now. But first some other issues to solve.

In order to avoid someone from simply calling EmailAddress { value: String::from("invalid-email") } we can place our Value Objects in a module and only make the struct and constructor public..

Implementing the PartialEq trait is trivial too. Or, even simpler, with derive: Most value-objects will be simple enough for derive.

Which we can then use throughout our code when checking equality. E.g. a business rule that ensures we never ever email john:

Whether or not to use the derive or the impl version depends mostly on the business needs. Typically, Value Objects are simple, so the derive version will almost always be good: our value structs hardly ever will have fields that aren’t relevant to the comparison.

But maybe our business says: “bob+one@example.com and bob+two@example.com are the same”. Or maybe our HttpResponseCode 488 should be the same as 400.

Now everything is ticked:

It has no identity. - Simply don’t add some id: field.
It is immutable. - Default in Rust
It is comparable. - Derive or implement PartialEq
It has meaning to the business. - Give them proper names
It is guaranteed to be valid according to the business. - Enforce constructors.

But we can improve a lot still. I won’t go through all the improvements that make it more proper rust-ish but will discuss some improvements are specific for the Value Objects. And one makes them a lot nicer to work with.

Most important, is an issue with our PageSize. It is still unclear in what unit this is. We could rename it to PageSizeBytes, implement some From traits to convert between PageSizeBytes, PageSizeKiloBytes and so on. Or we could improve the API of this object. E.g. a ::new_from_kilobytes() constructor, and then some fancy .as_megabytes(), etc. getters.

When to choose what, depends on the domain and requirements.

We can lean on the type checker for cases where a specific unit is a requirement. E.g. maybe a function must operate on bytes, so binding the function parameter to PageSizeBytes then helps: fn is_outlier(size: PageSizeBytes). We can use contracts (traits) for cases where we want to be sure that we can read a specific unit from the Value Object, with, say a as_kilobytes(). In that case, a generic over a trait fn bw_used<T: DataSize>(size: T) might work best, where trait DataSize enforces a as_kilobytes() interface and the type system ensures that whatever we pass in, has this method. And in cases where all this doesn’t matter, where only the business meaning and validation is of importance, we can bind to a more generic Value Object type. In our example, the HttpResponseCode or EmailAddress.

In the example, I would go for a PageSize, but clarify the unit in the constructor and a reader:

This shows another benefit of Value Objects: they can be unit-tested, and unit-testing them is dead-easy.

In the first version, had I wanted to test anything related to converting kilo- mega etc bytes, I would probably have had to test that the email being sent, contains the correct strings. But now, I can unit-test the getters and setters on PageSize, which is the most easy (and fastest) kind of test.

A last improvement is to make the Value Objects easier to use. In the intermediate version, we see that all our Value Objects must get the impl Display so that we can display them. The usize, or String we had before, can easily be displayed. Whats worse: we cannot do all the common operators on our Value Objects: they cannot be added, summed, divided, etc. What if we want to add “average page size” to our report? We then need to extract the value. In this we’re lucky, because we already have the as_kilobytes() getters. But another example could be a ViewCount which we may want to sum up. This then needs another getter. More boilerplate code, more calls:

Sometimes this isn’t an issue. Like when our getters are already there and required to disambiguate. Or when operations values don’t make sense: what would it even mean to have the average HTTP status code? Or to sum them up? Yet in the new example above, we’d be helped if we could forward any method calls, or operators, to the wrapped value. Deref can be of help:

We still have to deref any value before using it, but that is rather simple. In the example these are the*vc and the *report_title. This is considered an antipattern though. But, like always, “it depends”. With a very simple value object, deref makes sense: it may not be 100% semantic correct: deref is meant for custom pointer types and simple value objects can be considered such a pointer, but not entirely. For value objects that wrap multiple primitives, deref won’t work. And when we add semantics, like the as_kilobytes() it isn’t needed, and would add only confusion. So use with care and be aware of the downsides. Such as:

Using this pattern gives subtly different semantics from most OO languages with regards to self. Usually it remains a reference to the sub-class, with this pattern it will be the ‘class’ where the method is defined.

I personally don’t use deref that often. Only early on, but I find that when a value object improves and solidifies over time, I almost always remove the deref at some point in favor of semantic getters[2].

So, to sum up:

If the wrapped value makes sense as primitive, and we want to allow any operations or methods to be called directly on it, Deref can be of help.
If the wrapped value is ambiguous, named getters rather than a generic one, allow us to return a Primitive on which we can operate as we want.
If the wrapped value make no business sense as Primitive (e.g. our status code), we should prohibit getting to this primitive.
If the wrapped value is made up of multiple primitives, operations don’t make sense: getting to the underlying primitives should be prohibited.

The last one needs some extra attention:

Multiple values

Often a Value Object is made up from multiple values. An example would be a Point, coordinates in a 2d plane:

This is another great example where Value Objects make sense. We certainly don’t want to pass the two lat: usize and lon: usize around all over the place. Not only is that a codesmell, it is prone to errors (you will swap lat and lon somewhere, you will set one but not the other if you make them Optional. etc.)

A use-case that I come across more often, though, is a from: DateTime, to: DateTime. There is some obvious DateRange waiting to be implemented. This DateRange can ensure the from is never later than the to. It can get extra fancy helpers so that a user can ask date_range.intersects(other_range), date_range.includes(date) or date_range.length_in_seconds() or such.

A third value object would be a collection. A list, queue, vector, array, etc. Most collections are fine as primitive. But quite often they lack the business-validation. E.g. a list of “todo tasks” cannot ever contain a Task that is “done”. Or a top-5 Songs cannot ever contain 6 entries. But how to wrap collections ergonomically, is an entire post on itself.

Usage of Libraries

In the first example, way above, we have a url: String. A URL is not a String, just as a date isn’t a string, or a credit-card number isn’t a string. It’s a value with meaning, validation, helpers and so on. My name is Bèr is not a valid URL, yet our program accepted this as URL just fine. When dealing with URLS, you often need to extract a hostname, path, protocol and so on, also lacking in our example.

And rather than writing our own Url Value Object, we can leverage one of the many crates. For example url.

We could use such struct provided by a library directly. Which is often Good Enough. But to limit our coupling, we could wrap it in our own struct; Value Objects are a great place to do this. They double as Anti Corruption Layer in a convenient place.

In addition, wrapping it with our own version, allows different business rules. Maybe what url, the crate deems valid, isn’t for our domain. Maybe we can only accept URLs that are https, or only ever for our own example.com hostname. In that case wrapping ,and then Derefing an url in our own value object, is simple and makes the API (and errors and such) consistent.

That leaves us with the final implementation of the Value Objects:

You can play around with this here

This leaves a lot to be desired and improved, still. But a lot of implicit errors that were in the first version were fixed. And it clearly shows some of the neat tricks that Rust allows us to employ. Even though Rust isn’t Object Oriented in the traditional sense, doesn’t have objects, we can still use the Value Object Pattern (if you may call it that) to put business rules, -logic and meaning in our Rust programs.

Conclusion

Value Objects are a great tool to bring business-meaning and -rules into our code. They allow us to fix a lot of common code-smells. And they remove ambiguity and often make our code much better readable and therefore much easier to maintain. Value Objects are a great place to add some nifty helpers and converters.

In Rust, even though there are no Objects we can leverage structs, methods, traits, modules and the type checker to get Value Objects that make business sense, are valid within our domain, are ergonomic and require rather little boilerplate.

[1] And, yes, this isn’t any sort of email-validation. It’s an example!
[2] In Ruby, where I use Value Objects a lot too, and which is fully OOP, I find the same goes. I never just blanket forward all calls to the wrapped primitive. Because that leads to coupling with the primitive, which is one of the reasons for using value objects in the first place. Primitives also come with a very large interface (at least in Rust and Ruby they do), many of which don’t make sense in the domain meaning. What is status_code.is_odd() or status_code.len()?

https://berk.es/2022/09/28/value-objects-in-rust

"How do I test X" is almost always answered with "by controlling X"

Sep 18, 2022 Updated Sep 18, 2022

Show full content

Last week I stumbled upon a StackOverflow answer, where Shepmaster wrote a great quote about software testing:

“How do I test X” is almost always answered with “by controlling X”

This is simple, and may sound trivial. But it has some interesting consequences.

Ability to control X

First, there is the question: “can I control X?”. Because if you cannot, testing it becomes impossible. When X is some external service, or tool, for example a payment provider or email-server, and there is no way to control it, you cannot (, and therefore should not) test it.

How to control X

Second is the question “how can I control X”. For once, this can be answered different than with it depends, because we can control X, by ensuring that X is ours. By keeping it simple, and by ensuring it relies only on things that we control. That is not easy. But entire books have been written on architectural patterns that allow us to easily control X: so that we can easily test X.

Ease of controlling

So third is the question “how easy can I control X”. The StackOverflow question that Shepmaster was answering was about environment variables (env vars). They are reasonably easy to control in most tests (in most languages and frameworks). But harder when you run tests in parallel because all running tests will re-use the same shared env-vars. If test 1 sets env var “URL” to “http://localhost:3000” and test 2 sets it to “http://example.com”, there will be conflicts. Other difficulties are that a service you are testing, may need to be restarted to pick up a change to an environment variable, or that environment variables are enforced or overridden by your OS, CI, or hosting.

Environment variables are harder to control than stuff that we designed to be controlled by us.

An example of such an architectual pattern would be a “config-repository-adapter”. Some adapter that you can swap out. In production, dev and CI it may be use the EnvVarConfig, and in test a MemoryConfig. Don’t let the words “adapter” put you off. This works just as well for software that isn’t following some Java EnterpriseAdapterEnvVarConfigFactoryDecorator-“pattern”.

We can easily build MemoryConfig so that we own it, and can easily control it.

In the same thread, Simon Whitehead shows a great example in rust. If you are more familiar with Ruby, an example would be:

class Config
  def get(name)
    raise NotImplementedError
  end
end

class EnvVarConfig < Config
  def get(name)
    ENV[name]
  end
end

class MemoryConfig < Config
  def get(name)
    @config ||= {}
    @config[name]
  end

  def set(name)
    @config || = {}
    @config[name] = name
  end
end

Trivial, right? Yet testability goes way up, because we control X. Accidental benefit, is that we can easily swap out EnvVarConfig for a FileConfig, EncryptedVaultConfig, or CommandlineArgsConfig if we need. Another example of the common statement that easier to test software is better software.

In unit-tests

Because everybody has a different view of what “units” are, I too prefer the term “class test”.

When we test a class X, how do we control the class X?

By sending messages to it (calling methods or functions). Or by passing in stuff that we control.

We cannot control private methods or private state. So we cannot test that. This is nothing new, but somehow too often forgotten.

We often don’t control the dependencies of that class, stuff the class X depends on, and when we cannot control them, we cannot test X. But we can ensure that the if class X depends on Y, that we control Y. Dependency injection is the most common solution.

To illustrate:

user_repo = MemoryUserRepo.new().insert(username: "berkes", password: "hunter2")
sut = AuthenticationService.new(user_repo: user_repo)
assert_equal(sut.authenticate("berkes", "hunter3").error, "invalid password")

AuthenticationService somewhere calls find(username) on the user_repo it got passed in, then checks the password using ComplexCryptography that is of no concern to the outer world. So all MemoryUserRepo needs is to provide a find that returns a user[1] similar to how a ActiveDirectoryClusterUserRepo would return a user from it’s 120-server big active-directory-cluster. Yet where it’s really tough (if not impossible) to control that giant cluster, controlling a list in memory is trivial. Hell, it could even be hardcoded in a HardcodedUser.find() method if we only ever need it in this test.

Through dependency injection, when we want to test X, we can control X, because we control all the dependencies of X.

In integration tests

When we test a group of classes within some boundary X, how do we control that boudary X?

First, by ensuring that everything in the boundary stays in that boundary. By ensuring that classes in the group only interact with eachother, we can easily control the entire group through its public interface. But when elements (classes) in that boundary depend on external systems like databases, env-vars, servers or worse: stuff in boundary Y, we need to control all those.

This is really another way of saying that tight coupling is bad (for testability).

The solution like above, is to ensure that everything inside the boundary depends on “stuff outside” through simple, easy and controllable interfaces. E.g. a ports-and-adapters style. Or just simple decorators, services or whatever pattern fits best: as long as we can swap it out when testing, its fine. Obviously: the least we have of those, the better. So “everything within our boundary” should stay within that boundary as much as possible.

To illustrate:

mail_handler = MemoryMailQueue.new()
payment_server = StripeMock.new()
sut = Reimbursement.new(order_id: 1337, payment_server: payment_server, mail_handler: mail_handler)
assert(sut.call(), "Reimbursement failed")
assert(payment_server.requests.body.parse().type, "reimburse")
assert_equal(mail_handler.sent.first.subject, "You are reimbursed for order 1337")

Again simple dependency injection. We control the mail_handler, so we can test the mail_handler. We control the payment_server, so we can test the payment_server.

In end-to-end tests

In order to test the entire application, we must control our entire application.

This is where things get muddy. Because what our users consider “the entire application” almost certainly includes things that we cannot control (in our tests).

For example, we want to ensure that a notification mail is delivered, stripe is called, and what more, when reimbursing.

As a logged in a admin, when a client paid for order 1337, and on the admin-orders page, I hit the reimburse button on order 337. Then the money should be reimbursed, and a notification mail sent out to the client.

We cannot control Stripe. Nor can we control the mailserver. But we can replace them with services that act nearly similar and that we can control.). For email-servers, there is e.g. testmail.app. Many larger mail-delivery services have such features built in, e.g. Sendgrid allows you to check if a mail was delivered by checking that an email was sent through their API. Or stripe allows you to interact with their API in testing mode.

This is complex, slow and fragile. But that is expected for end-to-end tests. Which is why the testing pyramid puts them at the top: you want least of these. You want to depend least on these. Exactly because Controlling X, when X is the outside world, is tough. And so that makes testing X tough.

But in this case, not clear from the use-case, another important thing to control is the application state. As an admin implies that there somehow are admins, and that we are logged as one. Hitting the reimburse button on order 1337 implies that order 1337 exists, and is in a state that it can be reimbursed.

When all this is a single database, controlling that isn’t too hard. We could just poke around in that database from our tests and generate the correct records. It becomes harder when this database often changes. It becomes even harder if some of this state lives externally. The admin in this case mightn’t notice that the authorization is done on an external service or that the order was filled from some event-stream, rather than a relational database. So I think that our tests here shouldn’t deal with those details either.

I, therefore, prefer to drive all these “state” through the public UI. There must be some place where we can add admins, or where admins can log in, or where clients can place orders. “Just” walk through all these screens from the test and you should end up with a state where you can start testing the actual feature. It’s a quite extreme form of “only use the public interface when testing”, whatever works for you. But, for end-to-end tests, which test the public interface, the only real public interface is, well, the public interface[4].

If you must test the entire application, we must control not just the application, but all its external dependencies. Which is impossible in practice. But we can get closer. The fewer of those we have, the easier it becomes. So if we declare “the database”, or “the single-sign-on authentication service” as not some external application, but as part of the application, we need not control them directly: we can control them through their public interface!

Mocking and stubbing.

Stubbing means that we control X by replacing parts of it, with something that we control. But subtly different to dependency injection: If we stub e.g. the random() in Math.random(), we replace only that function with one that we control (and that always returns 42, for example), we aren’t really controlling Math, we are really poking around and replacing behaviour at runtime (this won’t work for many statically typed languages, for good reasons too). Yet when we inject the subject-under-test with our own implementation of Math, in which we control what the function random() does, we use dependency injecting.[2]

In order to test X, when that depends on Y, we want to completely control Y, not just one detail in Y

Coincidentally, this leads to looser coupling.

Mocking means that we inject our own implementations of Y, which is what the whole dependency injection is about. But mocking comes in many flavours. For one, it is often misused as alias for stubbing. More correctly, though, it means creating objects that simulate the behaviour of a real object. Yet to simulate is vague. Again, Shepmasters’ quote can help: A mock is something that the tests control, to replace something that the test would otherwise not control.

The “adapters” mentioned above are therefore very simple mocks. The more complex mocks need to be, the harder it becomes to control them. This implies another issue, though: the more complex mocks need to be, the harder it is to test. Which is a sign that the code we are testing, probably isn’t good.

To turn it around: When a subject under test, X, depends on Y, and you need complex behaviour to simulate Y, then X is depending too much on Y. Either Y should be part of the “unit” (bounded context, module, whatever you name your compartments), or Y and X should be decoupled more.[3]

Coincidentally, this leads to the software design principle of increased cohesion and looser coupling.

Conclusion

The phrase

“How do I test X” is almost always answered with “by controlling X”

is provoking, simple and has some interesting consequences. But I think it’s not entirely complete. I miss the dependencies. My version would therefore be;

“How do I test X” is almost always answered with “by controlling X and everything that X depends on”

[1] Turtles-all-the-way-down, though: this user, obviously, should not depend on a specific database implementation, ideally it would be an immutable, flat, simple struct (e.g. a value-object). But certainly not some model that itself relies on availability of database-servers, event-streams, or other subsystems to fill itself. Or more precise: the adapter that our tests control should return such a simple version. If the “actual” adapter returns some complex, wired-up object: fine. As long as our AuthenticationService does not rely on all that complexity and wiring, we are fine: we reward ourselves with loose-coupling by being lazy!
[2] A stub, however, can be a mock. I know… I use stub here in the sense it is commonly used in testing: to replace a single method. Not to replace a module, class or subsystem with one that we control: that would be mocking.
[3] Which I why I, and many people with me, often say we dislike mocking. I like dependency injection. And the things I inject can be mocks, but I prefer them to be the real thing. Yet when the real dependency cannot be controlled easily, then that is a sign of trouble. In other words: if you need to mock, you probably have a deeper problem. Maybe that cannot be solved, in which case a mock is a band-aid. And when you consider “test adapters” as mocks, then sure, mocking is proper. But only in certain layers and use-cases.
[4] Or I sincerely do hope that clients in your e-commerce system cannot place orders by writing records to your database directly… There are many downsides to this setup. But also many upsides. A topic for another post.

https://berk.es/2022/09/19/test-x-by-controlling-x

Using a Framework will harm the maintenance of your software

Sep 5, 2022 Updated Sep 5, 2022

Show full content

In this article I’m putting together my quotes, thoughts and notes on the idea that Frameworks harm the maintainability of the software you build in that framework. I’m proposing that Frameworks:

are harming maintainability, but not deliberate.
have different goals than you or your team.
make trade-offs that harm maintainability of the projects built in them.
are designed to take your project hostage.
offer some their benefits, and don’t harm maintainability, when used in a decoupled fashion.

What is a Framework and what isn’t.

In this article when talking about a framework, I mean a narrow definition. Not just any third party code used, and not just a methodology or architecture:

[…] a software framework is an abstraction in which software, providing generic functionality, can be selectively changed by additional user-written code, thus providing application-specific software. [..] Frameworks have key distinguishing features that separate them from normal libraries:

inversion of control: In a framework, unlike in libraries or in standard user applications, the overall program’s flow of control is not dictated by the caller, but by the framework. This is usually achieved with the Template Method Pattern.

[…]

extensibility: A user can extend the framework – usually by selective overriding – or programmers can add specialized user code to provide specific functionality. […]

non-modifiable framework code: The framework code, in general, is not supposed to be modified, while accepting user-implemented extensions. In other words, users can extend the framework, but cannot modify its code.

Emphasis mine, from wikipedia

Frameworks, by definition then, offer functionality, behaviour, flow and defaults all of which are built in, some of which is unchangeable or dictated. They allow a user to add code, rather then change the external code.

The maintenance-problems that frameworks can introduce, apply to all software-frameworks, but my experience is limited to frameworks for web services (API, backends, full-stack), commandline and GUI. The examples limit to web-frameworks only. Because in 2022, more and more software is on the web, or moving there.

People use frameworks, because it is supposed to make software-development more standardised, faster, easier, more secure, better scaleable, more consistent or more fun. Ironically, the Wikipedia lemma doesn’t provide any benefits for using Frameworks, only downsides.

The idea behind the Standardisation, is that developers are forced to work in a predefined way. That organisation of code is enforced and that APIs and logic becomes recognisable across projects that use the same framework. Yet the only scientific-ish prove I found hints at the opposite. The State Of the DevOps report indicates that technology such as use of frameworks, matters little for success. And enforcing those, even less so.

Companies that have..

A team that defines the standards, processes, practices, frameworks or architectures that other teams must follow.

…are amongst the lowest performers. Reversed: companies that lack this, amongst the high performers.

In other words: enforced standardising the tech, doesn’t pay off.

This makes sense: if everyone in a company is forced to use, say, Django, for any project, regardless, there will be a lot of projects where Django is a very poor choice.

That still allows a framework to offer other benefits within a project or within a team though. But the standardisation (and consistency) benefit seems to not exist, and even be opposite.

The attributes of speed (of development), more fun and ease very much depend on what phase a project is in. A framework tool that generates the code for the models saves you writing initial code[1]. I might save . But on the scale of seven people developing that software for over a decade, that half an hour is insignificant. Especially because over such a long period of time maybe hundreds such models can be generated, yet all the other tens of thousands of hours are spent modifying and maintaining existing code. Below I’ll show how this “speed of development” actually counters the speed of development over longer periods of time: harming maintenance.

And the attributes of security and performance are very context-dependant. Frameworks, by definition, add a lot of code to a project. Which at best isn’t in the way, but at worse offers a very big attack surface and a giant amount of overhead. Below I will show how these attributes can be achieved, easier even, by not using a framework.

What is “harming maintenance.”

When we launch our software, and it becomes a success, in the sense that it is used, we will maintain it (or we should). Maintenance is typically categorized:

Corrective Software Maintenance - AKA bugfixes
Preventative Software Maintenance - AKA preventing the bugs, stability improvements
Perfective Software Maintenance - AKA finishing touch.
Adaptive Software Maintenance - AKA continuous development.

In this article, though, I consider any changes to software after it is put to used, to be maintenance.

Anything that slows down such continued development over any period of time, I consider harming it. So if the use of framework slows down shipping of new features today, it is causing harm.

And when the use of a framework allows shipping features fast early on, at the cost of slowing down the shipping of new features or changes later on, that is harming maintenance.

A third type of harm is when the framework diverts resources into work that has nothing to do with delivering value to your customers. Work like upgrading, deprecation, education and information ingestion (learning about new features, e.g.). Those take away valuable (and often scarce) resources. All the hours you spend upgrading your stack are not spent delivering new features that users or market want.

And a last type of harm, is when a framework that was once a good fit for a project, is no longer a good fit. Because either the framework moved in a different direction, or because the software built in that framework moved in a direction that no longer fits the ideas or ideal use-case of the framework.

Frameworks have different goals than you or your team.

And yet despite the huge commitment you’ve made to the framework, the framework has made no reciprocal commitment to you at all. That framework is free to evolve in any direction that pleases its author. When it does, you realize that that you are simply going to have to follow along like a faithful puppy.

Here’s what DHH apparently has to say, about your concerns of the direction he is taking his framework. Most frameworks authors aren’t that hostile, though. They honestly care about their users, I’m certain. DHH probably does care about how happy you are when using Rails too, even when he doesn’t express that as clear. But then too all these authors care more about onboarding people and keeping people on board than that you can still continue to deliver value in fifteen, twenty years.

When looking at the marketing of a series of popular web-frameworks, we see that of Django, Rails, Spring, Gatsby, and Symfony only the last mentions maintenance or Maintainability: (emphasis mine)

Speed up the creation and maintenance of your PHP web applications. End repetitive coding tasks and enjoy the power of controlling your code.

Though how they provide this, is handwavy:

and the use of Best Practices guarantees the stability, maintainability and upgrade-ability of the applications you develop.

Slightly less provoking as DHH above, is the more official stance of Rails on how it supports you over a longer period of time:

When a release series is no longer supported, it’s your own responsibility to deal with bugs and security issues. We may provide backports of the fixes and publish them to git, however there will be no new versions released. If you are not comfortable maintaining your own versions, you should upgrade to a supported version. – https://rubyonrails.org/maintenance

At least it’s honest and clear: the framework won’t support you over time. You’ll have to divert resources in updating and migrating your project to keep your project on recent versions of Rails.

But even when the frameworks’ goals and yours perfectly align, you cannot predict the future. Especially at the start of a project, this future is unsure. Will the product always be a web-app? Are we certain we will only ever release this application for Windows desktops? Do we know for sure that a relational-database is the best storage in future? Do we need that scalability? Are we certain we don’t need that scalability? Will there be javascript PWAs in ten years? Twenty?

Yet, when building your product in a framework, you choose, very early on, to marry it. Forever together. At a moment that you have the least information to make that commitment, you are making it.

Frameworks make trade-offs that harm maintainability of the projects built in them.

Like with any software, the authors of Frameworks, must make trade-offs. For example, reading the payoffs on their websites, we can clearly see that all popular frameworks value speed of development and scalability most of all.

Yet, both those traits are perpendicular to maintainability. At worst they hamper maintainability, at best they don’t improve it.

Speed of development is sometimes achieved through code-generation (boilerplate), but more often through inheritance. When the framework generates code for you, it creates the code, but doesn’t maintain it. E.g. a framework like react-boilerplate or create-react-app work this way. They are really “just” fancy code-generators. Code that must be maintained or else it will degrade. Will accumulate duplication, inconsistencies, incompatibilities, etc. So called “code rot”.

The other method is where the framework solves this code-rot problem. By sticking all that code in superclasses (or reusable functions), it offers such “code” in a single, often logical place instead of spread out. As a user (a developer using the framework), you inherit a class, or mix in a class, module, or function. The framework is injected and through this injection offers an API for you to use.

E.g. with Rails, the default single inheritance for what is considered “A model”, adds a mind-boggling public surface to your objects. E.g. a Post, having three fields in the database:

class Post < ActiveRecord::Base; end

This gives you no less than 767 public class methods and 487 public instance methods: you inherit over 1200 methods by subclassing[2]!

Because your projects’ Post class now provides these, you are responsible for maintaining them. After all: your class offers them to its users. These methods live on your class, your instances.

But they live deep down in the code of the framework. You are responsible, but lack the ability or power to maintain them. It’s in the definition of what makes it a framework that you cannot change this, cannot own it.

The framework now might decide that at some point a method like title_came_from_user? method may be deprecated or changed. We now offer a large public interface by mixing framework into our project, we will use that throughout our code but hold no power over it. We are at the mercy of updates, backwards-compatibility-guarantees or goodwill and availability of the framework authors. Some of which may dismiss your worries with an R-rated slide. Most of which, though, are more friendly but lack the resources to keep all this API stable forever. Yet a framework such as Drupal (if one may call it a framework), comes with an upgrade that is so immense, in practice it means a complete rewrite of the project. Every few years! Others are friendlier and try to remain backwards compatible, or offer much smaller upgrade steps. But all, each and every one, has updates. That we must follow. On which we must act. Which occasionally require us to change existing code.

Many frameworks aren’t as extreme as Rails with its public interface of over 1200 methods. But all offer an API, functions, classes, to be used by the user of the framework: it’s the whole point of the framework to offer this!

We will use this code. We will then, over time, couple our code ever more tightly to the framework. Up to a point where it is completely dependent on that framework.

This is also why people generally say that one develops in a framework, not with a framework. You build your project in the framework.

And the performance or scalability they guarantee is performance compared to other similar frameworks. Software is made more performant and better scalable through architectural choices, low level optimizations and, above all, less code. Not more. And it is somewhat of a false-flag operation: frameworks have often been in the news because they were the source of visible performance issues for projects built in such frameworks. Rails’ became known for its bad performance through twitters’ Fail-Whale and then Twitter announcing it rewrote its Rails codebase in Java. It’s a statement to divert from the fact that most frameworks add significant performance overhead on a project.

All of the common solutions to scaling and performance-issues: architectural choices, low-level optimizations, and “less code” require the freedom to change our code when we find we have performance issues. Choices and optimizations we can only make later on, when we have information. If anything, frameworks harm our scalability, because they make it harder to move to other frameworks, architectures or set-ups that fit our usage-profiles better. When you get Fail-Whales, you want to optimize problematic code, not rewrite everything in Java (or Rust, its 2022 after all).

Frameworks are designed to take your project hostage.

By mixing in framework code, and by using it, a project becomes tightly bound to that framework. Every time we write a belongs_to(:author) in Rails or a models.ForeignKey("Band") in Django we are binding your project more tightly to the framework.

Good, maintainable binding, is when we have a tiny surface where we bind our domain to the framework. Bad, hard to maintain binding is when this surface is big, fuzzy or completely absent. Even worse, even harder to maintain, is when our domain and businesslogic gets intermixed with framework code. When high-level business-concepts mix up with low-level delivery mechanisms. When business-logic gets spread throughout those delivery mechanisms and we must read through controllers, views, models, factories, services, configuration-files, libraries, framework code, just to understand why a User may be created in case A, but not in case B.

Frameworks abstract away many of the technical details. They most often provide an ORM that abstracts away how we deal with a database, sometimes to a point that developers don’t have to know they are using a database at all. Just call a model.save or a User.find_by(email: "example.com") and it will save or fetch data regardless of whether it lives in PostgreSQL, sqlite or even MongoDB. The trade-off here is that you now don’t bind to a specific database, but bind to the ORM and Framework. You get freedom to use any database, at the cost of the freedom to use another ORM and framework.

Delivery mechanisms like HTTP, storage (like databases), event-buses, logging, messaging, all of these are details. They are irrelevant to your business-logic, you domain.

The architecture of an accounting app should scream “accounting” not Spring & Hibernate. @unclebobmartin

But frameworks don’t just put themselves up front, they encourage mixing up this logic. They offer their API, classes, functions for us to use throughout our business-logic. So not only is our code then tightly coupled to framework code, it gets mixed up. Even worse, they often encourage us to spread this business-logic all through these “details”. In a web-MVC the M is the storage, the V the template and the C the http layer. There is no single, logical place to keep your domain code: the framework actively encourages you to just drop it wherever it happens to be easiest. Rather than where it will keep your code most maintainable.

It is not uncommon for a project in a framework to look something like:

def create
  if User.exists?(email: params[:email]) 
    render :new, status: :already_exists 
  elsif user.save
    flash[:success] = flash_message_for(@user, :successfully_created)
    redirect_to edit_admin_user_path(@user)
  else
    render :new, status: :unprocessable_entity
  end
end

def user_params
  params.require(:user).permit(permitted_user_attributes |
                               [:use_billing,
                                role_ids: [],
                                ship_address_attributes: permitted_address_attributes,
                                bill_address_attributes: permitted_address_attributes])
end

When we read closely through this, we experience a rollercoaster ride. There is hardly any cohesion, and we jump from Domain logic, via Framework APIs to delivery-mechanism-details to security-details, to businesslogic and back. Up and down the stack. There is quite some business-logic in what should be purely a HTTP-layer.

In a clean, or screaming, or hexagonal, or any layered architecture, we separate those concerns and avoid mixing them up, while keeping the business-logic contained to a single place.

Frameworks don’t play an important role there, that’s the whole point of a self-contained, isolated domain area (or layer). Such domain-code doesn’t rely on details such as how to deserialize JSON, HTTP-headers, database-transactions, connection-pools, etc. Such a domain merely cares about it’s domain language: it calls an abstract posts_repository.create(post).

Such systems are far better maintainable, because the role of all code is clear. It is isolated and has great cohesion. If ever you change anything about how Posts are stored in the repository (you move away from that MongoDB, and choose to just write out markdown files on disk) this change is isolated to the PostsRepository and that alone. Nothing in your business-logic needs to be touched.

When you move such details to the side, into isolated, bound layers only, the software becomes more maintainable, because changes are isolated. And with such an architecture, a framework, if used at all, is moved aside as well. It lives in the fringes of your application. It can be easily replaced one piece at a time.

Frameworks offer some their benefits, and don’t harm maintainability, when used in a decoupled fashion.

Many people will argue that not using a framework means write everything yourself. This is a false dichotomy. We can use libraries and frameworks just fine. We should use code. Should avoid writing the same logic over and over. We should depend on (security)experts for security-critical code. We should not write our own cryptography, or password-handling if it can be avoided. We should use off-the-shelve libraries for this.

But we should give them a clear, and well-isolated place in our project. The code that routes HTTP-paths to method-calls (or commands or whatever your domain offers at its boundary) lives in aside, in a HTTP layer, it’s a detail. And it certainly not ever deals with business-logic. The more isolation, the better maintainable. The code that handles e.g. token authentication should not be written by us, but be included in a single, well contained, bounded area. One that encapsulates this and translates it into domain language, preferably. E.g. behind a authentication.is_known_as_admin(request.token) rather than sprinkled throughout our controllers, commandline-interfaces, scripts, or async jobs.

The code that sends out messages, is called by your domain as a simple messenger.deliver(recipent, body). Behind that method call might be an entire message-delivery framework, one that does exponential retries, buffering, smart routing, can handle both push-notifications and emails, etc.

The code that persists the Expenses, is called by your domain as a simple expenses_repository.add(expense). And might use the most complex distributed-database framework on earth. Or it might use a fancy framework to push expenses into an online accounting tool.

The point is not to never use frameworks, but to isolate them. To call them from a single place. One that we own. That we are responsible for and that we limit very much in what it can touch.

Yet most frameworks come with all those details up-front and mixed up. They often make it very hard, if not impossible, to be isolated. To be contained to a single interface. And when they do, they stop being a framework and start becoming a library very fast.

But why are there no frameworks that offer this?

First, that would defeat the purpose. Because the idea is to have independence of a framework. And building a framework with the sole purpose of not using that framework is rather counterproductive.

Secondly, well-maintainable software can evolve over time, to match changing needs.

When you migrate from HTTP to an event-bus as delivery mechanism, you no longer have need for an HTTP framework, obviously. When you move from a web-based service to one using only native mobile apps, you no longer need all the HTML/CSS/asset stuff, but do need a way to serialize and handle JSON requests. Maintainability requires your software to evolve with you. A HTTP-framework will offer HTTP. Some MVC framework will offer an ORM that uses a relational database. But when your needs change and you no longer need all the HTTP, or templating, it is still there. When you decide it fits the business best to store data in a database, but rather distribute it as JSON-files an ORM framework will become obsolete. Yet it is still there.

Third, one often hardly needs a framework to do stuff. For example, an architecture like CQRS is really mostly a simple if(is_command) { command(params) } else { query(params) }. You don’t need a framework to do an if/else.

And last, maintenance isn’t about using specific tools or frameworks. As Symfony aptly pointed out:

and the use of Best Practices guarantees the stability, maintainability and upgrade-ability of the applications you develop.

One of those “Best Practices” is to not let a framework rule your project!

[1] Something that is probably much better left to either standalone tools, IDEs, or IDEs leveraging those standalone tools. E.g. any editor or IDE worth its disk-space has some form of “snippets” or “templates” in 2022.
[2] Ruby, the language Rails is written in, comes with quite a lot of methods already. But even if you remove those from this list, you have over a thousand methods offered by Rails.

https://berk.es/2022/09/06/frameworks-harm-maintenance

Exponential compound interest on Technical Debt. And how I avoided it.

Aug 29, 2022 Updated Aug 29, 2022

Show full content

In this post I’ll try to explain why not all technical debt is crippling. And what I learned, dealing with technical debt. I’ll try to show that:

Technical Debt isn’t always bad.
It is bad, dangerous even, when it starts to compound.
The interest must be calculated into operations.
We must deliberately and actively choose when and where to take on technical debt.
One trick to take on technical debt but keep the interest low, is to decouple our code.

During my career, I’ve worked on many codebases. One thing that differed a lot between projects, is the effect of “technical debt”. A few projects where completely crippled by it. Where “paying the debt” took the entire companies’ engineering “budget”: hardly able to react to the market, to deliver feature requests, to keep users happy. All effort “wasted” on just keeping the software barely running.

Yet other projects, often much larger, than those ones above, or with far smaller teams, managed to keep delivering for years, decades even. Managed to maintain a high velocity. Continuously.

I’ve been pondering why that is, for years. Until I read Debt the first 5000 years by David Graeber. And until I started to seriously research companies (stocks) to invest in. I then realized that in finances, not all debts are equal. That having big debts, in itself is not bad. But that the types of debt are what make them dangerous. And I learned to appreciate the effect of compounding interest.

3% interest on debt: Simple vs Compound

When we talk about Technical Debt we usually don’t categorize it. After all, the term was coined as just an analogy, not something scientific:

Shipping first time code is like going into debt. A little debt speeds development so long as it is paid back promptly with a rewrite… The danger occurs when the debt is not repaid. Every minute spent on not-quite-right code counts as interest on that debt. Entire engineering organizations can be brought to a stand-still under the debt load of an unconsolidated implementation, object-oriented or otherwise. — Ward Cunningham, 1992

But the idea isn’t new. In 1980 already, Computer Scientist Meir Manny Lehman saw a clear cause and effect. And stressed that it is important we continually keep paying off those debts:

As an evolving program is continually changed, its complexity, reflecting deteriorating structure, increases unless work is done to maintain or reduce it. — Meir Manny Lehman, 1980

Technical debt will cause our project to deteriorate over time. The accumulation is what makes it such a powerful and dangerous concept. As DHH and Jason fried point out: it gets worse over time, and you’ll often only notice that when it is too late:

Promises pile up like debt, and they accrue interest, too. The longer you wait > to fulfill them, the more they cost to pay off and the worse the regret. When it’s time to do the work, you realize just how expensive that really was. — Jason Fried, David Heinemeier Hansson in It Doesn’t Have to Be Crazy at Work

When we don’t pay back the Interest, over time will not just accumulate, it will compound. Compounding interest means that if don’t pay off interest, that interest itself becomes debt too. Which increases the debt, which increases the interest. Ad infinitum, exponential.

Consider above graph, in which I show a debt of $1000, with 3% interest. One (blue) line is where interest is paid off. The other, is where interest isn’t paid off and is added to the debt. After 100 periods, one has paid off a total of $3000. The other now has a debt of over $50.000.

Compound interest is the eighth wonder of the world. He who understands it, earns it … he who doesn’t … pays it. — Albert Einstein

Yet, many companies take on debt. Many of the highest valued companies in the world, have large debt/equity ratios: not just large debts, but large debts compared to their equity. E.g. Apple has a debt ratio of over 100%: it has more debt than that it has equity. And Apple has a mind boggling-lot of equity: and therefore a mind-boggling lot of debt.

Screenshot showing amount of debt that Apple has Screenshot from SeekingAlpha

Like with such financial debts, I have experienced projects that took on technical debt in software, safely. Without grinding to a halt or going bankrupt, without turning into a Phoenix project.

When is technical debt safe? Where is it safe? For that, we must first take a detour. To get a common misconception is out of the way: That we, engineers, programmers, aren’t the ones responsible for the debt.

David Graeber points out in his provoking book, that financial debt isn’t equally distributed. Or, more precise: who has to pay it back, isn’t equally distributed:

As it turns out, we don’t “all” have to pay our debts. Only some of us do. ― David Graeber, Debt: The First 5,000 Years

The typical story, we, engineers like to tell ourselves, is that it is “management forces us” to make the shortcuts[1]. That “management” is taking on the debts and that “we” then have to pay it back. I’m convinced this is a misconception. A story “we” tell ourselves to shirk our engineering responsibility.

A medium post on this topic confirms that misconception when explaining the concept of technical debt with an example:

Of course, you thought that after shipping your app, you would have time to go back, get rid of the hack you made, redesign the architecture, implement it, and make your code look good again. In other words, get rid of your technical debt. […] Those same stakeholders come to you and say that the app needs to get that new feature as soon as possible. You tell them that you are just trying to make the code be in good shape again. And you hear back that there is simply no time for that now, you can take care of that later. […] In fact, it has to go through some loops and hoops in order to make it work.

A junior, that I once mentored, explained why his code was becoming more and more messy:

“Management needs me to churn out new features, rather than work on stuff that improves the software”. (paraphrased)

I have thought this same fallacy many times. But I, and this junior, am making a few mistakes here:

I did not quantify or clarify the debt and the interest clearly. The whole business should be worried by this debt, not just a (few) programmer(s).
I made it look like “taking on debt” and “paying back debt” are entirely different projects, which we plan in succession, rather than something that we (also) do as part of our daily work.
I am not taking “the business” serious enough. The part that actually pays my beer. I wrongly consider “engineering” vs “business” as opposites, rather than as the enablers of each-other they are.
That taking on technical debt can be avoided, and is not a fact of life of a software project.

And, above all, I did not design the code in such a way that it can safely take on the inevitable debt. Or even embrace it.

With a monetary debt, there are two things to do to avoid bankruptcy through compound interest: keep paying off the interest and debts, or keep the interest rates low.

This is an and/and, not or/or: with low interest rates, continuously paying off the debt is easy, which is why people will lend money for low interests. It’s a positive feedback loop!

With high interests, it is hard to pay off debts. And when there’s a high risk of not paying off debts continuously, people will charge much higher interest, making it harder to keep paying them off. A negative feedback loop!

This is the same for technical debt. High interest means we cannot pay it off: we have so much hacks and ugly code, that forever we remain behind on actually fixing all that. All available time is wasted on “emergencies” and on “fixing bugs” caused by the debt. But that then also means that the interest remains high.

With technical debt, the common “answer to technical debt” is to just keep reserving time to improve the codebase. Always, continuously, refactor. Keep paying off your debts. Something-something Lannister.

But the second part of that negative feedback loop is often ignored: we should keep the interest low. With low interest, having a large debt is fine.

Keep the interest rates low

For example, if we borrow $1000, over 200 weeks, with a weekly interest rate of 2, 6, or 12%, and don’t pay off anything, this is the difference:

loan 2% 6% 12% $1,000.00 $2,997.74 $5,601.05 $10,607.13

With a low interest, 2%, the loan has grown almost three times. With 12%, the loan grew over ten times. In technical debt, we cannot calculate it this precise.

But we could take the ratios: If we take on technical debt, with a low interest (2%), what we built today with two FTE, will cost us the work of 6 FTE in future to “pay back”. But with a high interest (12%)% would cost us 20FTE in that same future. The increased compounding effect caused by the height of interest is real and visible.

So, how do we avoid paying a big interest? How can we safely take on debt, like e.g. the company Apple does with money, without going bankrupt?

One thing I saw between the projects described in the first paragraph, is that the problematic projects had a high interest. Not so much a high rate of debt. And that this was always caused by tight coupling. Orthogonality:

In computing, the term has come to signify a kind of independence or decoupling. Two or more things are orthogonal if changes in one do not affect any of the others. — David Thomas and Andrew Hunt in The Pragmatic Programmer

In a tight coupled our system, a shortcut, quick hack, will easily spread into everywhere. It will touch everything that you have today, but also everything that is introduced in the future will be influenced. This is a compounding effect. But worse is that tearing it out in future is hard: after all, the chance that “paying back the debt” has unindented side-effects grows over time. This is another compounding effect.

Tight Coupling

By decoupling our system, taking on debt, can be kept locally. Isolated. The effect, the interest, is not affected by future work on other parts. It is still there, but only affects our work when we touch the isolated part. And ripping it out is much easier. If, for some reason, we leave the debt to grow unmanageable, we don’t need The Big Rewrite™ of the entire system, we, at most, might need to rewrite the isolated subsystem.

A typical SAAS web app may look something like below[2].

Network diagram of a tightly coupled software system

We see at least two important problems. And both caused real issues down the line. Both turned out to be crippling technical debt. In different projects, though. The first one I saw repeated in three different projects, all three struggling with the debt it brought. The second is when the refactoring (paying off the debt) is done wrong, or at the wrong moment. Where the refactoring itself becomes the technical debt.

We can see that everything is coupled to “role” and “user”. The infamous User-becomes-a-God-model issue. E.g. to answer can a user create a task in this project, the developers coupled almost every class to every other class. The subscription-Plan, together with the Role determine what actions a User can take on Task in a certain Project. To answer that question, we are going through six of the ten classes. We have tightly coupled over half of our codebase. Obvious problems arose: E.g. these codebases had (unit-ish) tests that would take over a minute, hundreds of lines of code, to set-up a database, just so the test could then create a task on which it could test stuff. Even if the test was entirely unrelated to “creating a task”, it has to create subscriptions, plans, roles, users, projects, task, (in a database!) just to test that “Given a task with status completed, when I edit it, I can only change it to status new”. Such tests ran slow, but also took hours to write.

In both codebases, this technical debt was once, a long time ago, “quickly added” through a library. Not very well thought through. Technical debt not taken on deliberate and within constraints. At the time it probably wasn’t even seen as technical debt, not even as a potential problem, but the accumulating interest wasn’t tackled and compounded into a situation where literally a majority of classes had references to nearly all other classes.

When, for example, I introduced, a sorting feature on a table on dashboard, this accidentally broke the unrelated PDF reporting feature. Textbook tight-coupling.

The solution implemented, however, in both these projects, was to increase the amount of manual testing (man-hours, interest-related costs!) to spend way more on code-reviews that warranted (fear, also interest-related cost), to over design new features and to simply have downtime, crashes or performance-issues and timeouts every few months. I.e. to not pay off the debt, but instead continue paying just the interest. And to allow this interest to remain high.

The second example, from another codebase, is what happens when someone takes “DRY, don’t repeat yourself” too literal. The code had several upload-an-image-and-attach-that-to-X classes. An engineer saw that duplication, refactored that into a single class that “handled all uploads”. This appears as good refactoring: paying off some technical debt! But alas, if we look closely (and with hindsight) it was accidental duplication. We, unknowingly, took on debt instead of paying it off. If we had a modularized architecture, and enforced it, we probably wouldn’t even consider this refactoring.

Loose coupling

In the projects that did not struggle with technical debt (but, again, some had large amounts of debt, they just didn’t suffer from it), the architecture looked different. There were clear subsystems: modules. Not microservices[3], but just well defined boundaries. In some systems classes could, technically, reach into a boundary and poke around there: it was just discipline that kept the team from actually doing that.

I’ve taken the diagram above, and without moving the “classes”, I introduced boundaries: placed them in modules. And I reversed the “attachment”-DRY refactoring by giving each class its own dedicated attachment back. I did move any components just to make it more readable though: just added containers.

Network diagram of a decoupled software system

What we see now, is modularized, compartmentalized architecture. There is an “Authorization subsystem”, which will answer the question “can user with this ID perform action X on a thing with this ID” in isolation, on its own, using data only it owns etc.

If we now take on technical debt in this authorization subsystem (business needs a log-in-with-github-button yesterday), it doesn’t touch anything but this subsystem. The interest doesn’t go up exponential, it goes up linear. And if we take on technical debt in the billing subsystem (Sales needs a free trial which can access all features), it only touches the finance subsystem.

The projects which didn’t suffer from the technical debt, were compartmentalized like this. The subsystems isolated. Communication and coupling minimised to well-defined ports (interfaces, methods etc).

In these projects the teams were confident to take on technical debt. But only if it remained compartmentalized. Never if the “technical debt” meant introducing coupling. Never if it meant to compromise the boundaries, break the isolation. Because debt that breaks the boundaries, will compound quickly into a spiral of compounding interest that can never be paid off.

Rewrite from Scratch

As long as it is isolated, we also have a clear worst-case-scenario: being forced to rewrite the entire subsystem. The worst possible interest of our technical debt is known: rewrite subsystem X.

I’ve actually paid off technical debt this way, a few times[4]. One of those was an (auto-generated)avatar-subsystem, which took just two days to rewrite from scratch and turn into a microservice, because, unlike in the first diagram above, we always kept the classes and algorithms for the avatars isolated from all other stuff that had to do with “Users”. Yet, internally, within the boundaries of this “avatar subsystem” the code was a terrible mess, performed horribly, was tightly coupled to a specific storage implementation, and withheld the entire codebase from scaling up horizontally - it was the last part that could not run parallel.

Two days, one FTE: That was all it took to pay off our biggest chunk of technical debt within the company. And we always knew the size of the debt, and therefore did not shirk back from taking on even more debt: because it’s effect was always isolated. Bounded.

Conclusion

With proper compartments, a decoupled architecture, technical debt can be contained. Which causes it’s interest, the effort required to pay it off, to grow linear, rather then exponential. Which can still lead to a crippling debt. Still lead to compounding interest. But which keeps if far more manageable.

Technical debt, ugly code, allowing us to deliver a feature today, which will cost us, with interest, much more effort to solve in a year, than we save today. But sometimes that is what is needed to keep the business afloat. And often is fine. Provided we keep the interest rates low.

It is up to the engineers to make this visible. And it is up to the engineers to keep the interest low through software architecture and discipline. And up to the entire business to keep paying off the debt, or at the very least the paying off the interest, because the compounding effect will otherwise grind the project to halt. Or even cause bankruptcy.

Discussion and comments are welcome over at this hackernews post

But in my experience those places hardly exist. Probably, because they won’t survive competition where engineers can take responsibility. And because in reality, management and C-level will listen to engineers, but might choose to ignore their warnings. Either because engineers have shown a history of “cry wolf”, or because other issues press harder on the business as a whole. Most often the last. And to engineers who lack this perspective for whatever reason, it may look like management is “forcing technical debt” upon them. Without seeing that it’s really an “Us”, not “them vs me” thing.
The actual domain of both the cases where I encountered this was not “project management”. There weren’t literal “tasks” nor literal “projects”. The real code was far more complex, the reality was far worse than what I describe here.
Though microservices are the culmination of decoupling, when done right, there are many steps between a “highly coupled system” and “a highly decoupled microservice system”. The most obvious being a monolith with properly decoupled, internal modules. And with rigidly enforced (domain) boundaries.
In general, I severely dislike “rewrite from scratch”. Books have been filled by why this is a bad idea, and I agree. But if the rewrite is isolated, small, and in a well-tested (integration test) system, it might prove the most efficient way to get rid of technical debt. Sometimes.

https://berk.es/2022/08/30/exponential-compound-interest-on-technical-debt-and-how-i-avoided-it

My Reasons For Using Rust (as a Ruby developer)

Aug 22, 2022 Updated Aug 22, 2022

Show full content

By now, there are thousands of articles explaining why Rust is a good option/the fastest/cleanest/nicest/whateverest. While reading about objective attributes that speak for- or against a programming language, a talk by Owen Synge stuck with me.

He explains why he likes Rust. Why he chose it. And what he values in a language. I took this idea for my rust presentation. And now wanted to explain it in some more detail.

I like this perspective, because it shows whether the person promoting Rust, values the same attributes; it shows why someone chose a language, framework or architecture with relevant context.

For me, the three most important attributes of a framework or language are that:

Software should work for decades.
A system has Good Defaults™: Being Lazy leads to good software; To make “bad” software you must employ extra effort.
It is Simpler (not necessarily easier) to maintain, host, test, and deliver.

I am, primarily, web-developer. Lot’s of backend work, with DevOps and infrastructure automation and occasionally frontend development. But all the software that I write is to support mobile- and web-apps. This means that attributes like being able to run on this 20 years old ACME controller or even the performance, are less important or even irrelevant. For me.

I’ve been developing in Ruby and Rails since 2005 and full-time since 2013.

I try not to advertise myself as an “X developer”, but use the language that fits the use-case best. On a typical day, I’ll write some Ruby, some Python, kludge around with JavaScript, do some Typescript and write some Rust.

While I think it’s important to train myself to be a [Polyglot](https://en.wikipedia.org/wiki/Polyglot_(computing) programmer, I’ll always have one Go-to language. One language that I know much deeper, for which my environment is tuned, which I have much more experience in, and so on. Currently that is Ruby, but I’m quickly shifting that to Rust.

Because Rust ticks those boxes.

Software should work for decades

Most of my software, over 90%, is archived, obsolete, unused, dead. But in the rare 10%, I want to be sure that it can be maintained over longer time. Yet it is impossible to guess up-front what this 10% is. There is software running, which I quickly hacked up late-night in an emergency. Which has been provisioning thousands of WordPress servers for almost 10 years now, it got improved later, though. Yet a very cleanly architectured ORM for a REST backend, with all sorts of Design Patterns, has ran for a few months in a staging environment, and then threw out because we decided PHP/Drupal really wasn’t going to cut that project.

Which is why I at least try to architecture all code in a way that it can be improved upon. It doesn’t have to be good at the start, but must be possible to be made good later.

The programming language, a framework and its ecosystem solidify a lot of options early on. More then The Architecture, in my experience. It’s relative easy to re-architecture something: incremental refactoring. But it’s not easy to swap out a framework or language for another. That almost always requires big rewrite, which, as we all probably know, is a company-killer. A wrong choice of language or framework leads almost inevitably to a painted yourself into a corner problem.

And those solidified options should support maintenance over long time. This is where software architecture plays the biggest role, but language has an important say in this. As does the culture of the ecosystem.

In Rails, for example, it is made explicitly clear that doing “stupid things” is OK. While I applaud the idea that it is possible, I have seen this lead to problems in every Rails project that I worked on. Where maintaining some “smart” hack, fancy DSL or full-featured library over time required recurring, and significant cost. Even grinded entire teams to a halt. Not even able to react to market. But even when, with great discipline, you manage to keep your application clean and maintainable, there are technical fundamentals, of the language that make maintenance harder or easier.

Ruby, like Python, for example, requires complex runtime environments and versioning. Rbenv, Gems, Bundler, whatnot. All of these move all the time. Every week at least one of my “shelved” Ruby projects, or JavaScript dependencies needs some critical security update. Any Ruby project that I haven’t touched for months, requires significant effort to just get running locally. If at all. I have a tiny rust plaything which I wrote, is finished, and keeps running without me ever looking at it [for months](. As a Ruby developer, this is a new experience!

This is less visible when you have a team working on one project day in day out. But I can assure you, the total time spent on upgrading gems, pips, npm, linter, CI, tests just to keep it running month-to-month is significant. All this is effort I cannot spend on writing the software (that makes me money).

It is more visible in agencies that have many PHP, WordPress, Drupal, Rails or Django projects. Where a client comes back after 20 months to ask for that X integration, which then, most often turns into a full-blown rewrite of the entire project[1].

Rust was built with stability in mind. It’s one of the core principles is to be fully backwards compatible and support versions forever.

The release of Rust 1.0 established “stability without stagnation” as a core Rust deliverable. Ever since the 1.0 release, the rule for Rust has been that once a feature has been released on stable, we are committed to supporting that feature for all future releases.

Rust invented a smart combination of versions, editions and releases, in which it can continue rapid improvement to avoid stagnation, but to ensure the project you write today, will compile and run in ten years. And to help you upgrade, if you want, but never force you to.

The crates system: libraries, with dependency management, support this idea too. You’ll continue to have access to old libraries, can easily keep them locally, or even on a self-hosted library-server. But where all ruby-gems, pips, or npm-packages must run on the same environment, with Rust you can mix and mash old and new libraries just fine.

Any future compiler will compile your old rust code, so you can pick up an old project work on that, alongside any now modern rust projects. You’ll miss features, you might be confused by how stuff was done back then, but it will compile. It will compile any old dependencies alongside any newer. And it will run.

A system has Good Defaults™

I’m a lazy programmer, or at least try to be. And working test driven (TDD), this means cutting corners, writing the quickest hack that works, before refactoring and cleaning up. But often a rough, hard-coded or duplicated piece of code will make it to production: I also don’t like to waste time on cleaning up stuff that the customers don’t need. And even if I, and my team, were always diligent, and industrious, there will be a deadline, or emergency, where corners must be cut. Where code is rushed out.

I want my frameworks and programming languages to acknowledge this.

To make Doing the Proper Thing easy, and doing the bad thing possible, but harder. That way, I, the lazy programmer, build secure, clean, maintainable software when I’m being lazy. Will forget a private marker. I will forget to make a variable immutable. I will make a typo in a filename of a source file. I will forget to mark a dependency as requirement for this one module.

Rust was built with this in mind. And it trickles down into many of the frameworks, libraries and working-groups.

The most visible is how all variables are immutable, unless you explicitly make them mutable. How definitions, attributes and variables are private, unless you mark them public. How no dependencies are available until explicitly included. How all code is memory safe unless you explicitly mark a chunk as unsafe. And so on.

You’ll have to write extra keywords, insert deliberate markers and keywords to “do the worse thing”. Not that mutable variables are always bad, but immutable, if possible, is better. So when the software makes me think hmm, do I really need this to be mutable? What if instead I just….

Slightly less visible is how the Rust compiler forces you to handle all exceptions and errors explicitly. You’ll never see your Rust program crash because some file could not be read, without first explicitly allowing that crash in the code. And therefore think about that case. This goes for anything that could go wrong. From parsing a CSV, to missing commandline arguments, to form-fields being empty: if it crashes, you explicitly allowed that crash (which is as simple as a .expect("Reticulating Splines")-call though).

Cargo, the default Rust toolkit, comes with a formatter that will format my code according to Community Best Practices. It comes with a test framework and runner. No need to set that up one day. It’s there, at the first commit. It comes with a linter. The rust compiler, famously won’t compile if it sees errors. All that is available on my laptop, CI, build system, without any configuration, or set-up.

Cargo assumes a well-documented layout of the code, that makes any rust project recognisable, but mostly avoids me having to spend time on decisions on my directory and file-layout: it follows the code-layout. It’s clear where tests go, it’s clear how modules are split, how to name directories.

I can configure or override all this. I can change the linting and formatting rules (I know at least two former co-workers who would immediately spend days tuning all this….). But being lazy, I’d rather leave them at the defaults. And those happen to be extremely well thought out.

It is Simpler to maintain, host, test, and deliver.

Rust -by default- statically compiles its runtime and dependencies in the resulting binary. This comes at the cost of rather big binaries, but it means that I can just plonk the compiled binary for Linux, onto a Linux machine and run it.

All dependency management is done compile time. This makes deployment as easy as an rsync or scp and a restart.

It makes a CI workflow ridiculous simply: a mere cargo check or cargo build --release. No need to setup rbenv, or compile the right version of node or python. No need to add test frameworks or set up linters. It’s all there.

The built-in test-runner will parallelize your builds, will find the tests based on naming conventions. Will build the project in test-mode, will report in one of many popular test-report formats. And so on. All without configuration or set-up.

In a typical Ruby project, almost half my tests are to catch cases that a proper type system would catch. Things like what if the email is null or what to do when the file cannot be read. With Rust, I trust the compiler for all this. My tests can then focus only on business-logic. I write and run a lot less tests.

Delivering the software is about as easy as deployment. Just email someone the binary, or .exe and have them run it. No need to unzip jars, install runtimes, check for versions, DLLs or .so files. The binary for your architecture will most likely just run.

And when maintaining software, complexity is the enemy. Simple the antidote, but getting there, is not easy.

Rust doesn’t offer direct tools to help make software simpler. I’m afraid that’s more of an art. But it does offer a well-designed, large standard library. Every week I find a chunk of code that I can replace with a single call to something Rust offers out of the box. Often removing tens or even hundreds of lines of code. And more often than not, you don’t need third party libraries or even frameworks to build a feature.

Rust promotes commandline tools in it’s tutorials and books. Rather than desktop, HTML or GUI applications. It makes it easy to rapidly crank out a new project, rather than shoehorning a feature into the existing project (make the “good option” the “easy option”!).

Any downsides?

Rust is certainly not perfect for me, though.

The lazy trait, has another perspective, for example. I don’t always want to think about a potential error like, say “the CSV file not being UTF8”, but Rust forces me to deal with that edge-case. Even if this is a tool to run over a single CSV file and then get archived. Developing in Rust is certainly slower for me, than in Ruby. Part of that to experience. But a large part to how rust forces me to deal with all sorts of use-cases always, all the time.

While I appreciate that Rust has no class inheritance, and relies only on composition, Its lack of classes and objects are unfamiliar to me. And require me to re-learn a lot of design-patterns. To design setups that I commonly write without even thinking about. I expect this to fade over time, as I gain experience with traits and structs, but I do miss it often.

I typically try out ideas outside of my codebase. A quick, isolated proof of concept or mock. Isolation being the key. With python, JavaScript or Ruby that is a mere python3, node or irb away. Rust has an online playground but I miss being able to do this locally. A quick cargo init trial works, but I find it to still be a too big hurdle often, and continue hacking in my actual project. And inevitably get distracted by some incompatibility or unrelated error, and then fail the PoC. It doesn’t have to be a REPL, just a really fast scratchpad or temporary workspace.

But even when I work in Python, or Ruby, Bash, JavaScript, I apply the lessons that my newest senior peer-programmer has taught me. The lessons that the Rust-compiler taught me. So even when I don’t write Rust, Rust has made me a better programmer. Or less-bad, maybe?

[1] I consider this a terrible business model; I know from own experience that rewrite or upgrades to newer versions are a significant source of Revenue for Drupal agencies. I dislike this very much.
It is perfectly possible, though to compile a binary that dynamically links, or requires external runtimes or other dependencies. But that requires extra flags, config and work. Another example of the “good defaults”.

https://berk.es/2022/08/23/my-reasons-for-using-rust

Event Source your Spreadsheets for Flexibility and Maintainability

Aug 15, 2022 Updated Aug 15, 2022

Show full content

Event Sourcing (ES) is one of those software paradigms that I wish I had known about much earlier. In hindsight several of my projects and companies would have benefited greatly from an Event Sourced Architecture.

Event Sourcing is, by no means, a magic bullet. It’s an architecture that has a place and use. But for many spreadsheets it turns out to be an excellent fit!

Spreadsheets are notorious in how they evolve and accumulate logic over time, so applying some best-practices and even architectural patterns up front, helps a lot in keeping the spreadsheet maintainable and extensible.

(Whether or not you should use a spreadsheet in the first place, is a whole different discussion. The answer is quite often “nope”, in my experience. But let’s assume we checked the alternatives and found: yes, this a good one for a spreadsheet, because [insert reasons])

For example, I’m tracking my investment portfolio in Google Sheets. Or maybe you need a quick dashboard summarizing the sales you make in various e-commerce platforms.

In such cases, it really helps to set up the sheet “Event Sourced”.

Event Source What?

Event Sourcing ensures that all changes to application state are stored as a sequence of events. Not just can we query these events, we can also use the event log to reconstruct past states, and as a foundation to automatically adjust the state to cope with retroactive changes. Martin Fowler

That’s it! That is everything there is to know about Event Sourcing: all the rest is implementation detail. Concepts like Domain Aggregates, CQRS, Commands, Command Handlers, Snapshots, Projections, and so forth, which you’ll often see when reading about Event Sourcing, are implementation details. Necessary details when implementing Event Sourced architectures and to keep code maintainable, solve common problems or avoid some downsides.

But for the sake of understanding why Event Sourcing is useful, the only important part is “application state is stored in a series of events”.

To explain with an example: the balance on your bank-account isn’t a column in a database table called balances, it’s the sum of the amounts on each payment and deposit ever done on that bank-account: the events are the source, your balance is just a derivative thereof.

Event Sourcing is nothing new, it’s actually really old, probably thousands of years. Bookkeeping has been done by writing a log of events, for centuries, since as long as it exists, really. But also domains like shipping works like this; since ages the captain keeps a “captain log”.

Captain's Log on Monitor from Star Trek

One important detail often used in Event Sourcing, is the so-called projection. As said, this is an implementation detail, but one that we can use for the spreadsheet version very well too.

A common scenario in this context is taking events created in the write model […] and calculating a read model view (e.g. an order summary containing the paid invoice number, outstanding invoices items, due date status, etc.). This type of object can be stored in a different database and used for queries.

Event Source introduction by EventStoreDB

In other words: projections are read-only tables containing summaries, conversions from (subsets) of the events. The balance mentioned above, might very well be stored in such a projection, called customer_balances, in reality. So to avoid having to loop through all the 8000+ transactions you made in your bank account on each tap in your banking-app. Sometimes these are called “caches”, since they very much act as a cache: they can be re-built from all events at any time, and might be out of data (eventual consistent) at moments.

Spreadsheets

When designing a spreadsheet, it really pays off to follow some best practices. One is to have only one single, obvious, place where to insert data.

In practice, such data most often comes from CSV files, exports, or, indeed, logs. Banks, Brokers, E-commerce, Bookkeeping, Project-management: most such systems have an export. And with almost all, those exports are some form of “list of things that happened”: a list of events!

So. When we insert data, it makes a lot of sense to make the entry a literal copy-paste of that export[1].

If we then agree this sheet is “append only”[2], we are done: we have an Event Source! We now have a sheet that holds the history of the things that we wish to track. We can add new entries to the bottom[3] and we can calculate all the numbers and summaries that we want from here.

Dashboards, cross-references, enrichments, all can be done by referencing and querying this one sheet.

Downsides

Apart from the fact that spreadsheets in itself are quite often the wrong tool for the job, this setup has some clear disadvantages though.

For one, we need quite complex lookups. The formula’s to calculate a sum quickly become quite complex when it needs to filter certain events only, or group by certain event attributes. But we’ll look at some solutions below.

Spreadsheet’s are very good at 2-dimensional data, but at handling more complex data. If, say, an order contains multiple products, it is often represented in a clumsy way: for example one row per product but each row repeats the order-number, customer-data and so on.

Date order# Price Product Customer 02-12 R1337 1.99 Banana b_@e_com 02-12 R1337 0.99 Apple b_@e_com 04-12 R8008 0.99 Apple d_@e_org

Or worse, the repeated rows leave those fields empty, and only the first holds it.

Date order# Price Product Customer Total 02-12 R1337 b_@e_com 2.98 02-12 1.99 Banana 02-12 0.99 Apple 04-12 R8008 d_@e_org 0.99 04-12 0.99 Apple

In both cases, we really have data in more dimensions: one order has many line-items (products). In both cases, the event really is only the placement of the order, (order placed event) and not the rows describing the contents of that order.

This mismatch is quite impossible to solve easily; it’s a limitation of spreadsheets. But becomes more apparent when we wish that the copy-pasted-data-entries really were the events, and not “events and then some contents of the events”.

And then that “Transactions2010020114:32.CSV” that I download from my creditcard supplier, or the “Account%20History.csv” that my stock-broker provides often includes several types of events in one table. In those cases, I need even more complex queries over the events, as every query needs to filter out specific events only.

When there are many events, thousands or more, things get slow. Or even impossible[4].

As with the dimensions of data, Spreadsheet software often shows that it isn’t really designed for such a setup. It often lacks details or requires hacks and manual work. For example, when enriching data, you’ll probably need complex IF(ISEMPTY(), 0, THEACTUALFORMULA) just to avoid pesky “#N/A” errors, which will trickle through everywhere. Or a hard-coded range (you will insert an A2:A1001, for certain) that now gives the wrong data because you just inserted event 1002.

Those limitations are there even if you don’t set you sheets up Event Sourced, just that they are more annoying when you try to set it up flexible, future-proof and maintainable.

Demo!

The details, some best practices and more background can best be explained with an example. As demo, I prepared a Google Sheet, and shared it. (feel free to leave your comments in this sheet if you see improvements)

Dashboard Screenshot

This demo is a “dashboard” for small e-commerce business. It contains both a sample data-set. I deliberately kept the list short, so that we can see and cross-reference what happens. The data-set is modelled (very loosely) after how an e-commerce platform offers their exports. They are sales and refunds. This would be a table into which we can append, by copy-pasting, our monthly downloaded CSV entries from this platform.

The Events

One tab is called “events”. This is the place where we append new entries as they come a in a download. Our demo has one source of events, but often you’ll find that you want to include multiple sources. For examples sales made on Etsy and sales made on Shopify. The exports from these places will not be compatible. It is then a bad idea to mix these up in one “events” tab. Instead, it is far cleaner to have multiple “event entry tabs”, e.g. “etsy_events” and “shopify_events”. This trick also works for when exports’ formats change in future. A provider may (and you can be certain at least one will!) change the export format without being backwards compatible. Adding or removing columns, changing the format etc. In that case I’ll just add a new tab “v202203_events” or so.

There is no deduplication nor an easy way to deduplicate. So, if on, say February 5th we download the entries for “this year”, copy paste them, and then in April we download the entries for Q1, we will have all events from Jan 1st to Feb 5th double.

The obvious solution is to just not enter double events. But if we need a solution for this, I’ll touch on one further below. Some exports don’t allow you to provide a date-range, there the same goes: just be sure to copy-paste only the new stuff.

A sample from the data to describe some of the Domain Logic around these events

DateTime SKU order_id customer_name customer_email cost_basis sale_price costs currency type 2021-08-02T03:00:50Z KJC_559 R71218 Jameson Mayhew jmayhewkv@usgs.gov 1.46 2.03 0.73 EUR sale 2021-08-02T09:34:36Z AGY_380 R71218 Jameson Mayhew jmayhewkv@usgs.gov 1.21 2.04 0.87 EUR sale 2021-08-02T14:52:44Z HCZ_519 R68804 Ignaz Vallerine ivallerine50@cornell.edu 0.35 3.51 0.22 USD sale 2021-08-03T03:19:36Z UNP_232 R68804 Ignaz Vallerine ivallerine50@cornell.edu 1.6 1.53 0.13 USD sale 2021-08-03T11:29:12Z DHQ_687 R68804 Ignaz Vallerine ivallerine50@cornell.edu 1.37 2.29 0.68 USD sale 2021-08-03T13:59:03Z SVQ_548 R73703 Lorens Gulleford lgullefordl5@instagram.com 0.86 3.41 0.08 EUR sale 2021-08-06T12:15:00Z UNP_232 R68804 Ignaz Vallerine ivallerine50@cornell.edu 0 -1.53 0.15 EUR refund 2021-08-06T07:24:36Z RON_531 R19813 Roz Cossar rcossarc2@smh.com.au 0.35 2.38 0.79 EUR sale

The data is duplicated, a unique OrderId describes an order, but additional rows describe additional products in that order. We have a type column which describes the kind of event.

In domain-language, we see Jameson having placed an order for two products for a total of 4.07 (this amount is not included!). The product has some costs attached for us (dropshipping fee, the costs to make it, or to buy it). We see that some products were bought in EUR, some in USD. And we see that Ignaz got a refund for one of the 3 products they ordered three days earlier.

A common export will probably contain much more columns, but for the sake of clarity and demonstration, I’ve kept it small.

Dashboard.

Now that we have the data, we should start with the most important: what do we want to show or calculate.

Spreadsheet experts often call this Start at the End.

To fill the dashboard shown above, we need some chart showing daily sales, we need to sum up specific events, to get to sales and cost totals and we need to sum all refunds (in the dataset just one, though). We need to group some data to count how many unique entries there were and we need to find the best selling products.

We now have the source - the events - and a target - the dashboard. Next step is to add the intermediate calculations.

Lookup tables.

Lookup tables, or, in Event Source lingo Projections, have two purposes in our spreadsheet.

The first is to move all calculations that enrich, parse, or reformat events away from the events table. So that this table contains only the data exactly as it appears in our exports. The second is to make summaries, and -yes! There they are- pivot tables.

Our source data has several problems. One is that the timestamps use an ISO standard, which Google sheets cannot parse (yes! Google sheet cannot parse the one and only official international standard way to write dates. Really). The other is that prices are in different currencies. And we want to calculate with the exchange rates as they happen on the day of the transaction. We certainly cannot just sum price in EUR with one in USD. Often you’ll need to parse, enrich or add much more data.

Again, it is unwise to do this in the “events” table. That makes copy-pasting very clumsy, it increases the chance of accidentally overriding and it spreads your code (formulas) all over. Let’s keep the events tab “append only”: no calculations there!

Using QUERY we can pull the data into another tab. This one I called lookup_sales_events. I prepend all my “lookup tables” to clarify their purpose. And to make clear that this is not a place to insert data. Lookup tables too are “read only”.

The query here is ` =QUERY(events!A1:J1001, “where J = ‘sale’”), which selects all events whose type issale`, with all its columns from the events.

A similar tab is added for refund.

Right of this tab I’ve added some enrichments. One is an ugly hack to parse the ISO dates, another to calculate the EUR version of any USD-priced event, using GOOGLEFINANCE Currency conversions

One of the mentioned technical issue surfaces here: if we were to add events, the right columns won’t grow with it automatically. We would miss this data without seeing an error! The solution adds complexity to the Demo, but would include additional logic in the formulas like IF(ISEMPTY(), "", [the current formula]). And even then, it only goes so far as we’ve dragged the right hand range to. If we dragged it to row 500 and we insert the 501st (actually, the 498th, we have two header rows) “sale” event, we will miss it and might not notice it! Stay vigilant!

Such projections are also the ideal place to deduplicate any events, in the exceptional case that it is impossible to avoid duplicating during entering.

For my overview I like to format this table a little, add some lines, headers, colors etc.

I’m personally a big fan of adding aggregates[5] of a column, such as sums, averages etc above the table. This makes it much easier to let the table grow down (which it will if we add more events) and always reference to the same cell for an aggregate. Other people prefer to put such aggregates in their own sheet. Whatever rocks your boat.

Some common formulas used to build such lookup tables are QUERY(), FILTER(), UNIQUE() and IMPORTRANGE(). The last one is great when you want your Event Source to be a separate sheet or even CSV file stored somewhere else.

For the sums, averages, etc the common formulas are COUNTIF, SUMIF or SUM(FILTER()) etc.

Another common type of lookup table can be found in the demo under lookup_products. Which counts the amount of products sold. For this we use COUNTIF and UNIQUE to count occurrences I chose to build this lookup table from another lookup table. Generally I try to avoid such inference and try to have each lookup_table use only the Event Source as source. But that would complicate and duplicate the formulas to filter only on “sale” events. It’s a trade-off.

And a third type of lookup is a so-called pivot table. Such a table groups data by one column, calculates aggregates over the groups and presents that in another table. Under lookup_pivot_daily_sales we group by day, and calculate the total amount of sales for each day. Used to then generate the chart on the Dashboard. To make this pivot-table a tad simpler, I added a date column to the sales_events. This can be done in the pivot table too, but this makes the already difficult to grasp pivot table even harder to grasp. Like above, I chose to use another lookup_table as source. Same trade-off.

Dashboard again

From here we can go back to the dashboard fill in all the data, add a chart, and QUERY() the top grossing products. My rule of thumb is to avoid any logic and calculations on the dashboard: just presentation, nothing more. I violated this a little, with summing the income and costs and negating the costs. Normally I would make another sheet where I calculate this, and then only reference this from the dashboard. But that would complicate the demo. So I left it for a next round of refactoring.

Future

With the events as data-source, we should have all data to calculate any possible projection onto our dashboard. Just add one or more lookup-tables to prepare the data, then refer to the outcome in that projection in a dashboard.

Need a monthly summary? Follow the pattern that we already have for our daily sales.

Need a date-range? Add a form (just one or two cells with a different color, or a dropdown) to the dashboard and use that to select a subset of the events, and either fill another projection with that data, or simply use the value in that field as parameter in a formula.

Sometimes, however, we will need data that don’t yet have and that isn’t in our events. These basically fall in two categories:

Events from now on have an additional field holding this data.
Events that we don’t yet have.
We need another lookup table or external data source to enrich this data.

For example we may want to add to our dashboard from which country a sale was made and our events lack a country, IP-address or something that can be used to provide this country. If we can get the provider to now add that field, we will have a list of events without that data, and a list of events with it. Even if the provider fills our previous events with this data in the export, it is probably not smart to start overwriting the existing events. In Event sourced software this generally is not possible and most often “forbidden” even: events are read-only, the list is append-only. We should never change past events. We should never delete past events.

In our spreadsheet it is possible, but still unwise, as a lot of projections or calculations may (and will) break the moment our source table suddenly has columns shifted or changed. It is far easier to add a new tab with e.g. v3_events or whatever versioning you want. Then add a projection sheet which combines all source events tabs and normalizes them. Using complex QUERY() formulas. The benefit is that all normalization, filtering, parsing, again, happens in the projections and nowhere else.

A second example may be when we need to extend our dashboard with data on how fast orders are delivered to the doorstep. For that we will need additional events. Events that our delivery service or postal service may have. Many will provide a CSV export that we could use as “events” and from which we can then project and calculate the time of delivery.

The third example may be when we do have said IP-address in the export but up to now never used it. And we want to get the country from that. Via some GeoIP lookup service (some of which even have addons for popular spreadsheet software), we could then fetch the country for an IP address in a lookup table.

So, any future extension is done with either more fields on existing events, additional events, or with more projections.

Need the amount of days that we didn’t sell? A projection. Average order size? Projection. Recurring customers? Projection. And so on. Once the structure is established, extending it, is mostly following a pattern.

Conclusion

Spreadsheets can be kept clean, clear, extendable, by organizing them with ideas from Event Sourcing.

This is by no means fully or even properly Event Sourced software. It is, however, an architecture to clearly distinguish logic from input from presentation in a spreadsheet. And to make extending it in future predictable and relatively easy.

It keeps the logic in a simple place. It reduces the amount of indirections (references to other cells and other calculations) minimal and flat. And it gives you a clear place where to copy and paste new entries.

[1] More sophisticated, but omitted from the demo here because of complexity, is to import, or link to, such exports. The “Event Source” then becomes a CSV file, or list of files in a directory somewhere. This greatly eases the “data inserts” as all you need to do, is download the export and plonk it in the right directory with the right name.
[2] I’m not enough of an excel guru to know if you can actually make a sheet “append only”, but just documenting this, is enough, in my experience. If you don’t trust the users of the sheet to follow the guidelines, a spreadsheet really is the wrong tool to begin with.
[3] Appending at the bottom is much easier than inserting rows at the top. The oldest entry then sits at the top, and newer ones get added below. But inserting at the top is certainly possible, just more work and more chances of accidentally overwriting data.
[4] There are solutions to this, though. One is a common bookkeeping practice to “close the books”. In Event Sourcing, this is solved with snapshots. Both, in essence, insert an additional event which says “all events up to here can be summarized with the following data:” and then includes that data. With a spreadsheet, I generally copy the sheet every year, insert one or more event that sets all the opening balances copied from on the previous sheet, and I’m good to go.
[5] For clarity: in DDD, which is often mentioned alongside of ES, an Aggregate has a different meaning. Here an Aggregate is just a calculated value from a column. Such as the sum of all prices or an average or a count.

https://berk.es/2022/08/16/event-source-your-spreadsheets-for-flexibility-and-maintainability

It's not Ruby that's slow, it's your database

Aug 8, 2022 Updated Aug 8, 2022

Show full content

Many people keep repeating that Ruby is slow. It is. But that doesn’t matter, because your database is so much slower that it is the bottleneck. So, an alternative title would be “Ruby is slow, but that doesn’t matter for you.”

While writing a gem that offers key-value storage in your existing Postgresql database, and benchmarking it, my old mantra kept popping up: Ruby isn’t slow, the database is slow. So much that I decided to collect the benchmarks and backup that mantra for myself.

In the industry, this is called I/O-bound, and is opposed by CPU-Bound performance. Most Ruby performance issues that I helped solve, fell in the first. The slowness of Ruby wasn’t causing any problems.

Ruby is slow, but…

Let’s be clear: ruby is slow. The garbage collector, JIT compiler, its highly dynamic nature, the ability to change the code runtime and so on, all add up to a sluggish language.

However, when People say “Ruby is slow,” when diving deeper, this critique often falls in one of three categories:

Yes, Ruby is slow, and that is a problem for our use-case.
Yes, Ruby is slow, but in practice this does not matter for us.
Yes, the Ruby application is slow, but really, its the stack, not just the language.

I want to dive deeper into the last one, but let’s get the first two out of the way, first.

Ruby is becoming faster year over year, and while that is very welcome, it probably won’t matter in the bigger picture:

I also think that speed isn’t a major factor slowing down Ruby adoption. Most people who use Ruby don’t need it to be faster. They like the extra free speed, sure. But they weren’t avoiding Ruby for speed reasons. – https://www.fastruby.io/blog/ruby/performance/why-wasnt-ruby-3-faster.html

Because performance really is very dependant on context:

[…] how fast does your system need to be? And how fast is it now? If you can test its current performance and know what good performance looks like, then you should feel confident in making a change. Sometimes making one thing slower in exchange for other things is the right thing to do, especially if slower is still perfectly acceptable. – Sam Newman in Building Microservices

So, often it hardly matters that it is slow, because your use-case does not need the scale, speed, or throughput that Ruby chokes on. Or because the trade-offs are worth it: Often the quicker development, cheaper development, faster time-to-market etc is worth the extra resources (servers, hardware, SAAS) you must throw at your app to keep it performing acceptable.

Not always, but often.

A quick benchmark

To re-confirm how bad Ruby performs, I made a quick benchmark comparing Ruby and Rust for a (simplified) real-world job that I recently ran into: parse a CSV, grab a number from a column, then bucket-count the results. The simplified version does this for a (My actual version did this for a CSV ten times as big as the example used here). The example counts how many votes a movie has and groups those counts: between 0 and 10 votes, between 10 and 100 votes etc.

To compare somewhat honestly, I tried to create a version in Rust and Ruby that are internally as similar as possible. The result is both ugly Ruby and ugly Rust. Buggy even. And none are optimized for performance. I’m certain the Ruby and Rust version can be improved (but even as Ruby-expert, and Rust-novice, I already know the Rust version is easier to optimize further then the Ruby version).

All benchmark code is found in an accompanying GitHub repo.

This is not Proper Science, but it shows the obvious: Ruby is slower[1]. Rust:

ber@berkes:db_benchmarks ⌁ time ./target/release/movie_ratings 
Some(0..=10): ###################### - 445
Some(10..=100): ############################################################ - 1208
Some(100..=1000): ############################################################################################################### - 2229
Some(1000..=10000): ############################################# - 914
Some(10000..=18446744073709551615):  - 7

real	0m0,162s
user	0m0,146s
sys	0m0,016s

Ruby

ber@berkes:db_benchmarks ⌁ time ruby movie_ratings.rb 
10000..:  - 7
1000..10000: ############################################# - 914
100..1000: ############################################################################################################### - 2229
10..100: ############################################################ - 1208
0..10: ###################### - 445

real	0m1,491s
user	0m1,389s
sys	0m0,103s

The Rust version is about 10x as fast as the Ruby version. Relative, this is huge! With larger files this speed difference does not increase linear, though, but janky. A large part of this time is startup time (hard to measure in this use-case) and the JIT compiler. Another part is the GC in Ruby “arbitrarily kicking in” and halting all progress until it is finished: dealing with large datasets, makes this a real and annoying problem.

But how about the absolute difference here? The Ruby version is just over 1.2 second slower. Enough to be a little annoying while testing and developing. When you run this over-and-over (in automated tests, I hope!) this gobbles up mere minutes in a day, though: A total of 1.2 seconds on a script that is ran about 20 times while developing and then maybe weekly in a cron? Not at all: You wasted half a minute and just under 5 seconds per month.

I’m only focusing on CPU here, but Memory is just as big an issue, yet far less visible in the typical use-case of modern software: customers, interacting with server software, will experience slowness, but won’t directly experience memory use. The main reason, however, not to dive to deep into this, is that benchmarking memory is rather complicated.

So, yes. Ruby is slow and using a lot of resources. It makes trade-offs, so maybe the overall costs, including development, is less. It depends on your case, is never an absolute.

It’s the stack, that makes it slow, not just the language.

Now, let’s address the elephant in the room: Rails. While there are some Ruby projects that don’t deal with Rails, the majority of Ruby-code running in production is running Ruby on Rails. I personally write most of my code in Ruby, but hardly ever write Rails (I don’t like Rails very much; another post, another time), but I’m also aware that I’m an exception that to the rule: Ruby development is almost always “web-development in Rails.”

One issue with Rails (or, arguably, a benefit?) is that it is highly coupled to the database. Rails is all about The Sacred Dictating Database. Without a database, Rails is pretty useless, and gets in your way more than it helps[2]. Furthermore, Rails is about the web. You can do non-web-stuff in Rails, but that really makes no sense at all: Rails is for HTTP. And Rails is big, massively so[3]. And often chooses ergonomics (developer friendliness) over performance, as does Ruby, the language. This is fine! But this means that in Rails, even more than in Ruby, performance is a problem.

So, the “stack” means “Ruby on Rails using a database.” And because Rails is web, doing only[4] HTTP request-responses we’ll be looking at Ruby in context of web-services only.

To dissect the issue, I’ll be comparing some non-rails, non-http, ruby scripts.

Ruby isn’t particular good at juggling significant amounts of data, yet this is what, deep down, webservices are all about. To illustrate the relative performance, we compare writing, and reading a million records to various sources: Memory, a SQLite database in memory and a Postgresql database.

Obviously, and unsurprisingly, memory is magnitudes faster than anything else[7]. Postgresql, here, is a docker container that gets only only CPU, and isn’t tuned at all. This is not about the absolute numbers, so it doesn’t matter that much what the exact Postgresql setup is. What matters is the magnitute of difference.

ber@berkes:db_benchmarks ⌁ ruby ruby_slow.rb 
                           user     system      total        real
Mem write              0.005277   0.000000   0.005277 (  0.005271)
Sqlite mem write       0.080462   0.000000   0.080462 (  0.080464)
Postgres write         0.665662   0.151700   0.817362 (  3.068891)
Mem read               0.002772   0.000000   0.002772 (  0.002767)
Sqlite mem read       10.323161   0.021355  10.344516 ( 10.345039)
Postgres read          8.296689   0.041118   8.337807 (  8.682667)

Writing into a database is slow. And it gets slower even when you have tuned the database for read-speed, with indexes, and/or that database is under load.

The above timings do need more dissecting, though. They don’t tell what makes it slow. And, surprisingly, this is another part of the stack the ORM. I used sequel, because it is simpler, so we can dissect easier.

Looking at two flamegraphs, we see that when inserting, Postgresql really is the bottleneck there. This makes sense, because a database has quite some work to do on insert. Our table is simple and has only one, and then of the lightest type, index.

Writing to a database is slow. So slow that all other timings become insignificant.

flamegraph of the Postgresql insertion variant

flamegraph of the Postgresql read variant

On reading, Postgresql is less of a bottleneck. This is partly due to the extremely simple lookup (no joins, using one index, very little data to fetch, etc). The parsing (juggling of data) takes the majority of time: DateTime::parse. Or, reversed: the DateTime::parse is such a performance-hog, that it makes the time spent in the database insignificant.

We have now identified two performance-problems in our stack: Postgresql, and the ORM.

To be clear: this isn’t to show that sequel is slow, or that DateTime::parse is problematic[8]. This is to show that the more tooling we add to our stack, the worse the performance becomes. Again: obvious and unsurprising. But worth reiterating.

Before pulling entire Rails into the benchmarks, lets isolate ORM in Rails: ActiveRecord. Again, very much simplified and again, the queries don’t fetch anything complex, so the time spent in the database is very little.

                           user     system      total        real
Postgres Sequel write  0.679423   0.112094   0.791517 (  2.963639)
Postgres Sequel read   8.798584   0.011155   8.809739 (  9.194935)
Postgres AR write      1.741980   0.189130   1.931110 (  4.404335)
Postgres AR read       1.551020   0.040676   1.591696 (  1.922000)

Writing Through ActiveRecord: Flamegraph AR Write

Reading Through ActiveRecord: Flamegraph AR Read

Reading Through Sequel: Flamegraph Sequel Read

Writing Through Sequel: Flamegraph Sequel Write

We can clearly see that the DateTime::parse from sequel remains a problem. I suspect, ActiveRecord uses a much better performing method to hoist the datetime from Postgresql to a native DateTime.

Yet relatively speaking, this poor performance of Ruby hardly matters. It hardly matters that Ruby halts all code for 15ms to do garbage collection, if the fastest database-query takes 150ms. The entire overhead of JIT, all the bazillion layers of Rack and Rails HTTP parsing and forwarding, is nothing besides a 190ms insert query writing to the database.

The example fetches a single record from a single table: no things that a relational database is so good for, but which also causes actual performance issues in that database: no joins, sorting, filtering, calculations etc.

So, even with a very poor performing ORM, the Database remains the primary time consumer.

Scaling Up

We’ve all been there: our Ruby/Rails code grows so convoluted becomes so poorly set-up that the stack (or your custom code) really is the bottleneck: this is easy to solve! Just add extra servers. A single request won’t become much faster, but at least the load on the server no longer brings down the performance for all other users. Your app won’t become faster, but will be able to grow to more users.

You can easily do this, until the database becomes the bottleneck again. Writing to a relational database is always a centralised problem: we can only scale that up vertically: larger database-server. For the query (read) side, we can add complexity to solve it: read-copies (formerly known as “slaves”). Almost all common relational database servers allow this. It’s not trivial, because it introduces “eventual consistency” to a setup/framework that was never designed to be eventual consistent but it’s doable. Writing (create, insert, update, delete etc) isn’t: the database will, at some point, be your bottleneck. Unless it isn’t ever: but then performance never was a problem to begin with.

Solving performance issues in Ruby code is easy: just throw more servers at it. Solving database performance issues isn’t that easy because scaling up a relational database is hard or even impossible at some point.

Another conclusion here would be that to keep your code scalable you should keep as much logic, transformations, etc in code[9]. By pushing business logic, constraints, validations and calculations into the database, you lose the simplest, and often cheapest, means of performance gains: “Throw More Servers At It.”

Rails

As mentioned several times, Rails’ complexity does cause real and hard to solve performance-issues. So we need to look at that too.

To throw a quote by DHH back at Rails:

“All of the fancy optimizations are optimizations to get you closer to the performance you would’ve gotten if you just hadn’t used so much technology” ☝️ https://macwright.com/2020/05/10/spa-fatigue.html

https://twitter.com/dhh/status/1259644085322670080

Rails’ internal complexity has two profound effects on performance. One is that it has tons of abstractions, critically referred to as “Black Magick.” The other is that your data passes through all these layers and all this complexity before the request-response of a typical HTTP cycle is finished.

With Ruby being a relatively slow language when handling data (see below), the more code your data has to pass through, the slower the result becomes. This is true for all software, but amplified with Ruby. Rails’ 163500 lines of Ruby (dd 01-04-2022) certainly don’t help speed this up.

“Lines of code” are not a metric for performance, but they are an indication. Don’t forget that even the smallest rails project still boots hundreds of thousands of lines of code, even if your data passes through only a fraction thereof.

Benchmarking Rails’ has been done over and over. Instead of continuing with the “benchmarks” and flamegraphs for the entire stack, including Rails, I’ll now get a bit more meta. Less on numbers, more conceptual. Because with Rails, I’m convinced the performance problems are conceptual. Technical performance problems, as showed above, are caused by Ruby, not Rails.

ActiveRecord (the implementation in Rails, not the pattern per-sé) is an abstraction over a system (a relational database) that requires a lot of detailed knowledge to keep performant. ActiveRecord (the pattern) not only is a leaky abstraction, it mostly is an abstraction that hides details that cannot be hidden and should not be hidden.

More practically: That one User.active.includes(:roles) I threw in to fix an N+1 query years ago, dynamically chooses what it thinks you need. It may “suddenly, magically, dynamically” start to build other JOINS and queries and degrade your performance. (Okay, not runtime from one minute to the next, but over small changes).

I’ve had a database server cluster go down on a multi-million-users app, because of this: a simple, unrelated change in an unrelated controller caused Rails to switch over to an outer join with an humongous materialized view that was never meant to be joined this way (it was for reporting). But Rails’ magic decided that from now on, it was going to use that. Every pageload suddenly did a ~2s database query, gobbling all CPU and IO up on the database server. Ouch.

A stupid mistake, certainly. One that we did not see because in development and test, the performance never decreased. But one that we should’ve noticed.

I do, however, blame “Rails” for making it so easy to make that mistake. In this specific case, we found and solved it fast. But I’ve worked on codebases that were riddled with such mistakes. The only reason those projects kept running was because of the huge Heroku server (@$1200/month) keeping it afloat for a few hundred visitors. A day. Where such mistakes don’t bring down database clusters, but incrementally accumulate to an expensive, and terrible performing app. A 20ms slowdown is hardly measurable. Hundred 20ms slowdowns, one by one added over months, does make the response unacceptably (it depends…I know) slow. And, worst of all, where these “mistakes” were labelled as “Done The Rails Way” by the team.

Rails is full of such footguns (Which Rails calls sharp knives). Most of which are harmless on their own. But which compound. It is easy to join up tables in suboptimal ways, to sort or filter on un-indexed columns. Active-record is filled with tools that make it easy to abuse the database horrendously, without warnings. The amount of Rails apps that I worked on, whith some form of .sort(params[:sort_by]) is astounding: in 2021 alone, I worked on three separate rails apps, all of which could DOS the database by firing requests with ?sort=some_unindexed_field. While this example is extreme and may count as security issue, it illustrates how easy it is to make your apps performance horrible.

The sorting-by-un-indexed-field example illustrates how Rails’ coupling to the database makes many of its performance issues database issues.

In my experience, performance issues in Rails are always:

N+1 queries. Easy to detect. Hard to fix (without introducing massive coupling issues).
Unoptimized joins. It’s far too easy to add simple has_many which allows developers to fire queries at the database that are way too heavy. This is near impossible to fix once introduced and spread through the app. There’s always code that, in the end runs something like User.with_access_to(project).notifications.last.sent_to. And which happens to query over five join tables and joins on at least one index that wasn’t meant for this. Causing some 800ms query. On each pageload.
Unoptimized where, group and order calls. Using columns that are hard or poorly optimized to filter, group, or order on. Using non-indexed columns.

My rule of thumb is that each added or removed where, has_many, group or any such active-record method must be accompanied by a database migration. Because the only times you don’t need to optimize the database for this new way of querying it, is if you already had indexes that you weren’t using before (meaning it was poorly optimized before). The other time is when you re-use existing indices, in which case you most probably should refactor to move the querying to a single-responsibility (e.g. a named scope).

With Rails’ human-friendly active-record API, it is easy to forget that you are still just querying a complex relational database. One that needs nudging and tuning and tweaking just to keep serving you the data within reasonable time.

With Rails it is easy to stack up many tiny mistakes that make your Database the bottleneck. But even if you have all that under control, a high performing database-call is still a magnitude slower than many other calls.

It is still a factor thousand or more faster to fill some array from memory and code, then to fill that array from a database. As I showed in the first paragraph.

So? What should I do about it?

Some rules of thumb that I employ are:

Don’t use the database when avoidable. Which is always more often than I think. I don’t need to store the 195 countries of the world in a database and join when showing a country- dropdown. Just hardcode it or put in config read on boot. Hell, maybe your entire product catalogue of the e-commerce site can be a single YAML read on boot? This goes for many more objects than I often think.
Keep all logic out of the database. It already is the slowest point. And hardest to scale up.
Be vigilant of any sort(), where(), joins(), etc calls. They must be accompanied by a migration to tune at least the indexes, if added (or removed).
Keep all database-calls simple. As few joins as possible, as few filters and sorts as possible. Databases in general can optimize much easier for this. This also keeps the app decoupled from the actual database details.
N+1 Queries are not always bad. Sometimes even preferred. Because they enable business-logic to remain in code. And keep the logic of what to fetch in a single place, allowing performance optimization there.
Remain aware of where the actual performance issues fall. Proactively scale up according to whether the performance is IO-bound or computational. And pray for it to be computational.

[1] To push the counterpoint though: as a Rust noob, I spent over an hour writing the Rust version and as a Ruby senior (10+ years), less than 10 minutes. I’d need to run both versions way over 2000 times before the extra time I spent developing the Rust version starts paying off in extra time waiting for it to run.

[2] I am certain that you can show me a project where you run Rails without a Database and where that makes sense. They cases are there. Some that I came across, are: “I already know Rails, but not Sinatra,” or “management requires us to run everything on a similar codebase.” Actually, scratch that last one. Most are valid reasons, except the last one: that is a horrible reason to choose Rails.

[3] A quick grep: Over 9000 classes, over 33000 methods; excluding all the magic dynamic methods like the ones that wrap around your database model. This excludes the 70-something dependencies that rails itself comes with.

[4] A common Rails app will be sending email, probably generating PDFs, ingesting CSVs of exporting CSVs, but all interaction, typically, goes through HTTP. I’m aware of the exceptions (I worked on them) where Rails is used only for running cron-jobs, ETL-pipelines or even media-encoding, but those are really that: exceptions.

[5] Ironically, the performance issue becomes less articulated in this non-http, non-rails context, yet in these cases people generally dismiss ruby as option, for its performance-issues. Which, catch-22, is one of the reasons Ruby is hardly used outside of Rails (and/or Web).

[7] What might be surprising, is that lookups from the SQLite in memory are slower than lookups from the database. But this illustrates another important issue: the database runs in a separate thread, maybe even on separate hardware. So load gets distributed: with SQLite, and our memory example, one singe Ruby thread is doing all the filtering, fetching and hoisting. With an external database, this is offset. Depending on your setup, the Ruby thread might even continue work while the database does its lookups. In this case, Postgresql, which is optimized to filter and fetch data, can do this faster than SQLite-inside-ruby. In a typical production setup, the Postgresql is even better suited at this.

[8] Please do note that while DateTime::parse is slow, this function is written in C. The slowness is not because it’s written in Ruby, but probably because parsing such complex texts is slow. It would probably be just as slow for a feature-comparable version in Rust.

[9] There are many more reasons why this is a better idea. The most obvious one being that you can never put all business-logic in the database, even if you wanted. So you will have business-logic in multiple places without any structure of what goes where. So the obvious solution to keep it in one place, is… to keep in one place. The only place where it all can be kept: your application.

https://berk.es/2022/08/09/ruby-slow-database-slow

The Waning of Ruby and Rails

Mar 7, 2022 Updated Mar 7, 2022

Show full content

Almost 12 years ago I answered a StackOverflow question is ruby on rails (or at least the community) dying? [closed] with [there is] still [a] very active community around it. Today, 2022, I see this declining more rapidly than before.

The most obvious (and unscientific) place to look is Google Trends. For both Ruby, and Ruby on Rails, we see a clear downward trend for years.

Google Trends

First a sharp bump, then it settles on a sideways pattern, but after 2016 decline sets in. I don’t have an explanation for the sharp decline in 2020 and presume it is a glitch in Googles data. Yet the trend remains clear: downwards.

Back in 2010, I answered:

Ruby on Rails was a Hype. That means a lot of people jumped on the bandwagon because that is what they do: jumping on bandwagons (for a living).

After that hype, many communities popped up, in various languages that mimic Rails. Or try to. Or just took the good ideas and applied them to their community. Now you have gazillion halfbaked PHP-frameworks, and a few actually good ones. You have Django (python), Zend, Symfony (PHP) and even in Ruby, some alternative frameworks. That has spread the attention. There used to be only One Good Framework (sic.) now there are many.

That said, Rails 3 has just been released. Rails 3 is cutting-edge again. It has all the ingredients for noSQL (the one-but-latest Hype) HTML5 (the latest Hype) and many javascript-frameworks and interactions (the next-to-be Hype).

That said, Rails is not just Hypes. It is actually a fantastic framework. With a still very active community around it. Just look at github, and visit the trending repo’s there once in a while and you will see a Great Rails Thing there every week.

If you want to keep up to date, I would advice:

http://www.rubyinside.com a blog all about Ruby.

http://5by5.tv/rubyshow a podcast with (most of) all the news in Rails and Ruby land.

First the last: dedicated podcasts or ruby news-sites have all disappeared, many haven’t been replaced. There’s a fantastic weekly newsletter, but that’s about it.

This highlights a problem that is a feedback-loop: without good information (and tutorials) the influx of new devs will dry up. And without a flow of new developers, there is less demand for this information (and tutorials). If we look at e.g. Udemy, as of today (march 2022) there are a meagre 109 course on Ruby (on Rails) on Udemy. Compared to over 10.000 each, for Python, Java, or JavaScript. One of the best Rails courses had its last public update in 2020. Other services such as go-rails are offering courses, just that the landscape is changing, not becoming worse, per-sé.

With a still very active community around it. Just look at github, and visit the trending repo’s there once in a while and you will see a Great Rails Thing there every week.

This is no longer the case. Support, gems, and developers working in the open, are waning. As example, let’s look at the gems for a service that wasn’t around during the initial hype. And one that’s not tied to Rails: Azure.

The support for Azure is in a bad state. Many unmaintained, hardly any activity in last few years. With a lot of unresolved issues. For example the official Library by Azure itself has 22 issues open, amongst which dependency issues caused by depending on very old versions of other libraries (Nokogiri). I know, this is N=1, but I picked this as example, not as proof.

I’ve recently started working on a (Ruby, not Rails) project where we need a lot of integrations: payment service providers, cloud storage, project-management and so on. Modern SAAS - started in the last decade, almost all lack Official Ruby Clients or SDKs for their APIs. Yet they offer them for Java, JavaScript, Python or even Rust.

Slack has no official clients or SDKs for Ruby, (but do for other languages) nor does Dropbox. Azure, as linked above, is hardly maintained, of all HubSpot API clients, the Ruby version is least popular (based on stars and forks) and least frequent updated. Modern project management like Monday, Teamleader or Notion all lack any reference to Ruby at all. Do note that these are examples of popular SAAS that don’t have primary Ruby support. Others do offer it: from AWS to Square: there are top-notch, well-maintained gems for them.

I should run an actual data analysis over ruby gems, their repo’s, open issues and so on, but glancing over the numbers already shows worrisome trends. We can see that, if we grab a handful of SAAS services, Ruby support is lacking. Back in 2010, when I answered that SO question, this was entirely opposite: the most prominent SDK or API client were the Ruby ones. One obvious reason is that, back then, the teams developing the APIs and SAAS did this in Ruby themselves. Companies that were around then, often have good Ruby clients. Companies from the last few years often not. And the community versions of those are often missing entirely or poorly maintained.

If we look at large SAAS or software companies, we can see that the ones running on Ruby (on Rails) are all from the early days. I have a hard time finding any successful SAAS building their products on Rails after 2020. Github: 2008, Shopify: 2006, Twitter: 2006, Groupon: 2008, Zendesk: 2007, AirBnB 2008, Fiverr: 2010. The only larger companies, running Ruby or Rails and started after 2010 I could find are Stripe (2011) and Gitlab (2014). Discourse and Mastodon are the only recent popular Ruby-based Open Source projects that I am aware of.

Obviously there’s a strong survivor-bias and skewed correlation here: successful companies take decades to become that. So naturally successful SAAS will be older, regardless of using Rails.

popped up, in various languages that mimic Rails. […] Now you have gazillion halfbaked PHP-frameworks, and a few actually good ones. You have Django (python), Zend, Symfony (PHP) and even in Ruby, some alternative frameworks.

My point, back in 2010, was that Rapid Application Development (RAD) development frameworks, using Model View Controller (MVC) architecture, were in demand, fueled by the success of Rails. I have a separate post planned on “RAD web frameworks”, MVC and ActiveRecord, but it is safe to say that such frameworks, amongst which Rails, have found their niche, yet are by no means a silver bullet. It us took some iterations to find that these architectures are not well-suited for a large group of problems and domains and to develop and find alternatives. So Rails’ decline in popularity may be fueled by a generic decline in MVC and RAD, regardless of the language.

When we look at some other sources, the 2021 StackOverflow survey results is telling too: Ruby, and Rails dangle in the bottom quadrant of all lists. Ruby is “dreaded” and “loved” in equal measure. Unfortunately, they don’t have accessible trends published, but StackOverflow has a separate tool based on activity on StackOverflow by tags: Ruby is both waning and in the lower quadrant there too for decades. This is confirmed by the Tiobe index, where Ruby is declining slowly, year after year. Both relative to other languages and absolute.

So, purely anecdotal, and mostly by glancing over published numbers, and based on a gut-feeling, I would revise my statement from 2010: yes, Ruby, ruby on rails, the community is clearly waning. Not dying, though! Just like Pascal and COBOL (and Perl) never died, if only because of legacy, Ruby will remain around. I would certainly not say it is a ship going down: But a ship slowing down month after month.

Popularity, however, says nothing about quality. If popularity was a measure for quality, then Internet Explorer 6 was the best browser ever (note for millennials: it wasn’t). Ruby still gives the fantastic development experience it gave when it was released in 2005. It only became better. Rails still is a great way to get a prototype demo, or minimum viable product online in days with least surprises.

Does that mean learning Ruby or Rails is a poor career choice? Certainly not! The demand for Rails and Ruby developers is high as ever, if only because general demand for developers is ever increasing. And all that SAAS, built since 2008, needs developers for the coming decades. But be aware of the risk of becoming the COBOL or Perl developer maintaining some dusty legacy if the downward trend for Ruby continues as it did last decade.

https://berk.es/2022/03/08/the-waning-of-ruby-and-rails

Dutch government killing Post-a-Coin, my crypto startup

Jan 9, 2020 Updated Jan 9, 2020

Show full content

Last summer I started a sideproject. Post-a-Coin. A simple and straightforward idea, really: you can buy high quality postcards with Bitcoin preloaded to give as a gift. For the crypto-nerds: a paper wallet, really^1.

I envisioned it as a small side-project: a tiny company that makes some revenue to keep itself running, but which, above all, helps to put bitcoin into more hands.

An example of a card I'm selling

This autumn, I decided to stop rolling it out. To kill it off.

The reason is the draconian Dutch implementation of the European AMLD5 directive, which is certainly going to solve the “rampant problem with money laundering with cryptocurrencies /s”.

This law is the Dutch interpretation of a European guideline, but the Dutch version goes far further than what is required by the European directive.

Here, the Dutch Central bank will to perform the supervision over all companies dealing with cryptocurrencies: custodial wallets, exchanges, services, etc. This law was supposed to activate today, January 10th 2020, but has been postponed because of bureaucratic reasons, not because it is a bad law, but some technicalities with a debate in the House of Representatives being cancelled last year.

Don’t get me wrong, I consider it a good thing when governments step up and try to protect the society they represent, with laws. I consider it Good For Bitcoin, when our government decides to supervise the companies that deal with bitcoin (or any other cryptocurrency). If recognition and supervision by a government can fix the bad reputation of Cryptocurrencies, then by all means: Implement it!

Whenever I mention Post-a-Coin to family, friends or others unfamiliar with the crypto-world, I get a reaction that includes the words “Criminals”, “Paedophiles” or “Ransomware”. Sometimes all three. Not having to fight that perception would help Post-a-Coin a great deal.

If that supervision is implemented well, it will show the world, and Joe-average, that those supervised cryptocurrency-companies are serious, well-run, law-abiding companies. And not some dark-web-criminals putting ransomware on your PC or selling childporn.

The problem lies in the implemented well, though. Because the Dutch supervision will not be implemented well. At all! (Dutch twitter thread highlighting all the democratic and legal issues with this implementation).

I’m aware of at least one other company stopping its business due to this law.

For me, for Post-a-Coin, the problems with the law are five-fold: It is expensive, complex, unfair, will result in monopolies and be wholly ineffective:

It is expensive. How much each supervised company has to pay, is unclear, but just the fee one has to pay to be supervised is going to cost a lot. Numbers like €150.000 or €77.000 per company, per year, have been going around. For certain is that “the supervised group of companies” have to pay the entire costs of that supervision themselves.

My company, selling a fifty-odd Bitcoin-postcards a month, will certainly not be able to pay such fees. I would need to turn it into a company that will rival the likes of Hallmark, just to be able to pay those kinds of fees. And even if that were realistic, I don’t want to grow into the CEO of one of the largest postcard companies in the world: I want it to remain a nice sideproject.

Edit: the Central Bank gave a final cost-specification. I missed that, eventhough I’m on a mailing list AND asked them about this several times, they failed to contact me about this final spec. In any case: €5000 for a request for registration (and it looks like you have to pay, regardless of whether they will register you in the end). Following years are unknown and deemed to be more expensive, because that is when “the niche” needs to cover the entire costs of all the registrations themselves.

It is complex. I’m running this company from a shared office space. It is a just a Sole Trader entity (Eenmanszaak in Dutch). As a tiny startup, it makes no sense at all to set up an LLC entity (B.V. in Dutch) with a board of directors and such. Yet this is required. Not by letter in this law, but everything in the legal documents assumes a large company with HR-divisions, Director of Risk and Compliance, boards of directors and so on.

At a meeting with the Dutch Central Bank, I asked them about the procedure in which they will review the people on the board of a directors, a requirement by that supervision.

The answer was telling: “Your HR department will need to write a document about each director of your company”. I could buy myself a box of different hats, one with “HR” written on it. And then write those document myself. It shows how disconnected an institute like the Central Bank is with the world of small companies.

They simply cannot envision companies being small. Understandable: up to a few years ago, all they dealt with, was supervising a handful of large banks. Supervising 20-odd of the biggest multinationals in The Netherlands is entirely different from supervising several hundreds of startups who are inventing weird stuff with new and hard-to-grasp cryptography in monetary systems. Like postcards that hold Bitcoin.

In order to comply, I would, basically, need to enlist Risk and Analysis people just to comply with the regulations. Not because the (growing) business needs them. Complying is actually really easy here. I know the risks, and am 99% certain that I’ll catch a terrorist trying to send a bitcoin-loaded postcard somewhere. I’ll know immediately when you are using this service to launder money (and will think you are a really dumb money-launderer for using such an inefficient mechanism to launder your money).

It is unfair. A larger company can very easily just move outside of The Netherlands. Technically, any company operating in The Netherlands has to register, but if they don’t, there’s nothing in place to force them.

So any company with enough legal personnel on board, will just continue operating from Malta, Panama, Ukraine or whatever letterbox-setup they can afford. But moving my entire business (Me, a laptop and an offline printer) to Panama is a ridiculously heavy move, just to sell some postcards.

In other words: if you can pay lawyers to set up some foreign construction, you don’t need to comply. You can basically choose to pay a lot to comply, or to pay a lot to not have to comply.

It won’t achieve what is tried to be achieved: Those looking to set-up exchanges or other crypto products in a shady or gray area of operation will still be able to do so, as explained above. Only those with good intentions are hit by this supervision. I assume and hope that the benefits for those participating outweigh the downsides, because otherwise this law will certainly achieve the exact opposite. The well-intended players will be outperformed by those who circumvent these laws. Because those who evade the laws operate far cheaper.

It makes monopolies: All above, show that after the law activates, we’ll have a large barrier-to-entry for new companies. I don’t blame the current Dutch exchanges or hold anything against them.

But they are the ones benefiting from this in several ways, while making it far harder for their competition to grow and accelerate. The existing exchanges benefit from a shakeout and a high barrier to entry, if only because it gives them less competition to worry about.

The Dutch Central bank benefits from fewer participants too: they get their money either way, but with fewer participants they have to do less work. So there are only incentives in place to reduce the amount of companies in cryptocurrency.

I really liked the idea of Post-a-Coin.

But with such laws coming to place, the Dutch government forces me to close my company. Before I’m legally required to apply for registration at the Central Bank, I’ll close down the shop.

Another startup killed by bureaucrats trying to control a niche with rules and laws.

^1: And yes, dear crypto-nerds, I know this is “insecure”. I’ve seen your private key on that card, because I will print it on the card and cover it with a neat scratch-off-sticker. You only have my dearest and sincerest promise that I’ll delete them. Which, in my opinion is good enough security for a $25 gift. Which is also an important reason that this product cannot and will not be used to launder money: it is technically completely unfit for that.

https://berk.es/2020/01/10/dutch-government-killing-crypto-startups

Using Bitwarden to store Ansible-vault password

Oct 18, 2019 Updated Oct 18, 2019

Show full content

When provisioning servers with Ansible, managing the secrets is quite a hassle, even with ansible-vault helping there. A typical ansible-vaults contains all sorts of critical secret keys, such as API-keys, root (sudo) passwords of sysadmins, databases passwords and so on. And it is protected with a single password. Anyone running ansible, needs that password; otherwise ansible won’t provision properly.

Storing that password in plain text on your hard drive is a definite no-go. That is insecure, and with more people involved requires syncing it across all your colleagues’ computers.

The solution is to use a password manager that allows sharing a password with a team. And then hook that up to ansible, so ansible can read it from there.

Ansible has a feature that allows users to define a script which returns the vault password: --vault-password-file.

Ansible itself has an example script, which uses your OS keyring as source for this vault password. But, alas, I’m not very comfortable with gnome-keyring and prefer Bitwarden. Most of these OS-keyrings also don’t allow sharing with other users.

Bitwarden is my preferred password manager because it is the only truly Open-source password manager with sharing-abilities and syncing built in. Everything in and around Bitwarden is open-source, even the server handling the syncing. You can host it yourself (on premise) even; I trust their servers and service so far, but love the idea of being able to bail out and host my own, if they ever break that trust: When it comes to passwords you’ll really want to avoid any risk of vendor lock-in. Open-Source allows for the adagio “Don’t trust, verify!”. User-friendly, free, open-source and cross-platform; I really don’t understand why it is not more popular, actually.

I’ve made a tiny bash script that interacts with bitwarden. It requires the bitwarden-cli to be installed, which is available for most OSes.

The script requires you to log in to bitwarden first. This can probably be added to the script as well, using bw unlock --check, but that too, adds unnecessary complexity: the errors shown when not logged in, are clear enough, I think. Bitwarden (all the apps, so the CLI app too) have a staged login: you log in with your email, password, and optional 2fa. This is set as default on your device. Then you unlock the database in order to access the passwords using only the password.

Introducing a tiny bash script called ansible-vault-pass.sh, which handles the unlocking and then looks up and returns the vault password is simple:

#!/bin/bash

_BW_VAULT_ENTRY_ID="ansible-vault"
_bw_session="$(bw unlock --raw)"
echo "$(bw get password ${_BW_VAULT_ENTRY_ID} --session ${_bw_session} --raw)"

It has the entry-id hardcoded; for me, that is “good enough”. Especially since it avoids a lot of complexity.

Now, running ansible should be very easy and only prompt for your bitwarden password:

ansible-playbook webservers.yml --vault-password-file=/path/to/ansible-vault-pass.sh

The --vault-password-file option can be given a default in the main Ansible configuration (commonly ~/.ansible.cfg):

[defaults]
vault_password_file=/path/to/ansible-vault-pass.sh

The latter, somehow, must be an absolute path. No idea why: ansible is weirdly picky in these things often.

No, you can run any ansible command without the --vault-password-file option. For example:

ansible all -a "hostname" -f 10

Result

The vault password can be stored securely in a properly encrypted Bitwarden database.
The vault password can be shared amongst colleagues using Bitwardens’ built in syncing, sharing and access system.
The ansible-vault itself can now be used to store all the secrets needed for provisioning.
We only need to type in the bitwarden password every run. Allowing for long, difficult, and easily rotatable vault-passwords.
Every colleague has her own bitwarden login, so everyone uses their own password instead of having to remember a shared one.

Two security notes, though:

Ansible-vault, as storage for secrets, is secure, but can be brute-forced. So using a hard, long and often changing password is probably a requirement. Especially if employees rotate often.
Everyone with access to the ansible-vault password in their Bitwarden account, can read the entire ansible-vault. Treat all the secrets in there as such. E.g. if a colleague leaves, consider rotating all the secrets stored in that vault; just revoking the access in Bitwarden is not enough, nor is changing the ansible-vault password stored in Bitwarden. Anyone with access can have easily dumped all the secrets onto their hard drive at some point.

https://berk.es/2019/10/19/ansible-vault-password-with-bitwarden

Algemene Voorwaarden Deponeren met Bitcoin Blockchain en IPFS

Sep 26, 2019 Updated Sep 26, 2019

Show full content

Anno 2019 gaat het deponeren van algemene voorwaarden, bij de KVK gelukkig al wel digitaal. Maar daar is dan alles ook mee gezegd. In de instructies van deze KVK is te lezen:

“Voorzie het document van je bedrijfsnaam (deze bedrijfsnaam moet gelijk zijn aan de naam waarmee je bent ingeschreven in het Handelsregister) en stuur dit naar …. De datum van ontvangst van je mail of post wordt gehanteerd als deponeringsdatum. Zodra de deponering is geadministreerd ontvang je een bevestigingsbrief met de datum van deponering en een factuur.

Dit is omslachtig, foutgevoelig en daarmee duur. Een alternatief, maar meestal nóg duurder, en vaak net zo omslachtig, is om je algemene voorwaarden bij een rechtbank te deponeren.

De KVK, en rechtbank verzorgen bij het deponeren twee belangrijke functies:

Een externe partij (iemand anders dan jij of je klant) heeft een kopie van je algemene voorwaarden
Een vertrouwde externe partij zet een datumstempel op je document: bewijs dat jij op datum T het document D had.

Laat dit nu juist twee functies zijn waar een gedecentraliseerde database (blockchain) uitermate geschikt voor is!

Dus bedachten wij met Nobleton een deponeringssysteem waarbij we moderne technologie, nieuwe en open standaarden aaneen koppelen om die twee functies te verzorgen. Vooralsnog enkel voor het Deponeren van algemene voorwaarden.

Het mooie -volgens mij dan- is dat iedereen dit na kan maken en na kan doen. Bij ons is het geautomatiseerd en -vinden wij- super gebruiksvriendelijk. Maar alle technologie is vrij, gratis en openbaar te gebruiken en zelf te controleren. Met wat onderzoek, het installeren van wat software en het aaneen koppelen van checksums, en digitale gereedschappen, kun je het best zelf doen. Gratis, en onafhankelijk van ons, de KVK of andere bedrijven. Wij nemen je deze stappen en dat onderzoek uit handen.

Hoe werkt dat dan ongeveer? (Terzijde: ik ben me bewust van de oversimplificatie en manke analogieën; het klopt inderdaad theoretisch niet perfect. Het gaat mij om het uitleggen van de principes, niet om de wiskundige of cryptografische juistheid)

OpenTimestamps

Dit is zowel een protocol als een vrij te gebruiken dienst. De dienst is OpenTimestamps.org. Waarop je eenvoudig documenten kunt uploaden en daarvan een cryptografisch bewijs ontvangt dat dit document bestond op het moment van uploaden. Dit bewijs gebruikt de Bitcoin Blockchain.

OpenTimestamps, het protocol, is een lijst afspraken en standaarden waaraan software moet voldoen die gebruik wil maken van deze stempeldienst. Of waaraan software moet voldoen om zélf een stempeldienst te worden. Het is dus geen bedrijf, of stuk software; je zit dus niet vast aan één dienst, één software of één partij.

In grote lijnen zeggen de afspraken dat een tijdsstempel als volgt gezet wordt:

Maak een checksum van een document. Een checksum is een cryptografisch begrip, te complex om hier in detail uit te leggen, maar hier voldoet het om het te zien als een “vingerafdruk”: Elk uniek document heeft een unieke vingerafdruk. Verandert er één letter of byte, dan krijg je een hele nieuwe vingerafdruk.
We slaan deze checksum in een Merkle tree. Dit kun je zien als een cryptografische database waarbij je honderden of duizenden zulke checksums kunt “samenvatten” in één checksum. En waarbij die éne checksum bewijst dat al die honderden er ook inzaten. Een OpenTimestamp server verzamelt documenten gedurende minuten, of uren en maakt er dan één zo’n “samenvatting” van. Je kunt het ook zien als “de vingerafdruk van honderden vingerafdrukken”.
Deze éne checksum wordt opgeslagen in de Bitcoin Blockchain. Een transactie van €0,00, waarbij, als het ware, de samengevatte vingerafdruk van honderden documenten als “betalingskenmerk” wordt opgenomen. Pas als er genoeg documenten verzameld zijn wordt het gestempeld. Maar nooit wordt langer dan enkele uren gewacht. Meestal binnen minuten al.
Een verwijzing naar de transactie, doet nu dienst als “tijdstempel”. Je hebt nu bewijs dat jij ten tijde van deze transactie het document had. Die verwijzing is tevens een soort “eigendomsbewijs” dat jíj het liet stempelen en niet je concurrent.

Bitcoin

We gebruiken de Bitcoin blockchain voor deze tijdstempels. Ten eerste, omdat hiermee het eerste, bekendste en meestgebruikte stempelprotocol, OpenTimestamps werkt. Maar ten tweede, omdat Bitcoin veruit de “veiligste” en meest bestendige blockchain is.

Bitcoin is de meestgebruikte blockchain, met de grootste hoeveelheid miners en deelnemers. Daarmee is het een blockchain waarin het zó enorm duur is om achteraf nog iets te wijzigen, dat dit nu én in de toekomst als onmogelijk geacht wordt.

Kortom: staat de “vingerafdruk” van je algemene voorwaarden er eenmaal in, dan kan niemand het meer wijzigen.

Bij centrale partijen als de KVK is dat wel mogelijk. De KVK kan door een hack, storing, menselijke fout, onder dwang, of theoretisch zelfs kwaadwillendheid, achteraf deponeringen best nog veranderen. Op de Bitcoin Blockchain is dat onmogelijk.

IPFS

Vastleggen dat een document bestond op datum T, is één ding. Dat jíj het in handen had op datum T een tweede.

Maar nét zo belangrijk is het kunnen opvragen van gedeponeerde documenten. Al je klanten moeten je algemene voorwaarden kunnen opvragen, ongeacht of jowu bedrijf nog bestaat, of dat jij ze (nog) wel wilt overhandigen aan die vervelende klant.

Wij gebruiken hiervoor een decentraal bestandssysteem: IPFS. IPFS kun je vergelijken met een enorme “online schijf” waarbij iedereen die eraan deelneemt, kleine stukjes van zijn schijf beschikbaar stelt en die, aaneengeregen, een enorme, onafhankelijke schijf vormen.

Een nadeel bij deze techniek is dat een bestand wat weinig gebruikt wordt, na verloop van tijd niet meer beschikbaar is. De redenen hiervan zijn vrij technisch, maar wij lossen dat voorlopig op door de bestanden te blijven “seeden”: door ze zelf actief te blijven verspreiden over het netwerk. Wij zoeken nog naar andere partijen die dit met ons willen gaan doen. Helemaal 100% onafhankelijk is het dus nog niet. Voor de geïnteresseerden: bedrijven achter IPFS werken hier ook aan met onder andere FileCoin.

Onafhankelijk betekent daarin dat het in principe niet uitmaakt of Nobleton de documenten nog heeft, of dat een centrale hostingpartij de servers wel online houdt: zolang het IPFS-netwerk van “aaneengekoppelde schijven” online blijft, zijn de bestanden te delen. Het is ook niet mogelijk om bestanden te wijzigen of te verwijderen. Wanneer versie 13.37 van je algemene voorwaarden gedeeld zijn, zijn ze beschikbaar onder de vingerafdruk van dat document. Verander je één letter erin, en wordt die geüpload, dan heeft dat een andere vingerafdruk: het is een nieuw bestand. Dit nieuwe bestand staat nu naast het oude. Iedere vingerafdruk vormt meteen ook een URL (een link) waarop het document te downloaden is. Hiervoor is geen speciale software nodig. Voor het plaatsen, het uploaden dus, wel.

Dit zijn geweldige eigenschappen voor het deponeren en beschikbaar stellen van documenten. Immers: je bent niet afhankelijk van de medewerking van bedrijven, partijen of personen om bij de algemene voorwaarden van die bedrijven te komen. Je kunt ook onafhankelijk nagaan of een versie wel exact klopt: niemand kan documenten per ongeluk of stiekem veranderen.

Bij Nobleton is deze IPFS-vingerafdruk (en dus de link om het te downloaden van het IPFS netwerk) tevens ook het deponeringsnummer. Simpel en doeltreffend.

De toekomst van “Deponeren”

Wij geloven dat deze moderne technieken als Bitcoin, Blockchain en IPFS het mogelijk maken om véél goedkoper én zonder derden of centrale partijen, documenten te kunnen deponeren. Dit is nu nog erg ingewikkeld, maar wij denken dit veel eenvoudiger te kunnen maken. Eenvoudiger nog dan deponeren bij “klassieke” partijen als de KVK of een rechtbank.

Onze volgende stap is om bestanden ook te kunnen versleutelen, waardoor je met dit systeem ook geheime documenten kunt deponeren. We zijn dit nog aan het onderzoeken en aan het testen: we moeten immers zorg dragen dat het nu én in de toekomst met veel krachtigere computers, alles goed en sterk versleuteld is en blijft.

Daarnaast werken we aan een zoekmachine voor de gedeponeerde algemene voorwaarden. Deze moet ook helemaal onafhankelijk draaien: ongeacht of wij over tien jaar nog bestaan: jij kunt de gedeponeerde documenten terugvinden!

Stay Tuned.

https://berk.es/2019/09/27/algemene-voorwaarden-deponeren-met-bitcoin-blockchain-en-ipfs

The Private Blockchain Fallacy

Sep 18, 2018 Updated Sep 18, 2018

Show full content

In all the blockchain hype, a simple, yet self-refuting idea keeps popping up: a Private Blockchain.

In order to understand why a private blockchain is nonsense, we must first define what a block chain is and what it is not. Since Nakamoto coined (PDF) the term, lets see if his description helps:

a peer-to-peer distributed timestamp server to generate computational proof of the chronological order of transactions

It does highlight some important traits of “a block chain” (Nakamoto used a space between both words, I use the current popular term blockchain):

peer-to-peer: implying distribution; at least ruling out a central authority
computational proof: implying it to be verifiable
timestamp-server/chronological ordering: it’s goal, but also implying permanence

Marco Iansiti and Karim R Lakhani have a more accessible explanation:

an open, distributed ledger that can record transactions between two parties efficiently and in a verifiable and permanent way.

open (often called permissionless)
distributed (often called decentralized)
verifiable
permanent (often called immutable)

In other words: anything that does not match those criteria, is by definition, not a blockchain. It may be something similar, or even something using similar techniques, but not a blockchain. This is important.

We can also look at it from the other side: if a blockchain offers these traits, when do we need a blockchain?:

logical graph to determine if one needs a blockchain .

(I downloaded this image years ago and forgot to note the author. Sorry.)

This has the same treats, but turns the argument around: if your requirements don’t fit the exact things that blockchain offer: you don’t need a blockchain. In the case of a private blockchain: if it is not public, you don’t need a blockchain. We could stop here.

Reasons for wanting a private blockchain

So, why do people come up with private blockchains in the first place? We can group the arguments into three main arguments. I’ll add a fourth.

So we can store private data or other kinds of data that should not be public. (We scrap the open and verifiable parts for privacy)
So we can control who has read/write access (We scrap the open part for control)
So we can scale better (We scrap the distributed part for speed).
Because we are not yet confident enough about the security and exact traits to release on a public platform (We scrap the permanent part for flexibility)

The fourth is the only valid argument to ever use a private blockchain: as a temporary phase because you are still developing and finding out the ins- and outs of your blockchain-application: very similar to how you first release an app to your five friends and not upload it immediately to an app store.

All others reasons are pure nonsense, because they don’t need a blockchain to get the exact same outcome. Or because by definition, deploying that way, makes it not a blockchain. One could say, that by deploying a blockchain in a private environment, it stops being a blockchain.

But, there is more than just a play of words. Keep tuned.

Private data: public.

A blockchain has to be public in order to ensure permanence or immutability.

Let’s compare a blockchain to a household book. When I have such a book, in a drawer in my office, I’m the only one who can read it and change it.

So, when for some reason I bought a pair of expensive shoes, and a week later want to hide that fact, I can simply remove that line from the book. Or overwrite it with “pair of scissors”. And boom. I never bought shoes (according to the book).

But if that book is accessible amongst family members, for everyone to review, or copy, it becomes a little more “immutable”. I can still remove the “expensive shoes” line, but people might know, and family-members can call me out on it. With even more people who can access the book (why not mail a weekly copy to all family and friends?) the harder it becomes to change stuff afterwards.

A blockchain, however, has an additional trait: through applied cryptography, it allows one to easily detect when a line is changed. So all people who have a copy can detect with certainty and easily that something has been changed that should not have been changed. But this trait is only useful if other stakeholders can get copies. In a private blockchain it is really easy to change something, even when that change has a big, clearly visible effect. Simply because there is no-one to witness that clearly visible effect!

The public part, is what ensures the permanency. Not because it is impossible to change something (it is perfectly doable) but because that change, or its cryptographically effects, are seen by, other participants.

Bitcoin, for example is not permanent either, in theoretic sense. If enough participants decide that the 0.01 BTC that you paid to John a year ago, is sent to Mallory instead, that is what will happen: history will be rewritten. Bitcoins Immutability stems from the fact that anyone can detect such a change, and that a vast majority then needs to acknowledge that change.

The public part furthermore ensures that those participants will get hurt when making such a retrospective change: they have a stake in the (Bitcoin) blockchain, which will decrease in value because they just have proved that this blockchain can be retrospectively mutated.

In a blockchain, public, or private, immutability is more of an agreement amongst all participants that we won’t change history. Promised! it is not some magical “thing that comes from cryptography”. In a private blockchain that promise is easy to make. And even easier to enforce: if Mallory changes some old record, just throw him out!

With that in mind: immutability is an agreement, so you can just choose any database, even a shared excel sheet, and then promise each other never to change anything. All you need is detection that something was changed, which most database systems (or logging) will offer. You can even add some cryptography to it to make tampering more obvious or easier to revert or deal with.

Permissionless because we need control.

A blockchain has to be permissionless in order to ensure permanence.

This is very simple actually. If you can control who has access, you control who decides what the truth is. Therefore, you hold the keys to change history in any way that you want.

In the analogy of the household book, the person who controls who may write in the book and who may read from it, is, by extension, the person who controls what will get in the book and what not.

That person is, by extension, also the one in full control of rewriting the history. Because when the access-controller wants something changed, he can simply remove anyone opposing that change from write (and read) access and then promote himself (or a proxy) to be able to write. And then change it.

If other people can still read, they might notice that history is rewritten, but will lack all power to do something about it. The truth is what is in the blockchain, and they cannot change that, because of lack of permissions. It matters little how clumsy or detectable such a retrospective change is: the one with access control can, through that power, make these changes.

Immutability in a blockchain is not something magical that comes from a fancy technology . But it comes from the conditions a blockchain requires: the environment, or “world” in which it runs make it immutable.

The irony is that, actually, there is a magic technology that ensures immutability. And that technology is called blockchain. And hence has to be a public blockchain. We’ve come full-circle.

No Distribution because of scaling or speed.

A blockchain has to be distributed in order to ensure a verifiable truth.

The opposite of distribution is centralization. To follow the “household book” analogy, distribution would mean that several times a day (maybe even after every change) copies are made and exchanged amongst people. More important, however, is that none of these copies is considered “The Truth”.

If there is only ever one copy of the household book, changing stuff in there (like ripping out a page) is trivial and -if done well- undetectable.

The obvious solution would be to photocopy every page (say, daily) and keep those copies in multiple places. This is distribution. This is inefficient, slow and expensive, as opposed to having one copy and simply trusting that. But in essence what a blockchain does. Because of the exact reason of trust: so that we don’t have to trust one copyholder to tell us the truth. This is the trustless nature of a blockchain.

When all the copies at all moments can be considered the truth any difference between any copy will be detectable and should trigger an alarm. We then need a conflict resolution: which one is the truth.

In Bitcoin, they simply say: what the majority of copy-holders have is the truth. So if your book is copied amongst five people and two have in there “berkes bought expensive shoes” but three don’t, berkes did not buy expensive shoes.

If only one book is elected as The Truth, and all others are simply copies that could be used to proof that something was there at some point, that really proves nothing: who is to say the person holding the copy did not forge some records in there?

Distribution amongst all participants is crucial, because any central authority can otherwise change history at will and without detection: Verification is not possible and trust in an authority, or even a small group of authorities, is required.

To turn it around: when you don’t need that trustlessness, when you can trust a central authority, you don’t need a blockchain. To turn it around again: if it requires trust, has an authority, by nature, it is not a blockchain.

But what about my favorite coin?

If it has a “private” or “permissioned” blockchain, it is not using a blockchain. It might still be a cryptographically backed database, a solid cryptocurrency of very modern database, but it simply is not using a blockchain. And because it is not using it, it lacks one or more of the crucial traits.

Which means a single company could run away with your coins at any moment (yes, Ripple, looking at you). Or which means it is fundamentally insecure. Or it may offer very interesting solutions nonetheless.

I can probably build a very useful and interesting monetary exchange system based on trust and a Google spreadsheet. But it ain’t a blockchain. It’s “just” a spreadsheet on a Google server (with a really novel idea around it).

“You say tomato, I say tomato”

Sure. If you want to call any database a blockchain, go ahead. If you insist on calling a Google spreadsheet a blockchain: fine. Its not some protected or trademarked term. You are even allowed to call my bicycle a blockchain. It has a chain after all.

But this helps no-one. Because you are either reinventing a wheel, or you are using an extremely inefficient technology in a place where far better fitting, more efficient technologies would help you far better.

Some private blockchains are not blockchains because they are simply other technologies.

But unfortunately most private blockchains are really just inefficient, insecure or simply stupid implementations in which a public blockchain-setup is used behind closed doors. This offers none of the benefits of that public version, but all of the downsides.

Take git: git is not a blockchain; git is a DAG; so don’t call it a blockchain. A DAG is very elegant, efficient data-structure. That could even be used for distributed data-storage within a small group of participants that need some cryptographically-secure tamper-detection of their data. I’m mentioning git, because there are some cryptocurrencies that do exactly this: using a DAG to share truth amongst a private set of participants.

Or take BitTorrent: BitTorrent is not a blockchain; so don’t call it that. BitTorrent is a very neat implementation of Merkle trees, and really brilliant for distributing large amounts of data over a large amount of participants. Not a blockchain. I’m mentioning it, because this technology is often enough if you need to ensure that all participants have a copy of the data.

There are literally thousands of types of databases out there, some allow ridiculously easy syncing, others allow insane throughput, again others allow really thorough consistency checks. Or cryptographically signed logs. Or encryption of the data, or sharing of parts of data, or … I could go on. We don’t call all these “blockchains” because of a hype. They are not: They are proper solutions to a specific use-case. A blockchain is one such solution, and like most technology, it applies to a very small subset of use-cases. A private version, however, can never be such a subset.

Positive feedback, as well as images of cats, calling me names, for hating on your beloved altcoins, or other comments are very welcome at my twitter or on reddit.

This post was translated to Dutch by bitcoinspot.nl. Thanks!

https://berk.es/2018/09/19/the-private-blockchain-fallacy

Making a case for JavaScript, in-browser Mining

Feb 14, 2018 Updated Feb 14, 2018

Show full content

So, today, Brendan Eich spoke up against mining on your browser.

Eich is the founder of Mozilla, Inventor or JavaScript, and current CEO of the ground-breaking browser Brave.

On twitter, he says:

Brave now blocks the abusive-even-when-opt-in (1.75 cores on my MBP which became too hot to call a laptop) 1st party Monero mining script at Salon, thanks to fast work by @lukemulks.

I have several problems with this tweet, and this this new block-them-all feature in the Brave browser and most adblockers. Which I’ll explain below.

First, some background on Brave: Brave has its own cryptocurrency, BAT, which is a payment system that allows you, the user of Brave, to pay a website or publisher. A very nice alternative for ads, so to say.

I love this idea, on a high level: to let me, a user, decide to pay a site that I peruse. I’ve ran Brave ever since they came with this idea and a built-in Bitcoin Wallet, to leverage this same idea (which they now moved to a BAT system).

Now, some background on browser-mining. Quite some cryptocurrencies use a mining algorithm for which making specialised hardware is impossible (currently infeasible, actually). Monero is one of them, the most famous. So, Monero, for example, is mined on “normal” CPU’s. In order to mine more Monero, you need more CPU’s. And because of how the mining works, it can be done in your browser, using a miner that is written in JavaScript: a browser miner. And when I want to “mine more Monero”, where can I “obtain” more CPU? Right: by running that miner on the visitors of a website.

I’ve spent 3 hours writing this article, spent ฿130 on coffee (That is Thai Baht, not Bitcoin) and considerable research on monero and coinhive to get this to you. If you like my work, consider pressing the button below, to run a miner on your computer, and pay me with a few Watts of your electricity. Or to see how it works, without paying me, see the demo on Coinhive.

Loading...

Note that there’s is a big chance that your adblocker is blocking it, so head over to a standalone page instead, if you wish. Or disable the adblocker, I’m running nothing but my own piwik - pinky promise.

The first problem with Brave’s implementation, is that it is default-on and an all-or-nothing thing. An alternative would be a popup like “Google Maps want’s to access your location. [Allow] [Never]”. You’ve probably seen it. Below is an example from Chrome (in Dutch, sorry):

Confirmation dialog example

berk.es wants to
  mine Cryptocurrency on your laptop.

                     [Block] [Allow]

Brave could have gone for this established UI-pattern, but instead they chose to “block it all”. A pity. A missed opportunity.

Imagine the possibilities:

Donate to Wikipedia by mining a few minutes, instead of dealing with their annoying beggar-popups
Use Netflix for free, by running a miner on one CPU core when watching series.
Avoid the privacy-abusive advertising industry by running JavaScript on your computer, rather than tracking users all over the place.
CAPTCHA’s, DDoS protection, all by running a few seconds of a miner on your laptop.
Have “outgoing affiliate links” that don’t work in secret behind the scenes with trackers, cookies and whatnot, but a clear [Mine and Redirect] or [skip mining]-page URLs. A step-up from a link like this (NOTE: following that link does pay me a few grains of coffee actually, thanks!)

The second problem I have with Brave is the fact that “mining cryptocurrencies to donate to a website” is a direct competitor of their own platform, where you donate to a website by obtaining their own currency and paying with that.

Competition is always good, don’t get me wrong. It’s just that this direct competition gives Brave incentive to block their competitors in their browser. Or to present the competition as “inherently evil” and so on. All under the pretence of “protecting the users”. Which, in some way, they actually do.

I’m not pretending to know the that Brave or Brendan Eich are implementing their “mining blocking” because of this. But I do know that their own incentives are perpendicular to those of incentives like Coinhive. That should make you at least wary of what they say about each other: a competitor will never say about their competition that it is a good alternative.

This could be a reason that the Brave developers chose to not white-list “JavaScript mining initiatives” which try to implement an actual friendly, opt-in system. Or to work together with those that try to be friendly, honest and clear about the mining on a website”.

The third problem I have is how this “block all mining” and the noise around it, drowns out any proper discussion about it. The current stance, fueled by brave and Brendan Eich is that “Browser Mining Equals Evil”. I fear the discussion is closed and an otherwise interesting technology is killed off.

This certainly is the fault of all the “Evil” miners our there, though. They are abusing ad-networks, hacking websites, routers and CDNs, for some free Monero. The overwhelming majority of browser-mining is evil and done without your consent. That is bad. And should be blocked.

But the solution to work towards would be to block the evil ones and work together with companies like Coinhive, to improve the initiatives for honest and friendly-use-cases.

As a sidenote: One of the biggest “threats” for CPU-mined cryptocurrency like Monero, currently, are all these evil players. The miners operating botnets, hacking CDNs to inject Browser miners, stealing AWS credentials, are rewarded much higher, while honest miners, with a laptop (and some cheap solar energy) stand no chance. I would not be surprised to learn that a vast majority of mining power in Monero comes from “stolen” CPU-cycles. But that is a different discussion.

Which brings me to the “energy use of cryptocurrency”-debate. A lot of people dislike all cryptocurrency because it is “uselessly” burning up electricity. A stance that I find just as strange as someone being against neon-ads, or public television screens. All these screens on city-streets, all these huge Coca-Cola signs on top of buildings combined -a wild guess, I could not find any proper figures to back this up- probably use far more energy than Bitcoin or Monero does. And it is just as “useless”.

But even if true: a proper, opt-in, consensual miner on a website allows someone who dislikes this whole crypto-crap to just ignore it. If you don’t like it: don’t participate. If you do see the value in it, think it is not “useless”, then it is a nice way to make a micro-payment to that site.

But, back to that evil network of JavaScript miners: they have messed up an otherwise fantastic opportunity.

You can see this opportunity in action above: a single click of a button and you can pay me for writing this (again, I need to stress this to make the point clear: you can simply also ignore it!)

Clicked it? You just paid me!

You did not need to register with PayPal, or pay fees to a credit card-company. You did not have to buy some token or currency on an exchange. Did not have to install a special browser to donate, there were no wallets to top-up, no copies of ID-cards to send to some server (that will be hacked in some future) because of KYC-money-laundering-regulations. No secret keys to back-up, no wallets that can get hacked, or twelve-words-backup-phrases to securely store away in that safe you still need to buy.

Just a press of a button and you are rewarding my work. And paying for another coffee.

It is privacy-friendly: had I embedded some ads from Bing or Google, they, and hundreds of other data-miners out there would be tracking you, my precious visitor (the only tracking this site has, is my self-ran, self-hosted, privacy-enhanced Piwik analytics). Ads are the really very evil, deceptive and abusive alternative here: they mine your data, sell it, trade with it, and use it to present more ads. Most of the time all this “stealing of your data” happens without your consent and without you knowing: that is why I tend to call them thieves: they steal data from you that you never agreed on giving away.

It is very clear and honest, when opt-in: I’m not running anything on your computer. I’m not selling your data to trackers or ad-networks. I’m offering you a choice to donate to me. Had I ran ads, I would be selling the privacy of you, my visitors, without your consent, to large companies that live of gathering (stealing) and mining that data. Which is far worse. Of all the alternatives, an opt-in JavaScript miner is the most honest one, since it’s abundantly clear what I’m offering. And you are in your full rights to simply ignore my request.

Now, if you don’t feel like clicking that mining-thing above, feel free to “pay” me, by sharing this story, or dropping me a line on @berkes, my twitter feed.

https://berk.es/2018/02/15/making-a-case-for-javascript-mining

Bitcoin turning into a multi layered system is the most interesting thing in crypto in 2018

Feb 8, 2018 Updated Feb 8, 2018

Show full content

When you use Tinder, and you swipe someone, you probably don’t sit there thinking “Let’s create some TCP packages and send them over IP, hoping they reach the phone of that nice looking fellow there”. You probably just think in terms of “lets swipe this nice fellow, Leo”

I’m bringing Tinder into this story to show the power of a layered architecture. You can swipe Leo because the Internet is made out of layers that You, Tinder, your phone, apps, your browser, can use for “free”. Disclaimer: I don’t actually have a Tinder, so I actually don’t know if “swiping” is the right term. But, well, this is a story about Bitcoin.

So, TCP/IP is made up out of four layers: Link layer, Internet Layer, Transport Layer and Application Layer. For this story, only the last two are interesting. On the internet, data is transported in the transport layer. Applications such as your browser, Tinder, your email-client or even the security camera at your front-door, use the transport layer to transport data. The power of this design becomes apparent if you turn it around: Applications don’t need to invent, maintain or run their own network, cables, or protocols. They can just tell the Transport layer “Hey, I’ve got a swipe, for Leo, can you deliver it to the Tinder servers? (so they can send it along to Leo)”.

Now, back to Bitcoin. The Bitcoin community is rolling out this thing called “Lightning Network”. It is a layer on top of Bitcoin, in which value can be transported between people (aka “make payments”). It is one of the possible layers that can run on top of Bitcoin, but it is the first, and an important one: making payments is one of the most important features of Bitcoin today, so logically that this is the first thing to be moved into an application layer.

This Lightning Network can be used today. Sure, Leo needs to have a Lightning Network enabled client as do you, you might need to compile some stuff, might need to run your own server and so on, but it is possible. Today.

Essentially Lightning Network is the birth of an application layer on top of Bitcoin. This might seem uneventful, but the birth of this second layer gives Bitcoin a new purpose: it “degrades” Bitcoin to a mere transport layer for value. This is not some “Pop! And we’re done” event, but a long process. Right now, Bitcoin is that transport layer, but is also, still an application layer: you can buy pizza, or buy beekeeping-gear through this transport layer, just fine. So it isn’t very layered yet.

This new layer is going to be so much better at this “paying” thing, that it will take an important feature “away” from Bitcoin: payments. But, before you get all angry: like with TCP/IP, one can use a layer directly, if you wish. You can just skip the transport layer, and deliver data directly over one of the lower layers, if you insist. You application can skip all the application layer stuff and interact with the transport layer directly, which happens a lot, actually. You’ve probably seen these “Use TCP/IP | use UDP”-toggles in some settings of some app. Here an app can bypass, say, HTTP, TCP and so on, and use a much more raw way of delivering. You can still interact with Bitcoin, in order to transfer or manage funds, just fine. It’s just that with this new Application layer, it will become much easier to just use that instead.

If you want to buy takeway, or beekeeping gear, today, both you and the recieving party interact with the Transport layer directly. Tomorrow, we both will interact with an application layer; probably the Lightning Network, to settle that payment instead.

There will be more layers on top of Bitcoin, there will be layers on top of layers on top of layers, but deep down below, Bitcoin is the Layer that ensures value is transferred from you to Leo.

To me, this proves, again, that Bitcoin, as a project, “Gets It”. Bitcoin does not need to be everything: it only needs to be a system to store and transfer value. Nothing more!

It does not need to invent, develop and maintain all the layers, just like Tinder does not need to maintain and invent everything from cables to how-to-get-a-swipe-to-Leo-protocols. Bitcoin needs to be a very secure, very solid, very stable layer to maintain these funds for all the layers on top of it. And Bitcoin is just that.

We should note, though, that a layered architecture was not envisioned by the inventor and early adopters of Bitcoin. They envisioned it more as a monolith: a single piece of software that handles all the possible use-cases and features in itself. At least, that is how I read the whitepaper: no-where was there a mention of “Application layers” or even “layers”.

Second layers can choose different models, use-cases, or different parameters. Lightning Network is complex but also (very) secure. It is decentralised, albeit maybe (time will tell) less so than Bitcoin itself. Other networks might opt for less security. Or even more centralisation. Or tweak other parameters.

If, for example, all you need to register is “I still owe you a beer”, there could very well be a layer that maintains “all the beers owed by everyone” in a central database (or it’s own blockchain) and which registers a daily “state of the beer” on the Bitcoin layer. The possibilities are endless.

A lot of altcoins (or their advocates) did not design a layered system either. So many of these altcoins offer some “feature”, like “speed”, or “programmability”, or “the ability to track bananas” in their core. They often present those built-in features as “the Bitcoin killer”, but frankly, most of them have implemented these feature in the wrong place: as core part of their entire system, rather than as additional layers on top of standard value-transport-layers.

When you start looking at Bitcoin as “merely” a the transport-layer for value, you might start to see the opportunities for other layers on top. And you might see a missing feature as good design, rather than as a missed opportunity, or as a sign that Bitcoin is doomed.

You don’t need “instant transactions, zero-fee” in your transport-layer, you need that in your application layer. So saying that “Ripple is better because it can scale up to Visa-Scale” is nonsense, because you should also mention the trade-off: Ripple has chosen to give away a lot of security maybe even all of it, in order to gain speed. And yes, I’m picking out Ripple because I consider that the biggest scam of the 21st century (closely followed by the Roger Ver Coin, by the way). Also, I’m not saying that it is a zero-sum game: that you can choose either speed or security. But making trade-offs is part of the game. Bitcoin does not make trade-offs if that hurts the decentralisation-property, or if it hurts security.

TCP/IP is not a very efficient system. A lot of resources are spent to ensure your “swipe for Leo” ends up at Leo’s phone and not at Marks’ phone, or even your current boyfriends phone. In some cases this overhead can be “ridiculous”: sometimes far more data is sent around ensuring that your swipe arrives at the right place, than the actual content of, say, the swipe itself. I mean: TCP/IP is brilliant, but it needs a lot of trade-offs to be fault-tolerant, decentralised, secure and stable. Sometimes systems choose different protocols because TCP/IP is just not fast enough: you don’t connect your computer-screen over the network to your computer, you use HDMI, or VGA: some other protocol that is much better at delivering pixels to your screen.

Bitcoin’s function is similar: it needs to be solid and secure. It must be slow and clunky, if that is what is needed to be solid and secure. It’s sole function is to guarantee that your funds are secure, that transactions are valid and that there is no single party that can take over the network or your funds.

As such, Bitcoin does not include a “programming language”, like Ethereum does (Note: I actually do like Ethereum but for different reasons), because Bitcoin chooses security over “fancy” new features like programming languages. It leaves things like “smart contracts” or “programmability” to another layer. Instead of including it in the base layer. Note, though that such a smart-contract-layer does not (really) exist yet, but nothing fundamental stops it from being rolled out.

Nor does Bitcoin offer very good privacy (compared to e.g. Monero or Dash). But there could very well be an application layer, some alternative to Lightning Network that enhances privacy. So, rather than building it into the base layer, it leaves increased privacy to the application layer.

Bitcoin does not offer an exchange in it’s base-layer either (Like e.g. Stellar does). Nor does it offer file-storage, computing power or tracking of Banana’s in it’s base layer.

By not implementing features, by choosing to be conservative, Bitcoin remains the most secure, most solid, and most predictable Transport Layer for transporting value. Ever. Exactly the features you want from such a basic layer.

As a closing note, I’d like to stress that there certainly are altcoin-projects that are completely layered by design. Quite some “cryptocurrency projects” are actually an application layer on top of another transport layer: a vast majority of altcoins are basically tokens on Ethereum: they are the Application layer on top of Ethereum! So: I’m not saying that all altcoins are wrong and only Bitcoin get’s it right: I’m only offering an alternative way to view Bitcoin: not as a polished, finished, fancy project to be downloaded from the iTunes store, but as a single, technical layer. An important component in a vast and rapidly changing new field: managing value online.

Positive feedback, as well as images of cats, calling me literally hitler for hating on your beloved altcoins, or other comments are very welcome at my twitter or on reddit.

https://berk.es/2018/02/09/bitcoin-turning-into-a-multi-layered-system-is-the-most-interesting-thing-in-crypto-in-2018

F/Rite Air ICO

Nov 12, 2017 Updated Nov 12, 2017

Show full content

F/Rite air, was een 1-aprilgrap van IEX in 2005. Zij toonden hiermee aan dat mensen, zonder enige kennis van zaken, zonder enig onderzoek, aandelen kochten in een tech-bedrijf wat “gebakken lucht” verkocht. Dat was het hoogte (of laagte-)punt van de dot-com bubble.

Nu is er de blockchain bubble. Vooral zichtbaar in de vorm van ICO’s. Deze ICO’s zijn de laatste dagen overal in het nieuws.

ICO’s, initial coin offerings, zijn een nieuwe vorm van financiering voor bedrijven. Vooral bedrijven die “iets met blockchain doen”[1]. Na Bitcoin kwamen vele nieuwe cryptomunten (AltCoins) waarvan één hier bijzondere aandacht verdient: Ethereum. Omdat Ethereum naast een munt (de Ether) waarmee je elkaar (bijna nergens, haha) kunt betalen, een systeem bevat waarmee smart contracts geschreven kunnen worden. Klinkt vaag: daarom wat uitleg:

Een smart contract is een klein programma’tje dat verspreid wordt naar alle deelnemers op het Ethereum netwerk en welke dan door computer(s) op dat netwerk kan worden gedraaid. Het is dus openbaar, onafhankelijk en “van iedereen”. Een voorbeeld:

Een smart contract met daarin een klein adreslijstje van 10 “personen”. Aan het begin van de maand wordt in dit smart contract een bedrag gestort door Bas. En aan het einde van de maand, mogen deze 10 personen 1/10e van het bedrag innen.

Een heel simpele vorm van een “loonadministratie” als smart contract: loonadministratie op de Blockchain; en we noemen het BlockWage.

Het doel en voordeel is niet meteen duidelijk, maar is er wel:

Gedurende de maand zit het “bedrag” vast: Baas Bas heeft het al overgemaakt, en kan het niet terughalen. De tien werknemers weten zeker dat Bas er niet met de pot vandoor kan, ook al wonen deze 10 mensen over de hele wereld en staat Bas bekend als iemand die nogal vaak en snel “failliet” gaat: ze weten zeker dat het loon van deze maand binnen komt: het staat immers al klaar in het Smart Contract, iets wat door iedereen gecontroleerd kan worden: wiskundig te bewijzen is, zelfs. Zonder dat iemand Bas kent, vertrouwt of zelfs maar weet wie het eigenlijk precies is, weet iedereen zeker dat 10 vastgestelde personen een bedrag aan het eind van de maand kunnen ophalen.

Je kunt dit zo ingewikkeld maken als je wilt. Met urenadministratie die voor de verdeling zorgt, of een democratisch stem-systeem in weer andere smart-contracts, die bepaalt welke personen in het adreslijstje staan en wat de verdeelsleutel is, enzovoort. Laat je fantasie de vrije loop, je kunt het zo uitgebreid maken als je wilt, met ingebouwde spaarregelingen, fooienpotten, bonussen, etcetera. En alles wordt bestuurd vanuit een “democratisch” stem-systeem.

Iedereen mag “tokens” inzetten om te stemmen. Eén token is één stem. En die tokens, die beheer je weer met smart contracts, kunnen mensen onderling uitwisselen enzovoort: een “munteenheid voor binnen één bedrijf”, zeg maar. Uiteraard kun je een deel van je loon uitgekeerd krijgen in deze tokens. We noemen het systeem “BlockWage” en die tokens zijn WageCoins.

Kortom, die smart-contracts, daar kan van alles mee. Eigendomsrechten, huurcontracten, CO2 certificaten, landingsrechten, een stadspark met een eigen budget, uit te keren aan iedereen die het park onderhoudt. Enzovoort.

En dit gaat de wereld veranderen. Wat precies, weet niemand. Er zullen bestaande bedrijfstakken vervangen worden door smart-contracts. Er zullen enorm veel “smart contract” bedrijven floppen en er zullen bedrijven met heel onverwachte toepassingen gigantisch rijk worden. Er zal enorm veel geld verdiend en verloren worden rondom zulke bedrijven. En dat maakt hebberig.

Terug naar de ICO’s. Wanneer iemand een heel set aan smart-contracts verzint om een bestaand systeem om te gooien, heeft deze vaak geld nodig. En omdat een “lening van de bank” duur, moeilijk en traag is, kun je net zo goed al heel vroeg investeerders aantrekken: crowdfunding. En omdat een beursgang duur, moeilijk en traag is, en het opzetten van een aandelenstructuur dat ook is, kun je net zo goed diezelfde smart contracts inzetten. “De eerste 20% van de WageCoins worden verkocht aan vroege deelnemers. De volgende 80% gebruikt om de lonen uit te betalen”. Een PDFje erbij waarin je met wiskunde, statistiek en wat economie aantoont dat het de hele wereld gaat veranderen een miljoen dollar voor een tweet over jou BlockWage/WageCoin van een of andere beroemdheid en bam, het geld stroomt binnen.

En dan blijkt BlockWage inderdaad een knallend succes in de landbouw, omdat werknemers de werkgevers daar niet vertrouwen, uitzendbureaus enorme marges berekenen en arbeiders de ene dag tomaten staan te plukken bij Boer Tomas en de volgende dat Bollen pellen bij Boer Pelle en daarom loon uitkeren nogal ingewikkeld is.

Waardoor die WageCoins waardevol zijn. Die landarbeiders houden de 2% WageCoins zelf vast of maken ze over naar hun vakbondsvertegenwoordiger, omdat ze hiermee samen meer dan de helft van de zeggenschap hebben over het bedrijf dat hun loon bepaalt en betaalt. Waardoor die 20% die tussen aandeelhouders verhandelt wordt steeds duurder wordt. Winst! Rijkdom! Miljoenen!

Of, BlockWage blijkt inderdaad gewoon een heel slecht idee en na het verkopen van de eerste “20%” voor anderhalf miljoen Euro, vertrekt heer Mallory, de bedenker, naar de Bahama’s met de pot.

Omdat mensen hebberig zijn en bijna iedere maand op tv of bij DWDD of een of ander “leken”-programma horen over mensen die schatjehemeltjerijk werden met Bitcoins of andere “blockchain dingen” willen zij dat ook. Enorm veel mensen balen dat “ze er niet op tijd bij waren”.

En daar maken ook oplichters graag gebruik van. Door te doen alsof dit “het nieuwe Bitcoin” is, een hype te creëren, zorgen ze ervoor dat (tien) duizenden mensen vooraan staan, in de hoop dat ze nu wel “op tijd erbij zijn”. Het wrange is, dat sommige van die “Tokens” ook heel even heel veel waard worden: deze “aandelen” schieten in de eerste dagen in prijs omhoog; waarmee de eerste instappers dus enorm veel geld kunnen verdienen. En die verhalen de volgende hype bij de volgende ICO alleen nog groter maken. Bijvoorbeeld The world’s first 100% honest Ethereum ICO.. Het F/Rite air van 2017, ironisch, maar met een heldere boodschap: stuur nooit geld naar een willekeurig persoon op het internet. Een boodschap die, helaas, niet vaak genoeg herhaald kan worden.

Dus zijn er tussen een paar heel mooie ideeën, een paar geweldige ondernemers en een paar bedrijven met gigantische potentie, een heleboel oplichters.

Het bericht dat we uit de huidige aandacht voor ICO’s moeten meenemen is vooral dat “goed informeren” van cruciaal belang is. Ken je mensen erachter? Vertrouw je die? Vertrouw je het bedrijfsmodel? Het plan?

Klinkt het te mooi om waar te zijn? Dan is het dat waarschijnlijk ook. Is het een mooi onderbouwd plan door iemand die je kent en waaraan je graag wat geld wilt geven om daarmee zijn plan uit te werken? Dan kan het best wel eens werken.

[1] Wanneer ik het hier over “Blockchain” heb, bedoel ik vooral de technische betekenis: een decentraal kasboek:

Een blockchain […] is een gedistribueerde database die een gestaag groeiende lijst bijhoudt van data-items die gehard zijn tegen manipulatie en vervalsing. Zelfs de beheerder van nodes kan deze gegevens niet vervalsen.[…] Dit gebeurt door middel van een consensus. Met een blockchain kan ervoor worden gezorgd dat een derde partij niet nodig is om de betrouwbaarheid van een transactie te waarborgen. Wikipedia

[2] Een loonadministratie op de blockchain, BlockWage, dus, zie ik niet veel kans maken, hoogstens in (internationale) arbeidsrelaties waarbij men elkaar helemaal niet kent en vertrouwt. Maar daar gaat het hier niet om

https://berk.es/2017/11/13/f-rite-air-ico

Simple time logging on top of git flow

Apr 5, 2016 Updated Apr 5, 2016

Show full content

My current team found out that we should have tracked some time over the last year. Extracting timelogs in retrosepct is not fun. Git helps a lot, combined with chat-logs from Slack, Google Calendars will give a good basis. A day of grep, sed, and awk, and you have some time-logs.

I decided that from now on, I want to track what I start and finish working on in a basic log. And I am using git with git-flow by Peter van der Does, which is what you get when you apt-get install git-flow. This allows special git-flow hooks.

I want this to write logs to a simple textfile. But have a place where I could call external APIs to insert some tracking data into external trackers, when my team uses these.

The result is certainly not a replacement for actual timetracking. But a log that will aide with answering “when did you work on what?”.

[2016-04-06T06:43:13Z] /home/ber/Documenten/BLG_blog STARTED article-git-flow-logging
[2016-04-06T06:43:16Z] /home/ber/tmp/flowtest STARTED a-feature
[2016-04-06T06:43:47Z] /home/ber/tmp/flowtest STARTED another-feature
[2016-04-06T07:12:10Z] /home/ber/Documenten/BLG_blog FINISHED article-git-flow-logging
[2016-04-06T07:43:52Z] /home/ber/tmp/flowtest FINISHED another-feature

These will be written out when using

git flow feature start some-feature
git flow feature finish branch
## or the short alternative
git flow feature finish

It requires you to work with git-flow and use feature branches for everything. But you should use topic branches anyway.

Git-flow triggers its own hooks. So just create a simple utility script that is exectuable and logs an activity, or calls an external API or whatever you are using. Then call that script from the git-flow hooks.

Note that, as far as I can tell, the upsteam git-flow by nvie himself, does not have own git-hooks. Peter van der Does’ fork has this. Which is also the source used for the Debian package (so also for Ubuntu).

#!/bin/bash
set -e

working_dir=$(pwd)
feature=$2
action=$1
now=$(date -u +"%Y-%m-%dT%H:%M:%SZ") # ISO8601

echo "[$now] $working_dir $action $feature" >> ~/.git-flow-feature.log

Write that to e.g. ~/bin/log-git-flow-feature and make executable with chmod +x ~/bin/log-git-flow-feature.

Note: when you create scripts with git-foo a subcommand git foo is made available. You probably don’t want to name this script git-flow-log-feature or so, to prevent git flow log from becoming a command.

Now just add two hooks and make them exectuable. This will add hooks to a specific git repo:

echo 'log-git-flow-feature STARTED "$@"' >> /path/to/project/.git/hooks/pre-flow-feature-start
chmod +x /path/to/project/.git/hooks/pre-flow-feature-start
echo 'log-git-flow-feature FINISHED "$@"' >> /path/to/project/.git/hooks/pre-flow-feature-finish
chmod +x /path/to/project/.git/hooks/pre-flow-feature-finish

When I need to call an external time-tracker, the ~/bin/log-git-flow-feature script is the place to do this. An example:

#...
curl -X POST -D "{ 'note': '$feature in $working_dir' }" http://timetracker.io/api/entry

I’ve create a gist with the contents of the files so if you want to enhance it, feel free to fork it!

There is a lot of room for improvement:

make this work with “generic” git-hooks instead of relying on git-flow. Should probably match against patterns in branches that are created, merged, rebased etc.
map directories with projects, log name of the project.
don’t use “pwd” but determine the actual working copy of git instead, to allow this to work with fancy setups or when working from within subdirectories.
log git flow feature checkout as well, to log switching between (long)running branches.
fall back on generic git-hooks

https://berk.es/2016/04/06/simple-time-logging-on-top-of-git-flow

Beste Nu.nl. Dit vinden wij jammer.

Dec 13, 2015 Updated Dec 13, 2015

Show full content

Beste Nu.nl,

Wij zien dat u een hele zwik aan trackers, analytics, en andere aggressieve, privacyschendende bedrijven op uw site toelaat.

Dit vinden wij jammer, want NU.nl verdient haar geld blijkbaar met het verkopen van haar bezoekers aan allerhande louche bedrijven.

Wij willen daarom geen uitzondering maken voor NU.nl en zullen advertenties blijven blokkeren.

Nu.nl controleert of uw browser advertenties blokkeert. Steeds meer sites doen dat. Het verdienmodel van NU.nl en dergelijke sites, in begint namelijk in te storten. “Zonder advertenties geen gratis artikelen”, roepen ze.

NU.nl verzoekt om de adblocker uit te zetten

Maar de argumenten zijn onzuiver. Advertenties zijn het probleem niet. Een simpele advertentie van een zorgverzekeraar naast een artikel over dure medicijnen zal niemand storen. Al helemaal niet als het een relevante tekst is waarin de zorgverzekeraar belooft dure medicijnen te blijven vergoeden. Vraag nu een offerte aan.

NU.nl verkoopt echter niet eenvoudig ruimte om te adverteren. Ze verkopen plek om ons, de bezoekers te bespioneren. Een plek waar honderden bedrijven zeer nauwkeurige, soms zelfs enge gegevens verzamelen om dat te verkopen aan letterlijk iedereen die geld hiervoor biedt.

Een willekeurige pagina op NU.nl, 14 December 2015 heeft meer dan tien trackers:

Trackers door Privacy Badger gemeld

ad.360yield.com
rtax.criteo.com
cdns.gigya.com
sat.sanoma.fi
b.scorecardresearch.com
js-agent.newrelic.com
www.google-analytics.com
rc.bt.ilsemedia.nl
cts.snmmd.nl
zoeken.startpagina.nl

Een paar hiervan zijn voor technische redenen aanwezig, maar het merendeel is gewoon commercieel: verdienmodel “data verzamelen en verkopen”. Die geven jullie een platform. En die blokkeren adblockers en privacy-tools, dan weer. Om onze privacy een beetje te beschermen.

Zet die dingen uit, NU.nl. Weiger het verkopen van je bezoekers aan al die grote advertentieboeren. Plaats gewoon nette advertenties, zoals bijvoorbeeld DuckDuckGo dat doet: advertenties die niet álles traceren en opslaan van degene die ze bekijken. Maar die gewoon een regeltje tekst of een leuk plaatje bevatten en linken naar de site van de adverteerder. Advertenties die niet controleren hoe oud ik ben, welke lettertypes ik geïnstalleerd heb, sites ik bezocht heb, hoe mijn moeder heet en of ik wel snel genoeg lees. Gewoon: advertenties waarvoor jullie geld krijgen om ze te plaatsen, en die de adverteerders bezoekers oplevert. Advertenties die niet honderden bedrijven mijn surfgedrag toespelen.

Jullie gaan anders kopje onder. Het kost wat moeite, maar iedereen kan uw advertenties blokkeren. En steeds meer mensen zullen dat ook doen: wij zijn die rommel simpelweg spuugzat.

Denk eens na, NU.nl: waarom zouden zoveel mensen, zoveel moeite doen om advertenties te blokkeren? Onderzoek dat eens?

Overweeg advertenties die niet mijn hele ziel en zaligheid aan meer dan tien verschillende bedrijven verkopen. Dan zet ik graag mijn advertentieblokkade voor jullie uit.

https://berk.es/2015/12/14/beste-nunl-dit-vinden-wij-jammer

Add sorting to your product page in Spree

May 8, 2015 Updated May 8, 2015

Show full content

For a crafts e-commerce shop, built with Spree, I wanted to add an option to sort the products on the products page.

In this blogpost, I’ll describe how one can build such a feature, more to create some understanding of how in Spree such customisations can be developed. Spree has everything in place to add this, but it can be a bit daunting to find all the bits and pieces that could and should be overridden. Which I hope to clarify a bit. In order to built this feature, we re-open some Helpers and Controllers from Spree and we inject some HTML using deface.

You could write this as a spree extension, but that requires even more moving parts to be in place. And I think it is a better, more efficient way to first write the customisation in your main app and then, later on, when things have settled, extract it into a spree extension.

You could write this TDD, and you should really write at least some tests to spec out your changes. But testing overrides of methods, controller actions and so on, is really daunting in Spree: you’ll be stubbing and mocking a gazillion of unrelated before-filters, finders and scopes. Just to spec that your method adds one other scope, you might need over 100 lines of setup. This is a problem, but not one that I want to address in this post.

Let’s get rolling. First, override the ProductsController#index action. Add a file app/controllers/spree/products_controller_decorator.rb. The idea is to add a sorting scope that is already available on Product. The ordering scopes I want add are ascend_by_updated_at, ascend+by_master_price and descend_by_master_price. When implementing this, you can add sorting=ascend_by_updated_at or one of the other sorting scopes to the URL of the app (e.g. http://localhost:3000/orders?sorting=ascend_by_updated_at). This way we can finish the controller and then move on to the view.

Controller

module Spree
  ProductsController.class_eval do
    def index
      @searcher = build_searcher(params.merge(include_images: true))
      sorting_scope = params[:sorting].try(:to_sym) || :ascend_by_updated_at
      @products = @searcher.retrieve_products.send(sorting_scope)
      @taxonomies = Spree::Taxonomy.includes(root: :children)
    end
  end
end

Now that this works, it needs to be secured and cleaned up:

module Spree
  ProductsController.class_eval do
    helper_method :sorting_param
    alias_method :old_index, :index

    def index
      old_index # Like calling super: http://stackoverflow.com/a/13806783/73673
      @products = @products.send(sorting_scope)
    end

    def sorting_param
      params[:sorting].try(:to_sym) || default_sorting
    end

    private

    def sorting_scope
      allowed_sortings.include?(sorting_param) ? sorting_param : default_sorting
    end

    def default_sorting
      :ascend_by_updated_at
    end

    def allowed_sortings
      [:descend_by_master_price, :ascend_by_master_price, :ascend_by_updated_at]
    end
  end
end

There is a lot going on here. But the general idea is to use method aliasing to be able to extend the original index. We have, furthermore, split out the scope-selecting into several methods. This allows setting a default and whitelisting allowed scopes. You don’t want people to call just any string and call that as a method on our model (e.g. sorting=delete_all). I’ve declaredsorting_param as a helper method because we’ll need it later on (I know, not very YAGNI, but for the sake of brevity, let’s already implement it here).

If you want to find out where a view, controller or model is defining something, you can either run bundle open spree_core (or spree_frontend or spree_backend) or simply clone the stable branch into a directory and browse or search the code there. However, make very sure you have the correct version. For example, when using Github to browse the code, you have a good chance of copying the wrong (too old or too new) versions of a method into your project in order to override.

View and helpers

First iteration is to copy over Spree’s products partial to app/views/spree/shared/_products.html.erb. We add the sorting links there. In a later iteration we’ll rely on deface to inject our code, rather then duplicating the partial. But, for now, Rails will simply pick our file instead of Spree’s version. In it, we add our links:

...
<% if products.any? %>
<div class="row sorting"><div class="col-sm-6 col-sm-offset-6">
  Sort by:
  <ul class="list-inline">
    <li><%= link_to "Newest", params.merge(sorting: :ascend_by_updated_at) %></li>
    <li><%= link_to "Lowest price", params.merge(sorting: :ascend_by_master_price) %></li>
    <li><%= link_to "Highest price", params.merge(sorting: :descend_by_master_price) %></li>
  </ul>
</div></div>
<div id="products" data-hook>
...

For the links, we use params.merge(...) in order to persist any search, paging or filter params.

On your development server, this will work, but when you look at products/index.html.erb, you can see cache(cache_key_for_products). It uses [cache_key_for_products`](https://github.com/spree/spree/blob/3-0-stable/core/app/helpers/spree/products_helper.rb#L54). The list of products will be cached, which is good, because the queries can be very heavy. But that cache disregards our ordering and the active-state of our links. We need to add the sorting to the cache-key.

In order to override it, we have to add it to a file called app/helpers/spree/products_helper_decorator.rb. Because we are changing a lot inside the method, we can’t really use the aliasing as used earlier, it won’t help us a lot. Instead, we simply override the entire method. And document our changes.

module Spree
  ProductsHelper.module_eval do
    ##
    # Override cache_key_for_products to add caching per sort param.
    def cache_key_for_products
      count = @products.count
      # Instead of default max_updated_at, we look at the first product in the list
      # And we add sorting, so that we get a product-cache per sorting param
      first_id = @products.first.id
      sorting = params[:sorting]
      "#{I18n.locale}/#{current_currency}/spree/products/all-#{params[:page]}-#{first_id}-#{sorting}-#{count}"
    end
  end
end

We now have a working sorting feature, but it needs improvement.

Deface

One thing we should clean up, is our override of the partial file. Unless you want to change how a file behaves, or want to alter its HTML structure, you should avoid copying them over. That makes upgrading a lot harder. And it can breeak a lot of addons that want to inject HTML into the views.

Deface is made for this, so let’s use it. Create a file app/overrides/sorting_links_in_products.rb (And possibly restart the rails server, I’ve found that deface sometimes does not pick up new files otherwise):

Deface::Override.new(:virtual_path  => "spree/shared/_products",
                     :insert_before => "#products[data-hook]",
                     :partial       => "spree/shared/sorting_links",
                     :name          => "sorting_links_in_products")

And a partial app/views/spree/shared/_sorting_links.html.erb

<% if products.any? && params[:controller] == 'spree/products' %>
<div class="row sorting"><div class="col-sm-6 col-sm-offset-6">
  <%= t(:sort_by) %>
  <ul class="list-inline">
    <li><%= link_to_unless current_sorting?(:ascend_by_updated_at), t(:newest), params.merge(sorting: :ascend_by_updated_at) %></li>
    <li><%= link_to_unless current_sorting?(:ascend_by_master_price), t(:lowest_price), params.merge(sorting: :ascend_by_master_price) %></li>
    <li><%= link_to_unless current_sorting?(:descend_by_master_price), t(:highest_price), params.merge(sorting: :descend_by_master_price) %></li>
  </ul>
</div></div>
<% end %>

Some changes to the first iteration of this erb-code, are that we now use the locales to render strings, and that we only render the sorting-link as a link when it is not active.

I’ve also added an additional condition where I check for the controller. This is not the cleanest, but since the products partial can be reused (it is a shared partial), and we inject into this partial regardless of who is requesting it, we should only add the sorting links when we are sure the partial is being renderd via the controller that can handle the sorting. For example, the partial is being used when rendering the products in a taxonomy. And there, the products already are sorted, so we don’t want to sort them. But the controller rendering them, there, will ignore the sorting param and won’t have the helper-methods we use.

Note tha we could clean this out even more by DRY-ing up the link_to_unless current_sorting... repetition with e.g. another helper, or partials, or both, but IMHO that is overengineering. Some repetition in views is fine: they give us the freedom to place some icons, override a text and so on, much easier and cleaner. We can now remove our version of products partial, since it is no longer overriding the Spree version.

One cleanup was to use a current_sorting? helper. Which does not exist, so let’s create that with the familiar decorator/monkey-patching. Add it to app/helpers/spree/products_helper_decorator.rb:

module Spree
  ProductsHelper.module_eval do
    #... cache_key_for_products

    def current_sorting?(key)
      sorting_param == key.to_sym
    end
  end
end

In this helper, sorting_param is requested from the controller, which re-uses the default, that’s why we made it a helper earlier.

We now have a few sorting links that are implemented without hacking Spree, and without copying over entire classes or files. We can still lean on Spree and upgrade it quite safely.

Oh, and I’ll leave the CSS and declaration of localised strings as homework.

Do you often override these Spree core items? Do you know any tips and tricks on how to manage these during Spree upgrades?

https://berk.es/2015/05/09/add-sorting-to-your-product-page-in-spree

When FactoryGirl leads to bad habits

Feb 18, 2015 Updated Feb 18, 2015

Show full content

FactoryGirl is a solution for cleaning up the repetitive task of setting up test data, used in many Rails projects.

I think FactoryGirl encourages bad design by hiding design problems.

The need for repetitive or complex test-data is an indication that your code and its API need improvement. Instead of improving the application, FactoryGirl enables you to only improve this in your test suite: that is not fixing at all, it is hiding the problem.

Building up a state before we can test

We have an app that measures fuel-usage. The feature, or integration test for inserting a “measurement” could be something like this, when not using FactoryGirl, we can immediately see a problem: (the testing framework is irrelevant here, so I am using RSpec to minimize the noise because it allows me to be concise)

scenario: 'secretary on a project with cars adds new fuel-usage entries and sees them' do
  project = Project.create(name: 'new bridge in sahara', starts_on: Date.today.beginning_of_year, ends_on: Date.today.end_of_year)
  user = User.create(email: 'speedy-joe@example.com', password: 'secret', password_confirmation: 'secret')
  user.roles.create(name: 'secretary', on: project)
  Resource.new(type: :car, slug: '13-37-42', name: 'Joes Truck', project: project)
  sign_in user

  visit("projects/#{project.id}/resources/#{car.id}/entries/new")
  fill_in('Measurement', with: 12)
  click_button('Create Measurement')
  expect(page).to have_content("#{Date.today}: 12 liter")
end

If you consider that only the last four lines are relevant for this test,the first block is just noisy set-up. Now, we could refactor this into a before-block in RSpec. Or into a helper method create_project_with_secretary_and_car. But that is what FactoryGirl does for us, with a much nicer interface then such ad-hoc helpers. With FactoryGirl the test could look like:

scenario: 'secretary on a project with cars adds new fuel-usage entries and sees them' do
  user = create(:user_on_project, role_name: 'secretary')
  create(:car, project: user.project)
  sign_in user

  visit("projects/#{project.id}/resources/#{car.id}/entries/new")
  fill_in('Measurement', with: 12)
  click_button('Create Measurement')
  expect(page).to have_content("#{Date.today}: 12 liter")
end

The first benefit becomes clear because we see far fewer lines of irrelevant objects being created. The second benefit of FactoryGirl immediately becomes clear too: we no longer need to provide irrelevant attributes such as the project name.

We have, however not fixed the our application, the implementation is still clumsy; our applications API is still unfriendly. A rake task, an “import”-job, a REST-client still needs to walk through (or consider the state of) all the other records.

If you consider your tests to be one (important) user of your API, a client, the problem is clear: we need to fix the application, not the client using that API.

scenario: 'secretary on a project with cars adds new fuel-usage entries and sees them' do
  resource = Car.new('Joes Truck')
  project = Project.create(name: 'New Bridge', resources: [resource])
  user = Secretary.create(email: 'speedy-joe@example.com', password: 'secret', project: project)
  sign_in user

  visit("projects/#{project.id}/resources/#{car.id}/entries/new")
  fill_in('Measurement', with: 12)
  click_button('Create Measurement')
  expect(page).to have_content("#{Date.today}: 12 liter")
end

We have added code to our application that is used by one of the clients: the tests. We have abstracted some of the details away: a client does not need to know how we have implemented roles. For example: we can use Car to build a resource, which takes care of creating a slug, and setting other defaults. If, instead we created some “car” trait in FactoryGirl, that effort was “wasted” on our app. Worse: we still maintain knowledge about how “resource” deals with cars in our test suite. That knowledge is merely centralised in a FactoryGirl file, but still lives in the client: the tests-suite. Again worse: if your trait happens to be “helpful” and create a Project too, the factories become unpredictable. I’ve spent many hours debugging tests only to find out that some factory “helpfully” created its own relations: leaving me with “why the hell are there three Projects, all in a different state, when I thought I created only one?”

As with any good API-design, you want your clients to remain as dumb as possible. You certainly don’t want a client having to know how projects, roles, users, resources and so on are related to one another.

In the last example, we have improved our application. Introduced useful defaults, additional, useful models all of which are immediately used by at least one user of the code: the tests.

Other users can start using these Secretary or Car models and the automatic defaults on Projects too, now. They allow us to clean up code that used the clumsy interface: controllers, rake-tasks, async-workers and so on.

Quite often, I’ve found that by removing the usage of a FactoryGirl.create in a test, I could improve not only the code under test, but a lot of other places in the application too: Suddenly you realize that the “case param[:resource_type]” in your controller can be completely removed and replaced with a few smart calls to Car.new and Boat.new and so on. It even, quite often, resulted in improvements of the UX: by setting sane defaults, instead of presenting the user with a “starts on is a required field” we save it and set a default. All this because we learned from our tests that the code had some issues. All this we would not have learned if instead we cleaned up the tests with FactoryGirl.

Unit testing

I’d argue that with Unit tests there is even more reason not to use FactoryGirl. I often go as far as to forbid the usage of FactoryGirl in unit-tests.

First of all, there is the “unit” part of the unit test: A test for Role, applies to Role alone. If you, somehow need to store a User with every Role, in order to test that Role, you have tightly coupled code. When your Role cannot live without a User (or Vice Versa), you don’t have a Role and a User, but a UserWithRole unit. Since we want decoupled code (we want decoupled code, don’t we?) getting into this mess is something to avoid. FactoryGirl does not help you to avoid this, it even encourages this design somewhat:

describe User do
  describe '#function' do
    it 'includes the name of the role and its project' do
      user = create(:user, :secretary_on_project)
      expect(user.name).to be 'Secretary for New Bridge'
    end
  end
end

class User
  def function
    roles.map { |r| "#{r.name} for #{r.project.name}" }.join(',')
  end
end

This looks like a nice, clean test. But only because we have hidden away all the complexity of coming to a state in which the user has a role and a project and both have a name. And because the test looks nice and clean, we might think we are done. We are not: One line of code with four issues (and probably actual bugs):

It cannot deal with a user that has no roles.
It cannot deal with a user with roles which have no project.
It will render strange texts when a role or a project has no name.
The User is dealing with attributes on roles, and even attributes on that roles’ attributes (Law of Demeter).

Sidenote: a method like function should, IMO not be part of the Model, but is presentation and should either live in the views, or in some decorator or presentation double UserPrester. For the sake of the example, let’s leave it here, just note that our tests are telling us this too: we are creating a presentable name in an object that deals wit details wit database-state: we are violating the Single Responsibility Principle.

When I recently encountered several of these exact issues in one of my projects’ codebase, the developers argued that the solution was to add more “validation”. “Yea, but our models require a role and each role requires a project. And we are only rendering stored users, so this should never happen”.

It will happen, when you render the unstored role back on the screen when there validation-issues on storing the user, or its role. Or when deleting a project and not dealing correctly with deleting the associated roles and users.

When you test with a clean, predefined set of test-data, you won’t notice all such edge cases. But when you test with the simplest and most extreme edge case: an unstored, empty User, most, if not all, these problems will be avoided; in the design, rather then in validations, delete-hooks, checking-for-nil in views and so on.

If we had tested without FactoryGirl, it would have looked something like this:

describe '#function' do
  before { @user = User.new } # In RSpec, this is returned when using the method 'subject'
  context 'when user has roles' do
    before { @user.roles = [Struct.new(name: 'Secretary')] }
    it 'includes the name of the role' do
      expect(@user.function).to be 'Secretary'
    end

    context 'when role has project' do
      before { @user.roles = [Struct.new(name: 'Secretary', project: Struct.new(name: 'New Bridge'))] }
      it 'includes the name of the project' do
        expect(@user.function).to be 'New Bridge'
      end
    end
  end
  context 'when user has no roles' do
    it 'is "no function"' do
      expect(@user.function).to be 'no function'
    end
  end
end

Now our tests tell us about the problems:

We explicitly have to deal with the situation when a user has no roles.
We see clearly that the “function” is reaching too deep, through roles into the project.

Because it is made clear, we can now fix that:

describe '#function' do
  context 'when user has roles' do
    before { subject.roles = [Struct.new(as_function: 'Secretary on New Bridge')] }
      it 'includes role as a function' do
      expect(subject.function).to be 'Secretary on New Bridge'
    end
  end
  context 'when user has no roles' do
    it { expect(subject.function).to be 'no function' }
  end
end

The tests are clean, and explain very well how we deal with edge cases. Because we have minimized the issues appearing in edge cases and because we have delegated the methods to the right objects.

class User
  def function
    return 'no function' unless roles
    roles.map(:as_function).join(', ')
  end
end
class Role
  def as_function
    return 'no project' unless project
    "#{name} on #{project.name}"
  end
end

We have now clearly improved our code, not just the tests. Bonus: since we no longer have to store stuff in the database, our unit-test is much faster now.

Sidenote: subject.roles = [Struct.new] or subject.roles = [double(:role)] is a way of mocking the Role: a User, nor its unit-test, needs to know how ‘Role’ exactly works, making that explicitly clear in a test is often considered a good practice. Yet mocking and stubbing itself has a lot of downsides too. That is beyond this post. One important thing, though: this code, when using ActiveRecord Relations, will not work, since AR does not allow roles to be assigned anything other then a list of Roles: passing in a double or Struct will cause an exception. I use mock_model() to fix this.

Whether you consider a controller a unit and test it as such, or whether you test it in a more “integration” makes little difference for the usage of FactoryGirl. A controller tested as unit, has no interaction with the database at all: so it really needs no FactoryGirl. When you test a controller and let it go through the database, that database needs state. But the above “feature test” examples apply here even more: if a certain controller action needs a lot of records in the database before it “can work”, the problem lies there, not in your tests which create all these records.

Don’t hide problems, fix them.

FactoryGirl is a solution for hiding problems with your application and its API. When a test tells you that the API is clumsy, or dirty, instead of fixing that API, FactoryGirl encourages you to hide this problem.

I’d argue that when some test needs 20+ records created before it can run, it is far better to leave these 20+ lines of code creating these records in your tests. At least you then admit and document the problem you have. Refactoring them with FactoryGirl makes it appear as if there is no problem with your application code. This is even more true for unit-tests that use FactoryGirl.

When to use FactoryGirl?

Is it completely useless and should you never use it then? No, it has a very good usecase: when your API does demand complex state, or lots of repetitive attributes. When your Order requires a ShippingAddress, it is poor practice to hide this away in factories. Even more so in the unit-tests of Order. Instead consider introducing a ShippedOrder model a ‘OrderBuilder service-object or an Order.create_with_address()` helper even. Improve the API of your application.

I also want to give a shout-out to the developers of FactoryGirl: I’ve used it with great pleasure to replace Rails core -even more clumsy- fixtures. Please don’t read this as an attack on FactoryGirl and its developers, it are personal observations on using a “test data framework”.

But when Address requires you to type your address over and over in your tests, then FactoryGirl is a nice DSL to put the “generating a street, city and postal code” in a central place. After all: generating this, is part of the task of the client, not the API. FactoryGirl is a fancy solution for creating such data. But since factorygirl is very tightly coupled to the database and the models that represent that database, it is way more. When you need an “address data” in your tests, you could just as well add a small helper:

class LoremIzer
  def address
    { street: 'Diagon Alley',
      number: '13a',
      city: 'London',
      postal_code: '1337'}
  end
end

It is only when such helpers grow large and wieldy: when they start to turn into a framework, that FactoryGirl (or another data generation gem has a good place in your app.

https://berk.es/2015/02/19/when-factorygirl-leads-to-bad-habits

Selecting a Good WYSIWYG for Rails: it's all about use-cases

Feb 2, 2015 Updated Feb 2, 2015

Show full content

There is no such thing as “a good WYSIWYG” without looking closely at the case you are trying to solve and the preconditions you want to set. WYSIWYG, is, unfortunately, a label that is applied to far too many solutions for varying problems.

When people ask for “A WYSIWYG-editor” they are often asking for a solution to one of the following three problems:

Needing to add markup up to body-texts.
In need of more freedom when layouting pages without having to go through the development-team and/or releases.
Editing copy-writing without having to go through the development-team and/or releases.

Unfortunately, many WYSIWYG-editors start off in one such use-case and then try to incorporate the other two; fail and grow unwieldy. I consider “a good WYSIWYG-editor” one that stays close to its usecase and implements that well. But even then, WYSIWYG-editors far often add more problems than that they solve. Sometimes you don’t need a WYSIWYG-editor at all, so the first thing you should do as a developer, before the need or request for such an editor comes up, is to look closely at the editing workflow and look where you can improve this. Quite often a request for “a WYSIWYG-editor” is actually a sign that the user has different needs then what you built in the CMS. And when there is a requirement in the specifications, it often is a sign that the person drafting these requirements did not think through the editing workflow very well.

Off course there are really good reasons to need such editors. But in that case, it is very important to know what it is used for. And then pick one (or more) to solve these problems. “A WYSIWYG” is by no means a magic potion that you can drop on your CMS and solve all the editor’s problems.

Marking up body-texts.

This is your typical-usecase. But note, that such an editor is not very good at layout. Nor should it be, layout is a different use-case.

An editor for such a use-case, must allow setting some inline styles, editing inline text and assets and allow you to add styles (mostly classes) that will effect the layout. A good WYSIWYG-editor will visualise this process and apply the styles for these classes inside the editor. E.g. if you mark something as “aside quote” it will be styled as “a quote that is pulled aside”.

The most obvious place where you will see this going wrong, is with responsive-designs. HTML was always meant as a “suggestion” to the browser on how to render it, with CSS to instruct the browser on how to style, place and layout it (and yes, CSS has been rather poor in this too, but that is a different issue). Your texts (the body-texts) are best treated that way too: where the WYSIWYG-edit adds markup which the “design”, through CSS most likely, can then leverage to style, design and layout.

Storing the layout-information with(in) the content is a guarantee for disaster when you decide to change the designs. I’ve done several jobs where I helped people into the “new mobile age” by writing large, ugly parsers that clean body-texts of any layout that breaks mobile, but was stored in the database. Believe me, you don’t want that markup in your database.

Having such “content” in your database”

<p style="float:right; border: 2px solid #eee;">
  <b>Computers are useless. They can only give you answers.</b></br>
  <b><i>— Pablo Picasso</i></b>
</p>

Is awful (and this is, by far, not the worse a WISYWYG-editor can and will output. Divs. Divs everywhere).

<quote class="aside">
  Computers are useless. They can only give you answers.
  <footer>— Pablo Picasso</footer>
</quote>

…is better.

My personal favorite is WYMeditor, though I am looking for a more modern looking alternative.

When the audience allows for it (most important: if I can train/instruct the users) a markdown + live preview is preferred. My goto is epiceditor.

Markdown has several huge benefits: you are storing a simplified, deduced markup in the database, no need for ugly HTML-parsing or regexp replacement of body-texts when changing the layout. And depending on where you want to send the text, you can compile it to the appropriate markup. HTML for the web, a different HTML through JSON for your app, PDF, epub, and so on. All of that is really only possible with very clean HTML. And allowing “only clean” HTML without using a different markup-language is very, very hard.

With something like:

> Computers are useless. They can only give you answers.
> — Pablo Picasso

… the error tolerance is large; where before you had to allow class="aside" but not class="button btn-round", you now no longer need to deal with this. When you allow a "classic" WYSIWYG you will allow all sorts of (broken) HTML too.

There is one thing certain: prepare for support-hell: you’ll be debugging broken HTML from the moment you deploy a classic WYSIWYG-editor. “The mobile site is broken since the last deploy”. “No, it’s not. It’s broken because you placed a left-floating image in a title in a table, probably not on purpose, but you did nonetheless.”

Layouting Pages.

If you really want your users to do this, build a CMS for this. This is certainly not something a WYSIWYG-editor can solve for you. You might need to build something that allows users to move widgets around. E.g. though Apotomo. If you consider that even a professional webdeveloper with good understanding of CSS, a nice grid-framework at her disposal and clean HTML5 boilerplate has a hard time to build nice layouts, you’ll understand that a tool to create such layouts without touching all that code is nearly impossible. It is certainly not something you’ll drop on your project and release for your interns to add content to. So this is really not about WYSIWYG-editors anymore, but all about building an intuitive CMS. Which is a completely different topic.

But, you could assign a small part of the page to be “layouted”. In that case, often the “layout possibilities” are very limited and the aforementioned Marking up body-texts applies. This is not about choosing an editor, but about defining the exact possibilities and areas that need to be edited and layouted.

Alternatively, you can assign small areas that can be edited, in combination with a tool to choose from pre-defined layouts. This is what e.g. cmsimple does. This works very well when you have a site with a few pages only and little dynamic content. The moment you need to mix this concept with, say, an upcoming-events calendar it becomes muddy and hard. Your users now need to edit everything around this calender in one place (the WYSIWYG-editor for the layout) and the other pieces somewhere else (in an events-CRUD-list e.g.). This is a hard problem but solved by mixing in copy-editing tools.

Copy-editing.

Often, in-between all the dynamic parts, you want to change the copy-writing. Case at hand: a list of “upcoming events”. It shows a dynamic list of events that will start in the next three weeks. The text and title above that list should be editable. You could use Mercury (which is what cmsimple leverages), but in that case you’d need to build a backend to capture and save/update the content. Worse is that you’ll have two different CMS-es: one to edit the copy-writing and one to manage the events.

For this, there is phrasing it has all that in place and works nice. The copy-writing can be kept down to the bare minimum: something that is done by a different person, or at the very least in a different workflow. The moment that someone has to toggle between managing content and editing the copy-writing within one task, is the moment you’ll need to refine the workflow. Most probably move some “copy” out of the copy-writing and into their own management. Say, someone edits the subtitle on the homepage on a daily base to contain the latest headline of the news-section: the user really needs a “highlight-management” within the content-management for the news-items.

Note that regardless of the solution, performance could become a bottleneck when there is a lot of copy-writing, because the texts now no longer come from a simple template-file but from a database. This is solvable with simple caching. And note that regardless of the solution, intermixing this with multilingual copy-writing will become a hell that you cannot ever hope to get out of.

But then again. Adding a WYSIWYG-editor in itself means entering a hell that you cannot ever hope to get out of.

This post was made as a comment on Reddit in r/rails and then edited and redacted to be a blogpost.

https://berk.es/2015/02/03/selecting-a-good-wysiwyg-for-rails-its-all-about-usecases

Uitgevers, geef mij niet de schuld, wanneer ik mijn ebooks niet van jullie koop!

Feb 3, 2014 Updated Feb 3, 2014

Show full content

Schrijvers en uitgevers zijn bang. Mensen kopen hun boeken niet meer. Goh. Al in 2010 schreef ik al dat het veel makkelijker is om illegale boeken te krijgen dan legale.

Daar is niets in veranderd. (Ik kocht, trouwens, in 2013 voor bijna €200 boeken voor mijn e-reader, maar dat terzijde).

Er zijn, voor mij, als fervent lezer, ongeveer de volgende mogelijkheden, in volgorde van moeilijkheid:

Het boek helemaal niet lezen.
Dan maar de papieren versie kopen.
Het boek illegaal downloaden.
Het boek kopen.

Dat zit zo: Ik heb thuis geen Windows computers. Dus vallen alle boeken van bol.com en andere grote ebook verkopers automatisch af: je hebt daarvoor super gebruiksonvriendelijke, lelijke software nodig. Software die in de DRM voorziet: die voorkomt dat jij het boek uitleent aan je moeder, nadat je het uithebt. En deze software draait niet op Linux.

Als je groepen mensen categorisch uitsluit uit je winkel, dan moet je niet raar opkijken als die groepen niets meer bij je willen kopen.

Sommigen van jullie zoeken hun heil inmiddels in, wat heet, on device purchase, direct op mijn e-reader een boek kopen; helaas ook met DRM, dus het gekochtte boek uitlenen aan mijn vrouw, zit er ook hier niet in (Iets wat ik uiteraard met een illegaal boek wel gewoon kan; met een papieren boek trouwens ook). Het is echter een kleine verbetering ten opzichte van helemaal niet kunnen kopen, omdat ik nu tenminste mág betalen. Maar slechts een héél kleine: Als Kobo verdwijnt (en, zoals jullie heel goed weten, boekenverkopers verdwijnen nogal vaak), of als ik besluit dat mijn volgende e-reader een Frobo moet zijn, ben ik al mijn aldaar gekochtte boeken voorgoed kwijt. Foetsie. Dat vooruitzicht maakt dat ik wel vijf keer nadenk om daar mijn boeken te kopen. En jullie zullen vast wel begrijpen dat als ik over een paar jaar mijn Frobo heb, ik deze boeken natuurlijk niet weer opnieuw ga kopen; dat de piratebay dan toch echt het betere alternatief is.

En er zijn bij jullie collega-uitgevers van IT-boeken ook heel verlichtte geesten die DRM-vrije downloads verkopen. Ga daar eens mee praten. Hier kan ik de auteur ondersteunen door voor zijn boek te betalen. Het is heel makkelijk om te betalen, zelfs.Pragprog en O’Reilly verdienen veel aan me. Ik zou eerlijk gezegd niet eens weten of hun boeken ook illegaal te vinden zijn.

Maar het ergste, beste uitgevers, is nog wel dat het illegaal downloaden helemaal niet zo gemakkelijk is. De gebruikerservaring (experience) is ronduit slecht. Als je al je weg vind in het oerwoud van torrent-sites, torrent-proxies, download-share-sites en versleutelde rar-bestandjes, dan zijn de gedownloadde boeken vaak van erg slechte kwaliteit. Zo zijn heel veel boeken heel slecht geconverteerd, met lelijke woordafbrekingen, vol spelfouten, ontbreken plaatjes en moet je vaak lang zoeken voor je een specifieke titel kunt vinden die ook nog eens downloadbaar is. Die experience kan écht veel beter, zonder grote investeringen. Daar betalen mensen als ik graag een paar euro voor; geloof me: dat heeft Netflix en Spotify allang bewezen. Daar zijn lessen geleerd: dóe daar wat mee.

Maar nog veel erger is dat veel boeken helemaal niet “digitaal” worden uitgegeven. Zijn jullie soms bang voor digitaal? Het is niet alsof het hele productieproces van een boek nog op typmachines en met van die letterbakken gaat, toch: je hebt ze echt wel digitaal. Daarbij blijkt dat de markt zó ontzettend verlegen zit om ebooks van boeken die jullie niet als ebook uitgeven, dat mensen ze zelf maar gaan maken: scannen, verbeteren, overtikken. Wat een moeite, wat een energie. Maar deze community is zo ontzettend groot, dat vrijwel alle relatief popuplaire boeken inmiddels op deze manier te vinden zijn. Daar waar jullie uit angst voor piraten, boeken niet digitaal uitgaven, zijn jullie inmiddels zelfs overbodig gemaakt door diezelfde piraten.

Dus hou onmiddelijk op met zeuren over piraterij. Er is maar één schuldige partij aan te wijzen en dat zijn jullie zelf. Jullie hebben gefaald om een goede, makkelijke manier aan te bieden waarmee mensen uberhaupt jullie boeken kunnen kopen. Jullie hebben vaak zelfs helemaal verzaakt om boeken digitaal te verkopen. Wanneer jullie dan nu naar mij, jullie klant, gaan wijzen dat ik niet genoeg boeken koop, dan zijn jullie naast dom en klantonvriendelijk, gewoon enorm slechte ondernemers, die het verdienen om failliet te gaan. Breng eerst je eigen shit op orde, pas je aan aan je klanten en hun wensen maar hou op met anderen de schuld te geven van je eigen fouten.

https://berk.es/2014/02/04/new-post

Olaf van den Heuvel is een maximum flapdrol

Nov 27, 2013 Updated Nov 27, 2013

Show full content

Een van de mooie dingen van een verstorend concept als Bitcoin, is dat “de gevestigde orde” zich er geen raad mee weet. Dat is niet erg, voor een deel heeft dat tijd nodig, voor een ander deel, zorgt het voor marktvernieuwing (als in: diegenen die blijven vasthouden dat paard en wagen echt een beter vervoermiddel is dan een auto of fiets, gaan failliet).

Maar mooier is, om te zien hoe de Dr. Clavans hun onkunde ten toon spreiden.

Vandaag Olaf van den Heuvel. Op zijn stelling “Bitcoin zal niet lang onder ons blijven” is niet veel aan te merken: het is zijn mening en een goede waarschuwing. Pas op: risico, bubble-gedrag!

Maar dan “onderbouwt” hij zijn stelling. En toont aan dat hij eigenlijk geen enkel benul heeft waarover hij praat.

zolang ik niet meer transparantie heb, over wat bepaalt de schaarste van Bitcoin, vind ik het heel lastig om dat als betrouwbaar betaalmiddel te zien».

Hij begint zijn verhaal hiermee, bovenstaande transcriptie is wanneer hij dit herhaalt. Bitcoin. Geen transparantie.

Als er iets is dat bij Bitcoin anders is dan alle betaalsystemen en economieën die we tot nog toe kennen, is het wel die transparantie!

Ten eerste is er het volledig open protocol. De papers, alle (academische) discussie daarover, alle economische modellen en wat dies meer zij, is zo transparant als maar zijn kan. Iedereen die meer dan vijf minuten besteed aan het lezen over Bitcoin weet dat er nooit meer dan 21 miljoen kunnen zijn, dat dit mathematisch is vastgelegd. Iedereen die meer dan twintig minuten doorleest snapt dat alle economische modellen van niet alleen het protocol maar van alle randeffecten tot in den treuren in alle transparantie bediscussieerd worden.

Dan is het er het bijna religieuze open source mantra: het staat de werkgever van heer van der Heuvel vrij om een eigen Bitcoin-programma, app, of site te ontwikkelen. Sterker nog, dat zou de hele gemeenschap ten zeerste toejuichen. Ik heb geen flauw benul van wat er precies in mijn randomreader zit, waarmee ik al mijn betalingen bevestig. Maar ik weet vrij nauwkeurig hoe iedere Bitcoin gemaakt, verdeeld, bevestigd en beveiligd wordt, door het lezen van specs, whitepapers en broncode. Dat kan, omdat alle software transparant is: open source.

En dan is er de blockchain: Als je weet wie of wat er achter een rekeningnummer zit kan iedereen nagaan waar het heengaat. Het kán eenvoudigweg niet transparanter.

Mijnheer van der Heuvel, u bent een econoom met een goede aanstelling. U wordt door economieredacties gebeld voor uw kennis en mening over technische zaken. Ik verwacht dan ook dat u weet waarover u praat. Wanneer u echter (het gebrek aan) transparantie opvoert als argument waarom Bitcoin een slecht idee is, geeft u niet alleen aan dat u geen flauw benul heeft waarover u praat, u erkent daarmee vooral dat u een flapdrol bent.

https://berk.es/2013/11/28/olaf-van-den-heuvel-is-een-maximum-flapdrol

Hoe kom ik aan Bitcoin, en een Bitcoin rekening?

Nov 26, 2013 Updated Nov 26, 2013

Show full content

Deze vraag kreeg ik de afgelopen weken verschillende keren. Laat ik beginnen met de bekende waarschuwing die ik eerder al gaf: Weet wat je doet. Besef dat de kans groot is dat je alles verliest. Misschien is de kans dat je alles verliest wel groter dan dat je iets aan je bitcoin-avontuur overhoud. Dit is dan ook stap 1: Informeer jezelf!. En dek jezelf in: ga niet je hele spaarrekening overzetten, maar alléén maar geld dat je kunt missen.

Een Bitcoin-rekening.

Ik krijg vaak de vraag van mensen hoe ze een rekening kunnen openen. Of de opmerking dat dit moeilijk is. Het is allesbehalve moeilijk! Maar berust vooral op een veel gemaakt misverstand: een bitcoin-rekening hoef je niet bij een site, bedrijf of organisatie af te nemen!

Download een app. Dat is alles; je hebt nu een rekening. Op je telefoon, dus zet er geen grote bedragen op!
Download een programma op je pc: BitcoinQT, MultiBit of Bitcoin Armory. Dat is alles; je hebt nu een rekening. Op een computer, die je deelt met alle virussen, familieleden, collega’s enzovoort. Dus ook hier: zet er geen grote bedragen op!
Open een van de vele web-based wallets. Deze beheer je niet zelf en zijn daarom maar deels betrouwbaar. Ook hier weer de les om er niet al te veel op te zetten, je bent niet beschermd en de kans is erg groot dat op de dienst waar je jou geld op opslaat een keer ingebroken wordt en je alles kwijtraakt. Blockchain.info is de bekendste.

Bitcoins wisselen.

Het makkelijkst is om ze te kopen voor Euro’s. Bij een wisselkantoor. Hiervan zijn er vele, in vele soorten en maten. Veel beginners proberen meteen bij het grootste en bekendste handelskantoor, MtGox een rekening te openen. Dat is alsof je naar de Beurs in Amsterdam gaat, om Thaise Baht te wisselen voor je zomervakantie. Dat doe je niet op een beursvloer, maar gewoon bij een GWK op de hoek. MtGox of de Europese Bitcoin-central zijn zulke beursvloeren. Maar er zijn een heleboel GWKs in Nederland. En een boel oplichters.

Enkele goed aangeschreven kantoren zijn:

Bitonic; betalen met iDeal (of gebruik deze link naar Bitonic, waarmee ik een vergoeding ontvang.
Bitcoin.de; Betalen met bankoverschrijvingen.

Meer kun je terugvinden op How to Buy Bitcoins in The Netherlands.

Wanneer je veel geld wilt wisselen, is de vuistregel om eerst wat af te tasten: wissel een tientje in, wissel het terug in. Zo bouw je vertrouwen op en heb je enig inzicht in de procedures, duur en betrouwbaarheid van het kantoor. Je loopt toch ook niet met al je vakantiegeld in een koffertje naar het Belhuis/wasserette/telefoonverkoper om de hoek om het om te wisselen naar Braziliaanse Real zonder de eigenaar een beetje te vertrouwen?

Of ga op zoek naar iemand die je in levende lijve wil helpen. Op localbitcoins zijn vaak mensen te vinden waar je mag langskomen om een briefje van vijftig om te wisselen. Uiteraard ook hier weer de waarschuwing dat je niet met twee ton in een koffertje moet langsrijden :).

Minen is weer een heel ander verhaal en vergt veel kennis, geduld en dure, speciale hardware. Voor degene die enkel een Pizza of twee wil gaan kopen is dit absoluut de moeilijkste optie.

Het is heel makkelijk.

Ondanks alle ingewikkelde concepten, cryptografische en economische modellen erachter, is het gebruik erg simpel.

Je hoeft je nergens aan te melden, te identificeren, of te registreren.
Je hoeft geen ingewikkelde mining-computers te kopen om aan wat (fracties van) Bitcoins te komen.
Je hoeft helemaal geen registratie- of andere complexe procedures door om wat Euro’s te wisselen voor Bitcoins.

Een app op je telefoon, en een contact op localbitcoin die ter plekke je Euro’s inwisselt en ze overmaakt naar je telefoon is genoeg. Of een iDeal betaling op Bitonic en het vanaf daar overmaken naar je rekening op je pc is ook voldoende.

En geef ze uit.

Ik ben geen econoom, maar het principe is zo eenvoudig dat zelfs ik het kan begrijpen: een Bitcoin is zoveel waard als wat je ermee kunt kopen: Goederen of ander geld. De waarde wordt dus voor een groot deel bepaald door wat je ermee kunt kopen. Als niemand ermee koopt, verkoopt ook niemand producten voor Bitcoin. En is die waarde laag. De beste “investering” is dan ook om het (ook) uit te geven!

https://berk.es/2013/11/27/hoe-kom-ik-aan-bitcoin-en-een-bitcoin-rekening

I am a webdeveloper (using Ruby, Rails and other Open Source)

Oct 5, 2013 Updated Oct 5, 2013

Show full content

Last few weeks, I had to explain quite a few times what it is I am doing now. Time for a summary.

I am not a WordPress Developer.

Source of the confusion is Savvii, a brand-new WordPress hoster where I work. Although Floor Drees forbade me to use the word Senior, I am the Senior Developer at Savvii.

And I develop in Ruby. And Rails. And then some more.

I also don’t develop with or for Drupal.

After my decision to stop all Drupal Work I’ve honed my Rails skills. Improved my understanding of OO patterns, software architecture. Got to love TDD and BDD, and hardly ever regretted my decision.

I am a webdeveloper. Developing in Ruby, Rails. And using other tools.

And then, an offer to help set-up a startup in my own town, a WordPress hoster, came along. Last Wednesday we launched our MVP (yay, we launched!). And the custom parts of our infra are mostly Ruby.

We do dissect our WordPress sites. To find why the heck something is (still) slow. And I oversee some WordPress plugin development: our Hosting plugin that makes WordPress talk to our, yes, Ruby backends. But I am not a WordPress Developer. And neither did I leave Drupal for WordPress as some seem to think.

A webdeveloper at an online bookstore needs to know about the world of selling books. And when you’re a developer for an online 3D-printing-service you will visit many 3D-printing conferences. That is what confused some folks: suddenly they see me tweeting about WordCamp and assume I’m now building WordPress sites. It’s merely one of the best sides of being a software-developer: that you’ve get to know so many diverse industries, because you need to capture that industry in software. I’ve now been given the chance to capture the industry of managing many, varying WordPress-sites in a fine piece of software. Which is about all the WordPress development I’ll be doing.

(And next up, are Chef, fine-tuning our Sinatra, Slim and Rails apps, building a nifty logging-platform, A RESTFull (probably composer-based) update-system for plugins, core and themes, a RESTFull Backup-service and a giant load of other infrastructural projects).

https://berk.es/2013/10/06/i-am-a-webdeveloper-using-ruby-rails-and-other-open-source

Doe één ding en doe dat goed

Oct 2, 2013 Updated Oct 2, 2013

Show full content

Dit artikel verscheen eerder op de blog van Savvii, de Wordpreshoster waar ik momenteel werk.

Als hoster willen we natuurlijk zoveel mogelijk automatiseren, hiervoor moeten we allerhande onderdelen ontwikkelen: van het afhandelen van bestellingen tot het updaten van de wordpress-sites. Voor ons is dan van belang dat dit backend:

snel aan te passen is aan nieuwe eisen, wensen en inzichten
goed overweg kan met fouten en storingen
heel veilig is

We hebben er daarom al heel vroeg voor gekozen om alles als heel kleine, gefocuste, losse onderdelen te maken. Waarbij we veel inspiratie halen bij the Unix Philosophy:

This is the Unix philosophy: Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface.

Rule of Modularity: Write simple parts connected by clean interfaces.

In ons geval willen we daarbij ook volledig onafhankelijk zijn van de locatie waarop iets draait. We kiezen ervoor om alle communicatie tussen alle componenten met RESTfull JSON te laten communiceren. Dat betekent dat we alle communicatie over HTTP doen, waarbij we JSON rondsturen.

Rule of Extensibility: Design for the future, because it will be here sooner than you think.

We hebben een paar concrete uitgangspunten:

Alle componenten zullen doorontwikkeld of vervangen worden.
Alle componenten zullen op een moment falen, uitgaan of fouten vertonen.
De markt en de community rondom WordPress is heel veranderlijk.

Het mooie aan een modulair ontwerp met een standaard-communicatie, is dat we iedere component ten alle tijden kunnen vervangen door een andere. Als bijvoorbeeld de backup-oplossing niet meer meeschaalt, vervangen we het door een andere zonder dat het hele systeem daarvoor plat hoeft te gaan!

Omdat alle componenten ook onafhankelijk zijn van elkaars beschikbaarheid kunnen ze ten alle tijden falen of kapot gaan, zonder dat de rest van het systeem daardoor mee omvalt. Als bijvoorbeeld de facturatieomgeving plat-gaat (vanwege een upgrade, bijvoorbeeld), heeft dat geen enkel effect op het waar bijvoorbeeld de backups gemaakt worden. Bij een zogenaamd monolithische opzet zit alles in één grote applicatie en valt bij een storing altijd de hele applicatie uit.

En omdat we parallel aan de verschillende onderdelen kunnen werken, kunnen we veel sneller anticiperen op veranderingen of nieuwe inzichten: wanneer bijvoorbeeld een nieuwe WordPress-versie een heel nieuw upgrade-mechanisme heeft, schakelen we eenvoudigweg een nieuwe update-robot in!

Hoe we zorgen dat we continu en supersnel kunnen releasen, is voor een volgende blogpost.

Rule of Simplicity: Design for simplicity; add complexity only where you must.

Om alle componenten zo dom en simpel mogelijk te kunnen houden zit in het “midden” Evvii, onze API. Dit is een applicatie die opdrachten en informatieverzoeken ontvangt (over RESTful HTTP, dus), en deze informatie weer teruggeeft in JSON.

De buitenwereld kan niet bij Evvii, maar bijvoorbeeld de admin-omgeving waarin jij onder meer je sites beheert, kan er wel bij. Deze admin-omgeving (we noemen hem Wallii), kan dus communiceren met Evvii, maar hoeft zelf helemaal niets aan business-logic te hebben: Dat doet Evvii.

Een bijkomend voordeel is dat de veiligheid hierdoor verhoogd wordt: de diensten die “publiek” op het internet draaien zijn héél simpel en alleen daardoor al makkelijker te beveiligen. Maar deze webdiensten kunnen zelf helemaal nergens bij, behalve dan bij Evvii. Waardoor we als het ware een tussenlaag hebben die ook weer extra te beveiligen is. Uiteraard is beveiliging veel meer dan dit, en zullen we ook hieraan nog uitgebreid aandacht besteden in toekomstige blogposts.

Evvii stuurt alle losse componenten aan en voorziet in alle informatieverzoeken. De losse componenten weten verder niets af van de andere componenten. Dit maakt het makkelijker voor ons om componenten te vervangen, verwijderen of aan te passen. Evvii is daarmee wel het kritieke deel: de bottleneck. Hoe we dat oplossen is voor een volgende blogpost.

Evvii is daarmee wel het meest complexe deel, maar op zich nog altijd vrij eenvoudig: Evvii weet niet hoe een backup gemaakt wordt, enkel wie ze de opdracht kan geven dat te doen! Daarmee kan Evvii nog altijd vrij simpel gehouden worden: ze hoeft enkel te weten wanneer ze wie welke vraag hoeft te stellen, niets meer. Hoe we al deze taken uitvoeren en alles schaalbaar én snel proberen te houden is ook voor een latere blogpost.

Rule of Composition: Design programs to be connected with other programs.

Een ander voordeel van het op deze manier opbouwen is dat we allerhande tools en programmeertalen kunnen inzetten. Ze communiceren immers over een protocol in een formaat dat vrijwel iedere taal en heel veel diensten gewoon begrijpen. De gereedschappen die dat niet doen krijgen een eenvoudige wrapper waardoor ze via een webservice bediend kunnen worden. Jullie WordPress sites, bijvoorbeeld, zijn niet altijd over het web met JSON te besturen: dat zou ook onveilig zijn. Dus besturen we ze op de server met wp-cli. Maar dat is ook een commandline tool en geen webservice: we moeten daar dus een webservice voor maken. Wanneer we die stabiel en veilig hebben, zullen we deze uiteraard releasen als Open Source Software.

We bouwen dus overal kleine webservices omheen. En we doen dat vooral in Rails, Sinatra en Slim. En de tools die we zelf bouwen zijn al allemaal webservices vanaf het begin.

https://berk.es/2013/10/03/doe-n-ding-en-doe-dat-goed

So, I am starting as Ruby Developer at Savvii

Jul 30, 2013 Updated Jul 30, 2013

Show full content

Monday (5th of JuliAugust) I am starting at the startup Savvii. To be part of the team that wants to bring you the very best WordPress hosting in the Netherlands.

I am going to develop the software for the back end: program the robots that do all the magic behind-the-scenes work.

Because I am a Ruby-developer, I’ll do this in Ruby. We needs some technical things like long-running threads and asynchronous workers and stuff, which makes Ruby (and Rails) one of the best candidates. I’ll be blogging more about the technical stuff on our yet-to-go-online blog. And probably here too.

WordPress, for those who are not too much into this stuff, is the most used, of the three most popular CMSes. It is the software behind many famous and popular websites and is commonly known as a Blogging-system, rather then a generic CMS.

WAT!? You be doing WordPress-development?

Probably a little, but my task is to get the back end-software running, which is building stuff like CRMs, billing gateways, provisioning APIs and whatnot. My fellow-programmer is the main PHP-and WordPress-developer. But I will certainly dive into WordPress now and again to make it play nicely with all the Ruby back ends.

But didn’t you say you hate WordPress, Drupal and all that other PHP-stuff?

Nope. Apparently a lot of people seem to think that I am “against” Drupal, WordPress or even PHP in general. On contrary: I like them a lot. And very often advise people to use this software for their websites.

I merely think that such CMSes have their place and cases, but are often abused in cases (and projects) where they fit poorly: the right tool for the job. Apparently many people read that as “haters gonna hate”. I don’t. Hate.

What about your own company?

I’ll be putting my all my focus on Savvii, on making it a great hoster. But I expect to go back to some small side-projects once we are moving. Not sure how it will combine, but time will learn. Berk.es will continue to exist, albeit dormant for a while.

Savvii? WordPress hoster?

We are just starting, so all the details are still unclear. But in short, translated from the placeholder website:

Savvii is the new managed WordPress platform for the Netherlands. That means speed (caching, CDNs, tweaks), security (scans, free fixes, automatic updates) and service!

Savvii is part of the family of companies around bliXem internet, which is where I’ll be working. Yes, that is Nijmegen, so I can go to work on my bike.

I am really grateful to become part of such a good team and to be able to get the opportunity to help such a promising startup going!

https://berk.es/2013/07/31/starting-as-ruby-developer-at-savvii

Over de anonimiteit van Bitcoin

May 22, 2013 Updated May 22, 2013

Show full content

Het bedrijf dat in Nederland pintransacties verwerkt, Equence, kondigde aan om ons pingedrag te verkopen aan geïnterresseerde marketeers en bedrijven. Ze beweren dat deze transacties anoniem zijn. Dat is onzin, pintransacties (Of iDeal, creditcard, Paypal of eender welke digitale transactie) kán eenvoudigweg niet anoniem zijn; je rekeningnummer is immers direct gekoppeld aan je, geïdentificeerde, persoon. Ooit een bankrekening proberen te openen op naam van Jelle Snikkelsma, Kerkstraat 14 te Grashuizen?

Bitcoin, hoor je vaak, is wél anoniem. Bitcoin-betalingen, zijn, inderdaad, niet tot een persoon terug te herleiden als je dat niet wilt. Vrijwel vergelijkbaar met betalen met contant geld. Als ik jou een briefje van vijf euro geef en daarna mijn hele levensverhaal vertel, is de betaling niet anoniem. Maar als ik je vijf euro per post opstuur, dan is (bijna) niet te achterhalen dat dit van mij afkomt.

Deze anonimiteit bij bitcoin is vrij uitzonderlijk voor een digitaal systeem. Wanneer je dan bedenkt dat Bitcoin in feite een enorm, publiek kasboek is, waarin iedere transactie wordt weggeschreven, klinkt dit al helemaal vreemd.

Bitcoin werkt namelijk met pseudoniemen. Cryptografische pseudoniemen; je rekening is niet meer dan een public en private keypair, je portemonee. En bitcoin werkt daarom niet met registratie: iedereen die een public en private (geheime) keypair kan aanmaken, heeft onmiddellijk een werkende bitcoin-rekening. In analogie met een bankrekening zou dat zijn: iedereen die een uniek bankrekeningnummer en bijbehorende pincode kan verzinnen en die kan onthouden, heeft een bankrekening; zonder dat het banksysteem daar ooit vanaf hoeft te weten. De cryptografie garandeert dat enkel jijzelf toegang hebt tot die bankrekening (dat de pincode precies de goede is om geld mee van de rekening over te maken). En zorgt ervoor dat er nooit (in de praktische zin van “nooit”: niet in theoretische) twee mensen dezelfde bankrekening verzinnen.

De portemonnee, en dan met name je public-key, moet echter niet naar jou te herleiden zijn. Je moet, in de praktijk, dus niet je publieke sleutel versturen vanaf een IP-adres wat naar jou persoon terugleidt. En dat versturen ervan, doe je bij iedere transactie. En daarbij kun je heel eenvoudig een (of meer) nieuwe portemonnee aanmaken en al je geld daarheen overmaken. Als jij je pseudoniem geheim wilt houden kan dat. En is die rekening verder op geen enkele manier te herleiden tot jou.

Daarbij heeft je portemonnee oneindig veel bankrekeningnummers, Bitcoin addresses genoemd, waarmee je geld naar je portemonnee kunt overmaken. En van waaraf je geld kunt overmaken. In de praktijk zul je dus voor iedere transactie een nieuw, wegwerp-adres aanmaken. Dus zelfs als één zo’n adres wel terug naar jou te herleiden is (omdat je bijvoorbeeld een boek bestelde en dat thuis laat bezorgen), dan nóg staat een volgende betaling daar vrijwel los van. Wanneer dus de éne betaling die je binnenkrijgt voor een factuur is waar je met naam en toenaam op genoemd staat, is alléén dat adres naar jou terug te leiden en niet te koppelen aan alle andere betalingen die je ontvangt.

Ook hier hangt het dus vooral af van hoeveel privacy je wílt; maar je hebt, in tegenstelling tot dat bankpasje in je broekzak, tenminste de mógelijkheid om je identiteit verborgen te houden bij bestellingen, betalingen en zelfs bij het ontvangen van geld.

Dát beste mijnheer Rietveld van Equens, is privacy. Het verkopen van cijfers aan Ohra, over hoe vaak een groepje rekeninghouders op Station Utrecht een blikje Grolsch koopt (en de volgende ochtend op Station Nijmegen bij de DA een pakje Aspirine), is dat niet!

https://berk.es/2013/05/23/over-de-anonimiteit-van-bitcoin

Bitcoins kopen? Weet wat je doet.

Mar 31, 2013 Updated Mar 31, 2013

Show full content

Afgelopen dagen heb ik aardig wat vragen over Bitcoin beantwoord. Daarom een algemene samenvatting. In de vorm van een waarschuwing.

Bitcoin is overal in het nieuws. BBC, Fox en vandaag ook het NOS achtuurjournaal. De reden is de enorme wisselkoersstijging. Vanmiddag ging de prijs van 1 Bitcoin door de $100-grens: je betaalt dus $100 voor één Bitcoin. Begin dit jaar was dat nog minder dan $20.

Laat vooropstaan dat ik Bitcoin een geweldige toekomst toedicht. Ik geloof erin; doe dat al bijna twee jaar.

Opeens wil iedereen Bitcoin. Hieronder drie redenen waarom je dat nu misschien beter niet kunt doen. Of helemaal beter niet kunt doen.

Garanties, veiligheid en kennis vereist.

Bitcoin is decentraal, dus niemand is verantwoordelijk voor jou geld. Bitcoin heeft wel waarde en bedrijven en zelfs banken waar je jou Bitcoins kunt onderbrengen.

Maar de meesten zullen hun spaarpot gewoon op hun computer zetten. Als je al moeite hebt met virusscanners, wachtwoorden onthouden of je hardeschijf te versleutelen is dit eenvoudigweg geen optie. Ja die éne Bitcoin kun je best op je Android-telefoon bewaren. Maar die paarduizend euro niet. De kans is vrij groot dat iemand ze gaat stelen. En dan is er geen grote bank die garant staat of zelfs maar moeite doet om jou te beschermen.

Je kunt je Bitcoin bij een van de vele online banken hiervoor onderbrengen, maar de kans dat zo’n bank beroofd wordt is ook reëel en dan is het onzeker of je het geld nog terugkrijgt. Meestal niet.

Weet waar je mee bezig bent, of huur hiervoor iemand in. Blijf er anders beter van weg.

Gelovers, gebruikers en handelaren

In de Bitcoin-economie worden aardig wat bedrijven opgezet, accepteren steeds meer handelaren het als betaalmiddel, en gaat er daadwerkelijk heel veel geld rond.

Maar er zijn ook heel veel daghandelaren. Dat zijn veel mensen die helemaal niet (persé) in het systeem geloven, maar enkel afkomen op de enorme koersstijgingen. Deze mensen leveren maar een kleine bijdrage aan de échte waarde van deze munt. Die waarde wordt namelijk vooral bepaald door wat je ermee kunt kopen. Vandaar dat alle ondernemers die Bitcoin aannemen, of die diensten rondom Bitcoin ontwikkelen, veel meer waarde in de economie stoppen.

Hoe minder mensen zulke waarde in het systeem stoppen, hoe instabieler de economie. Als je dus alleen van plan bent om snel rijk te worden en die rijkdom direct in Euro’s wilt omzetten, mag dat natuurlijk. Maar als teveel mensen op deze manier in Bitcoin zitten, zakt alles natuurlijk ook gewoon weer in.

Stap er dus ófwel voor heel korte tijd in, maar weet dat je dan veel verstand moet hebben van valutahandel en daghandel. Als leek heeft het enkel zin om met Bitcoin aan de slag te gaan voor lange termijn. En dan kun je beter even afwachten tot wat duidelijker is of deze huidige koers een bubbel is (veel te hoog, dus) of dat het weer aan het stabiliseren is. Anders beleg je spaarcentjes voor een veel te hoge koers.

Persoonlijk geloof ik dat Bitcoin (of een Bitcoin-achtige) over een paar, vier, vijf, jaar nog véél en véél groter is, lange-termijn dus.

De prijs is (veel te) hoog.

Althans, dat zeggen de meesten die wél verstand van economie hebben in de Bitcoin-community. Ik praat die lui hier enkel na. De reële waarde ligt eerder onder de $20 per Bitcoin, zeker niet $100.

Momenteel is de prijs vooral zo hoog, door de enorme vraag. Die vraag komt weer door de plotselinge zichtbaarheid in alle media.

Er is dus een enorm grote kans op een harde waardedaling in de komende weken. Dat zeg ik, nogmaals, als leek in economische wetenschappen. Maar die kans is groot, doordat er relatief (te) veel daghandelaren in de economie zitten (ten opzichte van mensen die reële waarde in de vorm van goederen en diensten) aanbieden. En omdat deze grote groep anticipeert op een koersstijging. Noem ze de “opportunisten”. Bij de minste of geringste daling zullen deze mensen eruit stappen. Omdat ze enerzijds niet in de lange termijn geloven (of er geheel niet in geloven) en anderzijds enkel erin gestapt zijn voor een waardestijging. Stappen al deze mensen eruit, dan corrigeert de waarde terug naar de reële waarde. Wat die waarde precies is, weet niemand. Maar kijk je naar wat langetermijngrafieken, dan is te zien dat de waarde van midden september tot eind januari, rond de $14 lag en nauwelijks schommelde.

Kenners en experts waarschuwen voor een te hoge prijs. De kans is, volgens hen, op dit moment groot dat de waarde in de komende weken weer gaat dalen. Neem dit risico mee in je overweging.

https://berk.es/2013/04/01/bitcoins-kopen-weet-wat-je-doet

Please, Ruby devs, join() your paths

Mar 20, 2013 Updated Mar 20, 2013

Show full content

Like in most programming languages, when you write paths in ruby, e.g. to open a file you pass in a string:

filename = "bar.txt"
File.open("/home/foo/"+ filename)

This is a serious smell for several reasons. Not, as people often believe, just to cater the few Ruby developers on Windows (Windows knows how to follow “/foo/bar/” paths just as well as “\foo\bar” nowadays).

But mostly because this does not scale, gets convoluted real quick. Like so:

config_dir = "config/"
File.dirname(__FILE__) + "/../" + config_dir + "/environment.rb"
#=> ./../config//environment.rb

Ruby offers a great File.join() class method, for this. This simply uses the File::SEPARATOR to join a string.

config_dir = "config/"
File.join(File.dirname(__FILE__), "..", config_dir, "environment.rb")
#=> ./../config/environment.rb

As you may notice, double slashes are eliminated.

Also, you can inherit this behaviour from Pathname, like Rails.root does.

config_dir = "config/"
Rails.root.join(config_dir, "environment.rb")
#=> /path/to/rails/project/config/environment.rb

Rolling your own, is very beneficial, and simple too.

class MyConfig
  def dir
    Pathname.new(File.join("/", "etc", "myapp"))
  end
end

mc = MyConfig.new
mc.dir.join("templates", "example.html")
#=> "/etc/myapp/templates/example.html"

There really is no reason to fiddle with strings, concatenate slashes and whatnot, to build paths. Join is so much easier, more powerfull and above all, cleaner and more portable.

https://berk.es/2013/03/21/please-ruby-devs-join-your-paths

Drupal Imagecache security vulnarability with DDOS attack explained

Mar 3, 2013 Updated Mar 3, 2013

Show full content

Nearly a year ago, long before I decided to move out of Drupalwork entirely, I reported a security vulnarability in Drupal 7 core in imagecache. Since imagecache is used on most Drupal6 instances this problem occurs there too. I had the draft for this poste, tucked away on an offline disk (security-details should not live “online” or in “the cloud”, ever); and, obviously, the day I arrive in Thailand for a vacation, Drupal released the CVE.

I made a proof of concept, and a tool to test it. A screencast explaining the issue is found below:

The issue itself is really simple, the solution is hard; because Imagecache was designed “wrong” in the first place. Let me explain.

You have really basic Drupal7-site on http://example.com, with content-type story that has an image-field. Using three imagecache-styles: “medium”, “large” and “thumbnail”.

Imagecache works by creating new images from an original, on demand, when a particular url is requested:

&lt;img src="http://example.com/sites/default/files/styles/medium/public/field/image/news.jpg" /&gt;

Dissecting that url, we see:

http://example.com/sites/default/files/ is where uploaded files are stored. This can also be something like http://acme.com/sites/acme-is-evil.org/files/ in case of multisite.
/styles/ is the directory where imageges are cahed under.
/medium/ is the style applied to this image
/public/ the “driver”, usually either “private” or “public”.
/field/image/news.jpg where the image is stored. The original can therefore be found at http://example.com/sites/default/files/field/image/news.jpg

In this case, a derivative called medium is created. Because creating images is heavy, they are stored on disk, so a next time, the webserver can serve this image right-away.

Let me repeat that: Because creating images is heavy, they are stored on disk.

The idea is as simple as it is wrong: The first time (when the image is created) a full Drupal is booted up, that Drupal-instance applies the various image-manipulations you have configured for that style, and then serves and saves the image.

“But why is that Wrong?”, you ask?

Because you never know when the heavy stuff will be invoked. It is unpredictable.

And because the heavy stuff is initialized by your visitors. People from the evil, outside world. They can fire up your image-creating just by visiting urls.

DDOS

This is a typical DDOS vector: making a server do heavy stuff by throwing something at it from outside. Typically in an orchestrated attack that involves many people from many places throwing stuff at it.

The actual issues: mixing images and styles

Everything above is not a large problem, because 90%, or more, of the images used in img-tags on your site, are already created and cached on disk. An attacker will need to find the last 10% and request these urls. This is limited.

But, there are more, far more, possible images then those you use in the img-tags.

We have two images. A frontpage-banner and a user-avatar. They are usually used with two imagecache styles:

http://example.com/sites/default/files/styles/avatar-thumb/public/users/123.jpg
http://example.com/sites/default/files/styles/front-banner/public/field/image/fancy_banner.png

I could just swap the styles and create a front-banner from the user-avatar, and an avatar from the banner, like so:

http://example.com/sites/default/files/styles/front-banner/public/users/user_123.jpg
http://example.com/sites/default/files/styles/avatar-thumb/public/field/image/fancy_banner.png

And what is worse, you can pull any image in your files directory through imagecache. Including that huge 7MB hi-res upload you forgot there. And if you consider the fact the tool imagemagick (often used as engine to convert the images) can actually handle pdfs, html and many other files you probaly have lying around in your files directory, you know how much your system can be hurt.

This all gets worse with the size of the images that can be abused and the heavyness of the imagecache-styles you have defined. Adding watermarks, smartcropping, overlays, rounded corners and whatnot make the generation of a derivative much heavier then merely resizing an image.

The other issue: recursiveness

When we look above, we can see that imagecache will gladly pick up any file, pull it trough the image-profiles you have defined, using the toolkits at hand and then write out a file to disk.

Guess where? Yup, in the files directory. Adding another file that can be pulled trough imagecache. So, you can imagecache-already-imagecached files. [insert inception jokes here].

This is where the attackers have the opportunity to fill up your servers’ harddrive. By simply generating image-styles by mixing up images and styles, you can create a huge amount of unexpected images. You them pull these trough imagecache again, to duplicate that huge amount. And again. And again. Untill urls grow so large that the webserver refuses them. Apache’s limit lies around 4000 characters.

A site with only one, 0.1MB image image and two styles can gain several thousand directories, nearly fivehundred copies of imagecache derivatives making a total of ~50MB of new images. All an attacker needs to do, is send 500 HEAD requests to your server, doable in a fraction of a second.

A site with thousands of images and five imagecache styles will get terabytes of new images in mere minutes. Obviously depending on the speed of the server and how many (HEAD) requests the server allows simultaniously. Or in days. Doing only a few-hundred requests each day, yet filling up your disk slowly but surely, after which your average server will either start crashing, or your hoster will send you large bills for extra storage and so on.

Also note that one does not need to download the to-be-generated file. Just requesting it, with a HEAD is enough.

The proof of concept tool

Find it on github.

Please note that the tool is made for investigative use only; but be aware that others might not heed this notice and either build such a tool themselves (it is really simple) or use it to bring down your site.

Because of this, I chose to cripple it a little. The tool cannot detect wether you have applied the security patch or not, or if you have different measures in place. Because of this, I have removed the crawling and parrallel part too, limiting it to images and imagecache-styles found on the page you insert manually.

The tool was made to investigate when and how a system would crash or choke using these attacks. Please investigate and learn about the CMS and the modules you are you are using.

Prepare your system for Ruby.

$ sudo apt-get install ruby rubygems #OSX and most Linuxes already have these
$ sudo gem install bundle

Clone the tool and install the dependencies.

$ git clone https://github.com/berkes/canhaz.git
$ cd canhaz
$ bundle install

Run!

$ ./canhaz # Shows all tasks
$ canhaz hit http://example.com 20 # generates max 20 imagecache
                                   # derivatives, by investigating
                                   # example.com

Lessons learned

Don’t do on-demand generation of things that require heavy work. In this case, derivatives needed for a user-avatar should be created when a user uploads that avatar. Even better is to let a worker queue deal with the actual generation, that way dedicated machines can deal with the heavy lifting, and users don’t have to wait in front of a loading page while you are making images. For PHP the standard tool Gearman, has worked well for me; just don’t expect it to be like resqueue, sidekiq or pythons RQ (yet).

Magic “handyness” like allowing any image to be “imagecached” is usefull in development, but not in production. So, on your development environment, you may want imagecached images to be generated on the fly (and probably not cached, damn you, drush cc-all), you certainly don’t want this flexibility on a production server. You probably want to call some build task while deploying to re-generated all your images there. Once. Before deploying.

And for Drupal8: get rid of imagecache and implement a much simpler on-submit image-builder. It should create the derivatives for when a File is created and passes validations. This not only solves any such “unpredictable load” issues, it allows for much easier CDNs, static-file-servers, caching and more. The on-demand architecture has too much downsides to warrant the only upside: flexibility.

https://berk.es/2013/03/04/drupal-imagecache-security-vulnarability-with-ddos-attack-explained

Tagadelic: TDD, OOP and seeking maintainer

Feb 9, 2013 Updated Feb 9, 2013

Show full content

TL;DR: Tagadelic is close to a Drupal 7 release, with an easy upgrade path to Drupal 8. It is completely rewritten, Object Oriented and Test Driven. Since I don’t do any Drupal anymore, I am looking for someone who can maintain a clean, OOP and TDD-module, to take it over from me.

There is just so much you can do when porting an age-old module again and again. Tagadelic has been around since mid-2005, has been ported over and over again. Mind you: not upgraded but ported. Quick, dirty and “works-for-me” ports. Like most other modules, actually. There never was a stable release for Drupal7, because the module never was really stable in the first place. Yes, it might work (for you), but that is far from stable and releasable.

Between 2005 and now, I learned programming properly. I mean, OOP, unit-testing, patterns and all that (This also lead to me, abandoning all my Drupal work, mostly).

None of my publicly released Drupal-modules ever resembled that progress I made; mainly because Drupal itself is not OOP. Has poor testing abilities (please read on, I will explain later) and applies quite a few anti-patterns. This makes writing really clean and pretty code, somewhat discouraging. Most of its examples, best practices and defaults go straight against what is in general considered best practice.

But since Tagadelic is used by a lot of people, I wanted to create a proper replacement. A module with pretty code, easy to implement APIs and some additional, turnkey modules for those who cannot or will not write these few lines of PHP. A module that resembles what I now consider good code and properly developed. As a replacement for what I thought proper 8 years ago.

I coded for several months and today released the first alpha.

In the long run, I can conclude three things:

Drupal is not really ready for OOP development. The interfacing between my module and Drupal required me to write wrappers (so that Tagadelic classes access Drupal-functions in an OOP-manner) and to write the modules themselves with global functions, since that is how Drupal expects the hooks and implementations.

DrupalWebTest is way, way too slow and feature-poor for Test Driven Development (TDD). Tagadelic only has about 150 DrupalWebTests, but running them all takes over 5 minutes (on my machine: quad-core Intel 2.67GHz, SSD drives only). Note that in a typical Rails (being -rightfully- known for being very slow) with cucumber suite of over 600 tests takes under 30 seconds; that includes selenium opening Firefox and clicking around in a few tests. 30 seconds is considered unacceptably slow there.

When developing test-driven (or Behaviour Driven) you typically run the isolated tests five, six times. And the entire suite of tests at least once. So aside from the actual coding, the testing alone takes 30 minutes. This is both discouraging (meh, I’ll just assume everything is still green, will test in next iteration) and very hard for your “flow” and concentration. It is feature-poor in a sense that I ended up writing most assertions and several set-up functions myself. assertXpath()? Nope. assertHasId()? Nope. assertIdenticalArrays()? Nope. And worse is that it breaks a very important rule for testing: isolation. If you want to test whether some admin-setting can be saved and creates the proper variable, you are also testing whether a Drupal is installed properly, user can log in, is admin, can access a page, has nodes, has access to creating these nodes and so on. I ended up poking into the database (not even “my” tables) because somewhere in the clutter of setup-tasks stuff was created but it failed.

It is really fun to write unit-tests with phpUnit. I was very much positively impressed by that test-environment and by using it. The biggest adventure was how to stub out Drupal. Drupal, using global functions for stuff like check_plain() is nearly impossible to mock and stub. I solved this by extending my DrupalWrapper and stubbing that. After all: I don’t care whether check_plain() itself works and clears out XSS, I only care whether or not my classes call that function in proper places to ensure clean output. Testing whether check_plain() works is not my concern, here. I chose phpUnit over DrupalUnit, because the latter is pretty much unusable for unit-testing of arbitrary classes.

And now it is time for someone, or several someones to slowly take over the module. Together we will release a Drupal7 2.0 version and then I can carry over all project rights on my last Drupalproject.

Interested? You should be:

Familiar with PHP OOP development. You should probably feel that the usual way of Developing Drupal modules in a none-OOP manner is not a very good way.
Familiar with PHPunit and Drupal Tests. You should feel strongly for TDD and good test-coverage. You should probably feel that even though writing Drupal Web Tests is not (yet) perfect and requires time and effort, it always should happen.
Able to maintain such a module for a substantial time. It being TDD and all, means that it won’t take you a lot of effort or time. But it would be a shame if three months after a release you abandon it altogether because you like Node.js better. Or so.
Wanting to develop on Github. At least until the 2.0 release.

https://berk.es/2013/02/10/tagadelic-tdd-oop-and-seeking-maintainer

Testing colored output with Cucumber

Feb 3, 2013 Updated Feb 3, 2013

Show full content

I am improving a Command line app to manage my todos. I am developing it entirely ‘Behaviour Driven’, using Cucumber and Aruba.

All is pretty straightforward, but I had a hard time testing the colors in the output. Colors are made with Rainbow; which is really neat, but sometimes a little too smart. Rainbow detects when it outputs to something that cannot handle colors and turns them off. The solution turned out to be really simple though.

Lets start with a simple script that outputs some Rastafari

#!/usr/bin/env ruby

require "thor"
require "rainbow"

class Example < Thor
  desc "example", "an example task"
  def example
    puts "Yah!".color(:red)
    puts "...".color(:black).bright
    puts "Rasta-".color(:yellow)
    puts "fari".color(:green)
  end
end
Example.start()

Running this, results in:

Example output

But piping this into a file, or for example less, shows no colors; This is a useful feature built into Rainbow. When testing with cucumber, the colors are gone too:

Feature: Example
  Scenario: Yah!
    When I run `example example`
    Then it should pass with:
      """
      Yah!
      ...
      """

This passes, but does not test any colors. First thing is to tell Aruba/Cucumber to not strip the colors, ansi-codes, with an @ansi tag.

Next thing is to tell Rainbow to output colors regardless of where it outputs to. We need to do do this in the application itself, by making the application a little more testable. However, Aruba strips the colors for a reason: it is really hard to test when all your output is littered with ANSI escape codes. You really only want to force Rainbow to output them when you are testing for colors.

#...
class Example < Thor
  def initialize(*args)
    super
    Sickill::Rainbow.enabled = true if ENV["FORCE_COLORS"] == "TRUE"
  end
  #....
end

This allows you to force coloring when testing or running by setting the variable, like so export FORCE_COLORS=TRUE; ./bin/example example. A step could them look like “When I run export FORCE_COLORS=TRUE; ./bin/example example”.

More usefull however, is that we can set this variable in cucumber for all the @ansi-tagged scenario’s. In a support-file features/support/ansi.rb:

Before('@ansi') do
  ENV["FORCE_COLORS"] = "TRUE"
end

With the scenario tagged @ansi, it fails: expected "\e[31mYah!\e[0m\n\e[30m\e[1m...\e[0m\n\e[33mRasta-\e[0m\n\e[32mfari\e[0m\n" to include "Yah! .... Good.

Testing against strings like “\e[31m”, however, is both error-prone and unreadable. A simple new step definition, in which we add the ansi-escape codes, using Rainbow, to the to-be-tested string. Which allows us to test colors really easy.

The features/support/ansi.rb should include “rainbow”.

Then /^it should output "([^"]*)" in "([^"]*)"$/ do |string, color|
  assert_partial_output(string.color(color.to_sym), all_output)
end

Feature: Example

  @ansi
  Scenario: Yah!
    When I run `example example`
    Then it should output "Yah!" in "red"

Readable, easy testing of your colored output!

https://berk.es/2013/02/04/testing-colored-output-with-cucumber

Make cucumber open the browser with the current page

Jan 7, 2013 Updated Jan 7, 2013

Show full content

The Cucumber Book describes a really nifty trick when testing web-pages: open the browser when a step fails. This is a feature provided by cucumber itself.

Add a support file features/support/debugging.rb:

After do |scenario|
  save_and_open_page if scenario.failed?
end

And add launchy to your gemfile, and bundle install. (or install it with whatever else you use).

group :test do
   #...
   gem "launchy", "~> 2.1.2"
end

This will save the page that cucumber is looking at, then open it in your browser. Works fine, untill you have a large suite of features and some refactoring breaks many features. Having to close twenty tabs in your browser after each run is counterproductive and often really frustrating.

I solved this with a flag that allows me to fire this debugging-trick only when I need it. When I have a failing scenario, and I want to investigate it by inspecting the page, I run my cucumber with an additional environment-variable:

$ cucumber debug=open

After do |scenario|
  save_and_open_page if scenario.failed? and (ENV["debug"] == "open")
end

The debug= syntax allows for more simple tricks too. Like debug=pp:

require "pp"
After do |scenario|
  save_and_open_page if scenario.failed? and (ENV["debug"] == "open")
  pp(page) if ENV["debug"] == "pp"
end

Simple trick, works like a charm.

https://berk.es/2013/01/08/make-cucumber-open-the-browser-with-the-current-page

Developing a tiny ecommerce site using microframework Slim

Nov 18, 2012 Updated Nov 18, 2012

Show full content

Last elections, I stepped forward as a volunteer for the Dutch Pirateparty.

They were struggling with Drupal, and the most pressing issue was getting a webshop going, using the Dutch Payment system iDeal, to take care of new member subscriptions.

I had quite some experience with Drupal e-commerce, and the default Dutch payment-method iDeal, in Drupal. Yet decided to build something from scratch instead, here is a short introduction, some reasoning and then how I build the shop.

Reasoning Managing the Drupalproject was hard as it was.

Three people were already working, around the clock, at that time, to get the Drupalsite running and bug-free: theming, organic-groups-integration and LDAP-integration (centralising the login), amongst the daily “we need this and that page, if possible, ready before yesterday).

Adding a fourth track, integrating e-commerce, was just too much juggling and managing. And granting access to their codebase and server for just about everyone who wanted to help was not an option; so all commits would have to go via already overloaded other volunteers.

Making, and deploying this as standalone application, was a good option.

Drupals’ (complete) lack of decent E-commerce options.

Yes, there is Commerce now. But in Juli, when all this was happening, it was simply not ready, for example, there was no stable iDeal payment and building one (against a still not stable payment-layer) was just too much work. But even now, I have great doubts about Commerces’ set-up, architectures and concepts. Too much layers, too much in-browser configuration and overly complicated juggling of several tens of modules. I guess that is partly my aversion against this overly complicated drupal-configuring-by-clicking-together stuff, but it might also be because Drupal, as a CMS, will simply never have the focus and targeted development that any of the bazillion e-commerce alternatives have.

And Übercart has actually never been a serious option to me.

Integrating and developing the iDeal part for payment-module, was going to be much more work then building from scratch anyway.

Easy for new volunteers.

Drupal is becoming harder and harder to jump into and start developing, with each release; this site was Drupal 7 and I found that a lot of volunteers had trouble solving trivial issues in this Drupalsite. I am not talking about installing the eighty-fifth module, or configuring another view and entity. I am talking theming, bugfixing and module-development.

One of the things I did was helping with many such trivial issues like organising the menu’s, side blocks, organic-groups setup and so on. There already was a shortage of people who knew enough of Drupal to help. There was no shortage of people who knew some CSS and HTML, though, yet getting them up-to-speed to employ these skills in Drupal theming and configuration proved too much work. I learned (again) how hard and difficult Drupal is, even for experienced (web)developers.

Also, the concept of this “become a member” thing was very much unstable: no one knew exactly what was needed, required and such. If you asked four people about what was required, you’d get four different (and even contradicting) answers. So I decided we needed the most agile system possible. Something that could have a new release several times a day, something that anyone with basic webdevelopment-skills could help fix or improve.

Slim

Hence PHP, hence Slim, and managed and deployed with git deploy.

Requirements, documentation and functional designs (wireframes)

Before I started coding I wrote down some guidelines made a few mockups and created the most basic requirement-doc possible.

The set-up

The code consists of a few directories and a very few files.

├ Slim        # Contains the Slim Framework Library (classes)
├ templates   # Contains snippets and PHP-files to render the actual HTML
├ config.inc  # A few settings and globals
├ ideal.inc   # Class/lib to handle the payment processor
├ index.php   # The actual application
└ secrets.inc # file which is excluded from the public git and from the docroot.

Template

The template was generated from the existing Drupalsite, by saving the page with Firefox (wget will not save CSS and such), parsing that trough tidy, and then manually clean up the rest of the HTML.

This turned out to be, by far, the most work and resulted in two extra directories “CSS” and “JS”. The biggest problem was Drupal’s extremely convoluted HTML, with nested divs twenty(levels) deep at some points.

In the end we learned that making the HTML, CSS and JavaScript from scratch would not only have saved us many hours, it would have left us with a far easier to maintain application.

The initial idea to keep as close to Drupal’s output as possible. Since that would allow us to transfer changes in design to this subsite too, was not practical.

The application

The entire application lives in index.php.

In a Slim application, you set up routes that react to a HTTP-request, do stuff, and then return other stuff. Like so

$app->get('/', function () use ($app) {
  $app->render('_head.inc');
  $app->render(
    'landing.php',
    array(
      'default_amount' => DEFAULT_AMOUNT,
      'actions' => get_actions(),
    )
  );
  $app->render('_footer.inc');
  });

This is a PHP5 syntax, where you can create The above does the following:

register a path, “/” with the HTTP-verb GET.
Add an anonymous function to that; which will be executed when “/” is requested. PHP5 supports anonymous functions, closures.
When the function is executed, we call render to tell what files under “/templates/” to render. We do this for head.inc and footer.inc.
The template landing.php is rendered, but gets two variables passed along, which can then be printed in landing.php. Note the “Bug” where we don’t actually print the actions? This part has been rewritten so often that things got messed up a little.

This is all code needed to create a page, with a form.

When you post this form, a slightly more complex function is called, the simplified version of that is:

$app->post('/pirate', function () use ($app) {
  $pirate = $app->request()->params();
  //... more preparing of the $pirate that will be stored in the database

  if (!valid($app,'email', FALSE, '/.+@.+\..+/', 'E-mailadres is niet correct')) {
    //...Lots of other validations of all the fields.
    $app->redirect("/");
  }

  $pirate = write_pirate($pirate);
  write_mail($pirate);

  $ideal = new Ideal(MERCHANT_ID, SUB_ID, HASH_KEY, AQUIRER_URL);
  //... Preparing a hidden form for payment

  $app->render('_head.inc');
  $app->render(
    'pirate.php',
    array(
      'hidden_form' => $ideal->hidden_form(),
      'url' => $ideal->aquirer_url,
    )
  );
  $app->render('_footer.inc');
  });

The values you posted are validated, then written to the database, and placed in a mail.

A new page is then rendered, with a hidden-form which will be posted to the payment-system.

iDeal

We have the simplest form of iDeal payment, iDeal lite. This is an offsite-payment, where you simply create a form with hidden values and POST that to the offsite payment-system. They then parse the POSTed values and present the customer with a payment-workflow.

On success, the customer might return to /success on error, there is a chance they end on /error. Because this is by no means a confirmation, we simply render a success or error page there; but take no actions.

$app->get('/error', function () use ($app) {
 $app->render('_head.inc');
 $app->render(
    'error.php'
  );
  $app->render('_footer.inc');
});

Other ideal-versions have a Post-back system, where their server confirms the payment; one of the earlier version of this application had that, since we did not know exactly what iDeal-version we’d go with. A confirm would be really simple to implement here:

$app->post('/confirm', function () use ($app) {
  $payment = $app->request()->params();
  $db = get_connection();
  $stmt = $db->prepare("UPDATE pirates SET status = :status WHERE id =
  :id");
  $stmt->bindParam("status", $payment["paymentStatus"]);
  $stmt->bindParam("pirate_id", $payment["purchaseID"]);
  $stmt->execute();
});

Obviously, in reality you’d probably need to parse the $payment to extract the correct information (the status and the user whose payment it was) from there. But again: you can see how extremely simple it is to implement.

Deploying

Since everything is stored in code, we can deploy really simple, using a git-push. Everyone with write access to the codebase can push changes there.

This allows for really fast rolling releases. Unfortunately, this breaks when someone hacks the code online. And with so many people, under such large pressure, that will happen. So all the time saved by having deployments under our fingertips with git, was undone by merging in changes that were not pushed correctly.

The solution to this is to make it easier and simpler to deploy The Right Way, then to hack something on a live server. (And not, as some might say, to make it harder to do the wrong thing; like removing Vim on the server or disallowing access to the files on the server).

I assumed git-deploy would be this easy, but apparently people under stress grabbed The Vim over SSH and hacked away on production code anyway. Apparently the git-route was not simple enough.

Conclusion

Several nights of coding, about 20 hours, of which 12 hours were CSS- and HTML-cleaning and fiddling, we made a really simple login-system.

The system was simple enough for others to start hacking on, in mere minutes.

And its result was simple, stable and friendly enough to handle the subscriptions of over 1100 in less then a month time.

Another example of what I call the “seeping trough of the complicated underlying technology”. In order to keep a project and its result clean, simple and friendly, make sure the technology you use is simple, friendly and clean; KISS. Despite all the unstructured and rushed development, this application still works, is reasonable clean and can be revived and improved in mere minutes.

Something I doubt would have been possible with a large stack of e-commerce modules in Drupal; or hacked into a Magento or other shop; let alone built with Zend, Rails or Whatever other framework.

https://berk.es/2012/11/19/developing-a-tiny-ecommerce-site-using-microframework-slim

Statische websites beheren anno 2012

Nov 11, 2012 Updated Nov 11, 2012

Show full content

Statische sites, bijvoorbeeld weblogs gemaakt met simpele HTML-bestandjes, maken een terugkomst. Gereedschappen als Jekyll samen met moderne revisiebeheersystemen als git, maken het opzetten van een simpele, statische site, vaak vele malen makkelijker dan het inregelen van een CMS. In dit artikel leg ik kort uit hoe een CMS zich dat verhoudt tot een statische site. Daarna bekijk ik een rijtje voor- en nadelen, mogelijkheden en onmogelijkheden. En als laatste beschrijf ik kort, hoe je in drie simpele (voor programmeurs, hackers en meer ervaren computergebruikers, althans) stappen, een site met Jekyll kunt opzetten.

Een CMS genereert de pagina’s wanneer ze opgevraagd worden. Voor élke pagina, voor iedere persoon, wordt telkens, eenmalig, een pagina opgebouwd. Dit blogartikeltje bijvoorbeeld, verandert niet meer tussen het moment dat jij en die ene andere lezer het lezen; dus waarom voor jou en die andere lezer een op maat gemaakte pagina opbouwen? Waarom is het nodig dat ik voor jou en de personen die voor en na jou komen telkens opnieuw in een database ga graven om daaruit de laatste updates te zoeken en daarmee een pagina opbouw?

Een Static Site Generator (SSG), genereert eenmalig, bijvoorbeeld bij een update, alle bestanden en pagina’s. Het idee is niet nieuw, vroeger waren gereedschappen die dit voor je deden, zoals Dreamweaver, de norm. Tegenwoordig zijn complexe, CMS-gebaseerde, of dynamische sites de norm. Maar eigenlijk meestal overbodig. Voorbeelden van moderne Static Site Genergators zijn:

Jekyll, in Ruby, gebruikt door onder meer Github.
Hyde, in python, vergelijkbaar met Jekyll.
Phrozn, in PHP, probeert Jekyll en Hyde te benaderen.

We zien dan ook een opleving van Static Site Generators, hoofdzakelijk aangedreven door git. Git is een revisiebeheersysteem waarmee je makkelijk kunt releasen, en makkelijk heel veel bestanden kunt beheren. Deze generatie static site generators, gebruiken geen databases of complexe en dure pakketten, maar, heel eenvoudig, tekstbestandjes. Gewoon een tekstbestand voor iedere pagina en artikeltje. Dit tekstbestandjes beheer je met git, een revisiebeheersysteem. Met wat scripts maak je van de tekstbestandjes daadwerkelijke webpagina’s en die zet je online.

Een Static Site Generator heeft enkele nadelen, maar ook enorme voordelen ten opzichte van dynamische sites:

Dynamische en interactieve functionaliteit.

Een statische site heeft simpelweg geen toegang tot een database. Het plaatsen en opslaan van bijvoorbeeld reacties is daarom simpelweg onmogelijk; dat maakt een en ander enorm veilig, maar maakt veel sites en concepten onmogelijk met statische sites. Sites waarbij mensen inloggen, berichten, artikelen, producten of wat dan ook, plaatsen, kun je gewoon niet uitvoeren met een statische site. Voor reacties op artikelen kune je nog wat trucjes met bijvoorbeeld facebook of disqus uitvoeren. Maar verder dan dat: zo goed als onmogelijk. Als je interactie nodig hebt, registraties, insturen van zaken door gebruikers, dan is een statische site gewoon geen optie. Maar voor de meeste redactioneel gedreven sites (ook als die redactie slechts één persoon is) is het een uitermate geschikte tool.

Performance

Op een gemiddeld webservertje, kun je met een gemiddelde Wordpress of Drupal-site (twee veelgebruikte CMSen) vaak minder dan tien gebruikers per minuut aan. Pagina’s laden duurt tussen de halve en enkele seconden. En je moet voor bijna iedere site, groot of klein, ingewikkelde caching-, proxy of databases tunen. Gewoon om een paar pagina’s te tonen. Een statische pagina is om en nabij de honderduizend keer zo snel als de meest eenvoudige Drupal-pagina.

Veel CMSen vallen dan ook terug op allerhande caching-servers en lagen. Het is niet ongebruikelijk om een volledige server (reken vanaf €300/jaar) voor caching in te zetten. Caching is in feite niets meer dan eenmalig de gegenereerde pagina (of onderdelen ervan) ergens op te slaan zodat ze voor de volgende bezoeker niet opnieuw gegenereerd hoeven te worden. Veelal nogal onzinnig: een volledig, op de server draaiend CMS, om statische pagina’s mee te maken. Dat kan slimmer (hint: dat CMS elders neerzetten, bijvoorbeeld op je eigen computer en de gegenereerde pagina’s zelf op de server plaatsen). Caching is voor de meeste CMSen onontbeerlijk en maakt een groot deel van de dynamiek en interactie weer ongedaan. Veelal blijkt het voor sites eigenlijk goed mogelijk om die vier jaar oude archiefpagina niet telkens opnieuw uit een database op te bouwen, maar gewoon eenmalig weg te schrijven als een statisch bestand en nooit meer naar om te kijken.

Piekperformance

Zelfs wanneer je een goed getunede server hebt met een mooi ingeregeld CMS, dan gaat het bij piekbelasting vaak mis. Meestal is de onderlinge oorzaak de complexiteit van het geheel. Allerhande gekoppelde redactionele systemen, advertentieservers en vele onderling gekoppelde servers. Wanneer je enkel statische files serveert is de opzet zó eenvoudig dat bij piekbelasting de site gewoon online gehouden kan worden, in heel extreme gevallen bijvoorbeeld door wat extra servers bij te schakelen. Iets wat met de meeste dynamische systemen gewoon niet kan. Terwijl juist bij piekbelasting de bereikbaarheid belangrijk is. Is het niet voor de kritieke communicatie en informatie bij calamtiteiten, dan op zijn minst omdat je tijdens een piek lekker veel advertentieinkomsten binnenharkt.

Je ziet dan ook bij extreme situaties dat (grote) sites overschakelen op statische files, of zelfs altijd statische files hebben klaarstaan om het over te nemen als het CMS het niet langer trekt. De CNN tijdens 9/11 was een prachtig voorbeeld daarvan: een simpele pagina waarop de redacteuren telkens nieuwe informatie toevoegden of herschreven; hun CMS was allang daarvoor gestorven aan overbelasting.

Security

Omdat je het meest complexe en tevens meest blootgelegde deel uit je applicatiestapel (stack) haalt, is beveiligen heel makkelijk. Geen databases, administratieomgevingen en interactieve delen die opengebroken kunnen worden. De enige die zaken op de server kan plaatsen is de applicatie of persoon die de statische files erop zet; dus dat is enorm sterk te beveiligen. In een HTML-bestand kun je (welhaast) niet inbreken. In diezelfde pagina van een CMS eigenlijk altijd wel (alleen is nu nog niet bekend hoe).

Om deze reden kun je ook enorm besparen op onderhoud; je hoeft immers enkel nog de webserver(s) te onderhouden; iets wat sowieso bijna niemand meer zelf doet. Bij een update-frequentie van eens per paar weken voor Wordpress, of eens per week voor Drupal, met een paar modules, scheelt dit al snel tien of meer uur aan duurbetaalde ontwikkelaars, per maand.

Hosting en serverkosten

De hosting kan enorm simpel worden uitgevoerd. Vanwege de duizenden malen hogere performance, kun je in theorie duizenden malen zoveel sites hosten op dezelfde hardware. Meestal kun je een statische site met enkele duizenden bezoekers per week heel makkelijk gratis en toch redelijk professioneel hosten. Meestal hoef je voor veel grotere sites slechts één server op te tuigen, in plaats van een hele boom van servers. Grote hosters zoals de NPO hebben vaak vele tientallen malen zoveel applicatieservers (de servers die de CMSen draaien) dan statsiche content-servers; gewoon omdat die laatste zoveel makkelijker tot onvoorstelbare performance op te schalen zijn.

Het web, en daarmee de bekendste servers en alle protocollen waren ook gewoon ontworpen voor statische content. Pas nu, 2012, komen webservers en protocollen beschikbaar die eigenlijk technisch geschikt zijn voor interactieve sites. Alles wat we tot nu toe deden was knutselen en rommelen in de marge. Als je dus teruggaat naar die basis, statische paginas, worden zaken vele malen gemakkelijker.

Hackability

Je hele stapel aan serversoftware kun je terugbrengen naar één lokaal draaiend stukje gereedschap; geen mod_php, cgi, databases, caching proxies enzovoort. En zelfs dat lokale stukje software is meestal erg simpel. Jekyll zelf bestaat uit een paar klasses en enkele tientallen files, niet meer. Met wat knutselwerk kun je ook met vier PHP-scriptjes HTML-bestanden maken uit tekstfiletjes. Bovendien draait de applicatie niet op de (productie-)server, dus is uitproberen van kleine wijzingingen heel makkelijk. Pas als je tevreden bent, kopieer je alles naar online. Het is daarom zeer, zeer hackable (hackable als in: aanpasbaar, niet als in inbraakgevoelig). Met een klein beetje programmeerervaring kun je zo heel grote sites optuigen. Veel makkelijker en overzichtelijker dan in de code van een of ander CMS te gaan rondsleutelen; iets wat bovendien vaak sterk wordt afgeraden. Maar bij statische-site-generators juist de norm en bedoeling is!

Deze aanpasbaarheid zorgt ook dat je een stuk vrijer bent dan wanneer je alles volgens de structuur of standaarden van het gekozen framework of CMS doet.

SEO

De eenvoud en hackability van statische-sites laat het toe om tot op zeer gedetailleerd niveau aanpassingen te maken: superschone HTML, goede -tags, nette doorverwijzingen enzovoort. De snelheid zorgt er bovendien voor dat de robots van zoekmachines in enkele seconden je hele site kunnen inlezen en indexeren; bij veel CMSen duurt dit weken; vooral omdat robots je site niet willen platleggen en ze, wanneer ze zien dat je een CMS hebt, wat rustiger gaan indexeren.

Hogere drempel

Het aanpassen van tekstbestandjes op je schijf geeft een hogere drempel; hoger voor onervaren gebruikers om te gaan schrijven aan tekst voor een site, en hoger voor jezelf, omdat het plaatsen van een artikel net even een grotere taak lijkt dan op een CMS.

Dit maakt dat veel blogs die met statische sites gemaakt zijn, veelal langere artikelen bevatten, met minder hoge frequentie. Het is minder geschikt voor iemand die viermaal daags een update wil plaatsen, liefst vanaf haar mobiele telefoon. Maar meer voor iemand die dagen schaaft aan een stukje.

Integratie van allerhande gadgets

De meeste CMSen integreren twitter, facebook, en wat dies meer zij, ook gewoon via zogenaamde “embedcodes”. Dat kan op een statische site dus precies zo makkelijk. Maar als de integratie iets verder gaat, zoals het automatisch koppelen van comments aan facebook; of het vanzelf plaatsen van links van nieuwe artikelen op twitter, dan is wat meer kunst- en vliegwerk vereist.

Grotere organisaties.

De huidige statische sites gaan ervanuit dat je op je PC een lijstje met bestanden hebt die bij publicatie omgezet worden naar HTML. Met meer mensen samenwerken is wat moeilijk. Tenzij die mensen dit met een revisiebeheersysteem doen. Daaronder valt ook bijvoorbeeld een gedeelde Dropbox waarin enkele mensen tegelijk tekstbestandjes plaatsen.

Maar een standaard redactioneel team met (online) workflows die via vele teams, eindredacteurs enzovoort naar online gaat, is het waarschijnlijk wat minder geschikt. Althans, je kunt ook gewoon elkaar tekstbestandjes mailen.

Jekyll opzetten

Hieronder een korte howto van een site met Jekyll. Zo een site is heel simpel en plat. Wil je wat sneller vooruitkomen, dan kun je met bijvoorbeeld octopress of Jekyllbootstrap. Maar die zijn meteen veel complexer en vereisen waarschijnlijk weer een heleboel opruimwerk. Ik begon zelf met Jekyll Bootstrap, maar heb erg veel tijd moeten steken in het weghalen van vanalles. Achteraf bezien was het veel sneller geweest om met een standaard Jekyll te beginnen en gewoon zelf alles op te zetten.

Daarom de drie stappen om je site te maken:

1. Installatie

Installeer de Gem (of eerst ruby, rubygems en dan Jekyll, op Linux soms nodig, op Mac heb je deze al):

$ sudo apt-get ruby rubygems # op Mac niet nodig dus, op Linux soms.
$ sudo gem install jekyll

Of installeer Jekyll vanuit je packagemanger in Ubuntu. Beter is via gems, want de deb packages in Ubuntu willen nogal eens achterlopen op de gems, wat Ruby’s eigen pakettensysteem is.

$ sudo apt-get install jekyll.

2. Mappen en bestanden aanmaken

Maak mappen aan waarin je je statische site en de layout enzo gaat bouwen.

    .
    |-- _config.yml
    |-- _includes
    |-- _layouts
    |   |-- default.html
    |   `-- post.html
    |-- _posts
    |   |-- 2012-12-30-mijn-eerste-blogartikel.markdown
    |   `-- 2012-12-31-het-was-een-goed-jaar.markdown
    |-- _site
    `-- index.html

default.html bevat de HTML waarmee je jou site opmaakt, post.html de opmaakt van een enkel artikel binnen deze standaard opmaakt. Hierin zet je waarschijnlijk enkele variabelen om bij het maken van de HTML, de titels, datums enzovoort te plaatsen. De posts schrijf je als simpel tekstbestandje, maar wat extra informatie (zoals tags, auteur enzovoort) kun je bovenaan in het bestandje kwijt.

index.html vul je met wat tekst voor je voorpagina en code om een lijstje met artikelen te plaatsen, bijvoorbeeld:

{% for posts in site.posts limit:10 %}
  {{ post.date | date_to_string}}: <a href="{{ BASE_PATH }}{{ post.url }}">{{ post.title }}</a><br />
{% endfor %}

Gerenderd als:

30 Dec 2012: Mijn Eerste Blogartikel
31 Dec 2012: Het was een Goed Jaar

3. Omzetten naar HTML: de site genereren.

Roep jekyll --server aan. Open een browser op localhost:4000. En viola, je site.

Met jekyll generate, vul je de _site map met de statische versie van je site; deze kun je op vrijwel iedere host plaatsen. Hosten (gratis) op Github kan ook, maar vereist wat kennis van git en github’s hosting. Uiteraard kun je het plaatsen van de statische content helemaal automatiseren, bijvoorbeeld met git, of een van de vele andere manieren.

https://berk.es/2012/11/12/statische-websites-beheren-anno-2012

Fix SEGV for Vims command t on Ubuntu 12.10

Oct 22, 2012 Updated Oct 22, 2012

Show full content

The upgrade to Ubuntu 12.10 upgraded my Ruby version to 1.9.3 (yay!).

This, however, broke my command-t a vim-plugin to open files quickly. Command-t is a compiled plugin (for speed) and needs to be compiled against the system-wide Ruby. Else vim crashes with a SEGV.

A little searching, showed me that command-t was the problem and needed to be recompiled. Obviously only when you had installed command-t before the upgrade to 12.10 (and thus compiled against the previous ruby version). As nearly always, once you know the problem, the fix is easy on Ubuntu; the Vim and gVim are already compiled against the correct library.

First, checking what version command-t is compiled against:

  $ ldd ~/.vim/bundle/command-t/ruby/command-t/ext.so
  linux-gate.so.1 =>  (0xb7714000)
  libruby1.8.so.1.8 => /usr/lib/libruby1.8.so.1.8 (0xb7696000)
  libc.so.6 => /lib/i386-linux-gnu/libc.so.6 (0xb7330000)
  libpthread.so.0 => /lib/i386-linux-gnu/libpthread.so.0 (0xb7314000)
  librt.so.1 => /lib/i386-linux-gnu/librt.so.1 (0xb730b000)
  libdl.so.2 => /lib/i386-linux-gnu/libdl.so.2 (0xb7306000)
  libcrypt.so.1 => /lib/i386-linux-gnu/libcrypt.so.1 (0xb72d5000)
  libm.so.6 => /lib/i386-linux-gnu/libm.so.6 (0xb72a9000)
  /lib/ld-linux.so.2 (0xb7715000)

Hmm. libruby1.8.so.1.8, not good.

Cd into the command-t location (mine is at ~/.vim/bundle/command-t/), feth the lastest version, clean the old make and re-make. In order to re-make I use the rakefile, you might have to install rake first. And, important, make sure you run the system ruby if using gemsets. Like so:

  $ rvm use system
  $ ruby -v && which ruby # Just to know what we are using.
  $ sudo gem install rake # We need rake to build.

  $ cd ~/.vim/bundle/command-t/
  $ git pull --rebase origin && git checkout master

  $ make clean # remove old compilations and installation
  $ rake make  # rebuild the

And there you go:

  $ ldd ~/.vim/bundle/command-t/ruby/command-t/ext.so 
  linux-gate.so.1 =>  (0xb7714000)
  libruby-1.9.1.so.1.9 => /usr/lib/libruby-1.9.1.so.1.9 (0xb74da000)

Happy command-t-ing.

https://berk.es/2012/10/23/fix-segv-for-vims-command-t-on-ubuntu-1210

Spree e-commerce on budgethoster Site5

Oct 21, 2012 Updated Oct 21, 2012

Show full content

Recently I rolled out a Ruby-on-Rails/Spree-based webshop on the budgethoster Site5. I thought to share some gotcha’s, reasoning around this project. To debunk the idea that hosting for Rails is either complex and hard, or done on Heroku. To explain a bit about Spree as a good option for your e-commerce (or not) and to go a little into how I modified Spree.

My wife makes custom bags, smartphone- and tablet sleeves, and all sorts of leather, handcrafted jewelery. Obviously she wanted to sell some through the internet; So I made her a webshop.

Reasons for choosing for Spree

I decided to go for Spree, after some investigation. A few important reasons where:

It’s catchphrase It was designed to make customization and upgrades as simple as possible.
It is built in Ruby on Rails, and that is what I do. (Although I probably know more about PHP, and Drupal, at that).
We wanted a simple shop. Her needs are mostly simplicity; less features in the shop equals better. Spree promises that; as opposed to most ready-to-go shopping tools, that promise every feature you may wish. And get in your way.
We needed flexibility. Simple means that it offers less options and choices. But also that the business-logic needs to handle more.
I wanted the site to be ready for mobile and thus to be responsive. Rails is ready for this. HTML5 and all that. So is Spree.

Some reasons why I had thoughts about either writing my own e-commerce or going with a PHP-solution were:

I had (and have, read on) my questions about how true the catchphrase It was designed to make customization… as simple as possible was, in reality.
Spree is Rails. Rails requires professional hosting; not a problem for a project developed for over 1k; where hosting at $20 per month is peanuts. In this case $20 per month is far over budget; again, the insides of her exact business-choices are out of scope for this project. Good-enough PHP-hosters come for under the $5 per month.
I was going to build it for free, obviously. So we could not afford me spending hundreds of hours on tweaking some tool, CSS and whatnot. A turnkey-solution like Magento would be a faster solution.
We chose the Dutch bank and payment provider Omnikassa because they simply offered the best deal for her. There was no omnikassa payment plugin for Spree, but there were some for more famous frameworks.

I went for Spree after proof-of-concepting my two biggest challenges: offsite payment and (Zurb-foundation](http://foundation.zurb.com) as CSS/HTML framework, instead of the default Skeleton, came out quite alright. And a 30 days trial on Site5 proved me that hosting there was going to work out.

Zurb Foundation

I replaced most views with my own HTML, so that I can use Zurb foundation. It offered just a little more features, such as a slider and more advanced responsive features; like hiding entire subtrees of elements on certain devises.

In the end, we decided to go Desktop-first; I am re-doing some of the views now, so they are prettier on mobile too. The reason we put this lower in our todo, is that the most-used payment system in the Netherlands, iDeal, trough our PSP omnikassa, is not mobile-friendly in the first place. So why offer an entire mobile-webshop when people cannot pay (properly) on a mobile?

It turned out to be really easy, but quite a journey of discovery trough all the spree-gems and its views, before I found out what views to copy into my own projects. The CSS was just as cumbersome to override. In the end, I decided to simply do away with all CSS and JavaScript for the front-end and roll my own.

Site5 Hosting.

For under €5 per month you get a server with git, ssh-access and Passenger to host your Rails application. A few Euro more for a static IP address, which I need for the SSL-certificate. It is an e-commerce, you need HTTPs.

It requires a few settings to be changed to my Rails application and it requires Ruby 1.8.7 since that is what their Passenger is configured to use, but in the end it works just fine.

The only problem I had was some asset-precompiling issue, where the compiler just died on me. After a support-call, the Site5-engineers upped some memory on the server and I could compile just fine. It turned out that some spree plugin came with several hundreds of (demo) asset-files like huge videos that needed to be “compiled” by the asset-pipeline. Cleaning up Spree Slider and removing its assets fixed the issue for good. But hey, I don’t expect a budget-hoster to support me compiling hundreds of megabytes of video and other demo-stuff. Fair enough.

Also, their uptime is reasonable. Not enterprise-alike reliable (IMHO), so I have nagios to check the frontpage every several minutes for certain strings to occur (It looks if the words “Anna Treurniet” occur in the <title>). Every odd month (or so) there is a restart or some short downtime. One time Site5 changed the MySQL-setup (location of the socket moved to elsewhere) so I had over half a day of downtime until solved. And now and again the application gets shut-down for no clear reason, so it needs to boot, resulting in the webshop loading very slow for one or two visitors. The nagios checking actually kind-of solves this too, since it acts as a “visitor” opening the site every five minutes.

All in all, I am very happy with this host. It offers far (far!) more then one would expect from a €5/month. In my “enterprise”-jobs, I have to deal with €500-and-up-per-month hosters who have far worse deals, support, deployment and uptime.

Their absolute biggest downside is the way their bulkhosting environment holds them back from upgrading to Ruby 1.9.3. So, if you depend on a reasonable recent Ruby-version, bad luck.

Git Deploy

In order to keep the deployment smooth and somewhat close to the experience I have with Heroku, I use git-deploy. Git deploy consists of a few simple hooks that run on the server in post-receive hooks. So, after you push your changes to production with git push production, the server runs a few commands, like (when needed) database-migrations, assets and cache-refreshing and then a restart of the Rails application. I have used this for other, PHP-based systems too. Some problems, as mentioned above with asset-precompiling aside, it works like a charm.

Obviously, when using git to manage the deployment, you need a good branching and releasing management. With that in place, I can fix and deploy changes within minutes. Yes. Minutes. Probably faster than most of you can log in over SSH, find the sourcecode on production, and hack a fix in with Vim.

Testing and TDD

Unfortunately, I did not manage to get good integration tests for Spree going. Most of the extensions lack any form of tests. Spree itself is is covered pretty well, but integrating Spree means changing configuration, overriding behaviours with Decorators. And I still have no idea how to tests these properly. The rest is mostly view-overrides, which often breaks Spree’s own tests and requires me to rewrite all the spree-tests in my application. It mostly boils down to my inexperience with testing mostly, though.

Omnikassa

A Spree Extension was written to allow offsite-payment with the Dutch payment-system Omnikassa. This is and was a mess. Spree had no (proper) support for offsite payments, so I had to hack into the entire checkout-workflow in order to get this payment-system going. The Omnikassa-extension somehow breaks the feature in Spree to allow discounts; it breaks certain orders in the back-end and whatnot. Spree 1.2.0 promises to have this checkout-workflow-inflexibility fixed, but an upgrade is rather hard, seeing all the customisations the application needed.

Spree again?

I don’t think I will use Spree for a future e-commerce project. Despite its promise to be the most flexible solution, I found it making too much assumptions and being far too inflexible in areas. I’d rather roll my own, next time. The most important parts that Spree offers me, are either very easy to develop myself (products, categories, their views, content-management), not needed (credit card-handling, user-login, 3rd party statistics) or covered in solid Rails plugins (administration, editors).

Against small-things-made-hard, like changing the checkout-workflow (one-click-checkout?, offsite-payments), manually ordering the products on the frontpage, integrating a simple CMS for the few “static pages” and so on. Usually, in an Average Rails-projects these things take a few hours programming and deploying. Here they took me days of stepping through spree-core code in order to understand how my Decorator did (not) change some ordering or some menu-addition.

For me, Spree offers me too little benefits to overcome its downsides. Despites its promise, it is very much a ready-made application, which works according to various assumptions about workflow, features and even looks, that you can configure and beat into shape; mostly. The very same conceptual things that Drove me from Drupal.

But overall, rolling a simple, Spree-site on Site5, gives you a well-designed, ready-to-go e-commerce environment for under €10 per month. And with a few hacks and tools, you can make the deployment to that host really easy. Whether Spree is the best solution for your e-commerce needs, depends on how much (and what exactly) you need to customize.

https://berk.es/2012/10/22/spree-on-budgethoster-site5

Why outside in Webdevelopment is Better; why Designs should be Leading.

Oct 14, 2012 Updated Oct 14, 2012

Show full content

Designs should be leading when building a website or -application. There! I said it.

Sure, form follows function; content is king and so on: a design should be a Good Design to begin with: it should carry the function, content and user-experience to a higher level. So, for the sake of the argument, let us assume you have the Perfect Design For Your Project, Audience and Branding. It is UX-tested. It is delivered as valid, clean HTML and CSS, backed with PSDs, Wireframes and looks stunning. It has passed numerous PR- and marketing sessions, and even the boss wants it implemented like this.

Enter “the technical guys”. People like me, a backend-guy. Sorry, but the CMS we are going to build it with, makes it really hard to implement this design., sometimes followed with the really pragmatic solution to Have the CMS dicate the way the application works. To redo the wireframes according to how the CMS works, because that will save lots of time and budget. That last word, being the most important one: Budget.

People think visual. People talk visual. And our delivery is mostly visual: people interact with our -technical- solutions through the “interface”, that which was in “the Design”.

But most important: the design, that interface, is most prone to change; from small changes in wording in a footer, to a grand rebranding-overhaul: mostly interface changes. When I calculate the tickets in the projects I am now involved in, 80% of the tickets are “interface”- or “interface-related”. First because this interface is where any problems below surface and second because this is what 80% of your and my clients care about: the interface.

Yet, Even though most of the communications and deliveries are outside-in; we work inside out; starting with database-designs, CMS-configurations from which we work towards the interfaces.

But how?

You’ll be developing the innards of the application (its models, controllers) and then, in the end, need a giant hammer to beat the views, glued to it, into shape to match as closely as possible to your designs.

It gets worse when these innards are fixed, because you use a CMS that has them built in, for example. You’ll be writing extensions, hacks and overrides all trough that CMS, just to make it output its interfaces as close as possible to designs.

The solution lies in outside-in development. I encountered that term first in The Rspec Book — A Comprehensive Handbook for Behaviour Driven Webdevelopment. And I am loving it.

The idea is as simple as pragmatic: you develop by making the views, the interface first. Then fixate them in tests (specs), so that the interface, the HTML is fixed in your project. From there on, you develop the underlying application, whether those are modules for a CMS, models and controllers for an MVC-framework or configurations of a point-and-click-tool.

Once the HTML (and thus the CSS and interaction) is fixed, a test will start to fail the moment someone touches that HTML. Developers can therefore develop, refactor, and rewrite the underlying application as often as they wish, without breaking the Designs. Sounds cool, not?

Flexibility.

Well, maybe not if that underlying application lacks the flexibility to be rewritten, altered and changed without making these views-tests fail. If, the moment you install an extension, all your tests turn Red (they fail), your tests, the fixed interfaces, become a burden. It is then near impossible to keep them passing, without all the hacks, extensions and configurations. That is probably the moment that you decide to rather change the view-specs (i.e. Divert from the original Designs) just to get forward.

It means you have the wrong tool. It means you have a tool at hand where you’ll simply never get 100% according to design. Where the tool dictates the designs and workflows of the application. If that works for you, then outside-in development makes little sense; but in that case, be very aware of this all trough the process; It really makes no sense to have a UX-lab test all the wireframes and mockups, if two weeks later, you find you cannot implement these wireframes and mockups anyway, due to the underlying inflexible tools.

Most RAD frameworks like Rails, Django, and various or their PHP-clones, are flexible enough in this, because they don’t (ever) assume anything about your wireframes and designs. Some, very few actually, CMSes allow this too.

An example?

In Rails many people test with Rspec; Views are (arguably) best described in cucumber features. The exact syntax and setup goes way beyond this post, so let’s look at some pseudo-code:

Scenario: I want to log in
  Given I am signed in
  When I click on my name
  Then I should see my profile-page

A simple feature, described in Gherkin and fixates the actual views with so-called step-definitions:

Given /I am signed in/ do
  @me ||= Factory(:valid_user)
  When %(I go to the new user session page)
  When %(I fill in "Username" with "#{@me.username}")
  When %(I fill in "Password" with "#{@me.password}")
  When %(I press "Sign in")
end
When /^I click on my name$/ do
  click_link("#{@me.first_name} #{@me.last_name}")
end
Then /^I should see my profile-page$/ do
  page.should have_xpath('//*/h1.name', :text => "#{@me.first_name} #{@me.last_name})
end

This is a really simple test to ensure that a logged-in user, who clicks on a link with her name, gets to see the profile-edit page, containing an H1-tag, with class “name” and contents of the users’ name. Real world tests, with more detail over at Jared Carroll’s blogpost on this.

Downsides?

Certainly. As you can see, just to describe a login-form, a link with my username, and a generic H1-tag, I already need one entire story. Depending on your test-suite and your application, you may have to write far, far more test-code then “actual” code.

But outside-in development is just one of many reasons why one might choose for test-driven development. Writing tests requires effort, time and dedication. Without tests you have no way to describe and fixate the interfaces. And without that, there is no outside-in development (And you’d best tell your designers and clients right-away that you are not going to get 100% pixel-perfect interfaces, within agreed budget and time… :) ).

The upside, is that your application is tested, and can easily remain so. With each release, a few commands ensure you that your (ever growing) application works. That all the work by all your colleagues or contributors remains working. And that the interfaces, the one thing your users care most about, work and look the way they expect them to look.

https://berk.es/2012/10/15/why-outside-in-webdevelopment-is-better-why-designs-should-be-leading

Programming on a Non-English Project; best practices

Oct 4, 2012 Updated Oct 4, 2012

Show full content

As a Dutch webdeveloper, I see mostly Dutch websites being built. And I see many team struggling with how to do this well. I developed a set of best practices, which I want to share and discuss.

If your entire teams’ first language is English, this post is probably not for you. And if you are the kind of person who thinks “are there actually people out there who write Russian code, Swedish documentation or use Dutch classnames? WTF?” this post is probably not for you either (But take this with you before leaving: there are. A lot).

Edit: as offbytwo, a redditor kindly points out, it is non-English, not none-English. Guess that proves that I am none-English (sic) to begin with :).

The Client.

They will most probably communicate in your local language. Not in English. They want to produce their specifications, requirements and even designs in their native language. This input is therefore not-English.

It makes no sense trying to enforce this in English, that only creates friction in an area (feedback, requirements) where you want as little friction as possible. If anything, you may want to designate one person to translating this input before it goes into your other systems. But often it is good enough to simply put the non-English input as-is into your project.

The Team.

You might have one or two developers whom you have to communicate in English with. Either because you outsourced stuff, or because you are collaborating in Open Source, or because you have people on your team from abroad.

But even when you don’t, having all the technical communication in English pays off: all your developers understand and can write some English; but not all your potential developers speak your native tongue.

Standardising on English now, allows you to pull in such developers later. Standardising on your native language now, shuts a giant pool of potential collaborators out. Forever.

Code

All code should be English. Always. No exceptions allowed. Programming-languages, libraries and tools are all in English, so if anything, using your native language will only add inconsistency. Sure, if there is an all-Dutch programming-language, you would be writing all your code in Dutch. But such languages are rare or nonexistent.

class GebruikersController extends Controller {
  //Maakt een nieuwe gebruiker aan op basis van zijn identifier.
  def __construct($gebruikers_id) {
    //@TODO: maak de eigenlijke code
  }
}

There. I just used 5 pure English words and statements, 2 mixed words (Gebruikers-Controller and gebruikers-id_entifier_) and a load of Dutch words that no-one but a Dutch speaking person can understand.

All your objects, classes, variables, function and whatnot should be English. Always. When you are programming a shop, you don’t call that class InnKaupaKörfu < ActiveRecord but rather just Cart < ActiveRecord. Code, databasetables, tools and whatnot should be in one language only. Defaulting to English.

Documentation

When your code is all in English, a large part of your documentation will automatically be in English. It makes no sense, therefore to have the other documentation in another language.

Just default to English.

With one exception, being the input you receive from your clients. If this input is passed on to English speaking colleagues, you will need to translate it, but in that case, keep the original with the translation. In that case, you will need to appoint one person, e.g. a project manager or lead developer with the task to do the translating. And you’ll need budget for that, obviously.

Configuration and Integration

In tools like Drupal, a lot of the building is done trough UI’s within Drupal itself. Things like fields (title, body, images, colors etc.) on articles, article-types and even entire pages (views) are configured trough the interface.

Often these are made in the native tongue. That is not a good practice. Just give all these components English names. So don’t call the color-field on your “product”, couleur, or χρώμα. But rather just color.

The reason is that in situations like these with Drupal, but also Expression-engine, Typo3 and such, these names are then used all over the place. Ranging from database names, column-names to variables and autogenerated classes.

It is very ugly having to work with template-files called overzicht_block_content or les_plus_lus_page_header.

Because these names trickle through to your code, keep them English. Else you will be forced to break rule #1: all code is in English.

Interface strings

Quite often, you will be building your application in one language only. It makes little sense to write it in English first, when you are 100% certain that it will never be used in English.

In that case, simply don’t use the full text for the interface. Rather than "Thank you for registering".t, simply use tokens or symbols: "registration.messages.thankyou".t. Many languages and platforms already require this. But some have the default standards to write the full texts throughout your application. In that case, just ignore that standard, when English is too much hassle. Anything is better than having your code filled with pieces like "ขอบคุณสำหรับการลงทะเบียน".t. Only a Thai developer has any idea what is going on there; whether it is one word, two, a sentence or a token?

Obviously, this means using some kind of translation framework. But nearly every language and platform has these either built-in or as default add-on. And if not, you are probably working with the wrong tool, considering you are building a localized application in the first place.

Revision control

While many teams do adhere to the guidelines that code, and everything around it is English, their commit-messages are local.

The funny thing, however, is that I have never seen a team write only-Dutch commit-messages. They often mix English messages with Dutch. And when upstream software is English, you’ll see that even more.

In all cases, it is best to just use English for commit-messages. For one, because you will be referring to (English!) code in the messages, but also, because mixing English and non-English requires more mental energy from the reviewer, when she or he has to toggle all the time.

There is only one reason to write commit-messages in your native tongue: when otherwise people don’t write them at all. Anything lowering the barrier that makes people write good messages is good. No commit-messages is definitely worse than non-English messages.

Bug tracking

When bugs are only managed within the team, just stick to English here too. You already write most of your documentation in English, your commit-messages in English and obviously your code too. Why then, write your bugs in another language. It only adds inconsistency and might (will!) even lead to confusion and more bugs. Like when the shopping-cart is called “shopping-cart” all over your product, but suddenly a bug-report mentions that the “ostukorvi” is not working. Suddenly! Confusion over already established concepts like “the shopping-cart-icon”, versus the “shopping-cart-process”.

Only when you clients write bug-reports, should you leave them in your native tongue. Here too, you want to remove any friction from your client submitting reports (you do, don’t you?).

I’ve seen three successful methods here:

The Bug-reporting was closed for the client, they emailed issues to their contact. The contact then translated it to English and translated it to an actual usable bug-report (i.e. turn “problem with site, clicking on header not well” into “JS-overlay placing DIV over clickable area in header”). Two wins.

The Bug-reports were made in Dutch and followed up in Dutch too. But from there, the person working on it translated everything. In commit-messages, when referred to an issue the title was quoted. Like

"Fixed issue 1337 "Problemen met video. METEEN REPAREREN!!!" by
adding new CDN-urls to javascript for players.

Bug-reports were made in a separate environment and followed up in an English-only workflow. Feedback and updates for the clients were in Dutch, in the “clients-issue-tracking”, yet in-house everything remained English.

What do you think? Is English bringing your productivity down? Do you consider your cultural heritage important enough, to decide to write code in your native tongue? Are you in a team where people don’t speak a word of English? (and how can they program, in that case?)

https://berk.es/2012/10/05/programming-on-a-none-english-project-best-practices

After almost twelve years of Drupal, I am saying goodbye

Sep 30, 2012 Updated Sep 30, 2012

Show full content

Over eleven years ago, I got involved in Drupal, after running two personal blogs on Drupal. A few years later, to my surprise, I found myself to be amongst the first few people to offer commercial and professional Drupal services.

And boy, it was a ride. With no IT-education, other than webmonkey, the fantastic Drupal-community allowed me to piggyback and stand on shoulders. Great, strong shoulders.

I grew. I learned. And I learned how to program somewhat decent. I then, gradually I learned there were books and systems that allowed one to program The Proper Way[tm]. Mostly because I was rolling out Drupal in Real-world projects every day, bumping into issues that, as I found, had been “solved” ages ago in academical books and studies. I learned that “talk is silver, code is gold” is simply not true. Code that is discussed, thought about, refactored, and then discussed again, is of a much higher quality than code that is “just done”. I learned that properly architecturing something, turns it into far more than gold. I often complained about “horse-tack-coding” in Drupal. Where working on small, isolated issues was (and is) always preferred over refactoring larger parts. This has led to a lot of repetition, inconsistent APIs and very unpredictable behaviour. I learned about encapsulation, separation of concerns, loose-coupling and more such well-known principals. All of which Drupal lacks, or ignores. I had the feeling I grew faster than Drupal.

I also came across Ruby on Rails and found that there were actually real, technical solutions for several of my gripes with Drupal. We are talking 2005, by now. Remember, I have no educational background. At first, I knew nothing about OOP, other than what the great folks, the great shoulders of the Giants in Drupal, told me about it on IRC and in forums.

I learned a lot of new terms, methods and concepts. They were, and still are, an eye-opener. MVC was something that actually existed! Something actually existed, actually got designed and invented, solely to solve most of the issues I had with Theming in Drupal! And these design-patterns were not just to make technical people happier (or something to flamewar about), they actually solve many management and planning-issue too.

But, I had also grew into something of a local Drupal-expert and goto-guy. Serving most of the Netherlands as freelancer and Drupal consultant payed for my mortgage. I got called in on many failing Drupal-project. Got to help large companies and organisation in their swich to Drupal.

But toggling between Rails and Drupal-work, only made me see all the issues with Drupal more clearly: there was a lot of work for me to do, in order to make Drupal something as elegant and nice to work with as Rails. In an ever growing Drupalcommunity, I decided that my voice and code in this was only noise; especially since that community clearly has a different idea about webdevelopment than I have.

Around that time (beginning of 2009), I co-founded Wizzlern. We developed training and education for Drupal. Training people allows you to meet professionals with lots of different backgrounds. People who have formal training in IT. People with much more in-the-field experience than I will probably ever have. And people who are critical. About things in Drupal.

But developing several training-courses also required me to dive really deep into the what and the why behind things. I suddenly had to paint the big picture around an inconsistent and weird API, answer questions like: Why are modules so hard to find? How come there are so few pretty themes, compared to wordpress? Why is it so much harder to use than Wordpress? Why can’t we find a decent workflow to develop in a team and deploy? Feedback from experienced webdevelopers (in Java, Python, .net and PHP-frameworks) made me realize even more that there was something amiss for me.

It became harder for me to defend that, harder to explain my passion for the system, its quircks and its community. They say, that once you have looked in the kitchen of your favorite restaurant, you never want to eat there again. That could be the case here. Or maybe it is because I am a vegetarian.

The realisation came slow. It took years. Drupal actually wants to be what it is now, not what I thought, or hoped, it wanted to be. My idea of a toolkit, developed by webdevelopers, for ourselves, wedevelopers, to create ever better websites for our clients, was a not going to be found in Drupal.

Dries’s comment on Copenhagens’ keynote made this even more clear for me. He pointed out that Drupal should not focus on developers.

“Drupal made the webmaster redundant. In future it will make the webdeveloper redundant”.

Unfortunately, I cannot find the exact quote, this one is from my vague memory and scriiblish note I made back then. I can only find Dries answering my question about that quote. So his exact words are most probably different from what I phrased here!

However, the bigger picture became more clear to me: something we have seen happening in Drupal for a while now: It focuses on the click-and-point development of website, not on the programming of a website. It really wants to be a CMS, albeit a flexible one. Rather than what I consider the future: a developer-platform that allows me to build a CMS.

Development in with click-and-point, offers little challenge for me. Learning, where and how to find, evaluate and configure the umpteenth gallery-slider-view-plugin offers no challenge, nor satisfaction. I also found this approach of clicking together a site, to not satisfy my clients; not being able to deliver 100% what they wish. And I found it inefficient: especially when my programming skills grew. I could churn out a few objects and a hook or two that output the exact JSON I want, much faster than I would ever be able to click together some Services configuration.

In my search to more challenging Drupal-work, I helped several large sites, to solve some of their performance-issues. Helped many project with their problematic Drupal-development and -deployment. I taught many developers how to write themes, modules and how to deploy. Unfortunately the challenges did not revive my love for Drupal, but only took me further from it. I came to realize that most of the problems stem from the way the Drupal community prefers to do stuff. I even wrote some controversial opinions on that (Dutch).

And I decided that it was time to make a shift. Find new projects outside of Drupal, work on some pet-projects and see if I found more challenges and opportunities to grow again, outside of Drupal. After nearly one year of flipping between Drupal and other projects made me realize that I had to cut all ties, in order to progress. That Drupal was never going to be the developer-tool I hoped it to become.

I will put down all my Drupal-work and finish the remaining few of my running Drupal-projects. Both those with clients, and those things like a stable release of tagadelic2 for Drupal 7. I will obviously announce those here.

Moving on to exciting new technologies, tools and development platforms that fit better with my workflow and programming-experience.

Goodbye, it was a fantastic bunch of shoulders to be allowed to stand on!

https://berk.es/2012/10/01/farewell-drupal

git deploy or: How I Learned to Stop Worrying and Love Deployment

Aug 2, 2012 Updated Aug 2, 2012

Show full content

One of the most surprising things I learned when moving away from Drupal-development towards Rails development, is the impact of fast and low-barrier deployment.

We all know that Drupal’s deployment is severely broken.

I always thought fast and low-barrier deployment was just a nice-to-have, because it would bring down the actual hours spent on deploying. But it gives you so much more:

Very quick response on client requests.
The possibility to fail fast and fail cheap.
Be sure about few regressions and no failures.
Provide guarantees about uptime.
Allows for canonical releases
It allows for more code and less config.

Very quick response on client requests.

There is nothing that makes a client happier then sending in a question or request and seeing it live and online a few minutes later. And nothing pays your bills so well as a deployment.

If you have contracts that only pay after “the entire project is finished”, it most certainly includes a deployment. But more often it requires far more, because “finished” implies that all bugs are fixed and the client gets (almost exactly) what she asked for. In a waterfall this means many small releases, often on some internal “acceptance” site. When you can deliver the bugfixes and improvements faster, you will finish your project faster; the actual time between your first preview-delivery and the final one (the one that gets the invoice payed) is much shorter. This is when you are not working “Agile”. For “Agile”, such fast-and-often deployments are a requirement.

The possibility to Fail Fast and Fail Cheap.

If you can push out releases the way my rabbit has babies, you can afford to have some fail. And because you can release five minutes later, again, you can fix such a fail with lightning speed.

A failing release that needs days of fixing, rollbacks and recovery-work is bad. But when that means that a next release requires even more planning, people on-call, meetings, and so on, you only make things worse: the release becomes even more expensive and cumbersome.

Being able to quickly recover from a mistake helps the client. She or he sees less downtime and has to pay less for “deployment-time” and on-call hosters and sysadmins.

But it mostly allows you to try new stuff. If some new UI-idea, or a fancy payment-method can be released with little or no effort, it becomes less hard for everyone to roll it back, if it proved less successful then anticipated. It makes the investment in some feature smaller, and therefore the barrier to simply throw it away when it fails, much lower.

Be sure about regressions and no failures.

One condition for fast releases, is that you are certain about its quality. Most often you will have a test-suite in place to insure yourself against regressions.

This allows you to hit a few buttons and when everything turns up green, you can deploy. With arrogance. You know it will go well. And you know everything continues working.

The average Drupal-deployment calls for click-frenzies: developers, clients and other stakeholders click around on the site manually, for many hours, to ensure everything continues working.

A client who sees an unrelated part failing because you bugfixed another part, is an unhappy client. Even if you can explain that this complex access-permission-module touches not just the Wiki (where they asked for some access control) but also the blogs and forums (whom you forgot to check throughly against the new access-control).

A client, whom learned to check the entire, and ever-growing site for failures, on each and every release, is an unhappy (and busy) client. A developer who manually walks trough the entire site after each and every change is a very unproductive developer.

Testing as a security against failures is not a result of fast deployment, but a requirement to have them. If you want fast deployments you must have tests.

But when testing is not an option, at least the fast, and low-barrier deployment, allows for quick rollbacks and makes such failures much cheaper.

Provide guarantees about uptime.

With Drupal, you must bring your site down during deployment. When you consider that your average Deployment of Drupal takes an hour or more, then no-one can afford to have several releases each week. Even when you deploy at 03:00 at night.

With slow, manual deployments, especially in cases like Drupal’s where the site is offline during the entire process, the downtime is unacceptable for many. My clients have often postponed releases for weeks, because of this; because they were afraid to bring down the site for even one hour. My last bigger Drupal release took four(!) hours of manual labor. Half a working-day downtime is Not an Option for most.

That “fear of downtime” and postponing of releases, is actually the worst part of it. It means that after developing cool new features, you have to wait weeks before it can be released (and the project can be finished and billed). Or worse: it means that you continue development and squeeze hundreds more bugfixes and features. Making the release even bigger and harder.

Allows for canonical releases

Releasing often, means that you can release after each and every change too. The advantages of that, are huge.

You detect mistakes faster, rolling back is a piece of cake, and the overall impact of a change is much easier to grasp.

It allows for more code and less config.

When a release is cheap, “hardcoding” stuff is cheap too. Instead of writing large and complex “on vacation-message-systems” in a CMS, you can simply set a “We’ll be back august 31” in the template. And deploy. Four minutes work.

Yet when the deployment is hard and expensive, you’ll need to allow such things in your application. Quite recently, did we implement a large and complex layout-system, with drag-and-drop placement of content-snippets in a CMS. It had a horrible effect on the system: The design became extremely complex, it had to cater every possible placement, performance of the application dropped to snail-speed, the code behind it all was large and complex and the UI of this “in the CMS layout system” required large and expensive projects in itself. A disaster.

Yet the reason behind the request for this layout-system was that the client wanted to change the placement of some content once or twice a year. With the required downtime for a deployment, the overall costs of one such deployment and all the friction that caused, it was no longer an option to call the development-team twice a year with the request for changing the layout. Building this large and complex beast was actually cheaper then having some (frontend)developer change some HTML around twice a year.

With fast deployments, the option to hardcode things is a very valuable option. It not only allows you to keep the application and its backend simple and lean, it is mostly a self-amplifying-loop: large and complex configurable systems require hard and often manual labor on releases.

Which is the main problem in Drupal’s deployment: you don’t code stuff, you configure it. And everything configured, cannot be deployed with a deployment-system, but has to be re-applied manually on a production site. Off course you can think of many tricks (like exporting and importing the configuration) but they don’t solve the underlying problem: manually applied configuration is not deployable like code is deployable. And when that configuration (such as the layout) lives in the same database as where your content lives, like in Drupal, the disaster is complete. The chaos is complete when such a manual configuration (like a new content-field, say “teaser”, gets introduced) requires a change in code too. Or when a code-change requires manual configuration.

Learn to stop worrying too

By coding most of the stuff, in a framework that supports automated testing and has a good migration-framework, you lower all the barriers.

Start testing

Testing allows you to be sure about what you are to release. No need to manually click trough the site on some “Acceptance” server, for hours, before releasing. But a few clicks after every change to assure yourself (and your client) every that worked in the past, still works.

Write migrations

Instead of manually inserting stuff after a deployment, you should automate that. In rails, I write migrations to change the database. And rake-tasks for most of the other work, which can then be called from within the migration. Rake-tasks are dead-easy to write, mostly because they were designed especially for automating tasks. Every task that can be automated, needs no UI, requires no manual labor and, most importantly can be tested trough and trough.

Deploy. Just do it.

I write my blog in Jekyll, where publishing a new article is a deployment. I don’t blog that often, but experiencing how simple and fast deployments are, has brought some of my deployment-fear down.

For the other systems, I use git-deploy, which ties the deployment on top of git. The setup is simple, but the deployment is ridiculously easy: git push production.

I have attempted to rewrite git-deploy for Drupal, but so far, not to satisfaction. Drush, the Drupal counterpart for Rake is hard to get configured on each and every (different) production server out there. And is not very scripting-friendly. But already, it lowers the barrier so far that deploying becomes fun again.

Deploy to acceptance, test or development on daily basis. Have at least one place where you and your entire team deploy several times a day. It brings experience and makes everyone aware of the benefits of good automation of the process.

Once you start deploying five times a week, as opposed to once every two months, you will be a happier developer. You clients will see far more progress, faster responses and all your sites will improve much faster.

What is holding you back from deploying once a day?

https://berk.es/2012/08/03/git-deploy-or-how-i-learned-to-stop-worrying-and-love-deployment

Standardising the technology stack (or CMS) is silly

Jul 1, 2012 Updated Jul 1, 2012

Show full content

I have been involved in many “we want to move all our websites to one technology” projects. Mostly Drupal. Mostly where people wanted to move from a wild range of CMSes, forum-systems, blogging-tools and so on, written in various languages towards everything-in-Drupal. Smart move, you might think. Not often, I know now.

We all like car analogies, when it comes to websites, so here is one: my 1986 Volkswagen T3 camper. A great car. I know its engine (watercooled 1.6 boxer, best engine sound ever), I have gone trough its electrical cirquits a few times and know them pretty well now.

I know where to find and buy secondhand or deadstock replacement parts. I know the support-forums. And mostly, I can safely invest in knowing the ins-and outs of the car, because I am pretty sure I will buy Volkswagen Classic T3 Camper Vans for all my needs, forever, from now on. Or not.

Busje

Off course not! No-one in his or her sane mind would buy a second such camper-van to do the daily shopping, or for your daily commute. If I needed a car for my daily commute, I would not buy a second, rather expensive, fuel-gobbling, always-something-needs-fixing van with all the camping-stuff built in, but one that suits the needs much better, like a small, cheap Japanese car.

Yet this is the argument that goes when people choose to “Always use Foo for every of our sites from now on”.

It is different if you are a mechanic, or auto-shop. Analogy for “webbuilder”. It makes more sense then, to become an expert on one brand, year, engine-type of car. Specialists have a good place: we, the consumers and clients will be able to find them when we need them. And I prefer a Volkswagen mechanic with good Knowledge of the old-timer transporter vans to fix my car.

This goes just as well for webdevelopment. A Drupal-only-shop will be able to provide better sites for their clients, because they know all the weird little quirks in that CMS inside out. A dedicated python-developer knows much better where to find the most up-to-date resources and tools. And so on.

But as a consumer, as a client, you are much better off to choose your CMS, framework, programming language as it best fits the site, service or app you want building. Rather then shoe-horning each and every website you want into, say, Wordpress. I have seen it all: e-commerce platforms built on top of Wordpress? Wordpress turned into a JSON-webservice system. Great ideas molded into nothing, because it had to fit within what Wordpress can offer.

Do you really want to sacrifice all that, simply because you chose to standardise on that one CMS?

No, what you want, is the same as with your car, or cars: a good and trustworthy mechanic who helps you with your old-timer camper-van as well as your husbands commute-car. Or at the very least, two such mechanics, one for each. You want trustworthy contractors and service-managers who can help you with whatever problem you have.

You want a good partner, or set of partners who can offer you the best tool and solution for each problem you have. And here is the other problem: if that partner is a shop who standardised on one technology, that is what they are going to offer you. There is no Ruby-on-Rails developer who is going to say, no, bad idea, you should not hire us. Rails is really not the best tool for your job. That is why you need some basic knowledge.

Again, like with the car: you walk into a Volkswagen-dealer to ask what car you need for your new building-company and you will end up with a Volkswagen. Even though a secondhand Toyota pickup is probably a much better fit. You need some basic idea of what you want and what fits. Most people who buy a car, know that newer, smaller cars use less fuel. And that (in The Netherlands, at least) driving Diesel is cheaper then Gasoline. Most people also know that an €180k is far more luxurious then a €400 Suzuki Alto. But you really should avoid shoehorning all your needs into one system. Because that one system is best at one area, you end up with a large range of websites that are not working too well, too expensive (or too cheap); you will end up like me, commuting to work by train, because doing it with my 1 liter on 8.5 kilometer fuelgobbling van, is just too expensive ;). You will end up using technology that is far from the best fit for a certain problem.

Only when all your sites are near 100% equal, then it makes sense to pick one and build every site with that technology.

And only if you insist on doing everything yourself, in-house, then it makes sense to standardise on one technology. Just like I would have to buy a second Volkswagen-van for my commuting, if I would insist on doing all the maintenance myself; learning the ins-and-outs of a, say, Japanese Diesel is too much for my simpleton programmer-brain.

Don’t you just hate these car-analogies in IT :) ?

https://berk.es/2012/07/02/standardising-the-technology-stack-or-cms-is-silly

Map capslock to escape in Ubuntu Linux

Jun 20, 2012 Updated Jun 20, 2012

Show full content

If you are a Vim user, you probably want the escape key more at hand. On Ubuntu (And probably every Gnome3 desktop) this is really very easy. It comes with point-and-click tools to map your key to about everything you can think of; and more. If you google this problem, you find all sorts of xmonad CLI commands. They work too, but this is far easier for the stupid Linux-user like me.

Open System settings.
Click Keyboard Layout.
Click Options.
Under Capslock, choose whatever you want.

I chose switch Escape and Capslock. So that if I REALLY WANT TO SHOUT AT PEOPLE I STILL CAN! :). And it forces me to relearn the escape key.

https://berk.es/2012/06/21/map-capslock-to-escape-in-ubuntu-linux

Overheden moeten APIs aanbieden

Jun 2, 2012 Updated Jun 2, 2012

Show full content

President Obama heeft alle federale overheidsorganisaties in de Verenigde Staten twee maanden gegeven om hun informatie via APIs aan te bieden. Dat getuigt van een goede visie.

Overheden zijn helaas geen webbouwers. Iedereen kent voorbeelden van verschrikkelijke websites van overheden. Regelmatig lees je over dramatisch verlopende ICT-projecten in gemeentes, rijksoverheden en aanverwanten. Zo lijkt het bijvoorbeeld de gemiddelde Nederlandse gemeenteambtenaar maar niet te lukken om simpele zaken als “wanneer komen ze het oudpapier ophalen” te beantwoorden op hun site. Vaak moet je eerst een studie doen naar de interne inrichting van die overheidsorganisatie, om te weten bij welke afdeling, welke subcontracter of welke semi-overheidsinstelling en welke website je moet zijn voor die informatie. En weet je aan het einde van je zoektocht meer over de contracten tussen gemeentes vuilophalers en paprapluorganisaties, dan over het oudpapierophaalrooster in je zomervakantie.

Overheden geven soms miljoenen uit aan relatief eenvoudige websites. Enorme projecten met enorme overheadkosten, supercomplexe interne functionaliteit en de vreemdste interne eisen. Zaken waar jij als eindgebruiker van zo’n website in het beste geval niks van merkt, maar meestal zult ervaren als ridicuul lastige navigatie en vreemde indeling. De oorzaak van de ridicule kosten, is dan ook de ontwikkeltijd. Meestal duren zulke bouwprojecten maanden. Of soms zelfs jaren. En tegen de tijd dat die mooie nieuwe desktopsite van het gemeenteloket, de douane-invoerprijzen-overzichtten of de werken-aan-de-weg-voorlichtingssite af is, wil iedereen met een smartphone je site lezen. Overheden, log en traag als ze zijn, kunnen het web gewoon niet bijbenen.

Het mantra waarmee je de afgelopen regeerperiodes doodgegooid werd is juist hier erg op zijn plaats:

Laat het aan de markt over.

Dat is wat Obama nu juist doorziet en verplicht doorvoert. De overheidsinstelling moet zijn of haar data beschikbaar maken via API’s. Waarmee “de markt”, snelle webontwikkelaars, slimme app-bouwers of heel onverwachte andere instanties de content kunnen gebruiken en opnieuw aanbieden. Een voorlichtingssite over wegwerkzaamheden is aardig. Maar die data direct in je TomTom is nog veel handiger. Of een mashup die die data samen met treintijden en parkeerplekken bij stations aanbied, is voor anderen nóg handiger. De onbegrijpelijke in- en uitvoerprijzen van de douane direct in je e-commerce pakket ingebouwd. Belastingoverzichten direct in je accountant-pakket. Natuurlijk zou rijskwaterstaat allemaal apps, integraties en mashups kunnen (laten) bouwen. Maar als ze hun wegwerkzaamheden gewoon via APIs (XML, SOAP, JSON) aanbieden, dan kan iedereen daarmee aan de slag. En kan een snelle app-bouwer dat voor hun doen. Die app-bouwer kan haar AnaarBeter App beter maken, geld ermee verdienen, en het belastinggeld hoeft er niet voor ingezet te worden. Iedereen wint.

In de VS komt daarbij nog een ander belangrijk element kijken: alle door overheden gemaakte data is publiek domein. Dus er zit geen auteursrecht op. Je mag het gebruiken, verkopen, hergebruiken en verspreiden zonder dat je daarvoor nog een keer hoeft te betalen. Het idee is even simpel als logisch: De belastingbetaler die al betaalt heeft, voor het maken van voorlichtingsfilmpjes, informatie, rapporten en wat dies meer zij, is gewoon eigenaar van dat werk.

Want alleen via APIs aanbieden is niet genoeg, je moet ook wettelijk regelen dat de informatie in die APIs gebruikt mag worden. Ook in commerciële toepassingen en in andere vormen dan waarop je het aanbied.

Natuurlijk kost het verplicht aanbieden van APIs ook geld. Nóg een eis waaraan een website moet voldoen. Wordt het project nóg duurder. Maar daarna kost snelle doorontwikkeling niks meer. Sterker nog, in veel gevallen zou een overheid eigenlijk niets meer dan een API hoeven te bouwen. Geen sites laten ontwerpen. Kunnen ze afzien van dure en ingewikkelde content managementsystemen met plaatjes en ondersteuning voor filmpjes. Maar gewoon een simpele SOAP-interface. En een PDFje erbij hoe je dat kunt gebruiken.

Dus, overheden, geef ons nou gewoon je informatie in een herbruikbaar formaat, in plaats van op onhandige, peperdure websites. En laat het aan de markt om er een mooie site omheen te bouwen; laat Google het indexeren en op hun kaarten weergeven, TV-producten het in je TV importeren of app-bouwers het in hun specialistische pakketten opnemen.

https://berk.es/2012/06/03/overheden-moeten-apis-aanbieden

3D printen walst de auteursrechten discussie binnen

May 30, 2012 Updated May 30, 2012

Show full content

Wired heeft een verhaal over Thomas Valenti. Als fan, ontwierp en printte hij poppetjes voor Warhammer. En deelde hij die op Thingiverse.

En dat mag niet. De advocaten van Games Workshop, de Britse fabrikant van het spel, hebben de bestanden waarmee je de poppetjes kon printen, offline laten halen. En zo begint de “oorlog”.

Een fabrikant als Alessi, die plastic prullaria mooi maakt; en zo een zoutvaatje voor enkele tientjes per stuk weet te verkopen, moet een ander businessmodel gaan zoeken. Net als fabrikanten van poppetjes voor spelletjes, dus.

Tot enkele jaren terug moest je flink investeren, mallen maken, dure spuitgietmachines aanschaffen, enzovoort, om die Alessi zoutvaatjes na te maken. Dat deden en doen, vast een heel aantal fabriekjes (in China), maar met wat moeite en een klein legertje advocaten, kun je deze namaak best uit de lokale blokker weren. Duizenden fabriekjes die namaak maken, is heel wat anders dan tientallen miljoenen mensen die datzelfde doen. Ooit kon de muziekindustrie de paar professionele cd kopieerders nog wel aanpakken. Tot de Cd-brander thuis en daarna Napster en trawanten.

Een jaartje of anderhalf, en iedereen kent wel iemand met een 3d printer in de bijkeuken. Je hebt er nu al kant-en-klare voor onder de $500. En met twee rechterhanden en een soldeerbout ben je al voor twee of driehonderd euro aan het printen. Nog niet een jaar geleden was de enige optie, die zelfmaak-optie en kostte zo een printer aan materiaal alleen al een paarduizend euro.

3D-printen kan zich nog lang niet meten met de bekende Chinese plastic-perser, die rubberen eendjes, pennen en alle andere plastic spullen uitspuugt voor een paar cent per kilo. En dat zal het waarschijnlijk ook nooit kunnen. Net zoals je de geboortekaartjes ook bij de drukker laat drukken, ondanks die kleurenprinter thuis op je bureau.

Maar voor maatwerk, onderzoek, kunst en dus ook dure design-prullen is het een zeer geschikt apparaat. Bedrijven die daarop draaien , of zich beschermd denken door octrooien op hun triviale producten, die moeten vernieuwen, of net als de muziekindustrie, ten onder gaan aan de revolutie die het Internet ons heeft gebracht.

Wat die bedrijfsmodellen zijn? Leer van de Open Source softwarefabrikanten: verkoop niet het product, maar de diensten eromheen. Of verkoop extra’s, zoals service, eenvoud, garantie en doorontwikkeling. Of een van de andere geweldige ideeën waar nu nog niemand zelfs van durft te dromen.

https://berk.es/2012/05/31/3d-printen-walst-de-auteursrechten-discussie-binnen

Leave some Ink in the Well

May 29, 2012 Updated May 29, 2012

Show full content

Many writers know Hemingway’s tip:

Leave some water in the well.

From Impulse:

It’s a great idea: stop working when you’re writing your best and it’s easier to start writing next time. You leave the work excited to return. You only face the dreaded Blank Page in the middle of your writing session, fresh from a success.

I found the same goes for Coding, albeit for different reasons. And, as a coder, I created a silly little script to help me with that.

Screenshot showing where I left the ink in the
well

You get disturbed.

I get disturbed a lot, during work, anyway. When finally you have mapped the entire stack of some weird application in your brain, your phone rings, alarms for some server start blinking, or you get some new twitter reply. Woosh, the effort is lost; the energy put into mapping everythin in your brain: wasted.

But even more often you have planned interuptions. Such as the end of the day, lunch, or some meeting.

You have many projects intermingling.

Ideally, as a programmer, you work on one problem at a time. Lucky people work on one technology, in one environment with only one language and toolset.

Luckily I don’t, because that would bore me to death. I love working on multiple projects simultaneaously. Most of us do, if the average github commitlog is any proof.

So, you are working on some problem in, say, Drupal. And suddenly your time is, up, or a more urgent, say, CSS-issue needs solving. Or some server configuration needs sorting out, because backups or builds are failing.

What do you do? Commit the unfinished state? git stash it? just leave it as it is?

You could leave some ink in the well. Using a simple @INK marker.

For example:

def by_ranking
  # @INK: the rank attribute is not updated or filled
  #       in the database, it seems. @TODO: make a 
  #       migration to add this field to the database,
  #       then an after_update hook to actually fill this value.
  sort {|a,b| a.rank.to_f < b.rank.to_f}
end

A simple ‘@INK’ with a comment. @INK is a different marker then @FIXME or @TODO. Actually, @INK is also a TODO.

Then, whenever you pick up a project, you look for the @INK, have your aha-moment and can jump right in where you left.

The only problem I have with this, is when you get disturbed, you often don’t have the time to dump your thoughts and mental-state into such a comment. But telling the person behind you to “wait a sec till I finish this sentence” is not too strange.

Some rules apply:

There can be only one @INK. Ever. (A project can have many @FIXME’s or @TODO’s)
@INK marked code may never be pushed to other people’s or a central repository. They are your private markers.
Whenever you open a project, you must search for the ink first. Then either remove it, replace it with a TODO or start where you left.

And with a simple script, you can find your ink in the well:

#!/bin/bash
echo "-- Last four commits --"
git log --graph --pretty=format:'%Cred%h%Creset -%C(yellow)%d%Creset %s %Cgreen(%cr) %C(bold blue)<%an>%Creset' --abbrev-commit --date=relative -n4
echo "-- Git status --"
git status -s
echo "-- The INK --"
ack-grep -C 4 "@INK" --all

It gets some context with two git commands, one to render a short, pretty git log, to learn what you did last, before you left the project. The other to show the changes in the repository: uncomitted, changed and removed files and such. And the last shows you where you placed the @INK marker, with a few lines of context.

Instead of ack-grep, you can use grep if you prefer. I’ts just slower and needs additional “–recursive *” parameters.

There, it works very @INK [berkes wo may 30 15:09:51 CEST 2012]: write some catchy finishing line; the postman is ringing at the door.

https://berk.es/2012/05/30/leave-some-ink-in-the-well

DrupalJam presentatie over Microframeworks

May 24, 2012 Updated May 24, 2012

Show full content

Voor DrupalJam 2012 werd mij gevraagd een sessievoorstel in te dienen. Het leek me passend om een aan Drupal-gerelateerd onderwerp te zoeken; welke niet direct over Drupal zélf gaat.

Vandaar: Microframeworks. En hoe je deze in een Drupalomgeving of -project kunt inzetten.

Update: resources

Microframenworks: Queen Drupal en haar onderdanen

“MobileFooWizards heeft voor ons een iphone-app gemaakt, of we op de Drupalsite even een JSON-feed kunnen aanbieden van de nieuwsberichten met de data zoals in dit mailtje staat”

Klinkt als een eitje. Toch? Beetje views klikken, klaar! Maar dan komt het: hoe bied je mobielvriendelijke plaatjes in die content aan? Hoe zorg je dat toekomstige versies van de app andere JSON kunnen lezen? Hoe scherm je het af? Schaalt het wel? Voor je het weet is het een enorm project, met allerhande afhankelijkheden, deployments enzovoort.

Het microframework: eenvoud

Een goede ontwikkelaar kan niet zozeer alle problemen oplossen, maar kan ieder probleem opdelen in makkelijk oplosbare, kleine probleempjes.

Ik, zojuist

Microframeworks zijn gereedschappen waarmee je enorm simpele, piepkleine webapplicaties bouwt.

Zo een web-applicatie kan perfect met Drupal samenwerken: Het kan Drupal werk uit handen nemen, voorgekauwde informatie aanbieden enzovoort. Een ideale onderdaan voor je Drupalsite.

Door een piepkleine website te bouwen naast Drupal dat bijvoorbeeld één enkele JSON-feed aanbied kun je Drupal veel werk uit handen nemen. Je verdeelt je project in onafhankelijke, losse componenten, die allemaal één ding doen en dat heel goed kunnen.

Sessie

In deze sessie laat ik verschillende scenario’s zien waar een microframework samen met Drupal een gouden combinatie blijkt.

We kijken naar het iPhone-app-probleem, maar ook naar hoe we informatie Drupal ín kunnen krijgen. En we kijken hoe we Drupal kunnen koppelen aan externe informatie en diensten door er een microframework tussen te plaatsen.

In de sessie zal hier en daar wat code voorbijkomen, maar dat is slechts ter illustratie. Uiteraard zul je moeten programmeren om een microframework in te kunnen gaan zetten; maar in deze sessie kijken we meer op een afstandje hiernaar.

https://berk.es/2012/05/25/drupaljam-presentatie-over-microframeworks

A new blog

May 23, 2012 Updated May 23, 2012

Show full content

There. Finally. A new blog. My old blog was found on bler.webschuur.com and webschuur.com. Which is also the name of my company: webschuur.com.

For several times, I attempted to rewrite everything, but after 80% of the work done, I found out was doing it all wrong. And started over again.

They say the cobbler’s children go the worst shod.

My blogs and bler webschuur.com, respectively, my personal, Dutch and English business blog, had to be merged. The CMS powering them, Drupal, needed a complete rebuild. And it really needed a redesign. Oh. And I had a much cooler domain name: berk.es.

All content has been migrated, but not everything has been repaired, there are thousands of articles dating back to 2001 in the database. In all kinds of formats, with all sorts of extra content, and many articles were broken for years. The coming weeks, I will have to repair them one at a time. By hand.

And the comments are not yet migrated. First, I need to weed out all spam and then migrate everything to Disqus. A hell of a job.

Jekyll

This blog runs on Jekyll. Jekyll is super simple: it uses a text file for each article and generates a site out of all these files. Which you then upload. This sounds old fashioned, but is actually super effective: no CMS, no database, no complex server software, no security upgrades, no possible intrusions in your CMS, and so on. Faster than a site so you can not get, as safer than CMS is not even in theory, and the simplicity is unimaginable. And hosting it on a professional environment is cheap (or free).

That is, if you think editing text files is easy.

More technical: you write the text in HTML or markdown. Which is then converted to clean HTML. You manage all text files in a revision control system (git, in my case) and that revision control provides the deployment: it generates and upload the site.

Why not Drupal?

Both sites were running Drupal. Both were FUBAR: totall loss. Upgrading was not possible (anymore) and troubleshooting or bugfixing yielded only but more problems:

Spam: I have tried all spamsolutions for Drupal, but with sometimes millions (!) of spam posts per day there is always some slipping through. 1% of 1 million is still 10000. Sometimes, after not looking at my site over a week, I found hundreds of thousands of spams that still seeped trough: a self-maintaining effect: published spam attracts spammers. Solution: A new comment system Disqus. This requires a difficult migration.
Old modules, old content: During time I have updated Drupal hundreds of times and upgraded seven times. Always some minor thing broke; there was no upgrade for a certain module. A table did not update completely right and so on. The result is a broken database, lots of broken content and lint. The solution is a complete rebuilding. And full export and import of all the old articles.
Drupal has become extremely big and bloated over time. Too heavy for a
small little blog. You notice this very well when you get a million
hits from spammers. But also briefly when the site has been busier
(e.g. a post on Reddit). My Tiny server may power five small drupalsite. Really, no more. That is ridiculous: a dedicated VPS for five tiny sites. An upgrade to Drupal7 of all five sites would mean that I have to order a larger or second VPS: ridiculous; an extra twohundredsomething Dollar per year for just five tiny sites. Or that I need to start fiddling with proxies and memcache. For five small sites: ridiculous. Solution continue with the old, yet better perform-using Drupal, or find another CMS.

I also had some small, simple requirements for a new blog:

HTML5 (and CSS3) for the layout.
Mobile Friendly.
Fine HTML.
No / minimal administration and security updates.
Bilingual content.
Cheap hosting.

Drupal7 can, with lots of effort, serve somewhat clean, responsive HTML(5) and CSS3. But after implementing 80% of my design (and development of an entire theme-engine just for this), I decided that this is nonsense. So I decided to look for something better, that gives me full control over the layout.

Rails?

I had converted my entire site in Ruby on Rails. The content migration was finished, a pretty secure commenting system was done. And it was full of fun gadgets (like Twitter, Reddit and Facebook scraper: copy the comments posted there onto my blog). And it even performed nearly as good as Drupal7. With some tweaking even better then Drupal 5. Just finish the last 20% and release it.

Then I made a few steps backwards, looked at it from a distance: a rather large, self-written CMS on Rails, just to publish a super simple blog. I must be mad: even that last 20% probably costs more work than a few evenings of Jekyll hacking.

There is why. Jekyll.

And now some more blogging.

https://berk.es/2012/05/24/a-new-blog

Een nieuw weblog

May 23, 2012 Updated May 23, 2012

Show full content

Zo. Eindelijk. Een nieuw weblog. Mijn oude blog stond op bler.webschuur.com en op webschuur.com. Zo heet ook mijn bedrijf: webschuur.com.

Al diverse malen heb ik geprobeerd om alles om te gooien, om na 80% van het werk erachter te komen dat ik verkeerd bezig was.

‘t is bij de loodgieter dat het kraantje lekt.

Mijn blogs bler en webschuur.com, respectievelijk mijn persoonlijke, Nederlandse en zakelijke Engelse blog, moesten samengevoegd. Waar ze op draaiden, Drupal moest geheel vernieuwd. En het design moest hoognodig op de schop. Oh. En ik had een veel cooler domein: berk.es.

Alle content is gemigreerd, maar nog niet alles is gerepareerd: er zitten duizenden artikelen die teruggaan tot 2001 in de database. In allerhande formats, met allerhande extra content, en veel artikelen bleken al jaren kapot. Dat moet met de hand aangepast gaan worden. Dat is een klus voor de komende weken. Alle reacties zijn nog niet gemigreerd. De spam moet er eerst helemaal uitgefilterd worden en dan moet alles geconverteerd en gemigreerd worden naar disqus. Een hels karwij, ook voor de komende weken.

Jekyll

Deze blog draait op Jekyll. Jekyll is supereenvoudig: het gebruikt een tekstbestandje per artikel en genereert van al deze bestandjes een site. Die upload je dan. Klinkt ouderwets, maar is vooral supereffectief: geen CMS, geen database, geen complexe serversoftware, geen veiligheidsupgrades, mogelijke inbraken in je CMS, enzovoort. Sneller dan zo een site kun je niet krijgen; veiliger dan zo een CMS bestaat zelfs in theorie niet; en de eenvoud is onvoorstelbaar. En het hosten is zelfs op een professionele omgeving zo goed als gratis.

Althans, als je tekstbestandjes bewerken makkelijk vind. Wat technischer: je schrijft de tekst in HTML of in markdown. Dat wordt dan omgezet naar schone HTML. Je beheert alle tekstbestandjes met een revisiebeheersysteem (git, in mijn geval) en dat revisiebeheersysteem zorgt ook voor de deployment; het genereren en uploaden van de site.

Waarom geen Drupal?

Allebei de sites draaiden Drupal. Beide waren FUBAR: totall loss. Upgraden ging niet (meer) en oplossen van problemen leverde alleen nog maar meer problemen op:

Spam: Ik heb alle spam-oplossingen voor Drupal geprobeerd, maar met soms miljoenen (!) spam-posts per dag slippen er altijd een paar doorheen. 1% van 1miljoen is nog altijd 10000. Als ik soms een weekje niet keek, had ik honderduizenden spams die toch doorgesijpeld waren: een versterkend effect: want gepubliceerde spam trekt spammers aan. Oplossing: een nieuw commentsysteem: Disqus. Dat vergt een moeilijke migratie.
Oude modules, oude content: In de loop van de tijd heb ik Drupal honderden keren geüpdate en iets van zeven keer geüpgrade. Altijd ging er wel iets kleins mis, of was er voor een module geen upgrade. Het resultaat is een kapotte database, veel kapotte content en enorm veel achtergebleven puin. De oplossing is een volledige herbouw. En een volledige export en import van alle oude artikelen.
Drupal is enorm zwaar geworden. Veel te zwaar voor een klein blogje. Dat merk je extra goed als je een miljoen hits van spammers krijgt. Maar dat merk je ook als het heel eventjes wat drukker is. Mijn servertje kan vijf drupalsitejes aan. Echt niet meer. Dat is belachelijk: één dedicated VPS voor vijf piepkleine sites. Upgraden naar Drupal7 van alle vijf zou betekenen dat ik een grotere of tweede VPS erbij moet bestellen: belachelijk. Of dat ik met proxies, memcache enzovoort in de weer moet. Voor vijf kleine sites: belachelijk. Oplossing bij de oude, beter performende Drupal blijven, of een ander CMS.

Bovendien had ik wat kleine, simpele eisen aan een nieuw Blog:

HTML5 (en CSS3) voor de layout.
Mobielvriendelijk.
Schone HTML.
Geen/nauwelijks beheer en veiligheidupdates.
Tweetalige content.

Drupal7 kan met veel pijn en moeite enigszins schone HTML5 uitserveren. Maar na 80% van mijn design geïmplementeerd te hebben (en daarvoor een heel nieuwe theme engine geschreven te hebben) besloot ik dat dit onzinnig is. Ik kon beter naar iets op zoek dat mij meer controle over de layout enzo gaf.

Rails?

Ik heb mijn hele site omgebouwd naar Ruby on Rails. De content-migratie was klaar, een mooi spamveilig commenting systeem was af. En het zat vol met leuke gadgets (zoals een twitter, reddit en facebook scraper: plaatst reacties aldaar op mijn blog). En het performde bijna net zo goed als Drupal7. Met wat tweaken zelfs nog beter. Nog even de laatste 20% afronden en klaar.

Totdat ik wat stappen achteruit deed en nog eens goed naar het project keek: een vrij groot, zelfgeschreven CMS, op Rails, om een supersimpel blogje te publiceren. Ik ben gek ook: zelfs die laatste 20% is waarschijnlijk nog meer werk dan even wat aan Jekyll hacken.

Vandaar. Jekyll.

En nu weer wat vaker bloggen.

https://berk.es/2012/05/24/een-nieuw-weblog

Ultimaker 3D printer.

Apr 1, 2012 Updated Apr 1, 2012

Show full content

Yes! Hij is binnen, hij werkt en ik ben dolgelukkig: Mijn Ultimaker 3D-printer.

[ Ultimaker printing black lines of plastic as bottom

Meer Foto’s »

Voor diegenen die het concept helemaal niet kennen: dit zijn zelfbouw-apparaten waarmee je (vooral) plastic dingen kun printen. Daadwerkelijke, fysieke dingen. Ik ervan overtuigd dat we aan het begin van een tweede revolutie staan; en ik wil daar uiteraard dolgraag bijzijn.

Vandaar dus deze aankoop. Na lang zoeken bleek vooral vanwege importheffingen, en mijn zeer geringe ervaring met de materie. de bovengenoemde Ultimaker de beste koop. Bekender is de Amerikaanse Makerbot, maar die is alles bij elkaar duurder om naar Nederland te krijgen dat onze eigen Ultimaker. De goedkoopste, mooiste, maar ook meest gedurfde viriant printer is de Mendel, uit de series van echte zelfbouwprinters. Gedurfd, want je koopt vooral losse onderdelen (pijpen, schroeven, draadjes, motoren enzovoort) en print dan de printer zélf uit!

Maargoed. Mijn Ultimaker print. Het zwakste stuk is vooralsnog de software waarmee je alles aanstuurt en converteert; dat blijkt behoorlijk foutgevoelig, heeft versie-problemen, en werkt eigenlijk gewoon niet lekker. Gelukkig heb ik juist daar redelijk wat kennis, dus daar is voor mij nog een taak weggelegd.

Ik heb nog geen concreet plan. Genoeg ideeën om uit te zoeken, uit te proberen en mee te spelen. Al ben ik juist vooral op zoek naar andermans ideeën. Bijvoorbeeld het koppelen aan e-commerce-software. Of direct de verpakking, met verzendlabels om het “product” heen printen. Misschien software inrichten waarmee je maatwerk kunt printen. Zoals namen of logo’s direct vanuit een webshop in het 3Dmodel verwerken. En Windmolens. Fucking zelfprintbare windmolens. Natuurlijk. Waar ik echter vooral naar zoek, zijn nóg meer ideeën. En dan bedoel ik niet meteen een idee voor de zoveelste leuke flesopener (mag natuurlijk ook), maar vooral breder, groter! Nicemarkten, alimme bedrijfsmodellen; handige inzet; vooral dingen die door deze 3D-printerrevolutie opeens wél kunnen, maar die eerst niet konden, niet toegankelijk waren, of gewoon onbetaalbaar.

Voor al deze ideeën staat mijn printer (en mijn hulp en dus tijd) ook tot jou beschikking. Nu nog gewoon bij mij thuis, maar bij wat animo ga ik op zoek naar een wat meer openbare ruimte. Dus stuur me je idee, of kom het langsbrengen. Of kom gewoon even spelen :)

https://berk.es/2012/04/02/ultimaker-3d-printer

Several reasons why I prefer Github over Drupal for Hosting my Drupal Projects.

Mar 22, 2012 Updated Mar 22, 2012

Show full content

Why I prefer Github over Drupal, a crosspost from an issue on Drupal.org.

I do more then “only Drupal”. Github allows me to maintain all these none-Drupal modules in the same environment. Just another angle to look at the “but it’s good to have everything centralised”. It is now centralised for me, the developer; arguably the most important person in a project.
Collaboration is far easier on Github trough its forking and pulling mechanism. No fiddling with patches, continuous re-rolling them and so forth. The entire experience is just simply a lot better worked out on Github. “It’s all about the details”^1.
I firmly believe that Drupal should ditch the entire project-hosting. And leave it to the community members to choose where they host: On their own (companies’) servers, on Github, Bitbucket, launchpad, whatever rocks your boat. And no, you can still have central places to find your modules then. In my believe: when there is a competition between hosters and contrib-search-engines, they will be a lot better then what we have now. ^2
I care for developers, not users. Users give me little in return (other then high-usage rates and self-esteem). Developers are my main target, for they have the tools and skills to help improve my work. Their “payment” comes in patches, bugfixes, performance-improvements, refactorings and so on. For me, the most valuable payment. Obviously, most developers are users themselves. And many user can become a developer. But in the end, I choose a project-manangement-environment aimed at developers, because they are my main target audience, because in my Open Source Projects, they “pay” best. And so, I prefer to lower the barrier to make such a payment.

In the end, It should be for the best for all of us.

Obviously, Github lacks a few things, most of which are easily solvable due to the distributed nature of Git. Depending on how much Drupal projects I will continue to maintain (I am evaluating that right now), I might release some of my tools which help me here.

Drupal automatically builds releases. You now need to push to two remotes if you wish nightly builds (the -dev version).
Update and security infrastructure is built around hosting and maintaining on Drupal.org entirely. You host elsewhere? You won’t be able to push out new security releases to your users.
Drupal has a really strange (and anoyingly incompatible with de-facto standards) workflow and branching model. Aliasing and simply ignoring most Drupal standards helps a lot.
BUT IF IT IS NOT ON DRUPLA.OGR IT IS NOT OFFICIAL!!11onone. Sadly a module has to be released on Drupal.org to be taken in consideration in most projects. Personally, I find that smallminded. Since there are great projects not on Drupal.org. But facing the facts: a module has to be on drupal.org, so if you host elsewhere, you still need to host on Drupal.org too. Meaning two environments, twice the fiddling and thrice the amount of Description/readme/changelog copypastig. By hosting on Drupal.org alone, you avoid most of this.

1: The most hilarious and sad example is the fact that when I decided to move my tickets over from Drupal.org to Github, I closed the ticket-feature on Tagadelic; you should no longer be able to post tickets there.. But due to some bug, this feature does not work: So now I have to keep replying on tickets on drupal.org, telling people the tickets are closed there. Sigh.

2: It will also solve another rediculous problem: that of “too much modules”. Right now, the solution to this is to hold back module-development! Hah! Because the mechanism to find The Best Contribs is broken, you simply say: we will stall creation of new contribs: because then people can find the oldest ones best. Edit: due to excessive spamming (my server almost crashed, recieving over a million! spam POSTS per day) comments can be posted over at reddit. I will reply there.

https://berk.es/2012/03/23/several-reasons-why-i-prefer-github-over-drupal-for-hosting-my-drupal-projects

Why not to use -dev versions of Drupal-modules.

Mar 22, 2012 Updated Mar 22, 2012

Show full content

Cross post from a Reddit Thread. Comments are most welcome there

Let us assume that a release is some form of agreement between the developers and her users. Usually a release indicates that:

The releasable version is considered in a certain state. A state that can be communicated with the users (stable, beta, alpha, security-fix etceteras).
The release indicates an immutable point in time and development. Even twenty years from now, you can rewind to release XYZ and find it in the exact same state.
A release is typically kept forever. Unless, off course, the entire project is removed, that release will exist, with the exactly predictable (and often documented) bugs, shortcomings and other incompatibility. In most complex software projects (and your Drupal-core + 30 contribs is such a project) you will always choose predictability over newness.
A release often runs trough test-cycles. This will be stated in the projects documentation. Most often trough simple “we are beta, please install and report back”, but sometimes trough entire Continuous Integration cycles.
Documentation, Readme’s, third-party dependencies are most often developed parallel. A release is a point where they are all brought together and synchronised. This can, indeed, easily mean that a -dev version has less bugs then a release. In most situations that is very logical: you make a release. It contains 6 bugs, 3 are solved, not enough for a new release. Now the -dev release contains 3 bugs and the released version 6. To many people this is an indication that the -dev release is “better” then the released version.

Dev versions may (and often are) be end of line versions. I have, for example, worked on a fully OOP, for users entirely backwards compatible, replacement for Tagadelic. (It is on hold, mostly due to my lessening Drupal involvement). If it is released, there will be an upgrade path from the various releases to the new version. But not from each and every nightly-build -dev. This is part of that “agreement”. Dev versions might stop working from one day to another. Often large refactorings mean that features have to pulled out for a few commits, or that entire subsets stop working. A rewrite will break compatibility with other modules for a while, at least. So even if it works now, you simply never know if it will tomorrow. Dev versions are aimed at developers. So all the nice tools to lower the barrier for the larger public, such as installers, integration, end-users documentation and so on, are often neglected during this period. “It does not work -what does not work? -it shows nothing on the installer? -what does the debug-log show you? -the what, I don’t know how to look at the log”. Such tickets and emails are all too common and are utter time-wasters. If you don’t know how to read and debug code, then a -dev version, being for developers, is not for you. You may try it, but certainly should not expect a smooth ride. And definitely not consume precious developers-time by filing already known- or duplicate bugs :).

A few more practical reasons not to use -dev versions are:

You have not pinned a point in time and development: there is nothing more time-consuming and infuriating then attempting to find the exact time and date for some arbitrary -dev module in some arbitrary three year old Drupalsite that is handed to you.
You cannot be certain of proper database updates. They are (or should be) guaranteed between releases, but not between -dev versions. chances are that some untested database change between -dev versions spoils your entire database. And if it can happen, it will happen.
The state is uncertain: whereas a release has known issues and bugs, a -dev version is in a state of brokenness (or else it could be a release, now, shouldn’t it?) which is entirely unknown. Installing a -dev version is a little like trial-and-error debugging and coding: it might get you a working environment. But if you have no idea why it works, you are probably worse off then when you had a none-working environment but knew exactly why it did not work.
The userbase is small and spread out very thin. There is only one Latest Release, making it the most used version. But there are new -dev version rolled out every night, making the amount of users who run the exact same version as you, very small. This fact, is actually the whole concept behind the “agreement-thing”.

TL;DR You know exactly what you get when you pick a released version. You never know what you get or where it may take you, when you pick a -dev.

https://berk.es/2012/03/23/why-not-to-use-dev-versions-of-drupal-modules

How the Code Consultant Bot tought me about a potential solution to Drupals' deployment horrors.

Mar 14, 2012 Updated Mar 14, 2012

Show full content

I just learned about the Code Consultant Bot, a chat-bot version of the Rubber Duck.

The gist is that by explaining your question to someone (or on Stackoverflow, or to a Duck, or a chat-bot), you think about the question in more abstract, simpler or different ways. It very often solves my problem.

I took the bot out for a test-drive. Posing my favorite problem to it: How to properly deploy Drupal without downtime and tedious manual configuration (answer: you cannot). Here is the transcript:

berkes@jabber.org 11:20:28
Hello	

code-consultant@appspot.com 11:20:32
Hello, I'm your Code Consulatant
Ever notice how just talking through an issue with a co-worker is enough to help you solve a problem, even if they don't say a word? 11:20:32
Let me help you with your problems so they can keep working. 11:20:33
How do you do.  Please state your problem. 11:20:33
 
berkes@jabber.org 11:20:49
I want to deploy Drupal without downtime.	

code-consultant@appspot.com 11:20:49
What would it mean to you if you got to deploy drupal without downtime  ?	

berkes@jabber.org 11:21:10
then I would not need to take the site offline while rolling out a new release	

code-consultant@appspot.com 11:21:10
You say then you would not need to take the site offline while rolling out a new release  ?	

berkes@jabber.org 11:21:18
yes	

code-consultant@appspot.com 11:21:18
You seem to be quite positive.	

berkes@jabber.org 11:21:43
how would I release a drupalsite without taking the site offline?	

code-consultant@appspot.com 11:21:44
Why do you ask ?	

berkes@jabber.org 11:22:26
because Drupal stores its configuration in the database. The same database that contains the content.	

code-consultant@appspot.com 11:23:09
What database are you using ?	

berkes@jabber.org 11:23:16
MySQL	

code-consultant@appspot.com 11:23:16
You're going to have to tell me more.	

berkes@jabber.org 11:23:32
MySQL relational database	

code-consultant@appspot.com 11:23:32
Is your database well normalized ?	

berkes@jabber.org 11:23:37
yes, it is.	

code-consultant@appspot.com 11:23:37
You are sure.	

berkes@jabber.org 11:23:55
Well, no. You are right, that is part of the proble,	

code-consultant@appspot.com 11:23:56
Are you saying no just to be negative?	

berkes@jabber.org 11:24:27
how would I normalise Drupals database?	

code-consultant@appspot.com 11:24:27
Liquibase would help you manage your database better.

LiquiBase. I am not sure about the practical application of this tool, and if it will fit well into a Drupal workflow. But from the description, it tries to solve exactly the Drupal Deployment Problem.

In the very least, it helps. a lot:

Rollback database to “tag”. Woa. If your deployment fails, whether or not you did that with any of the tools such as hook_update_N, Features, or whatever, there is still no rollback. Yet. Here it is.
Database diff changelog generation. Especially usefull on your development machine: it allows you to record all the changes you made while developing the new release for your site. Even if such a changelog helps you to write a release-script manually, it is very helpful. Did you never forget to set that one checkbox on live, because you forgot you had it checked on your development machine? Causing a Giant Shipment of Fail?

Funny. To learn a new tool from a bot. What is next? Bots solving our actual solutions for us?

https://berk.es/2012/03/15/how-the-code-consultant-bot-tought-me-about-a-potential-solution-to-drupals-deployment-horrors

A better way to variable_get() and t() in Drupal.

Feb 29, 2012 Updated Feb 29, 2012

Show full content

When programming in Drupal, repeating default values in variable_get and repeating strings in translations, all over the place, is a very strong codesmell.

I have been playing with solutions for this, and during my last project decided to take these attempts and make it into a very simple system. A pattern.

But, first, let us identify the problems.

Persistent variables

$html .= "Showing ". variable_get("mymodule_amount", 20) ."items";
$html .= pager_query("SELECT * FROM {mymodule_items}", variable_get("mymodule_amount", 20));
if ($total > variable_get("mymodule_amount", 20)) {
  $html .= "there are more";
}

Not only is there the magic number 20 all over the place, it is a DRY violation all over the place. In above example, that DRY violation is not very visible, yet, but imagine a module called project_magician_message_center:

variable_get("project_magician_message_center_amount_for_". $node->type, 20);
variable_get("project_magician_message_center_request_limit", 20);

Just open up your average variables table in larger Drupalproject and look around. The horror! (And maybe you have been bitten by the length limit of 128 characters?). There is no pattern; just a list of unpredicable names.

The magic number problem often gets solved by Drupal developers with constants. But as the name suggests, a constant is constant. And a variable is variable. It is very confusing to read this:

define("MYMODULE_AMOUNT", 20);
$items = pager_query("SELECT * FROM {mymodule_items}", variable_get("mymodule_amount", MYMODULE_AMOUNT));

Especially when you clearly get 30 items in some list. Which is what happens when a variable gets another value. Suddenly the constant is no longer used; it acts like a variable. Naming your constants MYMODULE_AMOUNT_DEFAULT is slightly better, but no real solution.

Translations, screentexts.

Translations, through t() act even worse. Some examples:

t("Hello World, today is %date");
t("Hello world, today is %date"); #note the intentional erronous lowercase world.

$actor = "Marsellus";
$subject = "Antwone";
t("Look, just because I don't be givin' no man a foot massage don't make it right for %actor to throw %subject into a glass motherfuckin' house, fuckin' up the way the nigger talks. Motherfucker do that shit to me, he better paralyze my ass, 'cause I'll kill the motherfucker, know what I'm sayin'?", array("%actor" => $actor, "%subject" => $subject));

$message <<<MESS
Well, the way they make shows is, they make one show. That show's called a pilot. 

Then they show that show to the people who make shows, and on the strength of that one show they decide if they're going to make more shows. Some pilots get picked and become television programs.

Some don't, become nothing. She starred in one of the ones that became nothing.
MESS
t($message);

t(_mymodule_message_contents());

First and foremost problem with this is that it is not prefixed, namespaced if you will. Your t(“Submit”) is the same as that other t(“Submit”). Translate this once to “Create new” and suddenly all sorts of labels, tabs, titles and links show the text “Create new”. We have all been, there, just admit it, already.

But The first two examples pose an ever greater problem, too many such sentences are very alike. Strings like “A new %type was created” show up next to “New %type created!”. Especially when there are many modules, built over time, by many different developers.

Then the larger texts become an even bigger issue, they range from plain ugly to cluttering and convoluted.

Mixing screentexts and logic, which is what we all do, is arguably as bad as mixing code and presentation.

Solution

Imagine you could say:

"Showing ". v("amount");
"Showing ". v("core.amount_per_page");

t("core.hello_world");
t("ecommerce.payment.thank_you");
t("core.thank_you");
t("footmassage", array("%actor" => $actor, "%subject" => $subject));

Where all your defaults are nicely set in a central place, your screen-texts are in single place, or even file. And everything gets prefixed with your modulename, unless you define it differently.

The solution is OOP. Just a little, don’t fret, and nicely tuck away so that you won’t need to program everything OOP suddenly. First a generic class, which we will built upon in our modules.

class DrupalHelper {
  protected $prefix = "core";

  public function v($symbol) {
    return variable_get(absolute_or_prefix($symbol), $$symbol);
  }

  public function t($symbol, $params) {
    $translated = "";
    $symbol = absolute_or_prefix($symbol);
    $function = symbol_to_function($symbol);

    if (method_exists($this, $function)) {
      $untranslated = $this->$function();
    }
    else {
      $untranslated = $symbol;
    }

    return t($untranslated, $params);
  }

  private function absolute_or_prefix($symbol) {
    if (!strstr($symbol, ".")) {
      $symbol = $prefix .".". $symbol;
    }
    return $symbol;
  }
  private function symbol_to_function($symbol) {
    return "t_". preg_replace("/\./", $symbol);
  }
}

With the module, I inherit this helper:

class MyModuleHelper() extends DrupalHelper {
  protected $prefix = "mymodule";

  #defaults:
  public $length = 20;

  #translations:
  private function t_hello_world() {
    "Hello World";
  }
}

In your module, you use this as follows:

function mymodule_form_alter($form_id, &$form, &$form_values) {
  if ($form_id == "foo") {
    $helper = new MyModuleHelper();

    $form["field"] = array(
      "#type"   => "textfield",
      "#title"  => $helper->t("hello_world"),
      "#length" => $helper->v("length")
    );
  }
}

The usage-example does not load the library files, but delegating code to separate files is not hard in Drupal, with helpers like module_load_include(). This example assumes the file is already loaded, or that some autoloader is in place. This example-code does not yet handle the variable_del and variable_set functionality for variables, but that is left to the reader to implement.

Also note that I have simplified the code a little for readability. Like leaving out the variable_set and the very much simplified symbol_to_function().

Some other todo’s on my list are:

Introduce a fallback for core strings, we now have to either call $helper->t(“foo”) for our symbol based translations, or t(“foo”) for core or 3rd party module strings. Core messages need to be callable with symbols too.
Allow passing variables into t() instead of an keyed array. Like t(“footmassage”, $actor, $subject); parsing and cleaning should use sane defaults but would need to be overridable.
Format_plural implementation. I hardly ever need it, but it should be callable like *plural(“footmassage”, $actor, $subject, $count);
Make it easier to place all screen texts in a separate file.
More consistency. Maybe defaults for variables should be defined just like texts, with a private v_var_name() function.

A real simple pattern, which requires a little understanding of OOP, but has almost only benefits in usage. And as far as I can see only one downside: it is “Un-Drupal-ish”; but that is not a reason, in itself.

comments on Reddit

https://berk.es/2012/03/01/a-better-way-to-variable_get-and-t-in-drupal

Wat kan wel en niet met Drupal: enkele vuistregels

Jan 10, 2012 Updated Jan 10, 2012

Show full content

Recent ontving ik weer twee mails met daarin de Gouden Vraag: “Wanneer moet ik Drupal nu gebruiken, en wanneer niet”. Van de laatste kreeg ik toestemming om de vraag en mijn antwoorden uit te werken tot deze blogpost; anoniem uiteraard.

…

Ik werk zelf al enige jaren met X [Een bekend ander CMS, of “ons eigen systeem”; BK] en dat systeem ken ik nu redelijk goed.

Door vragen van klanten/ontevredenheid met doorontwikkeling/zoektocht naar meer flexibiliteit ben ik eens naar Drupal gaan kijken.

Ik denk er daarom aan om over te stappen maar daarvoor zou ik graag willen weten wat niet kan met Drupal.

Wie dat weet, en kan toepassen, heeft goud in handen en kan makkelijkprijzen van boven de $300/uur vragen. Ofwel: dat weet niemand.

Ik ben daarom op zoek naar de concrete restricties/beperkingen van Drupal.

Die zijn er niet!

Voor iedere restrictie die je aanbrengt, brengt iemandanders een oplossing aan. En iedere beperking die ergens is beschreven, wordt door iemand andersbeschreven als ofwel een bewuste keuze (Y kan niet, maar dat is juist goed. Jij zou Y eigenlijk helemaal niet moeten willen) ofwel wordt er een uitbreiding, module of truukje uitgelegd waarmee, met wat werk, deze beperking omzeild wordt. Meestal vind je allebei.

Maar zelf hanteer ik enkele vuistregels, helaas niet erg concreet;

Drupal is een CMS, geen framework of zelfbouw.

Drupal is een CMS, geen framework (pdf). Helaas is “framework” een vage term, daarom enkele stellingen:

Een framework doet geen aannames over het gedrag, de look en de feel van het te bouwen eindproduct.

Een framework heeft een duidelijke doelgroep: de bouwers van applicaties, zoals websites (niet persé programmeurs).

Een framework biedt een technische basis en infrastructuur.

Een framework biedt een technische infrastructuur die het bouwen en/of programmeren efficiënter maakt.

Drupal voldoet hier niet écht aan; het is niet alleen opnionated over hoe je moet ontwikkelen, het is vooral opinionated over hoe het gebouwde resultaat zal werken en er zal uitzien.

Vergeleken met een CMS als Joomla! voldoet Drupal hier wel meer aan. En is daarom meer een framework dan Joomla!Maar vergeleken met Codeignitor, Symfony, Rails of Django is het véél minder een framework. Drupal valt dan veel meer in decategorie bij Joomla! Typo3 en Wordpress, dan bij Symfony of Django.

Omdat een CMS al volledig functioneel is (na installatie kun je meteen aan de slag, is het een werkende site), heeft het vastgestelde “manieren”.

Immers, na installatie heb je een werkend CMS. Hoe dat CMS je content benaderd, de workflow heeft bepaald, menusystemen ingeregeld heeft en wat de look en de feel is, liggen vast in de basis van dit systeem.

Drupal heeft dus een eigen wijze. En je moet dus je projectmanagement, wireframes, designs en workflows inrichten volgens hoe Drupal dat “wil”. Niet andersom.

Wil je een CMS, Drupal, exact laten gedragen zoals in je functioneel ontwerpen of technisch ontwerpen is vastgelegd, dan moet je twee keer zoveel ontwikkelen en eindig je met een drie keer zo complex systeem. Core doet manier-A. Uitbreiding X wordt ontwikkeld om manier-A ongedaan te maken. Uitbreiding Y wordt ontwikkeld om manier-B te implementeren.

Ben in je in de positie om de TO’s, FO’s, wireframes en designs te maken met kennis van Drupal’s “eigen wijze”, dan zul je vooral mét Drupal kunnen werken, in plaats van tégen Drupal te moet werken.

Database-geörienteerd, geen abstractie.

Drupal is zeer dicht op de Database gebouwd. Dit is in Drupal 7 in theorie verbeterd, de praktijk moet nog uitwijzen of dit ook écht een verbetering is. Helaas is over Drupal 7 nog veel onbekend en zijn er weinig casestudies te vinden.

In de praktijk moet dus alles in een door Drupal bepaalde MySQL Database, volgens een doorDrupal bepaalde Databaseopzet (DBA) opgeslagen worden. Wil je informatie elders vanbetrekken, of elders opslaan (legacy databases, zelfgedefinieerde databasestructuren, webservices, NoSQL, XML-files,etc.) dan zul je een groot deel van je budget/tijd opzij moetenzetten voor complexe synchronisaties, cron-scripts, en diverse hooks. Een centrale API, layer of zelfs een aangeraden design pattern, ontbreekt geheel.

Ook zul je vooraf duidelijk moeten hebben dat zulke koppelingen daarom, in praktijk, altijd uitdraaien op een groot houtje-touwtje ducktape-en-paperclips systeem: het werkt, maar is verre van stabiel en overzichtelijk. Met bijkomende operationele risico’s.

Moet je koppelingen maken met externe systemen, bijvoorbeeld om daaruit content te halen of juist om daar data in te stoppen? Dan zit Drupal zeer waarschijnlijk vooral in de weg. Maak dan een duidelijke afweging of deze extra investering en complexiteit opweegt tegen de voordelen van Drupal.

Een module voor alles versus het gevaar van onhandelbare complexiteit.

There is a module for that.

Voor een middelmatig complexe site heb je al snel over de vijftigmodules nodig. Om bijvoorbeeld de functionaliteit waarmee Wordpressstandaard komt na te bouwen in een Drupalblogsite kun je rekenen opdertig modules of meer. Dat is een enorme payload die mede beheerd,geüpdate en geconfigureerd moet worden. Daargelaten dat een groterehoeveelheid modules bijna altijd een negatief effect op de performanceheeft. Hou hier rekening mee bij het ramen van de lopende kosten: eengrotere server, een tijdrovende upgrade, update en beheerprocedure eneen toenemende complexiteit bij het (door)ontwikkelen.

Uiteraard is de correcte oplossing om simpelweg “niet een exacte Wordpress te willen nabouwen”. Waar Drupal standaard mee komt, is al genoeg om te kunnen gaan bloggen.

Ondanks dat dit een bekende vuistregel is, heeft het merendeel van de Drupalsites waar ik inzage in gehad heb veel meer dan die vijftig modules. Eerder rond de 100 modules, dan rond de 10.

Ook hier draait het weer om essentiële keuzes maken: je kunt heel goed bloggen met een Drupal zonder énige extra module. Pas als je allerlei eisen aan je workflow gaat stellen heb je modules nodig. Pas als je allerlei toeters en bellen erbij wilt, moet je enkele tientallen modules integreren, opmaken en doorontwikkelen.

Ben je in de positie om structurele en functionele keuzes te maken? Dan kun je het functioneel ontwerp goed bijsturen aan hoe Drupal dingen “standaard doet en kan”. En zijn weinig extra modules nodig. En wordt het project veel overzichtelijker en makkelijker beheerbaar.

Het enorme grijze gebied dat Themen heet.

Een Theme is, in theorie, niet veel werk, maar in de praktijk meestal de grootste klus van het bouwen van een site.

Veel hangt af van je eigen projectmatige inrichting van het enorme grijze gebied dat Drupal heeft tussen de”View”(de eigenlijke theme-files, de code), de configuratie (inregelen vancontenttypes, settings, views, panels, blokken enzovoort) en degebruikte modules. Zo kun je bijvoorbeeld kiezen om de “Posted by” opeen artikel in de configuratie uit te zetten, of om deze in denode-article.tpl.php template file eenvoudigweg niet te renderen. Of om daarvoor een set modules in te zetten die dit op een zeer krachtige manier configureerbaar maken.

Een typisch maatwerk-theme, waarbij het design al rekening houd metDrupal, kost een ervaren themer op zijn minst drie volle werkdagen om tebouwen. Meestal veel meer, omdat behalve het bouwen van het theme, dezethemer continue moet wisselen tussen het inregelen en configureren vanonderdelen van Drupal en het bouwen van het theme en de CSS.

Beperk je het themen enkel en alleen tot het aanpassen van de code in je theme? Dan ben je welliswaar beperkt in de mogelijkheden, maar is de klus zeer overzichtelijk en weinig werk. Maar betrek je het introduceren van allerlei functionaliteit erbij; of wil je in het theme ook bepalen hoe zaken zich gedragen, dan wordt het veruit de grootste klus van het bouwen van je website.

Met een houten kano de oceaan oversteken.

Verder wil ik met nadruk wijzen op het feit dat dit vuistregels zijn, geen wetten van Meden en Perzen.

Ik weet ook wel dat er altijd ergens een voorbeeld te vinden is van eensite die mijn ongelijk “bewijst”.

Maar als iemand in een houten kano de Atlantische oceaan overgestoken is,bewijst dat alleen maar dat je met een houten kano die oceaan over kunt steken. Het is geenszins eenbewijs tegen een algemene stelling zoals “een houten kano is geen geschikt vaartuig om de oceaan over te steken”.

Meer concreet: uiteraard zijn er mooie Drupalsites gebouwd die externe databases gebruiken voor hun content; maar daarmeeis nog niet gezegd dat over het algemeen het integreren van externe bronnen, een zeer moeilijke klus is, in Drupal.

Heb ik wat vuistregels over het hoofd gezien? Wat zijn jou vuistregels? Zijn er dingen die Drupal volgens jou absoluut niet kan? Of zijn er gebieden of cases waar Drupal juist het allerbeste inzetbaar blijkt?

https://berk.es/2012/01/11/wat-kan-wel-en-niet-met-drupal-enkele-vuistregels

CDPATH, add paths to your "cd" which are accessible from anywhere on your system for autocopletion.

Jan 2, 2012 Updated Jan 2, 2012

Show full content

Whenever you use the commandline a lot, you will be browsing to certain directories a lot. Most graphical filebrowsers offer some sort of bookmarking system. So that you can browse to the place where you have your invoices with only two clicks, instead of clicking all trough Documents » Administration » Finance » 2012 » 01.

Bash has something similar, but as always with the commandline, more powerfull: CDPATH.

I have several CDPATH entries set, for example on my desktop machine:

export CDPATH=~/Documenten:~/Documenten/Administratie/Facturen/huidig_jaar/

And on my webservers:

export CDPATH=/var/www.

The first, for example allows me, regardless of the current active directory, to type cd <tab> to show all active projects. Typing cd MCD_<tab><enter> will expand to the project cd MCD_my_current_drupal and open the directory /home/ber/Documenten/MCD_my_current_drupal/.

One of those tiny settings that make a small thing a little more efficient. And because I type that several hundreds of times each week, it’s overall benefit is rather large.

https://berk.es/2012/01/03/cdpath-add-paths-to-your-cd-which-are-accessible-from-anywhere-on-your-system-for-autocopletion

HJKL cheatsheet

Dec 7, 2011 Updated Dec 7, 2011

Show full content

Because I too always forget about hjkl.

hjkl, Motherfucker. Do you type it?

https://berk.es/2011/12/08/hjkl-cheatsheet

I was wrong: It was not a leak in a Drupalsite.

Sep 17, 2011 Updated Sep 17, 2011

Show full content

I tweeted too fast, and wrong:

Site were the Dutch Government accidentally leaked its 2012 budget, is a Drupalsite. Yes #Drupal does not secure its files. Drupal for govs?

The mayor news outlets in the Netherlands did not link to the leaking site, but instead to the site that carried (a mirror of the) PDFS that were leaked as well as background information. I followed these links, without researching if these sites were the actual leaking sites. This site they, instead, linked to, is a Drupalsite. The one with the unprotected files was not.

So much for not investigating a little myself! The site that leaked the file, was an ASP (.net?) site.

I am sorry for this misinformation. And as said, tweeted too fast, did too little investigation and that makes me look stupid. I am glad for those that told me my mistake. And because I got married the next morning, writing this errata took more time then is appropriate. Sorry for that too.

As a bonus, and to make things up a little, some common Drupal leakages that I helped fix in clients projects. Obviously I have responsibilities (and even a few NDAs) so I don’t give names and urls.

The avatar-fiasco.

A group of people partook in a grassroots campaign, backed up by a closed (the permission access-content was only given to hand-picked people) forum. The party who the grassroots took action against new about that forum, but could not access it. They, however, wrote a silly script that

scanned URL patterns for user-profiles: if they gave a 404, that user did not exist, if they got a 401, they still could not access the content, but new the person existed.
fetched avatars for all users that had one and used that to intimidate the partakers.

I did three things: migrated existing users and added some pseudorandom numbers to their uid. I hacked core, so it sent a 404 for access-denied pages too. I disabled avatars. And explained the users that their Drupalsite was not hacked, but instead leaking some minor privacy-information.

This was minor. But imagine this happening on a sexual-diversity, and/or civil-rights forum in, say, Iran?

The video-file settlement.

A none-drupalsite ran a (very interesting) documentary on a person with a mental illness. After a Preliminary injunction they had to take the video offline (or pay a fine of X for each day it remained online). The publishers took the article offline, but some journalists/bloggers found that the video could still be accessed, by giving the urls to the video-file. Luckily a settlement was reached and the publishers did not have to pay the fine for all the days the video had remained active.

We were just then planning to migrate this site to Drupal. The incident caused us to find a solution for this in Drupal: when a node gets unpublished the attached files should no longer be servable. We decided upon a custom-built module with a hook_node that acted upon “unpublishing” and simply renamed the files to some obscure salted-hash-name. Yes, that is security-by-obscurity, but the only affordable solution here.

The imagecache downloads

A site that (re)sold images used imagecache to watermark the images, resize them and only present small resolutions to users. Someone found this out, probably new Drupal, and fiddled with the urls to fetch the original files. Those were >5MB JPEGs, copyrighted and by contract, not allowed to be distributed. Ever.

My client was warned (luckily) and hired me to write a (very ugly) imagemagick hack that moved the original files to a place outside the web-root, but accessible for re-building of the derivatives.

The Multisite jokes

Back when I ran our Drupal-hoster we thought that multisite was a good solution for hosting. It is not, for many reasons, but one is most interesting here.

Two domains, for the sake of the example: upload-your-xxx.com and some-brochureware-about-us.nl were multisites

Some funny people found out that by switching the urls, one could present images uploaded on upload-your-xxx.com on the domain of some-brochureware-about-us.nl: and posted that on some forums: http://some-brochureware-about-us.nl/files/upload-your-xxx.com/hardcore.jpg with the messages: look company Y us dealing in pr0n. Embarrassing, in this case, but potentially harmfull, especially when one of both sites has user-generated-content, or when sites are tough opponents and involved in smear-campaigning. Also potentially harmfull when “good” sites get blocked on schools and in libraries, for “having none-complient or adult content”. Or when a multisite acts as a proxy to pass in disallowed content.

Our solution was to nuke multisite with a thousand flames.

But, private-files?

Drupal has a private-file modus. That is fine for small sites, but it does not scale. You cannot deliver (very) large files that way, and certainly cannot deliver large amount of files concurrently that way.

I see no solution; And earlier research has made me believe this is simply not possible with a classic LAMP (Apache and PHP) stack. One needs a real document-management server, or application (things like Alfresco). Most probably a Java-based solution. Or some thin proxy that knows about who can access what files in front of the app.

When you need to deliver large (amounts of) files, keep away from simple asp, lamp and such solutions, including Drupal. Maybe, but I have never tested it, Alfresco behind Drupal can offer a real solution?

Again, sorry.

As you know, I was wrong in my conclusion that (a badly configured) Drupal was leaking governmental information. But as you can see, it happens a lot. And it requires quite some effort to avoid Drupal leaking information. Including core hacks if you are really serious.

https://berk.es/2011/09/18/i-was-wrong-it-was-not-a-leak-in-a-drupalsite

Bitcoin inbraak,

Jun 20, 2011 Updated Jun 20, 2011

Show full content

Afgelopen weekeind, [werd ingebroken]((http://ftalphaville.ft.com/blog/2011/06/21/600441/george-clooney-roils-the-bitcoin-market) in het grootste wisselkantoor en handelsplatform mtGox. Een aantal mensen vroeg mij wat ik daarvan dacht, gezien mijn recente positieve verhalen over die BitCoin. Ik trek enkele conclusies, maar eerst wat achtergrondDe inbraak en crash is mooi te zien in een “live verslag” van een handelaar.

In het kort: er werd ingebroken en de inbreker heeft bitcoins ter waarde van anderhalf miljoen Dollars vér onder de prijs verkocht. Waarmee deze persoon de hele beurs heeft laten crashen. Deze bitcoins stonden óp een of twee rekeningen in het handelskantoor opgeslagen.

MtGox heeft meteen het platform gesloten en aangegeven iedereen een nieuw wachtwoord te geven. Ze hebben ook aangegeven alle in- en verkoopacties terug te draaien. Balen voor veel mensen die tijdens de crash bitcoins inkochten voor minder dan één dollar. Maar aangezien het illegale handel was, wel zo eerlijk voor alle andere handelaars op het platform.

Over de wijze van inbreken is nog wat onduidelijkheid, maar alles wijst erop dat een database van de computer van één van de ontwikkelaars is gestolen en de inhoud van die database is gebruikt om in te breken.

Een aantal dingen kan ik nu concluderen:

De Computers van de ontwikkelaars zijn een erg zwak punt in de beveiliging van websystemen. Ik ken genoeg webdevelopers met brakke operating systemen, slecht- of nietversleutelde hardeschijven, laffe wachtwoorden enzovoort. Ontwikkelaars die wél databases met tienduizenden gebruikersaccounts op hun laptopje in hun rugzak hebben zitten. Waarschijnlijk ook jou gegevens.
Bitcoin is nog erg onzeker. Beloftevol en technisch goed in orde, maar als economie en gemeenschap nog onervaren en zoekende. Een handelskantoor dat na een overval de hele economie doet crashen, is vergelijkbaar met een volledige crash van de Nederlandse economie na een kraak en enorme diefstal bij de Rabobank. Dat mág en kan niet gebeuren. De BitCoin economie mist nog allerlei veiligheidsnetten en mechanieken om dit te voorkomen.
Bitcoin is toch sterk. Want de beurzen die openblijven zijn weliswaar flink in het rood gedoken, maar inmiddels stabiel en weer opkrabbelend. We moeten nog zien wat er gebeurt als mtGox weer open gaat. Mogelijk dat heel veel mensen direct al hun bitcoins proberen te verkopen voor Euro’s of Dollars omdat ze geen vertrouwen meer in Bitcoin hebben. Maar ook goed mogelijk dat na een flinke waardedaling, diegenen die er nog wel in geloven (en er vaak veel in geïnvesteerd hebben) weer de goedkope bitcoins inkopen. Vooralsnog ziet het er niet erg slecht uit, en is “het einde van de Bitcoin” zeker nog niet in zicht.
De Cloud is gevaarlijk. Mensen die hun bitcoins op de site van mtGox bewaarden en niet thuis, op een veilige (versleutelde of afkoppelbare USB) schijf vertrouwen op een extern bedrijf. Een bedrijf dat enerzijds juist vanwege die grote waarde in kas een belangrijk doelwit is voor overvallers en tegelijk haar veiligheid blijkbaar niet goed op orde had. Dat is de “Cloud” waar vooral marketeers zo lyrisch over zijn: je mail, je documenten, je administratie en dus nu ook je geld, niet tuis op een veilige computer bewaard maar bij allerlei hippe “Cloud” diensten op hun online computers. BitCoin is in principe decentraal, maar als we allemaal één centrale database kiezen om het geld op te slaan, is het gewoon weer centraal.
BitCoin is spannend. Wat afgelopen weekeind gebeurde leest als het script van de James Bondfilm Goldfinger. Waarin Auric Goldfinger probeert al het Amerikaanse goud (de grootste hoeveelheid ter wereld) nucleair te besmetten, waarna het waardeloos wordt. En daarmee het goud van Auric Goldfinger zelf ineens meer waard is.
BitCoin is voor mij vooral een langetermijn plan. Over twee, die of zelfs vier jaar kunnen we zeggen of het een success was, of is. Tot die tijd is het vooral enorm spannend. En zullen de voor- en tegenstanders bij iedere grote gebeurtenis meteen concluderen dat het “dus” prima - of juist nooit kan werken.

Dus is het vooralsnog niet te concluderen dat BitCoin ten einde is of dat de waarde helemaal gecrashd is. We weten meer nadat de grootste beurs weer open gaat. Het is in ieder geval een belangrijk moment voor die Bitcoin. Wordt zeker vervolgd.

https://berk.es/2011/06/21/bitcoin-inbraak

Mijn bitcoin avontuur, deel twee: handelen en accepteren (op marktplaats) als betaalmiddel.

Jun 1, 2011 Updated Jun 1, 2011

Show full content

Zoals eerder beschreven ben ik in de bitcoin wereld gedoken.

En heb ik daar ook meteen al geld mee verdiend (jawel mijnheer de belastingmijnheer die dit ook misschien leest: ik voer dat gewoon netjes bij mijn inkomsten op). En zijn er een heleboel onduidelijkheden boven komen borrelen, de meeste worden besproken op het Nederlandse bitcoin IRC kanaal #bitcoin-nederland. Zaken als facturering, BTW, omrekentarieven enzovoort.Valutahandel met bitcoins is niet moeilijk en ook niet onduidelijk. Door de omhoogschietende prijs, heb ik met mijn relatief kleine investering na een paar dagen al een royale waardevermeerdering gezien. Als ik het nu zou opnemen, heb ik een heel royale winst gemaakt. Zonder jullie meteen alle details over mijn inkomen te geven: ik zou daar een paar jaar google advertenties op mijn sties voor moeten draaien. Ik kijk de kat nog even uit de boom, hoop dat het nog meer waard wordt. En loop dus het risico dat het als een mooie ballon knapt, en mijn geld ineens niks meer waar is.

Dat is ook het grootste risico op dit moment. De buzz is overal, iedereen probeert aan bitcoins te komen, dus hun waarde zoeft omhoog. Het is dus belangrijk dat er ook daadwerkelijke “echte wereld” goederen voor te koop zijn. Gelukkig zijn er al veel winkels te vinden, maar op wereldschaal gezien stelt dat nog bijna niks voor. Want als mensen ze accepteren, hebben ze een verankerde waarde. Als je voor 1 bitcoin een kop koffie kunt kopen, zou dat nu (2 juni 2011, 15:20) een duur bakkie zijn, met een wisselkoers van $10/1Btc. Maar is er een garantie dat je altijd nog kopjes koffie ter waarde van omgeveer €2 kunt krijgen. Dus dat de bitcoin verankerd is aan die €2 waarde.

Je moet echter ergens beginnen. Want in Nederland kun je nog nauwelijks ergens met bitcoins betalen. Ik heb mijn zolder opgeruimd en wat spul op markplaats gezet. Ik accepteer daar bitcoins voor. En Euros, want ik verwacht niet dat iemand met BTC gaat betalen: immers tegen de tijd dat je naar mijn huis gelopen bent om de fiets te bekijken is de afgesproken prijs in bitcoins alweer gestegen door die omhoog knallende koers.

Ik hoop dat ik van markplaats hier iets over te horen krijg. Er staat niks in hun voorwaarde over het accepteren van andere valuta (of ik heb eroverheen gelezen). Dus het lijkt gewoon te mogen. Maar laten we het experiment vooral doorzetten en zien wat ze er bij marktplaats van denken.

https://berk.es/2011/06/02/mijn-bitcoin-avontuur-deel-twee-handelen-en-accepteren-op-marktplaats-als-betaalmiddel

Bitcoins. De revolutionaire valuta met een potentie voor de teloorgang van ons banksysteem

May 28, 2011 Updated May 28, 2011

Show full content

Al verschillende keren is getracht om alternatieve economiën van de grond te krijgen. Sommigen zijn aardig gelukt. Maar nu, vooral afgelopen maanden, lijkt het dan eindelijk écht te lukken: een geldsysteem dat ons huidige op banken gebaseerde systeem kan vervangen. Klinkt eng? Dat is het ook; voor overheden en bankiers althans. Als je me niet gelooft, lees door en volg ook de links. Bitcoin is een valuta. Net als de Euro (of de Dollar). In tegenstelling tot die Euro, hebben Bitcoins geen centrale organisatie achter zich staan die de waarde garandeert. > De peer-to-peer topologie van Bitcoin en het ontbreken van de centrale administratie maakt het praktisch onmogelijk voor een overheid, of ieder ander, om de waarde van Bitcoins te manipuleren of meer inflatie te induceren dan er van tevoren is vastgelegd. Het ontwerp van Bitcoin zorgt voor anoniem eigendom en overdracht van waarde.

Ook ik ben met bitcoins aan het experimenteren en handelen geslagen.In tegenstelling tot Euro’s zijn bitcoins dus ook heel makkelijk “over te maken” en nauwelijks te traceren. Maar is het vooral geld dat “van ons is” en niet “van de overheid”.

Bitcoins zijn goed te vergelijken met goud. Goud is een onpraktisch, onhandig metaal; niemand maakt zijn schoffel, of fiets van goud, als zodanig is het dus nauwelijks waardevol. Maar omdat we met zijn allen hebben afgesproken dat goud iets waard is, ís het dat ook. Het feit dat er vrij moeilijk aan goud te komen is, maakt dat het deze waarde houdt en meestal zelfs meer waard wordt. Het feit dat je het goud vaak ongeregistreerd kunt bewaren maakt dat het een anoniem waardemiddel is. Als ik een fiets koop en dat betaal met goud, is die overdracht nauwelijks te traceren door overheden.

Momenteel is ongeveer $50 miljoen in het Bitcoin netwerk gestopt. Er is dus ongeveer $50.000.000,- ingewisseld tegen Bitcoins. Er bestaan ongeveer 6 miljoen bitcoins in omloop. Een Bitcoin is dus ongeveer $8 waard. En door de aanwas van geïnteresseerden, zal dit zeer waarschijnlijk enorm toenemen. Het is dus heel eenvoudig: omdat mensen Euro’s, Dollars en ander geld omwisselden in bitcoins, zijn die bitcoins wat waard.

Het aantal nieuw in omloop te brengen bitcoins is vastgesteld. Dit kan ook niet (meer) worden veranderd. Maar het aantal geïnteresseerden, mensen die “echt” geld inwisselen voor Bitcoins, zal alleen maar toenemen. Er ontstaat een “tekort” waardoor bitcoins meer waard worden.

Dat blijkt ook uit de korte geschiedenis. Anderhalf jaar geleden werd voor 1000 bitcoins slechts $50 geboden. Dat is $0,005 per bitcoin. Vandaag gaan ze van de hand voor ruim $8. In veertien maanden is de waarde dus verduizendvoudigd. Als je toen €2.500 had omgezet naar Bitcoins, waren die nu ruim €2,5 miljoen waard geweest. Multimiljonair dus.

Of het zo krankzinnig hard blijft groeien is uiteraard onduidelijk. Er zitten nog risico’s aan. En het is vooral afwachten wat overheden gaan proberen te doen. Maar interessant is het zeker. Ik heb, om die reden ook al een paarhonderd Euro omgezet in bitcoins. Omdat het vanuit mijn “traditionele” bank overgezet moet worden, duurt dat lang, want “zo werkt het nu eenmaal” maar zogauw het geld overgezet is, kan ik gaan handelen. Of het gewoon vasthouden. Ik zal hier beschrijven wat ik ermee ga doen en wat ik zoal tegenkom.

Maar laten we de risico’s een bekijken: Banken en overheden gaan zich roeren en proberen het te verbieden.

Bitcoin is peer to peer. Tussen jou en mijn computer. En die van tienduizenden anderen. Iedereen die bitcoins heeft is meteen de infrastructuur. Het geld staat op mijn computer (in de vorm van een versleuteld bestand, mijn “wallet”) en niet op een “bankrekening”. Vergelijk het met DSB en een matras. DSB kan door de overheden afgeschaft worden, waarna het geld dat bij DSB in beheer was, verdwenen is. Geld in mijn matras blijft daar liggen, ongeacht of banken omvallen. Overheden kunnen (onder druk van banken) best Bitcoin verbieden. Maar ze kunnen niet nagaan of we ons daaraan houden zonder al het internetkeerveer met enorme supercomputers af te luisteren. Alleen zo kunnen ze bitcoin-gebruikers afsluiten of bitcoinverkeer afknijpen. Net zomin als het overheden en film- en muziekindustrie lukt om bestandsdeling (filmpje downloaden) te verbieden of zelfs maar te verminderen, zal het ze lukken om dit de kop in te drukken.

Het cryptografische algoritme achter het netwerk blijkt een zwakheid te hebben.

Als het een diepliggende fout is, eentje die ingrijpt op de basis van hoe dit systeem werkt, zal het vertrouwen wegvallen, zullen mensen hun bitcoins verkopen en stort het in. Vergelijkbaar met wanneer een bank fouten blijkt te maken en iedereen als een gek zijn spaargeld opneemt. Vooralsnog zijn er geen tekenen dat zo’n fout gevonden gaat worden. En zijn er veel wiskundigen en theoretici die beweren dat dit ook niet gevonden kán worden. en zijn er veel mensen als een gek aan het zoeken naar zulke lekken (om het systeem plat te gooien, of om enorm veel geld te verdienen met zo’n zwakheid).

Mensen verliezen interesse. Als de interesse wegebt en mensen hun bitcoins weer inruilen tegen Dollars of Euro’s zal het netwerk uitgehold raken: bitcoins worden weer minder waard en diensten eromheen verdwijnen. De meest voor de hand liggende oorzaak van zo’n verminderde aandacht is bijvoorbeeld een alternatief systeem. Vergelijk het met peer-to-peer muziekuitwisseling. iTunes en meer recent spotify bieden betaalde, legale muziekdownloads aan. Omdat dit goede alternatieven zijn, met grote namen erachter, haalt het veel mensen weg uit het bestandsdeling-netwerk. Waardoor dit illegale aanbod zelfs kan krimpen (in werkelijkheid groeit het iets minder hard).

Wanneer bijvoorbeeld amazon, of paypal of zelfs de grote creditcard maatschappijen met alternatieven komen, die dezelfde voordelen bieden, kan het zijn dat mensen massaal weglopen uit het bitcoin netwerk. Wanneer echter een kritieke massa in dat bitcoin netwerk zit, zal dit risico klein zijn. Andere onbekenden. Bitcoin is nieuw. Een heel nieuw concept. Met een geheel nieuwe dynamiek. Niemand kan voorspellen wat er gebeuren kan, dus kan het meest onverwachte zomaar opeens dit netwerk onderuit halen. Enkele voorbeelden die critici al opvoerden, zijn: massadeflatie: met de kleinste betaaleenheid (een tiende cent) kun je enkel nog een huis kopen. Internet valt om door grote externe veranderingen (onrust of oorlog in de westerse wereld, energiecrisis et cetera). Iemand koopt ala goldfinger alle bitcoins op. Enzovoort.

Maar voorlopig zie ik een enorme potentie. Een (mogelijke) revolutie vergelijkbaar met het ontstaan van de banken in de renaissance. En de mogelijkheid om geld te verhandelen, bewaren en investeren, zonder mee te hoeven doen aan het, door velen verachtte, banksysteem dat we nu hebben.

Wil je ook meedoen: installeer dan de bitcoin client, en vraag gratis een piepkleine hoeveelheid bitcoins aan. Wil je bitcoins verdienen, dan heb je al een hele serie opties. Waaronder ze gewoon kopen. En als je dit verhaaltje interessant vond, maak doneer dan, in bitcoins, op mijn rekening: 1Hga5LMhjrfSjwtwxhFQnUmfPNmCRpaVvX :)

https://berk.es/2011/05/29/bitcoins-de-revolutionaire-valuta-met-een-potentie-voor-de-teloorgang-van-ons-banksysteem

Mailcatcher for Drupal and other PHP-applications - The simple version

May 28, 2011 Updated May 28, 2011

Show full content

This is an updated version of my earlier post. Since msmtp is no longer needed, things are a lot simpler, hence the new article.

Problem: on development (and test) you don’t want to send out mail. But you do want to test it. You certainly don’t want to be in my shoes when a client called me, telling she recieved dozens of confused and angry mails from users on her site, after I fired up cron on my local development machine. And sent out approximately 3000 notification mails to users, with stuff like “new post for you: “W000t, fieldz0rz developmentz in CCK is workinggggg!” (I am making this up now. Allthough…. ;) )

Problem: when debugging mail, you want to inspect the headers and often (in case of multipart or HTML mail) the source too. Most emailclients are crap for that (and right so: who other then the odd mail/webdeveloper needs to inspect the source of a mail. ever?)

Solution: the brilliant Ruby application named mailcatcher. This is a simple SMTP server and sendmail replacement that shows the mails sent to it in a handy webapplication. The webapplication features debug-tools such as headers, and source displaying.

Screenshot of a Drupal password recorvery mail in Mailcatcher

Aside: Windows. It is probably possible, but since using even the most basic proper commandline on there requires lots of hassle, all this is far from as trivial as Mac and Linux. I am sorry, but please use the comments if you go mailcatcher running with PHP on Windows.

Installation

Mac comes with ruby installed. On Ubuntu Linux you may need to install it still:

sudo apt-get install ruby rubygems

Install mailcatcher (Use sudo for installing systemwide).

gem install mailcatcher

Configure PHP to use mailcatcher for delivering mail:

Edit php.ini (Depending on your installation where this lives, but on Ubuntu this is /etc/php4/apache2/php.ini)
Under [mail function], if available, change the sendmail_path to /usr/bin/env catchmail and you’re set.

sendmail_path = "/usr/bin/env /var/lib/gems/1.8/bin/catchmail "

Find out where catchmail lives by invoking $ which catchmail. On Ubuntu it was installed at /var/lib/gems/1.8/bin/catchmail. Make sure you have the gems installed system wide, else apache (or the user running the webserver) does not have access to catcmail and the required libraries.

And restart apache.

sudo service apache2 restart

Start up mailcatcher.

mailcatcher

Open your browser and visit http://localhost:1080

Mailcatcher gotcha’s and tips

Just terminate (^C-c) mailcatcher and restart it to flush the recieved mail.
Don´t forget to start up mailcatcher before you start hacking along on your site, If you forget it, mail will not be sent out, but will fail and PHP (Drupal) will give errors on mailing.

** Happy Mailing on your development machine! **

https://berk.es/2011/05/29/mailcatcher-for-drupal-and-other-php-applications-the-simple-version

Mailcatcher for Drupal and other PHP-applications

May 27, 2011 Updated May 27, 2011

Show full content

UPDATE Please see the newer version of this article, the latest malcatcher has its own sendmail replacement, making installation for PHP a lot simpler.

Problem: on development (and test) you don’t want to send out mail. But you /do/ want to test it. You certainly don’t want to be in my shoes when a client called me, telling she recieved dozens of confused and angry mails from users on her site, after I fired up cron on my local development machine. And sent out approximately 3000 notification mails to users, with stuff like “new post for you: “W000t, fieldz0rz developmentz in CCK is workinggggg!”. Not cool.

Problem: when debugging mail, you want to inspect the headers and the source (in case of multipart or HTML mail). Most emailclients are crap for that (and right so: who other then the odd mail/webdeveloper needs to inspect the source of a mail. ever?)

Solution: the brilliant Ruby application named mailcatcher. This is a simple SMTP server, which shows the mails sent to it, in a handy webapplication. The webapplication features debug-tools such as headers, and source displaying.

Screenshot of a Drupal password recorvery mail in Mailcatcher

Additional problem: PHP on none-windows machines, cannot deliver mail to an arbitrary SMTP server. It requires a sendmail program being invoked somewhere. Drupal does not allow sending mail to any smtp server without additional configuration. Solution for that is the ultralight sendmail alternative msmtp. If we configure msmtp to act as sendmail and deliver mail to mailcatcher, we are fine: Drupal » PHP mail() » /bin/msmtp –foo –bar » Mailcatcher

Mac

Install mailcatcher (Use sudo for installing systemwide). $ gem install mailcatcher

Install msmtp, using MacPorts. $ sudo port install msmtp

Configure PHP to use msmtp for delivering mail:

Edit php.ini
Under [mail function], if available, change the sendmail_path variable. Else just add it to php.ini.

sendmail_path = “/usr/bin/msmtp -t “

And restart apache.

Configure msmtp with some defaults (this file should probably be named /etc/msmtprc)

account mailcatcher
host localhost  
port 1025       # MailCatcher will tell you the port it listens to.
# Enable logfile for additional troubleshooting.
# logfile /var/log/msmtp.log
auto_from on    # From does not work 100% with me, yet, because the envolope-from 
                # is still wrong. But leaving this out makes msmtp fail with PHP.

account default: mailcatcher

Start up mailcatcher $ mailcatcher

Open your browser and visit http://localhost:1080

Linux (Ubuntu, Debian and derivatives)

install ruby and rubygems first (if you don’t already have it) $ sudo apt-get install ruby rubygems Then install mailcatcher. $ gem install mailcatcher

Install msmtp. $ sudo port install msmtp

Configure PHP to use msmtp for delivering mail:

Edit /etc/php5/apache2/php.ini
Under [mail function], change the sendmail_path variable.

sendmail_path = “/usr/bin/msmtp -t “

And restart apache ($sudo service apache2 restart) to load the new php.ini.

Configure msmtp so it accepts PHPs sendmail calls with some defaults. Edit /etc/msmtprc.

account mailcatcher
host localhost  
port 1025       # MailCatcher will tell you the port it listens to.
auto_from on    # From does not work 100% with me, yet, because the envolope-from 
                # is still wrong. But leaving this out makes msmtp fail with PHP.

account default: mailcatcher

Start up mailcatcher $ mailcatcher

Open your browser and visit http://localhost:1080

Troubleshooting

If things don’t work out, check the main apache error log. Errors by msmtp will show up there and tell you what went wrong. Use the Logs Luke, use the logs!

You can turn on msmtp logging with logfile /var/log/msmtp.log in msmtprc. No need to restart apache for that. Msmtp will tell you what calls it recieved and what parameters it got.

Cut out PHP and send a mail with msmtp to mailcatcher on the commandline

echo -e "Subject: Test Mail\r\n\r\nThis is a test mail" |msmtp --debug --from=default -t me@example.com

If that works, then the problem is between Drupal/PHP and msmtp, and the apache-logs should give a hint (see above).

Mailcatcher gotcha’s and tips

Just terminate (^C-c) mailcatcher and restart it to flush the recieved mail.
On firefox (4) the web interface from mailcatcher did not look that well. Chrome(ium) rendered it fine, though.
Don´t forget to start up mailcatcher before you start hacking along on your site, If you forget it, mail will not be sent out, but will fail and PHP (Drupal) will give errors on mailing.

Happy Mailing on your development machine!

https://berk.es/2011/05/28/mailcatcher-for-drupal-and-other-php-applications

"Ruby-on-rails is minder goed dan PHP want het is moeilijk te hosten"

Mar 30, 2011 Updated Mar 30, 2011

Show full content

Op mijn diverse artikelen over de geschikheid van Drupal ook veel reacties van mensen die de plank helemaal misslaan. Zoals “Kan wel zijn, dat je in Rails makkelijker kan bouwen, maar Rails is moeilijk te hosten”. Sidenote: Ik heb slechts heel weinig ervaring met Python-hosting (enkel trac projectmanagement), dus durft hoogstens te zeggen dat ik verwacht dat onderstaand daar ook voor geldt.

Ten eerste: Ja! Voor je Ruby on Rails project is minder hosting te vinden.

Ten tweede: Dat is volledig irrelevant voor de projecten waar we het over hadden.We praten niet over een site van de bakker op de hoek, maar over sites en projecten waar best een werkdag voor hosting uitgetrokken kan worden. Waar meestal zelfs gewoon professionele hostingpartijen betrokken zijn. Met hosting van enkele duizenden euro’s per jaar. De 1-eurohost.biz hoster ondersteunt trouwens meestal ook gewoon Rails, want dat komt al jaren standaard met plesk mee. Dus die bakker op de hoek kan ook heel goed een site in Rails opgeleverd krijgen.

De grootste uitdaging zit bij organisaties en bedrijven die al hostingomgevingen hebben ingeregeld. Meestal voor Java en/of PHP+MySQL. Zelden voor Ruby of Python. Vooral als de beheerders van die omgeving inflexibel zijn, kunnen ze Ruby of Python nogal eens buiten de deur houden. Afhankelijk van de situatie (zoals windows-only hosting), volgens mij overigens volledig terecht, vaak; maar dat is een heel andere discussie. Voor een intranet kan het dan heel goed zijn, dat vanwege deze beperkingen Ruby of Python al direct afvallen. Terzijde: daarmee is de geschiktheid voor ontwikkelen in “een framework” niet minder, alleen de keuze aan frameworks is kleiner. Aan de andere kant heeft de Rubygemeenschap bijvoorbeeld een omgeving als heroku. Voor Drupalmensen: stel je voor: - “drush hosting create bakkerophoek”

“git push hosting”
“drush hosting online”
“drush hosting domains add bakkerophoek.nl”
de factuur volgt enkele minuten later per mail (patent pending… :) )Op heroku zet je voor een paar euro een SOLR aan bij je site (stel je voor dat “drush hosting solr enable” alles is dat je moet doen voor SOLR!). Voor een paar euro per maand wat extra CPU-power of draai je voor enkele tientjes per jaar volledige master-slave database omgevingen achter je site. En deploy je met enkele commando’s, geïntegreerd in je revisiebeheersysteem. Hosting is nog nooit zo makkelijk en betaalbaar geweest. Ik wou dat dit voor PHP zo makkelijk kon.

Wanneer mensen dus roepen dat bijvoorbeeld Diaspora nooit iets kan worden, omdat het in Ruby gebouwd is en dus moeilijk de deployen, klopt dat maar deels. Het soort mensen dat Diaspora zal deployen, is bekend met hosting en servers. Dat zijn niet de mensen die al vastlopen op het installeren van FileZilla, maar de geeks die het leuk vinden om met ngix of apache-proxies te prutsen. Diaspora is eigenlijk eerder gericht op “kleine” lokale community-sites dan op “iedereen en zijn moeder” een eigen Diaspora: Eerder Hyves die op Diaspora haar community bouwt, dan op “je neefje en zijn twee vrienden van de Geheime Piratenclub”. Het probleem is inderdaad dat een Ruby-project als rstat.us daarmee een nadeel heeft ten opzichte van het op PHP en MySQL gebaseerde status.net. Die laatste is op iedere 1-eurohost te installeren, door iedereen met een basiskennis webmastering of webdevelopment. Een grotere doelgroep, maar mogelijk niet de juiste doelgroep. Het bereik door het aantal installaties is groter, maar dat zegt (vooralsnog) niets over het bereik van gebruikers ervan.

Die Railsprojecten vereisen op zijn minst enige ervaring met de commandline. Maar eigenlijk gewoon kennis van (web)serverbeheer. Dat hoeft geen dure zeldzame ontwikkelaar te zijn, maar kan iedereen zijn die op een zondagmiddag een vps kan inregelen, voor enkele tientjes. Of iemand die bij een hoster werkt en voor jou de goede omgeving klaarzet.

Teruggrijpend op mijn eerdere betogen: een professioneel project heeft zo iemand erbij betrokken. Óók voor PHP-projecten, anders, zo durf ik te stellen, is het simpelweg geen professioneel of groot project.

Iemand die met moeite een goede Drupal- of wordpress-omgeving kan inrichten zal inderdaad erg teleurgesteld zijn in wat vereist wordt om een Django- of Rails- product uit te rollen. Maar zo iemand is nooit maatgevend voor de geschiktheid van die producten in een professioneel, groter project: zo iemand huurt dan iemand in die kennis van het uitrollen van servers heeft.

https://berk.es/2011/03/31/ruby-on-rails-is-minder-goed-dan-php-want-het-is-moeilijk-te-hosten

Simplest authentication in Rails: Basic Authentication with a logged_in? helper.

Mar 28, 2011 Updated Mar 28, 2011

Show full content

The, by far, simplest solution to add some form of authentication in Rails is basic authentication. It has a lot of downsides, but the simplicity is such a benefit that it may just outweight.

Downsides are, amongst others:

No users, no user-manangement.
Your username and password are hardcoded in the application.
No fancy or good looking login screens: just the basic HTTP login provided by your browser.
No logout, other then closing the browser.

Here is a simple implementation for a simple app I needed. Since I am the only editor, there is no need to introduce session controllers, user models and so on. If you are relatively new to Rails (like me) you may miss this most simple solution and dive right into devise or authlogic or start writing your own. And miss out that 10-minutes-and-you’re-done solution.

First, we introduce a basic authenticate method, that can be used troughout our controllers. This method uses the Rails/Rack helper authenticate_or_request_with_http_basic.

class ApplicationController < ActionController::Base
  protect_from_forgery

  protected
    def authenticate
      authenticate_or_request_with_http_basic do |username, password|
        username == USER_ID && password == PASSWORD
      end
    end
end

In a controller, we can then add a before_filter to require authentication for all methods but the index and the show.

class ImagesController < ApplicationController
  before_filter :authenticate, :except => [:index, :show]
  #...
end

A new file under config/initializers, named user.rb or anything else you want, contains the hardcoded username or password. Putting it in a separate file allows you to leave it out of your version-control, for example.

USER_ID   = "Sauron"
PASSWORD  = "s3cr3t"

Furhtermore, we define a logged_in? helper, usefull in our views. This checks if the authorization is a string (it is set) or nil (user is not authorized):

module ApplicationHelper
 def logged_in?
   not request.authorization.nil?
 end
end

Using that helper is simple too. E.g. show.html.erb:

<% if logged_in? %>
  <li><%= link_to 'Edit', edit_image_path(@image) %></li>
<% end %>

I am not certain if this evaluation of request.authorization.nil? performs all that well, but I would say, it being simple as possible, that the overhead is minimal.

https://berk.es/2011/03/29/simplest-authentication-in-rails-basic-authentication-with-a-logged_in-helper

Enkele goede reacties op mijn stelling "Drupal verkeerde keus voor overheidssites" uitgelicht

Mar 21, 2011 Updated Mar 21, 2011

Show full content

Behalve de vele reacties zonder enige onderbouwing, of zelfs enkele ad-hominem drogredenen, kreeg mijn http://webwereld.nl/opinie/106086/drupal-verkeerde-keus-voor-overheidssites--opinie-.html artikel op webwereld ook een paar heel goede argumenten.

De meerderheid, was, tegen verwachting, positief. Tegen verwachting, omdat olifanten in porseleinkasten meestal weggejaagd worden :).

De enige inhoudelijke reactie waar ik niet meteen een antwoord op heb, is van MexMast op Maandag 21 Maart 2011 17:07 (helaas kan ik niet direct linken naar webwereld reacties, dus even zelf zoeken.) > mja lijkt mij een gevaarlijke veralgemening [zie hieronder, BK], als wanhoopsargument. dat het om veel geld gaat veranderd weinig aan het geheel. Tuurlijk kom je er niet door je neef 50 euro te geven voor een zondagje werk.

Je zegt letterlijk dat alle overheids websites die drupal gebruiken de verkeerde keuze hebben gemaakt. En dat is vrees ik, uw persoonlijke voorkeur ten spijt, manifest onjuist.

De voorbeelden van overheids websites die op drupal draaien zijn veelvuldig aanwezig, tenzij je een paar voorbeelden kan aanhalen van drupal websites en vergelijkbare websites waar de drupal variant veel duurder uitdraaide qua total cost of ownership of functionaliteit. Ik betwijfel het ten zeerste.

Dat ik letterlijk zeg dat alle overheids websites die Drupal gebruiken een verkeerde keus maakten is niet waar. Ik insinueer dat wél, maar blijf telkens terugkomen op “grote Drupalprojecten”. Mijn oorspronkelijke titel was ook Er zijn betere alternatieven voor Drupal. De Webwereld redactie heeft dat wat aangescherpt; omdat ik dat toch in het artikel zélf goed onderbouw en nuanceer.

Ik zeg zeker niet dat overheden verre van Drupal moeten blijven. Ik herhaal: voor veel grote Drupalprojecten kan beter een andere omgeving worden gekozen. Wat ik verder onderbouw.

Dat er daarmee dus ook veel overheids-website successvol op Drupal draaien is eenvoudig bewezen en herhaaldelijk aangekaart in reacties. Deze reacties hebben het artikel selectief gelezen, of eenvoudigweg niet gelezen. Want dan hadden ze zeker begrepen dat ik nergens beweer dat ieder overheidsproject bij Drupal vandaan moet blijven. Maar dat ik stel dat grote, bouw-intensieve projecten beter af zijn met andere omgevingen dan Drupal. Deze nuance kan ik schijnbaar niet vaak genoeg herhalen.

De veralgemening slaat op mijn eerdere reactie:

Helaas zijn dit soort projecten altijd voor het grootste deel “ontwikkel”-klussen. Minstends de helft van het budget gaat op aan development. En zelden hebben deze klussen budgetten met minder dan vier nullen.

En dat is veel te kort door de bocht en verdient verdieping.

Ik heb heel veel offertes gezien en geëvalueerd. Het merendeel Drupalprojecten. Ik ken geen project waarin het geöffreerde bedrag voor development onder de 80% komt of kwam. Het dichtbij komt een offerte waarbij het gevraagde bedrijf ook alle TO en FO’s moest schrijven, de designs moest opleveren en mee-deed met de brainstorms. Daarbij was het opstartbedrag zó substantieel dat het ontwikkelbudget slechts 60% van het geheel uitmaakte uiteindelijk. Een project om een site te bouwen, zal altijd een substantieel bedrag voor het bouwen van de site bevatten. Wanneer in een web-project van twee ton slechts vijfduizend Euro vrijgemaakt is voor het bouwen van de site (97,5% overhead) is er iets grondig mis met het project, en maakt de gekozen techniek inderdaad niet uit.

Voor de tweede kamer staat een vacature online waarin een Drupalontwikkelaar gevraagd wordt. Een duidelijkere hint dat er in dit specifieke project ontwikkeld gaat worden kan ik niet bedenken.

Ik kan vertrouwelijke informatie niet vrijgeven (en wil dat ook niet) maar het voldoet om te zeggen dat het budget voor deze tweedekamer-site “groot” is. Openbare informatie over andere gemeenteprojecten met Drupal laten allemaal zien dat daarin honderden uren is ontwikkeld.

Overheden en gemeentes hebben inderdaad met veel meer te maken dan een klein ontwikkelklusje:

@berkes The issues you mention could be about any cms/fw package. Imho most problems with governments sites lie in OSI-layer 8: politics

Maar ook met veel complexe omgevingen. Met legacy-databases, ingewikkelde koppelingen, single sign-on, elder of eerder geïmplementeerde diensten enzovoort:

Krishna Kurvers@berkes @bertboerland binnen mijn sector worden diverse open source #CMS overwogen. past niet bij deze tijd van #nora en #sga. Mijn mening@berkes #OSS voorstander van, maar #CMS is achterhaalt. Werkt niet goed bij gekoppelde overheden met modulaire bouw en hergebruik data.

Nogmaals: hiermee is niet gezegd dat automatisch alle overheidsprojecten hiermee te maken hebben. Overheden zetten ook regelmatig tijdelijke, kleine of ingekapselde sites in. Voor wijkinformatie, evenementen, PR, acties, bouwprojecten enzovoort. In dit soort projecten kan veel vaker wel dan niet, een CMS ideaal ingezet worden. Voor de site van het bouwproject naast mijn deur, bijvoorbeeld, is een CMS misschien zelfs al te groot en complex. Laat staan dat een raamwerk daar efficient ingezet kon worden.

Kortom: eenieder die roept dat ik specs moet kennen, of dat een overheidsproject meer is dan ontwikkeling alleen, vraag ik te reageren. Geef me voorbeelden, indicaties, projecten en sites waarbij:

Er niet, of minimaal ontwikkeld is. Waar Drupal als eenvoudige blokkendoos is ingezet, met een leuk theme erbij en misschien een enkel custom module-tje. Kortom: dat Drupal is ingezet in een project waar het optimaal voor is. En of het totale budget hiervoor minder dan 4 nullen had.
Het ontwikkelen, optuigen, migreren en ander development-werk uiteindelijk minder dan 30% van het totale budget opneemt. Waar het, als geheel, om een groot project ging, maar het inzetten van Drupal slechts een klein onderdeeltje is. Waarbij binnen een “groot” project dus toch nauwelijks ontwikkeld is.

Voor die projecten zou inderdaad van de buitenkant lijken dat Drupal niet het goede systeem is, want teveel ontwikkeling. Maar zal bij nader inzien Drupal toch zeer geschikt blijken.

Voor alle andere projecten is het veilig om heel kort door de bocht te stellen dat ze “Substantiële ontwikkeltijd en -budget” vergden. Dus dat het ontwikkelen een groot onderdeel was. Dus dat de keuze van het onderliggende ontwikkelplatform belangrijk is.

Ik heb namelijk, niet verbazend, geen enkele reactie gehad die het tegendeel beweerde van mijn stelling dat Drupal voor development niet optimaal is en dat daar veel betere omgevingen voor zijn, zoals frameworks. Dat is niet verbazend, omdat zelfs de meest verstokte Drupal-gelovers ook inzien dat een CMS hierin niet sterk is. Dat dat ook helemaal niet erg is: een CMS heeft immers niet tot doel een ontwikkelplatform te zijn.

Ik hoop te horen van alle mensen die het niet eens zijn met mijn stelling, en voorbeelden of argumenten aandragen waarbij overheidssites wel degelijk gebaat zijn bij een “kant-en-klaar” product als Drupal, en met een ontwikkelomgeving slechter af zijn.

https://berk.es/2011/03/22/enkele-goede-reacties-op-mijn-stelling-drupal-verkeerde-keus-voor-overheidssites-uitgelicht

Val ik daar Drupal aan of af?

Mar 20, 2011 Updated Mar 20, 2011

Show full content

Nee. Dat doe ik niet. Reacties die ik kreeg (sommigen samengevat)> Je gooit je eigen ruiten in!(Achtergrond: Ik ben vooral Drupalontwikkelaar, dus is het niet handig om het product dat brood op mijn plank brengt af te vallen.)> ik vind het jammer dat je concreet Drupal noemt en niet ‘CMS in het algemeen’. Wat jij beschrijft is inherent aan een CMS. Elke CMS kent haar manier van workflows en registratieafhandelijk en dus kost het maatwerk om hiervan af te wijken.

Of > Als jij het project zo belicht, is dat ook slecht voor de Drupalgemeenschap.(Achtergrond: De Drupalcommunity is misschien wel het belangrijkste onderdeel van Drupal. Mogelijk belangrijker dan het product zélf).> Jammer, want nu vallen gemeentes misschien weer terug op hun oude vertrouwde Closed Source CMSen.

Dit is allemaal, deels, waar. En heeft ook zeker meegespeeld bij het schrijven van deze artikelen en het herschrijven van het verhaal voor Webwereld.

Zoals iemand anders reageerde: een gedurfd artikel waar je vast een tijd op hebt gebroed.. Inderdaad ik heb al diverse malen op het punt gestaan dit te publiceren en telkens afgezien daarvan, omdat ik teveel mogelijk negatief effect zag. Laat ik daarom mijn persoonlijke drijfveren van het voeren van mijn onderneming opvoeren, in willekeurige volgorde.* Geld verdienen (middellange termijn drijfveer).* Mooie, succesvolle en prettige projecten afronden.(dagelijkse, kortetermijn drijfveer).* (het gebruik van) Open Source, (daarmee) Transparantie en Open Standaarden bevorderen (ideologische en langetermijn drijfveer).

Beginnend met waarom ik concreet Drupal noem en niet het concept CMS. Drupal is momenteel de keuze bij steeds grotere (overheids) projecten. Overal, niet alleen in Nederland. Alle andere Open Source projecten lijken van de radar te verdwijnen. Tot ik gewezen werd op DevCMS. Dat systeem raakte precies de kern van Het Stuk Dat Ik Niet Durfde Te Plaatsen. DevCMS is mogelijk (maar mogelijk ook niet) een veel betere oplossing. Waar het mij echter vooral om gaat is dat deze werkwijze van DevCMS, precies is wat ik mis zie gaan in zoveel Drupal projecten. De Open Source community is beter gebaad bij low-level oplossingen dan eindproducten: Aan het design van gemeente-kerkstraaaten.nl heeft niemand in de Drupalgemeenschap iets. Aan de exacte configuratie ook niet. Maar aan DigiD bibliotheken, aan geavanceerde workflow en dergelijke wél. Wanneer Drupal als framework ingezet wordt, is dat laatste ook het resultaat voor de gemeenschap. Ik ben van mening dat op dit moment, veel van deze “enterprice Drupalprojecten” niks opleveren voor de Drupal gemeenschap. Het enige dat ik zie is een heel mooie Rijkshuisstijl theme en een DigiD module. Als je uitrekent dat, heel grove schatting, er ruim een miljoen budget aan Drupalontwikkeling van overheidssites doorheen is gejaagd, is het resultaat voor de community abominabel. Het netto effect van zulke grote projecten, is dus enkel naamsvermelding een beeldvorming. Drupal heeft een natuurlijke habitat waar het zich goed voelt: projecten waar Drupal helemaal voor geschikt is. En Drupal heeft een habitat waar het maar moeizaam kan overleven. Wanneer we Drupal te vaak in die laatste omgeving inzetten, ontstaat er frictie. Klanten worden ontevreden (vaak terecht) over Drupal, of de site die met Drupal gebouwd werd. Het publiek ziet veel gerotzooi, en gaat FUD, verspreiden. De schuld wordt bij Drupal neergelegd. Dan blijkt die naamsvermelding en beeldvorming opeens vaak negatief. Ik moet er niet aan denken dat we de kop te lezen krijgen: Noord Holland breekt Drupal-infrastructuur af, wegens budget en kostenoverschrijdingen Dan is het netto resultaat helemaal negatief. Noord Holland doet, overigens, zover ik weet, niets met Drupal.

Wanneer Drupal vaker de schuld krijgt van minder geslaagde projecten is dat vervelend voor Drupal. Maar nóg vervelender voor Open Source. Als we blijven roepen dat The Gimp het beste is wat Open Source heeft voortgebracht, kan ik me heel goed voorstellen dat zovelen zo geweldig negatief zijn over Open Source (in het algemeen). The Gimp is een heel complex, geavanceerd en goed fotobewerkingsprogramma. Maar het is niet voor de leek. En vergt ook enorme aanpassingen van de graficus die van Photoshop komt. Voor bepaalde omgevingen, is the Gimp ideaal. Maar voor veel omgevingen ook helemaal niet.

Als we van Drupal een soort tweede Gimp maken, door het overal in te zetten waar het negatief afstraalt op Open Source, zie ik bijna nog liever, helemaal geen Drupal meer ingezet worden. Omdat ik mijn drie drijfveren in balans wil houden, vertel ik klanten vaak dat ze niet Drupal moeten gebruiken: liever geen project dan een frustrerend, te duur, lelijk project. Maar wil ik ook alles in het werk stellen om Open Source goed en succesvol ingezet te zien worden. En hoop ik dat Drupal daarmee sterke kanten veel beter gebruikt ziet worden, in plaats van zich ingezet ziet worden in projecten waar het geen fatsoenlijke kans van slagen heeft, enkel omdat een Drupalontwikkelaar niet durft te zeggen dat Drupal hier ongeschikt voor is.

Liever tweehonderd kleinere, tevreden Drupalgebruikers, dan één duur overheidsproject waar Drupal zich niet in thuisvoelt.

https://berk.es/2011/03/21/val-ik-daar-drupal-aan-of-af

Minor sidenotes for Tagadelic users, regarding SA-CONTRIB-2011-013

Mar 15, 2011 Updated Mar 15, 2011

Show full content

Tagadelic, Drupals tag-cloud module, was found with a security vulnerability. From the advisory:

The module does not sanitize some of the user-supplied data before displaying it on abovementioned cloud pages, leading to a Cross Site Scripting XSS vulnerability that may lead to a malicious user gaining full administrative access.

This vulnerability is mitigated by the fact that the attacker must have a role with the ‘administer taxonomy’ permission which should generally only be granted to trusted roles.

The fix simply escapes the description and the title before they are passed along.

This may cause problems to the people who “abused” this vulnerability. Admins who, for example, had embedded video, HTMl markup or javascript in the description of their tag cloud page, will no longer see this after upgrading.

For them, there is no simple solution, other then the strongly discouraged “solution” of not upgrading. I discourage this not only for security reasons, but also, because any future release will re-introduce this issue.

Taxonomy descriptions and titles were never meant to hold any markup in the first place, so if this upgrade hits people, they were abusing a Drupal-non-feature in the first place.

A better solution would be to place such markup in a block and embed that in the theme (in a region). That way you use the proper Drupal-tools for the proper job.

Also note that the unreleased Drupal 7 branch is not yet fixed.

https://berk.es/2011/03/16/minor-sidenotes-for-tagadelic-users-regarding-sa-contrib-2011-013

Drupal is ook minder geschikt voor afwijkende, custom interactie en functionele ontwerpen

Mar 9, 2011 Updated Mar 9, 2011

Show full content

Op mijn artikel over waarom Drupal voor grote projecten niet de meest geschikte tool is, kreeg ik ook wat reacties in de trant van

Jij bekijkt het alleen vanuit de technische kant. En vergeet het designen, functioneel ontwerpen en het beheer.

Dit ben ik niet vergeten, maar heb ik expres weggelaten om het verhaal niet nóg langer te maken. Ter volledigheid: Functioneel ontwerp, interaction design en natuurlijk het grafisch ontwerpen zijn zaken die allemaal vooraf gaan aan het bouwen van een site.

Hier doet zich in het geval van een CMS een interessant gegeven voor, namelijk dat het CMS erg beperkend werkt op de mogelijkheden binnen zo een ontwerp. Zo roep ik bij bijna alle Drupal-projecten:

Er moet dan wel een techneut bijzitten die kan bijsturen, zodat we Drupal optimaal gebruiken en geen zaken gaan bouwen die in Drupal heel moeilijk blijken.

Klinkt redelijk: dat de gebruikte techniek optimaal ingezet wordt. Dat je al tijdens de eerste ideeënfase Drupals ins- en outs leert kennen en je daardoor laat leiden, is geen verkeerde werkwijze.

En zo voorkom je honderden uren ontwikkelen van een detail, dat, achter bezien, die honderden uren helemaal niet waard blijkt.

Ook een goed idee: om zaken die al voor jou uitgewerk werden gratis in te zetten: scheelt geld.

Als je uit gaat van “standaard waar het kan, maatwerk als het echt moet” en je bent als klant al tevreden met hetgeen out of the box kan worden geconfigureerd (verwachtingen-management!) dan ben je een framework of custom code al tonnen verder om na te bouwen wat je met Drupal en alle contribs al hebt.

Een meer concreet voorbeeld:

Gebruik gewoon de standaard Drupal login- en registratieprocedure, dan hoeven we daar niet eens over na te denken en ook niets voor te ontwikkelen (en te onderhouden).

Maar wat nu als je wél die vrijheid van ontwerpen wilt? Om dat login-verhaal erbij te halen: een goede businesscase, waarmee je conversie-ratio omhoog schijnt te schieten is: mensen kunnen anoniem een comment posten en krijgen pas daarná een login- of registreer scherm. Met de optie om met openid, twitter, facebook enzovoort te registreren.

Dan verdwijnt al het voordeel van een gratis, al uitgedacht systeem. En dat is precies wat ik overal zie gebeuren: duizenden euro’s in een offerte voor een systeem dat afwijkt van hoe het CMS het “al deed”.

Het argument “standaard waar het kan, maatwerk als het echt moet” onderschrijft daarmee mijn betoog alleen maar: een standaardproduct is simpel, goedkoop en snel uit te rollen (en als dat al niet zo is, is er its anders, grondig, mis). Dan is een budget van enkele tonnen niet te verantwoorden: Die tonnen zitten in de praktijk altijd in dat “maatwerk als het echt moet”. En is de cirkel rond: “standaard waar het kan, (een klein beetje) maatwerk als het echt moet en anders een tool die veel geschikter is voor maatwerk”. Waarbij Drupal dus vooral ingezet kan worden voor kleinere, goedkopere projecten en de grote projecten (zoals overheids- of enterpricesites) meer geschikte tools moeten hebben.

Men zegt niet voor niets dat A large portion of time spent building […] is spent undoing the assumptions that Drupal has baked into core directly..

En daarmee hebben we design, ontwerp, en beheer ook meteen te pakken.

Design:

Designers moeten zich confirmeren naar het CMS en hoe daarin zaken gedaan worden.

Nu moet gezegd worden dat Drupal misschien niet het makkelijkst te leren is voor ontwerpers (themers), maar zeker een van de (zo niet de aller-) flexibelste qua design.

Overigens heerst onder Drupal-ontwerpers ook veel ongenoegen over dat themen. Dat heeft voor een groot deel een architectuur-technische oorzaak: Drupal is niet MVC, heeft geen “ontworpen” theme-laag, maar grotendeels een organisch gegroeid “gebied”. Dat resulteerde in een inconsistente en rommelige “interface”, waarbij interface de gereedschapskist is, waarmee de ontwerper/themer aan de slag gaat.

(Interactie) Ontwerp

Een framework doet veel minder (tot geen) aannames over interactieontwerp. Een CMS doet dat wél. Een CMS biedt (duidelijk) afgebakende grenzen aan: zo doen we het en niet anders.

Wanneer in het CMS deze interactie heel goed voor een bepaald doel op maat gemaakt is, heb je, wat ik eerder noemde, een hyper gefocused CMS. PHPBB, een forum-tool, hoeft ook bijna geen aangepast interactieontwerp: het is immers al helemaal geöptimaliseerd voor het beheren van forums. Dat PHPBB je beperkt in de vrijheid zelf je admin- interfaces, workflows en dergelijke te ontwerpen, is nauwelijks van belang.

Maar een CMS dat fungeert als generieke blokkendoos, vereist dat wél. Daar moet je de vrijheid hebben om zélf je menu-structuren helemaal in te richten, om zélf optimale workflows op te zetten. Kortom, precies het interaction-design, de wireframes of het funcitoneel ontwerp te kunnen volgen.

Dan is “Drupal optimaal gebruiken” opeens veel minder waardevol, omdat je gewoon het ontwerp wilt kunnen volgen en niet rekening te moeten houden met de niet-passende ideeën van een modulebouwer.

Beheer.

Wanneer je zo een site bouwt, waarin je heel veel moet “undo-en”, eindig je niet zelden met honderd, hondervijftig modules. Waarvan een groot deel heel, project- of casespecifiek gebouwd is.

Overigens kent het gemiddelde Rails-project waarmee ik bekend ben, ook ongeveer 50 externe, vereiste bibliotheken (gems). Maar vijftig bibliotheken is iets heel anders dan hondervijftig modules. Mijn ervaring met Django, Symfony, of .

NET projecten is te gering om hier een generieke uitspraak over te kunnen doen, maar ik verwacht ongeveer hetzelfde. Een recent CakePHP project dat ik bouwde had twee zulke bilbiotheken: een PDF-library en een Twitter-libary. Niks meer.

In Drupalprojecten kom ik niet zelden het volgende patroon tegen:

Core doet X1
Contrib module maakt daarvan X3
Eigen module maakt deel van contrib module ongedaan en maakt het gewenste resultaat X2
Het theme en theme-preprocessors bouwen daarvan X2’.

Dat is ook wat developmentseed bedoelde met “undoing”. En er zit dus twee keer zoveel code en twee keer zoveel features in dan nodig: eerst de features en daarvoor nodige code die niet gebruikt gaan worden. En vervolgens code om die features weer te verbergen.

Een ander, veelvoorkomend patroon is:

Contrib A doet X op zijn eigen maniertje, niet helemaal passend bij het interactieontwerp.
Contrib B doet Y op een ander, eigen maniertje.
Eigen module zorgt dat X en Y consistent zijn en samenwerken volgens het interactieontwerp.
Theme gebruikt deze data, verwerkt en past ze aan, tot een custom-interface.

Dat heet in Drupal meestal “gluecode”. Omdat het een CMS is, en geen framework, hebben modules (en core) deze “maniertjes”. Een (goede) bibliotheek heeft geen maniertjes, maar is hoogstends “opinionated”; wat betekent dat het technische aannames doet, zoals bijvoorbeeld de naamgeving van je database.

Dit patroon veroorzaakt ook een zogenaamde “tight coupling” tussen het theme en de implementatie. Een theme kan niet werken zonder dat alledrie modules beschikbaar zijn, én exact volgens een patroon ingeregeld. En andersom zal zonder het theme (of met een ander theme) de site heel anders (of helemaal niet) werken. “Tight coupling” is een bekende oorzaak van veel beheerproblemen en van enorm veel bugs.

Een Drupalmodule is echter bibliotheek, implementatie en vaak nog design daarvan, in één. Een bibliotheek is enkel bibliotheek. De implementatie, en al helemaal het design is aan de bouwer. Er is dus geen undoing nodig. En geen gluecode (of: iemand zei me ooit: bouwen met Django is alléén maar gluecode schrijven).

Wanneer je in Django een paar Packages (bibliotheken) binnenhaalt, of in Rails een paar Gems opneemt, is het patroon heel anders:

Core doet niets.
Gem X biedt een aantal, geabstraheerde, helpers aan.
Jou Rails app gebruikt die helpers om, met custom code, het gewenste resultaat te krijgen.

Een voorbeeld:

Rails heeft van zichzelf geen uploadfunctionaliteit.
Carrierwave bied de mogelijkheid om heel eenvoudig files aan “objecten” toe te voegen.
Het door jou ontwikkelde “article” model, gebruikt enkele regels code om plaatjes aan artikelen toe te voegen, deze te bewerken (thumbnails, watermarks enzovoort) en de resultaten beschikbaar te maken en weer te geven.

Voor beheer is dus belangrijk dat, om het gewenste “ontwerp” te bereiken, met een CMS vaak erg veel extra modules, addons of eigen code meegeleverd moet worden. Die veelal erg uiteenloopt qua implementatie, veiligheid en kwaliteit. 150 modules beheren is een hel. En helemaal als een groot deel daarvan gluecode en ondoing is.

Bij een framework kunnen bibliotheken net zoveel uiteenlopen qua veiligheid en kwaliteit, maar de manier van inzetten is zodanig anders, dat beheer van deze bibliotheken nauwelijks moeite kost. De manier van inzetten volgt ook altijd heel duidelijk beschreven patronen, want dat is vastgelegd in het framework. Waardoor een willekeurige (nette) rails app door iedere rails-ontwikkelaar binnen enkele uren te begrijpen is.

Bovendien zijn frameworks in essentie eigenlijk niet meer dan tools om al die bibliotheken te beheren en te implementeren. Bij een CMS is beheer van modules of addons vaak slechts een bijzaak, het is immers een Content management systeem en geen Code management systeem. In Drupal is dit welliswaar aan het veranderen met tools als features, en drush. Maar het komt nog altijd niet in de buurt van een tool als bundler

Wanneer je je confirmeert aan het CMS heb je dus niet alleen veel minder ontwikkel- en denkwerk te verrichten, het beheer wordt ook nog eens stukken goedkoper. Maar een project waarbij je nauwelijks ontwikkelt, je functioneel ontwerp helemaal laat leiden door de conventies van het CMS en het beheer een fluitje van een cent is, kost toch ook bijna niets?! Zo een site is in een week gemaakt. Kost een paarduizend euro aan design en themeing en kan per definitie nooit honderden uren werk kosten, want dat werk voorkwam je nu juist door alles standaard te doen!

Waarom kost een Drupalsite dan toch vaak enkele (tien)duizenden euro’s? Mis ik een belangrijke kostenpost? Of zijn we vooral “Maatwerk waar het moet” aan het bouwen met Drupal?

https://berk.es/2011/03/10/drupal-is-ook-minder-geschikt-voor-afwijkende-custom-interactie-en-functionele-ontwerpen

Maar in praktijk valt dat toch allemaal wel mee, wordt Drupal toch erg vaak succesvol ingezet?

Mar 7, 2011 Updated Mar 7, 2011

Show full content

Een aantal mensen op twitter of per mail, vonden mijn artikel over waarom Drupal geen goede optie is voor grote projecten als gemeentesites niet erg concreet. Een reactie:

Er zijn recentelijk een aantal successvolle sites voor gemeentes gebouwd met Drupal. Dat is toch het bewijs dat je ongelijk hebt?

Uiteraard! Deze sites bewijzen dat Drupal wel degelijk gebruikt kan worden voor zulke projecten. Ik zeg ook nergens dat het niet kan. Daarom eerst nog eens de essentie van mijn betoog:

Drupal is een fijn platform om sites mee te bouwen. Ook grote sites. Maar Drupal is zeker niet het meest geschikte platform om die grote sites mee te bouwen. Drupal zal altijd duurder uitpakken (en mogelijk minder goede projecten afleveren) dan een meer geschikte tool. Frameworks als Django, Rails of Symfony zijn geschiktere tools.

En dan dat practische deel. Ik heb inmiddels wat tijd gehad om met DevCMS te spelen. Merk op dat dit geen Drupal versus DevCMS artikel is, maar dat ik DevCMS opvoer als een product dat een team kan bouwen met een framework.

Zoals alle Drupalontwikkelaars weten, is een accessmodel in Drupal erg lastig. Het is enorm ingewikkeld om een strak ingeregelde redactionele workflow te maken, waarbij artikelen enkel zichtbaar zijn nadat ze door een administratieve molen gehaald zijn. Iets schijnbaar simpels als (door mij vereenvoudigd citaat uit een recente offerteaanvraag): Een groep artikelen wordt door de redactie in draft geschreven, dan geredigeerd, gaat dan naar afdeling Juridische zaken[…], bij goedkeuring naar eindverantwoordelijke en bij afkeuring terug naar de auteur(s)[…]. De eindverantwoordelijke kan de groep artikelen publiceren, waarna oude revisies bewaard blijven en de goedgekeurde revisieset online komt.

Ik zou, écht niet weten hoe ik dat makkelijk in Drupal gedaan kan worden. Ik schat, natte vingerwerk, voor deze feature minimaal 100 uur ontwikkeling in (maar zou een echte inschatting pas na een proof of concept durven geven). De crux zit hem overigens in het detail “groep artikelen”; wat een artikel met subartikeltjes zou kunnen zijn. En in “strak ingeregelde”, want iets wat hier ongeveer op lijkt is vrij eenvoudig in Drupal.

DevCMS heeft dat dus wél. Ik weet uiteraard niet hoelang dat team daaraan heeft gebouwd, maar ik weet uit ervaring dat met Rails’ statemachine addons zulk soort workflows niet enorm veel werk en tijd kosten. Uit de handleiding:

Nodes in the tree can be in several states, which is stored in the node’s status attribute:• unapproved: The node has been created or changed by an editor and is waiting for approval.• approved: The node has been approved, created or updated by an administrator or final editor. A new version of the content node is recorded.• rejected: An unapproved node has explicitly been rejected by an administrator or final editor. The editor responsiblefor the change will be notified when a node enters this state.• drafted: The node (unapproved or approved) is drafted, meaning a user has not finished changing this node andshould therefore not be shown on the website. The node will not be listed for approval.

When a content node is created by an administrator or final editor, it is automatically considered to be approved and therefore a new version of this node is recorded. [….] Conclusively, when an editor makes a change, the table of the content type will contain the new, unapproved version and the website should show the yaml-ized, approved version until the unapproved version has been approved by an administrator or final editor.

Ze merken in de handleiding zélf nog op dat hier nog veel werk voor e DevMCS ontwikkelaars ligt. Dat ze dit nog lang niet naar eigen wens hebben afgerond. Maar mijn eerste indruk is dat dit al véle malen beter is dan welke redactionele workflow ik in Drupal ooit geïmplementeerd heb zien worden.

Drupal kan iets essentieels als redactionele workflows best, maar vraagt een oplossing die zo goed als van scratch opgebouwd moet worden: honderden uren bijelkaar klikken van standaardcomponenten en stukjes custom code, die ieder project telkens weer moet neertellen: voor iedere te implementeren Drupalsite weer opnieuw. En blijkt dna erg vaak op details slecht afgewerkt te zijn; resulterend in workflows die meestal wel redelijk, maar zelden precies passen in een organisatie.

Terwijl in een framework vrij eenvoudig gebruik gemaakt wordt van componenten als een statemachine om een workflow te bouwen die exact past bij het project. Die dus ook van scratch gebouwd moet worden, maar die, vanwege veel geschikter hergebruik van anderssoortige bibliotheken, zowel veel nauwkeuriger ingebed kan worden in een bestaande workflow, alsook veel sneller en makkelijker te implementeren is.

Een CMS vraagt eigenlijk altijd dat de organisatie zich aanpast aan de redactionele workflow die ingebakken is in dat CMS, terwijl een framework de ontwikkelaars bouwstenen geeft om een precies passende workflow te ontwikkelen

En dat is slechts één klein practisch voorbeeld. Ik weet zeker dat we met veel meer zulke voorbeelden kunnen komen.

Wat voor ogenschijnlijk simpel probleem heeft jou in Drupal uren uitzoek- en ontwikkeltijd gekost?

Wat Doet Drupal, of een ander CMS out of the box (zonder enig werk) waar je uren ontwikkelen in een framework voor nodig hebt? edit: tweede vraag toegevoegd

https://berk.es/2011/03/08/maar-in-praktijk-valt-dat-toch-allemaal-wel-mee-wordt-drupal-toch-erg-vaak-succesvol-ingezet

Geen CMS en al zeker geen Drupal voor grote web-projecten zoals gemeentesites.

Mar 6, 2011 Updated Mar 6, 2011

Show full content

Gemeentes en overheden zouden hun sites niet met Drupal moeten bouwen.

Achtergrond.

Ik ben een fan van Drupal, ontwikkel er al jaren mee en heb veel ervaring met succesvolle en evenzovele gefaalde Drupalprojecten. Dat laatste vooral door mijn functie als “probleemoplosser” bij Drupalprojecten. Mensen huren mij vooral in om hun vastgelopen, of uitlopende Drupalprojecten te redden. Ik ben (misschien juist daardóór) ook een Drupal-scepticus. Drupal wordt teveel en te vaak ingezet voor projecten waar het helemaal niet geschikt voor is. Drupal mist ook zo ongeveer alles wat een goede “architectuur” vraagt. Het ontbeert een uitgekristalliseerd veiligheidsmodel, abstractie is geheel afwezig, rommel en broddelwerk in de community (de modules) zijn eerder regel dan uitzondering, enzovoort.

Ik ben ook pragmatist; werk moet af (binnen de deadline). Sites moeten mooi zijn (en niet te duur). Software moet gewoon werken (voor de eindgebruiker) enzovoort. Goede academische opzet is leuk, maar nooit het doel: een goed opgezet project is juist zo goed opgezet omdat daarmee bovenstaande doelen gehaald kunnen worden! En nooit om het “goed opzetten” an Sich. Drupal moet dus ingezet worden waar het succesvol kan zijn. Waar het goed tot zijn recht komt. En het moet vooral niet ingezet worden voor projecten waar het ongeschikt voor is.

CMSen en hun alternatieven.

Grofweg zijn er drie groepen software waarmee sites gemaakt kunnen worden: CMS, Framework of Barebones.

Een CMS is een kant-en klaar pakket, welke zonder programmeerwerk ingezet kan worden om content te beheren. Voor het gemak wordt de C van Content meestal zeer breed gerekt: ook software om een webwinkel mee te draaien of pakketten om klanten in te beheren worden onder deze groep geschaard: zo lang het maar inzetbaar is zonder programmeerwerk.

Een framework is een omgeving, of platform, of softwarepakket, waarmee programmeurs sites kunnen bouwen. Vaak hebben frameworks allerlei complexe zaken al voor de programmeurs geabstraheerd en is programmeren nauwelijks meer dan het correct instellen van allerlei instellingen, en het verder uitwerken van voorgebakken code.

Het is een enorm brede categorie, dus beperk ik, vanaf hier, tot enkele moderne, veelgebruikte frameworks: Symfony, Django en Ruby on Rails.

Een barebone is eigenlijk geen systeem, platform of omgeving, maar juist het ontbreken ervan. Gewoon van de grond af, iets zelf bouwen. Welhaast ieder “webdesignburo” heeft zo zijn eigen CMS gebouwd. Bijna alle grote “enterpriceomgevingen” zijn op deze manier gebouwd.

Open Source

«Gemeente Grootezee heeft nu DuurBetaaldCMS X, dus Drupal is Open Source, dus beter.» Is een veelgehoord argument. Laat vooropstaan dat juist voor publieke diensten zoals gemeentesites twee dingen enorm belangrijk zijn:

Voor bezoekers en gebruikers: Toegankelijkheid en voldoen aan standaarden.
Voor overheden: Onafhankelijk zijn van bedrijven, contracten en licenties.

Maar Drupal is zeker niet de enige oplossing die hieraan kan voldoen. Open Source betekent ook niet, dat iets als Open Source ingekocht moet worden. Een gemeente kan best een bestaand closed source pakket opkopen of iets geheel van de grond af aan laten ontwikkelen en het dan Open Source vrijgeven!Ik zie zelfs veel overheidsprojecten waar men Drupal modules heeft laten ontwikkelen die niet vrijgegeven zijn!Gelukkig blijken ook enkele projecten wel het geval. In elk geval is het gebruiken van Open Source nog geen garantie dat de investering ook bij de gemeenschap terugkomt.

Bij het gebruik van een framework is dat precies zo: het gebruiken van Open Source technologie garandeert niet dat het eindproduct ook aan de gemeenschap teruggegeven wordt. Maar dat kan wel, zie het voor Deventer op maat gemaakte CMS devCMS.

Return on Investment: Effectief bouwen van een site.

Het belangrijkste argument blijft echter dat Drupal, als CMS, nauwelijks geschikt is om grote complexe maatwerk-projecten mee te bouwen. Dat heeft grotendeels met de technische opzet van deze CMSen te maken.

Daarvoor zijn hieronder enkele grafieken opgenomen. Ze geven een globaal inzicht van hoeveel offert men moet steken in ontwikkeling van een site na een X-tal uren.

De y-as geeft de “effectiviteit” aan. Het aantal uren dat met per delivery nodig heeft. 100% effectief betekent géén effort en toch iets opgeleverd (en bestaat dus niet). De onderkant is 0%: enorm veel werk gedaan en niets kunnen leveren. De grafieken gaan uit van ervaren ontwikkelaars. Dus de initiële leertijd (om een taal, of raamwerk te leren gebruiken is buiten beschouwing gelaten).

De X-as geeft het aantal uren dat besteed is aan het gehele project. Bijvoorbeeld: na 100 uur Drupal-ontwikkelen wordt dóórontwikkelen steeds duurder. De ontwikkelaar geraakt dan in een domein waar hij of zij alles zelf moet doen, zelf modules moet bouwen en steeds minder makkelijk toegankelijke kennis nodig heeft (bij ontwikkelaars bekend als: de hele broncode van Views moeten lezen om dat stomme cartesiaans product op te lossen). Grafiek van effectiviteit uitgezet tegen investering Overigens blijkt hier ook al meteen een ander veelgemaakte denkfout uit: een groot project is een project met lange doorlooptijd, veel manuren en grote budgetten. Het zegt niks over het uiteindelijke gebruik van een site. Drupal kan goed ingezet worden voor een site met miljoenen bezoekers. Precies andersom, kan een intranetomgeving voor slechts enkele tientallen onderzoekers mogelijk duizenden manuren opslorpen en daarmee een heel groot project zijn.

Wordpress, of de hypergefocuste CMSen

Wordpress is een CMS dat ontwikkeld werd voor één doel: Bloggen. Het is daarmee super geoptimaliseerd voor deze taak: gebruiksvriendelijk, doelgericht, afgestemd op de doelgroep enzovoort. Daarmee is ook meteen aangegeven wat het allemaal niet kan: namelijk: al het andere. Dit geldt uiteraard ook voor ontwikkelaars: zij bouwen met Wordpress altijd blog-achtige sites. Iets anders kán gewoon niet. Deze vorm van software opzetten heet ook wel opinionated software. Andere voorbeelden zijn PHPBB, het bekende forumsysteem, mediaWiki (software achter onder meer Wikipedia) of Status.net (je eigen twitter-community opzetten).

In enkele uurtjes heb je een site draaien, maar écht maatwerk aanpassingen buiten het duidelijke kader en doel van het CMS kosten enorm veel werk.

Drupal, of de blokkendozen.

Drupal wordt vaak, ten onrechte, een content management framework genoemd, waarmee men probeert aan te geven dat Drupal best aardig kan meekomen als framework. De oorzaak van deze verwarring is dat Drupal eigenlijk meer een soort blokkendoos is. In tegenstelling tot een CMS als Wordpress met maar één doel, heeft Drupal die focus juist helemaal niet. Drupal is daarmee niet gebruiksvriendelijk (vriendelijk te gebruiken waarvóór?) niet geöptimaliseerd en niet afgestemd op een doelgroep. Dat geeft flexibiliteit en vrijheid. Ontwikkelaars kunnen er veel meer mee bouwen dan alleen datgene waarvoor het ooit bedoeld werd.

Maar het is met nadruk géén framework, omdat het juist voor programmeurs weinig biedt: het is niet Object Georiënteerd, kent nauwelijks abstractie, vereist enorm veel duplicaatcode voor simpele aanpassingen, ontbeert iedere vorm van een ontwikkelomgeving, kent nauwelijks een migratiesysteem (Uitrollen van een site via een testomgeving naar de live omgeving). Enzovoort.

Voor de oorspronkelijke doelgroep is dat ook helemaal niet erg: Een kleine site heeft helemaal geen OTAP straat nodig. Mijn persoonlijke blog via een ISO9002, ITIL-gecertificeerde workflow prubliceren? Kom nou!Maar als je dit juist wél wilt? Als je juist je systeem wilt koppelen aan andere processen, tools of systemen?Dan moet je alles zelf bouwen, met nauwelijks enige technische infrastructuur. De effectiviteit van al dat extra werk is dan meestal nog lager dan wanneer alles “barebone” van scratch gebouwd werd. Drupal blijkt dan vaak meer “in de weg” te lopen dan dat het “meehelpt”. Dit is onafhankelijk van de ervaring en kennis van Drupal: iemand met veel ervaring spendeert net zo goed 100+ uren aan een eenvoudige Create Read Update Delete omgeving van custom “dingen” zoals, zeg, betalingen. Drupal biedt hier nauwelijks tools voor: alle pagina’s, lijstweergaves, workflows moeten met de hand geprogrammeerd worden, alsof het in een barebone PHP-omgeving gebouwd werd.

Drupal kent een steile leercurve, veel steiler dan de doelgerichte CMSen. Binnen een tiental uren staat er een basissite. Binnen anderhalve week is met standaardcomponenten een heel aardige maatwerk site op te zetten. Maar daarna zakt alle productiviteit in: Buiten het gebruik van standaardcomponenten is Drupal een slecht ontwikkelplatform.

De frameworks

Deze kennen een redelijk lange opstarttijd, welke eigenlijk vooral bepaald wordt door de ervaring die men ermee heeft.

Iemand die al enkele sites in Ruby on Rails bouwde, heeft binnen enkele uren al een basissite staan, maar een team dat nog nauwelijks ervaring heeft met frameworks, of de taal waarin ze gebouwd zijn (Ruby voor Rails en Python voor Django) zal, uiteraard, eerst de taal en de concepten moeten eigen maken.

Dat hoeft bij een systeem als Drupal veel minder en bij een gefocust systeem als Wordpress helemaal niet. Iemand kan zonder programmeren een heel aardige site opzetten. Met Django kom je niet ver als “Objecten en Classes” enkel op de todolijst onder: “moet ik nog eens induiken” staan.

Maar daarna gaat het snel. Wanneer het team, of de ontwikkelaar weet hoe zaken werken, hoe het conceptueel in elkaar steekt en de ontwikkelteams onder de knie heeft, blijft doorontwikkelen en maatwerk dezelfde effort kosten. Eigenlijk ontwikkelt men met een framework een CMS dat in de eerste categorie thuishoort: een hyper gefocust CMS.

Een framework is ook veel beter geöptimaliseerd om met teams te werken. Drupal biedt nauwelijks abstractie (naar databases, services, diensten enzovoort), kent nauwelijks migraties, testomgevingen en deployment-tools. In frameworks kun je meestal niet eens zónder deze zaken ontwikkelen.

Bovendien is de literatuur van frameworks ook zeer veel meer gericht op ontwikkelaars. Literatuur legt juist de structuur, filosofie en architectuur uit. Terwijl bij een CMS altijd het gebruik van het eindproduct op de voorgrond staat. De leercurve voor ontwikkelaars en -teams is daarmee meestal stukken lager dan bij een CMS dat eigenlijk niet bedoeld is voor ontwikkelaars maar voor eindgebruikers.

Verder zak een goed framework ook hergebruik van code toelaten. Meer nog dan een CMS, waar maatwerk juist precies dat is: maatwerk: dus per definitie niet of nauwelijks herbruikbaar. Een framework maakt het juist zo, dat dat maatwerk de minst mogelijke inzet kost. En zorgt voor structuur en standaarden, waardoor zelf ontwikkelde bibliotheken los van het maatwerk gemaakt kunnen worden: een koppeling met een extern systeem bestaat dan bijvoorbeeld uit een abstracte “algemene Python digiD bibliotheek” met een klein sausje maatwerk eroverheen (de integratie van die bibliotheek in je Django-project). Hierdoor is het vrijgeven van werk als Open Source ook veel aantrekkelijker dan bij veel CMSen het geval is. Een Drupal-DigiD-module is enkel in Drupal te gebruiken (waarmee de betreffende overheid zich dus “ingesloten” heeft in Drupal). Terwijl een DigiD Ruby-gem, een Java library, PHP of Python Package veel breder inzetbaar is. Eigenlijk zijn zulke projecten veel waardevoller voor de (open source) gemeenschap.

Wanneer een site gebouwd wordt door professionals, kan al binnen enkele uren een product klaarstaan. En kost het doorontwikkelen daarvan nauwelijks extra tijd per te ontwikkelen onderdeel. Tot het moment dat men ook buiten de kaders van dat platform wil gaan. Dan moeten plots eigen bibliotheken of diensten ontwikkeld worden. Maar dat geldt precies net zo voor geavanceerde features in een CMS. Ik bedoel dan vooral complexe zaken als worker-queues, load balancing, communicatie met andere systemen enzovoort. Een goed framework zal hier echter zeker niet in de weg lopen en mogelijk al allerlei tools (hooks, plugin-systemen, workflows) hebben klaarstaan om e.e.a. te kunnen integreren.

Conclusie

Drupal is zeer interessant voor projecten die onder de, ruwweg, tweehonderd uur blijven, projecten die ruwweg minder dan €5000 kosten. Daarboven zal een framework, mits gebruikt door ervaren web-ontwikkelaars, altijd efficiënter blijken. Drupal biedt dan geen enkel voordeel, anders dan dat het “Open Source” is. Maar Open Source kan net zo goed met een framework.

Overheden en grote projecten zouden dan ook best niet in Drupal gebouwd moeten worden, maar in een daarvoor veel geschikter framework.

https://berk.es/2011/03/07/geen-cms-en-al-zeker-geen-drupal-voor-grote-web-projecten-zoals-gemeentesites

Why I chose to disclose a security issue and not report it to Drupal securty team.

Feb 2, 2011 Updated Feb 2, 2011

Show full content

Okay. So I did not play nice. In fact, I probably brought quite some sites out there in trouble, by disclosing a Drupal security issue on Twitter, without mentioning it to the security Team.

I had several reasons for doing this. * I was frustrated. With this module, its code and it causing several ugly bugs in an already frustrating site. Being frustrated and having access to Twitter is never a good idea. More on this below. * It has been one of many security issues in contribs I stumbled upon off late. Some I have reported, quite some being hard to reproduce are not worth reporting. I am by no means a security expert. Hence the frustration.* It has been one of many, many more project-only security issues I came across off late. Some in custom code, some in themes, many, many more in crappy configuration and even crappier custom-gluecode. Hence the frustration: I often get the idea that it is way to easy to write crappy, insecure or bad-performing Drupal-code. I know of other projects where it is much harder to build insecure code.* This specific issue has been around since December 2007. That was the main point for me to vent my anger and disclose the issue. It is never smart to post such issues when frustrated. And I am very sorry if I brought the Drupalk security team in trouble by this. That was not intended. When I often see the quality of contributions, I get very sad. Or frustrated. I too, often make bad code; I too learn new things about writing proper code every day. And I try to improve my code by not allowing in features, code, or other stuff that misses Good Architecture, fails to fit in the Grand Scheme and so on.** If code is bad, people should not use it! At all. Bad code should not be allowed to exist. **Bad code will exist. Bugs will creep in. Security holes will open. That is reality. But we should not allow such things to be kept for long. Any software project should have processes in place to weed out bad code, security issues and such. Drupal has such processes; one of them is the security-team.

However. This hole has existed (at time of writing) for probably over 3 years. No one has reported it, yet over 2000 sites are reportedly using this module. Here something is wrong. Had I reported it to the security team, then some patch would have been brought out. And all 2000 sites would have been patched (you patch, don’t you?). At least the choice to either close down the project, fix it, or anything else, whould have been that of the security team, not mine. I understand that fully. However, this time I chose the disclosure. For two (IMO) good reasons: * A project that is actively used, with a security hole, by thousands of users for several years, is wrong. This is the proof that at least some trivial security holes will leak trough in the current process. We must be aware of that fact.* People should know their own responsibility. I am probably very optimistic if I say that all 2000+ reported users of this module have found that hole themselves, fixed it locally, but did not manage to report it to Drupals security team. Realistic to think that hardly any of these reported users have it fixed locally. To me, that is a good indication that many-eyeballs fail to find security holes. This too, must be known.

I should probably have taken another route to raise such awareness. But in the light of things, I find a full-disclosure a good way to raise this: ** Your Drupal (Or wordpress, Joomla! or proprietry) site with 100+ modules and custom code is probably insecure. Unless you have reviewed it and know for sure it is not. ** Be aware of that.

https://berk.es/2011/02/03/why-i-chose-to-disclose-a-security-issue-and-not-report-it-to-drupal-securty-team

Clean and maintainable pattern for blocks development in Drupal 6

Dec 30, 2010 Updated Dec 30, 2010

Show full content

Clean and maintainable pattern for blocks development in Drupal 6

Drupal7, has finally removed the confusing $op parameters from hooks. And replaced them with a family of related hooks instead; one hook per op.

Here is a way to achieve the same in Drupal 6; by building a simple router in hook_block(). We use a user function for this, a patter well known in Drupal as hook.

/**
 * Implementation of hook_block().
 */
function example_block($op = 'list', $delta = '', $edit = array()) {
  if ($op == 'list') {
    return _example_block_list();
  }

  $callback = "_example_block_#{$op}_#{$delta}";
  if (function_exists($callback)) {
    if ($op == 'save') {
      return call_user_func($callback, $edit);
    }
    else {
      return call_user_func($callback);
    }
  }
  # @TODO remove debug
  # else {
  #   dvm("block callback not found: #{$callback}")
  # }
}

From here on, we can implement a simple family of functions, instead of cramping everything in one huge and cluttered multifunctional hook implementation.

For example, we implement the list callback as

function _example_block_list() {
  # ...build blocks...
  return $blocks;
}

Note the preceding _underscore before the function. This has no special technical meaning, but is a de-facto standard in PHP to indicate a function should be considered private. This way, we tell other developers to never even consider using our callbacks; leaving us the freedom to change our functions at will.

But first, dissecting our hook_block will show what we actually do:

if ($op == 'list') {
  return _example_block_list();
}

This makes an exception for list. $op list, is the operator that does not recieve a delta, because it defines those delta’s. A delta, is a severe misnoner in Drupal, because it is simply an identifier. However, Drupal calls it a delta, so should we. Another common misunderstanding, is that a delta must be numeric; beacause that is what the name delta implies. This implementation works best for simple textual identifiers.

With the list we define the blocks. For example to define two blocks, foo and bar

function _example_block_list() {
  $blocks['foo'] = array(
    'info'       => t('Renders block foo'),
  );
  $blocks['bar'] = array(
    'info'       => t('Renders block bar'),
  );
  return $blocks;
}

There are a lot more paramters you can define, they are all added in the final example implementation.

Back to the hook_block, we see

$callback = "_example_block_#{$list}_#{$delta}";
if (function_exists($callback)) {
  if ($op == 'save') {
    return call_user_func($callback, $edit);
  }
  else {
    return call_user_func($callback);
  }
}

A function callback is built, code checks if that exists and if so calls that function. We do make another exception, for save. Save takes another paramter, $edit, which all the others callbacks do not take. To keep things clean, we should only pass parameters to functions that actually have these implemented, hence the exception for save.

And, finally, adding a little piece of code that learns us of not implemented callbacks, so we can implement these, or ignore them.

# @TODO remove debug
# else {
#   dvm("block callback not found: #{$callback}")
# }

Obviously to be removed before releasing.

We can now implement some of the callbacks. To do that, say we want a setting on block foo, but not in bar. This setting will allow us to toggle a “Read more »”-link on block foo in the block’s configuration. For this, we need a configure callback and a save callback.

/** Block callback for configure op, delta foo.
 *
 * @return Array
 *   Form api array.
 */
function _example_block_configure_foo() {
  $form = array();
  $form['read_more'] = array(
    '#type'          => 'checkbox',
    '#title'         => t('Show "Read more »"'),
    '#default_value' => variable_get('example_foo_read_more', FALSE),
  );
  return $form;
}

Configure simply returns a FAPI form, as per hook_block documentation.

/** Block callback for save op, delta foo.
 *
 * @param $edit Array
 *   The submitted form values.
 *
 * @return Array
 *   Form api array.
 */
function _example_block_save_foo($edit) {
  ## Save values for block
  variable_set('example_foo_read_more', $edit['read_more']);
  return TRUE;
}

Save simply saves the configure to a variable.

Finally, we need a view callback for each block: the function that actually renders the block.

/** Block callback for view op, block foo.
 *
 * @return Array
 *   Block array with content and subject key.
 */
function _example_block_view_foo() {
  $block['subject'] = t('Title of block #1');
  $block['content'] = 'Content of block #1';
  if (variable_get('example_foo_read_more', FALSE)) {
    $block['content'] .= l(t('Read more »'), 'foo/more');
  }
  return $block;
}

A similar view for bar is needed, this time without the optional “Read more »”.

/** Block callback for view op, block bar.
 *
 * @return
 *   Block array with content and subject key.
 */
function _example_block_view_bar() {
  $block['subject'] = t('Title of block #2');
  $block['content'] = 'Content of block #2';
  return $block;
}

Once you get more and more blocks, you can even split them out over include-files. By using module_load include, we can read a new file for each delta. This becomes only usefull when your module has many different blocks, each block having many helper functions and -libraries.

function example_block($op = 'list', $delta = '', $edit = array()) {
  if ($op == 'list') {
    return _example_block_list();
  }

 $lib_file = "example_#{$delta}";
  module_load_include('inc', 'example', );
  # ...
}

This allows us to move all callbacks to this include. With exception of the list, that one should remain in the module, since that is the one to define the available blocks. It would be possible to make that dynamic too, to call an info per include-file. The benefit would be, that the includefiles are completely self-contained; have all the information about a single block in a single file. The downside is complexity and overhead. We use $delta to group the libraries, so you have one file for each block. The alternative would be to have a library per op. This makes things only worse, since your code will be spread all over the place, and a new block, removal of block or change in a block would often require you to change all four files, instead of just the module and the block-file.

The final *example.module* looks like this:

<?php // $Id$
/**
 * Module:      example for blocks pattern
 * Date:        2010-12-31  10:18
 * Author:      ber
 *
 * Description:
 *   Example blocks 
 *
 * License:
 *
 *   Copyright (C) 2010  ber
 *
 *   This program is free software: you can redistribute it and/or modify
 *   it under the terms of the GNU General Public License as published by
 *   the Free Software Foundation, either version 3 of the License, or
 *   (at your option) any later version.
 *
 *   This program is distributed in the hope that it will be useful,
 *   but WITHOUT ANY WARRANTY; without even the implied warranty of
 *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.pro See the
 *   GNU General Public License for more details.
 *
 *   You should have received a copy of the GNU General Public License
 *   along with this program. If not, see <http://www.gnu.org/licenses/>.
 *
 */

/**
 * Implementation of hook_block().
 */
function example_block($op = 'list', $delta = '', $edit = array()) {
  if ($op == 'list') {
    return _example_block_list();
  }

  $callback = "_example_block_{$op}_{$delta}";
  if (function_exists($callback)) {
    if ($op == 'save') {
      return call_user_func($callback, $edit);
    }
    else {
      return call_user_func($callback);
    }
  }
  #else { #TODO: remove debug!
  #  dvr("block callback not found: {$callback}");
  #}
}

/** Renderer lists all blocks.
 *
 * @return
 *   Block description array. see hook_block() documentation for details on array contents.
 */
function _example_block_list() {
  $blocks['foo'] = array(
    'info'       => t('Renders foo'),
    'cache'       => BLOCK_CACHE_PER_ROLE, # | BLOCK_CACHE_PER_USER | BLOCK_CACHE_PER_PAGE | BLOCK_CACHE_GLOBAL | BLOCK_NO_CACHE,
    'status'     => TRUE,
    'weight'     => 0,
    'region'     => '',
    'visibility' => 1,
    'pages'      => '',
  );
  $blocks['bar'] = array(
    'info'       => t('Renders bar'),
    'cache'       => BLOCK_CACHE_PER_ROLE, #| BLOCK_CACHE_PER_USER | BLOCK_CACHE_PER_PAGE | BLOCK_CACHE_GLOBAL | BLOCK_NO_CACHE,
    'status'     => TRUE,
    'weight'     => 0,
    'region'     => '',
    'visibility' => 1,
    'pages'      => '',
  );
  return $blocks;
}

/** Block callback for configure op, delta foo.
 *
 * @return Array
 *   Form api array.
 */
function _example_block_configure_foo() {
  $form['read_more'] = array(
    '#type'          => 'checkbox',
    '#title'         => t('Show "Read more »"'),
    '#default_value' => variable_get('example_foo_read_more', FALSE),
  );
  return $form;
}

/** Block callback for save op, delta foo.
 *
 * @param $edit Array
 *   The submitted form values.
 *
 * @return Array
 *   Form api array.
 */
function _example_block_save_foo($edit) {
  variable_set('example_foo_read_more', $edit['read_more']);
  return TRUE;
}

/** Block callback for view op, block foo.
 *
 * @return Array
 *   Block array with content and subject key.
 */
function _example_block_view_foo() {
  $block['subject'] = t('Title of block #1');
  $block['content'] = 'Content of block #1';
  if (variable_get('example_foo_read_more', FALSE)) {
    $block['content'] .= l(t('Read more »'), 'foo/more');
  }
  return $block;
}

/** Block callback for view op, block bar.
 *
 * @return
 *   Block array with content and subject key.
 */
function _example_block_view_bar() {
  $block['subject'] = t('Title of block #2');
  $block['content'] = 'Content of block #2';
  return $block;
}

From here on, you can start improving even more, e.g. by splitting out the theme functions that render blocks, the database or parsers that fetch and model the data and so on. But that is for later.

https://berk.es/2010/12/31/clean-and-maintainable-pattern-for-blocks-development-in-drupal-6

Lokale klokkenluiderssite Opennu is een farce

Dec 19, 2010 Updated Dec 19, 2010

Show full content

Vandaag releaste Michel Spekkers een “Lokale wikileaks”. Ik ben enorm voorstander van transparantie en van wat wikileaks doet. Ik kan het daarom alleen maar toejuichen als het voorbeeld van wikileaks gevolgd wordt en de beweging Decentraliseert. Dat maakt de beweging alleen maar sterker. Maar wat hier gereleased is, is verre van een wikileaks. Ik durf zelfs te stellen: een farce. Een klokkenluiders-outlet (wikileaks) staat of valt bij één zeer belangrijke voorwaarde: • Anonimiteit moet gegarandeerd zijn. Nu én in de toekomst.

Die garantie vereist weer drie belangrijke voorwaarden:• De organisatie moet juridisch zeer sterk opgezet zijn• De infrastructuur moet technisch zeer sterk opgezet zijn• De marketing en PR moet goed opgezet zijnDat eerste is mij zeer onduidelijk, dus ik kan er weinig meer over zeggen dan dat het een éénpersoonsactie lijkt te zijn die juridisch helemaal niets uitgewerkt heeft. De reden dat dit belangrijk is, is heel eenvoudig: als justitie met een dwangbevel de servers en administratie komt ophalen, moet dat tegengegaan kunnen worden. Het kán niet zo zijn dat bij de eerste de beste tegenslag alle gelekte informatie in handen van het OM komt. Dat is geen garantie tot anonimiteit, dat is slechts een aardig probeersel.

Het tweede is mij wél duidelijk. Dat is gewoon enorm slecht. Technisch mist er enorm veel. Een klein onderzoekje leverde met het volgende op: De domeinnaam opennu.nl is beheerd door domein-direct.nl, onderdeel van web-direct. De server staat bij flexwebhosting, en lijkt op zijn beurt weer onderverhuurd aan ingento, die in eerste instantie niet ingeschreven lijkt te zijn bij de KvK. Op deze server draaien, volgens een korte analyse minstens dertien andere sites.

Ik ken die bedrijven niet, kan er zo snel even geen verdere informatie over vinden, maar dat geeft ook niet. Belangrijk is het simpele feit dat door jou gelekte data in het beste geval op een shared omgeving van een keten goedwillige bedrijven terecht komt. Slechts één van deze bedrijven hoeft een overheid een goede reden te geven om de servers in beslag te nemen, en jou gelekte informatie, inclusief je IP-adres enzovoort, komt precies daar terecht waar je dat niet wilt hebben.

Een zo mogelijk nóg belangrijkere duiding dat deze zeer waarschijnlijk goedwillende mijnheer Spekkers zijn zaken niet voorelkaar heeft en jou anonimiteit absoluut niet kan garanderen, is het gebrek aan https, een veilige verbinding om over te uploaden. Het certificaat is ongeldig, de https site bestaat überhaupt niet, en de mogelijkheid om hierover te uploaden dus ook niet. Zoals het nu staat, kan (en zal) iedereen op je netwerk en iedereen tussen jou en opennu.nl gewoon meelezen met wat je stuurt. Zelfs als je versleutelde bestandemeern stuurt, kan je IP adres, het gegeven dát jet iets naar opennu stuurt en alle bijbehorende data gewoon afgeluisterd worden. En zover ik weet gebeurt dit ook in iedere (middel)grote organisatie die haar netwerk een beetje beveiligd. Dus vanaf je werk dat ene PDFje uploaden, zorgt bijna direct dat je door de mand valt bij deze opzet. En vanaf thuis, kan nog iedereen tussen jou en opennu afluisteren. ziggo heeft gewoon logs waarin mensen kunnen nalezen dat jij een PDF stuurde in de nacht van N op M. Nee, zoals Spekkers het nu heeft opgezet is het hoogstends een goedwillend, maar erg naïef gebeuren. Een snel aangemaakt hotmail-accountje op naam van Piet Snot, kerkstraat 12, ons dorp, bied nog meer garanties.

En daarmee eerder een bedreiging voor de wikileaks beweging dan een toevoeging. Ik moet er niet aan denken dat het bijna onontkoombare ontslag van een Nederlandse ambtenaar, na lekken via opennu breed uitgemeten wordt in de pers. Dat doet de wikileaks beweging alleen maar onbetrouwbaar overkomen.

https://berk.es/2010/12/20/lokale-klokkenluiderssite-opennu-is-een-farce

Counter queries for complex, none-distinct SQL in Drupals Pager system.

Nov 9, 2010 Updated Nov 9, 2010

Show full content

I think everyone knows these moments: You have a problem, a question. And just by asking that question, the answer pops up in your head. It happens to me often, when programming. It is obvious: by asking the question, you have to analyze and simplify the problem. And by doing so

Today, there was another great way Stackoverflow helped me: I had a problem with a Drupal pager-query on a none-distinct SQL query. And right when I was finishing up, the answer struck me. But because I spent so much effort in the question, and I don’t want to forget, I decided to share it.

Drupal uses pager_query if you wish to get a limited result, for used as a paged, list.

A simple example would be (I am aware of my code not adhering to Drupal standards, done that for simplicity):

$nodes = pager_query('SELECT title, created FROM node WHERE published = 1', 20, 0);
while ($node = db_fetch_object($nodes)) {
   $html .= "$node->title ($node->created)";
}
$html .= theme('pager'); //This collects "magic" variables set by pager_query to build a string containing pagerlinks.

Now, I need to tackle a much more complex query, one that is not distinct as above. I am not sure if I should solve this in the domain of SQL, or rather in the domain of Drupal/PHP.

session<->node is an N:1 relation: any node has_many sessions. A session has_one node.

$nodes = pager_query('SELECT node.nid, node.title, node.created, sessions.time, sessions.sid FROM node INNER JOIN sessions ON session.nodes_nid = node.nid WHERE published = 1', 20, 0);
while ($node = db_fetch_object($nodes)) {
   $n->title = $node->title;
   $n->nid = $node->nid;
   unset($node->title, $node->nid);
   $n->sessions[$node->sid] = $node;
   $items[$n->nid] = $n;
}

Above routine allows me to query the database ONCE, fetch nodes that have_many playdates, and collect them in a list that:

has one row per $node.
each $node row has a list of all its associated sessions under $node->sessions.

However, pager_query lists one item for each row, instead of using a smarter counter query.This is where the answer became clear

And so, the answer is really simple: counter query

the last parameter of pager_query() is an alternative query to be used as counter. In the abovementioned example. that would be:

$sql = 'SELECT node.nid, node.title, node.created, sessions.time, sessions.sid FROM node INNER JOIN sessions ON session.nodes_nid = node.nid WHERE published = 1';
$counter = 'SELECT COUNT(DISTINCT(node.nid) FROM node INNER JOIN sessions ON session.nodes_nid = node.nid WHERE published = 1';
pager_query($sql, 20, 0, $counter);

https://berk.es/2010/11/10/counter-queries-for-complex-none-distinct-sql-in-drupals-pager-system

Op Apple kun je geen kritiek hebben.

Nov 7, 2010 Updated Nov 7, 2010

Show full content

Open een willekeurig forumtopic of blogpost met “Mac” “gewoon werkt” en “virussen” in de titel, en zie de discussies daar ontvouwen als ware religieuze oorlogen. Ik vermoed dat Niels t Hooft, in zijn stuk in NRC-next dat voor ogen had, want anders had hij het niet zo vol tegenstellingen, drogredenen en feitelijke onjuistheden neergeschreven. Niels gaat uit van het idee dat virussen binnenkomen met programma’s. Daar is eigenlijk zijn hele stuk op gebouwd. Virussen komen alleen niet binnen doordat gebruikers deze installeren, virussen die jezelf binnenhaalt heten Trojaan horses en is een heel ander probleem. Dat is niet semantisch, maar écht een probleem van een heel andere orde. Ook Mac, ook Linux, zijn onveilig voor Trojan horses: als iemand zo dom is zelf de applicatie te installeren, kun je hoogstends beschermen wat zo een trojan horse kan aanrichten. Dat doet BSD en daarmee Mac ook heel goed. En Linux ook, trouwens. Belangrijk is om applicaties te (kunnen) screenen. Het grootste argument tegen de geslotenheid van de Appstore is dan ook dat niemand weet óf Apple eigenlijk wel controleert op Trojans. Blijkbaar niet altijd even goed. Een ander argument is mogelijk nog belangrijker: is geslotenheid eigenlijk wel nodig voor controle? Android, Firefox plugins, de Ubuntu Installer en nog veel meer voorbeelden van weinig tot zeer open systemen zijn bekend, waarbij ook gewoon veiligheidsproblemen onderzocht worden. Deze modellen zijn niet bewezen slechter (of beter), maar in elk geval wel opener. Niels gelóóft graag dat gesloten beter is, net als menig Open Source liefhebber gelóóft dat openheid altijd beter is. Maar daarmee is het nog niet waar. Zover mij bekend, zijn de geleerden het na vele -tig jaar onderzoeken nog altijd niet eens over welk model nu het beste is. Mac weet dat, maar doet alsof haar model het beste is. Zoals ieder commerciëel bedrijf met goede marketing moet doen.

Niels is op zijn Mac goed tegen virussen (ik bedoel hier dus géén trojans) beschermd door de zeer goed gebouwde basis, het Operating Systeem. Een Open Source Unix variant genaamd BSD. Linux (bekend onder “merknamen” als Ubuntu, Debian, SuSE, Redhat enzovoort) is een andere bekende variant. Windows gebruikt een hele andere “onderkant” die in zichzelf een heel stuk minder veilig is. Dus dat Niels weinig last van virussen heeft, is vooral te danken aan de Open Source wereld. Niet aan de geslotenheid van de Mac. Het andere grote deel van virussen, komt voort uit gaten in (veel) gebruikte software. Adobe (bekend van de PDFs) en Internet Explorer zijn bekende voorbeelden van software met veel veiligheidsproblemen, gaten, via welke virussen binnenkomen. Ook hier is de oplossing van het probleem juist openheid gebleken. In het geval van Internet Explorer (IE), de webbrowser die standaard met Windows meekomt, had Microsoft een monopolie. Iedereen gebruikte deze standaardbrowser, waardoor ze er zelf niet aan doorontwikkelden; een zeer verouderd systeem, waar nauwelijks alternatief voor was werd niet meer verbeterd. De crackers en virusbouwers zaten niet stil, IE wel. Totdat Firefox kwam: veiliger, sneller beter. Dat was ongeveer de slogan waarmee zij het monopolie van IE doorbraken. Microsoft antwoordde met doorontwikkeling: IE7, IE8 en nu bijna 9. Ook met slogans als “veiliger sneller beter” gemarkt. Keuze bleek het medicijn voor de grootste digitale veiligheidsprobleem ooit, te zijn. Gebrek aan keuze, bleek de oorzaak van de kwaal. Adobe’s PDF reader problemen zijn van een andere aard en komen ook grotendeels, indirect, uit het gebrek van keuzevrijheid. Mac leverde deze (lekke) software ook gewoon voor haar Macs. Gelukkig hebben Macgebruikers de keuzevrijheid om andere PDF software te gebruiken. Nóg wel. Totdat Mac een deal met Adobe sluit en alle andere PDF-lezers weigert. Nu is dit specifieke voorbeeld haast ondenkbaar: Adobe en Mac hebben juist ruzie. Maar hypothetisch is het probleem enorm. Kiest Mac voor één App om X te doen, dan heeft de gebruiker niet de keuze om het veiligere Y te installeren. Zoals ik al zei: de geleerden zijn het nog altijd niet eens over welk model nu het veiligst is: open of gesloten, het Kapitalisme of Geleidde markt, eco-diversiteit of juist niet, marktwerking of overheidsbemoeienis. (of een van de vele tussenvormen en andere alternatieven). Maar in élk geval is het zo dat in veel gevallen júist keuzevrijheid en openheid tot betere software leidden. Als Niels gelijk had, en minder keuze iets goeds was, dan werkten we nu allemaal op windows95, belden we met een Nokia 3310. Gelukkig kon Niels kiezen en koos hij een Mac met MacOS en een iPhone.

Dan wordt een derde partij door Niels aan de haren erbij gesleept. Eentje die volledig buiten het hele verhaal van “waarmee houd je het beste virussen buiten de deur” staat, Defective by Design. Die organisatie is opgericht om de strijd met DRM aan te gaan. Dus om kopieerbescherming en aanverwanten te bestrijden. Dat Mac DRM gebruikt om ervoor te zorgen dat u uw Apps niet kan delen met derden, is haar een doorn in het oog. Dat is een heel andere discussie voor een heel ander moment. Niels komt met een klassieke drogreden uit vele religieuze debatten “If you are not with us, you are against us”. Zoals alles is ook hier de wereld een stuk genuanceerder dan Defective by Design en Niels doen vermoeden.

Ik zal niet ontkennen dat Mac gebruiksvriendelijk is, zaken goed doordenkt en enorm goed weet te marketen. In tegendeel: dat doen ze geweldig. Maar net zoals mijn Philips tandenborstel “gewoon werkt” deed mijn eerdere Braun dat ook; keuzevrijheid staat “gewoon werken” niet in de weg. Die twee sluiten elkaar niet uit. Al doet Mac dat graag zo lijken, want het is hun standaard antwoord op de klacht dat hun model gesloten is. Een drogreden, om de ware reden (gewoon, net als ieder ander gezond bedrijf geld verdienen aan klanten en marktaandeel veroveren) niet te hoeven noemen. Klinkt vriendelijker, maar is daarom nog niet waar.

Als voorbeeld komt Niels met de Android store. Een winkel die open is. Google voert wel enig toezicht, maar dat is, jaja, vooral om te voorkomen dat onveilige applicaties te weren. Maar daar zijn niet notoir meer onveilige Apps in de Mac winkel te vinden dan in de Android winkel. Dat zegt Niels gelukkig ook niet, maar hij doet wel alsof dit een oorzaak gevolg is. Dat is het dus niet. De geslotenheid zorgt niet voor beter werkende producten. Op mijn Ubuntu-laptop werken applicaties vele malen beter samen, zijn ze veel beter “usable” dan op menig windowsomgeving. En Ubuntu werkt volgens een nóg veel opener manier dan Android. Niet alleen mag iedereen zijn App bijdragen in de repositories, maar aan veel applicaties kan ook nog eens iedereen meewerken. Zomaar, zonder een bedrijf ertussen die bepaalt wie meedoet en waarom! Ik wil hier de discussie niet aangaan of Windows danwel Ubuntu beter is op bepaald gebied. Maar wél weerleggen dat het specifieke probleem dat Niels noemt, het onderling samenwerken van Apps, blijkbaar niet met het gesloten model te maken heeft. Waarschijnlijk gebruikt Niels zelf op zijn Mac ook heel veel open source software. De macgebruikers die ik ken, wel in elk geval. En die software werkt heel goed samen met andere software. Software die gebruiksvriendelijk is én goed met andere software samenwerkt. Oh, en die hij overigens niet meer kan gebruiken Niels als hij alleen nog maar via een Appstore software mag installeren.

Ik hoor je nu al de hele tijd denken: jamaar, de sales, de cijfers? Klopt, Mac verkoopt goed. Mac heeft minder last van virussen dan Windows PCs. Maar Linux, bijvoorbeeld Ubuntu, heeft er nóg minder. Is het daarmee beter? Is daarmee het Ubuntu-model bewezen beter? Ik gelóóf van wel, maar in tegenstelling tot Niels durf ik dat dan niet als bewijs op te voeren.

Veel mensen roepen dan meteen “dat komt omdat er veel meer windows PCs zijn”. Dat is niet waar, er zijn veel meer Linux Servers en toch zijn er nauwelijks virussen voor geschreven. Bovendien schreef ik in de eerste paragraaf al, zitten Mac (BSD), Linux en Windows gewoon anders in elkaar. Windows Vista heeft relatief gezien véél minder last van virussen dan haar windows 95 voorgangers. Gewoon omdat het technisch veel beter doordacht is. Maar ook dit is een heel andere discussie; wie er hier ook “gelijk” heeft, met Niels zijn argument dat je beter een gesloten model kunt hebben om veiliger software te verdelen, heeft het in elk geval niets van doen.

En de cijfers dat er meer iPhone Apps verkocht worden dan Android Apps? Zoals Niels stelt? Mac verkoopt toch veel meer Apps dan Android? Als Niels gelijk had, en keuzevrijheid slecht was voor het gebruiksgemat en de veiligheid, dan was Google toch juist veiliger en beter bruikbaar? Hoe meer applicaties, hoe minder veilig. En hoe minder gebruiksvriendelijkheid. Dat was juist het betoog van Niels. Ik begrijp hem niet meer. Maar dat heb ik wel vaker met religieuze fanatiekelingen. Dat ik ze niet begrijp.

En als afsluiter de idee dat op een Mac überhaupt alles “gewoon werkt”. Ik houd niet van anekdotische bewijsvoering, maar we hebben hier ook maar één voorbeeld nodig om te bewijzen dat dit onwaar is. Dat een Mac dus hoogstens “veel vaker dan X of Y gewoon werkt”. Maar zeker niet “altijd”. Tijdens de trainingen die ik geef en bij de klanten die ik op weg help, moet ik vaak Macgebruikers helpen. Met computerproblemen. Programma’s die niet werken, firewalls die bepaalde applicaties laten crashen. Bestanden die niet geopend kunnen worden. Ach, kijk zelf maar, hoeveel gebruikers nog problemen hebben.Update Tijdens het inkorten (jaja het eerdere stuk was nǵ langer) heb ik een belangrijk gegeven weggelaten, maar vergeten elders terug te laten komen: Niels nuanceert op zijn blog enorm veel van de stellingen. Waardoor hij dáár duidelijk laat merken wel degelijk de nuances te begrijpen en tegenstellingen in zijn artikel onderbouwt. Neemt niet weg dat het stuk dat in de krant verscheen gewoon vol drogreden en tegenstellingen zit.

https://berk.es/2010/11/08/op-apple-kun-je-geen-kritiek-hebben

Drupal filetypes for Ack grep.

Oct 26, 2010 Updated Oct 26, 2010

Show full content

How often did I not think “sigh, I wish I could just grep this pile of invoices for that date”. But unfortunately, the all-powerfull search-tool grep is not available IRL. But it is available on most unices, including all Linux systems and OSX.

But life gets even better. With Ack-grep. A much faster, better and more targeted tool. For example , it will ignore all sorts of files you usually wanted to ignore, when “grepping” trough a pile of files. You know, searching for that line “sent to foo@example.com”, but getting all sorts of results from backup files, revision-databases and what more. Ack does this. And more.

Ack also allows you to define profiles. Sets of files to be searched trough and sets of files to ignore. It comes with lots of built in sets, but not with Drupal predefined.

To get a Drupal profile, just add a .ackrc file to your home directory and add the profiles there.

echo "--type-set=drupal=.php,.inc,.module,.install,.info,.engine" >> ~/.ackrc

Now you can search trough Drupal with

ack "implementation of hook_"

Or, if you want to ignore all none-drupal(ish) files, with

ack --drupal "implementation of hook_"

Or, if you want to search trough all files, except Drupal-files

ack --nodrupal "Licence"

Many additional tools, such as gedit addons, will use ack, when found (over grep). And will benefit from this drupal profile too.

https://berk.es/2010/10/27/drupal-filetypes-for-ack-grep

Waarom ik spotify meteen alweer gedag zeg.

Oct 24, 2010 Updated Oct 24, 2010

Show full content

Ik ga spotify even een kans. Aangezien ik meer dan €10 per maand aan muziek uitgeef, leek mij het model van spotify ook wel te boeien: voor €10 per maand onbeperkt muziekluisteren. Nieuwe, oude, muziek uit eigen bibliotheek, muziek van vrienden enzovoort. Mooi model. Alleen niet voor mij. - Met wat crativiteit, is er zeker geen gebrek aan (legale) gratis muziek. Mijn “Nooit beluisterd” playlist bevat voor 4 weken, 2 dagen en 3 uur muziek, nog.

Wanneer je jou spotify account opzegt (of het bedrijf spotify verdwijnt) ben je al “jou” muziek kwijt. Niet die al op je hardeschijf staat (eerder gekocht of gedownload), maar alle muziek die van spotify zélf is. Muziek die je koopt, betaal je eenmaal voor en kun je dan blijven luisteren. Of de winkel waar je hem kocht nu falliet gaat of niet, maakt niets uit. In principe kunt je deze muziek zelfs nog doorverkopen (in geval van CDs en LPs enzo, vooral). Muziek is écht van jou.
De spotify muziekspeler voor linux is rotzooi. Hij crached, is traag, past niet in mijn desktop (knopjes zitten allemaal verkeerd), werkt niet met de normale multimedia knoppen op mijn keyboard enzovoort. Als ik betaal voor software, dan moet het gewoon werken. Zónder allerlei trucs en moeilijkdoenderij. Spotify werkt niet “gewoon”, iets wat iedere andere (gratis), OpenSource muziekspeler wél gewoon doet.
Als je vanaf een andere computer (of mijn mobiel) luistert, stopt de andere client. Logisch, gezien hun bedrijfsmodel, dat je nooit vanaf meerdere plekken je tegelijk muziek kunt luisteren. Voor mij vervelend, omdat we in de huiskamer een multimedia PCtje hebben staan (IPV een TV), waarop we muziek luisteren, TV kijken enzovoort. Als mijn huisgenoot muziek wil luisteren op die PC moet ze eerst mij uitloggen, op haar spotify inloggen (die heeft ze overigens niet, dus dit is hypotetisch) en dan pas kan ze luisteren. Als ze dat per ongeluk vergeet, stopt bij mij de muziek met spelen. Veel gedoe en verre van ideaal, voor onze situatie.
Eigendomsrecht van de luisterdata is vaag. Spotify kan (en moet, deels) alles bijhouden wat ik speel. Dat vind ik niet heel erg, zolang maar duidelijk is wie de eigenaar van die data is. Het is immers (redelijk) vrij privacygevoelige data. Eerlijkheid gebied me te zeggen dat ik al tig jaar mijn muziek-speel informatie naar last.fm stuur en recent ook nog naar libre.fm. Dus in mijn geval maakt het toevallig niet zo veel uit. Last.fm en libre.fm kan ik echter opzeggen (en gewoon mijn muziek blijven luisteren) of verwijderen. Spotify kan ik ook opzeggen, maar, zie boven, dan ben ik wel mijn muziek “kwijt”. Privacy is en blijft een probleem met dit soort diensten. Dus rest mij niets anders dan Spotify waarwel te zeggen en ooit eens een OpenSource, distributed, vrij en mooi playlist-sharing-systeem op te zetten. Misschien bovenop libre.fm of in een alternatieve versie van Diaspora, wie weet. Mooi zomervakantieproject.

Even terugkomend op die gratis muziek: Dat zijn veel mixxes van bijvoorbeeld soundcloud, tig podcasts, een paar “herbeluisterde” shows van Studio Brussel, een grote hoeveelheid creative commons en andere vrije muziek en een flinke backlog aan dooie-muziekanten-torrents. Ik weet heus wel dat downloaden van een volledige discography van iemand, op piratebay niet volgens de regels is, maar als Micheal Jackson dood is, voel ik geen enkele scrupules om zijn discografie binnen te halen. Nouja, behalve misschien dan dat hij ook enorm slechte muziek gemaakt heeft :). Aan aanvoer van “nieuwe” muziek bij mij dus geen gebrek. En ik voel me nog minder schuldig omdat ik uiteindelijk wel heel veel muziek koop

https://berk.es/2010/10/25/waarom-ik-spotify-meteen-alweer-gedag-zeg

Four troublesome notes with Drupal module-updates and upgrades.

Oct 12, 2010 Updated Oct 12, 2010

Show full content

On stackoverflow, I wrote an answer to someone having issues with Drupal module-upgrades. Something I thought worth noting down here too. There are many gotcha’s in module-upgrades, that people find out sooner or later. Often found out in not-so-nice-ways. :)

Not all module developers think the same about dot (minor) releases: sometimes 5.x-1.2 and 5.x-1.3 are major rewrites or come with completely new features, themeable-functions, pages or APIs.
Not all upgrades are compatible with others. Sometimes you cannot update module B to 6.x-1.4, because of its dependency with A, when A is not compatible with 6.x-1.4 (yet). Drupal does not support dependencies on versions.
Major releases imply (but do not guarantee) incompatibility, or even complete rewrites: Upgrading from 5.x-1.4 to 5.x-2.1 might force major rewrites of custom code, including your theme.
Security updates often are dependent on earlier releases: 6.x-1.2 might introduce new features (that you do not want, or wish to ignore), 6.x-1.3, can be a security-release that requires (some of the) the changes in 6.x-1.2 to be available. You must then either fiddle around with patches, or go trough that feature release anyway.

Off course there are all these other notes to take, such as database-migrations (that might go wrong and destroy or break all your heard-earned data), new-features-come-whith-new-bugs problems, your own, custom code breaking on a new version, etceteras. But you already knew that, did you not? :)

https://berk.es/2010/10/13/four-troublesome-notes-with-drupal-module-updates-and-upgrades

Moet je Drupal7 gaan gebruiken voor een nieuwe Drupalsite?

Sep 26, 2010 Updated Sep 26, 2010

Show full content

In een recente mailcorrespondentie voorzag ik iemand van wat advies over Drupal 7. Drupal 7 is de Drupal die binnenkort gereleased zal worden als opvolger van Drupal 6. Drupal6 is daarmee niet ten einde, mogelijk blijft deze nog jaren onderhouden. Drupal 5 komt daarmee wél te vervallen. De mail vroeg: > Ik wil eigenlijk gaan beginnen met D7 aangezien deze er nu bijna is en het project sowieso nog wel een 4-tal maanden zal duren.

Een medewerker van Dries bij Acquia duwde ons ook in deze richting voor onze community site. En ik antwoordde: Eerlijkgezegd geloof ik hier voor geen meter in. Tenzij je concrete voorbeelden boven water kunt krijgen waar Drupal7 nu al beter is dan Drupal6. Ik zie die voor jullie project nog niet. Je zult hoe dan ook een technisch ontwerp (naar een functioneel ontwerp) moeten opzetten. Als in dat technisch-ontwerp grote problemen boven water komen, die met Drupal7 opgelost zouden worden, is dat natuurlijk een goede optie. Maar als dat niet zo is, is Drupal7 altijd een nadeel: - Onbekend (reken op maanden, mogelijk een jaar na release voordat er net zoveel experts en developers voor zijn als voor 6).

Onbekend (marketeers roepen natuurlijk dat het getest en klaar is, maar dat moet eerst nog bewezen worden).
Onaf, third party modules worden grotendeels al vooraf klaargemaakt. Maar wat ik daarvan gezien heb, valt a) het aantal modules dat voor 7 klaat is tegen, maar is vooral b) de kwaliteit van die ports erg (erg) slecht vaak. Verwacht dat in het eerste half jaar van Drupal7 de helft van die modules vervangen wordt door alternatieven. Als je zo vroeg op de trein springt is dat uiteraard niet te voorkomen in geval drupal 7 nieuwe features heeft die in 6 niet te krijgen zijn, maar heel onverstandig, omdat je écht in de voorhoede meedraaft, en dus veel zelf moet uitvinden (in plaats van gebaande paden te volgen) en daarmee vaak doodlopende paden bewandelt, met onvermijdelijke pijnlijke migraties voor de boeg. Mijn advies, kortom was, om voorlopig nog niets met Drupal 7 te gaan doen. Het betreft hier een groot project (4 maanden ontwikkeling) en dus niet De Blog Van Mien. Voor een klein, goed te overzien project is Drupal 7 precies net zo goed als Drupal 6. En zijn kleine voordelen als “Maar Drupal 7 heeft een heel coole interface” geldige argumenten om ervoor te kiezen. Maar voor ieder groter project moet je je richten op de technische eisen en wensen. En daarvan heeft Drupal 7 mogelijk enkele kant-en-klaar, terwijl 6 die niet heeft. Maar andersom is de kans even zo groot, of groter, dat vor Drupal6 daar al een goed getestte en uitgewerkte oplossing is, terwijl in de wereld van Drupal 7 nog niets uitgewerkt is voor dat specifieke probleem.

https://berk.es/2010/09/27/moet-je-drupal7-gaan-gebruiken-voor-een-nieuwe-drupalsite

Drupal needs you to conform, a framework will conform to you.

Sep 12, 2010 Updated Sep 12, 2010

Show full content

An answer on stackoverflow to the endless question “should I choose Drupal or Foo” gets the whole problem with many Drupalprojects in one sentence:

Drupal needs you to conform, symfony will conform to you - choose whichever you want.

I would refrase that as:

Drupal needs you to conform, a framework will conform to you - choose whichever you want.

Now, that would not be a big problem, if you have that freedom: the freedom to conform. But many projects (clients) have certain demands, or expectations.

On those projects, you often cannot conform. And even if you can conform, you often should not conform; not every case is best served with The Drupal Way[tm].

Ask yourself these questions:

Do I want to conform to The Drupal Way?
Can I conform to The Way of The Drupal?
Will my client allow us to conform to the Drupal Way?

If the answer to any of these is yes, the next step would be to find out, what That Drupal Way[tm] is; that Drupal Way often needs to be found out and described to in great detail.

If the answer to any of these questions is no, certainly not, you might want to not use Drupal at all. Or, more correct, not use a CMS at all, but learn a real framework.

https://berk.es/2010/09/13/drupal-needs-you-to-conform-a-framework-will-conform-to-you

Toggle your webservers' production environment by using symlinks

Sep 7, 2010 Updated Sep 7, 2010

Show full content

I love simple. If I can do something in a simple way, that, is the way I will do it. Releasing new features, updates or upgrades of sites can be a PIASS, even when you use the whole shebang of version-control systems, release-management environments and what more.

I think it can be done very simple (okay, I don’t run bank-applications that are critical to the world economy, but still) with symlinks.

Say, I have a tool called “foo” that gets a critical update. I know most of you would just fire up ws_ftp (the more savvy would prolly fire up filezilla) and then overwrite the old code with the new code and be done with it. That, however, is the kind of simple that is even too simple for me. It is so extremely error-prone, that I don’t recommend it for anyone. Not even if you have that site that is only visited by three people and an accidental cat, per week.

What I do, is keep two direcories for my app, under /var/www: foo_r and foo_l. The _r and _l stand for left and right. You could also call them one and two, or tinky and winky. One symlink points to them: foo.

  ber@luscious:/var/www$ ls -ahl
  lrwxrwxrwx  1 www-data www-data 12 2010-09-08 19:31 foo -> foo_l
  drwxr-xr-x  6 www-data www-data 4,0K 2010-07-20 17:30 foo_l
  drwxr-xr-x  6 www-data www-data 4,0K 2010-09-08 19:29 foo_r

My vhost (/etc/apache2/sites-enabled/foo) points to foo:

  <VirtualHost *:80>
ServerAdmin webmaster@foo.com
    ServerName foo.com
DocumentRoot /var/www/foo
    <Directory /var/www/foo>
     AllowOverride all
     Options -MultiViews
    </Directory>
  </VirtualHost>

And all you have to do is:

update the code in the folder that is not symlinked to: foo_r, in the above example.
optionally test that code. (but you had your tests done on the test-environment, not?)
switch the symlink: rm /var/www/foo && ln -s /var/www/foo\_r /var/www/foo

This way, you solve many problems, without brining in heavy shots, such as capistrano, or whatever-release-tool.

During overwriting the code with new code, your users could (and will) hit a situation where half of the code is new, and the other half is old.
If you work with a revision-controlsystem, you can solve merge conflicts before people get hit by them.
You can test-drive your unreleased code, by introducing a vhost that uses the foo_r (or the other one, at least the one unused at that moment) as documentRoot.
If you have your code spread over multiple servers, you can distribute it first, then switch the simlinks on all the servers at once, instead of waiting for code to be distributed and having a period during which parts of the balanced servers serve old code and parts serve the new stuff.
This is /so/ simple, that it can be integrated in about every script and adminstrative frontend.

https://berk.es/2010/09/08/toggle-your-webservers-production-environment-by-using-symlinks

Uitgevers: maak van mij geen boekenpiraat

Aug 10, 2010 Updated Aug 10, 2010

Show full content

De Nederlandse e-book gaat hard de mp3 achterna. De volgende industrie, dit keer regionaal, staat op het punt kapotgepiraat te worden. Jammer? Nee, eigen schuld. Vandaag in de de Pers een zuur stuk over het uitblijven van de e-book. Niet de e-book reader, die is overal verkrijgbaar, maar het boek zélf. Sinds een kleine week lees ik mijn boeken ook digitaal, heerlijk. Ik ben vooral nog aan het hamsteren. Maar van de ruim 160 boeken, die ik er nu op heb staan, zijn er maar ongeveer 60 legaal. Het merendeel uit diverse public-domain, of creative commons bronnen, enkele gekocht. Die gekochte versies zijn allemaal dure it-boeken, van mijn wishlist. De rest is dus illegaal. Illegaal in de zin van: ik heb ook de papieren versie niet in de kast staan, ben niet van zins een legale versie te gaan aanschaffen en ik voel me geenszins shuldig over het “stelen” van het brood van de Schrijvers. Nou goed, een beetje schuldig, anders schreef ik dit stuk natuurlijk niet.

Dat zit zo: ik wíl graag ebooks kopen. Maar als het aanbod er niet is, kán ik weinig kopen. En Koop ik dus niet. Natuurlijk mag stelen dan nog altijd niet. Als de bakker zegt geen bruinbrood meer te hebben,heb ik pech. Ook al zie ik in de bakkerij nog rijen bruinbrood staan. Je kent het wel, op zaterdag staat de halve bakkerij vol met mensen die brood bestélden, mensen uit een generatie dat je vrijdag nog de bakker belde om op zaterdag een garantie op bruinbrood te hebben. Maar voor jou is het “op”.

Ik mag dan natuurlijk niet gewoon via de achterdeur de bakkerij binnenlopen en dan zelf mijn brood pakken. Als de bakker zegt dat het bruinbrood op is, heb ik me daar maar naar te voegen, dat is zijn beslissing. En zijn goed recht om mij met lege handen de winkel uit te sturen.

Zonder toestemming downloaden, is echter iets heel anders dan stelen. Het duidelijkste argument hiervoor, is dat als ik dat brood bij de bakker steelt, hij een brood minder heeft (en mogelijk dus mevrouw Pollux, die “besteld heeft” moet teleurstellen), maar als ik een ebook download, de uitgever nog altijd precies evenveel boeken heeft. Nogmaals, ik wíl graag kopen. Voor zaken waar ik waarde aan hegt, leg ik graag geld neer. Zo kocht ik in 2009 voor bijna €600 aan muziek, en films online en offline. En heb ik een abonnement op het NRC, ondanks dat ik iedere ochtend in de trein allerhande gratis krantjes doorspit. Om verschillende redenen wil ik graag betalen, maar dan moet mij dat wél mogelijk gemaakt worden.

Sterker nog, dan moet het niet alleen mogelijk zijn, het moet ook nog makkelijk zijn. Liefst makkelijkér dan zonder betaling downloaden. Dat is het allesbehalve. Een snelle google, doorklikken naar één van de vele downloads en klaar. Veel makkelijker dan tussen lijsten met papieren boeken zoeken naar de digitale versie. Dan allerlei (op bol.com negen, om precies te zijn, waaronder de altijd-leuke “zoek de random-reader voor ideal tussen de rommel in de rommella”) betaalstappen doorlopen en, tien euro armer, datzelfde boek op mijn reader te hebben staan.

Ik ga niet én betalen én moeilijker doen én me neerleggen bij een beperkt aanbod. Dan zoek ik de makkelijke weg. Die toevallig ook nog eens goedkoper, en makkelijker is. En als ik daar eenmaal aan gewend ben, voert de uitgever een achterhoedegevecht, bij voorbaat verloren, net als die vermaledijde muziekindustrie, die -tig jaar later geen raad meer weet. Daar waar Dirk Knops in de Pers concludeert dat hij zijn ebookreader dan maar laat liggen, vrees ik dat met mij, velen, hun ebooks dan maar illegaal gaan uitwisselen. Schiet op, geef me dat makkelijke, goedkope aanbod. Anders ben ik voorgoed een boekenpiraat.

https://berk.es/2010/08/11/uitgevers-maak-van-mij-geen-boekenpiraat

The problem of Drupals exponential complexity

Jun 22, 2010 Updated Jun 22, 2010

Show full content

Over the last days, I helped a client with some bugs in a really complex Drupal site. The site is that complex, because clients “needs” and “wishes” were to be met. So gradually more and more modules were stacked onto this Drupal. Resulting in a site that no-one can really grasp. At all. Now, if modules in Drupal were entirely self-contained and very loosely coupled, something I consider good practice, this would be of little problem.

The issue, however, lies at the conceptual side, not so much the technical side. Technically such systems suffer from what is called Exponential Complexit For every feature (module) added, the overall complexity increases exponential_Hence the amount of breaking features, bugs and regressions will grow exponentially too. For every feature introduced in your site, several new modules are required. For every new module, the complexity can grow N times. Let us say 5 times: an eaverage module contains about 5 hooks and overrides. A Drupal-site with 10 modules might suffer from 6 bugs; big change you won’t see any of them become a problem on your project. A Drupalsite with 12 modules would then suffer from 150 bugs, part of which _will become a problem at some time. The solution can be sought on the technical side, but frankly, I don’t believe there is a holy grail. A system built from self-contained, loosely coupled entities, will, typically, suffer far less bugs and related problems then tightly coupled entitiesIn web-development you will see that e.g. a project in Django, due to its loosely coupled design will suffer from a lot less “exponential complexity”: if there is a bug in the blogs, that is where the bug is. > A key advantage of such an approach is that components are loosely coupled. That is, each distinct piece of a Django-powered Web application has a single key purpose and can be changed independently without affecting the other pieces. For example, a developer can change the URL for a given part of the application without affecting the underlying implementation. A designer can change a page’s HTML without having to touch the Python code that renders it. A database administrator can rename a database table and specify the change in a single place, rather than having to search and replace through a dozen files.

The bug will not travel trough the entire site and pop-up in random other places. Drupals design philosophy is exactly the opposite: it is entirely horizontal. Due to this horizontal design bugs can travel troughout the entire project. When you introduce a bug in the messaging system and pow! all mail stops working: maybe (in the case of this clients projecte, that was true) the whole cron stops working: search indexes, sessions, garbage collection etceteras no longer work. One small bug, a misconfiguration, caused a PHP error that could have brought down the entire site in due time.

One bug caused at least 7 things to break. These 7 things would cause again X new problems in due time. How to fix this? In Drupal, the only way to fix this is to use as few modules as possible. And even then, to select these modules on their “loosely coupled-ness”. So avoid modules that depend on certain Views configurations. Avoid modules that go sit inbetween all your mail transports, avoid modules that depend on other modules. In practice that would mean: just avoid all modules alltogether :). Not very practical. Again, this comes down to common sense: at the very least, avoid Drupal-projects that are so complex that no-one understands them. If you don’t understand the messaging-system-modules, then don’t use them. Look for an alternative. Even outside of Drupal. Choose the simplest solution. You can always let your site grow over time: add features when they are really needed. That way, at least, you will have to deal with the exponetial complexity only one-step a time: even if those steps will become bigger while your site grows, they are still smaller then the giant leap at a single delivery of a huge site.

https://berk.es/2010/06/23/the-problem-of-drupals-exponential-complexity

Grid Systems, Drupal and Semantics (why CSS frameworks are not that bad in Drupal)

Jun 17, 2010 Updated Jun 17, 2010

Show full content

It seems Grid systems, or CSS frameworks, are being picked up by the Drupal themer community. I think this is a good thing. Some think it is a bad thing. So let us have a look at the downsides and upsides of grid systems in Drupal.

Grid Systems force you to change your HTML, that is bad.

This is wrong, for three reasons.

It might be bad, if it were true. But not all grid systems, or their implementations require changes to the HTML, in theory. I love SASS and its tool-kit Compass. Compass pulls in grid systems such as Blueprint (native) or 960.gs (plugin), in such a way, that the CSS is the only thing you rewrite.

... you can apply battle-tested styles from frameworks like Blueprint to your style sheets instead of your markup.

The second reason, is that changing your HTML might not be bad at all. More on this in the next argument “semantic by the way. But, summarized: only if you can afford being a purist, is this an argument. In all other cases, there is nothing wrong with changing and moving some HTML around. Off course, only to some extend (see Semantics). But GS usually require only minimal changes to your markup.
The third reason lies with Drupal. In order to control the exact outputted HTML, you need not only a huge amount of Drupal theming experience, you need patience. And a gigantic maintainance budget. Drupal works with overrides: it will output source X by default, untill you decide you want to change it into Y. Now, for a theme_item_list, that is not too hard. But inherently complex functions such as theme_table, it is. And these themable “things” are themed globally: if you change the item-list-generator, all your lists throughout entire Drupal, will be changed. Not necessarily bad, but it takes a large part of full control of your source away from you. Drupal also works with nested theme-calls. Theme-page calls theme_foo, theme_foo calls theme_bar, calls theme_item-list. Imagine hunting down that one item-list where you insist on having a .first and .last class, or a .horizontal-list, as required by your CSS framework. And lastly, Drupal is modular and flexible. Depending on your enabled modules, configuration, context or situation the source will change drastically. A logged in user may see different source and elements then an anonymous visitor. An admin with certain quick-edit-module might get popup-links when hovering certain elements. And dynamic modules, such as Views or CCK allow you to configure not only the data, but also the way the data is shown. These are all examples of modules that alter your source drastically. Again: full control of what is outputted is nearly impossible. So Grid Systems in Drupal are hard to achieve and require a lot of work, especially in details such as smaller elements on your site. But it is not that, when Drupal requires a lot of changes for minor changes in source, the concept of “making minor changes to source to force certain display” is wrong all-together. In Drupal it may not be practical, true. But the concept itself is not that wrong.

Semantics.

Changing the source may be bad because of semantics. Purists say that needing to change your markup (the meaning of the information) in order to change the display (the visualisation of the information) is wrong and was never the idea of HTML.

I agree.

However, purists may not have to deal with Internet Explorer in their work (Unfortunately, I do). And purists will steer away from Drupal. Or should.

Taken from a random Drupal site. Whitespace and identation deliberately left the way it is generated.

                         		<div class="view-content">
                		<div class="views-row-1 views-row-odd views-row-first views-row-last">

            	<div class="field-content"><p style="font-size:1.2em"><img src="sites/default/files/images/drupal.png" style="float:right; margin-left:20px" />Met Drupal: maken en beheren van simpele tot en met complexe websites. Dit is de site van de Belgische en Nederlandse Drupal-community. Lees hier over Drupal's <a href="over-drupal" rel="nofollow">sterke punten</a>.<a href="node/1819" rel="nofollow"><br /></a></p>
	</div>
  	</div>
<a href="/sites/default/files/drupal-6.17.tar.gz"><img src="/sites/all/themes/lagelanden/images/download-drupal-btn.png" alt="" title="" width="215" height="32" /></a>  </div>
            	</div>

Font sizes? Inline CSS? Field-content? empty alt tags? No alt tags at all? 4 nested Divs for a single paragraph? rel nofollow on something that clearly should be followed? Empty A-tags? How people, who work daily with a system that outputs such sources by default, dare mention the argument “semantics”, is beyond me.

This tagsoup in the example, is mostly the fault of views, which, in practice, adds gigantic loads of meaningless markup.

A class like “views-row-1 views-row-odd views-row-first views-row-last” is debatable. Some say that these are correct semantics. Maybe. But even if they are, the way some classes are embedded and some are chained makes no meaningful sense.

Why is .view-content outside of views-row and its subclasses? Why are these subclasses, but us views-field-body not a subclass of field-content? Why do we need these in the first place?

The answer is technical: because that markup it is dynamically generated with hilghly flexible and complex code, and we still want to provide enough handgrips for desingers to attach their CSS to.

Certainly not semantic. You cannot convince me that the subclasses views-row-odd views-row-first views-row-last make any semantic sense. Last and first together? It is the only item in the list, so it, technically is correct that it is both the last and the first item. And since it is the first, it is also the odd item. But semantic, meaning. Certainly not more then some additional grid-two-column class. Odd, even classes are just as semantic as classes used to identify columns in a grid.

Now, I will agree with you that the difference between:

<p class="paragraph teaser">
 <img src="sites/default/files/images/drupal.png" alt="Drupal screenshot showing the coolness of Drupal" />
 Met Drupal: .... Lees hier over Drupal's
 <a href="over-drupal">sterke punten</a>
</p>

and:

<p class="paragraph teaser grid-left">
 <img src="sites/default/files/images/drupal.png" alt="Drupal screenshot showing the coolness of Drupal" />
 Met Drupal: .... Lees hier over Drupal's 
 <a href="over-drupal" class="inline-button">sterke punten</a>
</p>

is important. And that the latter is worse then the first. But Drupal’s markup does not even get close to my handcoded and cleaned example. Adding a .grid-left to the tagsoup from the example output of views makes absolutely no semantic difference. At all. Adding it to the corrected, and cleaned examples below does make a difference.

My random example may be a particularly bad example. But before you comment with urls to examples that are cleaner, consider the heading-layout. Consider the source for logged in admins. And evaluate the entire source/markup ratio. It will be bad in Drupal. Please prove me wrong. :)

The other point is that semantics are a little overvalued. Not that I think we should abandon the idea of putting meaning in our HTML and go for the dirty solutions such as table-based layouts. I just say we should be pragmatic. Source order, for example. Most screen-readers and Braille terminals “look” at CSS. That’s because, in practice, most sites change “meaning” by chaning the layout. A form-label that stands above the form-element (like, by default in Drupal), even if done with CSS, will force the Braille terminal to insert a linebreak”: users must take an action to enter the form-field. Changing the CSS so that a label is not display:block, but display:inline will make your forms a lot more accessible. Being a purist gets you only halfway in this: you will still need to look at the entire picture: javascript, CSS and HTML. No (sane) web-indexer will ignore javascript entirely. The google bot may not execute all javascript, but will certainly evaluate it to see if the source is altered trough these scripts.

Good semantics are not just putting a navigation below the content and providing a “skip to navigation”-link. Good semantics are about the entire picture. From source order, via minimalistic source (four nested divs around a single paragraph, for goodness sake!), via correct weight of elements (heading layout etcetera’s), untill meaning-altering javascript or CSS. In practical Drupal this is as good as impossible; you can develop and design a minimalistic Drupal, but those are not the sites that stand for Drupal examples. It will, most probably, be considered an ugly, boring or not-very-representative site. Views is a de-facto standard. Zen a theme for theme-developers probably has the worst source/content ratio of all themes. And it is the most used theme.

Good semantics is about the big picture. And no Drupalsite will manage be semantically correct in that big picture. Not without a huge amount of work, that leaves you with a maintainance nightmare, overrides that, in lines of code, will be far larger then their originals. And a content- and editors- handbook that will make all editors depressed.

Are grid systems bad?

In theory: yes. In practice: hardly; but only in a place where you control your source and therefore can afford to be entirely semantically correct. They require minimal changes to your source. Adding style to a place that should only contain meaning. Adding a class=”horizontal-list” to an UL, in order to make it horizontal is bad practice. In a place where the rest of the source is perfect. But in a tagsoup like that of Drupal, a single class=”horizontal-list” will not make anything worse. Provided you can add that class in the first place, without large code changes (that need to be maintained). And having to re-order some HTML, but keeping it valid, is always a lot better then getting into ugly IE6 CSS hacks, that not only make your CSS invalid, but often add huge amount of extra CSS complexity to your designs.

Any Drupal themer, who does not want a CSS framework, because it does not use HTML like it should be used, is acting silly. Drupal, by default, renders HTML that is so far from semantically correct, that the additional downside of a few extra non-semantic classes, or the downside of a few extra not-so-well-source-ordered blocks does exactly nothing: and certainly not make your source less correct. If you really care for semantics, start with the low-hanging fruit and make Drupal, or its contributions, a little more semantically correct. A Drupal themer who says that using a grid framework is not very practical, because Drupal is far too dynamic and full control of the outputted HTML, is more correct.

https://berk.es/2010/06/18/grid-systems-drupal-and-semantics-why-css-frameworks-are-not-that-bad-in-drupal

Mosterd na de maaltijd

Jun 13, 2010 Updated Jun 13, 2010

Show full content

Ook fijn: Nú opeens komen kranten (NRC next vandaag: prima stuk over verschillen tussen PVV en VVD) met diep inhoudelijke, politieke analyse. Dat moet toch juist tijdens de campagne: nu hebben één komma zoveel ongeïnformeerde mensen op de PVV (of juist de VVD) gestemd met allerlei “Wij van WC-eend”-adviesen en argumenten. Als ze uit de krant, TV of RTL-boulevard hadden vernomen hoe PVV en VVD inhoudelijk mijlenver van elkaar afliggen: dán had je écht een keuze gemaakt. – Jammer “De media” is missch. nog wel het meeste debet aan het failliet van Neerlandsch Progressief en Oopen Vrijdenken.

https://berk.es/2010/06/14/mosterd-na-de-maaltijd

Design principles for creating Good Classes let you write Good Drupal Modules.

Jun 9, 2010 Updated Jun 9, 2010

Show full content

Drupal is not object oriënted (OO). No really, it is not! It merely borrows some design principles from OO, and uses some design concepts (such as the Observer Pattern, or hooks, in Drupal) from OO. Though many Module developers actually use another design principle, that of Classes. When creating a module, one can borrow almost all ideas from the (good) design of classes, to create a good design for a module.

Maybe you think: “Why should I design my module”? (When I say design, I do not mean graphical, or UI design, but technical design, often called software architecture). If you ask yourself that, then get back to your developed modules. They may be so small, and welldesigned that you are a natural talent. But more often, you will let them grow, maintain them, add features and think everytime “I should really rewrite this module from scratch”. The problem is’t that the ad-hoc is a bad way of coding. But that good abstraction, good design, offers many benefits. It not only make your projects easier to run, it also makes your sites a lot more stable, predictable and -overall- better to use maintain and extend.

Good classes, and hence good modules, have several benefits:

You can hide implementation details
Changes don’t affect the whole Drupal environment
The Drupal environment is more obviously correct
You don’t have to pass data all over your entire Drupal
You are able to work with real-world (and website) entities rather then with low-level implementation structures. —Taken, and adapted from Code Complete, second edition Working Classes, p127-128

But first let me answer the question whether modules can be compared to classes, at all. The Drupal handbook, Introduction to Drupal modules writes:

...[a module] is more of a concept that encourages good design principles. Modularity also suits the open-source development model, because it allows a number of developers to contribute functionality to Drupal without risk of interference.

(emphasis added)

Those are reasons why classes were created in the first place: as a good design principle, to lower the risk of interference. Or, to avoid changes infecting your entire Drupal site.

But, more important, is that modules in Drupal are supposed to be highly focused: do one task and do that well. The general idea in Drupal, is that not a forum module offers all features phpBB offers, but that a phpBB-alike forum is achieved by pulling together many modules: often one module for each feature you wish to introduce. This mathces that other great benefit of Classes: You are able to work with real-world (and website) entities rather then with low-level implementation structures..

The User Display api, offers a consistent, focused programming interface to deal with statii of users. When I talk of interfaces, I mean programming interfaces, not user interfaces (UIs). Interfaces in Drupal are hooks, theme-functions, database-api functions, and public functions. Eventhough Drupal, or actually PHP, has no proper support for setting the scope of data and methods (functions), the Drupal convention is to prefix private functions with an _underscore().

So, instead of a forum module that has several features to control the display of the online status of users, Drupal encourages the use of several modules, on top of Drupal core, to introduce such features by themselves.

The answer therefore is: “no, modules are not really classes”, but rather “Good Drupal modules follow a lot of design principles of classes in OO”. You cn approach a module as you would approach a class. But you cannot use all the concepts from classes in a Drupal module.

I took the liberty to modify McConnells, the author of Code Complete, checklist from the book Code Complete and adjusted it to suit module development. He writes a checklist that you can use to see if your classes and their use is Good. Another book to read on this is Design Patterns by the Band of Four. The latter is slightly more academic, but still great if you want to become a better Drupal developer.

I adjusted the checklist from Code Complete, so it becomes a checklist that shows you if your module is Good. It makes a great checklist for writing awesome modules:

Abstract Data Types

Have you thought of the modules in your drupal implementation as Abstract Data Types and evaluated their interfaces from that point of view? (Where, again, interfaces are programming interfaces, not UI’s)

Abstraction

Does the module have a central purpose?
Is the module well named, and does its name describe its central purpose?
Does the modules’s interface present a consistent abstraction?
Does the modules’s interface make obvious how you should use the class?
Is the modules’s interface abstract enough that you don’t have to think about how its services are implemented? Can you treat the module as a black box?
Are the modules’s services complete enough that other modules don’t have to meddle with its internal data?
Has unrelated information, user interfaces and functionality been moved out of the module?
Have you thought about subdividing the module into smaller modules, and have you subdivided it as much as you can?
Are you preserving the integrity of the modules’s interface as you modify the class? (i.e: Can you provide backwards compatibility, without losing the ability to change the code in your module?)

Encapsulation

Does the module minimize accessibility to its internal functions?
Does the modules avoid exposing data, such as global or accessible variables?
Does the module hide its implementation details from other modules as much as the used concepts (hooks, theme, etc.) permit?
Does the module avoid making assumptions about its users (the other modules using this module, not users visting the site), including its derived modules (modules depending on this module)?
Is the module independent of other modules? Is it loosely coupled? (i.e a form_alter that expects forms to be in an exact state are tightly coupled, a nodeapi inserting a new piece of data into a node is loosely coupled.)

Inheritance

Is inheritance used only to model “is a” relationships? (i.e. dog.module, depends on mammal.module, but never on user.module)
Does the class documentation describe the inheritance strategy? (i.e. when Module Cat depends on module feline, does it tell this to the users?)
Do derived modules avoid “overriding” non overridable routines? In PHP and Drupal only achievable by well-commented code.
Are inheritance trees fairly shallow?

Other Implementation Issues

Does the module contain about seven data members (public functions) or fewer?
Does the module minimize direct and indirect routine (function) calls to other modules?
Does the module collaborate with other modules only to the extent absolutely necessary?

If you want more in-depth information on these statements, please refer to Code Complete, second edition, Chapter 6. Or leave a comment below so that I can try to explain it in more detail.

In general, the idea is that all rules of thumb that apply to designing good Classes, are usefull for designing good modules. Keep it small, simple, focused and try to hide as many as possible for others. In Drupal that would mean: provide hook_implementations, but keep all the other functions private. That function that iterates over the latest coffee-mugs to extract their avialability in the shop: should never be available to other modules.

You can prefix private functions with an underscore, such as __coffeemugs_extract_availability(). Or stick them in include-files, and mention in the code-comments that others should keep away from your inc files, at all times! Avoid calling functions in include files. Avoid calling any function in any other module, unless it is explicitely advertised as “usable by others”. Try to avoid introducing such functions as often as possible, rather create a new hook, which, by its nature, is public.

Keep your module focused. A print_and_pdf_and_mail_for_nodes.module is a bad module. An “alternative_rendering.module”, with inheriting modules “print.module”, “pdf.module” and mail.module” is far, far better design.

And go read Code Complete. It will make you a happier Drupal developer.

https://berk.es/2010/06/10/design-principles-for-creating-good-classes-let-you-write-good-drupal-modules

Why I withdraw my Pledge to have Tagadelic ready for Drupal 7

Jun 3, 2010 Updated Jun 3, 2010

Show full content

Actually, it is very simple: I had a slot in February. And one in May. Both are gone now. But because of the lack of anything stable-ish, I decided to use that slot for diving into some documentation on the proposed (and some submitted) changes to 7. And on getting stuff synched and tagadelic migrated to git(hub).

I planned a new slot in May. But again, there was no stable Drupal7 to work against. Off course, Tagadelic is simple, and could probably be migrated against a current unstable Drupal7, then work perfectly on release. But maybe not. And I really don’t have time, nor the will to go shoot at a moving target. Nor do i plan to upgrade my sites to Drupal7 anytime soon: never fix something that aint broken: they run just fine on Drupal5 and Drupal 6. As soon as a client, who uses Tagadelic, hires me to upgrade a site, will I upgrade it and make it ready for 7. Or when someone steps up with a properly tested and clean patch to migrate to 7, will I commit it, or grant that person commit rights. In fact: you already have these rights: just fork tagadelic and upgrade to seven. Then let me know about it, to review the code.

https://berk.es/2010/06/04/why-i-withdraw-my-pledge-to-have-tagadelic-ready-for-drupal-7

Small but Useful modules: are they worth the pain?

Jun 2, 2010 Updated Jun 2, 2010

Show full content

A blogpost on Merge brought a question back that has haunted me for a while now. What about all these small modules?First part of that question is: How to deal with the many small modules, day-to-day?. Quite often, I see sites that drag more then 50 modules along. Most often these are really very simple sites.

That introduces several problems in itself: humongous effort on upgrade management and maintainance, lot of time spent on selection, gigantic complexity -to a level that debugging and troubleshooting becomes impossible- and last: performance. The latter is -imho- one of the least of problems. With 50+ modules, how small they may be, you can be sure of a security release every week, possibly more then one. With proper testing and management, that will mean a couple of hours technical maintainance every week. That is unacceptable for (out of thin air) over threequarters of the Drupalsites. Every time you see a module that does what you want, you should consider the impact of that module on the project as a whole. Not just how many minutes it saves while developing; but also how many hours upgrade pain it may cause.

Second part in this question is: UNIX has this philosophy with gazillion, tiny, focused and optimised libraries and apps, why is it not a problem there?. The answer is probably: managment - and upgrade tools. Drupal has no APT, Gems, VersionTracker or Fink. It has drush that can resolve the minimum of requirements, but hardly more. It has makefiles that provide a good starter for- but are far from- a real package managment tool. The other part of this answer is that UNIX libraries offer no user interface, and that the majority of the tools offer only really low-level user interfaces, most often in the form of configuration options. A small subset offers user interfaces in the form of commandline options. And an even smaller subset offers a real graphical user interface, with options to click, buttons to press and objects to drag. To illustrate: From the 29 packages that help deal with printing (on paper) on my ubuntu machine, only two have a GUI: one to configure and manage printers, the other to print stuff and view the printqueue. Or at least: that is my knowledge, if there are more user interfaces, I do not know of them, nor should I. In Drupal, most modules offer some interaction, add stuff to configuration-pages, offer settings, cases, and so on. In Drupal they not only add technical complexity (dependencies of -, reliance on-, tight coupling with- other modules) they also offer complexity for the user. How often do you not read things like “create a content type, then add a pathauto alias for these nodes, then select the hierarchy from the simple-hierarchy-based-on-aliases module s config interfaces”? I have never read anything like this on ubuntu in order to print an invoice.[1] Small modules are hardly ever librries, they are applications. On UNIX stuff is managable, because probably less then 5% of the apps offer an interface to the user, the rest offers interfaces to software, not to the users.

Third part of the question is: Are you not better off, just hardcoding some stuff? I know that, say, analytics module offers a handy tool to inject analytics code into your pages. But be honoust: is it that hard to copy-paste it into your page.tpl.php, the page template? Is it really so hard that it is worth the overhead of upgrades, management, complexity and performance? Do you really need a module to add a javascript file to the header? So, small modules are usefull. And may come in handy at times. But most often you will find that they offer more pain on the long run, then they gain you on the short run.[1] Actually, I work with Linux mostly, for over 10 years now, so I have seen the days when piping stuff trough ghostscript conversion filters. via lineprinter-tools into /dev/something devices. But Ubuntu really is of a whole new level :)

https://berk.es/2010/06/03/small-but-useful-modules-are-they-worth-the-pain

Toegankelijkheid gestest van 10+1 partij-websites.

Jun 1, 2010 Updated Jun 1, 2010

Show full content

Hoe goed, of slecht voldoen de websites van politieke partijen aan de toegankelijkheidsrichtlijnen? Ik verklap het antwoord alvast: Slecht.

De piratenpartij is de enige die de basisrichtlijnen op orde heeft. Maar ze heeft enkele details ook niet op orde. Het slechts scoren Groenlinks en SGP, met 8 fouten, waaronder vooral veel belangrijke fouten.

Waarom deze lijst? Websites die voldoen aan deze richtlijnen zijn toegankelijk voor iedereen, bijvoorbeeld mensen met voorleesapparaten of braille-terminals. Mogelijk kan dit lijstje iets laten zien over de aandacht die de partijen besteedden aan toegankelijkheid van hun sites. En dus wat ze nu al zélf doen voor toegankelijkheden van minderheden tot hun informatie. Uitereraard is er veel meer dan een validatie van een site. Dus zoek vooral door naar de standpunten van partijen over toegankelijkheid van minderheden tot informatie (en overheidsdiensten en dergelijke).

De test is gedaan met een WAI validator. Een simpele check, waar ook de nederlandse stichting Drempelvrij gebruik van maakt. - CDA - Faal, 4 fouten

PVDA - Faal, 4 fouten.
SP - Faal, maar 3 fouten.
VVD - Faal, 5 fouten.
PVV - Faal, 5 fouten.
GroenLinks - Faal,8 fouten!
ChristenUnie - Faal, 5 fouten.
D66 - Faal, maar 3 fouten.
Partij voor de Dieren - Faal, 5 fouten.
SGP - Faal, 8 fouten!
Piratenpartij - Faal, maar 3 fouten. Echter de basis is in orde! de gevonden fouten zijn details. De kleine partijen heb ik niet meegenomen, met uitzondering van de Piratenpartij. Deze scoret namelijk het beste van alle partijen. Met ook 3 fouten, maar dat zijn de minst kritische fouten. Waar iedere partij faalt op de meest basale toegankelijksrichtlijnen heeft de piratenpartij die op orde. Enkele details missen ook zij. Momentopname, 2 juni, omstreeks 17.00.

https://berk.es/2010/06/02/toegankelijkheid-gestest-van-101-partij-websites

Status update-je Thailand (geen foto's, alleen lettertjes)

Apr 30, 2010 Updated Apr 30, 2010

Show full content

We zijn terug uit Isan, het noorden (van Thailand, want daar ben ik op vakantie). Isan is het Thailse platteland. Dat stukje Thailand waar volgens de Lonely planet geen fsck te doen is (wat ook waar is). Best lekker dus. We hebben voor een paarhonderd Euro handgewoven(weven?) zijde ingeslagen, waar Anna “iets” mee wil maken, en de rest van wil verkopen. Verder hebben we wat bier gedronken. Naar een koe gekeken en daarna naar een andere koe gekeken. En we hebben twee rondjes gelopen om het huis, door het dorp. En cola gedronken.<iframe width="425" height="350" frameborder="0" scrolling="no" marginheight="0" marginwidth="0" src="http://www.openstreetmap.org/export/embed.html?bbox=96.22,12.5,105.45,19.41&layer=mapnik&marker=16.82117,102.57479" style="border: 1px solid black"></iframe>
View Larger MapVandaag zijn we weer terug bij Pap, in Chonburi waar we morgen gaan duiken (in het zembad). Dat schijnt leuk te zijn. (Niet in het zwembad, dat is volgens mij niet zo leuk, maar in de zee). Dinsdag gaan we dan voor het echie; duiken in zee. Ik hoop dat ik een vis zie (denk het wel). En dat ik geen garnalen zie (denk van wel), want die vind ik vies. En dat mijn suiker zich een beetje in het gareel houd. Want onder water een boterham eten is wat moeilijk.<iframe width="425" height="350" frameborder="0" scrolling="no" marginheight="0" marginwidth="0" src="http://www.openstreetmap.org/export/embed.html?bbox=100.276,13.222,101.429,14.096&layer=mapnik&marker=13.38113,100.99962" style="border: 1px solid black"></iframe>
View Larger Map

https://berk.es/2010/05/01/status-update-je-thailand-geen-fotos-alleen-lettertjes

The first rule of coding for Drupal: never forget about the option to write your own code.

Apr 18, 2010 Updated Apr 18, 2010

Show full content

yelvington writes: “The first rule of coding for Drupal: We do not write code for Drupal.” I must say, that after years , I come to the exact opposite conclusion. Right now, I should be writing another webshop (instead I am writing a blog-post, but this article is not about procastrination :)).

I was one of the most outspoken for getting the first CCK, flexinode into core. Not that project itself, per sé, but the concept. I have been a firm believer of “don’t duplicate code”, as such I even introduced the rule in Drupal Join forces with others. I consider myself a moderate programmer (speaking some Perl, Bash, rather good Ruby, almost Fluent PHP and rusty C, C++ and even som Java; hardly a hardcore programmer), I am lazy and tend to be pragmatic (and most often disguise the former by calling it ‘pragmatic’ :)).

Why write yet another shop-system when you can pick from several e-commerce tools? Because face it: e-commerce is not ready. Übercart -no offence!- simply sucks, for extendability, usability and flexibility. But that was not the main reason, that was more technical. More on that later.

Views has a performance horror luring around the corner. It might not hit you, but often will – Views is not bad performant per sé, but it can be. CCK - well, exactly the same. And panels. Don’t get me started on that! If you sincerely hate your frontend developers, give them panels. I have seriously had a person resign from his job because of panels used int their project (panels 2 in his defence, 3 is an improvement). But I do use Views in most projects, together with -obviously- CCK and about 20 other modules.

Views, CCK, Panels are all great tools for the average quick project. Typically projects where the 80/20 rule is applied as: we build 80 and forget about the 20. And we all know the problem that features tries to solve: you create CCK fields, use these in (dependant!) views, and override that in -PHP- templates. The always returning staging-horror. AKA ‘simply repeat the creation on LIVE all over again’. But I do not want to go into more details on the technical downsides of these modules. However important, a far greater concern outshadows these by far.

The problem that made me switch 180° was the development for and in CCK, views, panels and all these high-level buildingbricks. To illustrate, let me give you some quit often repeating questions; try to build them with CCK, views and related modules:

An event-listing: next upcoming item, whose end-date is not yet passed (event is not yet finished), grouped by day.
An article with some fields extracted (live) from a webservice: content lives not in the local database, but is pulled over SOAP, REST or similar. E.g. the editor fills in the “trailer_id” and the trailer is pulled from a filmtrailer service.
On Cron, fill certain nodes with data from a service or an XML-file.
Validate a postal-code field against a city-field; a postal-code implies a certain city. (using, e.g. a local lookup-table or some provided library).
People must provide either a telephone OR an emailfield.
After submitting a node (say a classified ad) people are redirected to the next node form (say, to add photo’s), of wich parts are pre-filled and which is related to the first (in database or ORM speak: one classified has_many photo’s)
A table, listing all profile-nodes, but where the fields Prefix, FirstName, middlename, SurName, Postfix are aggregated into one column, sortable by Surname.

Right? Off course, with the right combination of computed fields, custom template logic(!) and maybe some views and CCK addons it is possible. But far from easy.

Now, I developed each and every of above in custom modules. Let me summarise how I did that, and how many code it required.

Eventlisting: Custom node, defined in an event.module, with a (really simple) date-field, and a (slightly less simple) database query, pushing that to a theme(‘table’). Done. Isolated code for this is less then 200 lines, one small module! The module became more complex, because we changed the model into “event has_many playdates” later. Now bearing less then 600 lines, still small.
Extraced content from a webservice: Very small custom node, defined in movie.module, on hook_insert etc. insert the ID into a local joined table, hook_load request external source using the value from the table. Tiny module, without theme functions, less then 400 lines of PHP.
Fill from an external resource: On cron, fill some custom module-defined nodes. Before we filled CCK-nodes, but the dynamic use of database (database layout changes when reconfiguring fields) made us decide to simply push all data to our custom joined table. Simple. Effective. Less then a thousand lines code, with most code on the XML parsing and validation.
Telephone or mailfield: A custom node, joined table, with hook_validate checking existence of one of both fields and presenting user with a proper message. Less then 40 lines of PHP. 20 minutes development or so. Other fields on this custom node are added with CCK.
A module with several custom nodes, extendable with CCK, but some fields are stored in the database (e.g. the abovementioned telephone/mail fields) module does redirecting, validating and pre-filling on several hooks provided by Drupal. One of the larger modules, still less then 1000 lines of PHP.
Simple SQL pager-query, some PHP looping over the items and aggregating them at wish. Less then 50 lines of PHP. Less then an hour development.

I am not trying to look cool and say “look how fast and small I can develop”, nor do I want to thumb down CCK or views, or any of the other buidling blocks. I am trying to point out how an often forgotten, simple tool can aid. And that writing Views addons, CCK plugins and the likes requires far -FAR- more development, complexity. Will introduce a lot more (unhandled) edge cases (seen a module that does not handle multiple fields correct lately?). And offers hardly any benefit other then the -theoretically- better re-usability. Theoretically, because when being pragmatic, you can just as easy copy paste some code from an old project, then wrinting a perfectly flexible and generic solution.

To illustrate: we spent a month on addons for übercart, views and CCK: simgle-click-checkout, insert barcode in invoices, hacking the Übercart interface in templates, writing complex -dependant and relating- fields for CCK, and so on. The client was not very happy with the workflow, we were far less happy with the enormous amount of (dependant!) code for all the addons and overrides. Loosing all the benefits of re-using code. A complex form-alter introduces just the same amount of tight coupling as a fork would: you have to maintain your form-alter code on every change of the altered form, just as well. An amount of override and template code that extends the amount of re-used code, defeats the purpose of getting a quick start.

Rewriting the entire thing in my own modules took less then 3 weeks. And we are far futher then then 80% now, nearing 90. While the generic solution left us entirely stuck at 80%. Not being able to get out, with the only solution “convince the client that the last 20% is not very important”. Well it was, and right so. We killed a project wich required over 30 modules and 3000+ lines of template code to be replaced by two custom modules (4000+lines, so rather large) and no template logic.

As if a carpenter only uses his completely computerised drilling robot, automated sawing machines and super-hightech-glue-gun. When often a handsaw, nail and hammer will get to planks together in less then 5 minutes. A good carpenter might have all the hight-teck tools, but never forgets about the ease and speed of a hammer and a nail.

So, yes. Using Views and CCK helps you forward. And will get your to the 100% if your 100% is not that demanding. Say, in rapid prototyping; get up a CCK+views+panels version in a few hours, see if the general idea is good, throw it out and rerite it in your own code.

But when you’re requirements are slightly more specialised then a few simple modules, -developed in PHP-, are the quickest, cleanest and most pragmatic way. The only way that will make your client 100% happy. Especially when you are your own client!

edit we had over 3000 lines, not over 300 lines of template code.

https://berk.es/2010/04/19/the-first-rule-of-coding-for-drupal-never-forget-about-the-option-to-write-your-own-code

"Pick up where you left last X" by not committing last changes, good branches and pseudocode.

Apr 14, 2010 Updated Apr 14, 2010

Show full content

When programming, you often need to carry a single task on to the next day, or till “after the meeting” and such. I experimented and found a good method, to pick up where I left. Working the GTD way, well documented code, a good software architecutre all help keeping a project in line, up to speed and manageble. But it does not solve one thing, though: “picking up where you left”. I have many clients, did many projects and have lots of strings attached. Hence I get disturbed very often. Even when you can focus entirely on one single programming task (lucky you!) five o’clock is the end of the day (or, in my case, 18:30). Going on for another half hour or so is perfectly fine, but most often I cannot finish a single task and need to carry it on to the next day. Or till after that meeting. Getting back in the flow then, requires quite some effort. What I do, is twofolded: work with pseudocode and don’t commit the last run.

First, I outline my project in the usual diagrams, documents and such. Then, when working on a more micro-level (the methods), I first write out the inner workings in comments. Steve McConnell, in Code Complete, calls this Pseudo Programming Process. When finished, I will commit this. When not finished, see below. You often see some meeting, break, or end-of-day coming closer, while working in several Classes, on several routines, or in the database, documenation and code all at once. You are in a certain flow, but really need to halt, because people are waiting, kids need to be picked up at school or your girlfriend needs attention. Valid enough to stop working for now. Most people I know, finish up hastily, and commit the work. I call these five o’clock commits not only are they horrible from a revisioncontrol point (a commit should always describe a complete change) but they offer an opportunity for you: to keep in the flow. Do not commit this. Leave it as it is. Even if you have your work in a commitable state, leave it! After your break, or next day, you open the code, and when you left it commitable, the first thing you must do, is commit the work.

This forces you to read through the diff, describe the changes and then commit. I found this exercise more then enough to get me right into the flow, to pick up where I left. If you left it in an uncommitable state, then somewhat the same applies. You can read through the diff, describe what was going on and pick up the work right away.

When combined with good branching practice this “reading through the diff” takes less then ten minutes. And, as is my experience, even helps when picking up work after a few days other work, or a long weekend or any other situation where simply the diff does not ring a bell anymore. In fact, after finishing this blogpost, I am going to pick up where I left friday (six days ago) and I am confident that the commented pseudocode, good branchnames and the diff leave me more then enough hints to be on track in less then ten minutes.

https://berk.es/2010/04/15/pick-up-where-you-left-last-x-by-not-committing-last-changes-good-branches-and-pseudocode