GeistHaus
log in · sign up

Paperless

Part of Paperless

stories primary
Nix function from string to pseudorandom integer
Nix
Or: Load balancing remote builders with pseudorandom numbers.
Show full content

Or: Load balancing remote builders with pseudorandom numbers.

Background1

Nix remote builders are fantastic - a machine reachable by SSH plus a few lines of configuration (and avoiding some footguns), and you got yourself seamless remote builds. The main issue is the network traffic between the builder and client host, but in case that will slow things down, disabling remote builders is usually as simple as adding --builders '' or a similar option to the command. Conversely, --max-jobs 0 forces remote builds.

Another issue is when a group of people, each on their own machine, are sharing a pool of builders. Since none of these are aware of each other, and there’s no central scheduling, it’s easy for a build machine to end up as a choke point, having every client asking it for resources before trying the next host in the list. This can delay build start-up. Even worse, if a builder is configured to run more than one job in parallel, it might end up doing that even when other builders are doing nothing. This means there will be unnecessary resource contention.

Finally, we want to minimise download time. It would be ideal for each build from a single user to always go to the same builder, so that the builder doesn’t have to download anything which is already in the Nix store. But since builders are shared, we can’t always guarantee this. Each user should have a unique list of builder priorities. With a lot of users and a lot of builders, this should result in a decent spread of load across the machines, and minimal downloads for each individual user.

The hack

An equivalent to a “priority” for builders is the speed factor. The higher the speed factor, the more preferred the host. We could generate a pseudo-random list of speed factors for each user, and achieve crude load balancing this way. (If we configured each user’s machine centrally we could of course assign each of them a list of speed factors, but this is not the scenario I’m dealing with.)

All we need now is a pseudo-random number generator which is likely to be unique per combination of user and builder. Thanks to Joel McCracken pointing out the relevant building blocks, this should do the trick:

pkgs.lib.fromHexString (
  builtins.substring 0 15 (
    builtins.hashString "md5" "your string"));

How it can be used in the build machine configuration23:

{ config, pkgs, ... }:
{
  config.nix.buildMachines =
    let
        pseudoRandomSpeedFactor =
          content: 2 + pkgs.lib.fromHexString (builtins.substring 0 15 (builtins.hashString "md5" content));
    in
    [
      rec {
        hostName = "[…]";
        speedFactor = pseudoRandomSpeedFactor (hostName + sshUser);
        sshUser = "[…]";
        # Other properties omitted
      }
      # Other builders omitted
    ];
}

To verify the new configuration, we can pull out the mapping from client host to builder host and speed factor with nix eval .#nixosConfigurations --apply 'clientHosts: builtins.mapAttrs (_clientHost: props: map (builder: {${builder.hostName} = builder.speedFactor;}) props.config.nix.buildMachines) clientHosts'. For example, with two NixOS host configurations “weak” and “strong”, where “strong” doesn’t have a remote builder, and “weak” uses “strong” as the builder:

❯ nix eval .#nixosConfigurations --apply 'clientHosts: builtins.mapAttrs (_clientHost: props: map (builder: {${builder.hostName} = builder.speedFactor;}) props.config.nix.buildMachines) clientHosts'
{ strong = [ ]; weak = [ { strong = 290089632656129126; } ]; }
  1. Caveat: I’m relatively new to remote builders. They are pretty simple, so I haven’t actually read the Nix code to verify all my assertions, and have made assumptions about how they work based on the “the simplest thing which could possibly work” heuristic. Please let me know if you have any factual corrections. 

  2. If everyone is using the same SSH user, you could use config.networking.hostName instead of sshUser

  3. The 16 hex characters (“0” through “9”, “a” through “f”) representing four bits each, and the 15 character limit of fromHexString, means that the final number is between 0 and 2^(15×4)-1, inclusive. Since a speed factor of 0 would be pointless, and a speed factor of 1 presumably makes the remote builder equally likely to be used to the local host, I’m adding 2 to the result so that it’s between 2 and 2^(15×4)+1, inclusive. Yes, this is a silly thing to worry about, but my brain would not shut up about the corner case. 

https://paperless.blog/string-to-pseudorandom-integer-in-nix
Remote Nix build footguns
Nix
Remote builds are one of the biggest superpowers of Nix. But while it is in theory easy to set up, there are a fair number of footguns. I have no feet, but I must walk. Here are a couple of tips for making sure everything is set up properly.
Show full content

Remote builds are one of the biggest superpowers of Nix. But while it is in theory easy to set up, there are a fair number of footguns. I have no feet, but I must walk. Here are a couple of tips for making sure everything is set up properly.

First, a quick recap of what’s needed: the root user on the client needs to be able to SSH non-interactively into the builder as a user who is allowed to run Nix builds1. We can verify different aspects of this on the command line.

On the client host:

  • Get the builders configuration with nix config show builders, and if it points to a file, cat it to get the usernames, host names, and SSH key paths. If this configuration is missing, you might need to restart the Nix daemon.
  • Verify that your SSH keys don’t have a passphrase with ssh-keygen -f PRIVATE_KEY -y. Otherwise remove it with ssh-keygen -f PRIVATE_KEY -p.

On the builder host:

  • nix config show allowed-users lists who can trigger Nix builds. The user listed in the relevant builders entry on the client (or one of its groups if there are any @GROUP entries) needs to be in the allowed users list. That is, if the client builders configuration has an entry starting with ssh-ng://alice@big, and nix config show allowed-users on big lists root @wheel, then groups alice on big must include “wheel”.

It’s a bit clunky to verify the connection from the client to the builder, but the following should do: sudo ssh -o 'IdentityAgent none' -i PRIVATE_KEY USER@HOST true. sudo is necessary because the Nix daemon by default runs as the root user. And the IdentityAgent none setting is necessary to avoid forwarding keys from the SSH agent of the user running sudo (via $SSH_AUTH_SOCK).

  1. @waffle8946 points out that I should use allowed-users for security reasons. In my case I’m already root on the builder host, so I use trusted-users

https://paperless.blog/remote-nix-build-footguns
Hardening systemd services via system call filter
systemdhardeningsecurity
No code is perfect, so it’s useful to limit its potential for doing harm. systemd provides a system call filter which we can use to do just that, but it’s easy to limit it so much that it breaks the service.
Show full content

No code is perfect, so it’s useful to limit its potential for doing harm. systemd provides a system call filter which we can use to do just that, but it’s easy to limit it so much that it breaks the service.

Let’s start with systemd-analyze security SERVICE to get a list of settings which can limit how much risk the service is to the system (“exposure”). (We’re only looking into system calls in this article, but I would recommend looking into everything it lists.) Here’s an example analysing the SSH daemon with systemd-analyze security sshd:

  NAME                                                        DESCRIPTION                                                             EXPOSURE
✓ AmbientCapabilities=                                        Service process does not receive ambient capabilities
✗ CapabilityBoundingSet=~CAP_AUDIT_*                          Service has audit subsystem access                                           0.1
✗ CapabilityBoundingSet=~CAP_BLOCK_SUSPEND                    Service may establish wake locks                                             0.1
✗ CapabilityBoundingSet=~CAP_BPF                              Service may load BPF programs                                                0.1
✗ CapabilityBoundingSet=~CAP_(CHOWN|FSETID|SETFCAP)           Service may change file ownership/access mode/capabilities unrestricted      0.2
✗ CapabilityBoundingSet=~CAP_(DAC_*|FOWNER|IPC_OWNER)         Service may override UNIX file/IPC permission checks                         0.2
✗ CapabilityBoundingSet=~CAP_IPC_LOCK                         Service may lock memory into RAM                                             0.1
✗ CapabilityBoundingSet=~CAP_KILL                             Service may send UNIX signals to arbitrary processes                         0.1
✗ CapabilityBoundingSet=~CAP_LEASE                            Service may create file leases                                               0.1
✗ CapabilityBoundingSet=~CAP_LINUX_IMMUTABLE                  Service may mark files immutable                                             0.1
✗ CapabilityBoundingSet=~CAP_MAC_*                            Service may adjust SMACK MAC                                                 0.1
✗ CapabilityBoundingSet=~CAP_MKNOD                            Service may create device nodes                                              0.1
✗ CapabilityBoundingSet=~CAP_NET_ADMIN                        Service has network configuration privileges                                 0.2
✗ CapabilityBoundingSet=~CAP_NET_(BIND_SERVICE|BROADCAST|RAW) Service has elevated networking privileges                                   0.1
✗ CapabilityBoundingSet=~CAP_SET(UID|GID|PCAP)                Service may change UID/GID identities/capabilities                           0.3
✗ CapabilityBoundingSet=~CAP_SYS_ADMIN                        Service has administrator privileges                                         0.3
✗ CapabilityBoundingSet=~CAP_SYS_BOOT                         Service may issue reboot()                                                   0.1
✗ CapabilityBoundingSet=~CAP_SYS_CHROOT                       Service may issue chroot()                                                   0.1
✗ CapabilityBoundingSet=~CAP_SYSLOG                           Service has access to kernel logging                                         0.1
✗ CapabilityBoundingSet=~CAP_SYS_MODULE                       Service may load kernel modules                                              0.2
✗ CapabilityBoundingSet=~CAP_SYS_(NICE|RESOURCE)              Service has privileges to change resource use parameters                     0.1
✗ CapabilityBoundingSet=~CAP_SYS_PACCT                        Service may use acct()                                                       0.1
✗ CapabilityBoundingSet=~CAP_SYS_PTRACE                       Service has ptrace() debugging abilities                                     0.3
✗ CapabilityBoundingSet=~CAP_SYS_RAWIO                        Service has raw I/O access                                                   0.2
✗ CapabilityBoundingSet=~CAP_SYS_TIME                         Service processes may change the system clock                                0.2
✗ CapabilityBoundingSet=~CAP_SYS_TTY_CONFIG                   Service may issue vhangup()                                                  0.1
✗ CapabilityBoundingSet=~CAP_WAKE_ALARM                       Service may program timers that wake up the system                           0.1
✓ Delegate=                                                   Service does not maintain its own delegated control group subtree
✗ DeviceAllow=                                                Service has no device ACL                                                    0.2
✗ IPAddressDeny=                                              Service does not define an IP address allow list                             0.2
✓ KeyringMode=                                                Service doesn't share key material with other services
✗ LockPersonality=                                            Service may change ABI personality                                           0.1
✗ MemoryDenyWriteExecute=                                     Service may create writable executable memory mappings                       0.1
✗ NoNewPrivileges=                                            Service processes may acquire new privileges                                 0.2
✓ NotifyAccess=                                               Service child processes cannot alter service state
✗ PrivateDevices=                                             Service potentially has access to hardware devices                           0.2
✗ PrivateMounts=                                              Service may install system mounts                                            0.2
✗ PrivateNetwork=                                             Service has access to the host's network                                     0.5
✗ PrivateTmp=                                                 Service has access to other software's temporary files                       0.2
✗ PrivateUsers=                                               Service has access to other users                                            0.2
✗ ProcSubset=                                                 Service has full access to non-process /proc files (/proc subset=)           0.1
✗ ProtectClock=                                               Service may write to the hardware clock or system clock                      0.2
✗ ProtectControlGroups=                                       Service may modify the control group file system                             0.2
✗ ProtectHome=                                                Service has full access to home directories                                  0.2
✗ ProtectHostname=                                            Service may change system host/domainname                                    0.1
✗ ProtectKernelLogs=                                          Service may read from or write to the kernel log ring buffer                 0.2
✗ ProtectKernelModules=                                       Service may load or read kernel modules                                      0.2
✗ ProtectKernelTunables=                                      Service may alter kernel tunables                                            0.2
✗ ProtectProc=                                                Service has full access to process tree (/proc hidepid=)                     0.2
✗ ProtectSystem=                                              Service has full access to the OS file hierarchy                             0.2
  RemoveIPC=                                                  Service runs as root, option does not apply
✗ RestrictAddressFamilies=~AF_(INET|INET6)                    Service may allocate Internet sockets                                        0.3
✗ RestrictAddressFamilies=~AF_NETLINK                         Service may allocate netlink sockets                                         0.1
✗ RestrictAddressFamilies=~AF_PACKET                          Service may allocate packet sockets                                          0.2
✗ RestrictAddressFamilies=~AF_UNIX                            Service may allocate local sockets                                           0.1
✗ RestrictAddressFamilies=~…                                  Service may allocate exotic sockets                                          0.3
✗ RestrictNamespaces=~cgroup                                  Service may create cgroup namespaces                                         0.1
✗ RestrictNamespaces=~ipc                                     Service may create IPC namespaces                                            0.1
✗ RestrictNamespaces=~mnt                                     Service may create file system namespaces                                    0.1
✗ RestrictNamespaces=~net                                     Service may create network namespaces                                        0.1
✗ RestrictNamespaces=~pid                                     Service may create process namespaces                                        0.1
✗ RestrictNamespaces=~user                                    Service may create user namespaces                                           0.3
✗ RestrictNamespaces=~uts                                     Service may create hostname namespaces                                       0.1
✗ RestrictRealtime=                                           Service may acquire realtime scheduling                                      0.1
✗ RestrictSUIDSGID=                                           Service may create SUID/SGID files                                           0.2
✗ RootDirectory=/RootImage=                                   Service runs within the host's root directory                                0.1
  SupplementaryGroups=                                        Service runs as root, option does not matter
✗ SystemCallArchitectures=                                    Service may execute system calls with all ABIs                               0.2
✗ SystemCallFilter=~@clock                                    Service does not filter system calls                                         0.2
✗ SystemCallFilter=~@cpu-emulation                            Service does not filter system calls                                         0.1
✗ SystemCallFilter=~@debug                                    Service does not filter system calls                                         0.2
✗ SystemCallFilter=~@module                                   Service does not filter system calls                                         0.2
✗ SystemCallFilter=~@mount                                    Service does not filter system calls                                         0.2
✗ SystemCallFilter=~@obsolete                                 Service does not filter system calls                                         0.1
✗ SystemCallFilter=~@privileged                               Service does not filter system calls                                         0.2
✗ SystemCallFilter=~@raw-io                                   Service does not filter system calls                                         0.2
✗ SystemCallFilter=~@reboot                                   Service does not filter system calls                                         0.2
✗ SystemCallFilter=~@resources                                Service does not filter system calls                                         0.2
✗ SystemCallFilter=~@swap                                     Service does not filter system calls                                         0.2
✗ UMask=                                                      Files created by service are world-readable by default                       0.1
✗ User=/DynamicUser=                                          Service runs as root user                                                    0.4

→ Overall exposure level for sshd.service: 9.6 UNSAFE 😨

As we can see, any issues with the SSH daemon would open up the system to a lot of scary side effects. Based on discussions it seems that limiting the capabilities of the SSH daemon without breaking any of the vast array of functionality it provides is actually quite difficult. But most services are much simpler and can be hardened a lot more, for example:

  NAME                             DESCRIPTION                                                                                         EXPOSURE
✓ SystemCallFilter=~@clock         System call allow list defined for service, and @clock is not included
✓ SystemCallFilter=~@cpu-emulation System call allow list defined for service, and @cpu-emulation is not included
✓ SystemCallFilter=~@debug         System call allow list defined for service, and @debug is not included
✓ SystemCallFilter=~@module        System call allow list defined for service, and @module is not included
✓ SystemCallFilter=~@mount         System call allow list defined for service, and @mount is not included
✓ SystemCallFilter=~@obsolete      System call allow list defined for service, and @obsolete is not included
✓ SystemCallFilter=~@privileged    System call allow list defined for service, and @privileged is not included
✓ SystemCallFilter=~@raw-io        System call allow list defined for service, and @raw-io is not included
✓ SystemCallFilter=~@reboot        System call allow list defined for service, and @reboot is not included
✗ SystemCallFilter=~@resources     System call allow list defined for service, and @resources is included (e.g. ioprio_set is allowed)      0.2
✓ SystemCallFilter=~@swap          System call allow list defined for service, and @swap is not included

Not bad! This service only needs access to some of the system calls in the resources group, such as ioprio_set. We can use systemctl show --property=SystemCallFilter SERVICE to show the full list.

Anything not allowed will result in signal 31, aka. “SYS”, which can be inspected in the core dump log. Below is an example from a service which I had changed without updating the hardening settings:

❯ journalctl --identifier=systemd-coredump --lines=1 --output=cat
Process 582 (sed) of user 1000 dumped core.

Module [omitted]/sed without build-id.
Stack trace of thread 582:
#0  0x00007f8f72d1051b fchown (libc.so.6 + 0x11051b)
#1  0x000000000040817b closedown ([omitted]/sed + 0x817b)
#2  0x0000000000408d70 read_pattern_space ([omitted]/sed + 0x8d70)
#3  0x000000000040ad4d process_files ([omitted]/sed + 0xad4d)
#4  0x0000000000403abf main ([omitted]/sed + 0x3abf)
#5  0x00007f8f72c2a4d8 __libc_start_call_main (libc.so.6 + 0x2a4d8)
#6  0x00007f8f72c2a59b __libc_start_main@@GLIBC_2.34 (libc.so.6 + 0x2a59b)
#7  0x0000000000403be5 _start ([omitted]/sed + 0x3be5)
ELF object binary architecture: AMD x86-64

In this case we can see that sed tried to run fchown. We can use systemd-analyze syscall-filter to see which system calls (and other groups) are in each group, and searching for fchown we can see that the chown group contains fchown. We’ll need to either

  1. add fchown or @chown to SystemCallFilter (easy, and low risk since the service runs as my user);
  2. change the sed command to not need to call fchown (potentially impossible without changing sed itself); or
  3. use some other command than sed in the service (potentially lots of work).

The first option is easy, and the risk is tolerable, so that’s what I ended up doing.

Knowing which commands are available and how to find relevant debugging info is half the battle, so I hope this was useful.

Thanks to _Andrew on the NixOS Discourse for the journalctl tip!

https://paperless.blog/hardening-systemd-services-via-system-call-filter
nixf-tidy pre-commit hook
quality assuranceNixlinting
nixf-tidy (introduction, docs) is a simple Nix linter CLI tool which takes Nix code on standard input and emits a JSON array of issues. This mode of operation makes it non-trivial to use as a pre-commit hook, so I’ve written a small wrapper:
Show full content

nixf-tidy (introduction, docs) is a simple Nix linter CLI tool which takes Nix code on standard input and emits a JSON array of issues. This mode of operation makes it non-trivial to use as a pre-commit hook, so I’ve written a small wrapper:

nixf-tidy.bash
#!/usr/bin/env bash

set -o errexit -o noclobber -o nounset -o pipefail
shopt -s failglob inherit_errexit

for file; do
    result="$(nixf-tidy --pretty-print --variable-lookup < "${file}")"
    if [[ "${result}" != '[]' ]]; then
        printf '%s: %s\n' "${file}" "${result}" >&2
        exit_code=3
    fi
done

exit "${exit_code-0}"
Use

Save nixf-tidy.bash above somewhere in your repository and use the following snippet:

.pre-commit-config.yaml
repos:
  - repo: local
    hooks:
      - id: nixf-tidy
        name: nixf-tidy
        entry: ./path/to/nixf-tidy.bash
        files: \.nix$
        language: system
        stages: [pre-commit]
https://paperless.blog/nixf-tidy-pre-commit-hook
“Treadmill vs. Real Hill” response
physicsexperiment
In Treadmill vs. Real Hill: Which is harder to run, Steve Mould and Jared Reabow ran an interesting experiment: does it take more energy to run up a hill or to run in place on a treadmill with the same inclination? The result was interesting, but also a bit of a let-down: the treadmill run used 9.1 W, while the ramp run used 10.1 W, 11% more. More research is clearly needed 😅! What could improve the experimental setup?
Show full content

In Treadmill vs. Real Hill: Which is harder to run, Steve Mould and Jared Reabow ran an interesting experiment: does it take more energy to run up a hill or to run in place on a treadmill with the same inclination? The result was interesting, but also a bit of a let-down: the treadmill run used 9.1 W, while the ramp run used 10.1 W, 11% more. More research is clearly needed 😅! What could improve the experimental setup?

The friction of a regular treadmill and a wooden slope is going to be different, so using the same surface material would eliminate that option. Using a hard surface material would also reduce friction, but could increase the amount of slip. We can avoid the slip by using a rack railway, which would also allow testing with much bigger inclines, even beyond 45°. A Lego treadmill should work beautifully for this: run on the treadmill for the stationary test, and lay the tread out on an incline to do the moving test.

Comparing with different inclinations should also help. If we see an X% difference between the energy use on the treadmill and the stationary surface both at a positive angle and when running horizontally, we can conclude that the difference is not because of the incline. A bigger incline would also be interesting.

Even a negative incline could be interesting, if instead of running the engine we used engine breaking to generate power. Would this result be relevant for the experiment? I’m not sure, but intuitively I would expect the power use on a slope of X° to be similar to the power generated on a slope of -X°. Unfortunately this result would probably be dominated by the engine/dynamo efficiency, which are both going to be much less than 100%. The engine and dynamo efficiencies could also be completely different, say, 80% and 50%.

Another easy win would be to use an aerodynamic vehicle to minimise differences due to air resistance. In a similar vein, a heavy vehicle would make the air resistance a smaller factor of the total energy use. And since drag is proportional to the square of the speed, using a slow speed would further reduce the contribution from air resistance. Slowing down would also reduce the risk of transferring extra energy to the vehicle because of rattling in the treadmill system itself (thanks Adon!).

In summary and conclusion, I’ll need about 500,000 moon dollars for Lego parts, a to-be-negotiated sum to contract a friend to “find a Lagrangian and minimise the action”, and three months to produce a Git repo and YouTube video with the results.

https://paperless.blog/treadmill-vs-real-hill-response
Getting started with automated quality assurance
quality assuranceautomation
The bad news is that every discussion we’ve had about tabs vs spaces was a waste of time, and neither of us learned anything useful. The good news is that we won’t need to have that discussion ever again - there are tools which apply idiomatic indentation for pretty much any language under the sun. But the good news don’t stop there. That discussion about whether entries in a long lists should all be on a single line, broken into long lines, or broken into one item per line? There’s a tool for that. Newline at end of file? There’s a tool for that. How imports should be sorted and grouped? Detecting overly general catch statements, unused variables, or bad spelling? You guessed it. In this article I’ll go through why you might want to automate QA, and some practicalities of how to adopt automated QA tools, so that the team can concentrate on the actually important work.
Show full content

The bad news is that every discussion we’ve had about tabs vs spaces was a waste of time, and neither of us learned anything useful. The good news is that we won’t need to have that discussion ever again - there are tools which apply idiomatic indentation for pretty much any language under the sun. But the good news don’t stop there. That discussion about whether entries in a long lists should all be on a single line, broken into long lines, or broken into one item per line? There’s a tool for that. Newline at end of file? There’s a tool for that. How imports should be sorted and grouped? Detecting overly general catch statements, unused variables, or bad spelling? You guessed it. In this article I’ll go through why you might want to automate QA, and some practicalities of how to adopt automated QA tools, so that the team can concentrate on the actually important work.

Why

Do any of the following apply to you?

  • Participated in a discussion about tabs vs. spaces.
  • Found a bug in production which would’ve been obvious at review time if the code had been properly formatted.
  • Created a function which had more cyclomatic complexity than the London Underground.
  • Committed a file with a syntax error.
  • Committed a syntactically valid file with schema errors. For example, a JSON file with a $schema property, or an XML file with an schemaLocation attribute.
  • Committed a spelling mistake, risking making a future git grep miss important information, or looking bad in front of clients.
  • Committed a script with the wrong line endings, causing incomprehensible runtime errors.
  • Received a complaint about misleading, overly complex, or non-idiomatic code.
  • Silently cursed someone, maybe even yourself, for using such weird formatting.

You see where this is going - all of these can be avoided by using tools which are widely available, robust, and generally have excellent defaults.

Why not?
  • More maintenance burden, making sure the tools keep working,
  • Each tool has quirks.
  • Some tools use your least favourite configuration format.
  • Some big commits. (The next section discusses ways to minimise that.)
Strategy

tl;dr: Start early, improve steadily.

When learning a new set of tools it’s not a good idea to try to adopt all of them at once. Automated QA tools are still a relatively new phenomenon, so conventions are still being developed, and learning how to use one tool does not usually translate into an easier time with the next one.

Make sure to get buy-in before introducing a tool. If someone is sceptical, try a quick demo to see if they like it. One important thing to keep in mind at this stage is that today’s tools are generally extremely robust. Gone are the days when auto-formatting a piece of code had a significant chance of breaking it. Of course I can’t speak for all tools, but anything mainstream has almost by definition been tested on thousands of projects already.

After introducing one tool I’d recommend looking for another one which is a small but useful step forward. Any time we find some particular part of development tedious, chances are someone has developed a tool to avoid most of that work. A quick search for “CI [the task]” (without quotes) in a search engine should find something relevant. For example:

Make sure everyone working on the project has time to get familiar with new tools before introducing another one. Otherwise you risk becoming the “Keeper of the Tool”, leading a solitary life in a silo.

Don’t be afraid to revert or change tools. Sometimes the tool is too painful to work with. Many years ago I tried a Java formatter. But rather than sensible defaults, the first thing I had to do was to choose between a bunch of formatting standards, none of which I was familiar with because I was just getting started. I won’t name names, but I still run into this with “modern” tools, having to do a bunch of obscure configuration just to get started. Other times the community default changes or crystallises, such as nixfmt-rfc-style recently becoming the default Nix formatting tool.

Sometimes two tools overlap in functionality. For example, I’d recommend using isort with Black, even though we have to manually configure isort to use a style compatible with Black. Other times the tools refuse to work together, and we have to choose between them. 🤷

If a few hours with a tool doesn’t give much benefit, just stash that work. Maybe look into it again in a month or two, when the original experience has faded a bit. At the same time, we shouldn’t feel obliged to introduce all the tools we possibly can - some might just be more trouble than they are worth, or don’t fit how we want to work. For example, I like the idea of prose style checkers, but not of excising the word “is” from my blog!

The earlier in a project automation is introduced, the better. Introducing a formatter usually results in a single commit with a big diff, which can make it harder to explore the version control log. That said, if a project is in active development it’s probably going to last yet another long time, so we should try to judge fairly the cost of such a one-time big diff against the repeated return on investment from automation. In the golden words of Randall Munroe, Is It Worth the Time? (But also, please also consider the developer experience improvement! Time is not the only dimension worth optimising for.)

When working on a project with many developers, we need to be careful to make the introduction of a new tool as painless as possible. Which means we need to be prepared to learn the tool in some depth before even suggesting it for production use. We might have to showcase it, discuss any quirks (slow speed, bad defaults, workarounds for common issues), and create a plan for how to introduce it. This might involve setting up temporary logic to only apply the tool to new files, so that the team can get used to it before committing to the big diff resulting from applying it to the entire repository. Then we might apply the tool to all changed files. Make sure developers apply these changes in a separate commit or branch before the changes they are working on, so that it’s easy to review the formatting changes separate from any semantic changes. This should organically lead to a better state, and after a while we can introduce a single commit (or small series of commits) to finish the job and tear down any temporary code.

Which tools?

This isn’t really the best place to go into any depth (future articles, perhaps), but I’d recommend these tools to anyone who wants their team to be able to concentrate on the important parts of the work:

  • pre-commit can be used in CI to run all linters and formatters on the entire repo with a single command, pre-commit run --all-files. Locally, pre-commit install will set up hooks to run only on the changed files when committing, to fix issues quickly before committing. All of the following tools work with pre-commit; see my pre-commit configuration for some examples.
  • pre-commit-hooks also has a grab-bag of useful hooks.
  • EditorConfig is a simple way to tell all modern editors how to do the basics.
  • check-jsonschema can verify conformance with both JSON and YAML schemas out of the box.
  • gitlint checks that commit messages conform to your requirements.
  • Prettier formats not just code, but also data files and markup.
  • Vale checks prose rules.
https://paperless.blog/getting-started-with-automated-quality-assurance
Automating ticket progression
automationGitproject management
I forget to move tickets around the board when I’m supposed to. After working with 10+ ticketing systems for some 20 years, it’s still too alien to have to perform a manual step in a completely separate system at the same time as trying to concentrate on the development. Why not automate this process a bit?
Show full content

I forget to move tickets around the board when I’m supposed to. After working with 10+ ticketing systems for some 20 years, it’s still too alien to have to perform a manual step in a completely separate system at the same time as trying to concentrate on the development. Why not automate this process a bit?

This article is a brain dump for my future self to implement, and therefore makes a bunch of simplifying assumptions, glossing over a bunch of details and considerations like team size. Still, the central idea might be useful.

Assumptions
  • Tickets are tracked electronically, or you have a robot and sufficient time to automate your physical Kanban board 😉.
  • The ticket tracking system has a sane API.
  • Tickets all have a unique ID, so that you don’t have to do fancy heuristics to connect them to change sets.
  • You use feature branches. It would be possible to adapt this approach to trunk-based development, but it would probably require some rethinking and heuristics.
  • Each branch corresponds to exactly one ticket. This is doable if tickets are small.
  • Some work is not related to a specific ticket, and that’s OK. We have finite time, after all.
Discussion

Tickets are often created with just some hastily scribbled notes. This may be OK in a one-person team, but implicit knowledge can’t generally be inferred from such notes, so we probably want to make sure there’s separate “(not) ready [for an arbitrary developer to start working on without further context]” states. Only once the ticket is ready should anyone start working on it.

In standard “make the change easy, then make the easy change” fashion, feature branches mean that preparatory work such as refactoring should probably be in separate commits in the same branch as the actual feature. These commits should not be squashed when merging, since that makes reverts extremely painful.

Process
  1. Work starts when creating the branch1. This can be signalled by implementing a create-feature-branch Git alias. This alias would take just the ticket ID, then perform something like the following process:
    1. Download the ticket metadata. If it’s not in the “ready” state, either warn the user or abort, depending on the state and how strict you want to be with the process. For example, it’s probably not expected that a cancelled ticket moves into “in progress”, so any such attempt can probably be assumed to be a typo.
    2. Create a slug containing the ticket title and ID, in such a way that it’s easy to programmatically extract the ID from the slug. For example, a ticket with a title of “Alert on excess CPU use” and ID of “1234” might get a slug of “alert-on-excess-cpu-use-1234”.
    3. Create a branch with the name of the slug. As in, git branch SLUG DEFAULT_REMOTE/DEFAULT_BRANCH.
    4. Change to the new branch.
    5. Call the ticketing system API to change the ticket state to “in progress”.
  2. Create a draft merge request when pushing the first time. This could use the name of the ticket as the title. This shows how work is progressing, and should give the right idea about the fact that the branch is not ready to base other work on, because the commit history may change. Signalling that the history might change encourages atomic commits (as opposed to fix-up commits for linting issues) and a clean history through rebasing onto the target branch rather than merging from it.
  3. Move the ticket to “in review” when a merge request is no longer in draft, and/or when deployed to a user acceptance environment.
  4. Move the ticket to “in production” (often called “done”, which is misleading) when it’s been deployed to production.
  5. Move the ticket to “in use” when log monitors detect that the relevant code has been triggered.
  6. Move the ticket to “cancelled” if the branch is deleted before merging.
  7. Create a new ticket to remove the functionality when log monitors detect that the relevant code has not been triggered for a Long Time™. This might be too much work, but for a complex system it could be one way to keep the maintenance burden manageable.
  1. It might be simpler to implement a server-side hook to set the state to “in progress” when someone first pushes a branch with a ticket ID, but that pushes a lot of information-gathering much later into the process. Part of the appeal of the process above is that information is updated ASAP. 

https://paperless.blog/automating-ticket-progression
TLS certificate setup on NixOS
HTTPSTLSNixOSLet's Encrypt
Info dump on how to set up a HTTPS-enabled service on NixOS.
Show full content

Info dump on how to set up a HTTPS-enabled service on NixOS.

Prerequisites:

  • API access to change DNS entries for your own domain
  • NixOS
  • Static IP or dynamic DNS setup

Starting off configuration.nix:

{
  modulesPath,
  config,
  lib,
  pkgs,
  ...
}:
let
  certName = "${config.networking.hostName}-dot-${config.networking.domain}-wildcard";
in
{
  # TODO: See below
}

Networking:

networking = {
  domain = "example.org"; # TODO: Replace with your own domain
  firewall.allowedTCPPorts = [
    22 # TODO: Configure SSH (not shown)
    # I've intentionally left out port 80 since all modern clients support TLS
    443
  ];
  hostName = "your-hostname"; # TODO: Replace with the name you want for your server
};

Let’s Encrypt (ACME) setup:

security.acme = {
  acceptTerms = true;
  certs."${certName}" = {
    # TODO: Replace with a real path or something like `config.sops.secrets.acmeCredentials.path` if you enable SOPS
    credentialsFile = /path/to/acme/credentials;
    # TODO: Replace with a value from https://search.nixos.org/options?query=security.acme.certs.%3Cname%3E.dnsProvider
    dnsProvider = "some-provider";
    domain = "*.${config.networking.hostName}.${config.networking.domain}";
  };
  defaults.email = "jdoe@example.org"; # TODO: Replace with your email address
};

Enable any HTTP service. In this case I’ll be setting services.audiobookshelf.enable = true; to serve audiobooks locally. Then I’ll use nginx to serve audiobookshelf to the world:

nginx = {
  clientMaxBodySize = "4G"; # Allow uploading big audiobooks
  enable = true;
  recommendedGzipSettings = true;
  recommendedOptimisation = true;
  recommendedProxySettings = true;
  recommendedTlsSettings = true;
  virtualHosts = {
    "audiobookshelf.${config.networking.fqdn}" = {
      forceSSL = true;
      locations."/" = {
      proxyPass = "http://127.0.0.1:${builtins.toString config.services.audiobookshelf.port}";
      proxyWebsockets = true;
      extraConfig = ''
        proxy_redirect http:// $scheme://;
      '';
    };
    useACMEHost = certName;
    };
  };
};

Using sops-nix is optional, but it is a good way to automate secrets management. The Nix part of the configuration should look something like this:

sops = {
  age.sshKeyPaths = [ "/etc/ssh/ssh_host_ed25519_key" ];
  defaultSopsFile = ./secrets/default.yaml;
  secrets = {
    acmeCredentials = { };
  };
};

Then there’s the file configuration in .sops.yaml:

keys:
  # TODO: Use `age-keygen --output ~/.config/sops/age/keys.txt` to replace the value below
  - &admin age125nlnhal9c90u8vwtveurccf5emtdk2u5nr3vv3yyu5kfmldgsusre2n8k
  # TODO: Use `ssh-keyscan HOST | ssh-to-age` to replace the value below
  - &host_audiobookshelf age1hr9vcf9jlfgxf4f6c3f9yq2kklzj7tplxv7cyulhqzfvhx42le4ssnq89j
creation_rules: # sops updatekeys secrets/*
  - path_regex: audiobookshelf/secrets/[^/]+\.(yaml|json|env|ini|sops)$
    key_groups:
      - age:
          - *admin
          - *host_audiobookshelf

You can then add encrypted credentials in audiobookshelf/secrets/default.yaml, which would look something like this:

acmeCredentials: ENC[AES256_GCM,data:…,type:str]
sops:
  kms: []
  gcp_kms: []
  azure_kv: []
  hc_vault: []
  age:
    - recipient: age125nlnhal9c90u8vwtveurccf5emtdk2u5nr3vv3yyu5kfmldgsusre2n8k
      enc: |
        -----BEGIN AGE ENCRYPTED FILE-----
        …
        -----END AGE ENCRYPTED FILE-----
    - recipient: age1hr9vcf9jlfgxf4f6c3f9yq2kklzj7tplxv7cyulhqzfvhx42le4ssnq89j
      enc: |
        -----BEGIN AGE ENCRYPTED FILE-----
        …
        -----END AGE ENCRYPTED FILE-----
  lastmodified: "2024-11-20T04:20:13Z"
  mac: ENC[AES256_GCM,data:…,type:str]
  pgp: []
  unencrypted_suffix: _unencrypted
  version: 3.9.1

Finally, to allow nginx to read the TLS certificate:

users.users.nginx.extraGroups = [ "acme" ];

Apply this configuration, and https://audiobookshelf.your-hostname.example.org should be accessible online.

https://paperless.blog/tls-certificate-setup-on-nixos
The lost taglines
documentationsillinesssoftware
Software documentation usually does not include the most important facts you want to know as a beginner. The kind of stuff you might internalise after using something for years, but you could not have learned by reading the documentation. The stuff which should show up as a blinking orange & teal banner on top of the documentation page — the lost taglines:
Show full content

Software documentation usually does not include the most important facts you want to know as a beginner. The kind of stuff you might internalise after using something for years, but you could not have learned by reading the documentation. The stuff which should show up as a blinking orange & teal banner on top of the documentation page — the lost taglines:

  • Bash: “Use More Quotes™!”
  • CSV: Your data will contain commas.
  • Git: bisect — you’ll never be the same again.
  • JSON: Add a schema!
  • Markdown: The parser is the limiting factor.
  • Nix: It’s worth the pain.
  • PHP: The comments are the most useful parts of the docs.
  • Perl: The first thing you must do is add every safety pragma.
  • Python: mypy --strict and ruff FTW.
  • Rust: The compiler loves you, and only wants you to be happy.
  • vCard: v3 is the only one supported anywhere.
  • YAML: Have you considered JSON?
https://paperless.blog/the-lost-taglines
Nightly Rust development with Nix
NixRustdevelopmentIDE
This one took a while to get simple enough; enjoy!
Show full content

This one took a while to get simple enough; enjoy!

Background

While getting started with Bevy I wanted to make sure I had an easily reproducible development environment by using only Nix as the package manager. However, the latest version of Bevy was not compatible with stable Rust, so looked into Nix development environments for the nightly version. Unfortunately the instructions I could find either recommended using Rustup (that is, involving another package manager and probably sacrificing reproducibility), were “left as an exercise to the reader”-style vague, or looked complex to maintain. I also wanted to start out with stable integration with JetBrains IDEA, my IDE of choice, so that when I upgrade any part of the setup it would have a good chance to Just Keep Working™.

I tried starting from a standard Nix shell, but nightly Rust integration seemed like it would need one of the aforementioned complex setups or some third party involvement. I didn’t want to add another framework, but finally caved and started using one. This happily turned out to be the right choice.

Prerequisites Setup
  1. Go to your project directory
  2. Run devenv init to create devenv and direnv configuration
  3. Run devenv inputs add fenix github:nix-community/fenix --follows nixpkgs to add nightly Rust support
  4. Add the following in devenv.nix:

    languages.rust = {
      channel = "nightly";
      components = [
        "cargo"
        "rust-src"
        "rustc"
      ];
      enable = true;
    };
    

At this point cargo --version and rustc --version should print -nightly version numbers, and you can run cargo init to start the Rust journey.

Optional IDE setup

Start your IDE inside the project directory2 to have all the tools you just installed available on the path. In Languages & FrameworksRust, you can now set the relevant paths for the project:

  • Toolchain location: [path to project]/.devenv/profile/bin
  • Standard library: [path to project]/.devenv/profile/lib/rustlib/src/rust/library

This should be enough to build and run the code, and to follow references to Rust core, within the IDE.

  1. I’ve only tested this with IDEA, but it should work similarly in RustRover. 

  2. I use nohup idea-ultimate nosplash . &>/dev/null & disown to be able to use the shell immediately after starting IDEA. 

https://paperless.blog/nightly-rust-development-with-nix