GeistHaus
log in · sign up

LornaJane

Part of feedburner.com

Lorna Jane Mitchell's Website

stories
Selectively ignore lines in git diff
techgittips
I have a things-as-code project that outputs mostly text-based formats, but a lot of them. To keep an eye on consistency, I rebuild all the outputs and dump them into a local git repository so I can very easily diff … Continue reading →
Show full content
I have a things-as-code project that outputs mostly text-based formats, but a lot of them. To keep an eye on consistency, I rebuild all the outputs and dump them into a local git repository so I can very easily diff to spot any changes – which was fine until we added a build timestamp, so every file looks changed on every run! This post is about ignoring the matching line with git diff -I.

With a task like this where I want a quick check on a LOT of output (a few million lines I think) which probably hasn’t changed much, the text-based output and source control tools are a great fit. Also since I’m a confident git user, I can very easily re-run the script in the directory (already set up with the paths for my system) and when it finishes, check there’s nothing in git diff.

It worked great – until it didn’t! We added a nice piece of metadata showing when the file was last regenerated, which seemed like a good feature and I approved the change myself. But of course it means all the output files look changed each time. I needed a way to ignore only that change and still see anything except the timestamp updates.

Ignore patterns in git diff

The -I (it’s a capital “i” and is short for --ignore-matching-lines) accepts a regex and can be used with diff, log, show and probably other git commands. I’m using it with diff, like this:

git diff -I 'x-generated-date'

I only see diff entries that do not match this regex (okay, in this example it’s a string, but you can do cleverer things if you need them! I could put the datetime pattern in here to be super certain what I’m matching), which is exactly what I need. I know that this line will always be changed so we can ignore it and only inspect any other changes that are present.

To see just the list of files so I can get an idea of the blast radius on what I just changed, I can add --stat to my diff command.

Working with larger filesets and particularly with generated output where it’s useful to quickly sanity check any side effects, tricks like this are so useful – and I thought I would share. If you have additional strategies to offer me in return then I would love to hear about them in the comments, please!

https://lornajane.net/?p=5304
Extensions
Git renames are not renames
techgittips
I consider myself pretty git-confident, I’ve worked with it a lot, taught it, been a git consultant, run engineering and various things-as-code teams. This week I had a spectactular git problem where merging one branch into another produced changes that … Continue reading →
Show full content
I consider myself pretty git-confident, I’ve worked with it a lot, taught it, been a git consultant, run engineering and various things-as-code teams. This week I had a spectactular git problem where merging one branch into another produced changes that didn’t exist on either branch. Turns out, renaming directories in a monorepo with multiple almost-identical boilerplate documentation files comes with surprises…

The short version is: Git actually doesn’t track renames at all. It tracks adds and deletes, and presents them to the end user as renames. If the file is 50% the same, it’s considered a rename. It’s a bit of a blunt instrument and so it goes wrong sometimes. You can adjust whether your client shows you renames, or how similar the files can be – but you don’t impact how other users see it.

The situation

I’m in a situation where I need to alter the directory structure of a monorepo with lots of similar-structured but actually different files that appear in every folder. I’m adding an additional folder level, to allow siblings of each project. Did I mention that git doesn’t support renames? It doesn’t track folders either.

The similar structures of the projects turned out to be even more interesting when I realised that there are two sets of mostly-boilerplate documentation in every folder. The two folders (they both have markup content that gets published as documents) have some files that are named the same, and every folder has both folders, with either no changes, or with very minor (but crucial!) changes.

I renamed every folder, committed the changes, updated ALL the build scripts to handle the changes, and thought I’d done the difficult bit (famous last words). Then when I started applying these updates to the existing work-in-progress branches, I noticed that git was deleting or updating files in other projects within the monorepo – changes that were not in either of the branches I was combining.

Why this happens

Git simply has no idea which adds go with which deletes – and on a big project, with a lot of almost-the-same files which were all renamed? It just fails in a big messy heap.

Things I tried that did not help at all:
– splitting up the restructure into small chunks so I only moved one directory at a time, per commit
– adjusting git’s rename settings when merging the branches
– swearing a lot

(I did have it on my list to make all the repeated content into templated files, because it’s annoying to maintain as it is – but I had no idea that was on the critical path for the directory restructure!)

This story has no ending

As things stand, I have no solution! I knew this was the way git worked and I knew there was content that could be problematic – but this problem manifested in a way I didn’t expect when the changes which had been applied apparently without any issue were then merged into the ongoing feature branches (it’s a slow moving project, a lot of contributors, all the reasons I normally advise against doing this sort of repository surgery in the first place). I started seeing phantom changes that simply didn’t exist in any of the “before” branches and it took me a moment to understand what happened.

Things I am doing so we can make progress with the changes:
– lots of checks in CI for pull requests to help humans catch any weirdness at review time and before merge
– having a good attitude to if we need to fix minor documentation excitements as a result of these changes, because they are absolutely a positive change for the project and it’s the least interesting files that are impacted
– manually applying the same folder restructures to the feature branches (it was a script in the first place so it’s repeatable) before allowing users to sync with the main branch seems to help git to know what’s happening

My only advice is either not to do this sort of change, or to have a no-pull-requests interval. Neither of those were options for me though so I’m sharing in case you find yourself in a similar situation some day! Be aware that “renames” (and in fact “directories”) are not always what they seem ….

https://lornajane.net/?p=5288
Extensions
Manage Diagrams in AsciiDoc on GitHub
techasciidoccigitgithubmarkuptips
I use a lot of asciidoc these days for work documentation (and I love it) and I’ve been so happy that GitHub renders it when you view a repository in the web browser, just like it does for Markdown and … Continue reading →
Show full content
I use a lot of asciidoc these days for work documentation (and I love it) and I’ve been so happy that GitHub renders it when you view a repository in the web browser, just like it does for Markdown and ReStructuredText. BUT what GitHub does not do is render the image types that asciidoc does so even though I’m working with asciidoc and PlantUML, and the asciidoc tools render those diagrams nicely in PDF and HTML output – GitHub’s rendering doesn’t. So here’s a quick overview of how I handle those repositories.

Before I start, I should mention that for our context, the outputs are usually A4-sized PDF documents, so GitHub in the browser is a similar sort of form factor, and none of this stuff needs to be print quality. If you have a use case that’s none of the above, you might want to take a different approach.

Setup for GitHub AND Asciidoctor

Asciidoctor has good rendering of PlantUML (and all sorts of other format) diagrams, but GitHub does not, so the tradeoff is to have the diagram source in the repository, have build scripts (that run locally and in CI) to build the SVG version of the diagram and also include that in the repository, and to reference the SVG to include the diagram in the document.

When adding a diagram, the author creates the PlantUML (.puml) file and runs the build script which generates the SVG format in an images/ directory. The author can then put the link to the SVG file into the text.

The build script regenerates images if the user has the PlantUML tooling installed so that any update are included. If the tool isn’t there, the images/ directory is included in the repository anyway so that the content is viewable in GitHub, so the existing images can be used locally too.

Image content that works in multiple publish destinations

This version of the diagrams works perfectly well in the generated PDFs too, and since there’s a build step for PDFs there was an obvious place to hook the image generation step into.

SVG works well enough in both contexts, and while it’s annoying to have the extra build requirement for authors, I think the benefits of having the content readable with images in the repository is worth it. Having to get the published PDF to have all the information available is not great, and this is a document that has a lot more readers than writers!

https://lornajane.net/?p=5286
Extensions
Tag Kinds in OpenAPI 3.2
APIskindopenapitagstips
OpenAPI tags have always been annoying: user-supplied arbitrary data for endpoints should be a fabulous feature – but the documentation tools seem to think that tags are only for them so it becomes more difficult to use tags for other … Continue reading →
Show full content
OpenAPI tags have always been annoying: user-supplied arbitrary data for endpoints should be a fabulous feature – but the documentation tools seem to think that tags are only for them so it becomes more difficult to use tags for other purposes. In fact it is very useful to be able to tag endpoints with lots of different categories of data and so in OpenAPI 3.2, tags were enhanced to include an additional “kind” field so that different kinds of tag could be used.

Being grumpy about documentation tools using tags for themselves is silly (especially as I built some of those tools!), and I’m really pleased there’s a better way now. Tags might be useful for:
– tagging endpoints that are only for an exclusive group of users
– tagging endpoints to show something is “new” in an API, or relates to a particular version
– tagging endpoints to indicate their lifecycle status such as being in beta, or being deprecated
– tagging endpoints to say whether they are included in the published documentation or SDKs
– tagging if this endpoint follows a particular pattern or exception to a pattern
– tagging for some other tool to pick up and use the tag for more information later

… you get the idea, tags can be used for anything now.

Declare the kind of tag it is

Tags are declared in their own tags section, and it’s here that you can say what sort of tags they are. Here’s an example:

tags:
 - name: product
   summary: Products
   description: Product operations
 - name: partner
   summary: Partner
   description: Operations available only to partner organisations, or other privileged accounts.
   kind: audience
 - name: v2
   summary: Version 2
   description: Available in API v2 or later
   kind: version

Tags without a “kind” default to type “nav” – meaning the navigation tags which is the pre-3.2 standard behaviour. This default setting means that you can upgrade an OpenAPI description from 3.1 to 3.2 without needing to make changes, but when you want to introduce new tags you can do so by including the “kind” of tag that they are.

Apply the tag

Applying tags is done the same way as before – again, no changes needed on upgrade and the additional data is applied in the tags entry rather than at the operation level. So it looks something like the example below if I apply a v2 tag:

paths:
  "/v2/products/":
    operationId: getProductsV2
    summary: Get all products
    tags:
      - v2
      - products

Applying multiple different kinds of tag is expected and doesn’t need any special setup – the tags have their own properties, and you can apply as many as you like.

Tags Registry

The kind field is an open string – you can define your own kinds of tag and that’s intentional so that this feature can be used in different ways in different contexts. That said, there are some common use cases and for those we’ll get much better experiences if we collaborate on the tags we use and how we use them. That way, we can all align and have the tools support our use cases and give us seamless interoperability as we send our OpenAPI descriptions through all sorts of different pipelines.

The OpenAPI tag kinds registry has already been set up with some basic entries: nav (default), badge, and audience. Please check out the list for new entries and open pull requests to add tag kinds you’re using yourself.

Tags are not just for docs any more

Tell your friends, your colleagues, and your AI assistants: tags are not a docs feature. They’re an arbitrary user-supplied data feature, and I can’t wait to see what you build!

For more information about tags, read the tag kinds documentation on the OpenAPI Learn site.

https://lornajane.net/?p=5282
Extensions
Nested tags in OpenAPI 3.2
APIstechopenapitagstips
OpenAPI has always had support for simple tags, but the OpenAPI 3.2 release brought in some serious tag upgrades including a summary field, a “kind” field with registry, – and the ability to nest tags which is the focus of … Continue reading →
Show full content
OpenAPI has always had support for simple tags, but the OpenAPI 3.2 release brought in some serious tag upgrades including a summary field, a “kind” field with registry, – and the ability to nest tags which is the focus of today’s post. If one level of organisation isn’t enough for your API (and on bigger APIs I’d argue it shouldn’t be) then the ability to indicate which tag is the parent of this tag will be a good feature to adopt when you upgrade your OpenAPI descriptions.

The basic premise is: each tag has a ‘parent’ field which contains the name of a tag for this tag to belong to. Let’s quickly look at an example:

tags:
  - name: account
    summary: Account Details
    description: Operations relating to the customer's account
  - name: accVerification
    summary: Verification
    description: Operations for verify a new user or user detail change
    parent: account

The parent tag does not have awareness of its children, but the child tag indicates the parent that it belongs to. This works well for APIs that are composed of smaller APIs such as in a miroservices context.

There are no rules in the specification itself about the matching parent must exist – there are also no restrictions on creating structures that clearly won’t work such as loops. The tools that implement the tag structures will probably point out the error of your ways though, if you try to use something like that!

APIs that have previously used extensions to group tags, such as x-tagGroups (which I see fairly frequently), will find this approach gives an OpenAPI official way to do the same thing. The major benefit is that all tools now use the same syntax to describe the same thing and your API descriptions should be more portable across different contexts.

For more details on nesting tags in OpenAPI, see the OpenAPI learn site – then come back here and leave me a comment on what your structure looks like!

https://lornajane.net/?p=5276
Extensions
Notification Contexts Matter
workcareernotificationstipswriting
Like many of you, my days are dominated by notifications. Emails from project management systems, source control systems, calendar invitations, ticket updates, and message about messages on other platforms. I’ve noticed that some people use notifications as a power tool, … Continue reading →
Show full content
Like many of you, my days are dominated by notifications. Emails from project management systems, source control systems, calendar invitations, ticket updates, and message about messages on other platforms. I’ve noticed that some people use notifications as a power tool, while others seem blind to what happens when they do something. So this post is some tips that I’ve picked up along the way.

The entry level here is to use a correct and meaningful title for the issue/pull request/ticket. I see too many which are just a reference number or two words (and I do edit them if I have write access)! This data is included everywhere, in search results, in notifications – choose your wording well or risk obscurity.

“Report task” vs “Markdown lint and spellcheck results”.

Timing also matters: if I just talked to you about this an hour ago, it’s fine to write a shorter update. But if we’re collaborating on something that’s quite slow-moving and the last update was weeks ago? I’m going to need more context! And most of the time, it’s not much more effort but it does make it much more likely that I’ll take in what you’re asking me to do.

“I agree with the above” vs “I agree that the format option should be added, but not enabled by default”.

If I’ll find out about whatever action you just took by email, then think about how people use email. I don’t have images enabled in email and I’m probably working through my inbox in a triage activity. If you need me to know or do something – make it easier. One of my collaborators completely failed to nerd snipe me into fixing something this week because the examples were supplied as screenshots and I totally missed how fascinating the bug was because I couldn’t see them.

[row of empty image boxes] vs almost anything else. If it’s code or syntax, paste it, not a picture of it! (side benefit, it’ll be in search results)

Your notification should support your intended outcome. If you open a pull request, maybe you want a review for that? It’s on my list if you tagged me but it’s in a crowd there. A pull request that says what is being changed in the title, plus an explanation of why we’re doing it and any particular feedback that’s expected? I’ve got a clear sense of what’s involved, what’s expected and it’ll probably happen because it’s less of an unknown quantity.

“Bug fixes” vs “Add multipath to diagrams, please test on your platform”.

When you send a calendar invite with “follow up as discussed” I have no idea what this is or why I need to reschedule another meeting to attend it. If the meeting is next month I will have even less idea and even if I do attend, there’s no chance that I’ll be well prepared to make good use of the time. But it’s unlikely to survive my calendar contention without more context – and it’s not more words or more work, it’s just more attention while you’re sending the invite.

“Followup sync” vs “Updates and next steps on project supermoon”.

If you’re tagging someone in a conversation that they’re not already participating in, such as on a platform like GitHub or JIRA, they will have no context at all. So “@lornajane, your thoughts?” is just going to sit in my inbox until I run out of things that have the information included with the notification and choose to go digging through some history. You don’t need to replicate the whole conversation but just a note “we’re looking at the options for adding versioning to help with [something I care about], we could use your guidance on how to handle deletions” is enough for me to know how this fits the picture and what you need from me.

Context is everything

The context in which your audience – or audiences – reads what you write is crucial to good communication. I’ve used workplace examples this time, but your open source and community projects follow the same rules. The collaborators who do this well get really, really remarkable results and I think I’m sharing a lot of their secrets here! What are yours? Comment and let the rest of us know, please?

https://lornajane.net/?p=5267
Extensions
Quick local API docs with Scalar
APIsapiscalartipstools
Today I’m sharing a quick-and-dirty script to take an OpenAPI description, spin up a docs server locally, and copy the URL into your clipboard. I also use a bit of glob expansion in my script to find the right folder, … Continue reading →
Show full content
Today I’m sharing a quick-and-dirty script to take an OpenAPI description, spin up a docs server locally, and copy the URL into your clipboard. I also use a bit of glob expansion in my script to find the right folder, because I have a lot of APIs with long and formulaic directory names (TM Forum members know this story). I’m spinning up Scalar here as it’s my current favourite just-works local API docs platform.

The script:

#!/usr/bin/env -S uv run --script

# /// script
# requires-python = ">=3.11"
# dependencies = [
#   "argparse",
# ]
# ///

import glob
import subprocess
import random

port = random.randint(0,500) + 3000;

# Add your own fudge/expansion magic here if you have APIs to find
path = f'../api-*/oas/*.oas.yaml' 
url = f'http://localhost:{port}'

files =  glob.glob(path)
if len(files) > 0:
  for f in files: # works for one match
    # copy URL to clipboard for convenience
    subprocess.run(["pbcopy"], input=url, text=True)
    subprocess.run(["scalar", "document", "serve", "-p", str(port), f])
else:
  print("API not found");

I’m using Python and I run this script with uv because I run this script from lots of different folders and it takes care of dependencies so neatly (confession: my actual script uses argparse and some inputs to feed the “guess the location of the API file from these three digits” magic but that part probably isn’t useful to other people!)

First I generate a random port number and use it to construct a URL – there’s no checking that the port is available but given that I rarely have more than 4 or 5 of these up at one time, it rarely collides and I just run the script again if it does!

The subprocess calls then run the commands I would actually run myself on the commandline if I didn’t use a script to generate a random port number and do some path fudging logic for me:

  • `pbcopy http://localhost:3333` (run this command first because the next one blocks)
  • scalar document serve -p 3333 openapi.yaml

Then I have the URL and can paste it into my browser. You could just open a browser tab instead of copying but I find it always picks the wrong tab/window/app/whatever so in my clipboard is more useful to me. I also use a clipboard history tool so it isn’t a problem to have lots of things writing to my copy buffer.

Feel free to adapt the script to fit your own paths and preferred tools; probably some of you do this often enough that it’s worth having a wrapper rather than remembering the correct incantation or relying on a hosted platform to download and upload to when the file is already local. And as always, if you do something differently, I’d love to hear about it!

https://lornajane.net/?p=5244
Extensions
API Specificity with Overlays and Enums
APIstechopenapioverlaysspeakeasytools
The more I work on API standards, the more I realise how few teams understand that they can adopt the standards and, without breaking any contract, adapt them to make a strong interface for their own application. One of my … Continue reading →
Show full content
The more I work on API standards, the more I realise how few teams understand that they can adopt the standards and, without breaking any contract, adapt them to make a strong interface for their own application. One of my favourite examples is to add enums where a standard interface cannot dictate the exact values, but in your own implementation it is very helpful to do so.

To start with, let’s use this library book example snippet from an OpenAPI file – this is just the Components section:

components:
  schemas:
    Book:
      type: object 
      properties:
        isbn:
          type: string
          description: ISBN publication identifier
          format: isbn
          example: "9781250236210"
        title:
          type: string
          description: Book title
          example: "A Psalm for the Wild-Built"
        author:
          type: string
          description: Author's name
          example: "Becky Chambers"
        genre:
          type: string
          description: Genre of the book
          example: "Solarpunk"

That genre field at the end of the list is just a string – but in my library, I’m going to only offer a very specific set of genres, so I want to limit the values that can be used there. To that I’ll use an OpenAPI Overlay.

Here’s the Overlay:

overlay: "1.0.0"
x-speakeasy-jsonpath: rfc9535
info:
    title: "Set the genres list"
    version: "1.0.1"
actions:
    - target: $["components"]["schemas"]["Book"]["properties"]["genre"]
      update:
        enum:
            - Science Fiction
            - Fantasy
            - Speculative
            - Solarpunk
            - Philosophy

There are a few different tools that you can use to apply an Overlay – I’ve been using Speakeasy CLI a lot lately, mostly because I’m working with other OpenAPI tools and then using this tool’s compare command to get an OpenAPI diff in Overlay format that I can reapply elsewhere.

Using Speakeasy CLI, the command to apply the overlay (let’s say the file is overlay.yaml) to the OpenAPI description (imaginatively, mine is called library.openapi.yaml) is as follows:

speakeasy overlay apply --overlay=overlay.yaml --schema=library.openapi.yaml --out updated.openapi.yaml

The overlay updates the Book schema so that now the components section looks like this:

omponents:
  schemas:
    Book:
      type: object
      properties:
        isbn:
          type: string
          example: "9781250236210"
          description: ISBN publication identifier
          format: isbn
        title:
          type: string
          description: Book title
          example: "A Psalm for the Wild-Built"
        author:
          type: string
          description: Author's name
          example: "Becky Chambers"
        genre:
          type: string
          description: Genre of the book
          example: "Solarpunk"
          enum:
            - Science Fiction
            - Fantasy
            - Speculative
            - Solarpunk
            - Philosophical
Make your API implementations your own

Adding repeatable OpenAPI description updates with Overlays is a good way to get from a generic or rather vague API description to something that you can implement efficiently in your own projects. Especially with published standards that might be used in many different settings, it can be particularly valuable to adapt to your own needs.

In this example, we showed adding enums which is a great way to give your SDKs and validators clearer instructions on the values that should be expected. Don’t neglect the other fields, such as updating descriptions or adding meaningful examples that will help those implementing the API (human or machine) to have the context they need to be successful.

https://lornajane.net/?p=5237
Extensions