Thomas Broyer — GeistHaus

Nov 5, 2024 Updated Nov 10, 2024

Show full content

HTML event handlers are those onxxx attributes and properties many of us are used to, but do you know how they actually work? If you're writing custom elements and would like them to have such event handlers, what would you have to do? And what would you possibly be unable to implement? What differences would there be from native event handlers?

Before diving in: if you just want something usable, I wrote a library that implements all this (and more) but first jump to the conclusion for the limitations; otherwise, read on.

High-level overview

Before all, an event handler is a property on an object whose name starts with on (followed by the event type) and whose value is a JS function (or null). When that object is an element, the element also has a similarly-named attribute whose value will be parsed as JavaScript, with a variable named event whose value will be the current event being handled, and that can return false to cancel the event (how many times have we seen those infamous oncontextmenu="return false" to disable right click?)

Setting an event handler is equivalent to adding a listener (removing the previous one if any) for the corresponding event type.

Quite simple, right? but the devil lies in the details!

(fwiw, there are two special kinds of event handlers, onerror and onbeforeunload, that I won't talk about the here.)

In details

Let's go through those details the devil hides in (in no particular order).

Globality

All built-in event handlers on elements are global, and available on every element (actually, every HTMLElement; that excludes SVG and MathML elements). This include custom elements so you won't need to implement, e.g., an onclick yourself, it's built into every element. This also implies that as new event handlers are added to HTML in the future, they might conflict with your own event handlers for a custom event (this is also true of properties and methods that could later be added to the Node, Element and HTMLElement interfaces though).

Custom elements already have all "native" event handlers built-in.

Conversely, this globality isn't something you'll be able to implement for a custom event: you can create an onfoo event handler on your custom element, but you won't be able to put an onfoo on a <div> element and expect it to do anything useful. (Technically, you possibly could monkey-patch the HTMLElement.prototype and use a MutationObserver to detect the attribute, but you'll still miss attributes on detached elements and, well, monkey-patching… do I need to say more?)

To avoid forward-incompatibility (be future-proof) you might want to name your event handler with a dash or other non-ASCII character in its attribute name, and maybe an uppercase character in its property name. When custom attributes are a thing, then maybe this will also allow having such an attribute globally available on all elements. Not sure it's a good idea, if you ask me I think I'd just use a simple name and hope HTML won't add a conflicting one in the future.

Return value

We briefly talked about the return value of the event handler function above: if it returns false then the event will be cancelled.

It happens that we're talking about the exact false value here, not just any falsy value.

Fwiw, by cancelled here, we mean just as if the event handler function had called event.preventDefault().

Listener ordering

When you set an event handler, it adds an event listener for the corresponding event, so if you set it in between two element.addEventListener(), it'll be called in between the event listeners.

Now if you set it to another value later on, it won't actually remove the listener for the previous value and add one for the new value; it will actually reuse the existing listener! This was likely some kind of optimization in browser engines in the past (from the time of Internet Explorer or even Netscape I suppose), but as websites relied on it it's now part of the spec.

const events = [];
element.addEventListener("click", () => events.push("click 1"));
element.onclick = () => "replaced below"; // starts listening
element.addEventListener("click", () => events.push("click 3"));
element.onclick = () => events.push("click 2"); // doesn't reorder the listeners
element.click();
console.log(events);
// → ["click 1", "click 2", "click 3"]

If you remove an event handler (set the property to null –wait, there's more about it, see below– or remove the attribute), the listener will be removed though. So if for any reason you want to make sure an event handler is added to the end of the listeners list, then first remove any previous value then set your own.

Non-function property values

We talked about setting an event handler and removing an event handler already, but even there there are small details to account for.

When you set an event handler's property, any object value (which include functions) will set the event handler (and possibly add an event listener). When an event is dispatched, only function values will have any useful effect, but any object can be used to activate the corresponding event listener (and possibly later be replaced with a function value without reordering the listeners).

Conversely, any non-object, non-function value will be coerced to null and will remove the event handler.

This means that element.onclick = new Number(42) sets the event handler (to some useless value, but still starts listening to the event), and element.onclick = 42 removes it (and element.onclick then returns null).

Invalid attribute values, lazy evaluation

Attribute values are never null, so they always set an event handler (to remove it, remove the attribute). They're also evaluated lazily: invalid values (that can't be parsed as JavaScript) will be stored internally until they're needed (either the property is read, or an event is dispatched that should execute the event handler), at which point they'll be tentatively evaluated.

When the value cannot be parsed as JavaScript, an error is reported (to window.onerror among others) and the event handler is replaced with null but won't remove the event handler! (so yes, you can have an event handler property returning null while having it listen to the event, and not have the listener be reordered when set to another value)

const events = [];
element.addEventListener("click", () => events.push("click 1"));
element.setAttribute("onclick", "}"); // invalid, but starts listening
console.log(element.onclick); // reports an error and logs null, but doesn't stop listening
element.addEventListener("click", () => events.push("click 3"));
element.onclick = () => events.push("click 2"); // doesn't reorder the listeners
element.click();
console.log(events);
// → ["click 1", "click 2", "click 3"]

The error reports the original location of the value, that is the setAttribute() call in a script, or even the attribute in the HTML, even though the value is actually evaluated much later. This is something that I don't think could be implemented in userland.

Scope

We've said above that an event variable is available in the script set as an attribute value, but that's not the only variable in scope: every property of the current element is directly readable as a variable as well. Also in scope are properties of the associated form element if the element is form-associated, and properties of the document.

This means that <a onclick="alert(href)" will show the link's target URL, <button onclick="alert(action)"> will show the form's target URL (as a side effect, you can also refer to other form elements by name), and <span onclick="alert(location)"> will show the document's URL.

This is more or less equivalent to evaluating the attribute value inside this:

with (document) {
  with (element.form) {
    with (element) {
      // evaluate attribute value here
    }
  }
}

Related to scope too is the script's base URL that would be used when import()ing modules with a relative URL. Browsers seem to behave differently already on that: Firefox resolves the path relative to the document URL, whereas Chrome and Safari fail to resolve the path to a URL (as if there was no base URL at all). I don't think anything can be done here in a userland implementation.

Function source text

When the event handler has been set through an attribute, the function returned by the event handler property has a very specific source text (which is exposed by its .toString()), which is close to, but not exactly the same as what new Function("event", attrValue) would do (declaring a function with an event argument and the attribute's value as its body).

You couldn't directly use new Function("event", attrValue) anyway due to the scope you need to setup, but there's a trick to control the exact source text of a function so this isn't insurmoutable:

const handlerName = "onclick"
const attrValue = "return false;"
const fn = new Function(`return function ${handlerName}(event) {\n${attrValue}\n}`)()
console.log(fn.toString())
// → "function onclick(event) {\nreturn false;\n}"

Content Security Policy

Last, but not least, event handler attribute values are rejected early by a Content Security Policy (CSP): the violation will be reported as soon as the attribute is tentatively set, and this won't have any effect on the state of the event handler (that could have been set through the property).

The CSP directive that controls event handler attributes is script-src-attr (which falls back to script-src if not set, or to default-src). When implementing an event handler for a custom event in a custom element, the attribute value will have to be evaluated by scripting though (through new Function() most likely) so it will be controlled by script-src that will have to include either an appropriate hash source, or 'unsafe-eval' (notice the difference from native event handlers that would use 'unsafe-inline', not 'unsafe-eval'). Hash sources will be a problem though, because you'll have to evaluate not just the attribute's value, but a script that embeds the attribute's value (to set up the scope and source text). And you'd have to actually evaluate both to make sure the attribute value doesn't mess with your evaluated script (think SQL injection but on JavaScript syntax). This would mean that each event handler attribute would have to have two hash sources allowed in the script-src CSP directive, one of them being dependent on the custom element's implementation of the event handler.

An alternative would be to use a native event handler for parsing, but then the function would have that native event handler as its function name, and you'd have to make sure to use an element associated with the same form (if not using the custom element directly because e.g. you don't want to trigger mutation observers) to get the appropriate variables in scope.

Recap: What does it mean for custom event handlers?

As seen above, it's not possible to fully implement event handlers for a custom event in a way that would make it indistinguishable from native event handlers:

they won't be globally available on every element (except maybe in the future with custom attributes)
a Content Security Policy won't be able to use script-src-attr on those custom event handlers, and if it uses hash sources, chances are that 2 hash sources will be need for each attribute value (one of them being dependent on the custom event handler implementation details)
errors emitted by the scripts used as event handler attribute values won't point to the source of the attribute value
an import() with a relative URL, inside an event handler attribute value, won't behave the same as in a native event handler

The first point alone (or the first two) might make one reevaluate the need for adding such event handlers at all. And if you're thinking about only implementing the property, think about what it brings compared to just having users call addEventListener().

That being said, I did the work (more as an exercise than anything else), so feel free to go ahead a implement event handlers for your custom elements.

http://blog.ltgt.net/html-event-handlers/

Making Web Component properties behave closer to the platform

Jan 21, 2024 Updated Feb 25, 2024

Show full content

Built-in HTML elements' properties all share similar behaviors, that don't come for free when you write your own custom elements. Let's see what those behaviors are, why you'd want to implement them in your web components, and how to do it, including how some web component libraries actually don't allow you to mimic those behaviors.

Built-in elements' behaviors

I said it already: built-in elements' properties all share similar behaviors, but there are actually several different such shared behaviors. First, there are properties (known as IDL attributes in the HTML specification) that reflect attributes (also known as content attributes); then there are other properties that are unrelated to attributes. One thing you won't find in built-in elements are properties whose value will change if an attribute change, but that won't update the attribute value when they are changed themselves (in case you immediately thought of value or checked as counter-examples, the situation is actually a bit more complex: those attributes are reflected by the defaultValue and defaultChecked properties respectively, and the value and checked properties are based on an internal state and behave differently depending on whether the user already interacted with the element or not).

Type coercion

But I'll start with another aspect that is shared by all of them, whether reflected or not: typing. DOM interfaces are defined using WebIDL, that has types and extended annotations, and defines mapping of those to JavaScript. Types in JavaScript are rather limited: null, undefined, booleans, IEEE-754 floating-point numbers, big integers, strings, symbols, and objects (including errors, functions, promises, arrays, and typed arrays). WebIDL on the other hand defines, among others, 13 different numeric types (9 integer types and 4 floating point ones) that can be further annotated to change their overflowing behavior, and several string types (including enumerations).

The way those types are experienced by developers is that getting the property will always return a value of the defined type (that's easy, the element owns the value), and setting it (if not read-only) will coerce the assigned value to the defined type. So if you want your custom element to feel like a built-in one, you'll have to define a setter to coerce the value to some specific type. The underlying question is what should happen if someone assigns a value of an unexpected type or outside the expected value space?

Convert and validate the new value in a property custom setter.

You probably don't want to use the exact WebIDL coercion rules though, but similar, approximated, rules that will behave the same most of the time and only diverge on some edge cases. The reason is that WebIDL is really weird: for instance, by default, numeric values overflow by wrapping around, so assigning 130 to a byte (whose value space ranges from -128 to 127) will coerce it to… -126! (128 wraps to -128, 129 to -127, and 130 to -126; and by the way 256 wraps to 0; for the curious, BigInt.asIntN and BigInt.asUintN will do such wrapping in JS, but you'll have to convert numbers to BigInt and back); non-integer values assigned to integer types are truncated by default, except when the type is annotated with [Clamp], in which case they're rounded, with half-way values rounded towards even values (something that only happens natively in JS when setting such non-integer values to typed arrays: Math.round(2.5) is 3, but Int8Array.of(2.5)[0] is 2).

Overall, I feel like, as far as primitive/simple types are concerned, boolean, integers, double (not float), string (WebIDL's DOMString), and enumerations are all that's needed; truncating (or rounding, but with JavaScript rules), and clamping or enforcing ranges for integers. In other words, wrapping integers around is just weird, and what matters is coercing to the appropriate type and value space. Regarding enumerations, they're probably best handled by the reflection rules though (see below), and treated only as strings: no single built-in element has a property of a type that's a WebIDL enum.

Reflected properties

Now let's get back to reflected properties: most properties of built-in elements reflect attributes or similarly (but with specific rules) correspond to an attribute and change its value when set; non-reflected properties are those that either expose some internal state (e.g. the current value or validation state of a form field), computed value (from the DOM, such as the selectedIndex of a select, or the cellIndex of a table cell) or direct access to DOM elements (elements of a form, rows of a table, etc.), or that access other reflected properties with a transformed value (such as the valueAsDate and valueAsNumber of input). So if you want your custom element to feel like a built-in one, you'll want to use similar reflection wherever appropriate.

Have your properties reflect attributes by default.

The way reflection is defined is that the source of truth is the attribute value: getting the property will actually parse the attribute value, and setting the property will stringify the value into the attribute. Note that this means possibly setting the attribute to an invalid value that will be corrected by the getter. An example of this is setting the type property of an input element to an unknown value: it will be reflected in the attribute as-is, but the getter will correct it text. Another example where this is required behavior is with dependent attributes like those of progress or meter elements: without this you'd have to be very careful setting properties in the right order to avoid invalid combinations and having your set value immediately rewritten, but this behavior makes it possible to update properties in any order as the interaction between them are resolved internally and exposed by the getters: you can for example set the value to a value upper than max (on getting, value would be normalized to its default value) and then update the max (on getting, value could now return the value you previously set, because it wasn't actually rewritten on setting). Actually, these are not technically reflected then as they have specific rules, but at least they're consistent with actual reflected properties; for the purpose of this article, I'll consider them as reflected properties though.

This is at least how it theoretically works; in practice, the parsed value can be cached to avoid parsing every time the property is read; but note that there can be several properties reflecting the same attribute (the most known one probably being className and classList both reflecting the class attribute). Reflected properties can also have additional options, depending on their type, that will change the behavior of the getter and setter, not unlike WebIDL extended attributes.

Also note that HTML only defines reflection for a limited set of types (if looking only at primitive/simple types, only non-nullable and nullable strings and enumerations, long, unsigned long, and double are covered, and none of the narrower integer types, big integers, or the unrestricted double that allows NaN and infinity).

You can see how Mozilla tests the compliance of their built-in elements in the Gecko repository (the ok and is assertions are defined in their SimpleTest testing framework). And here's the Web Platform Tests' reflection harness, with data for each built-in element in sibling files, that almost every browser pass.

Events

Most direct changes to properties and attributes don't fire events: user actions or method calls will both update a property and fire an event, but changing a property programmatically generally won't fire any event. There are a few exceptions though: the events of type ToggleEvent fired by changes to the popover attribute or the open attribute of details elements, or the select event when changing the selectionStart, selectionEnd or selectionDirection properties of input and textarea elements (if you know of others, let me know); but notably changing the value of a form element programmatically won't fire a change or input event. So if you want your custom element to feel like a built-in one, don't fire events from your property setters or other attribute changed callbacks, but fire an event when (just after) you programmatically change them.

Don't fire events from your property setters or other attribute changed callbacks. Why you'd want to implement those

If you're you (your team, your company) are the only users of the web components (e.g. building an application out of web components, or an internal library of reusable components), then OK, don't use reflection if you don't need it, you'll be the only user anyway so nobody will complain. If you're publicly sharing those components, then my opinion is that, following the principle of least astonishment, you should aim at behaving more like built-in elements, and reflect attributes.

Similarly, for type coercions, if you're the only users of the web components, it's ok to only rely on TypeScript (or Flow or whichever type-checker) to make sure you always pass values of the appropriate type to your properties (and methods), but if you share them publicly then you should in my opinion coerce or validate inputs, in which case you'd want to follow the principe of least astonishment as well, and thus use rules similar to WebIDL and reflection behaviors. This is particularly true for a library that can be used without specific tooling, which is generally the case for custom elements.

For example, all the following design systems can be used without tooling (some of them provide ready-to-use bundles, others can be used through import maps): Google's Material Web, Microsoft's Fluent UI, IBM's Carbon, Adobe's Spectrum, Nordhealth's Nord, Shoelace, etc.

How to implement them

Now that we've seen what we'd want to implement, and why we'd want to implement it, let's see how to do it. First without, and then with libraries.

I started collecting implementations that strictly follow (as an exercise, not as a goal) the above rules in a GitHub repository (strictly because it directly reuses the above-mentioned Gecko and Web Platform Tests harnesses).

Vanilla implementation

In a vanilla custom element, things are rather straightforward:

class MyElement extends HTMLElement {
  get reflected() {
    const strVal = this.getAttribute("reflected");
    return parseValue(strVal);
  }
  set reflected(value) {
    const newValue = coerceType(value);
    // …there might be additional validations here…
    this.setAttribute("reflected", stringifyValue(newValue));
  }
}

or with intermediate caching (note that the setter is identical, setting the attribute will trigger the attributeChangedCallack which will close the loop):

class MyElement extends HTMLElement {
  #reflected;

  get reflected() {
    return this.#reflected;
  }
  set reflected(value) {
    const newValue = coerceType(value);
    // …there might be additional validations here…
    this.setAttribute("reflected", stringifyValue(newValue));
  }

  static get observedAttributes() {
    return [ "reflected" ];
  }
  attributeChangedCallback(name, oldValue, newValue) {
    // Note: in this case, we know it can only be the attribute named "reflected"
    this.#reflected = parseValue(newValue);
  }
}

And for a non-reflected property (here, a read-write property representing an internal state):

class MyElement extends HTMLElement {
  #nonReflected;
  get nonReflected() {
    return this.#nonReflected;
  }
  set reflected(value) {
    const newValue = coerceType(value);
    // …there might be additional validations here…
    this.#nonReflected = newValue;
  }
}

Because many rules are common to many attributes (the coerceType operation is defined by WebIDL, or using similar rules, and the HTML specification defines a handful of microsyntaxes for the parseValue and stringifyValue operations), those could be packaged up in a helper library. And with decorators coming to ECMAScript (and already available in TypeScript), those could be greatly simplified:

class MyElement extends HTMLElement {
  @reflectInt accessor reflected;
  @int accessor nonReflected;
}

I actually built such a library, mostly as an exercise (and I already learned a lot, most of the above details actually). It's currently not published on NPM but you can find it on Github

With a library

Surprisingly, web component libraries don't really help us here.

First, like many libraries nowadays, most expect people to just pass values of the appropriate types (relying on type checking through TypeScript) and basically leave you handling everything including how to behave in the presence of unexpected values. While it's OK, as we've seen above, in a range of situations, there are limits to this approach and it's unfortunate that they don't provide tools to make it easier at least coercing types.

Regarding reflected properties, most libraries tend to discourage you from doing it, while (fortunately!) supporting it, if only minimally.

All libraries (that I've looked at) support observed attributes though (changing the attribute value updates the property, but not the other way around), and most default to this behavior.

Now let's dive into the how-to with Lit, FAST, and then Stencil (other libraries left as a so-called exercise for the reader).

With Lit

By default, Lit reactive properties (annotated with @property()) observe the attribute of the same (or configured) name, using a converter to parse the value if needed (by default only handling numbers through a plain JavaScript number coercion, booleans, strings, or possibly objects or arrays through JSON.parse(); but a custom converter can be given). If your property is not associated to any attribute (but needs to be reactive to trigger a render when changed), then you can annotate it with @property({ attribute: false }) or @state() (the latter is meant for internal state though, i.e. private properties).

To make a reactive property reflect an attribute, you'll add reflect: true to the @property() options, and Lit will use the converter to stringify the value too. This won't be done immediately though, but only as part of Lit's reactive update cycle. This timing is a slight deviation compared to built-in elements that's probably acceptable, but it makes it harder to implement some reflection rules (those that set the attribute to a different value than the one returned by the getter) as the converter will always be called with the property value (returned by the getter, so after normalization). For a component similar to progress or meter with dependent properties, Lit recommends correcting the values in a willUpdate callback (this is where you'd check whether the value is valid with respect to the max for instance, and possibly overwrite its value to bring it in-range); this means that attributes will have the corrected value, and this requires users to update all properties in the same event loop (which will most likely be the case anyway).

It should be noted that, surprisingly, Lit actively discourages reflecting attributes:

Attributes should generally be considered input to the element from its owner, rather than under control of the element itself, so reflecting properties to attributes should be done sparingly. It's necessary today for cases like styling and accessibility, but this is likely to change as the platform adds features like the :state pseudo selector and the Accessibility Object Model, which fill these gaps.

No need to say I disagree.

For type coercion and validation, Lit allows you to have your own accessors (and version 3 makes it even easier), so everything's ok here, particularly for non-reflected properties:

class MyElement extends LitElement {
  #nonReflected;
  get nonReflected() {
    return this.#nonReflected;
  }
  @state()
  set nonReflected(value) {
    const newValue = coerceType(value);
    // …there might be additional validations here…
    this.#nonReflected = newValue;
  }
}

For those cases where you'd want the attribute to possibly have an invalid value (to be corrected by the property getter), it would mean using a non-reactive property wrapping a private reactive property (this assumes Lit won't flag them as errors in future versions), and parsing the value in its getter:

class MyElement extends LitElement {
  @property({ attribute: "reflected", reflect: true })
  accessor #reflected = "";

  get reflected() {
    return parseValue(this.#reflected);
  }
  set reflected(value) {
    const newValue = coerceType(value);
    // …there might be additional validations here…
    this.#reflected = stringifyValue(newValue);
  }
}

or with intermediate caching (note that the setter is identical):

class MyElement extends LitElement {
  @property({ attribute: "reflected", reflect: true })
  accessor #reflected = "";

  #parsedReflected = "";
  get reflected() {
    return this.#parsedReflected;
  }
  set reflected(value) {
    const newValue = coerceType(value);
    // …there might be additional validations here…
    this.#reflected = stringifyValue(newValue);
  }

  willUpdate(changedProperties) {
    if (changedProperties.has("#reflected")) {
      this.#parsedReflected = parseValue(this.#reflected);
    }
  }
}

It might actually be easier to directly set the attribute from the setter (and as a bonus behaving closer to built-in elements) and only rely on an observed property from Lit's point of view (setting the attribute will trigger attributeChangedCallback and thus Lit's observation code that will use the converter and then set the property):

class MyElement extends LitElement {
  @property({
    attribute: "reflected",
    converter: (value) => parseValue(value),
  })
  accessor #reflected = "";

  get reflected() {
    return this.#reflected;
  }
  set reflected(value) {
    const newValue = coerceType(value);
    // …there might be additional validations here…
    this.setAttribute("reflected", stringifyValue(newValue));
  }
}

Note that this is actually very similar to the approach in the vanilla implementation above but using Lit's own lifecycle hooks. It should also be noted that for a USVString that contains a URL (where the attribute value is resolved to a URL relative to the document base URI) the value needs to be processed in the getter (as it depends on an external state –the document base URI– that could change independently from the element).

A previous version of this article contained a different implementation that happened to be broken.

class MyElement extends LitElement {
  #reflected = "";
  get reflected() {
    return this.#reflected;
  }
  @property()
  set reflected(value) {
    const newValue = coerceType(value);
    // …there might be additional validations here…
    const stringValue = stringifyValue(newValue);
    // XXX: there might be a more optimized way
    // than stringifying and then parsing
    this.#reflected = parseValue(stringValue);
    // Avoid unnecessarily triggering attributeChangedCallback
    // that would reenter that setter.
    if (this.getAttribute("reflected") !== stringValue) {
      this.setAttribute("reflected", stringValue);
    }
  }
}

This implementation would for instance have the setter called with null when the attribute is removed, which actually needs to behave differently than user code calling the setter with null: in the former case the property should revert to its default value, in the latter case that null would be coerced to the string "null" or the numeric value 0 and the attribute would be added back with that value.

If we're OK only reflecting valid values to attributes, then we can fully use converters but things aren't necessarily simpler (we still need the custom setter for type coercion and validation, and marking the internal property as reactive to avoid triggering the custom setter when the attribute changes; we don't directly deal with the attribute but we now have to normalize the value in the setter in the same way as stringifying it to the attribute and parsing it back, to have the getter return the appropriate value):

const customConverter = {
  fromAttribute(value) {
    return parseValue(value);
  },
  toAttribute(value) {
    return stringifyValue(value);
  },
};

class MyElement extends LitElement {
  @property({ reflect: true, converter: customConverter })
  accessor #reflected = "";
  get reflected() {
    return this.#reflected;
  }
  set reflected(value) {
    const newValue = coerceType(value);
    // …there might be additional validations here…
    // XXX: this should use a more optimized conversion/validation
    this.#reflected = parseValue(stringifyValue(newValue));
  }
}

With FAST

I know FAST is not used that much but I wanted to cover it as it seems to be the only library that reflects attributes by default. By default it won't do any type coercion unless you use the mode: "boolean", which works almost like an HTML boolean attribute, except an attribute present but with the value "false" will coerce to a property value of false!

Otherwise, it works more or less like Lit, with one big difference: the converter's fromView is also called when setting the property (this means that fromView receives any external value, not just string values from the attribute). But unfortunately this doesn't really help us as most coercion rules need to throw at one point and we want to do it only in the property setters, never when parsing attribute values; and those rules that don't throw will have possibly different values between the attribute and the property getter (push invalid value to the attribute, sanitize it on the property getter), or just behave differently between the property (e.g. turning a null into 0 or "null") and the attribute (where null means the attribute is not set, and the property should then have its default value which could be different from 0, and will likely be different from "null").

This means that in the end the solutions are almost identical to the Lit ones (here using TypeScript's legacy decorators though; and applying the annotation on the private property to avoid triggering the custom setter on attribute change):

class MyElement extends FASTElement {
  @attr({ attribute: "reflected" })
  private _reflected = "";

  get reflected() {
    return parseValue(this._reflected);
  }
  set reflected(value) {
    const newValue = coerceType(value);
    // …there might be additional validations here…
    this._reflected = stringifyValue(newValue);
  }
}

or with intermediate caching (note that the setter is identical):

class MyElement extends FASTElement {
  @attr({ attribute: "reflected" })
  private _reflected = "";

  private _reflectedChanged(oldValue, newValue) {
    this._parsedReflected = parseValue(newValue);
  }

  private _parsedReflected;
  get reflected() {
    return this._parsedReflected;
  }
  set reflected(value) {
    const newValue = coerceType(value);
    // …there might be additional validations here…
    this.__reflected = stringifyValue(newValue);
  }
}

Or if you want immediate reflection to the attribute (the internal property can now be used to store the parsed value):

class MyElement extends FASTElement {
  @attr({
    attribute: "reflected",
    mode: "fromView",
    converter: {
      fromView(value) {
        return parseValue(value);
      },
      toView(value) {
        // mandatory in the converter type
        throw new Error("should never be called");
      }
    }
  })
  private _reflected;

  get reflected() {
    return this._reflected ?? "";
  }
  set reflected(value) {
    const newValue = coerceType(value);
    // …there might be additional validations here…
    this.setAttribute("reflected", stringifyValue(newValue));
  }
}

Note that the internal property is not initialized, to avoid calling the converter's fromView, and handled in the getter instead (our fromView expects a string or null coming from the attribute, so we'd have to initialize the property with such a string value which would hurt readability of the code as that could be a value different from the one actually stored in the property and returned by the pblic property getter).

If we're OK only reflecting valid values to attributes, then we can fully use converters but things aren't necessarily simpler (we still need the custom setter for type coercion and validation, and marking the internal property as reactive to avoid triggering the custom setter when the attribute changes; we don't directly deal with the attribute but we still need to call stringifyValue as we know the converter's fromView will receive the new value):

const customConverter = {
  fromView(value) {
    return parseValue(value);
  },
  toView(value) {
    return stringifyValue(value);
  },
};

class MyElement extends FASTElement {
  @attr({ attribute: "reflected ", converter: customConverter })
  private _reflected;

  get reflected() {
    return this._reflected ?? "";
  }
  set reflected(value) {
    const newValue = coerceType(value);
    // …there might be additional validations here…
    this._reflected = stringifyValue(newValue);
  }
}

For non-reflected properties, you'd want to use @observable instead of @attr, except that it doesn't work on custom accessors, so you'd have to do it manually:

class MyElement extends FASTElement {
  private _nonReflected = "";
  get nonReflected() {
    Observable.track(this, 'nonReflected');
    return this._nonReflected;
  }
  set nonReflected(value) {
    const newValue = coerceType(value);
    // …there might be additional validations here…
    this._nonReflected = newValue;
    Observable.notify(this, 'nonReflected');
  }
}

With Stencil

First a disclosure: I never actually used Stencil, only played with it a bit locally in a hello-world project while writing this post.

Stencil is kind of special. It supports observable attributes through the @Prop() decorator, and reflected ones through @Prop({ reflect: true }). It will however reflect default values to attributes when the component initializes, doesn't support custom converters, and like FAST will convert an attribute value of "false" to a boolean false. You also have to add mutable: true to the @Prop() if the component modifies its value (Stencil assumes properties and attributes are inputs to the component, not state of the component).

A @Prop() must be public too, and cannot have custom accessors. You can use a @Watch() method to do some validation, but throwing from there won't prevent the property value from being updated; you can revert the property to the old value from the watch method, but other watch methods for the same property will then be called twice, and not necessarily in the correct order (depending on declaration order).

You cannot expose properties on the element's API if they are not annotated with @Prop(), making them at a minimum observe an attribute.

In other words, a Stencil component cannot, by design, feel like a built-in custom element (another thing specific to Stencil: besides @Prop() properties, you can expose methods through @Method but they must be async).

http://blog.ltgt.net/web-component-properties/

Improving a web component, one step at a time

Dec 16, 2023 Updated Dec 16, 2023

Show full content

Earlier this month, Stefan Judis published a small web component that makes your text sparkle.

In the spirit of so-called HTML web components which apparently often comes with some sort of aversion for the shadow DOM, the element directly manipulates the light DOM. As a developer of web apps with heavy DOM manipulations, and lover of the platform, this feels weird to me as it could possibly break so many things: other code that manipulates the DOM and now sees new elements and could also change them, handling of disconnection and reconnection of the element (as most such elements modify their children in the connectedCallback without checking whether it had already been done), MutationObserver, etc.

The first thing that came to my mind was that shadow DOM, for all its drawbacks and bugs, was the perfect fit for such an element, and I wanted to update Stefan's element to use the shadow DOM instead. Then a couple days ago, Zach Leatherman published a similar element that makes it snow on its content, and I was pleased to see he used shadow DOM to encapsulate (hide) the snowflakes. That was the trigger for me to actually take the time to revisit Stefan's <sparkle-text> element, so here's a step by step of various improvements (in my opinion) I made.

Disclaimer before I begin: this not in any way a criticism of Stefan's work! On the contrary actually, it wouldn't have been possible without this prior work. I just want to show things that I think could be improved, and this is all very much subjective.

I'll link to commits in my fork without any (intermediate) demo, as all those changes don't have much impact on the element's behavior, as seen by a reader of the web page (if you're interested in what it changes when looked at through the DevTools, then clone the repository, run npm install, npm run start, then checkout each commit in turn), except in some specific situations. The final state is available here if you want to play with it in your DevTools.

Using shadow DOM

The first step was moving the sparkles to shadow DOM, to avoid touching the light DOM. This involves of course attaching shadow DOM, with a <slot> to let the light DOM show, and then changing where the sparkles are added, but also changing how CSS is handled!

Abridged diff of the changes (notably excluding CSS)

@@ -66,16 +62,21 @@ class SparklyText extends HTMLElement {
 `;
     let sheet = new CSSStyleSheet();
     sheet.replaceSync(css);
-    document.adoptedStyleSheets = [...document.adoptedStyleSheets, sheet];
-    _needsStyles = false;
+    this.shadowRoot.adoptedStyleSheets = [sheet];
   }
 
   connectedCallback() {
+    if (this.shadowRoot) {
+      return;
+    }
+
     this.#numberOfSparkles = parseInt(
       this.getAttribute("number-of-sparkles") || `${this.#numberOfSparkles}`,
       10
     );
 
+    this.attachShadow({ mode: "open" });
+    this.shadowRoot.append(document.createElement("slot"));
     this.generateCss();
     this.addSparkles();
   }
@@ -99,7 +100,7 @@ class SparklyText extends HTMLElement {
       Math.random() * 110 - 5
     }% - var(--_sparkle-base-size) / 2)`;
 
-    this.appendChild(sparkleWrapper);
+    this.shadowRoot.appendChild(sparkleWrapper);
     sparkleWrapper.addEventListener("animationend", () => {
       sparkleWrapper.remove();
     });

In Stefan's version, CSS is injected to the document, with a boolean to make sure it's done only once, and styles are scoped to .sparkle-wrapper descendants of the sparkle-text elements. With shadow DOM, we gain style encapsulation, so no need for that scoping, we can directly target .sparkle-wrapper and svg as they're in the shadow DOM, clearly separate from the HTML that had been authored. We need to do it for each element though (we'll improve that later), but we now need to make sure we initialize the shadow DOM only once instead (I'm going step by step, so leaving this in the connectedCallback).

As a side effect, this also fixes some edge-case bug where the CSS would apply styles to any descendant SVG of the element, whether a sparkle or not (this could have been fixed by only targetting SVG inside .sparkle-wrapper actually); and of course with shadow DOM encapsulation, page author styles won't affect the sparkles either.

Small performance improvements

Those are really small, and probably negligible, but I feel like they're good practice anyway so I didn't even bother measuring actually.

First, as said above, the CSS needs to be somehow injected into each element's shadow DOM, but the constructible stylesheet can actually be shared between all of them. I've thus split construction of the stylesheet with its adoption in the shadow DOM, and made sure construction was only made once. Again, to limit the changes, everything's still in the same method, just move inside an if (I think I would have personally constructed the stylesheet early, as soon as the script is loaded, rather than waiting for the element to actually be used; it probably doesn't make a huge difference).

   generateCss() {
-    const css = `…`;
-    let sheet = new CSSStyleSheet();
-    sheet.replaceSync(css);
+    if (!sheet) {
+      const css = `…`;
+      sheet = new CSSStyleSheet();
+      sheet.replaceSync(css);
+    }
     this.shadowRoot.adoptedStyleSheets = [sheet];
   }

Similarly, sparkles were created by innerHTML the SVG into each. I changed that to using cloneNode(true) on an element prepared only once.

   addSparkle() {
-    const sparkleWrapper = document.createElement("span");
-    sparkleWrapper.classList.add("sparkle-wrapper");
-    sparkleWrapper.innerHTML = this.#sparkleSvg;
+    if (!sparkleTemplate) {
+      sparkleTemplate = document.createElement("span");
+      sparkleTemplate.classList.add("sparkle-wrapper");
+      sparkleTemplate.innerHTML = this.#sparkleSvg;
+    }
+
+    const sparkleWrapper = sparkleTemplate.cloneNode(true);

We actually don't even need the wrapper element, we could directly use the SVG without wrapper.

Handling disconnection

The element uses chained timers (a setTimeout callback that itself ends up calling setTimeout with the same callback, again and again) to re-add sparkles at random intervals (removing the sparkles is done as soon as the animation ends; and all of this is done only if the user didn't configure their browser to prefer reduced motion).

If the element is removed from the DOM, this unnecessarily continues in the background and could create memory leaks (in addition to just doing unnecessary work). I started with a very small change: check whether the element is still connected to the DOM before calling adding the sparkle (and calling setTimeout again). It could have been better (for some definition of better) to track the timer IDs so we could call clearTimeout in disconnectedCallback, but I feel like that would be unnecessarily complex.

       const {matches:motionOK} = window.matchMedia('(prefers-reduced-motion: no-preference)');
-      if (motionOK) this.addSparkle();
+      if (motionOK && this.isConnected) this.addSparkle();

This handles disconnection (as could be done by any destructive change to the DOM, like navigating with Turbo or htmx, I'm not even talking about using the element in a JavaScript-heavy web app) but not reconnection though, and we've exited early from the connectedCallback to avoid initializing the element twice, so this change actually broke our component in these situations where it's moved around, or stashed and then reinserted. To fix that, we need to always call addSparkles in connectedCallback, so move all the rest into an if, that's actually as simple as that… except that when the user prefers reduced motion, sparkles are never removed, so they keep piling in each time the element is connected again. One way to handle that, without introducing our housekeeping of individual timers, is to just remove all sparkles on disconnection. Either that or conditionally add them in connectedCallback if either we're initializing the element (including attaching the shadow DOM) or the user doesn't prefer reduced motion. The difference between both approaches is in whether we want the small animation when the sparkles appear (and appearing at new random locations). I went with the latter.

This still doesn't handle the situation where prefers-reduced-motion changes while the element is displayed though: if it turns to no-preference, then sparkles will start animating (due to CSS) then disappear at the end of their animation (due to JS listening to the animationend event), and no other sparkle will be added (because the setTimeout chain would have been broken earlier). I don't feel like it's worthy enough of a fix for such an element but it's also rather easy to handle so let's do it: listen to the media query change and start the timers whenever the user no longer prefers reduced motion.

@@ -94,6 +94,19 @@ connectedCallback() {
       );
       this.addSparkles();
     }
+
+    motionOK.addEventListener("change", this.motionOkChange);
+  }
+
+  disconnectedCallback() {
+    motionOK.removeEventListener("change", this.motionOkChange);
+  }
+
+  // Declare as an arrow function to get the appropriate 'this'
+  motionOkChange = () => {
+    if (motionOK.matches) {
+      this.addSparkles();
+    }
   }

Browser compatibility

Constructible stylesheets aren't supported in Safari 16.3 and earlier (and possibly other browsers). To avoid the code failing and strange things (probably, I haven't tested) happening, I started by bailing out early if the browser doesn't support constructible stylesheets (the element would then just do nothing; I could have actually even avoided registering it at all). Fwiw, I borrowed the check from Zach's <snow-fall> which works this way already (thanks Zach). As an aside, it's a bit strange that the code assumed construtible stylesheets were available, but tested for the availability of the custom element registry 🤷

   connectedCallback() {
-    if (this.shadowRoot) {
+    // https://caniuse.com/mdn-api_cssstylesheet_replacesync
+    if (this.shadowRoot || !("replaceSync" in CSSStyleSheet.prototype)) {
       return;
     }

But Safari 16.3 and earlier still represent more than a third of users on macOS, and more than a quarter of users on iOS! (according to CanIUse) To widen browser support, I therefore added a workaround, which consists of injecting a <style> element in the shadow DOM. Contrary to the constructible stylesheet, styles cannot be shared by all elements though, as we've seen above, so we only conditionally fallback to that approach, and continue using a constructible stylesheet everywhere it's supported.

-      sheet = new CSSStyleSheet();
-      sheet.replaceSync(css);
+      if (supportsConstructibleStylesheets) {
+        sheet = new CSSStyleSheet();
+        sheet.replaceSync(css);
+      } else {
+        sheet = document.createElement("style");
+        sheet.textContent = css;
+      }
     }

-    this.shadowRoot.adoptedStyleSheets = [sheet];```
+    if (supportsConstructibleStylesheets) {
+      this.shadowRoot.adoptedStyleSheets = [sheet];
+    } else {
+      this.shadowRoot.append(sheet.cloneNode(true));
+    }

Other possible improvements

I stopped there but there's still room for improvement.

For instance, the number-of-sparkles attribute is read once when the element is connected, so changing the attribute afterwards won't have any effect (but will have if you disconnect and then reconnect the element). To handle that situation (if only because you don't control the order of initialization when that element is used within a JavaScript-heavy application with frameworks like React, Vue or Angular), one would have to listen to the attribute change and update the number of sparkles dynamically. This could be done either by removing all sparkles and recreating the correct number of them (with addSparkles()), but this would be a bit abrupt, or by reworking entirely how sparkles are managed so they could adapt dynamically (don't recreate a sparkle, let it expire, when changing the number of sparkles down, or create just as many sparkles as necessary when changing it up). I feel like this would bump complexity by an order of magnitude, so it's probably not worth it for such an element.

The number of sparkles could also be controlled by a property reflecting the attribute; that would make the element more similar to built-in elements. Once the above is in place, this hopefully shouldn't be too hard.

That number of sparkles is expected to be, well, a number, and is currently parsed with parseInt, but the code doesn't handle parsing errors and could set the number of sparkles to NaN. Maybe we'd prefer using the default value in this case, and similarly for a zero or negative value; basically defining the attribute as a number limited to only positive numbers with fallback.

All this added complexity is, to me, what separates so-called HTML web components from others: they're designed to be used from HTML markup and not (or rarely) manipulated afterwards, so shortcuts can be taken to keep them simple.

Still speaking of that number of sparkles, the timers that create new sparkles are entirely disconnected from the animation that also makes them disappear. The animation length is actually configurable through the --sparkly-text-animation-length CSS custom property, but the timers delay is not configurable (a random value between 2 and 3 seconds). This means that if we set the animation length to a higher value than 3 seconds, there will actually be more sparkles than the configured number, as new sparkles will be added before the previous one has disappeared. There are several ways to fix this (if we think it's a bug –this is debatable!– and is worth fixing): for instance we could use the Web Animations API to read the computed timing of the animation and compute the timer's delay based on this value. Or we could let the animation repeat and move the element on animationiteration, rather than remove it and add another, and to add some randomness it could be temporarily paused and then restarted if we wanted (with a timer of some random delay). The code would be much different, but not necessarily more complex.

10 sparkles, animation lengthened to 10 seconds

There are currently sparkles.

Regarding the animation events (whether animationend like it is now, or possibly animationiteration), given that they bubble, they could be listened to on a single parent (the element itself –filtering out possible animations on light DOM children– or an intermediate element inserted to contain all sparkles). This could hopefully simplify the code handling each sparkle.

Last, but not least, the addSparkles and addSparkle methods could be made private, as there's no reason to expose them in the element's API.

Final words

Had I started from scratch, I probably wouldn't have written the element the same way. I tried to keep the changes small, one step at a time, rather than doing a big refactoring, or starting from scratch and comparing the outcome to the original, as my goal was to specifically show what I think could be improved and how it wouldn't necessarily involve big changes. Going farther, and/or possibly using a helper library (I have written earlier about their added value), is left as an exercise for the reader.

http://blog.ltgt.net/web-component-step-by-step-improvement/

Beyond the login page

Nov 29, 2023 Updated Nov 29, 2023

Show full content

There are many blog posts floating around about “adding authentication to your application”, be it written in Node.js, ASP.NET, Java with Spring Boot, JS in the browser talking to a JSON-based Web API on the server, etc. Most of them handle the login page and password storage, and sometimes logout and a user registration page. But authentication is actually much more than that!

Don't get me wrong, it's great that we can describe in a single blog post how to do such things, but everyone should be aware that this is actually just the beginning of the journey, and most of the time those blog posts don't have any such warnings.

So here are some things to think about when “adding authentication to your application”:

are you sure you store passwords securely? and verify them securely?
is your logout secure? (ideally cannot be abused by tricking you just clicking a link on a mail or random site)
are passwords robust?
- put a lower bound on password length (NIST recommends a minimum of 8 characters); don't set an upper bound, or if you really want to make sure it's high enough (NIST recommends accepting at least 64 characters)
- if possible, check passwords (at registration or change) against known compromised passwords (use Pwned Passwords or similar)
how well do you handle non-ASCII characters? For example, macOS and Windows encode diacritics differently, so make sure that someone who signed up on one device will be able to sign in on another (put differently, use Unicode normalization on inputs; NIST recommends using NFKC or NFKD)
do you have a form to securely change the password? (when already authenticated)
are your forms actually compatible with password managers?
do you protect against brute-force attacks? if you do (e.g. by locking out accounts, or even just throttling), do you somehow protect legitimate users against DDoS?
once authenticated, how do you maintain the authenticated state (sessions; btw don't use JWTs)? and is this secure? (in other words, do you protect against session fixation? cross-site request forgery?)
how long are your sessions? There's a balance between short and long sessions regarding security and convenience, but a choice needs to be made.
do you have a mechanism to ask for re-authentication before sensitive actions?
what do you do if a user forgot their password? Password recovery generally requires an email address, do you have one? how can you make sure that the user didn't mistype it and you will actually be able to use it when they need it? Put differently: you need a secure email verification process before you can have a secure password reset process. Implementing those processes securely go beyond the scope of this post, but let's just say we've just come from one single blog post explaining how to “add authentication to your application” to a series of blog posts.
by the way, now that you store an email address for password reset purpose, how can the user securely update it? and by that I also mean, how do you handle the case where the account got breached and the attacker changes the email address? There's unfortunately no simple answer to that, because there are a handful of cases to handle: the user may have lost access to the previous email, an attacker may have gained access to the previous email, the user may still have access to the previous email but have mistyped the new email, etc.
speaking of changing passwords, do you make it easier for password managers? (spoiler: through a /.well-known/change-password URL)
do you handle multi-factor authentication? do you plan on handling it in the future? If you use SMS to send one-time codes, can the device autofill the form?
how about passkeys?

That being said, I don't think I ever implemented all of the above perfectly. There are always tradeoffs. But these are things to think about and make choices, and sometimes deliberate choices to postpone things (or just not implement them, after pondering the risks). Unfortunately, I did however see big mistakes in implementations of the various processes hinted above.

Most of the time nowadays, I prefer offloading this to an identity provider, using OpenID Connect or soon Federated Credential Management (FedCM), even if that means shipping an identity provider as part of the deliverables (I generally go with Keycloak, with keycloak-config-cli to provision its configuration). I'm obviously biased though as I work in IT services, developping software mainly for intranets/extranets, and companies now increasingly have their own identity providers or at a minimum have that in their roadmap. So YMMV.

And we've only talked about authentication, not even authorization!

Some resources to go farther:

http://blog.ltgt.net/beyond-the-login-page/

What are JWT?

Nov 29, 2023 Updated Nov 29, 2023

Show full content

This article's goal is to present what JWTs are, whenever you face them. As we'll see, you won't deliberately choose to use JWTs in a project, and more importantly: you won't use JWTs as session tokens.

What is it?

JSON Web Token (JWT) is a compact, URL-safe means of representing data to be transferred between two parties. The data is encoded as a JSON object that can be signed and/or encrypted.

This is, paraphrased, the definition from the IETF standard that defines it (RFC 7519).

What's the point? What's the use case?

So the goal is to transfer data, with some guarantees (or none, by the way): authenticity, integrity, even possibly confidentiality (if the message is encrypted). There are therefore many possible uses.

JWT is thus used in OpenID Connect to encode the ID Token that forwards to the application information on the authentication process that took place at the identity server. OpenID Connect also uses JWT to encode aggregated claims: information from other identity servers, for which we'll want to verify the authenticity and integrity.

A JWT might be used to authenticate to a server, such as with the OAuth 2 JWT Bearer (RFC 7523).

Still in OAuth 2 land, access tokens could themselves be JWTs (RFC 9068), authorization request parameters could be encoded as a JWT (RFC 9101), as well as token introspection responses (IETF draft: JWT Response for OAuth Token Introspection), and finally dynamic client registration uses a JWT to identify the software of which an instance attempts to register (so-called software statements of RFC 7591).

How does it work?

A JWT is composed of at least 2 parts, separated with a . (dot), the first one always being the header. Each part is always encoded as base64url, a variant of Base 64 with the + and / characters (that have special meaning in URLs) replaced with - and _ respectively, and without the trailing =.

There are two types of JWTs: JSON Web Signature (JWS, defined by RFC 7515), and JSON Web Encryption (JWE, defined by RFC 7516). The most common case is the JWS, composed of 2 or 3 parts: the header, the payload, and optionally the signature. JWEs are rarer (and more complex) so I won't talk about them here.

The header, common to both types, describes the type of JWT (JWS or JWE) as well as the different signature, MAC, or encryption algorithms being used (codified by RFC 7518), along with other useful information, as a JSON object.
In the case of JWS, we'll find the signature or MAC algorithm, possibly a key identifier (whenever multiple keys can be used, e.g. to allow for key rotation), or even a URL pointing to information about the keys (in JWKS format, defined by RFC 7517), etc.

In the case of JWS, the payload will generally be a JSON object with the transfered data (but technically could be another JWT).

The third part is the signature or MAC. This part is absent if the header says the JWT is unprotected ("alg":"none").

For debugging, one can use the JWT Debugger by Auth0 to decode JWTs (beware not to use it with sensitive data, only on JWTs coming from test servers).

⚠️ JWT being almost always used in security-related contexts, handle them with care, specifically when it comes to their cryptographical components.

One MUST use dedicated libraries to manipulate JWTs, and be careful to use them correctly to avoid introducing vulnerabilities.

RFC 8725 has a set of best practices when manipulating and using JWTs.

Criticism

Numerous security experts, among them cryptographers, vehemently criticize JWTs and advise against their use.

The main criticism relates to its complexity, even though it could look simple to developers:

first, you need to know how to decode UTF-8 and JSON ; that's as many sources of bugs (and potential vulnerabilities).
and of course because it's a generic format capable of signing and/or encrypting, or even not protecting anything at all ("alg":"none"), with a list of supported algorithms as long as your arm, you have to handle many cases (even if only to reject them).

As a result, a number of vulnerabilities have been identified; among them (identified as soon as March 2015):

As the JWT itself declares the algorithm used to sign or encrypt it, software that receives it needs to partly trust it, or correctly check the used algorithm against a list of authorized algorithms. Because of its apparent simplicity, many libraries came out that didn't do those necessary checks and readily accepted unprotected JWTs ("alg":"none"), allowing an attacker to use any JWT, without authenticity or integrity check. And as incredible as it may seem, we still find vulnerable applications nowadays!

Note: in the same way, the header can directly include the public key to be used to verify the signature. Using it will prove the integrity of the JWT, but not its authenticity as the signature could have been generated by anyone.
Another attack involes using the public key intended to verify an asymmetric signature ("alg":"RS256" or "alg":"ES256") as a MAC key ("alg":"HS256"): the application receiving the JWT could then mistakenly validate the MAC and allow the JWT in. Anybody could then create a JWS that would be accepted by the application, when that one thinks it's verifying an asymmetric signature.

This vulnerability could be due to a misuse of the library used to verify JWTs, but also in some cases directly to its API that cannot tell between a public key and a shared secret (generally for the sake of making it easy to use).

Aside: despite ID Tokens in OpenID Connect being JWTs, you won't actually need to verify their signature as you generally get them through HTTPS, that already guarantees authenticity and integrity (and confidentiality), which saves us from a whole class of vulnerabilities.

Another criticism is due to the misuse of JWT, most often by ignorance or lack of expertise in software security: validity of a JWT is directly verifiable, without the need for a database of valid tokens or a validation service (authenticity and integrity are verifiable, so the validity period contained within in the JWT are reliable), but it makes the JWT impossible to revoke (unless you add such a mecanism –possibly based on the jti claim, initially designed to protect against replay attacks– going against the whole reason for which JWT was chosen in the first place). If a JWT is used as a session token for example, it then becomes impossible to sign out or terminate a session. In most use cases (in the specifications), a JWT is validated and used as soon as it's received from the issuer, so revocation is not even an issue. It's when a JWT is stored by the receiver for a later use that the problem arises (such as with a session token or an access token).

Some articles critical of JWT:

JWT should not be your default for sessions (by Evert Pot, developper)
Why JWTs Suck as Session Tokens (by Okta, vendor of an identity management platform)
section "JSON Web Tokens" of "API Tokens: A Tedious Survey" (by Thomas H. Ptacek, security researcher)
Alternatives to JWTs (by Scott Brady, engineering manager specializing in identity management systems)
Stop using JWTs for sessions and Stop using JWT for sessions, part 2: Why your solution doesn’t work (on a web site surprisingly without HTTPS)
No Way, JOSE! Javascript Object Signing and Encryption is a Bad Standard That Everyone Should Avoid (by Scott “CiPHPerCoder” Arciszewski, cryptographer)
How to Write a Secure JWT Library If You Absolutely Must (by the same author)

http://blog.ltgt.net/jwt/

How I teach Git

Nov 26, 2023 Updated Nov 26, 2023

Show full content

I've been using Git for a dozen years. Eight years ago, I had to give a training session on Git (and GitHub) to a partner company about to create an open source project, and I'm going to tell you here about the way I taught it. Incidentally, we created internal training sessions at work since then that use the same (or similar) approach. That being said, I didn't invent anything: this is heavily inspired by what others wrote before, including the Pro Git book, though not in the same order, and that IMO can make a difference.

The reason I'm writing this post is because over the years, I've kept seeing people actually use Git without really understanding what they're doing; they'd either be locked into a very specific workflow they were told to follow, and unable to adapt to another that, say, an open source project is using (this also applies to open source maintainers not really understanding how external contributors use Git themselves), or they'd be totally lost if anything doesn't behave the way they thought it would, or if they made a mistake invoking Git commands. I've been inspired to write it down by Julia Evans' (renewed) interest in Git, as she sometimes ask for comments on social networks.

My goal is not to actually teach you about Git, but more about sharing my approach to teaching Git, for others who will teach to possibly take inspiration. So if you're learning Git, this post was not written with you in mind (sorry), and as such might not be self-sufficient, but hopefully the links to other learning resources will be enough to fill the blanks are make it a helpful learning resource as well. If you're a visual learner, those external learning resources are illustrated, or even oriented towards visual learning.

Mental model

Once we're clear why we use a VCS (Version Control System) where we record changes inside commits (or in other words we commit our changes to the history; I'm assuming some familiarity with this terminology), let's look at Git more specifically.

One thing I think is crucial to understand Git, is getting an accurate mental model of the concepts behind it.

First, that's not really important, but Git doesn't actually record changes, but rather snapshots of our files (at least conceptually; it will use packfiles to store things efficiently and will actually store changes –diffs– in some cases), and will generate diffs on-demand. This sometimes shows in the result of some commands though (like why some commands show one file removed and another added, while other commands show a file being renamed).

Now let's dive into some Git concepts, or how Git implements some common VCS concepts.

Commit

A Git commit is:

one or more parent commit(s), or none for the very first commit (root)
a commit message
an author and an author date (actually a timestamp with timezone offset)
a committer and commit date
and our files: their pathname relative to the repository root, their mode (UNIX file-system permissions), and their content

Each commit is given an identifier determined by computing the SHA1 hash of this information: change a comma and you get a different SHA1, a different commit object. (Fwiw, Git is slowly moving to SHA-256 as the hashing function).

Aside: how's the SHA1 computed?

Git's storage is content-adressed, meaning that each object is stored with a name that's directly derived from its content, in the form of its SHA1 hash.

Historically, Git stored everything in files, and we can still reason that way. A file's content is store as a blob, a directory is stored as tree (a text file that lists files in the directory with their name, mode, and the SHA1 of the blob representing their content, and their subdirectories with their name and the SHA1 their tree)

If you want the details, Julia Evans wrote an amazing (again) blog post; or you can read it from the Pro Git book.

A graph with 5 boxes organized in 3 columns, each box labelled with a 5-digit SHA1 prefix; the one on the left is sub-labelled "commit" and includes metadata "tree" with the SHA1 of the box in the middle, and "author" and "committer" both with value "Scott", and text "The initial commit of my project"; the box in the middle is sub-labelled "tree" and includes three lines, each labelled "blob", with the SHA1 of the 3 remaining boxes and what looks like file names: "README", "LICENSE" and "test.rb"; the last 3 boxes, aligned vertically on the right are all sub-labelled "blob" and contain what looks like the beginning of a README, LICENSE, and Ruby source file content; there are arrows linking boxes: the commit points to the tree, which points to the blobs. — A commit and its tree (source: Pro Git)

The parent commit(s) in a commit create a directed acyclic graph that represents our history: a directed acyclic graph is made of nodes (our commits) linked together with directed edges (each commit links to its parent(s) commit(s), there's a direction, hence directed) and cannot have loops/cycles (a commit will never be its own ancestor, none of its ancestor commits will link to it as a parent commit).

A graph with 6 boxes arranged in 2 lines and 3 columns; each box on the first line is labelled with a 5-digit SHA1 prefix, sub-labelled "commit" and with metadata "tree" and "parent" both with a 5-digit SHA1 prefix –different each time–, "author" and "committer" both with value "Scott", and some text representing the commit message; the box on the left has no "parent" value, the two other boxes have as "parent" the SHA1 of the box on their left; there's an arrow between those boxes, pointing to the left representing the "parent"; incidentally, the box on the left has the same SHA1 and same content as the commit box from the above figure; finally, each commit box also points to a box beneath it each labelled "Snapshot A", "Snapshot B", etc. and possibly representing the "tree" object linked from each commit. — Commits and their parents (source: Pro Git)

References, branches and tags

Now SHA1 hashes are impractical to work with as humans, and while Git allows us to work with unique SHA1 prefixes instead of the full SHA1 hash, we'd need simpler names to refer to our commits: enter references. Those are labels for our commits that we chose (rather than Git).

There are several kinds of references:

branches are moving references (note that main or master aren't special in any way, their name is only a convention)
tags are immutable references
HEAD is a special reference that points to the current commit. It generally points to a branch rather than directly to a commit (we'll see why later). When a reference points to another reference, this is called a symbolic reference.
there are other special references (FETCH_HEAD, ORIG_HEAD, etc.) that Git will setup for you during some operations

A graph with 9 boxes; 6 boxes are arranged the same as the above figure, and are labelled the same (three commits and their 3 trees); two boxes above the right-most (latest) commit, with arrows pointing towards it, are labelled "v1.0" and "master" respectively; the last box is above the "master" box, with an arrow pointing towards it, and is labelled "HEAD". — A branch and its commit history (source: Pro Git)

The three states

When you work in a Git repository, the files that you manipulate and record in the Git history are in your working directory. To create commits, you'll stage files in the index or staging area. When that's done you attach a commit message and move your staged files to the history.

And to close the loop, the working directory is initialized from a given commit from your history.

A sequence diagram with 3 participants: "Working Directory", "Staging Area", and ".git directpry (Repository)"; there's a "Checkout the project" message from the ".git directory" to the "Working Directory", then "Stage Fixes" from the "Working Directory" to the "Staging Area", and finally "Commit" from the "Staging Area" to the ".git directory". — Working tree, staging area, and Git directory (source: Pro Git)

Aside: ignoring files

Not all files need to have their history tracked: those generated by your build system (if any), those specific to your editor, and those specific to your operating system or other work environment.

Git allows defining naming patterns of files or directories to ignore. This does not actually mean that Git will ignore them and they cannot be tracked, but that if they're not tracked, several Git operations won't show them to you or manipulate them (but you can manually add them to your history, and from then on they'll no longer be ignored).

Ignoring files is done by putting their pathname (possibly using globs) in ignore files:

.gitignore files anywhere in your repository define ignore patterns for the containing directory; those ignore files are tracked in history as a mean to share them between developers; this is where you'll ignore those files generated by your build system (build/ for Gradle projects, _site/ for an Eleventy website, etc.)
.git/info/excludes is local to the repository on your machine; rarely used but sometimes useful so good to know about
and finally ~/.config/git/ignore is global to the machine (for your user); this is where you'll ignore files that are specific to your machine, such as those specific to the editors you use, or those specific to your operating system (e.g. the .DS_Store on macOS, or Thumbs.db on Windows)

Summing up

Here's another representation of all those concepts:

Commits, references, and areas (source: A Visual Git Reference, Mark Lodato)

Basic operations

This is where we start talking about Git commands, and how they interact with the graph:

git init to initialize a new repository
git status to get a summary of your files' state
git diff to show changes between any two of your working directory, the index, the HEAD, or actually between any commit
git log to show and search into your history
creating commits
- git add to add files to the index
- git commit to transform the index into a commit (with an added commit message)
- git add -p to add files interactively to the index: pick which changes to add and which ones to leave only in your working directory, on a file-by-file, part-by-part (called hunk) basis
managing branches
- git branch to show branches, or create a branch
- git switch (also git checkout) to check out a branch (or any commit, any tree, actually) to your working directory
- git switch -b (also git checkout -b) as a shortcut for git branch and git switch
git grep to search into your working directory, index, or any commit; this is kind of an enhanced grep -R that's aware of Git
git blame to know the last commit that changed each line of a given file (so, who to blame for a bug)
git stash to put uncommitted changes aside (this includes staged files, as well as tracked files from the working directory), and later unstash them.

Commit, branch switching, and HEAD

When you create a commit (with git commit), Git not only creates the commit object, it also moves the HEAD to point to it. If the HEAD actually points to a branch, as is generally the case, Git will move that branch to the new commit (and HEAD will continue to point to the branch). Whenever the current branch is an ancestor of another branch (the commit pointed by the branch is also part of another branch), committing will move HEAD the same, and branches will diverge.

When you switch to another branch (with git switch or git checkout), HEAD moves to the new current branch, and your working directory and index are setup to ressemble the state of that commit (uncommitted changes are tentatively kept; if Git is unable to do it, it will refuse the switch).

For more details, and visual representations, see the commit and checkout sections of Mark Lotato's A Visual Git Reference (be aware that this reference was written years ago, when git switch and git restore didn't exist and git checkout was all we had; so the checkout section covers a bit more than git switch as a result). Of course, the Pro Git book is also a good reference with visual representations; the Branches in a Nutshell subchapter covers a big part of all of the above.

Aside: Git is conservative

As we've seen above, due to its content-addressed storage, any “change” to a commit (with git commit --amend for instance) will actually result in a different commit (different SHA1). The old commit won't disappear immediately: Git uses garbage collection to eventually delete commits that aren't reachable from any reference. This means that many mistakes can be recovered if you manage to find the commit SHA1 back (git reflog can help here, or the notation <branch-name>@{<n>}, e.g. main@{1} for the last commit that main pointed to before it changed).

Working with branches

We've seen above how branches can diverge. But diverging calls for eventually merging changes back (with git merge). Git is very good at that (as we'll see later).

A special case of merging is when the current branch is an ancestor of the branch to merge into. In this case, Git can do a fast-forward merge.

Because operations between two branches will likely always target the same pair of branches, Git allows you to setup a branch to track another branch. That other branch with be called the upstream of the branch that tracks it. When setup, git status will, for example, tell you how much the two branches have diverged from one another: is the current branch up to date with its upstream branch, behind it and can be fast-forwarded, ahead by a number of commits, or have they diverged, each by some number of commits. Other commands will use that information to provide good default values for parameters so they can be omitted.

To integrate changes from another branch, rather than merging, another option is to cherry-pick (with the same-named command) a single commit, without its history: Git will compute the changes brought in by that commit and apply the same changes to the current branch, creating a new commit similar to the original one (if you to know more about how Git actually does it, see Julia Evans' How git cherry-pick and revert use 3-way merge).

Finally, another command in your toolbelt is rebase. You can see it as a way to do many cherry-picks at once but it's actually much more powerful (as we'll see below). In its basic use though, it's just that: you give it a range of commits (between any commit as the starting point and an existing branch as the end point, defaulting to the current one) and a target, and it cherry-picks all those commits on top of the target and finally updates the branch used as the end point. The command here is of the form git rebase --onto=<target> <start> <end>. As with many Git commands, arguments can be omitted and will have default values and/or specific meanings: thus, git rebase is a shorthand for git rebase --fork-point upstream where upstream is the upstream of the current branch (I'll ignore --fork-point here, its effect is subtle and not that important in every-day use), which itself is a shorthand for git rebase upstream HEAD (where HEAD must point to a branch), itself a shorthand for git rebase --onto=upstream upstream HEAD, a shorthand for git rebase --onto=upstream $(git merge-base upstream HEAD) HEAD, and will rebase all commits between the last common ancestor of upstream and the current branch on one hand and the current branch (i.e. all commits since they diverged) on the other hand, and will reapply them on top of upstream, then update the current branch to point to the new commits. Explicit use of --onto (with a value different from the starting point) is rare actually, see my previous post for one use case.

We cannot present git rebase without its interactive variant git rebase -i: it starts with exactly the same behavior as the non-interactive variant, but after computing what needs to be done, it'll allow you to edit it (as a text file in an editor, one action per line). By default, all selected commits are cherry-picked, but you'll be able to reorder them, to skip some commit(s), or even combine some into a single commit. You can actually cherry-pick a commit that was not initially selected, and even create merge commits, thus entirely rewriting the whole history! Finally, you can also stop on a commit to edit it (using git commit --amend then, and/or possibly create new commits before continuing with the rebase), and/or run a given command between two commits. This last option is so useful (to e.g. validate that you didn't break your project at each point of the history) that you can pass that command in an --exec option and Git will execute it between each rebased commit (this works with non-interactive rebase too; in interactive mode you'll see execution lines inserted between each cherry-pick line when given the ability to edit the rebase scenario).

For more details, and visual representations, see the merge, cherry pick, and rebase sections of Mark Lodato's A Visual Git Reference, and the Basic Branching and Merging, Rebasing, and Rewriting History subchapters of the Pro Git book. You can also look at the “branching and merging” diagrams from David Drysdale's Git Visual Reference.

Working with others

For now, we've only ever worked locally in our repository. But Git was specifically built to work with others.

Let me introduce remotes.

Remotes

When you clone a repository, that repository becomes a remote of your local repository, named origin (just like with the main branch, this is just the default value and the name in itself has nothing special, besides sometimes being used as the default value when an command argument is omitted). You'll then start working, creating local commits and branches (therefore forking from the remote), and the remote will probably get some more commits and branches from its author in the mean time. You'll thus want to synchronize those remote changes into your local repository, and want to quickly know what changes you made locally compared to the remote. The way Git handles this is by recording the state of the remote it knows about (the branches, mainly) in a special namespace: refs/remote/. Those are known as remote-tracking branches. Fwiw, local branches are stored in the refs/heads/ namespace, and tags in refs/tags/ (tags from remotes are generally imported right into refs/tags/, so for instance you lose the information of where they came from). You can have as many remotes as needed, each with a name. (Note that remotes don't necessarily live on other machines, they can actually be on the same machine, accessed directly from the filesystem, so you can play with remotes without having to setup anything.)

Fetching

Whenever you fetch from a remote (using git fetch, git pull, or git remote update), Git will talk to it to download the commits it doesn't yet know about, and will update the remote-tracking branches for the remote. The exact set of references to be fetched, and where they're fetched, is passed to the git fetch command (as refspecs) and the default value defined in your repository's .git/config, and configured by default by git clone or git remote add to taking all branches (everything in refs/heads/ on the remote) and putting them in refs/remote/<remote> (so refs/remote/origin/ for the origin remote), with the same name (so refs/heads/main on the remote becomes refs/remote/origin/main locally).

A diagram with 3 big boxes, representing machines or repositories, containing smaller boxes and arrows representing commit histories; one box is labelled "git.outcompany.com", sublabelled "origin", and includes commits in a branch named "master"; another box is labelled "git.team1.outcompany.com", sublabelled "teamone", and includes commits in a branch named "master"; the commit SHA1 hashes are the same in "origin" and "teamone" except "origin" has one more commit on its "master" branch, i.e. "teamone" is "behind"; the third box is labelled "My Computer", it includes the same commits as the other two boxes, but this time the branches are named "origin/master" and "teamone/master"; it also includes two more commits in a branch named "master", diverging from an earlier point of the remote branches. — Remotes and remote-tracking branches (source: Pro Git)

You'll then use branch-related commands to get changes from a remote-tracking branch to your local branch (git merge or git rebase), or git pull which is hardly more than a shorthand for git fetch followed by a git merge or git rebase. BTW, in a number of situations, Git will automatically setup a remote-tracking branch to be the upstream of a local branch when you create it (it will tell you about it when that happens).

Pushing

To share your changes with others, they can either add your repository as a remote and pull from it (implying accessing your machine across the network), or you can push to a remote. (If you ask someone to pull changes from your remote, this is called a… pull request, a term you'll have probably heard of from GitHub or similar services.)

Pushing is similar to fetching, in reverse: you'll send your commits to the remote and update its branch to point to the new commits. As a safety measure, Git only allows remote branches to be fast-forwarded; if you want to push changes that would update the remote branch in a non-fast-forward way, you'll have to force it, using git push --force-with-lease (or git push --force, but be careful: --force-with-lease will first ensure your remote-tracking branch is up-to-date with the remote's branch, to make sure nobody pushed changes to the branch since the last time you fetched; --force won't do that check, doing what you're telling it to do, at your own risks).

As with git fetch, you pass the branches to update to the git push command, but Git provides a good default behavior if you don't. If you don't specify anything, Git will infer the remote from the upstream of the current branch, so most of the time git push is equivalent to git push origin. This actually is a shorthand to git push origin main (assuming the current branch is main), itself a shorthand for git push origin main:main, shorthand for git push origin refs/heads/main:refs/heads/main, meaning to push the local refs/heads/main to the origin remote's refs/heads/main. See my previous post for some use cases of specifying refspecs with differing source and destination.

git push (source: Git Visual Reference, David Drysdale)

For more details, and visual representations, see the Remote Branches, Working with Remotes, and Contributing to a Project subchapters of the Pro Git book, and the “dealing with remote repositories” diagrams from David Drysdale's Git Visual Reference. The Contributing to a Project chapter of Pro Git also touches about contributing to open source projects on platforms like GitHub, where you have to first fork the repository, and contribute through pull requests (or merge requests).

Best practices

Those are directed towards beginners, and hopefully not too controversial.

Try to keep a clean history:

use merge commits wisely
clear and high-quality commit messages (see the commit guidelines in Pro Git)
make atomic commits: each commit should be compile and run independently of the commits following it in the history

This only applies to the history you share with others. Locally, do however you want. For beginners, I'd give the following advices though:

don't work directly on main (or master, or any branch that you don't specifically own on the remote as well), create local branches instead; it helps decoupling work on different tasks: about to start working on another bug or feature while waiting for additional details on instructions on the current one? switch to another branch, you'll get back to that later by switching back; it also makes it easier to update from the remote as you're sure you won't have conflicts if your local branches are simply copies of the remote ones of the same name, without any local change (except when you want to push those changes to that branch)
don't hesitate to rewrite your commit history (git commit --amend and/or git rebase -i), but don't do it too early; its more than OK to stack many small commits while working, and only rewrite/cleanup the history before you share it
similarly, don't hesitate to rebase your local branches to integrate upstream changes (until you shared that branch, at which point you'll follow the project's how branching workflow)

In case of any problem and you're lost, my advice is to use gitk or gitk HEAD @{1}, also possibly gitk --all (I'm using gitk here but use whichever tool you prefer), to visualize your Git history and try to understand what happened. From this, you can rollback to the previous state (git reset @{1}) or try to fix things (cherry-picking a commit, etc.) And if you're in the middle of a rebase, or possibly a failed merge, you can abort and rollback to the previous state with commands like git rebase --abort or git merge --abort.

To make things even easier, don't hesitate, before any possibly destructive command (git rebase), to create a branch or a tag as a "bookmark" you can easily reset to if things don't go as expected. And of course, inspect the history and files after such a command to make sure the outcome is the one you expected.

Advanced concepts

Only a few of them, there are many more to explore!

Detached HEAD: the git checkout manpage has a good section on the topic, also see my previous post, and for a good visual representation, see the Committing with a Detached HEAD section of Mark Lodato's A Visual Git Reference.
Hooks: those are executables (shell scripts most of the time) that Git will run in reaction to operations on a repository; people use them to lint the code before each commit (aborting the commit if that fails), generate or post-process commit messages, or trigger actions on the server after someone pushes to the repository (trigger builds and/or deployments).
A couple rarely needed commands that can save you hours when you actually need them:
- git bisect: an advanced command to help you pinpoint which commit introduced a bug, by testing several commits (manually or through scripting); with a linear history, this is using bisection and could be done manually, but as soon as you have many merge commits this becomes much more complex and it's good to have git bisect do the heavy lifting.
- git filter-repo: a third-party command actually, as a replacement to Git's own filter-branch, that allows rewriting the whole history of a repository to remove a mistakenly added file, or help extract part of the repository to another.

We're done.

With this knowledge, one should be able to map any Git command to how it will modify the directed acyclic graph of commits, and understand how to fix mistakes (ran a merge on the wrong branch? rebased on the wrong branch?) I'm not saying understanding such things will be easy, but should at least be possible.

http://blog.ltgt.net/teaching-git/

Confusing git terminology

Nov 12, 2023 Updated Nov 12, 2023

Show full content

Next week, Julia Evans published on her blog about confusing git terminology. This is an awesome post but not all explanations resonated with me so I thought I'd write my own version (or rather, add my own notes) in case others felt the same (Julia, feel free to cherry pick from here to your blog 😉). I'll also reorder them to make it easier to cross-reference without you having to jump around.

My mental representation of git

First, let me quickly describe how I represent a git repository in my head.

A git repository is a set of directed acyclic graphs of commits. In many cases a repository has only one such graph, but there can actually be multiple (early users of GitHub Pages know about the gh-pages branch, in most case it's an entirely separate branch, a separate graph not connected in any wayto the other branches).

Then to easily reference some of those commits, we put labels on them: those are our branches and tags (among other things).

Each git repository on a machine contains such a set of directed acyclic graphs of commits, and each time you git clone, git fetch and git push you copy parts of these graphs between repositories.

You can use gitk --all or git log --all --oneline --graph to visualize the graphs known on your matchine.

HEAD and “heads”

As Julia says, “heads” are “branches” (contrary to tags that are immutable, those “heads” move along the graph).

The way I see HEAD though is more like “what's been checked out in the working directory”. It will thus indeed be “the current branch” most of the time, but not always (we'll come to those cases below).

One interesting thing: a remote repository also has a HEAD, it then represents the “default branch” that will be checked out when you clone the repository (unless you tell git to checkout a specific branch). Actually, git makes no distinction between a repository on a server that everyone will clone from (e.g. on GitHub), and any of these clones: git is decentralized before all. You can even clone from a repository you already have on your machine, and observe that the branch that will be checked out by default will be that source repository's HEAD. When you change the “default branch” of your repository on GitHub, what you're actually doing is updating its HEAD.

“reference”, “symbolic reference”

A reference is any label on a commit in the directed acyclic graph of commits. It allows you to reference (sic!) a commit by a (somewhat) simple name (much simpler than the commit ID at least). Those are branches (local and remote), tags, as well as HEAD, FETCH_HEAD, ORIG_HEAD, MERGE_HEAD, etc.

A symbolic reference is a reference that points to another reference, rather than directly to a commit. This is the case of HEAD when you checkout a branch: it points to the branch so that git knows to move that branch forward when you make a new commit.

Note that as Julia notes, HEAD^^^ is not technically a reference, it's one of many different ways of specifying revisions (another name for a commit).

“index”, “staged”, “cached”

I have nothing to add to what Julia wrote. tl;dr: they're all the same thing, but --cached (or --staged which is a synonym) and --index mean slightly different things.

“untracked files”, “remote-tracking branch”, “track remote branch”

The word “track” here has three different meanings:

an “untracked file” is a file that's not included in HEAD or the index (technically it could exist in another commit, but when only looking at HEAD and comparing it to the state of your working directory, it only exists in your working directory and not in HEAD or in the index)
a “remote-tracking branch” is a reference that corresponds to a branch in a remote repository that you fetched. Whenever you git fetch (or git clone) from a remote repository, the branches in that remote repository (in refs/heads/ there) are copied/updated to your repository under new names, in refs/remote/<remotename>/ rather than ref/heads/ (refs/heads/ being reserved for local branches). Those refs/remote/<remotename>/ branches are thus tracking the corresponding refs/heads/ from the remote repository.
in git, a branch can be configured to “track” another (e.g. using git branch --track when creating a branch, git branch --set-upstream-to= to change a branch); that other branch is then said to be the “upstream” of the former. Git will use that information in git status to tell you by how many commits the two branches diverge, and in git pull and git push to synchronize the two branches. The “upstream” branch can be a “remote-tracking branch” or a local branch. When you git switch (or git checkout) to a local branch that actually doesn't exist but has a name match in a single remote, git will automatically create it from the matching “remote-tracking branch”, and set it up to “track” it (by extension, the repository you cloned/forked from, and whose branches you'll track, can also be called the “upstream repository”).

“detached HEAD state”

When the HEAD points to a (local) branch, each new commit will move the branch label to the new commit.

When the HEAD points to anything else than a (local) branch, git won't be able to move the reference to a new commit: you're in a “detached HEAD state”, if you make a new commit, only HEAD will reference it and nothing else, so if you switch to a branch you'll no longer have any reference (label) to that commit. In other words, you're in a “detached HEAD state” when HEAD is not a “symbolic reference” but directly references a commit.

Note that when you checkout anything that's not a local branch (in refs/heads/), whether it's a tag or a “remote tracking branch”, git will resolve it to the commit ID and setup HEAD to point to that ID, so you'll be in a “detached HEAD state”.

“ours” and “theirs” while merging or rebasing

“Ours” and “theirs”, or “local” and “remote”, are indeed confusing.

When merging, you merge another branch into the current branch: the current branch is “ours” and the other one is thus “theirs”.

But when rebasing the current branch on top of another branch, you're repeatedly cherry-picking the commits from the current branch on top of the other branch, so the other branch is “ours” or “local”, and the commits from the current branch are “theirs”. To make things a bit clearer, I like to think of how rebase work (conceptually at least): after determining the list of commits that defers between the branches and need to be rebased, first checkout the other (target) branch, then for each commit in the list cherry-pick it, and finally update the branch to point to the last rebased commit. Because you start by moving to the branch on top of which you want to rebase, it becomes the “ours” or “local”, and the branch you started from becomes the “theirs” or “remote”.

“Your branch is up to date with ‘origin/main’”

This is directly derived from the “tracking” of your branch, as seen above: if your current branch “tracks” refs/remote/origin/main, then git status will display by how much commit the two branches diverge. When they don't diverge (i.e. both references point to the exact same commit), then the branch is said to be “up to date” with its “upstream”.

Remember though, as Julia points out, that refs/remote/origin/main is only updated when you explicitly fetch from the remote repository (with git fetch, git pull, or git remote update).

“can be fast-forwarded”

This is another message you can see in the output of git status related to the state of this branch relative to its “upstream” branch. We've seen that when they both point to the same commit you'll get an “is up-to-date” message; this one is another situation when the branches have not diverged, but they're not identical either. This happens when the current branch is “behind” its “upstream”: it points to a commit that's part of the “upstream”, but “upstream” actually has more commits.

A - B (main)
     \
      C - D (origin/main)

or if you prefer

A - B (main) - C - D (origin/main)

This will typically be the case when you did git pull a few days ago to bring your main “up-to-date” with origin/main (at that time, both main and origin/main pointed to commit B) and didn't touch it since then, and things continued moving in the origin remote repository (commits C and D were added). When you git fetch origin main, you retrieve commits C and D locally into origin/main; now main can be “fast-forwarded” to commit D by just moving the main label along the graph towards origin/main.

In other words, there's no need to create a merge commit when running git merge (or git pull), and there's no risk of merge conflict. There's hardly any situation safer than a “fast-forward merge”.

Note that such a “fast-forward merge” can actually bring in merge commits (here, main can be fast-forwarded to origin/main, and bring in commits C, D, E, F, and G):

A - B (main) - C - D (origin/main)
 \            /
  E -- F --- G (origin/newfeature)

As for the name, I like to imagine those commits as a timeline, or a tape in a tape cassette or VHS. You were following changes but ⏸️ paused a few days ago at your last git pull. Git knows that there's origin/main ahead in a “straight line” so you can just press the “⏩ fast forward” button to safely reach that new state.

The other situations you can experience that are neither an “is up to date with” or “can be forwarded” are:

when your branch has more commits than its “upstream”: git will show “Your branch is ahead of 'origin/main' by N commits”
```
A - B (origin/main)
     \
      C - D (main)
```
when they have diverged: “Your branch and 'origin/main' have diverged, and have M and N different commits each, respectively”
```
A - B (main)
 \
  C - D (origin/main)
```

HEAD^, HEAD~, HEAD^^, HEAD~~, HEAD^2, HEAD~2

When you need to specify commits as parameters to git commands, one way is to use the commit ID, or a reference (branch, tag) name. But git makes it easier for those commits that are not directly pointed by a reference: if you know how to find that commit then no need to use git log to go search the commit ID yourself, you can tell git how to get to it from another commit.

That's what the ^ and ~ suffixes do (there are other notations as well).

So ^ is actually a shorthand for ^1 which takes the “first parent” of the commit you apply it to. Most commits have only a single parent, but merge commits will have at least 2 (yes, at least, you can actually have merge commits with more than 2 parents), so ^ or ^1 will take the first, and ^2 the second (and ^3 the third, you got it).

HEAD^^ actually just applies the ^ operator to HEAD^, which itself had applied it to HEAD, therefore taking “two commits ago”.

To make it easier to follow the “first parents”, the ~ operator can be used. Similarly, ~ is actually a shorthand for ~1. Directly taken for the docs, ~3 is equivalent to ^^^ and directly expressed “three commits before” (or “three commits ago” when applied to HEAD). So “ten commits ago” can be written either HEAD^^^^^^^^^^ or HEAD~10, one is easier to read than the other 😉

.. and ...

Those are generally used with git log and git diff.

The notation r1..r2 selects all commits reachable from r2 that are not reachable from r1 (note that r1 and r2 can be any form of revision: a reference or a commit ID), whereas r1...r2 selects all commits reachable from either r1 or r2 but not both.

In a typical tree with two diverging branches like this:

A - B (main)
  \ 
    C - D (test)

the notation main..test will select all of B, C and D (but not A), whereas main...test will select commits C and D only.

Note that the behavior is different with git diff, as git diff is about comparing two points in the graph, not a range of commits! git diff thus has its own definition for .. and ...: whereas git diff r1..r2 is equivalent to git diff r1 r2, showing the difference between those 2 commits, git diff r1...r2 will however find the last common ancestor of r1 and r2 (same as git merge-base r1 r2), and diff between that common ancestor and r2. In other words, git diff main...test will show the changes in test since the point it diverged from main (what changes did I add to my branch, ignoring commits added to the “upstream” since then? or what changes exist in my “upstream” branch since I branched out, ignoring changes in my branch?)

While this might seem the reverse of git log (commit B is taken into account by git log main...test but not git log main...test, and by git diff main...test but not git diff main..test), this is actually rather consistent with git log, at least for ...: git log main...test and git diff main...test will both only tell you about commits C and D (notice that this is what GitHub is using when clicking on those compare links).

TL;DR: forget about the .. notation, it's almost never what you want for git log, use either ... or the space-separated form of git diff.

refspecs

Refspecs are used by git fetch and git push to determine what to fetch or push, respectively, and the mapping between local references and remote ones (though most of the time one uses those commands without an explicit refspec). A default refspec can also be configured for a remote (remote repository) for each action (fetch or push); one will generally be configured for fetching.

When you clone a repository, git sets up a remote named origin and configures its default refspec, generally with +refs/heads/*:refs/remotes/origin/* but this can differ depending on the options passed to git clone.

This refspec tells git that when fetching from the remote repository, all the references inside refs/heads/ (due to the * wildcard) will be fetched and stored locally into refs/remote/origin/ (using the same name suffix). The + is equivalent to passing --force to the commands and will update the destination reference even if the new value is not “fast-forwarded” from the current value. When fetching, this means that if someone force-pushed a branch, git will update the corresponding refs/remote/ on your side to make it match the remote reference; without the +, your “remote-tracking branch” would instead stay desynchronized.

The --tags flag is actually a shorthand to adding the refs/tags/*:refs/tags/* refspec: tags are synchronized (either fetched or pushed, depending on the command) between repositories (without overwriting existing tags at the destination).

As I said above, you can actually use those refspecs for pushing too.

For example, with git push origin HEAD:test you will update (or possibly create) a test branch on the remote repository (git will expand test to refs/heads/test) to point to the commit that's locally your HEAD (this will send the appropriate commits to the remote to make it possible). I use this from time to time on side-projects where I'm the only maintainer to test local commit on a scratch branch, to trigger my GitHub Actions; if the build pass, then only will I push to main; all without having to create that test branch locally.

I sometimes also use the form git push origin main^:main to push my main branch, except for its last commit, that I will keep local as it's likely a work in progress.

People working with Gerrit will be familiar with git push origin HEAD:refs/for/main to push commits for review (refs/for is a magic namespace in Gerrit to push for review for a target branch), and now you know what it means 😉.

You might sometimes also see things like git push origin :test, this will delete the remote test branch, and is equivalent to git push --delete test (and it was the only way to delete a remote branch or tag before the --delete flag was added).

“reset”, “revert”, “restore”

Those three terms are all meant to somehow destroy something, but in different ways. Eck there's even a section of the docs dedicated to disambiguating them!

“reset” is meant to move the current branch to another commit (a “fast-forward merge” is actually equivalent to a “reset”), though it can also be used to manipulate the “index” (opposite of git add and equivalent to git restore --staged). You can tell git reset what to do of your index and working tree with flags such as --hard.
“revert” will create new commits that will undo the effects of previous commits
“restore” is all about files in your working directory or index, to undo changes made to them and restore them to a specific version recorded in some commit or the index.

checkout

The git checkout command can do two seamingly unrelated things:

“switching” to another branch, and
“restoring” files from a given commit

Technically, those are actually quite similar as they're about changing files in your working directory, and in the case of “switching” also changing what HEAD points to.

Nowadays, you should rather use the git switch and git restore commands to the same effects.

“tree-ish”

In git, each commit is a snapshot of the state of the repository, along with some metadata (among them the commit message, committer, and author). That snapshot is stored as a tree object. A “tree-ish” is anything that resolves to a tree object: either the tree ID itself, or a commit-ish (a commit ID, a reference name, possibly using the ^ or ~ operators as seen above).

Technically you can also refer to a subtree (directory) of a given tree-ish by suffixing it with : followed by the path of the directory. While I sometimes use this notation with git show to refer to files (show me the content of the given file inside that commit), I've never ever used it for a subtree (this can apparently be used with git restore --source=, git checkout, and git reset; looks like a very advanced feature to me).

reflog

The reflog, or reference log, is kind of an audit log of any change ever done to references in your local repository.

You'll almost never use it but it can save yourself in some gnarly situations, to recover things you accidentally deleted.

merge vs rebase vs cherry-pick

I have to say I don't quite understand how those terms are confusing 🤷

I suppose this is due to superficial knowledge of git; knowing mostly git commands and not really having a mental representation of the concepts at hand. Git core concepts aren't that hard to comprehend, but if nobody explains them to you and you only learned to use git by memorizing a few commands, you can quickly get lost, particularly when told to change your workflow (fwiw, this is I think the main reason we created internal training sessions at work, starting from those concepts towards the commands that manipulate them, dispensed to all new hires).

The commands can sometimes be confusing to use though:

git merge will create a new commit joining two lines of commit history (two branches)
git rebase will replay your commits on top of another commit (selecting the commits since the last common ancestor). In more advanced use cases, you can also specify exactly which set of commits to rebase, and onto which commit to rebase them (see below).

Because git stores snapshots, and not diffs, it will compute the diff of each commit (similar to git diff) and apply it on the new base. Julia has a wonderful post explaining how this all works in details.

git rebase also has some super powers in the form of its interactive mode, where you can tell it to reorder the commits, skip some, squash others into a single commit, etc. You generally use this form to replay your history without changing your “base”.
git cherry-pick will also replay a commit, but works kinda the reverse of git rebase: you tell it which commit (from another branch) to replay on top of your current branch; the commits from your current branch don't change, you're creating a new commit that does the same as another commit from another branch.

The thing to remember: git rebase can be destructive, so use with care and don't hesitate to create a branch as bookmark before you rebase, and/or abort your rebase if you feel like you lose control of it. That being said, my personal workflow involves rebasing a lot

git rebase --onto

When you use git rebase main to rebase your current branch on top of main (e.g. just before merging it, as a “fast-forward merge”, because you like your history to be linear; or just to avoid all those merge commits whenever you want to sync your feature branch with new changes from main), git will first find the last common ancestor between your current branch and main, and get the list of commits in your branch since that point (this is the exact equivalent to git log main... or git log main...HEAD if you remember). It will then replay them on top of main.

So main is used twice here: to find which commit to rebase, and “onto” which base.

Imagine you started working on a new feature, so you branched from main at some point. Then management decides that the feature becomes a priority and should be released early, without other features that already landed on main. So a new branch (let's call it release-X) is created from an earlier point of main than you branched from, then possibly a few bugfixes are cherry-picked too. You would then want to take all the commits from your branch and move them as if you branched from that new branch (or any earlier point from main than you initially branched from): git rebase --onto release-X main.

commit, more confusing terms, and all the rest…

I'll stop there I have nothing to add to what Julia says on “commit”.

I might actually do a followup post with some of the things she left out. I'd personally add fork vs. clone too.

http://blog.ltgt.net/confusing-git-terminology/

Climate-friendly software: don't fight the wrong battle

Apr 30, 2023 Updated Apr 30, 2023

Show full content

When talking about software ecodesign, green IT, climate-friendly software, the carbon footprint of software, or however you name it, most of the time people focus on energy efficiency and server-side code, sometimes going to great length measuring and monitoring it. But what if all this was misguided?

Ok, this is a bit of a bold statement, but don't get me wrong: I'm not saying you shouldn't care about this. Let's look at one of the most recent examples I've seen: GitHub's ReadME Project Q&A: Slash your code's carbon footprint newsletter issue. It's good and I agree with many things in there (go read it if you haven't already), but it talks almost exclusively about energy efficiency and server-side code, or in other words it limits actions to the scope 2 of the GHG Protocol.

So let's first understand which impacts we're talking about before I give you my opinion on the low-hanging fruits.

Disclaimer: people regarded as experts in green IT trusted me enough to have me contribute to a book on the subject but I'm not myself an expert in the field.

Note: this post is written for developers and software architects; there are other actions to lower the climate impact of the digital world that won't be covered here.

Stepping back

Most software nowadays is client-server: whether web-based or mobile, more and more end-user software talk to servers. This means there's a huge asymmetry in usage: even for small-scale professional software the end users generally vastly outnumber the servers. And this implies the impacts of the individual clients need to be much lower than those of the servers.

Data table showing greenhouse gas emissions share broken down by tier and lifecycle stage; all values in user equipment line are red, other values in use phase column are orange; in the total column, user equipment is red, networks orange, and data centers green — Greenhouse gas emissions balance (source, PDF, 533 KB)

What life-cycle assessments (LCA) for end-users' devices tell us is that manufacturing, transport and disposal summed up immensely outweighs use, ranging from 65% up to nearly 98% of the global warming potential (GWP). Of course, this depends where the device was manufactured and where it's being used, with the use location's biggest impact being related to the carbon footprint of the electric system, as the use phase is all about charging or powering our smartphones, laptops and desktops.

Bar chart of the estimated greenhouse gas (GHG) emissions for the Google Pixel 7; production is 7 times bigger than customer use, itself much bigger than transportation or recycling — Estimated Greenhouse Gas (GHG) emissions for a Google Pixel 7 (source, PDF, 224 KB)

Piechart of the estimated carbon footprint for a Dell Precision 3520 broken down by lifecycle phase, with a secondary piechart breaking down the footprint of the manufacturing phase by component; manufacturing is more than 4.5 times bigger than use, itself much bigger than transportation or end of life; components with the biggest impacts are the display, twice as big as the solid state drive, followed by the power supply and mainboard — Estimated carbon footprint allocation for my Dell Precision 3520, assuming 4 years of use (I've had mine for more than 5.5 years already): 304 kg CO₂e ± 68 kg CO₂e (source, PDF, 557 KB)

I am French, working mainly for French companies with most of their users in France, so I'm ready to admit I'm biased towards a very low use phase weight compared to other regions: go explore data for your users on Electricity Map and Our World in Data. And yet, that doesn't change the fact that the use phase has a much lower carbon footprint than all three of manufacturing, transport, and disposal as a whole.

What we can infer from this, is that keeping our devices longer will increase the share of use in the whole life-cycle impacts. Fairphone measured that extending the lifespan of their phones from 3 to 5 years helps reduce the yearly emissions on global warming by 31%, while a further extension to 7 years of use helps reduce the yearly impact by 44%.

Barchart of yearly emissions for the Fairphone 4, per baseline scenario — Fairphone 4: comparative of yearly emissions per baseline scenario (source, PDF, 1.1 MB)

Extending the lifespan of a smartphone from 3 to 5 years can reduce its yearly global warming impacts by almost a third.

Things are different for servers though, where the use phase's share varies much more depending on use location: from 4% up to 85%! As noted in the ReadME Project Q&A linked above, big companies' datacenters are for the most part net-neutral in carbon emissions, so not only the geographic regions of your servers matter, but also the actual datacenters in those regions. This implies that whatever you do on the server side, its impact will likely be limited (remember what I was saying in the introduction?) Of course there are exceptions, and there will always be, so please look at this through the prism of your own workloads.

Piechart of estimated carbon footprint allocation for a Dell PowerEdge R640, assuming 4 years of use: use is more than 4.5 times bigger than manufacturing, itself an order of magnitude bigger than transportation or end of life. — Estimated carbon footprint for a Dell PowerEdge R640 server, assuming 4 years of use: 7730 kg CO₂e (source, PDF, 514 KB)

Keep in mind the orders of magnitude though: 70 kg CO₂e for a single Pixel 7 (on 3 years) vs. 7730 kg CO₂e for a Dell PowerEdge R640 server (on 4 years), that's 110 smartphones for a server (or a 83:1 ratio when considering yearly emissions): chances are that you'll have much more users than that. The ratio for laptops (304 kg CO₂e on 4 years for a Dell Precision 3520) would be 25 laptops for a server. But as seen previously the actual carbon footprint will vary a lot depending on the location; you can explore some data in the Boavizta data visualization tool that compiles dozens of LCAs of various manufacturers. The Dell PowerEdge R640 in France would actually emit 1701 kg CO₂e rather than 7730 kg CO₂e: that's a 4.5:1 ratio! Comparatively, my Dell Precision 3520 would fall from 304 kg CO₂e to 261 kg CO₂e, only a 1.16:1 ratio. The laptop to server ratio would thus fall from 25 down to 7.9:1, which makes the laptops' impacts comparatively much bigger than the server compared to other regions.

Note that there are three tiers: end-users, datacenters, and networks. Network energy consumption however doesn't vary proportionally to the amount of data transferred, which means we as users of those networks don't have much levers on their footprint. That being said, data transmission is among the things that will drain the batteries of mobile devices, so reducing the amount of data you exchange on the network could have a more direct impact on the battery life of end-users' smartphones (even though what will drain the battery the most will more likely be the screen).

Taking action

So, what have we learned so far?

It's important that end users keep their devices longer,
we can't do much about networks,
the location (geographic region and datacenter) of servers matter a lot, more so than how and how much we use them.

Now, what can we do about it?

For servers, it's relatively simple: if you can, rent servers in energy efficient datacenters, and/or countries with low-carbon electricity; in addition, or otherwise, then of course optimize your server-side architecture and code. If you manage your own servers, avoid buying machines to let them sit idle: maximize their utilization.

Pick servers in carbon-neutral or low-carbon datacenters first, then optimize your architecture and code.

For the networks, our actions are probably limited to reducing data usage, not because it reduces immediate emissions (it doesn't), but to avoid the need for rapid expansion of the network infrastructure (I'm quoting Wim Vanderbauwhede here, from a private conversation).

For the end-users' devices, it's more complicated, but not out of reach: we want users to keep their devices as long as possible so, put differently, we must not be responsible for them to change their devices. There will always be people changing devices "for the hype" or on some scheduled basis (or just because the vendor stopped pushing security updates, some form of planned obsolescence, or can't be repaired; two things laws could alleviate), but there are also many people who keep them as long as possible (because they're eco-conscious or can't afford purchasing a new device, or simply because they don't feel the need for changing something that's still fully functioning.) For those people, don't be the one to make them change their mind and cross the line.

Don't be the one that will make your users change their device.

This is something we won't ever be able to measure, as it depends on how people perceive the overall experience on their device, but it boils down to perceived performance. So by all means, optimize your mobile apps and web frontends, test on old devices and slow networks (even if only emulated), and monitor their real-user performance (e.g. through Web Vitals). As part of performance testing, have a look at electricity use, as it will both be directly associated with emissions to produce that electricity, and be perceptible by the user (battery drain). And don't forget to account for the app downloads as part of the overall perceived performance: light mobile apps that don't need to be updated every other day, frontend JS and CSS that can be cached and won't update several times a day either (defeating the cache).

Optimize for the perceived performance and battery life.

Don't forget about the space taken by your app on the user's device too: users shouldn't have to make a choice between apps due to no space left on device, so when possible prefer a website or progressive web app (PWA) to a native application (you can still publish them to application stores if required, through tiny wrapper native apps).

When possible, prefer a website or PWA to a native application. A note to product managers

The above advices were mostly technical, answering the question What can I do as an architect or developer? but product managers have their share, and they're actually the ones in power here: they can choose which features to build or not build, they can shape the features, they can reduce software complexity by limiting the number of features and of levers and knobs. This will undoubtedly avoid bloat and help you make things leaner and faster.

Avoid feature creep and beware of Wirth's law.

Refrain from adding features, reduce software complexity.

Last, but not least, make sure you really need software! Sometimes you should embrace low-tech. For example, instead of developing a mobile app with accounts to identify the user so you can notify them, then maybe you could simply use SMS (assuming you have some out-of-band means of knowing their phone number, and the latency of distribution is acceptable). And sometimes what you're trying to address with software just isn't worth it, particularly if it involves IoT (remember that we should strive for fewer devices that we keep longer, not more).

Sometimes, ideas aren't even worth their impacts.

Conversely, as we'll need to electrify parts of our economy to reduce their carbon footprint, software is one of the few sectors to start with a head-starts: we get greener at the same rate as the grid without other work needed (I'm quoting Alex Russell here, from a private conversation), so please do use software to digitalize and replace more carbon-intensive activities.

Other pitfalls

Besides only evaluating electricity consumption on your servers, another pitfall is trying to attribute emissions to each user or request: when you have dozens, hundreds or even thousands of concurrent requests, how do you distribute electricity consumption among them? There's an IETF proposal for a HTTP response header exposing such information, and while it's a commendable idea I doubt it's realistic. My personal belief is that display of such information is often a sign of greenwashing. To my knowledge, data can only be accurate in aggregates.

If you really do want to show how green you are, conduct a life-cycle assessment (LCA): take all three scopes into account, all three tiers, evaluating impacts over more criterias than the global warming potential (GWP) alone.

Here are a couple resources if you want to go farther:

Thanks to Alex Russell and Wim Vanderbauwhede for their feedback.

http://blog.ltgt.net/climate-friendly-software/

Naming things is hard, SPA edition

Mar 28, 2023 Updated Mar 28, 2023

Show full content

During the past few months, social networks have been shaken by a single-page vs multi-page applications (SPA vs MPA) battle, more specifically related to Next.js and React, following, among other things, a tweet by Guillermo Rauch and a GitHub comment by Dan Abramov.

I've read a few articles and been involved in a few discussions about those and it appeared that we apparently don't all have the same definitions, so I'll give mine here and hope people rally behind them.

SPA vs MPA: it's about navigation

It's not that hard: a single-page application means that you load a page (HTML) once, and then do everything in there by manipulating its DOM and browser history, fetching data as needed. This is the exact same thing as client-side navigation and requires some form of client-side routing to handle navigation (particularly from history, i.e. using the back and forward browser buttons).

Conversely, a multi-page application means that each navigation involves loading a new page.

SPA means you load a page once then navigate by manipulating the DOM and history. MPA means that each navigation involves loading a new page.

This by itself is a controversial topic: despite SPAs having lots of problems (user experience –aborting navigation, focus management, timing of when to update the URL bar–, accessibility, performance even by not being able to leverage streaming) due to taking responsibility and having to reimplement many things from the browser (loading feedback, error handling, focus management, scrolling), some people strongly believe this is “one of the first interesting optimizations” and they “can’t really seriously consider websites that reload page on every click good UX” (I've only quoted Dan Abramov from the React team here, but I don't want to single him out: he's far from being alone with this view; others are in denial thinking that “this is the strategy used by most of the industry today”). Some of those issues are supposedly (and hopefully) fixed by the new navigation API that's currently only implemented in Chromium browsers. And despite their many advantages, MPAs aren't free from limitations too, otherwise we probably wouldn't have had SPAs to being with.

My opinion? There's no one-size-fits-all: most sites and apps could (and probably should) be MPAs, and an SPA is a good (and better) fit for others. It's also OK to use both MPA and SPA in a single application depending on the needs. Jason Miller published a rather good article 4 years ago (I don't agree with everything in there though). Nolan Lawson also has written a good and balanced series on MPAs vs SPAs.

And we haven't even talked about where the rendering is done yet!

Rendering: SSR, ESR, SWSR, and CSR

Before diving into where it's done, we first need to define what rendering is.

My definition of rendering is applying some form of templating to some data. This means that getting some HTML fragment from the network and putting it into the page with some form of innerHTML is not rendering. Conversely, getting some virtual DOM as JSON for example and reconstructing the equivalent DOM from it would qualify as rendering.

Rendering is applying some form of templating to some data.

Now that we've defined what rendering is, let's see where it can be done: basically at each and any stage of delivery: the origin server (SSR), edge (ESR), service-worker (SWSR), or client (CSR).

There's also a whole bunch of prerendering techniques: static site generation (SSG), on-demand generation, distributed persistent rendering (DPR), etc.

All these rendering stages, except client-side rendering (CSR), generate HTML to be delivered to the browser engine. CSR will however directly manipulate the DOM most of the time, but sometimes will also generate HTML to be used with some form of innerHTML; the details here don't really matter.

Rendering at the origin server or at the edge (Cloudflare Workers, Netlify Functions, etc.) can be encompassed under the name server-side rendering (SSR), but depending on the context SSR can refer to the origin server only. Similarly, rendering in a service worker could be included in client-side rendering (CSR), but most of the time CSR is only about rendering in a browsing context. I suppose we could use browser-side rendering (BSR) to encompass CSR and SWSR.

Schema of SSR, ESR, SWSR and CSR, with grouping representing SSR-in-the-broader-sense (SSR and ESR) vs. BSR (SWSR and CSR), and which generate HTML (SSR, ESR and SWSR) or manipulate the DOM (CSR)

As noted by Jason Miller and Addy Osmani in their Rendering on the Web blog post, applications can leverage several stages of rendering (SSR used in the broader sense here), but like many they conflate SPA and CSR. Eleventy (and possibly others) also allows rendering a given page at different stages, with parts of the page prerendered at build-time or rendered on the origin server, while other parts will be rendered at the edge.

What does that imply?

My main point is that rendering is almost orthogonal to single-page vs multi-page: an SPA doesn't imply CSR.

SPA doesn't necessarily imply CSR.

Most web sites are MPAs with SSR, sometimes ESR.
Most React/Vue/Angular applications are SPAs with CSR: the HTML page is mostly empty, generally the same for every URL, and the page loads data on boot and renders it (at the time of writing, the Angular website is such an SPA+CSR).
Next.js/Gatsy/Remix/Nuxt/Angular Universal/Svelte Kit/Solid Start/îles applications are SPAs with SSR and CSR: data is present as HTML in the page, but navigations then use CSR staying on the same page (and actually, despite the content being present in the HTML page, those frameworks will discard and re-render it client-side on boot).
Qwik City/Astro/Deno Fresh/Enhance/Marko Run applications are MPAs with SSR (and CSR as needed through islands of interactivity); Qwik City provides an easy way to switch to an SPA with SSR and CSR (though contrary to the above-mentioned frameworks, Qwik City won't re-render on page load).
Hotwire Turbo Drive (literally HTML over the wire; formerly Turbolinks) and htmx applications are SPAs with SSR.
GitHub is known for its use of Turbolinks and is actually both MPA and SPA, depending on pages and sometimes navigation (going from a user profile to a repository loads a new page, but the reverse is a client-side navigation).

Some combinations aren't really useful: an MPA with CSR (and without SSR) would mean loading an almost empty HTML page at each navigation to then fetch data (or possibly getting it right from HTML page) and do the rendering. Imagine the Angular website (which already makes a dubious choice of not including the content in the HTML page, for a documentation site) but where all navigations would load a new (almost empty) page.

Similarly, if you're doing a SPA, there's no real point in doing rendering in a service worker as it could just as well be done in the browsing context; unless maybe you're doing SPA navigation only on some pages/situations (video playing?) and want to leverage SWSR for all pages including MPAs?

Other considerations

In an application architecture, navigation and rendering locality aren't the only considerations.

Inline updates

Not every interaction has to be a navigation: there are many cases where a form submission would return to the same page (reacting to an article on Dev.to, posting a comment, updating your shopping cart), in which case progressive enhancement could be used to do an inline update without a full page refresh.

Those are independent from SPAs: you can very well have an MPA and use such inline updates. Believe it or not, this is exactly what Dev.to does for their comment form (most other features like following the author, reacting to the post or a comment, or replying to a comment however won't work at all if JavaScript is somehow broken).

Concatenation and Includes

Long before we had capable enough JavaScript in the browser to build full-blown applications (in the old times of DHTML, before AJAX), there already were optimization techniques on the servers to help build an HTML page from different pieces, some of which could have been prerendered and/or cached. Those were server-side includes and edge-side includes.

While they are associated with specific syntaxes, the concepts can be used today in edge functions or even in service workers.

The different parts being concatenated/included this way can be themselves static or prerendered, or rendered on-demand. Actually the above-mentioned feature of Eleventy where parts of a page are server-rendered or prerendered and other parts are rendered at the edge is very similar to those includes as well.

http://blog.ltgt.net/naming-things-is-hard-spa-edition/

Migrating from Jekyll to Eleventy

Mar 12, 2023 Updated Mar 12, 2023

Show full content

Yes, this is going to be yet another one of those articles explaining how I migrated this blog from Jekyll to Eleventy. You've been warned.

Why?

I don't really have issues with Jekyll and I've been using it for 10 years now here, but I haven't really chosen Jekyll: it's been more-or-less imposed on me by GitHub Pages. But GitHub now has added the possibility to deploy using a custom GitHub Actions workflow, and this is game-changer!

I could have kept using Jekyll with unlocked possibilities, but I'm not a Rubyist, that's just not a language I'm comfortable with, and I know almost nothing about Gems, so definitely not something I'd be comfortable maintaining going forward.

I also could have just kept using the built-in Jekyll Pages integration, and this is what I would have done if I hadn't found any satisfying alternative. I'm not forced to change, so at least I have a fallback in the form of the status quo.

So what would replace it? Let's evaluate my requirements.

The Requirements

I have articles written in HTML (exports from Posterous) and Markdown, using a bit of Liquid to link to other articles (with the post_url Jekyll tag). The Markdown articles use GitHub Flavored Markdown, including syntax-highlighted fenced code blocks, with embedded HTML. Ideally I shouldn't have to update the articles at all.
I only have 4 templates only (index.html, rss.xml, and default and post layouts) so migrating to another templating engine wouldn't really be a problem. The index.html uses pagination (even though I still only have a single page). The default layout builds a Content Security Policy using flags from the articles' front matter.
I also have a few static files: CSS, JS, and images (and a file to verify ownership for the Google Search Console).
Of course, because cool URIs don't change, the permalinks have to be ported to the new solution.
I hadn't identified it at first, but I actually have an old article that's not published, through Jekyll's published: false in the front matter. In the worst case, I'd just delete it (it'd still be there in the Git history).
Nice to have: I kinda like Jekyll's _drafts folder using the file's last modified date, and _posts folder with the publication date as part of the file name. (I don't commit my drafts, and yes that means I don't have backups; I don't have many drafts, and I'll probably never finish and publish them so 🤷)
Of course I want something I'm comfortable using for the next 10 years, in terms of technology and ecosystem. This means essentially that I'd like a Node-based solution.
Last, but not least, I want the output to be (almost) identical (for now at least) to the Jekyll site: must be static HTML, with <script>s added by the layouts and possibly right from the articles, no client-side hydration and upgrading to a Single Page Application.

The choice

The HTML-first approach rules out (a priori, correct me if I'm wrong) every React or Vue based approach, or similar.

I've quickly evaluated a couple alternatives, namely Astro and Eleventy.

Astro is fun, but I must say it doesn't really look content oriented, relegating the content into its src/pages, or worse, a subfolder inside src/content/. I really like the typesafe nature of content collections, but moving everything down to src/content/blog really hides the content away IMO. Extracting the publication date from the file name is possible, but it looks more and more like a development project rather than a content project. It's great, but not what I'm looking for here.

I then looked at Eleventy. I have to admit my first contacts with the Eleventy documentation months ago left me with a bitter taste as I couldn't really figure out how collections worked and how you were supposed (or not) to organize your files. Looking at tweetback more recently didn't really help: absolutely everything is JS, loading content from a SQLite database.

I decided to give it a chance: maybe I misunderstood the documentation the last time(s) I read it. And indeed it was the case: moving from Jekyll to Eleventy probably couldn't be easier.

How?

I felt my way a bit, so I'll summarize here what I ended up doing, also describing some things I tried along the way.

Getting Started

Removing Jekyll consists in deleting the _config.yml and possibly Gemfile (I didn't have one). Adding Eleventy means initializing a new NPM packaging and adding the @11ty/eleventy dependency (and of course adding node_modules to the .gitignore), and creating a configuration file (I chose eleventy.config.cjs rather than the .eleventy.js hidden file).

Because the deployment workflow is different, the CNAME file becomes useless and can be deleted. A new GitHub Actions workflow also has to be created, using the actions/configure-pages, actions/upload-pages-artifact, and actions/deploy-pages actions. I took inspiration from the Astro starter workflow and updated it for Eleventy.

Markdown

Eleventy supports Markdown out of the box, with all the options I needed, except syntax highlighting and heading anchors for deep linking. It also automatically extracts the date from the file name.

Syntax highlighting is as easy as using the official plugin, but then the generated HTML markup is different than with the Rouge highlighter in Jekyll, so I had to change the CSS accordingly. I ended up importing an existing theme: display would be slightly different than before, but actually probably better looking.

Deep linking requires using the markdown-it-anchor plugin, and to make sure existing deep links wouldn't break I provided my own slugify function mimicking the way CommonMarkGhPages computes the slug from the heading text (I happen to have a few headings with <code> in them, and CommonMarkGhPages would compute the slug from the rendered HTML leading to things like codejavaccode; I chose to break those few links in favor of better-looking anchor slugs). I also disabled tabIndex to keep the same rendering as previously (I'll read more on the accessibility implications and possibly revert that choice later.)

I reimplemented the post_url first as a custom short code but that meant updating all articles to quote the argument (due to how Eleventy wires things up), so I ended up using a custom tag; that's specific to the Liquid template engine (in case I would want to change later on) but at least I don't have to update the articles.

In terms of rendering, besides syntax highlighting, the only difference is the <br> which are now rendered that way rather than <br /> (there's an option in markdown-it but I'll keep the less XHTML-y, more HTML-y syntax).

The rss.xml file wouldn't be treated as a template by default, so I aliased the xml extension to the Liquid engine, and added an explicit permalink: to avoid Eleventy creating an rss.xml/index.html file. I did the same with the css extension so I could use an include to bring in the syntax-highlighting theme in my style.css.

Liquid Templating

I had to rename my layout files to use a .liquid extension rather than .html. I didn't want to move them though, so I configured a layouts directory instead.

I also had to handle all the Jekyll-specific things I was using: xml_escape, date_to_xmlschema, date_to_string, and date_to_long_string filters, and the site.time and site.github.url variables (we already handled the post_url tag above).

At first, I tried to recreate them in Eleventy (which is easy with custom shortcodes and global data files), but finally decided that I could replace most with more standard Liquid that would be compatible right-away with LiquidJS: xml_escape becomes escape, date_* become date: with the appropriate format (this made it possible to fix my <time> elements erroneously including the time), and site.time becomes "now" or "today" with the date filter. I put that in a separate commit as that's compatible with Jekyll Liquid as well. And all that's left is therefore site.github.url that can be put in a global data file (a JS file getting the value out of an environment variable, fed by the actions/configure-pages output in the GitHub Actions workflow).

Finally, I actually had to update all templates to use Eleventy's way of handling pagination, and looping over collections.

Speaking of collections, I initially used directory data files to assign a post tag to all posts in _posts and _drafts. This didn't handle the published: false, so I used a custom collection in the configuration file instead. I probably could have also used a computed eleventyExcludeFromCollections to exclude it, but this also helped fix an issue with the sort order and apparently a bug in LiquidJS's for loop with both reversed and limit: where it would limit before reversing whichever way I wrote things, contrary to what the doc says.

One last change I made: update the Content Security Policy to account for the Eleventy dev mode autoreload; I used eleventy.env.runMode != "build" to detect when run with autoreload.

Static Files

Contrary to Jekyll where any file without front matter is simply copied, static files have to be explicitly declared with Eleventy. I also had to ignore those HTML files I needed to just copy without processing.

Permalinks

Permalinks for the rss.xml and style.css are defined right in those files' front matter. The index.html uses pagination so I declared a mapping there as well.

Finally I decided to compute the permalink for posts right in the front matter of the post layout, using the page.fileSlug gives me exactly what I want (the date part has already been removed by Eleventy). Using a JS front matter allowed me to filter out the published: false article so it wouldn't ever be rendered to disk (I already excluded it from the posts collection, but Eleventy would still process and render it).

Drafts

To handle drafts, I'm using the getFilteredByGlob function when declaring the posts collection, so I can decide whether to include the _drafts folder depending on an environment variable. This would include the drafts in the posts collection so they would appear in the index.html and rss.xml.

More importantly though, when not including drafts, I have to ignore the _drafts folder, otherwise the drafts are still processed and generated (despite not being linked to as they don't appear in the posts collection). This is actually not really a problem given that I don't commit drafts to my Git repository, so I would observe this behavior only locally.

Comparing the results

To make sure the output was identical to the Jekyll-based version, I built the site once with Jekyll before any modification and backed up the _site folder; then compared it with the output of Eleventy to make sure everything was OK.

Conclusion

As I felt my way and learned about Eleventy, this took me nearly two weekends to complete (not full time, don't worry!) What took me the most time actually was probably finding (and deciding on) the new syntax-highlighting theme! Otherwise, things went really smoothly.

I'm very happy with the outcome, so I switched over. And now that I control the build workflow, I know I could setup an asset pipeline, minify the generated HTML, bring in more Eleventy plugins to split the syntax-highlighting theme out and only send it when there's a code block on the page, etc.

A big would recommend!

http://blog.ltgt.net/from-jekyll-to-eleventy/