GeistHaus
log in · sign up

raganwald.com

Part of raganwald.com

stories primary
New Yorker Cartoon, c. 2025
Show full content

"We're impressed with the programming assignment you submitted, and would like to hire the ChatGPT Whisperer who created it."

https://raganwald.com/2023/04/27/chatgpt-whisperer
Cryptographic techniques used by nCrypt Light in 1994
Show full content

Newton MessagePad 100, circa 1993

Preface (2023)

The following described the cryptographic protocol and algorithm used by nCrypt Light back in 1993-94. I wrote nCrypt Light in the hope of creating a strong cryptography app for the orginal Newton MessagePad 100. Rolling your own crypto is well-understood to be the complete opposite of implementing secure cryptography, so this is presented purely for nostalgia and amusement purposes.

The MessagePad at that time had messaging and email if connected to a network of some kind, but it did not have strong, secure protocol for ensuring the privacy of messages or documents. My business partner and I thought there might be a market for encrypting messages much as Telegram or Signal do today. Our first goal was to ship a symmetrical encryption app (where sender and receiver must share a passphrase as the secret), and he hoped to write a public-key app later.

Between other business commitments, quickly discovering how little I understood about cryptography, and the fact that the Newton did not become the next big platform, we never wrote another line of Newton code. I lost the source code (written in NewtonScript), but from time to time I look around and discover it’s still possible to download the app from FTP sites.

On the plus side, we uploaded it to various sites, one of which was CompuServe. A few months later, I received a cheque: Someone had not just downloaded our app, but paid us the optional USD 25 for it! That was the very first time something I wrote “on spec” for the world to try, made any money, and I still have the original physical cheque in my memory box.

And here is the original text, which has been hosted for all these years by Peter Conrad. You can find it here.


The description below is © 1994 CustomWare, Inc. and Reginald Braithwaite-Lee. That copyright overrides the create commons license that applies to the remainder of this blog. Note the use of pseudocode in the description: At the time, we were concerned that sharing source code directly could place us in jeopardy of the laws of the time, which considered cryptography a munition covered by strict export restrictions backed by draconian penalties for noncompliance.


Introduction

nCrypt Light (“nCrypt”) is a password protection application for the Apple Newton. A derivative application, incorporating public key cryptographic techniques, is currently being developed. nCrypt Light is an enabling architecture; cryptographic protocols are “installed” into nCrypt, and those protocols become available for users.

nCrypt’s architecture is composed of layers. At the highest level is the nCrypt application. nCrypt makes use of protocols. Protocols implement an algorithm and supporting procedures such as key generation, message padding, and error detection. Protocols make use of algorithms. At this time, nCrypt includes one protocol, the Alternating Stop and Go Generator (“Stop & Go”), built-in. Another, an implementation of Bruce Schneier’s Blowfish, is available as a “drop-in” module.

This document describes the Stop & Go protocol, with an emphasis on providing details useful to cryptanalysts searching for weaknesses. Notes indicatate where nCrypt’s implementation of Stop & Go differs from this description. Two future documents are planned:The first will describe nCrypt’s binary to text translation formats and the second will describe how developers may write other “drop-in” algorithms.

Both nCrypt (the Newton application) and the cryptographic algorithms used by nCrypt are works-in-progress. This document is published in order to foster analysis which will improve the the specific cryptographic techniques used in nCrypt as well as to improve Cryptography in general.

The Stop & Go protocol is constructed from independant modules. The most basic, the Secure Message Authentication Code (“SMAC”), is a key-dependant varient of NIST’s Secure Hash Algorithm (“SHA”). SMAC is used to build the Stop & Go algorithm. The keys needed for the Stop & Go protocol are generated from passphrases using a salt and SHA.

CustomWare hereby places its interest in the cryptographic techniques used in Stop & Go (“SMAC” and “Stop & Go”) in the public domain, with no restrictions on their use (CustomWare does not indemnify users from any consequences of such use: before proceeding, perform a thorough patent search and consult qualified legal counsel). nCrypt (the application) remains the proprietary property of CustomWare and may only be used and distributed as allowed by the license agreement accompanying nCrypt.

The Secure Message Authentication Code

The Secure Message Authentication Code (“SMAC”) is a modification of the Secure Hash Algorithm (“SHA”).[^1] SHA is an 80 round hash which securely compresses a 512 bit value to a 160 bit result. It is conjectured that:

Given any message, 2^160 operations are required to discover another message which generates the same hash result. 2^80 operations are required to find any two messages which generate the same hash value. SHA makes use of five 32 bit buffers which are initialized to 0xa67452301, 0xEFCDAB89, 0x98BADCFE, 0x10325476, and 0xC3D2E1F0. The input material is repeatedly mixed into the buffer, then the original buffer contents are added (modulo 32) to the result. When more than one 512 bit block is to be hashed, the result of one block’s hash is used as the buffer for the next.

SMAC makes use of a 160 bit key, which is divided into five 32 bit buffers. The key is used instead of SHA’s initial buffer values. Every time SMAC is used to compress a 512 bit value to a 160 bit result, the internal buffers are reset to the key value. No other modification has ben made to the algorithm. SMAC’s keys are defined either as values produced by hashing bit strings with SHA, or as output from SMAC.

We conjecture that setting the internal buffers to cryptographically secure values produced by SHA or SMAC produces a hash which is as secure as SHA. Our reasoning is that this operation is equivalent to prepending SMAC’s inputs with key material before hashing.

The Alternating Stop and Go Generator Algorithm

The Alternating Stop and Go Generator (“Stop & Go”) is based on a stream cipher described by Schneier[^2] from a paper presented by Günther[^3]. Stop & Go uses three generators (“rL”, “rR”, & “rA”) constructed from SMACs. The output of Stop & Go is independant of the plaintext; it is a large generator which outputs 160 bits per round.

Each generator consists of a register and a SMAC function. The keys for all three generators in nCrypt are identical. If more key material is available, using independant keys could increase security. Also, generating separate keys by “expanding” a single key could increase security.

Each generator maintains its own 512 bit shift register. The registers are initially loaded as follows: rL is loaded with 0xFFFF...FFFF. rR is loaded with 0x0000...0000. rA is loaded with 0xAAAA...AAAA. (The generator shift registers should not be confused with SMAC’s internal register.) Stop & Go’s generators work by repeatedly “clocking:” clocking a generator generates a 160 bit result from the generator’s shift register. The shift register is the modified by discarding the first 160 bits and appending 160 bits of material.

Before any encryption is performed, the Stop & Go protocols requires that the plaintext be padded to a multiple of 512 bits. (The padding mechanism is unspecified; nCrypt’s implementation pads plaintext with a repeated byte value equal to the number of bytes of padding). Once the padding has been completed, all three generators are “clocked”. The results of rL & rR are XORed to produce a “mask.” The mask is XORed with the first block of plaintext to produce the first block of ciphertext.

For the second and each subsequent block of plaintext, a bit from the result of rA is used to determine whether to clock rL or rR.These “alternation” bits are taken from consecutive bit positions in rA’s result. The mask from the previous round is pushed through the generator’s shift register and a new result is obtained. The results from both generators (one new and one previous) are then XORed together to produce a new mask.

Should more than 160 blocks need to be encrypted, rA is clocked to produce 160 more bits for alternation. The first bit of rA’s new result is then used to clock rL or rR and the resulting mask is used for the next block of plaintext.

Generating Session Keys from Passphrases

Generating a Session Key

Session keys consist of five 32-bit values. Session keys should be unique for each message generated by the cipher. The Stop & Go protocol generates session keys for passphrase-protection by concatenating a passphrase with a unique salt and hashes the two together using SHA. The process is repeated a variable number of times and the result, a 160-bit value, is used as the session key. The number of iterations is not considered secure and is transmitted with the ciphertext.

The unique salt is the result returned by Newton’s TimeInSeconds() function, which is the number of seconds elapsed since Midnight, January 1, 1904. This number need not be secure and is transmitted with the ciphertext.

The Stop & Go protocol currently performs four iterations when encrypting, and can handle any number of iterations when decrypting. nCrypt’s implemntation of the Stop & Go protocol does not perform a fast passphrase check. After decrypting the entire ciphertext, the Stop & Go protocol checks the last block for well-formed padding. CustomWare has experimented with a fast passphrase check and may incorporate this feature into a future version of the Stop & Go protocol .

The Stop & Go protocol allows for variable effective keylengths from 8 to 160 bits, (for practical reasons, nCrypt’s implementation of the Stop & Go protocol restricts keylengths to multiples of 8). In all cases, the Stop & Go protocol generates a 160 bit session key using the method described above. When encrypting, the Stop & Go protocol checks the desired length and performs one of two operations:

  • If 160 bits of effective keylength are required, the Stop & Go protocol uses the 160 bit session key as is and proceeds without further operations on the session key
  • If fewer than 160 bits of effective keylength are required, the Stop & Go protocol truncates the session key to the desired length and hashes the truncated key with SHA. The resulting 160 bits are used as the session key
  • The number of bits of effective keylength are not considered secure and are transmitted with the ciphertext.

nCrypt’s shareware implementation of the Stop & Go protocol only encrypts with 40 bits of effective keylength. It can decrypt messages created with any number of effective bits of keylength.

Cryptanalytic Comments

The basis for the security of Stop & Go is the conjecture that the difficulty of guessing the initial buffer value given a chosen SMAC register and a known output is as difficult as guessing a SMAC register given a known initial buffer and a known output. Could this conjecture be false?

Peter Gutman (pgut01@cs.aukuni.ac.nz) pointed out that rA and rB could enter the same state (i.e. their registers become identical) which would produce an extremely weak sequence. A possible improvement would be to check for this possibility and immediately clock again should this happen.

The XORing of rL and rR could assist an attacker using differential cryptanalysis.[^4]

Bill Stewart (bill.stewart@pleasantonca.ncr.com) suggested that the use of TimeInSeconds() as the salt value could be too limited a range of salt values; an attacker using a dictionary attack against a weak passphrase may not be sufficiently hampered if the range of possible salts is too small. A possible improvement would be to concatenate additional information to TimeInSeconds() such as the plaintext, a range of memory in the Newton, or other values and hash the result with SHA into a 160 bit salt. This may add significantly to the number of possible salt values and frustrate dictionary precomputation.

Bill Stewart also suggested incorporating passphrase material into each iteration of the session key generation process.

The use of highly regular initial registers for rL, rR and rA could assist an attacker in cryptanalizing the session key. Possible improvements would be:

  • Expand the session key and use key material for the initial registers. We conjecture that this approach necessitates splitting the key material into parts which are generated separately: the key material used to generate the initial registers should be independant of the key material used to generate the keys for each generator.
  • Abandon the SMAC functions and use SHA for all three generators, but expand the key material to fill the three registers with different initial states.
  • Devise a “chosen register” attack in which an attacker can discover information about a SMAC’s registers from the result when the attacker gets to choose the input. Determine the worst possible register values for this type of attack and use those for the initial states.
  • nCrypt is implemented on an Apple Newton MessagePad. The Newton’s “operating system” is completely dynamic and may relocate objects at any time, without warning. An attacker with access to a Newton used to generate ciphertext may be able to recover the plaintext or even the passphrase from RAM. For this reason, CustomWare suggest that nCrypt only be used to generate messages for secure transmission or storage on other systems.
PseudoCode
FUNCTION Stop&Go IS LAMBDA( paddedBinary, sessionKey, bitsInKey )
  BEGIN

    TEMPORARY i, j;
    TEMPORARY mask BECOMES nil;
    TEMPORARY resultL BECOMES nil;
    TEMPORARY resultR BECOMES nil;
    TEMPORARY alternationBits BECOMES -1;
    TEMPORARY shiftRegisterL BECOMES NEW-BINARY( 64 );
    TEMPORARY shiftRegisterR BECOMES NEW-BINARY( 64 );
    TEMPORARY shiftRegisterA BECOMES NEW-BINARY( 64 );

    FOR-EACH-VALUE-OF i BECOMES 0 TO 63 IN-STEPS-OF 2 DO BEGIN
        shiftRegisterL AT i BECOMES 0x0000;
        shiftRegisterR AT i BECOMES 0xFFFF;
        shiftRegisterA AT i BECOMES 0xAAAA
    END;

    IF bitsInKey < 160 THEN
      sessionKey BECOMES SecureHashOf( bitsInKey N-BITS-FROM sessionKey
AT 0 BITS);

    TEMPORARY AA BECOMES 32-BITS-FROM sessionKey AT 0 BYTES;
    TEMPORARY BB BECOMES 32-BITS-FROM sessionKey AT 4 BYTES;
    TEMPORARY CC BECOMES 32-BITS-FROM sessionKey AT 8 BYTES;
    TEMPORARY DD BECOMES 32-BITS-FROM sessionKey AT 12 BYTES;
    TEMPORARY EE BECOMES 32-BITS-FROM sessionKey AT 16 BYTES;
    TEMPORARY cryptBinary BECOMES NEW-BINARY( length(paddedBinary) );
    TEMPORARY lenPadded BECOMES length(paddedBinary);

     FOR-EACH-VALUE-OF i BECOMES 0 TO lenPadded-20 IN-STEPS-OF 20 DO
BEGIN

      IF alternationBits < 0 THEN BEGIN
        TEMPORARY resultA BECOMES SecureMACOf( AA, BB, CC, DD, EE,
shiftRegisterA );
        DROP-20-BYTES-FROM shiftRegisterA;
        APPEND resultA TO shiftRegisterA;
        alternationBits BECOMES 159
      END;

      TEMPORARY switch BECOMES 1-BIT-FROM resultA AT alternationBits
BITS;     		
      alternationBits BECOMES alternationBits - 1;
      IF switch = 0 THEN
        resultL BECOMES nil
      else
        resultR BECOMES nil;

      IF LOGICAL-NOT resultL THEN BEGIN
        IF mask THEN BEGIN
          DROP-20-BYTES-FROM shiftRegisterL;
          APPEND mask TO shiftRegisterL
        END;
        resultL BECOMES SecureMACOf( AA, BB, CC, DD, EE, shiftRegisterL );
      END;

      IF LOGICAL-NOT resultR THEN BEGIN
        IF mask THEN BEGIN
          DROP-20-BYTES-FROM shiftRegisterR;
          APPEND mask TO shiftRegisterR
        END;
        resultR BECOMES SecureMACOf( AA, BB, CC, DD, EE, shiftRegisterR );
      END;

      mask BECOMES BITWISE-XOR( resultL, resultR );

        cryptBinary FROM i TO i+19 BECOMES BITWISE-XOR( mask,
paddedBinary FROM i TO i+19 );

     END;

     RETURN cryptBinary

  END
Bibliography
  1. “SHA: The Secure Hash Algorithm,” William Stallings, Dr. Dobb’s Journal, April 1994, pp. 32-34.
  2. “Applied Cryptography,” Bruce Schneier, John Wiley & Sons, 1994, pp. 360-361.
  3. “Alternating Step Generators Controlled by de Bruijn Sequences,” C. G. Gunther, Advances in Cryptology[EUROCRYPT ]87 Proceedsings, Springer-Verlag, 1988, pp. 5-14.
  4. “Differential Cryptanalysis,” Eli Bihem & Adi Shamir, Springer-Verlag, 1993
https://raganwald.com/2023/01/10/ncrypt-lite
Mutual Recursion in Language
Show full content

This is not a programming post.

loanwords

A loanword is a term taken from another language and used without translation; it has a specific meaning that (typically) does not otherwise exist in a single English word. Sometimes the word’s spelling or pronunciation (or both) is slightly altered to accommodate English orthography, but, in most cases, it is preserved in its original language.

Résumé (🇫🇷) and kindergarten (🇩🇪) are loanwords.

calques

A calque, on the other hand, is a word or phrase taken from another language but translated (either in part or in whole) to corresponding English words while still retaining the original meaning.

Forget-me-not calques ne m’oubliez mye (Old 🇫🇷) . Beer garden calques biergarten (🇩🇪).

the grand conclusion

But wait: Calque is a word taken from French and used without translation, so calque is a loanword! Meanwhile, loanword calques German’s lehnwort.

Therefore… “Loanword” is a calque, and “calque” is a loanword!

https://raganwald.com/2022/11/03/mutual-recursion
The Inner Osborne Effect
Show full content

In software development, we talk a lot about software anti-patterns, how to recognize them, and how to extricate yourself from them via refactoring.

An anti-pattern is a common response to a recurring problem that is usually ineffective and risks being highly counterproductive. The term, coined in 1995 by computer programmer Andrew Koening, was inspired by the book Design Patterns, which highlights a number of design patterns in software development that its authors considered to be highly reliable and effective.

The term was popularized three years later by the book AntiPatterns, which extended its use beyond the field of software design to refer informally to any commonly reinvented but bad solution to a problem.

Anti-patterns occur throughout businesses, not just in software code and architecture. One famous business anti-pattern is the Osborne Effect, which is when a company pre-announces a future product that will obsolete the product line it’s selling today.

Customers take notice and cancel their orders for the current product, waiting for the wonderful future in the company’s marketing brochure. This stalls the company’s momentum, kills the company’s revenue, and in extreme cases, drives it right out of business.

But the Osborne Effect isn’t just a marketing and sales anti-pattern. It’s also a product management anti-pattern, the Inner Osborne Effect.



The Product of Tomorrow

I worked for two different companies who inflicted the Inner Osborne Effect on themselves, and both did so in theatrically dramatic ways. Pull up a pew, I’ll describe what happened, and how it became a disaster. The following story is a mashup inspired by both sets of circumstances, with some artistic license taken.

Our semi-fictitious company began its life as a scrappy startup selling into mid- to enterprise-sized customers. It found early success, and powered by its sales-led revenues, grew to dominate its niche. Driven by closing deals, its long-term development was constantly derailed by “fire drills” to ship new features that were alleged to be keys to closing big contracts, or by bug fixes driven by whichever customer complained the loudest at contract renewal time.

Tech debt piled on faster than pounds at an all-you-can-eat pancake breakfast. Development velocity slowed, which increased the urgency of shipping features at the expense of writing quality code or refactoring existing problem code.

But there was another threat looming:

Because of the company’s success, each sale was paradoxically getting harder to make. It had already plucked the low-hanging fruit in its niche, so the remaining customers were those who had less of a fit for the company’s product. And each new feature generated less net new revenue than the preceding features, because with fewer customers remaining in the niche, each new feature unlocked a proportionally smaller share of net new revenue.

If the company was to continue to grow, It needed to escape its niche with a bold new effort.



Lumburgh

By this time, trust between the C-Suite and Engineering was at an all-time low. The executives didn’t understand why bugs were climbing and velocity was dropping. The Director of Engineering was articulate in explaining exactly what was going on, but Lumburgh, the VP of Marketing, would always interject with “Poppycock! Back when I worked at FamousCo, we had none of these problems!”

The C-Suite were receptive when Lumburgh approached them on the sly. “I can start a skunkworks, remotely, under my direct command, to build The Product of Tomorrow. It will be both a floor wax and a dessert topping!”

Engineering was strictly fire-walled off from The Product of Tomorrow Team, which would emerge from stealth RealSoonNow™️ to replace everything Engineering was working on today.

How did the engineers feel about this? They were already demoralized by the feature factory mindset. They had been complaining forever that if they weren’t given the time to build the product right, they’d never find the time to build it over, and now they felt punished for management’s choice to ignore their warnings.

Resignations followed, first a trickle, then a flow.

Meanwhile, the company’s Product Management group was in chaos. Every time they wanted to build something, they’d be stopped with “Hold up, that will be half the cost and twice the value with the Product of Tomorrow!”

The current product was eventually put on life support, because senior management had neither the will, nor the resources, to do anything except fix bugs while waiting for the Product of Tomorrow.



Whither The Product of Tomorrow?

I was long gone by the time the Product of Tomorrow was killed. Yes, the Product of Tomorrow failed, and the Inner Osborne Effect took the existing product with it. The Product of Tomorrow turned out to be demo-ware, carefully crafted to look good in front of the CEO, but lacked essential features “under the hood.”

The central cause of the failure was another anti-pattern, CEO as Customer. And it turns out that a company that can’t invest in the quality of its mainline product, won’t invest in the quality of its replacement, either. That’s a management anti-pattern, and starting over with new people and a blank piece of paper doesn’t change management. Only management can change management.



From Osborne 1 to MessagePad 100

But getting back to our company and its executives, the problem wasn’t that the executives bet on the Product of Tomorrow, but that they also announced to its Engineering and Product Management groups that they were betting everything on the Product of Tomorrow. And then they followed through by obstructing any attempt by Engineering or Product Management to invest in the original product.

Compare and contrast this approach with Apple, who having succeeded with Apple II, went on to make big bets on Apple III, Lisa, Macintosh, Newton, iPhone, and iPad.

Apple III and Lisa failed. Macintosh was underpowered and overpriced on launch. But Apple continued to invest in Apple II, which financed investing in Macintosh, which became its future. Macintosh financed betting on Newton, which failed, and iPhone, which succeeded. Now Apple’s betting on AppleTV, iPad and Apple Watch, which are “nice little businesses” compared to iPhone. iPad Pro is close to displacing Macintosh, but Apple is famouly tight-lipped about its visions of the future, and only once—to my knowledge—made the grevious mistake of inflicting the Inner Osborne Effect on itself.

Could the Product of Tomorrow have succeeded? Maybe. But following Apple’s example, the way to do that would have been to make it fully independent, and continue to build the existing product as if the plans for the Product of Tomorrow did not exist. Call the Product of Tomorrow “research.” Or an “experiment.” Locate its team in Texaco Towers. But don’t call it the future that will obsolete the present. And especially don’t respond to that by choking all progress on the cash cow that is financing the company’s bets.

post scriptum

The Obsborne Effect was not the sole cause of the failure of the Osborne Computer Company, and the Inner Osborne Effect I described above wasn’t the sole cause of either source company’s failures. But both companies would have had a fighting chance to succeed had they not pre-announced that they were going all-in on an all-singing, all-dancing piece of vapourware built outside of their core product development groups.

https://raganwald.com/2021/10/28/the-inner-osborne-effect
Remembering John Conway's FRACTRAN, a ridiculous, yet surprisingly deep language
Show full content

On April 8, 2020, John Horton Conway developed symptoms of COVID-19. On April 11, 2020, he succumbed to the disease.1234

Like so very, very many, I mourn Conway’s passing, and yet I also celebrate his life. I celebrate his accomplishments, I celebrate his curiosity, and I celebrate his skill at making important topics in mathematics engaging and interesting.

One of the finest examples of that skill is the programming language FRACTRAN, the subject of this essay.


Prelude

John Horton Conway in 1993

John Horton Conway in 1993


Conway touched my own life from early days. As I described in The Eight Queens Problem… and Raganwald’s Unexpected Nostalgia:

My mother had sent me to a day camp for gifted kids once, and it was organized like a university. The “students” self-selected electives, and I picked one called Whodunnit. It turned out to be a half-day exercise in puzzles and games, and I was hooked.

One of the things we talked about in “Whodunnit” was Conway’s Game of Life. I don’t recall playing with it much: There was a lot going on, and it’s entirely possible that I was too busy falling in love with Raymond Smullyan to have curiosity left over for John Conway.567


Hashlife

An infinitely scrolling implementation of Conway’s Game of Life


I went on to rediscover Conway’s Game of Life several times in my life. Some years ago, I read William Poundstone’s The Recursive Universe: Cosmic Complexity and the Limits of Scientific Knowledge, and it literally blew my mind.

I learned a little about Game Theory. I spotted Games of Strategy: Theory and Applications in a library and picked it up, thinking it would help my Backgammon.

That led me to Conway’s On Numbers and Games, and via parallel paths, to Surreal Numbers. Like the Game of Life, Surreal Numbers keep popping up unexpectedly, reigniting my interest in how the way we represent data, affords or hinders working with that data.

The subject of numbers and representation leads us to FRACTRAN.89


Table of Contents

Books © Stewart Butterfield, 2012, Some Rights Reserved

“Books” © Stewart Butterfield, 2012, Some Rights Reserved


Prelude

FRACTRAN

Marvin Minsky’s Magnificent Machines

Marvellous Minsky Machines

Gödel Numbering and Masterful Minsky Machines

On Equivalence

The Collatz Conjecture

Addenda


FRACTRAN

Only FRACTRAN has these star qualities

A fragment of John Horton Conway’s paper on FRACTRAN


In 1987, Conway contributed FRACTRAN: A SIMPLE UNIVERSAL PROGRAMMING LANGUAGE FOR ARITHMETIC to a special workshop on problems in communication and computation conducted in the summers of 1984 and 1985 in Morristown, New Jersey, and the summer of 1986 in Palo Alto. California.

FRACTRAN itself was not an important open problem in the field, but as the editors noted:

Perhaps the most entertaining of all the contributions is Conway’s fascinating article on FRACTRAN, a strange collection of numbers, which when operated on in a simple way, yield all possible computations. We begin with his article.

–Thomas M. Cover & B. Gotinath, “Open Problems in Communication & Computation”10

our first fractran program

As Wikipedia notes, a FRACTRAN program is an ordered list of positive fractions together with an initial positive integer input n. The program is run by updating the integer n as follows:

  1. for the first fraction f in the list for which nf is an integer, replace n by nf
  2. repeat this rule until no fraction in the list produces an integer when multiplied by n, then halt.

For example, this is a FRACTRAN program for computing any Fibonacci number: 17/65, 133/34, 17/19, 23/17, 2233/69, 23/29, 31/23, 74/341, 31/37, 41/31, 129/287, 41/43, 13/41, 1/13, 1/3.11

All FRACTRAN programs also start with an initial value for n. That value is sometimes a constant, and sometimes provided by the user. When it’s provided by the user, there is sometimes a need to prepare n to make it usable.

In this program’s case, to compute fib(x) for some value of x, we compute n = 78 * 5^(x - 1).

Let’s use this program to compute fib(7). We start with n = 78 * 5^(7-1), which is 1,218,750. We’ll follow along for a while to get the feel for what happens:12

  • The first fraction in the program is 17/65. 1,218,750 multiplied by 17/65 is 318,750, so we replace 1,218,750 with 318,750 and begin again.
  • The first fraction in the program is 17/65. 318,750 leaves a remainder when divided by 65, so we move on.
  • The next fraction in the program is 133/34. 318,750 multiplied by 133/34 is 1,246,875, so we replace 318,750 with 1,246,875 and begin again.

We leave it to run for a very long time, and then we see:

  • The next fraction in the program is 13/41. 24,576 leaves a remainder when divided by 41, so we move on.
  • The next fraction in the program is 1/13. 24,576 leaves a remainder when divided by 13, so we move on.
  • The next fraction in the program is 1/3. 24,576 multiplied by 1/3 is 8,192, so we replace 24,576 with 8,192 and begin again.

8,192 is an important number, because none of the divisors divide evenly into 8,192. So we see

  • The first fraction in the program is 17/65. 8,192 leaves a remainder when divided by 65, so we move on.
  • The next fraction in the program is 133/34. 8,192 leaves a remainder when divided by 34, so we move on.
  • The next fraction in the program is 17/19. 8,192 leaves a remainder when divided by 19, so we move on.

  • The next fraction in the program is 13/41. 8,192 leaves a remainder when divided by 41, so we move on.
  • The next fraction in the program is 1/13. 8,192 leaves a remainder when divided by 13, so we move on.
  • The next fraction in the program is 1/3. 8,192 leaves a remainder when divided by 3, so we move on. None of the demoninators in the program divide evenly into 8,192, so the program halts.

All FRACTRAN programs produce a series of values for n, and the result we want must be extracted from them. For our Fibonacci program, the values begin with 1,218,750, 318,750, 1,246,875, and 1,115,625, and then end with 221,184, 73,728, 24,576, and 8,192.13

In the case of Fibonacci, the result we want is the log2 of the last value for n. The last value of n is 8,192, and log2(8,192) is 13, the answer we want. The 7th Fibonacci number is 13.

We have now seen the three elements that every FRACTRAN program has:

  1. The program itself, a finite list of fractions. This program’s list is 17/65, 133/34, 17/19, 23/17, 2233/69, 23/29, 31/23, 74/341, 31/37, 41/31, 129/287, 41/43, 13/41, 1/13, and 1/3.
  2. An initial value of n. This may be a constant, it may be a user-supplied value, or it may be a transformation of a user-defined value. This program’s transformation can be expressed in JavaScript as n => 78 * Math.pow(5, n-1).
  3. A transformation from the values of n into the result we want, encoded the way we want it. In our case, it is something like values => Math.log2(last(values)) .
writing a fractran-based fibonacci function in javascript

Writing a FRACTRAN interpreter is very easy. Let’s begin by writing a JavaScript Fibonacci function that uses our FRACTRAN program for its implementation. The main thing we’ll need to watch out for is that that values of n can grow very, very large, so we will want to use big integers, aka “BigInts.”14

One consequence of working with big integers is that many of the things we depend on for numbers no longer work. For example, Math.log2(8192) => 13, but Math.log2(8192n) => TypeError: Cannot convert a BigInt value to a number. We’ll have to write our own log2 function.

The same goes for Math.pow, we’ll have to write our own. Feel free to use these implementations if you like:15

// Any sufficiently complicated function that loops, contains an ad hoc,
// informally-specified, bug-ridden, slow implementation of
// half of Linear Recursion
const log2 = (n) => {
  let result = 0n;

  while (true) {
    // degenerate condition
    if (n === 1n) break;

    // termination conditions
    if (n % 2n === 1n) return;
    if (n < 1n) return;

    //divide and conquer
    ++result;
    n = n / 2n;
  }

  return result;
}

const pow = (base, exponent) => {
  if (exponent < 0n) return;

  let result = 1n;

  while (exponent-- > 0n) result = result * base;

  return result;
}

Now go ahead and write your own implementation. Ignore the code below until you’ve written your own.

// uses log2 and pow from above to formulate the seed and decipher the result

const fib = (x) => {
  const program = (
    '17/65, 133/34, 17/19, 23/17, 2233/69, 23/29, 31/23, 74/341,' +
    ' 31/37, 41/31, 129/287, 41/43, 13/41, 1/13, 1/3'
  ).split(/(?:\s*,|\s)\s*/).map(f => f.split('/').map(n => BigInt(n)));

  let n = 78n * pow(5n, BigInt(x) - 1n);

  program_start: while (true) {
    for (const [numerator, denominator] of program) {
      if (n % denominator === 0n) {
        n = (n * numerator) / denominator;
        continue program_start;
      }
    }
    break;
  }

  return log2(n);
};

fib(7)
  //=> 13

This is very cool. The FRACTRAN program is very small and ridiculously simple: It’s just fractions. And the central FRACTRAN interpreter is also very small: It’s literally a for loop inside a while loop, and the while loop (along with a break statement) could have been avoided if JavaScript supported GOTO.16

For all its apparent elegance, FRACTRAN appears at first glance to be inscrutable. Is it one of those languages that is neither good for reading nor writing? Or is there a method to the madness of writing a program by composing a list of fractions?

To answer this question, we must first consider the work of another great mind no longer with us, Marvin Minsky.


Marvin Minsky’s Magnificent Machines

Marvin Minsky

Marvin Minsky posing with one of MIT’s demonstrations of Robotics and AI


In September of 1987, Stewart Brand published The Media Lab: Inventing the Future at M.I.T.. To say that the book affected me does not do its impact justice. Like many business and technical hagiographies, it was breathless in its admiration for what M.I.T.–and director Nicholas Negroponte–were trying to do.

It was also highly revealing: Brand described in detail how the Media Lab was funded by its corporate sponsors, and how the funding model drove research. Their motto was, “Demo or Die,” which meant that the money went to the people who could not only wow their peers, but also provide the stage-magic that would open the wallets of their corporate sponsors.

I was working in technology Enterprise Sales at the time, and keenly understood that while it’s really difficult to sell sizzle without steak, it’s equally improbable to secure the sale when you have perfectly good steak, but no sizzle.17

One of the figures mentioned in The Media Lab was Marvin Minsky. Minsky was considered a giant in Artificial Intelligence, and Artificial Intelligence was the tulip mania of the day. History would show that as the book was released, AI funding was starting another of its cyclical collapses, but as I read the book, the magazines and book shelves were groaning under the weight of breathless prose and utopian visions.



In 1985, Minsky had published The Society of Mind, and I bought it on the strength of reading about Minsky in The Media Lab. In 1994, a company called Voyager published a CD-ROM version of The Society of Mind that I “played” using one of Apple’s quirkier products, a CD-ROM reader and audio-CD player they borrowed from Phillips and branded as PowerCD.

One of the features of the CD-ROM was an interactive tour of Minsky’s office. I still vividly recall his discussion of a mirror he had hanging on the wall, and his mention that Feynman had written an entire book devoted to explaining just one thing: How light actually reflects off a mirror, and that in this book, he explains why all of the things non-specialists believe about light and reflection are actually false.

Like Feynman and Conway, Minsky had a talent for explaining challenging ideas. And also like Feynman and Conway, he had made enormous contributions to the progress of human knowledge. One of his areas of research was in computability, and specifically, the study of a certain class of idealized computing machines that are now named Minsky Machines in his honour.1819

magnificent minsky machines

A Minsky Machine is an idealized machine that has been proven to be computationally universal. Minsky Machines are part of the family of idealized machines called Register Machines.

Like Register Machines and Turing Machines, Minsky machines form a little family, with slight differences between them in the way they are imagined, but all are computationally equivalent. We shall discuss one particularly simple form of Minsky Machine that we shall call the “Magnificent” Minsky Machine.20

Like a Turing Machine, a Magnificent Minsky Machine has a finite number of states. However, instead of having a single tape that stretches to infinity in one or both directions, Magnificent Minsky Machines have a finite number of tapes, each of which stretches to infinity in only one direction.

Like a Turing Machine, a Magnificent Minsky Machine has a tape-head for each tape, and based on its instructions, can move its tape-heads and change to a different state. Unlike a Turing Machine, a Magnificent Minsky Machine can move its tape-heads any finite amount in either direction. We call these directions forward (“away from the beginning”), and backwards (“towards the beginning”).

When a Turing Machine matches a particular symbol under its tape-head, it can write a symbol, and it can also move the tape-head one square in either direction.

Unlike a Turing Machine, a Magnificent Minsky Machine cannot write anything on its tapes, and therefore, does not match anything under its tape-heads. What it can do is test whether it is possible to perform a move. Since it is always possible to move forward, the only test we care about is whether a move backwards is possible.

Also unlike a Turing Machine, a Magnificent Minsky Machine can move zero or more of its tape-heads at the same time. Therefore, its test may check whether one or more tape-heads can be moved backwards.

To give a literal example, if the tape-head is at the beginning, no backwards move is possible. If the tape-head is n squares forward, all moves backwards <= n squares are possible, but all moves > n squares are not possible.

The net effect of these design choices is that a Magnificent Minsky Machine can store an arbitrary amount of state, just like a Turing Machine, but it stores that state by having each tape-head be an arbitrary distance from the beginning of its tape.21


Multi Tape

An illustration of a multi-tape machine.


creating a magnificent minsky machine

Let’s write a program using a Magnificent Minsky Machine. Now that we have the general idea of how these machines operate, here is how we will specify each Magnificent Minsky Machine, i.e. How we shall write its “program:”

  1. Our machine will have a finte number of states, denoted with consecutive positive integers, i.e. 1, 2, 3, …
  2. Each state will have an finite and ordered list of rules.
  3. Each rule will be expressed as “Do this, provided that.” Thus, they will have two clauses: An action clause, and a guard clause.
  4. The action clause shall consist of a set of tape-heads to move forward, and a positive integer for each tape-head stating how many squares to move. We shall note these as tuples of (tape-identifier^squares-to-move-forward). The action clause is an unordered set of such tuples, with no two tuples in the same rule sharing the same tape-head. The action clause will also include a positive integer indicating the next state to enter.
  5. The guard clause shall consist of a set of tape-heads to move backwards, and a positive integer stating how many squares to move towards the beginning. As with the action clause, no two guard clauses in the same rule can share the same tape-head.
  6. Because no two clauses in the same rule can share the same tape-head, it follows that no one rule can both test a tape-head in the guard clause and simultaneously move a tape-head in the action clause.

Our Magnificent Minsky Machine is initialized with the tape-heads being placed in a specific set of positions, so in addition to listing the states and rules therein, the description of a Magnificent Minsky Machine will also include instructions for setting up the initial position of the tape-heads.

When we wish to run a Magnificent Minsky Machine, we start it in state 1. We operate our machine by scanning the rules within the current state, in order.

For each rule, we check its guard clause. If it is possible to move all of the tape-heads in the guard clause towards the beginning, the rule “fires,” and we move the tape-heads towards the beginning as specified by the guard clause. We then consult the action clause, and move all of the tape-heads listed away from the end by the amounts listed. Finally, when a rule fires, if it lists a next state, we change to that state and return to scanning the rules within the current state, in order.

If any rule’s guard clause cannot perform all of the required movement of tape-heads towards the beginning, the rule fails, and the tape-heads are not disturbed. We then move on and try the next rule in that state’s list, and the next, and so forth. If all of the rules in the current state fail, the machine halts. It follows, trivially, that if the machine enters a state without any rules, it must necessarily halt.

If the machine attempts to transition to an undefined state, it also halts. An explicit transition to state 0, therefore, is the well-formed way to explicitly halt the machine.

Now we’re ready to discuss a simple notation for Magnificent Minsky Machines, and to try one out.

a notation for magnificent minsky machines

Consider a Magnificent Minsky Machine that will add two numbers. We will only need one state, with two rules. Our notation will be simple. In each state, there is a comma-separated list of rules. Since we will only have one state in our first machine, we will only discuss how to write down the rules right now.

Our state will have a comma-sparated list of rules, i.e. rule1, rule2. Whitespace is not significant, so we could also write rule1,rule2 or even:

rule1,
rule2

(As is usual with most programming languages, how we arrange our program with whitespace is a matter of organizing the layout for human readability.)

Each rule willl have an action clause and a guard clause, separated with /, e.g.:

actionClause1/guardClause1,
actionClause2/guardClause2

The / obviously resembles the notation for division, but in a Magnificent Minsky Machine, it doesn’t actually mean “divide,” it just separates the action clause from the guard clause in each rule.

The action clauses and guard clauses both have exactly the same notation: One or more tuples of the form (t^s), where t identifies the tape-head, and s identifies the number of squares to move. The ^ character means exponentiation in some programming langauges, but in a Magnificent Minsky Machine, it’s just a way of separating two numbers.

Thus, (1^1)(4^1)/(3^1) is a rule with two clauses:

  • (1^1)(4^1) is the action clause, and it has two tuples: (1^1) says to move tape-head 1 by one squre, and (4^1) says to move tape-head 4 by one square.
  • (3^1) is the guard clause, and it says to move tape-head 3 by one square.

In a Magnificent Minsky Machine, action clauses always move the tape-heads forward, and guard clauses always move the tapehead backwards, so we don’t need to have our clauses use a + or -, or arrows pointing in different directions, it is enough to know that clauses to the left of the / are action clauses and move the tape-heads forward, while clauses to the right of the / are guard clauses, and move the tape-heads backwards.

Now to our first machine: Our machine only has one state, 1, and it has two rules:

(1^1)/(2^1),
(1^1)/(3^1)

Our program adds two numbers positive numbers, which we shall call a and b. It has three tapes, 1, 2, and 3. To set the machine up, we place tape-head 1 at the beginning of its tape, tape-head 2 a squares forward, and we place tape-head 3 b squares forward. When the machine halts, tape-head 1 will be a + b squares forward, while tape-heads 2 and 3 will be at the beginning.

We can use it to add the numbers 2+2, and we’ll see if it comes up with 4. Here’s how we’ll show the position of the tape-heads graphically:22

1: *
2: ..*
3: ..*

This shows that tape-head 1 is at the beginning, while tape-heads 2 and 3 are two squares forward, the beginning conditions for adding 2 and 2.

Our machine begins in state 1 (it’s the only state this machine has). The first rule is (1^1)/(2^1). It checks to see if tape-head 2 can move one square towards the beginning. It can, therefore it also executes the action of moving tape-head 1 one square away from the beginning. It remains in state 1, and now the tape-head positions are:

1: .*
2: .*
3: ..*

Once again, it consults its rules, and the first rule fires again, producing:

1: ..*
2: *
3: ..*

The third time through, rule 1 fails to fire because its guard clause (2^1) fails. It remains in its only state, and checks the second rule, (1^1)/(3^1). Tape-head 3 can be moved one square towards the beginning, and it does so while moving tape-head 1 away from the beginning:

1: ...*
2: *
3: .*

Once again, it consults its rules, and the second rule fires again, producing:

1: ....*
2: *
3: *

The last time through, neither rule can fire, because neither tape-head 2 or tape-head 3 can move one square towards the beginning, so the machine halts. As we desired, tape-head 1 is now four squares forward of the beginning, so our machine has added 2 and 2 to produce 4.

the magnificent minsky multiplication machine

Now we’ll write a Magnificient Minsky Machine that multiplies two numbers using our notation. We will use more than one state, so we separate states with ;, and show a transition to a different state with .

Our states will be numbered from 1, in the order we list them. With this scheme, our multiplication machine can be written as:

(1^0)/(2^1)→2,    (1^0)/(3^1)   ;
(1^1)(4^1)/(3^1), (1^0)/(1^0)→3 ;
(3^1)/(4^1),      (1^0)/(1^0)→1

Or even:

(1^0)/(2^1)→2,
(1^0)/(3^1);

(1^1)(4^1)/(3^1),
(1^0)/(1^0)→3;

(3^1)/(4^1),
(1^0)/(1^0)→1

As with the addition machine, we give it three tapes: A result tape that must be empty, and two multiplicand tapes, with the tape-head advanced by the value of the multiplicand. We also supply a fourth tape to use as a temporary variable. Thus, to multiply 3 and 39, we set the machine’s tapes to 0, 3, 39, 0.

The operation of the machine is easy to describe at a high level. There’s one idiom that we first introduce: (1^0) is a NOOP clause. If used as a guard clause, it always succeeds because every tape-head is always in a position where it can move zero squares backwards. And if used as an action clause, it does nothing because it moves the tape-head zero squares forward. And a clause like (1^0)/(0^1)→1 is a Magnificient Minsky Machine’s way of writing GOTO: It always succeeds, doesn’t move any tape-heads, and changes to state 1.

The initial state (1) has two rules:

  1. (1^0)/(2^1)→2, which decrements tape 2 and then changes to state 2, followed by;
  2. (1^0)/(3^1), which decrements tape 3 and remains in state 1.

The net effect of these two rules is that whenever the machine is in state 1, it tries to decrement tape 2 and move to state 2. If that fails, it tries to decrement tape 3 and remain in state 1. When it decrements tape 3 and remains in state 1, it clearly will do it again, and again, and again until it can decrement state 3 no more, at which point it will fail both of its rules, and halt.

The second state’s two rules are:

  1. (1^1)(4^1)/(3^1), which increments tapes 1 and 4, decrements tape 3, and stays in state 2, followed by;
  2. (1^0)/(1^0)→3 which is a simple GOTO 3.

The net effect of state 2’s rules is simply to add the value of tape 3 to whatever is in tape 1, while simultaneously making a copy of tape 3 in tape 4 (because in our Magnificient Minsky Machines, successfully reading a value also consumes that value).

The third states two rules are:

  1. (3^1)/(4^1), which copies tape 3’s original value back to tape 3 from tape 4, followed by;
  2. (1^0)/(1^0)→1 which is a simple GOTO 1.

The net effect of the third rule is to restore tape 3. Since this machine copies the value of tape 3 to tape 1 once for every square of tape 2, the net effect is to multiply tapes 2 and 3.

implementing magnificent minsky machines

Manually simulating ideal computing machines loses its lustre once you’ve manually verified that 3 * 13 = 39. Without peeking ahead, try writing an interpreter that can parse our notation and output the result.

For example, we’d like to write:

const evaluate = (program, ...tapes) => {
  // ...
};

evaluate(
  '(1^1)/(2^1), (1^1)/(3^1)',
  0, 2, 3
)
  //=> [5, 0, 0]

evaluate(`
  (1^0)/(2^1)→2,    (1^0)/(3^1)   ;
  (1^1)(4^1)/(3^1), (1^0)/(1^0)→3 ;
  (3^1)/(4^1),      (1^0)/(1^0)→1
  `, 0, 3, 13, 0
)
  //=> [39, 0, 0, 0]

No peeking until you’ve tried it yourself!

const parse = (program) => {
  const parseProgram = program => program.split(/\s*;\s*/).map(s => s.trim());
  const parseState = state => state.split(/(?:\s*,|\s)\s*/).map(s => s.trim());
  const parseRule = (rule, stateIndex) => {
    const [clauses, nextState] = rule.includes('→') ? rule.split(/\s*→\s*/).map(s => s.trim()) : [rule.trim(), stateIndex + 1]

    return [clauses, typeof nextState === 'string' ? parseInt(nextState, 10) : nextState];
  };
  const parseClauses = clauses => clauses.split('/').map(s => s.trim());
  const parseClause = clause => {
    if (clause === '') return [];

    const strippedClause = clause.substring(1, clause.length - 1); // strip opening and closing ()
    const pairs = strippedClause.split(/\)\s*\(/).map(s => s.trim());

    return pairs.map(p => p.split(/\s*\^\s*/).map(s => parseInt(s, 10)));
  };

  return [[]].concat(
    parseProgram(program).map(
      (state, stateIndex) => parseState(state).map(
        rule => {
          const [_clauses, nextState] = parseRule(rule, stateIndex);
          const clauses = parseClauses(_clauses).map(parseClause);

          return clauses.concat([nextState]);
        }
      )
    )
  );
}

const interpret = (parsed, input = []) => {
  const tapes = ['ANCHOR', ...input]; // fake 1-indexing

  let stateIndex = 1;

  run: while (stateIndex > 0 && stateIndex < parsed.length) {
    const rules = parsed[stateIndex];
    for (const [actionClauses, guardClauses, nextState] of rules) {
      if (guardClauses.some(
        ([tapeIndex, squares]) => tapes[tapeIndex] === undefined || tapes[tapeIndex] < squares
      )) continue;
      for (const [tapeIndex, squares] of guardClauses) {
        tapes[tapeIndex] = tapes[tapeIndex] - squares;
      }
      for (const [tapeIndex, squares] of actionClauses) {
        tapes[tapeIndex] = (tapes[tapeIndex] || 0) + squares;
      }
      stateIndex = nextState;
      continue run;
    }
    break;
  }

  const output = tapes.slice(1); // unfake 1-indexing

  return output;
}

const evaluate = (program, ...tapes) => interpret(parse(program), tapes);

Now that we have a machine to emulate our machines, we can explore an interesting idea.


Marvellous Minsky Machines

Marvellous Spatuletail Dubi Shapir/ABCbirds.Org

Marvellous Spatuletails are endemic to Peru. (Dubi Shapiro/ABCbirds.org)


Here is another Magnificent Minsky Machine that multiplies two numbers, we shall call it the marvellous multiplier:

(1^1)(4^1)(6^1)/(3^1)(5^1) ,
(7^1)/(5^1)                ,
(3^1)(8^1)/(4^1)(7^1)      ,
(1^0)/(7^1)                ,
(5^1)/(6^1)                ,
(7^1)/(8^1)                ,
(5^1)/(2^1)                ,
(1^0)/(3^1)

This is a one-state multiplier. Our previous multiplier needed three states:

(1^0)/(2^1)→2,    (1^0)/(3^1)   ;
(1^1)(4^1)/(3^1), (1^0)/(1^0)→3 ;
(3^1)/(4^1),      (1^0)/(1^0)→1

How does our marvellous multiplier work without the other states?

The not-very-secret secret is that in addition to the four tapes our multiplier needs to do its main business, we’ve added four more tapes: 5, 6, 7, and 8. We only ever set them to 1 or zero, so they act like flags that emulate four additional states.

Let’s analyze this, state-by state. The first state of the original multiplier had two rules:

(1^0)/(2^1)→2,
(1^0)/(3^1)

These two rules are the last rules of the marvellous multiplier:

(5^1)/(2^1),
(1^0)/(3^1)

Why are they the last rules? Since we are using tapes 5, 6, 7, and 8 to emulate other states, every rule that comes before these rules is guarded by (5^1), (6^1), (7^1), or (8^1). Thus, these rules only come into play if none of these state-emulation tape-heads are one square forward.

But what about the rules themselves? The second rule, (1^0)/(3^1), is the same thing we already have, it decrements tape 3 and remains in the only state, thus it will clear tape 3 and then the entire program will halt, just as with the original.

The first of our original rules is different. Instead of (1^0)/(2^1)→2, we have (5^1)/(2^1). Instead of setting the next state to 2, we increment tape 5.

That leads us to the original state 2:

(1^1)(4^1)/(3^1),
(1^0)/(1^0)→3

And our marvellous machine’s equivalent:

(1^1)(4^1)(6^1)/(3^1)(5^1),
(7^1)/(5^1)

Both of these rules are guarded by (5^1), which matches what we saw from the rule in the original state 1: incrementing tape 5 makes this machine act like it’s in the original state 2.

The first rule of the original machine incremented tapes 1 and 4, decremented tape 3, and remained in state 2. Our new rule also increments tapes 1 and 4, and decrements tape 3. And we know that we have to have it get back to the emulated state 2 by incrementing tape 5. But it is forbidden to both decrement and increment the same tape in our Minsky Machines, so it increments tape 6 instead.

How does that get us to incrementing tape 5? Look further down, and we find this rule:

(5^1)/(6^1)

If the marvellous multiplier finds itself with tape 6 incremented, it turns around and increments tape 5. Thus, incrementing tape 6 is an indirect way of incrementing tape 5, and that’s what our machine does when it is already decrementing tape 5.

We already know why the second of our original rules for state 2, (1^0)/(1^0)→3, becomes (7^1)/(5^1) in the marvellous multiplier. Instead of always matching with a (1^0) guard, it only matches when tape 5 is incremented, because tape 5 emulates state 2. And instead of setting having a NOOP action with (1^0) and setting the next state to 3, it increments tape 7 because tape 7 emulates state 3.

We now know everything we need to know to interpret how the marvellous machine’s (3^1)(8^1)/(4^1)(7^1), (1^0)/(7^1) emulates the original’s (3^1)/(4^1), (1^0)/(1^0)→1, and why there’s a (7^1)/(8^1) rule added to support it.

The marvellous multiplier illustrates a key property of these Minsky Machines: For every Magnificent Minsky Machine with two or more states, there exists an equivalent Marvellous Minsky Machine with just one state, but more tapes.

an algorithm to derive a marvellous minsky machine from any magnificent minsky machine

From our exploration of the marvellous multiplier, we picked up a few tricks for emulating states with additional tapes:

  1. Place the original state 1 last;
  2. For each state beyond 1, set up two additional tapes, stateTape and stateTapePrime;
  3. Guard the rules of the original state with stateTape;
  4. For rules outside the state that transfer to the state, remove the nextState and add an action to increment stateTape.
  5. For rules of the original state that remain in the state, add an action to increment stateTapePrime;
  6. Add a rule to transfer from stateTapePrime to stateTape.

This can, of course, be automated. If you’re keen, have a go at it before reading this example. If you’re really keen, see if you can devise your own algorithm for deriving a Marvellous Minsky Machine from any Magnificent Minsky Machine.23

const maxTapeIndexOf = (parsed) => {
  let max = undefined;

  for (const rules of parsed.slice(1)) {
    for (const [actionClause, guardClause] of rules) {
      for (const [tapeIndex] of actionClause.concat(guardClause)) {
        if (max === undefined || tapeIndex > max) max = tapeIndex;
      }
    }
  }

  return max;
};

const maxStateNumberOf = (parsed) => {
  return parsed.length - 1;
}

const NOOP = [1, 0];
const isNOOP = ([,squares]) => squares === 0;
const isActionable = ([,squares]) => squares !== 0;

const SUCCESS = [1, 0];
const isSuccess = ([,squares]) => squares === 0;
const canFail = ([,squares]) => squares !== 0;

const ruleCanFail = ([, guardClause]) => guardClause.every(canFail);

const withClause = (clauses, clause) => clauses.filter(canFail).concat([clause]);

const toMarvellous (parsed) => {
  const maxTapeIndex = maxTapeIndexOf(parsed);
  const maxStateNumber = maxStateNumberOf(parsed);

  const stateToTape = new Map();

  for (let stateNumber = 2; stateNumber <= maxStateNumber; ++stateNumber) {
    const offset = maxTapeIndex + (2 * stateNumber) - 3;

    stateToTape.set(stateNumber, { stateIndex: offset, statePrimeIndex: offset + 1 });
  }

  const state1 = parsed[1];

  // adjust all rules in state 1 to set an emulated state
  // rather than use an explicit nextState if they point to
  // another state
  for (const rule of state1) {
    const [actionClause, guardClause, nextState] = rule;
    if (nextState > 1) {
      rule[0] = withClause(actionClause, [stateToTape.get(nextState).stateIndex, 1]);
      rule[2] = 1;
    }
  }

  let stateIndex = 1;
  let aggregateRules = [];
  for (const rules of parsed.slice(2)) {
    ++stateIndex;

    if (rules.every(ruleCanFail)) {
      // if we cannot guarantee action,
      // add an explicit fall-through to halt
      // this will get guarded below
      rules.push([[NOOP], [SUCCESS], 0]);
    }

    const stateEmulationGuard = [stateToTape.get(stateIndex).stateIndex, 1];

    for (const rule of rules) {
      const [actionClause, guardClause, nextState] = rule;

      if (nextState === stateIndex) {
        // this rule remains in the same state. we cannot directly
        // emulate the sate, because we are already guarding for it,
        // so we set the state-prime
        rule[0] = withClause(actionClause, [stateToTape.get(nextState).statePrimeIndex, 1]);
      } else if (nextState > 1) {
        // set an emulated state rather than use an explicit nextState
        // if nextState
        rule[0] = withClause(actionClause, [stateToTape.get(nextState).stateIndex, 1]);
      }

      rule[1] = withClause(guardClause, stateEmulationGuard);
      rule[2] = 1;
    }

    aggregateRules = aggregateRules.concat(rules); // TODO: refactor to flatMap

  }

  for (const { stateIndex, statePrimeIndex } of stateToTape.values()) {
    const actionClauses = [[stateIndex, 1]];
    const guardClauses = [[statePrimeIndex, 1]];

    aggregateRules.push([actionClauses, guardClauses, 1]);
  }

  aggregateRules = aggregateRules.concat(state1);

  return [[]].concat([aggregateRules]);
}

That is truly marvellous, but is it meaningful?

Yes, it is meaningful, and we’re about to find out why.


Gödel Numbering and Masterful Minsky Machines

Godeleinstein

Kurt Gödel and Albert Einstein


Formally, a Gödel Numbering is a scheme for assigning a unique natural number to every symbol and statement in a formal language. Kurt Gödel used such a scheme to show that formal systems capable of making statements about themselves were equivalent to systems that make statements about numbers, and from there he went on to show that in such systems, there were necessarily statements that were true but not provable.

Informally, a Gödel Numbering is a mapping between natural numbers and some other mathematical concept or entity. Let’s keep that in mind.

register machines

A Minsky Machine’s tapes are easily represented as natural numbers. We have presented the machine as moving over a tape, which our actions and guards moving the head away from and back towards the beginning of the tape.

And while we could implement the tapes with arrays or some such, natural numbers are a better choice because they more neatly fit the affordances our tapes provide: Incrementing by a certain amount, testing whether they can be decremented by a certain amount, and decrementing them by a certain amount.

In fact, if we let go of the notion of tapes, we can think of our Minsky Machines as operating on a finite set of registers, each of which holds a natural number. And this is where things get interesting:

In a “Magnificent” Minsky Machine, the current state of the world is represented by the machine’s current state and the current state of its registers. But a “Marvellous” Minsky Machine does away with the machine’s current state. the “state of the world” is encoded entirely by the contents of its registers.

This brings up an interesting possibility: We were able to “flatten” a Magnificent Minsky Machine into a Marvellous Minsky Machine, getting rid of its states by encoding the machine states into register values.

Could we “flatten” the registers by encoding their value into a single natural number? Could we develop a Minsky Machine that only needs one number to encode its entire state?

encoding state with prime factorization

When Gödel developed his numbering system, he needed a way to encode a finite number of finite numbers in a single finite natural number. In this way, he could encode any statement in any formal language as a single natural number.

There are various ways to encode a finite number of finite numbers in a single finite natural number. For example, we could just write the numbers out like text, separating the numbers with a comma like a CSV file. We wouldn’t need 8 or 16 bits for each “character,” 4 bits could handle this easily.

While that’s certainly possible, the things that lists of characters makes easy–like searching for any arbitrary substring–are not useful to us, and the things that are useful to us are not easy with lists of characters.

We know our requirements: We need to quickly and easily decrement one or more “registers,” preferably in a single step. And we need to quickly and easily increment several “registers” in a single step.

Gödel chose to use prime factorization to encode a finite number of finite numbers, and that method suits us perfectly.

With prime factorization, we encode lists of numbers as exponents of consecutive primes. Thus, the list [1, 9, 6, 2] is encoded as 2¹3⁹5⁶7², or 30,139,593,750. While the numbers might become very large, they’re workable for the kinds of toy exercises we’re performing here.

Prime factorization can be used to encode a Minsky Machine’s state, using the exponents of consecutive primes as virtual registers. But that’s not its only use. Action clauses in our rules are tuples of a register and an amount to increment. So (3^2) means, “increment the register identified as 3, by 2.”

The two is definitely a natural number. But the register indicator could be anything. Our Minsky Machines don’t allow indirect access or iteration over them, so there’s nothing really binding them to consecutive integers.

So what if we labeled the registers with consecutive primes? In that case, an action clause like (1^1)(4^1)(6^1) would become (2^1)(7^1)(13^1). And we can do the same thing with guard clauses: (3^1)(5^1) would become (5^1)(11^1).

Now we see the obvious!24

We can represent action and guard clauses as natural numbers with prime factorization. So a rule like (1^1)(4^1)(6^1)/(3^1)(5^1) would become (2^1)(7^1)(13^1)/(5^1)(11^1) when we swap the consecutive tape numbers for consecutive primes. We then turn that notation into an arithmetic expression by converting the increments and decrements to exponents, which would be 2¹7¹13¹/5¹11¹.

And that can be written as a pair of natural numbers separated with a forward slash, i.e. 182/55.

The final part of our single-number Minsky Machine will be its input and output. With a Magnificent or Marvellous Minsky Machine, we listed the initial values of the registers. So for adding having a value of 0 in register 1, 3 in register 2, and 13 in register 3, we would include the parameters 0, 3, 13.

And for output, our machine would return the value of all the registers, and the value we want would be in one or more of them, depending on how we wrote our program. For example, the Marvellous Multiplication Machine returns 39, 0, 0, 0, 0, 0, 0, 0.

We will encode these with prime factorization as well. So the input of 0, 3, 13 would become 3³5¹³, or 32,958,984,375. Likewise, the output of 39, 0, 0, 0, 0, 0, 0, 0 would become 2³⁹, or 549,755,813,888.

This transformation is extremely simple, and once again we can perform it mechanically:25

const PRIMES = [
  2n, 3n, 5n, 7n, 11n,
  13n, 17n, 19n, 23n, 29n,

  // ... 180 more primes ...

  1153n, 1163n, 1171n, 1181n,	1187n,
  1193n, 1201n,	1213n, 1217n,	1223n
];

const tapeToPrime = tape => PRIMES[tape - 1];
const exponentiate = ([tape, amount]) => pow(tapeToPrime(tape), BigInt(amount));
const multiply = (x, y) => x * y;
const godelizeClauses = clauses => clauses.map(exponentiate).reduce(multiply, 1n);

export const toMasterful = (magnificentProgram) => {
  const parsedMarvellous = parse(toMarvellous(magnificentProgram));
  const godelized = parsedMarvellous[1].map(
    ([actions, guards]) => [actions, guards].map(godelizeClauses)
  );

  return pp(godelized);
}

Now what about actually evaluating a single-number Minsky Machine? If you are extremely keen, you can write your own implementation. We want to end up with something like:

const evaluate = (program, initialState) => {
  // ...
};

// masterful adding machine
// 3²5³ => 1,125
// 2⁵ => 32
evaluate(
  '2/3, 2/5',
  1125
)
  //=> 32

// masterful multiplication machine
// 3³5¹³ => 32,958,984,375
// 2³⁹ => 549,755,813,888
evaluate(
  '182/55, 17/11, 95/119, 1/17, 11/13, 17/19, 11/3, 1/5',
  32958984375n
)
  //=> 549755813888n

We will need functions to convert BigInts to their factorization, and factorizations back to BigInts. If you wish, you may use these helpers for your implementation:

// Converts BigInts to their prime factorizations,
// represented as a map from prime to exponent.
//
// Examples:
//   17 => Map { 17 => 1 }
//   39 => Map { 3 => 1, 13 => 1 }
//   44 => Map { 2 => 2, 11 => 1}
//
// Accepts BigInts or numbers, returns a factorization
// as ordinary numbers

// Relies on simple factoring code adapted from
// http://www.javascripter.net/math/primes/factorization.htm
export const toFactors = (n) => {
  n = BigInt(n);

  if (n <= 0n) return;

  const factorization = new Map();

  while (n > 1n) {
    const primeFactorBigInt = leastFactor(n)
    const primeFactor = unsafeToNumber(leastFactor(n));

    factorization.set(
      primeFactor,
      factorization.has(primeFactor) ? factorization.get(primeFactor) + 1 : 1
    );

    n = n / primeFactorBigInt;
  }

  return factorization;
}

const pow = (base, exponent) => {
  base = BigInt(base);
  exponent = BigInt(exponent);

  if (exponent < 0n) return;

  let result = 1n;

  while (exponent-- > 0n) result = result * base;

  return result;
}

// Converts prime factorization to BigInts
export const fromFactors = g => [...g.entries()].reduce((acc, [factor, exponent]) => acc * pow(factor, exponent), 1n);

// find the least factor in n by trial division
function leastFactor(composite) {

 // if (isNaN(n) || !isFinite(n)) return NaN;

 if (composite === 0n) return 0n;
 if (composite % 1n || composite*composite < 2n) return 1n;
 if (composite % 2n === 0n) return 2n;
 if (composite % 3n === 0n) return 3n;
 if (composite % 5n === 0n) return 5n;

 for (let i = 7n; (i * i) < composite; i += 30n) {
   if (composite % i         === 0n) return i;
   if (composite % (i +  4n) === 0n) return i + 4n;
   if (composite % (i +  6n) === 0n) return i + 6n;
   if (composite % (i + 10n) === 0n) return i + 10n;
   if (composite % (i + 12n) === 0n) return i + 12n;
   if (composite % (i + 16n) === 0n) return i + 16n;
   if (composite % (i + 22n) === 0n) return i + 22n;
   if (composite % (i + 24n) === 0n) return i + 24n;
 }

 // it is prime
 return composite;
}

function unsafeToNumber(big) {
  return parseInt(big.toString(), 10);
}

Have a go at it before peeking!

the masterful minsky machine

This is the Masterful Minsky Machine. It has just one state, and it encodes its state and clauses with single natural numbers:

const parse = (program) => program.trim().split(/,?\s+/).map(
  rule => rule.split('/').map(
    chars => parseInt(chars, 10)
  )
);

const interpret = (rules, state) => {
  run: while (true) {
    for (const [action, guard] of rules) {
      const factoredState = toFactors(state);

      // check guard clause
      const factoredGuard = toFactors(guard);
      if ([...factoredGuard.keys()].some(
        factor => factoredGuard.get(factor) > (factoredState.get(factor) || 0)
      )) continue;

      for (const [factor, guardValue] of factoredGuard.entries()) {
        const oldStateValue = factoredState.get(factor);

        factoredState.set(factor, oldStateValue - guardValue);
      }

      const actionGuard = toFactors(action);
      for (const [factor, actionValue] of actionGuard.entries()) {
        const oldStateValue = factoredState.get(factor) || 0;

        factoredState.set(factor, oldStateValue + actionValue);
      }

      state = fromFactors(factoredState);

      continue run;
    }
    break;
  }

  return state;
}

const evaluate = (program, initialState) => interpret(parse(program), initialState);

And of course, we can try it out:

// the masterful adding machine
// 3²5³ => 1,125
// 2⁵ => 32
evaluate(
  '2/3, 2/5',
  1125
)
  //=> 32

// the masterful multiplication machine
// 3³5¹³ => 32,958,984,375
// 2³⁹ => 549,755,813,888
evaluate(
  '182/55, 17/11, 95/119, 1/17, 11/13, 17/19, 11/3, 1/5',
  32958984375n
)
  //=> 549755813888n

It works. And that’s not all… Does this look familiar?

// the masterful fibonacci machine
// 78 * 5⁽⁷⁻¹⁾ => 1,218,750
// 2¹³ => 8,192
evaluate(
  `17/65, 133/34, 17/19, 23/17, 2233/69, 23/29, 31/23,
   74/341, 31/37, 41/31, 129/287, 41/43, 13/41, 1/13, 1/3`,
  1218750
)
  //=> 8192n

Our Masterful Minsky Machine evaluates FRACTRAN programs, and it’s not difficult to see why.


On Equivalence

Alan Turing

Alan Matheson Turing


The word “equivalence” has a very specific meaning in computability. When we say that two computation machines are “computationally equivalent,” we mean that they both share some set of properties we consider important with respect to computing, although they may differ wildly in other respects.

Computers that we program with programming languages are computation machines. If we one of them in the language Haskell, and program the other in the language Piet, we can say that they are both computationally equivalent, as there is nothing in principle that we can compute in Haskell that we can’t compute with Piet.26

Even if “computational equivalence” doesn’t care about how a computation is achieved, how it does it matters in a different, deeper way. The languages CoffeeScript and JavaScript are equivalent in a deeper way than Haskell and Piet, because under the hood, “CoffeeScript is just JavaScript,” as creator Jeremy Ashkenas says.

With this in mind, let’s ask ourselves: “Is a Marvellous Minsky Machine equivalent to a FRACTRAN machine in a superficial, computationally equivalent way? Or is there a deeper relationship? If we look under the hood, will we find that “FRACTRAN is just a Marvellous Minsky Machine”?”

fractran and marvellous minsky machines

Examining the code superficially, FRACTRAN looks very different from the Marvellous Minsky Machine. The core of the FRACTRAN interpreter tests a remainder, then performs one multiplication and one division if the remainder is zero:

for (const [numerator, denominator] of program) {
  if (n % denominator === 0n) {
    n = (n * numerator) / denominator;
    continue program_start;
  }
}

Whereas, the Marvellous Minsky Machine performs multiple comparisons of magnitude, then performs subtractions, and additions, wrapped in a lot of factorization faff:

const factoredState = toFactors(state);

// check guard clause
const factoredGuard = toFactors(guard);
if ([...factoredGuard.keys()].some(
  factor => factoredGuard.get(factor) > (factoredState.get(factor) || 0)
)) continue;

for (const [factor, guardValue] of factoredGuard.entries()) {
  const oldStateValue = factoredState.get(factor);

  factoredState.set(factor, oldStateValue - guardValue);
}

const actionGuard = toFactors(action);
for (const [factor, actionValue] of actionGuard.entries()) {
  const oldStateValue = factoredState.get(factor) || 0;

  factoredState.set(factor, oldStateValue + actionValue);
}

state = fromFactors(factoredState);

Those of you familiar with number theory have already grasped that these are both the same algorithm!

If we divide 1,218,750 by 65 and have no remainder, it’s because:

  1. The prime factorization of 65 is 5¹13¹.
  2. The prime factorization of 1,218,750 is 2¹3¹5⁶13¹.
  3. Checking whether the remainder of 1,218,750 by 65 is zero is exactly the same thing as testing whether any of the prime factors of 65 have an exponent greater than the exponent of the same factor in 1,218,750.

Likewise, actually dividing 1,218,750 by 65 and getting a result of 18,750 is exactly the same thing as turning 2¹3¹5⁶13¹ into 2¹3¹5⁵13⁰, which is 18,750.That’s just basic arithmetic.

If we look at how our code handles the action clauses, it’s no different. Multiplying 18,750 by 17 is exactly the same thing as turning 2¹3¹5⁵ into 2¹3¹5⁵17¹, for the same reason: “Arithmetic.”

Marvellous Minsky Machines are deeply related to FRACTRAN interpreters, because they are FRACTRAN interpreters. We’re just doing a lot of things by hand that the %, /, and * operators do for us in JavaScript, or whatever language Conway used in 1972, had he bothered to write a computer implementation.

polygame

In 1979, Ludmila Gregušová and Ivan Korec published “Small Universal Minsky Machines” in The Proceedings of the 8th Symposium on Mathematical Foundations in Computer Science. In it, they describe universal Minsky Machines that can compute anything computable.27

In the paper, they present a universal machine, U, with 37 rules. It can simulate a Minsky Machine, just as a Universal Turing Machine can simulate any Turing Machine. A Universal Minsky Machine takes as its input a description of another Minsky Machine, plus input for that machine, and produces as its output what the described Minsky Machine would produce.

These machines are ingenious, and a lot of work goes into figuring out exactly how to encode the machine to be simulated so that it’s amenable to being simulated. For example, Gregušová and Korec also give a 32-rule universal machine, but it requires an enormously larger encoding for the simulated machine than the 37-rule version.

Now, what do Universal Minsky Machines tell us about FRACTRAN?

Well, if:

  1. A Magnificent Minsky Machine, U, can simulate any Magnificent Minsky Machine, and;
  2. For every Magnificent Minsky Machine, there is an equivalent Marvellous Minsky Machine, and;
  3. For every Marvellous Minsky Machine, there is an equivalent FRACTRAN program, therefore:
  4. For every Magnificent Minsky Machine, there is an equivalent FRACTRAN program.

And thus:

  1. Since there exists a Magnificent Minsky Machine, U, that can simulate any Magnificent Minsky Machine,
  2. Therefore, there exists a FRACTRAN program, that can simulate any FRACTRAN program.

And here it is, John Horton Conway’s POLYGAME:


POLYGAME

An excerpt of Conway’s paper on FRACTRAN, showing the POLYGAME program.


There is a wonderful explanation of POLYGAME’s workings in Open Problems in Communication & Computation. Polygame can compute any computable function, we just have to find the function’s “catalogue number.” You’ll want to read Conway’s original description to grasp how that works.

But the thing to know about POLYGAME is that every recursively enumerable function f has a catalogue number, c. And given c * 2 ^ 2 ^ n, POLYGAME will produce 2 ^ 2 ^ f(n).

POLYGAME is a Universal FRACTRAN Program.28

The Collatz Conjecture

Lothar Collatz

Lothar Collatz in mid-lecture.


The Collatz conjecture is a conjecture in mathematics that concerns a sequence defined as follows: Start with any positive integer n. Then each term is obtained from the previous term as follows: if the previous term is even, the next term is one half of the previous term. If the previous term is odd, the next term is 3 times the previous term plus 1. The conjecture is that no matter what value of n, the sequence will always reach 1.

The sequence of terms obtained in this manner is known as the Collatz sequence. The conjecture is named after Lothar Collatz, who introduced the idea in 1937, two years after receiving his doctorate.

Here’s a FRACTRAN program to test the Collatz Conjecture for any x. We set n to be 2ˣ, and then every subsequent value of n of the form 2ʸ is the next term of the Collatz sequence.

165/14, 11/63, 38/21, 13/7, 34/325, 1/13,
184/95, 1/19, 7/11, 13/17, 19/23, 1575/4

It halts when it reaches 2¹, corresponding to the original Collatz sequence reaching 1. If you can find a number Z that disproves the Collatz conjecture, this program will run forever when given 2ᶻ.

If you want to run it, the existing FRACTRAN interpreter given above will work, albeit it won’t give any clue as to what is happening. As an exercise, you might want to try rewriting our FRACTRAN interpreter to output the machine’s state after every pass over the program.

If you’re really ambitious, you’ll rewrite the interpreter as a generator. That will allow you to output the Collatz sequence. Here’s a sample implementation:

function * interpret (program, n) {
  program_start: while (true) {
    yield n;
    for (const [numerator, denominator] of program) {
      if (n % denominator === 0n) {
        n = (n * numerator) / denominator;
        continue program_start;
      }
    }
    break;
  }
}

const parse = program => program
  .trim().split(/(?:\s*,|\s)\s*/)
  .map(f => f.split('/').map(n => BigInt(n)));

const collatz =
  '165/14, 11/63, 38/21, 13/7, 34/325, 1/13, ' +
  '184/95, 1/19, 7/11, 13/17, 19/23, 1575/4';

function test (x) {
  const n = pow(2n, x);

  for (const nn of interpret(parse(collatz), n)) {
    if (log2(nn) != undefined)  console.log(log2(nn).toString());
  }
}

test(3)
  //=>
    3
    10
    5
    16
    8
    4
    2
    1

Interesting. And?

why fractran really matters

Collatz 5n+1 simulator

The Collatz 5n+1 simulator is an unknown fate pattern constructed by David Bell in December 2017 that simulates the Collatz 5n+1 algorithm using sliding block memory and p1 technology, while always having a population below 32000.29


In On Unsettleable Arithmetical Problems, Conway described the specific Collatz function n/2 | 3n + 1 as a bipartite linear function. He then generalized the idea to k-partite linear functions, with k linear possibilities.

A Collatz function is a k-partite linear function where each possibility is defined as (a * n / b) + c, and the value of the entire function is the first linear possibility with an integral result.

Next, Conway described the Collatz Game, which takes a Collatz function, a starting value of n, and a target value. For the Collatz conjecture, the Collatz function is n/2 | 3n + 1 and the target value is 1.

The Collatz game consists of generating the Collatz sequence for n, and stopping when when its next term is the target value, when its next term is undefined, or when the sequence enters a loop.

We say that a Collatz game is decidable if we can create an algorithm for determining whether that game reaches the target value for all starting values of n. Lots of Collatz games are decidable. For example, the game n/2 | n + 1 is decidable.

Where does FRACTRAN come in?

Every FRACTRAN program is a k-partite Collatz function where the value of c for every possibility is 0. Thus, every FRACTRAN program is both a Collatz function and the basis of a Collatz game. We will call the set of all Collatz games that are also FRACTRAN programs, FRACTRAN games.

Rice’s Theorem states that all non-trivial, semantic properties of programs are undecidable. That includes whether the program enters a particular state. For a FRACTRAN program, its state is encoded in n, so it is undecidable whether FRACTRAN programs ever generate a particular target value of n.

And it follows that FRACTRAN games are undecidable. This doesn’t speak directly to the Collatz conjecture, because (1/2)n | 3n + 1 is not a FRACTRAN game. But it does follow that arbitrary Collatz games are undecidable, since the set of all Collatz games includes the set of all FRACTRAN games.

Conway went on to do much more work on the subject of Collatz functions and Collatz games, including addressing Collatz functions like (1/2)n | 3n + 1 through the medium of POLYGAME. Others have taken these ideas further. Stuart A. Kurtz and Janos Simon built upon Conways’ work with FRACTRAN in The Undecidability of the Generalized Collatz Problem.

FRACTRAN, then, is more than just a ridiculous way to represent register machines: Its correspondence to Collatz functions and its universality make a direct contribution to our understanding of Collatz functions, and may help us one day determine whether the Collatz Conjecture is provably true, or undecidable.

THE END

(discuss on /r/math, /r/programminglanguages, and hacker hews)


Addenda

John Horton Conway © 2005 Thane Plambeck

John Horton Conway © 2005 Thane Plambeck


conway’s “fractran: a ridiculous logical language” lecture



norman wildberger’s lecture on the collatz conjecture



vikram ramanathan on fractran

Vikram Ramanathan has written a nice, compact essay about their experience developing a FRACTRAN interpreter and how to approach FRACTRAN programming.


notes
  1. “Can you solve it? John Horton Conway, playful maths genius” 

  2. Mathematician John Horton Conway, a ‘magical genius’ known for inventing the ‘Game of Life,’ dies at age 82 

  3. Obituary in The Guardian 💰 

  4. John Horton Conway, a ‘Magical Genius’ in Math, Dies at 82 💰 

  5. HashLife in the Browser 

  6. Time, Space, and Life As We Know It 

  7. Cafe au Life, an implementation of HashLife in CoffeeScript 

  8. Elegance and the Surreals 

  9. A Surreal Encounter with a Winged Elephant 

  10. Open Problems in Communication & Computation is available on SciHub, or by paying rent to Spinger-Verlag. I chose to use SciHub and to simultaneously purchase a used copy of the book online. I don’t know how long we’ll still be able to buy used books, but I’m taking full advantage of the privilege while we’re still permitted to do so. 

  11. Feel free to use FRACTRAN to ace the programming interview when somene asks for a program to compute fib(n)

  12. The complete debug output is in this gist

  13. The complete list of values of n is in this gist

  14. BigInt is slated to enter JavaScript formally in ES2020. Most of the code is deliberately “lowest common denominator,” however exceptions are justified when the feature greatly simplifies the code by removing accidental complexity, and doesn’t affect the central idea of the code itself. 

  15. If you wish to try out this code for yourself, feel free to peruse the code repository for all the code in this essay, plus some other experiments that didn’t make the cut. You will also find instructions on how to use babeljs.io to compile the code. 

  16. Before anybody quotes Djikstra, remember that “Considered Harmful” Essays Considered Harmful 

  17. There is a persistant myth in software that Enterprise Technology Sales are conducted on the golf course, and that the sales team who spends the most on steak and bourbon wins the deal no matter how crap-tastic their proposal.

    People promulgating this myth greatly underestimate the number of individuals in any non-trivial organization who lay in wait for proposals, ready to make their own reputations by pointing out the flaws in every proposed adoption project. 

  18. Minsky Machines on esoloangs.org 

  19. Minsky Machines on Wikipedia 

  20. “Magnificent Minsky Machine” is not a term of art, if anything, it’s a fond reference to Raymond Smullyan’s use of songbirds to describe combinators. 

  21. An equivalent metaphor for a Magnificent Minsky Machine is to have a single tape, but multiple tape-heads. We could imagine them as moving back and forth, and when one tape-head encounters another, it crawls over the other tape-head like crabs crawling over each other in a great heap. 

  22. Using ASCII to draw things rather than say things has a very long tradition in programming, and ASCII graphics are just as “graphical” as an animated, interactive high-definition virtual-reality depiction of a process.

    And besides, the Latin root of the word “graph” refers to graphite, the material used in pencils. So if we want to be literal about this, if you can write it with a pencil, it’s a graphic. 

  23. The suggested algorithm for automatically deriving a Marvellous Minsky Machine from any Magnificent Minsky Machine actually only works for the subset of Magnificent Minsky Machines that do not use a transfer to state 0 to explicitly halt.

    A good exercise for the enthusiastic reader is to consider this question: For every Magnificent Minsky Machine that explicitly halts, is there an equivalent Magnificent Minsky Machine that only halts implicitly? If so, how can we derive it? 

  24. So obvious that I have no doubt that you saw it much more quickly than I did when I first read about these subjects. 

  25. Our implementation can only handle Magnificent and Marvellous Minsky Machines with a maximum of 200 registers, but that’s enough to demonstrate the principle, and it is not particularly difficult to write a version of tapeToPrime that can handle any number of registers. 

  26. Piet is an esoteric programming language in which programs look like paintings in the neo-plasticism style. Piet was invented by David Morgan-Mar and is named after geometric abstract art pioneer Piet Mondrian. There’s an excellent book devoted to the practical use of the language. 

  27. Alas, The Proceedings of the 8th Symposium on Mathematical Foundations in Computer Science is also behind the Spinger-Verlag rent-collecting gate. 

  28. Others have written Universal FRACTRAN Programs. The “catalogue” approach has certain benefits from the perspective of proving things about FRACTRAN, but if we want to write a universal program that behaves more like an interpreter, we want a FRACTRAN program that interprets directly encoded FRACTRAN programs, rather than an index into a catalogue.

    For example, Chris Lomont has written an 84-fraction Universal FRACTRAN Interpreter in FRACTRAN. Lomont’s solution doesn’t require a catalogue, we just directly encode a FRACTRAN program as input using a very approachable base-11 numbering scheme.

    And as a final tribute to Conway, there’s a competition to find the FRACTRAN interpreter with the fewest number of fractions

  29. Conway became very tired of all the attention his Game of Life garnered, but its appeal seem to be evergreen. For those still fascinated by the Game of Life, this 5n+1 simulator connects Conway’s Game of Life with his work on the Collatz conjecture. 

https://raganwald.com/2020/05/03/fractran
Exploring Regular Expressions, Part II: Regular Languages and Finite-State Automata
Show full content

This is Part II of “Exploring Regular Expressions.” If you haven’t already, you may want to read Part I first, where we wrote a compiler that translates formal regular expressions into finite-state recognizers.

You may also want another look at the essay, A Brutal Look at Balanced Parentheses, Computing Machines, and Pushdown Automata. It covers the concepts behind finite-state machines and the the kinds of “languages” they can and cannot recognize.


Table of Contents The Essentials from Part I

Regular Expressions

Our Code So Far

For Every Regular Expression, There Exists an Equivalent Finite-State Recognizer

Beyond Formal Regular Expressions

Implementing Level One Features

Implementing Level Two Features

What Level Two Features Tell Us, and What They Don’t

For Every Finite-State Recognizer, There Exists An Equivalent Formal Regular Expression
The Essentials from Part I

If you’re familiar with formal regular expressions, and are very comfortable with the code we presented in Part I, or just plain impatient, you can skip ahead to Beyond Formal Regular Expressions.

But for those who want a refresher, we’ll quickly recap regular expressions and the code we have so far.

Regular Expressions

In Part I, and again in this essay, we will spend a lot of time talking about formal regular expressions. Formal regular expressions are a minimal way to describe “regular” languages, and serve as the building blocks for the regexen we find in most programming languages.

Formal regular expressions describe languages as sets of sentences. The three basic building blocks for formal regular expressions are the empty set, the empty string, and literal symbols:

  • The symbol describes the language with no sentences, { }, also called “the empty set.”
  • The symbol ε describes the language containing only the empty string, { '' }.
  • Literals such as x, y, or z describe languages containing single sentences, containing single symbols. e.g. The literal r describes the language { 'r' }.

What makes formal regular expressions powerful, is that we have operators for alternating, catenating, and quantifying regular expressions. Given that x is a regular expression describing some language X, and y is a regular expression describing some language Y:

  1. The expression x|y describes the union of the languages X and Y, meaning, the sentence w belongs to x|y if and only if w belongs to the language X, or w belongs to the language Y. We can also say that x|y represents the alternation of x and y.
  2. The expression xy describes the language XY, where a sentence ab belongs to the language XY if and only if a belongs to the language X, and b belongs to the language Y. We can also say that xy represents the catenation of the expressions x and y.
  3. The expression x* describes the language Z, where the sentence ε (the empty string) belongs to Z, and, the sentence pq belongs to Z if and only if p is a sentence belonging to X, and q is a sentence belonging to Z. We can also say that x* represents a quantification of x.

Before we add the last rule for regular expressions, let’s clarify these three rules with some examples. Given the constants a, b, and c, resolving to the languages { 'a' }, { 'b' }, and { 'b' }:

  • The expression b|c describes the language { 'b', 'c' }, by rule 1.
  • The expression ab describes the language { 'ab' } by rule 2.
  • The expression a* describes the language { '', 'a', 'aa', 'aaa', ... } by rule 3.

Our operations have a precedence, and it is the order of the rules as presented. So:

  • The expression a|bc describes the language { 'a', 'bc' } by rules 1 and 2.
  • The expression ab* describes the language { 'a', 'ab', 'abb', 'abbb', ... } by rules 2 and 3.
  • The expression b|c* describes the language { '', 'b', 'c', 'cc', 'ccc', ... } by rules 1 and 3.

As with the algebraic notation we are familiar with, we can use parentheses:

  • Given a regular expression x, the expression (x) describes the language described by x.

This allows us to alter the way the operators are combined. As we have seen, the expression b|c* describes the language { '', 'b', 'c', 'cc', 'ccc', ... }. But the expression (b|c)* describes the language { '', 'b', 'c', 'bb', 'cc', 'bbb', 'ccc', ... }.

It is quite obvious that regexen borrowed a lot of their syntax and semantics from regular expressions. Leaving aside the mechanism of capturing and extracting portions of a match, almost every regular expressions is also a regex. For example, /reggiee*/ is a regular expression that matches words like reggie, reggiee, and reggieee anywhere in a string.

Our Code So Far

In Part I, we established that for every formal regular expression, there is an equivalent finite-state recognizer, establishing that the set of all languages described by formal regular expressions–that is to say, regular languages–is a subset of the set of all languages recognized by finite-state automata.

We did this in constructive proof fashion by writing a compiler that takes any formal regular expression as input, and returns a JSON description of an equivalent finite-state recognizer. We also wrote an automator that turns the description of a finite state recognizer into a JavaScript function that takes any string as input and answers whether the string is recognized.

Thus, we can take any formal regular expression and get a function that recognizes strings in the language described by the formal regular expression. And because the implementation is a finite-state automaton, we know that it can recognize strings in at most linear time, which can be an improvement over some regex implementations for certain regular expressions.

We’re going to revisit the final version of most of our functions.


the shunting yard

Our pipeline of tools starts with a shunting yard function that takes a regular expression in infix notation, and translates it into reverse-polish representation. It also takes a definition dictionary that configures the shunting yard by defining operators, a default operator to handle catenation, and some details on how to handle escaping symbols like parentheses that would otherwise be treated as operators.

It is hard-wired to treat ( and ) as parentheses for controlling the order of evaluation.

function error(m) {
  console.log(m);
  throw m;
}

function peek(stack) {
  return stack[stack.length - 1];
}

function shuntingYard (
  infixExpression,
  {
    operators,
    defaultOperator,
    escapeSymbol = '`',
    escapedValue = string => string
  }
) {
  const operatorsMap = new Map(
    Object.entries(operators)
  );

  const representationOf =
    something => {
      if (operatorsMap.has(something)) {
        const { symbol } = operatorsMap.get(something);

        return symbol;
      } else if (typeof something === 'string') {
        return something;
      } else {
        error(`${something} is not a value`);
      }
    };
  const typeOf =
    symbol => operatorsMap.has(symbol) ? operatorsMap.get(symbol).type : 'value';
  const isInfix =
    symbol => typeOf(symbol) === 'infix';
  const isPrefix =
    symbol => typeOf(symbol) === 'prefix';
  const isPostfix =
    symbol => typeOf(symbol) === 'postfix';
  const isCombinator =
    symbol => isInfix(symbol) || isPrefix(symbol) || isPostfix(symbol);
  const awaitsValue =
    symbol => isInfix(symbol) || isPrefix(symbol);

  const input = infixExpression.split('');
  const operatorStack = [];
  const reversePolishRepresentation = [];
  let awaitingValue = true;

  while (input.length > 0) {
    const symbol = input.shift();

    if (symbol === escapeSymbol) {
      if (input.length === 0) {
        error('Escape symbol ${escapeSymbol} has no following symbol');
      } else {
        const valueSymbol = input.shift();

        if (awaitingValue) {
          // push the escaped value of the symbol

          reversePolishRepresentation.push(escapedValue(valueSymbol));
        } else {
          // value catenation

          input.unshift(valueSymbol);
          input.unshift(escapeSymbol);
          input.unshift(defaultOperator);
        }
        awaitingValue = false;
      }
    } else if (symbol === '(' && awaitingValue) {
      // opening parenthesis case, going to build
      // a value
      operatorStack.push(symbol);
      awaitingValue = true;
    } else if (symbol === '(') {
      // value catenation

      input.unshift(symbol);
      input.unshift(defaultOperator);
      awaitingValue = false;
    } else if (symbol === ')') {
      // closing parenthesis case, clear the
      // operator stack

      while (operatorStack.length > 0 && peek(operatorStack) !== '(') {
        const op = operatorStack.pop();

        reversePolishRepresentation.push(representationOf(op));
      }

      if (peek(operatorStack) === '(') {
        operatorStack.pop();
        awaitingValue = false;
      } else {
        error('Unbalanced parentheses');
      }
    } else if (isPrefix(symbol)) {
      if (awaitingValue) {
        const { precedence } = operatorsMap.get(symbol);

        // pop higher-precedence operators off the operator stack
        while (isCombinator(symbol) && operatorStack.length > 0 && peek(operatorStack) !== '(') {
          const opPrecedence = operatorsMap.get(peek(operatorStack)).precedence;

          if (precedence < opPrecedence) {
            const op = operatorStack.pop();

            reversePolishRepresentation.push(representationOf(op));
          } else {
            break;
          }
        }

        operatorStack.push(symbol);
        awaitingValue = awaitsValue(symbol);
      } else {
        // value catenation

        input.unshift(symbol);
        input.unshift(defaultOperator);
        awaitingValue = false;
      }
    } else if (isCombinator(symbol)) {
      const { precedence } = operatorsMap.get(symbol);

      // pop higher-precedence operators off the operator stack
      while (isCombinator(symbol) && operatorStack.length > 0 && peek(operatorStack) !== '(') {
        const opPrecedence = operatorsMap.get(peek(operatorStack)).precedence;

        if (precedence < opPrecedence) {
          const op = operatorStack.pop();

          reversePolishRepresentation.push(representationOf(op));
        } else {
          break;
        }
      }

      operatorStack.push(symbol);
      awaitingValue = awaitsValue(symbol);
    } else if (awaitingValue) {
      // as expected, go straight to the output

      reversePolishRepresentation.push(representationOf(symbol));
      awaitingValue = false;
    } else {
      // value catenation

      input.unshift(symbol);
      input.unshift(defaultOperator);
      awaitingValue = false;
    }
  }

  // pop remaining symbols off the stack and push them
  while (operatorStack.length > 0) {
    const op = operatorStack.pop();

    if (operatorsMap.has(op)) {
      const { symbol: opSymbol } = operatorsMap.get(op);
      reversePolishRepresentation.push(opSymbol);
    } else {
      error(`Don't know how to push operator ${op}`);
    }
  }

  return reversePolishRepresentation;
}

the stack machine

We then use a stack machine to evaluate the reverse-polish representation. It uses the same definition dictionary to evaluate the effect of operators.

function stateMachine (representationList, {
  operators,
  toValue
}) {
  const functions = new Map(
    Object.entries(operators).map(
      ([key, { symbol, fn }]) => [symbol, fn]
    )
  );

  const stack = [];

  for (const element of representationList) {
    if (typeof element === 'string') {
      stack.push(toValue(element));
    } else if (functions.has(element)) {
      const fn = functions.get(element);
      const arity = fn.length;

      if (stack.length < arity) {
        error(`Not enough values on the stack to use ${element}`)
      } else {
        const args = [];

        for (let counter = 0; counter < arity; ++counter) {
          args.unshift(stack.pop());
        }

        stack.push(fn.apply(null, args))
      }
    } else {
      error(`Don't know what to do with ${element}'`)
    }
  }
  if (stack.length === 0) {
    return undefined;
  } else if (stack.length > 1) {
    error(`should only be one value to return, but there were ${stack.length} values on the stack`);
  } else {
    return stack[0];
  }
}

evaluating arithmetic expressions

To evaluate an infix expression, the expression and definition dictionary are fed to the shunting yard, and then the resulting reverse-polish representation and definition dictionary are fed to the stack machine. For convenience, we have an evaluation function to do that:

function evaluate (expression, definition) {
  return stateMachine(
    shuntingYard(
      expression, definition
    ),
    definition
  );
}

The evaluate function takes a definition dictionary as an argument, and passes it to both the shunting yard and the state machine. If we pass in one kind of definition, we have a primitive evaluator for arithmetic expressions:

const arithmetic = {
  operators: {
    '+': {
      symbol: Symbol('+'),
      type: 'infix',
      precedence: 1,
      fn: (a, b) => a + b
    },
    '-': {
      symbol: Symbol('-'),
      type: 'infix',
      precedence: 1,
      fn: (a, b) => a - b
    },
    '*': {
      symbol: Symbol('*'),
      type: 'infix',
      precedence: 3,
      fn: (a, b) => a * b
    },
    '/': {
      symbol: Symbol('/'),
      type: 'infix',
      precedence: 2,
      fn: (a, b) => a / b
    },
    '!': {
      symbol: Symbol('!'),
      type: 'postfix',
      precedence: 4,
      fn: function factorial (a, memo = 1) {
        if (a < 2) {
          return a * memo;
        } else {
          return factorial(a - 1, a * memo);
        }
      }
    }
  },
  defaultOperator: '*',
  toValue: n => +n
};

evaluate('(1+2)3!', arithmetic)
  //=> 18

The code for both the shunting yard and stack machine have been extracted into a Github repository.


compiling formal regular expressions

With a different definition dictionary, we can compile formal regular expressions to a finite-state recognizer description:

const formalRegularExpressions = {
  operators: {
    '∅': {
      symbol: Symbol('∅'),
      type: 'atomic',
      fn: emptySet
    },
    'ε': {
      symbol: Symbol('ε'),
      type: 'atomic',
      fn: emptyString
    },
    '|': {
      symbol: Symbol('|'),
      type: 'infix',
      precedence: 10,
      fn: union2merged
    },
    '→': {
      symbol: Symbol('→'),
      type: 'infix',
      precedence: 20,
      fn: catenation2
    },
    '*': {
      symbol: Symbol('*'),
      type: 'postfix',
      precedence: 30,
      fn: zeroOrMore
    }
  },
  defaultOperator: '→',
  toValue (string) {
    return literal(string);
  }
};

We will not reproduce all of the code needed to implement emptySet, emptyString, union2merged, catenation2, and zeroOrMore here in the text, but the full implementations can be found here.

Here it is working:

evaluate('0|1(0|1)*', formalRegularExpressions);
  //=>
    {
      "start": "G37",
      "transitions": [
        { "from": "G37", "consume": "0", "to": "G23" },
        { "from": "G37", "consume": "1", "to": "G25" },
        { "from": "G25", "consume": "0", "to": "G25" },
        { "from": "G25", "consume": "1", "to": "G25" }
      ],
      "accepting": [ "G23", "G25" ]
    }

This is a description in JSON, of this finite-state recognizer:

stateDiagram [*]-->G37 G37-->G23 : 0 G37-->G25 : 1 G25-->G25 : 0, 1 G23-->[*] G25-->[*]

It recognizes the language consisting of the set of all binary numbers.


automation and verification

We don’t rely strictly on inspection to have confidence that the finite-state recognizers created by evaluate recognize the languages described by regular expressions. We use two tools.

First, we have an automate function that takes a JSON description of a finite-state recognizer as an argument, and returns a JavaScript recognizer function. The recognizer function takes a string as an argument, and returns true if the string belongs to the language recognized by that finite-state recognizer, and false if it doesn’t.

This is the core automate function:

function automate (description) {
  if (description instanceof RegExp) {
    return string => !!description.exec(string)
  } else {
    const {
      stateMap,
      start,
      acceptingSet,
      transitions
    } = validatedAndProcessed(description);

    return function (input) {
      let state = start;

      for (const symbol of input) {
        const transitionsForThisState = stateMap.get(state) || [];
        const transition =
        	transitionsForThisState.find(
            ({ consume }) => consume === symbol
        	);

        if (transition == null) {
          return false;
        }

        state = transition.to;
      }

      // reached the end. do we accept?
      return acceptingSet.has(state);
    }
  }
}

automate interprets the finite-state recognizers as it goes, and could be faster. But for the purposes of running test cases, it is sufficient for our needs. Its supporting functions can be found here.

Speaking of running tests, we use a general-purpose verify function that works for any function, and for convenience, a verifyEvaluate function that uses evaluate and automate to convert any expression into a recognizer function first:

function deepEqual(obj1, obj2) {
  function isPrimitive(obj) {
    return (obj !== Object(obj));
  }

  if (obj1 === obj2) // it's just the same object. No need to compare.
    return true;

  if (isPrimitive(obj1) && isPrimitive(obj2)) // compare primitives
    return obj1 === obj2;

  if (Object.keys(obj1).length !== Object.keys(obj2).length)
    return false;

  // compare objects with same number of keys
  for (let key in obj1) {
    if (!(key in obj2)) return false; //other object doesn't have this prop
    if (!deepEqual(obj1[key], obj2[key])) return false;
  }

  return true;
}

const pp = value => value instanceof Array ? value.map(x => x.toString()) : value;

function verify(fn, tests, ...additionalArgs) {
  try {
    const testList =
      typeof tests.entries === 'function'
        ? [...tests.entries()]
        : Object.entries(tests);
    const numberOfTests = testList.length;

    const outcomes = testList.map(
      ([example, expected]) => {
        const actual = fn(example, ...additionalArgs);

        if (deepEqual(actual, expected)) {
          return 'pass';
        } else {
          return `fail: ${JSON.stringify({ example, expected: pp(expected), actual: pp(actual) })}`;
        }
      }
    )

    const failures = outcomes.filter(result => result !== 'pass');
    const numberOfFailures = failures.length;
    const numberOfPasses = numberOfTests - numberOfFailures;

    if (numberOfFailures === 0) {
      console.log(`All ${numberOfPasses} tests passing`);
    } else {
      console.log(`${numberOfFailures} tests failing: ${failures.join('; ')}`);
    }
  } catch (error) {
    console.log(`Failed to validate: ${error}`)
  }
}

function verifyEvaluate (expression, definition, examples) {
  return verify(
    automate(evaluate(expression, definition)),
    examples
  );
}

We can put it all together and verify our “binary numbers” expression:

verifyEvaluate('0|1(0|1)*', formalRegularExpressions, {
  '': false,
  'an odd number of characters': false,
  'an even number of characters': false,
  '0': true,
  '10': true,
  '101': true,
  '1010': true,
  '10101': true
});

For Every Regular Expression, There Exists an Equivalent Finite-State Recognizer

Armed with the code that compiles a formal regular expression to an equivalent finite-state recognizer, we have a constructive demonstration of the fact that for every regular expression, there exists an equivalent finite-state recognizer.

If someone were to hand us a formal regular expression and claim that there is no equivalent finite-state recognizer for that expression, we would feed the expression into our evaluate function, it would return an equivalent finite-state recognizer, and would thus invalidate their alleged counter-example.

Another way to put this is to state that the set of all languages described by formal regular expressions is a subset of the set of all languages recognized by finite-state recognizers. In the essay, we will establish, amongst other things, that the set of all languages described by formal regular expressions is equal to the set of all languages recognized by finite-state recognizers.

In other words, we will also show that for every finite-state recognizer, there exists an equivalent formal regular expression. We’ll begin by looking at some ways to extend formal regular expressions, while still being equivalent to finite-state recognizers.


Beyond Formal Regular Expressions

Formal regular expressions are–deliberately–as minimal as possible. There are only three kinds of literals (, ε, and literal symbols), and three operations (alternation with |, catenation, and quantification with *). Minimalism is extremely important from a computer science perspective, but unwieldy when trying to “Get Stuff Done.”

Thus, all regexen provide functionality above and beyond formal regular expressions.


a hierarchy of regex functionality

Functionality in regexen can be organized into a rough hierarchy. Level Zero of the hierarchy is functionality provided by formal regular expressions. Everything we’ve written in Part I is at this base level.

Level One of the hierarchy is functionality that can be directly implemented in terms of formal regular expressions. For example, regexen provide a ? postfix operator that provides “zero or one” quantification, and a + postfix operator that provides “one or more” quantification.

As we know from our implementation of the kleene star, “zero or one” can be implemented in a formal regular expression very easily. If a is a regular expression, ε|a is a regular expression that matches zero or one sentences that a accepts. So intuitively, a regex flavour that supports the expression a? doesn’t do anything we couldn’t have done by hand with ε|a

The same reasoning goes for +: If we have the kleene star (which ironically we implemented on top of one-or-more), we can always express “one or more” using catenation and the kleene star. If a is a regular expression, aa* is a regular expression that matches one or more sentences that a accepts. Again, a regex flavour supports the expression a+ doesn’t do anything we couldn’t have done by hand with aa*.

Level Two of the hierarchy is functionality that cannot be directly implemented in terms of formal regular expressions, however it still compiles to finite-state recognizers. As we mentioned in the prelude, and will show later, for every finite-state recognizer, there is an equivalent formal regular expression.

So if a particular piece of functionality can be implemented as a finite-state recognizer, then it certainly can be implemented in terms of a formal regular expression, however compiling an expression to a finite-state machine and then deriving an equivalent formal regular expression is “going the long way ‘round,” and thus we classify such functionality as being directly implemented as a finite-state recognizer, and only indirectly implemented in terms of formal regular expressions.

Examples of Level Two functionality include complementation (if a is a regular expression, ¬a is an expression matching any sentence that a does not match), and intersection (if a and b are regular expressions, a∩b is an expression matching any sentence that both a and b match).

beyond our hierarchy

There are higher levels of functionality, however they involve functionality that cannot be implemented with finite-state recognizers.

The Chomsky–Schützenberger hierarchy categorizes grammars from Type-3 to Type-0. Type-3 grammars define regular languages. They can be expressed with formal regular expressions and recognized with finite-state recognizers. Our Level Zero, Level One, and Level Two functionalities do not provide any additional power to recognize Type-2, Type-1, or Type-0 grammars.

As we recall from A Brutal Look at Balanced Parentheses, Computing Machines, and Pushdown Automata, languages like “balanced parentheses” are a Type-2 grammar, and cannot be recognized by a finite-state automata. Thus, features that some regexen provide like recursive regular expressions are beyond our levels.

In addition to features that enable regexen to recognize languages beyond the capabilities of finite-state recognizers, regexen also provide plenty of features for extracting match or partial match data, like capture groups. This functionality is also outside of our levels, as we are strictly concerned with recognizing sentences.


Implementing Level One Features

As mentioned, the ? and + operators from regexen can be implemented as “Level One” functionality. a? can be expressed as ε|a, and a+ can be expressed as aa*.

The easiest way to implement these new operators is to write new operator functions. Let’s begin by extending our existing operators:

function dup (a) {
  const {
    start: oldStart,
    transitions: oldTransitions,
    accepting: oldAccepting,
    allStates
  } = validatedAndProcessed(a);

  const map = new Map(
    [...allStates].map(
      old => [old, names().next().value]
    )
  );

  const start = map.get(oldStart);
  const transitions =
    oldTransitions.map(
      ({ from, consume,  to }) => ({ from: map.get(from), consume, to: map.get(to) })
    );
  const accepting =
    oldAccepting.map(
      state => map.get(state)
    )

  return { start, transitions, accepting };
}

const extended = {
  operators: {

    // ...existing operators...

    '?': {
      symbol: Symbol('?'),
      type: 'postfix',
      precedence: 30,
      fn: a => union2merged(emptyString(), a)
    },
    '+': {
      symbol: Symbol('+'),
      type: 'postfix',
      precedence: 30,
      fn: a => catenation2(a, zeroOrMore(dup(a)))
    }
  },
  defaultOperator: '→',
  toValue (string) {
    return literal(string);
  }
};

verifyEvaluate('(R|r)eg(gie(e+!)?)?', extended, {
  '': false,
  'r': false,
  'reg': true,
  'Reg': true,
  'Regg': false,
  'Reggie': true,
  'Reggieeeeeee!': true
});
  //=> All 7 tests passing

This is fine. It’s only drawback is that our faith that we are not doing anything a regular expression couldn’t do is based on carefully inspecting the functions we wrote (a => union2merged(emptyString(), a) and catenation2(a, zeroOrMore(dup(a)))) to ensure that we are replicating functionality that is baked into formal regular expressions.1

But that isn’t in the spirit of our work so far. What we are claiming is that for every regex containing the formal regular expression grammar plus the quantification operators ? and +, there is an equivalent formal regular expression containing only the formal regular expression grammar.

Instead of appealing to intuition, instead of asking people to believe that union2merged(emptyString(), a) is equivalent to ε|a, what we ought to do is directly translate expressions containing ? and/or + into formal regular expressions.


implementing quantification operators with transpilation

We demonstrated that there is a finite-state recognizer for every formal regular expression by writing a function to compile formal regular expressions into finite-state recognizers. We will take the same approach of demonstrating that there is a Level Zero (a/k/a “formal”) regular expression for every Level One (a/k/a extended) regular expression:

We’ll write a function to compile Level One to Level Zero regular expressions. And we’ll begin with our evaluator.

Recall that our basic evaluator can compile an infix expression into a postfix list of symbols, which it then evaluates. But it knows nothing about what its operators do. If we supply operators that perform arithmetic, we have a calculator. If we supply operators that create and combine finite-state recognizers, we have a regular-expression to finite-state recognizer compiler.

We can build a transpiler exactly the same way: Use our evaluator, but supply a different set of operator definitions. We’ll start by creating a transpiler that transpiles formal regular expressions to formal regular expressions. The way it will work is by assembling an expression in text instead of assembling a finite-state recognizer.

Here’s the first crack at it:

  function p (expr) {
    if (expr.length === 1) {
      return expr;
    } else if (expr[0] === '`') {
      return expr;
    } else if (expr[0] === '(' && expr[expr.length - 1] === ')') {
      return expr;
    } else {
      return `(${expr})`;
    }
  };

const toValueExpr = string => {
  if ('∅ε|→*()'.indexOf(string) >= 0) {
    return '`' + string;
  } else {
    return string;
  }
};

const transpile0to0 = {
  operators: {
    '∅': {
      symbol: Symbol('∅'),
      type: 'atomic',
      fn: () => '∅'
    },
    'ε': {
      symbol: Symbol('ε'),
      type: 'atomic',
      fn: () => 'ε'
    },
    '|': {
      symbol: Symbol('|'),
      type: 'infix',
      precedence: 10,
      fn: (a, b) => `${p(a)}|${p(b)}`
    },
    '→': {
      symbol: Symbol('→'),
      type: 'infix',
      precedence: 20,
      fn: (a, b) => `${p(a)}→${p(b)}`
    },
    '*': {
      symbol: Symbol('*'),
      type: 'postfix',
      precedence: 30,
      fn: a => `${p(a)}*`
    }
  },
  defaultOperator: '→',
  toValue: toValueExpr
};

const before = '(R|r)eg(ε|gie(ε|ee*!))';

verifyEvaluate(before, formalRegularExpressions, {
  '': false,
  'r': false,
  'reg': true,
  'Reg': true,
  'Regg': false,
  'Reggie': true,
  'Reggieeeeeee!': true
});
  //=> All 7 tests passing

const after = evaluate(before, transpile0to0);

verifyEvaluate(after, formalRegularExpressions, {
  '': false,
  'r': false,
  'reg': true,
  'Reg': true,
  'Regg': false,
  'Reggie': true,
  'Reggieeeeeee!': true
});
  //=> All 7 tests passing

The result has an excess of parentheses, and does not take advantage of catenation being the default, but it works just fine.

Extending it is now trivial:

const transpile1to0q = {
  operators: {

    // ...as above...

    '?': {
      symbol: Symbol('?'),
      type: 'postfix',
      precedence: 30,
      fn: a => `ε|${p(a)}`
    },
    '+': {
      symbol: Symbol('+'),
      type: 'postfix',
      precedence: 30,
      fn: a => `${p(a)}${p(a)}*`
    }
  },

  // ...
};

const beforeLevel1 = '(R|r)eg(gie(e+!)?)?';
const afterLevel1 = evaluate(beforeLevel1, transpile1to0q);
  //=> '(R|r)→(e→(g→(ε|(g→(i→(e→(ε|((ee*)→!))))))))'

verifyEvaluate(afterLevel1, formalRegularExpressions, {
  '': false,
  'r': false,
  'reg': true,
  'Reg': true,
  'Regg': false,
  'Reggie': true,
  'Reggieeeeeee!': true
});
  //=> All 7 tests passing

Note that the postfix operators ? and + are associated with functions that create formal regular expressions, rather than functions that manipulate finite-state recognizers.


implementing the dot operator

Regexen provide a convenient shorthand–.–for an expression matching any one symbol. This is often used in conjunction with quantification, so .? is an expression matching zero or one symbols, .+ is an expression matching one or more symbols, and .* is an expression matching zero or more symbols.

Implementing . is straightforward. All regular languages are associated with some kind of total alphabet representing all of the possible symbols in the language. Regexen have the idea of a total alphabet as well, but it’s usually implied to be whatever the underlying platform supports as characters.

For our code, we need to make it explicit, for example:

const ALPHA =
  'abcdefghijklmnopqrstuvwxyz' +
  'ABCDEFGHIJKLMNOPQRSTUVWXYZ';
const DIGITS = '1234567890';
const PUNCTUATION =
  `~!@#$%^&*()_+=-\`-={}|[]\\:";'<>?,./`;
const WHITESPACE = ' \t\r\n';

const TOTAL_ALPHABET = ALPHA + DIGITS + PUNCTUATION + WHITESPACE;

What does the . represent? Any one of the characters in TOTAL_ALPHABET. We can implement that with alternation, like this:

const dotExpr =
  TOTAL_ALPHABET.split('').join('|');

{
  operators: {

    // ...as above...

    '.': {
      symbol: Symbol('.'),
      type: 'atomic',
      fn: () => dotExpr
    }
  },

  // ...
};

There are, of course, more compact (and faster) ways to implement this if we were writing a regular expression engine from the ground up, but since the computer is doing all the work for us, let’s carry on.


implementing shorthand character classes

In addition to convenient operators like ? and +, regexen also shorthand character classes–such as \d, \w, and `\s–to make regexen easy to write and read.

In regexen, instead of associating shorthand character classes with their own symbols, the regexen syntax overloads the escape character \ so that it usually means “Match this character as a character, ignoring any special meaning,” but sometimes–as with \d, \w, and with \s–it means “match this shorthand character class.”

Fortunately, we left a back-door in our shunting yard function just for the purpose of overloading the escape character’s behaviour. Here’s the full definition:

const UNDERSCORE ='_';

const digitsExpression =
  DIGITS.split('').join('|');
const wordExpression =
  (ALPHA + DIGITS + UNDERSCORE).split('').join('|');
const whitespaceExpression =
  WHITESPACE.split('').join('|');

const digitsSymbol = Symbol('`d');
const wordSymbol = Symbol('`w');
const whitespaceSymbol = Symbol('`s');

const transpile1to0qs = {
  operators: {
    '∅': {
      symbol: Symbol('∅'),
      type: 'atomic',
      fn: () => '∅'
    },
    'ε': {
      symbol: Symbol('ε'),
      type: 'atomic',
      fn: () => 'ε'
    },
    '|': {
      symbol: Symbol('|'),
      type: 'infix',
      precedence: 10,
      fn: (a, b) => `${p(a)}|${p(b)}`
    },
    '→': {
      symbol: Symbol('→'),
      type: 'infix',
      precedence: 20,
      fn: (a, b) => `${p(a)}→${p(b)}`
    },
    '*': {
      symbol: Symbol('*'),
      type: 'postfix',
      precedence: 30,
      fn: a => `${p(a)}*`
    },
    '?': {
      symbol: Symbol('?'),
      type: 'postfix',
      precedence: 30,
      fn: a => `ε|${p(a)}`
    },
    '+': {
      symbol: Symbol('+'),
      type: 'postfix',
      precedence: 30,
      fn: a => `${p(a)}${p(a)}*`
    },
    '__DIGITS__': {
      symbol: digitsSymbol,
      type: 'atomic',
      fn: () => digitsExpression
    },
    '__WORD__': {
      symbol: wordSymbol,
      type: 'atomic',
      fn: () => wordExpression
    },
    '__WHITESPACE__': {
      symbol: whitespaceSymbol,
      type: 'atomic',
      fn: () => whitespaceExpression
    }
  },
  defaultOperator: '→',
  escapedValue (symbol) {
    if (symbol === 'd') {
      return digitsSymbol;
    } else if (symbol === 'w') {
      return wordSymbol;
    } else if (symbol === 's') {
      return whitespaceSymbol;
    } else {
      return symbol;
    }
  },
  toValue (string) {
    if ('∅ε|→*'.indexOf(string) >= 0) {
      return '`' + string;
    } else {
      return string;
    }
  }
};

As you can see, we don’t allow writing one-symbol operators, but we do support using back-ticks with d, w, and s just like with regexen:

const beforeLevel1qs = '((1( |-))?`d`d`d( |-))?`d`d`d( |-)`d`d`d`d';
const afterLevel1qs = evaluate(beforeLevel1qs, transpile1to0qs);

verifyEvaluate(afterLevel1qs, formalRegularExpressions, {
  '': false,
  '1234': false,
  '123 4567': true,
  '987-6543': true,
  '416-555-1234': true,
  '1 416-555-0123': true,
  '011-888-888-8888!': false
});

Excellent!


thoughts about custom character classes

regexen allow users to define their own character classes “on the fly.” In a regex, [abc] is an expression matching an a, a b, or a c. In that form, it means exactly the same thing as (a|b|c). Custom character classes enable us to write gr[ae]y to match grey and gray, which saves us one character as compared to writing gr(a|e)y.

If that’s all they did, they would add very little value: They’re only slightly more compact, and they add the cognitive load of embedding an irregular kind of syntax inside of regular expressions.

But custom character classes add some other affordances. We can write [a-f] as a shorthand for (a|b|c|d|e|f), or [0-9] as a shorthand for (0|1|2|3|4|5|6|7|8|9). We can combine those affordances, e.g. we can write [0-9a-fA-F] as a shorthand for (0|1|2|3|4|5|6|7|8|9|a|b|c|d|e|f|A|B|C|D|E|F). That is considerably more compact, and arguably communicates the intent of matching a hexadecimal character more cleanly.

And if we preface our custom character classes with a ^, we can match a character that is not a member of the character class, e.g. [^abc] matches any character except an a, b, or c. That can be enormously useful.

Custom character classes are a language within a language. However, implementing the full syntax would be a grand excursion into parsing the syntax, while the implementation of the character classes would not be particularly interesting. We will, however, be visiting the subject of negating expressions when we discuss level two functionality. We will develop an elegant way to achieve expressions like [^abc] with the syntax ^(a|b|c), and we’ll also develop the ¬ prefix operator that will work with any expression.


eschewing transpilation

There are lots of other regexen features we can implement using this transpilation technique,2 but having implemented a feature using transpilation, we’ve demonstrated that it provides not functional advantage over formal regular expressions. Having done so, we can return to implementing the features directly in JavaScript, which saves adding a transpilation step to our evaluator.

So we’ll wrap Level One up with:

const zeroOrOne =
  a => union2merged(emptyString(), a);
const oneOrMore =
  a => catenation2(a, zeroOrMore(dup(a)));
const anySymbol =
  () => TOTAL_ALPHABET.split('').map(literal).reduce(union2merged);
const anyDigit =
  () => DIGITS.split('').map(literal).reduce(union2merged);
const anyWord =
  () => (ALPHA + DIGITS + UNDERSCORE).map(literal).reduce(union2merged);
const anyWhitespace =
  () => WHITESPACE.map(literal).reduce(union2merged);

const levelOneExpressions = {
  operators: {
    // formal regular expressions

    '∅': {
      symbol: Symbol('∅'),
      type: 'atomic',
      fn: emptySet
    },
    'ε': {
      symbol: Symbol('ε'),
      type: 'atomic',
      fn: emptyString
    },
    '|': {
      symbol: Symbol('|'),
      type: 'infix',
      precedence: 10,
      fn: union2merged
    },
    '→': {
      symbol: Symbol('→'),
      type: 'infix',
      precedence: 20,
      fn: catenation2
    },
    '*': {
      symbol: Symbol('*'),
      type: 'postfix',
      precedence: 30,
      fn: zeroOrMore
    },

    // extended operators

    '?': {
      symbol: Symbol('?'),
      type: 'postfix',
      precedence: 30,
      fn: zeroOrOne
    },
    '+': {
      symbol: Symbol('+'),
      type: 'postfix',
      precedence: 30,
      fn: oneOrMore
    },
    '.': {
      symbol: Symbol('.'),
      type: 'atomic',
      fn: anySymbol
    },
    '__DIGITS__': {
      symbol: digitsSymbol,
      type: 'atomic',
      fn: anyDigit
    },
    '__WORD__': {
      symbol: wordSymbol,
      type: 'atomic',
      fn: anyWord
    },
    '__WHITESPACE__': {
      symbol: whitespaceSymbol,
      type: 'atomic',
      fn: anyWhitespace
    }
  },
  defaultOperator: '→',
  escapedValue (symbol) {
    if (symbol === 'd') {
      return digitsSymbol;
    } else if (symbol === 'w') {
      return wordSymbol;
    } else if (symbol === 's') {
      return whitespaceSymbol;
    } else {
      return symbol;
    }
  },
  toValue (string) {
    return literal(string);
  }
};

And now it’s time to look at implementing Level Two features.


Implementing Level Two Features

Let’s turn our attention to extending regular expressions with features that cannot be implemented with simple transpilation. We begin by revisiting union2:

function productOperation (a, b, setOperator) {
  const {
    states: aDeclaredStates,
    accepting: aAccepting
  } = validatedAndProcessed(a);
  const aStates = [null].concat(aDeclaredStates);

  const {
    states: bDeclaredStates,
    accepting: bAccepting
  } = validatedAndProcessed(b);
  const bStates = [null].concat(bDeclaredStates);

  // P is a mapping from a pair of states (or any set, but in union2 it's always a pair)
  // to a new state representing the tuple of those states
  const P = new StateAggregator();

  const productAB = product(a, b, P);
  const { start, transitions } = productAB;

  const statesAAccepts = new Set(
    aAccepting.flatMap(
      aAcceptingState => bStates.map(bState => P.stateFromSet(aAcceptingState, bState))
    )
  );
  const statesBAccepts = new Set(
    bAccepting.flatMap(
      bAcceptingState => aStates.map(aState => P.stateFromSet(aState, bAcceptingState))
    )
  );

  const allAcceptingStates =
    [...setOperator(statesAAccepts, statesBAccepts)];

  const { stateSet: reachableStates } = validatedAndProcessed(productAB);
  const accepting = allAcceptingStates.filter(state => reachableStates.has(state));

  return { start, accepting, transitions };
}

function union2merged (a, b) {
  return mergeEquivalentStates(
    union2(a, b)
  );
}

We recall that the above code takes the product of two recognizers, and then computes the accepting states for the product from the union of the accepting states of the two recognizers.

Let’s refactor, and extract the set union:

function productOperation (a, b, setOperator) {
  const {
    states: aDeclaredStates,
    accepting: aAccepting
  } = validatedAndProcessed(a);
  const aStates = [null].concat(aDeclaredStates);

  const {
    states: bDeclaredStates,
    accepting: bAccepting
  } = validatedAndProcessed(b);
  const bStates = [null].concat(bDeclaredStates);

  // P is a mapping from a pair of states (or any set, but in union2 it's always a pair)
  // to a new state representing the tuple of those states
  const P = new StateAggregator();

  const productAB = product(a, b, P);
  const { start, transitions } = productAB;

  const statesAAccepts =
    aAccepting.flatMap(
      aAcceptingState => bStates.map(bState => P.stateFromSet(aAcceptingState, bState))
    );
  const statesBAccepts =
    bAccepting.flatMap(
      bAcceptingState => aStates.map(aState => P.stateFromSet(aState, bAcceptingState))
    );

  const allAcceptingStates =
    [...setOperator(statesAAccepts, statesBAccepts)];

  const { stateSet: reachableStates } = validatedAndProcessed(productAB);
  const accepting = allAcceptingStates.filter(state => reachableStates.has(state));

  return { start, accepting, transitions };
}

function setUnion (set1, set2) {
  return new Set([...set1, ...set2]);
}

function unionMerged (a, b) {
  return mergeEquivalentStates(
    productOperation(a, b, setUnion)
  );
}

We’ll create a new set union operator for this:

const levelTwoExpressions = {
  operators: {

    // ... other operators from formal regular expressions ...

    '∪': {
      symbol: Symbol('∪'),
      type: 'infix',
      precedence: 10,
      fn: union
    }
  },
  defaultOperator: '→',
  toValue (string) {
    return literal(string);
  }
};

verifyEvaluate('(a|b|c)|(b|c|d)', levelTwoExpressions, {
  '': false,
  'a': true,
  'b': true,
  'c': true,
  'd': true
});
  //=> All 5 tests passing

verifyEvaluate('(a|b|c)∪(b|c|d)', levelTwoExpressions, {
  '': false,
  'a': true,
  'b': true,
  'c': true,
  'd': true
});
  //=> All 5 tests passing

It does exactly what our original union2merged function does, as we expect. But now that we’ve extracted the set union operation, what if we substitute a different set operation?


intersection
function setIntersection (set1, set2) {
  return new Set(
    [...set1].filter(
      element => set2.has(element)
    )
  );
}

function intersection (a, b) {
  return mergeEquivalentStates(
    productOperation(a, b, setIntersection)
  );
}

const levelTwoExpressions = {
  operators: {

    // ... other operators from formal regular expressions ...

    '∪': {
      symbol: Symbol('∪'),
      type: 'infix',
      precedence: 10,
      fn: union
    },
    '∩': {
      symbol: Symbol('∩'),
      type: 'infix',
      precedence: 10,
      fn: intersection
    }
  },
  defaultOperator: '→',
  toValue (string) {
    return literal(string);
  }
};

verifyEvaluate('(a|b|c)∩(b|c|d)', levelTwoExpressions, {
  '': false,
  'a': false,
  'b': true,
  'c': true,
  'd': false
});

This is something new:

  • If a is a regular expression describing the language A, and b is a regular expression describing the language B, the expression a∩b describes the language Z where a sentence z belongs to Z if and only if z belongs to A, and z belongs to B.

Intersection can be useful for writing expressions that separate concerns. For example, if we already have 0|1(0|1)* as the expression for the language containing all binary numbers, and .(..)* as the expression for the language containing an odd number of symbols, then (0|1(0|1)*)∩(.(..)*) gives the the language containing all binary numbers with an odd number of digits.


difference

Here’s another:

function setDifference (set1, set2) {
  return new Set(
    [...set1].filter(
      element => !set2.has(element)
    )
  );
}

function difference (a, b) {
  return mergeEquivalentStates(
    productOperation(a, b, setDifference)
  );
}

const levelTwoExpressions = {
  operators: {

    // ... other operators from formal regular expressions ...

    '∪': {
      symbol: Symbol('∪'),
      type: 'infix',
      precedence: 10,
      fn: union
    },
    '∩': {
      symbol: Symbol('∩'),
      type: 'infix',
      precedence: 10,
      fn: intersection
    },
    '\\': {
      symbol: Symbol('-'),
      type: 'infix',
      precedence: 10,
      fn: difference
    }
  },
  defaultOperator: '→',
  toValue (string) {
    return literal(string);
  }
};

verifyEvaluate('(a|b|c)\\(b|c|d)', levelTwoExpressions, {
  '': false,
  'a': true,
  'b': false,
  'c': false,
  'd': false
});

\ is the set difference, or relative complement operator:3

  • If a is a regular expression describing the language A, and b is a regular expression describing the language B, the expression a\b describes the language Z where a sentence z belongs to Z if and only if z belongs to A, and z does not belong to B.

Where intersection was useful for separating concerns, difference is very useful for sentences that do not belong to a particular language. For example, We may want to match all sentences that contain the word “Braithwaite”, but not “Reggie Braithwaite:”

verifyEvaluate('.*Braithwaite.*\\.*Reggie Braithwaite.*', levelTwoExpressions, {
  'Braithwaite': true,
  'Reg Braithwaite': true,
  'The Reg Braithwaite!': true,
  'The Notorious Reggie Braithwaite': false,
  'Reggie, but not Braithwaite?': true,
  'Is Reggie a Braithwaite?': true
});

verifyEvaluate('(.*\\.*Reggie )(Braithwaite.*)', levelTwoExpressions, {
  'Braithwaite': true,
  'Reg Braithwaite': true,
  'The Reg Braithwaite!': true,
  'The Notorious Reggie Braithwaite': false,
  'Reggie, but not Braithwaite?': true,
  'Is Reggie a Braithwaite?': true
});

The second test above includes an interesting pattern.


complement

If s is an expression, then .*\s is the complement of the expression s. In set theory, the complement of a set S is everything that does not belong to S. If we presume the existence of a universal set U, where u belongs to U for any u, then the complement of a set S is the difference between U and S.

In sentences of symbols, if we have a total alphabet that we use to derive the dot operator ., then .* is an expression for every possible sentence, and .*\s is the difference between every possible sentence and the sentences in the language S. And that is the complement of S.

We can implement complement as a prefix operator:

const complement =
  s => difference(zeroOrMore(anySymbol()), s);

const levelTwoExpressions = {
  operators: {

    // ... other operators  ...

    '¬': {
      symbol: Symbol('¬'),
      type: 'prefix',
      precedence: 40,
      fn: complement
    }

  }

  // ... other definition ...

};

verifyEvaluate('¬(.*Reggie )Braithwaite.*', levelTwoExpressions, {
  'Braithwaite': true,
  'Reg Braithwaite': true,
  'The Reg Braithwaite!': true,
  'The Notorious Reggie Braithwaite': false,
  'Reggie, but not Braithwaite?': true,
  'Is Reggie a Braithwaite?': true
});
  //=> All 6 tests passing

complement can surprise the unwary. The expression ¬(.*Reggie )Braithwaite.* matches strings containing Braithwaite but not Reggie Braithwaite. But if we expect .*¬(Reggie )Braithwaite.* to do the same thing, we’ll be unpleasantly surprised:

verifyEvaluate('.*¬(Reggie )Braithwaite.*', levelTwoExpressions, {
  'Braithwaite': true,
  'Reg Braithwaite': true,
  'The Reg Braithwaite!': true,
  'The Notorious Reggie Braithwaite': false,
  'Reggie, but not Braithwaite?': true,
  'Is Reggie a Braithwaite?': true
});
  //=> 1 tests failing: fail: {"example":"The Notorious Reggie Braithwaite","expected":false,"actual":true}

The reason this failed is because the three “clauses” of our level two regular expression matched something like the following:

  1. .* matched The Notorious Reggie ;
  2. ¬(Reggie ) matched '' (also known as ε);
  3. Braithwaite.* matched Braithwaite.

That’s why we need to write our clause as ¬(.* Reggie ) if we are trying to exclude the symbols Reggie appearing just before Braithwaite. For similar reasons, the expression ¬(a|b|c) is not equivalent to the [^abc] character class from regex syntax. Not only will the empty string match that expression, but so will strings longer than with more than one symbol.

If we want to emulate [^abc], we want the intersection of ., which matches exactly one symbol, and ¬(a|b|c), which matches any expression except a or b or c.

We can represent [^abc] with .∩¬(a|b|c):

verifyEvaluate('.∩¬(a|b|c)', levelTwoExpressions, {
  '': false,
  'a': false,
  'b': false,
  'c': false,
  'd': true,
  'e': true,
  'f': true,
  'ab': false,
  'abc': false
});
  //=> All 9 tests passing

That’s handy, so let’s make it an operator:

const characterComplement =
  s => intersection(anySymbol(), complement(s));

const levelTwoExpressions = {
  operators: {

    // ... other operators  ...

    '^': {
      symbol: Symbol('^'),
      type: 'prefix',
      precedence: 50,
      fn: characterComplement
    }

  }

  // ... other definition ...

};

verifyEvaluate('^(a|b|c)', levelTwoExpressions, {
  '': false,
  'a': false,
  'b': false,
  'c': false,
  'd': true,
  'e': true,
  'f': true,
  'ab': false,
  'abc': false
});
  //=> All 9 tests passing

The syntax ^(a|b|c) is close enough to [^abc] for our purposes.


What Level Two Features Tell Us, and What They Don’t

The Level Two features we’ve implemented are useful, and they demonstrate some important results:

We already know that:

  • if x is a finite state recognizer that recognizes sentences in the language X, and y is a finite-state recognizer that recognizes sentences in the language Y, there exists a finite-state recognizer z that recognizes sentences in the language Z, where a sentence a belongs to Z if and only if a belongs to either X or Y. We demonstrated this by writing functions like union2 that take x and y as arguments and return z.
  • if x is a finite state recognizer that recognizes sentences in the language X, and y is a finite-state recognizer that recognizes sentences in the language Y, there exists a finite-state recognizer z that recognizes sentences in the language Z, where a sentence ab belongs to Z if and only if a belongs to X and b belongs to Y. We demonstrated this by writing the function catenation2 that takes x and y as arguments and returns z.
  • if x is a finite state recognizer that recognizes sentences in the language X, there exists a finite-state recognizer z that recognizes sentences in the language Z, where a sentence ab belongs to Z if and only if a is either the empty string or a sentence belonging to X, and b is a sentence belonging to Z. We demonstrated this by writing the function zeroOrMore that takes x as an argument and returns z`.

These three results tell us that the set of finite-state recognizers is closed under alternation, catenation, and quantification.

Implementing our Level Two features has also demonstrated that:

  • if x is a finite state recognizer that recognizes sentences in the language X, and y is a finite-state recognizer that recognizes sentences in the language Y, there exists a finite-state recognizer z that recognizes sentences in the language Z, where a sentence a belongs to Z if and only if a belongs to both X and Y. We demonstrated this by writing the function intersection that takes x and y as arguments and returns z.
  • if x is a finite state recognizer that recognizes sentences in the language X, and y is a finite-state recognizer that recognizes sentences in the language Y, there exists a finite-state recognizer z that recognizes sentences in the language Z, where a sentence a belongs to Z if and only if a belongs to X and a does not belong to Y. We demonstrated this by writing the function difference that takes x and y as arguments and returns z.
  • if x is a finite state recognizer that recognizes sentences in the language X, there exists a finite-state recognizer z that recognizes sentences in the language Z, where a sentence a belongs to Z if and only if a does not belong to X. We demonstrated this by writing the function complement that takes x as an argument and returns z`.

These three results also tell us that the set of finite-state recognizers is closed under intersection, difference, and complementation.

Writing Level Three features does come with a known limitation. Obviously, we can translate any Level Three regular expression into a finite-state recognizer. This tells us that the set of languages defined by Level Three regular expressions is a subset of the set of languages recognized by finte-state recognizers.

But what we don’t know is whether the set of languages defined by Level Three regular expressions is a equivalent to the set of languages defined by formal regular expressions. We don’t have an algorithm for translating Level Three regular expressions to Level Zero regular expressions. Given what we have explored so far, it is possible that the set of languages recognized by finite-state recognizers is larger than the set of languages defined by formal regular expressions (“Level Zero”).

If that were the case, it could be that some Level Three regular expression compiles to a finite-state recognizer, but there is no Level Zero expression that compiles to an equivalent finite-state recognizer.

How would we know if this were true?

With Level One expressions, we showed that for every Level One expression, there is an equivalent Level Zero expression by writing a Level One to Level Zero transpiler. With Level Two expressions, we’ll take a different tack: We’ll show that for every finite-state recognizer, there is an equivalent Level Zero expression.

If we know that for every finite-state recognizer, there is an equivalent Level Zero expression, and we also know that for every Level Zero expression, there is an equivalent finite-state recognizer, then we know that the set of languages recognized by finite-state recognizers is equal to the set of languages recognized by Level Zero expressions, a/k/a Regular Languages.

And if we know that for every Level Two expression, there is an equivalent finite-state recognizer, then it would follow that for every Level two expression, there is an equivalent Level Zero expression, and it would follow that the set of all languages described by Level Two expressions is the set of regular languages.


For Every Finite-State Recognizer, There Exists An Equivalent Formal Regular Expression

It is time to demonstrate that for every finite-state recognizer, there exists an equivalent formal regular expression. We’re going to follow Stephen Kleene’s marvellous proof, very much leaning on Shunichi Toida’s excellent notes for CS390 Introduction to Theoretical Computer Science Structures The proof of this aspect of Kleene’s Theorem can be found here.

Our constructive proof-like approach will be to write a function that takes as its argument a description of a finite-state recognizer, and returns an equivalent formal regular expression in our syntax. The approach will be an old one in computer science:

For any pair of states (any par implies that both states could be the same state) in a finite-state recognizer, we will find all the paths from one to another, and for each path, we can write a regular expression representing that path using catenation. When we have more than one path between them, we can combine them together using alternation. We’ll explain how quantification comes into that in a moment.

But if we had such a function, we could apply it to the start state and any accepting states, getting a formal regular expression for the paths from the start state to each accepting state. And if there are more than one accepting states, we could use alternation to combine the regular expressions into one big regular expression that is equivalent to the finite-state recognizer.


the regularExpression function

Let’s get started writing this in JavaScript. Given a description like:

const binary = {
  "start": "start",
  "transitions": [
    { "from": "start", "consume": "0", "to": "zero" },
    { "from": "start", "consume": "1", "to": "notZero" },
    { "from": "notZero", "consume": "0", "to": "notZero" },
    { "from": "notZero", "consume": "1", "to": "notZero" }
  ],
  "accepting": ["zero", "notZero"]
}

It will be a matter of finding the regular expressions for the paths from start to zero, and from start to notZero, and taking the union of those paths. We're going to do that with a function we'll call between. Our function will take an argument for the state from, another for the state to, and a third argument called viaStates` that we’ll explain in a moment.4

Note that from, to, and via can be any of the states in the recognizer, including being the same state.

Here’s an empty function for what we want to begin with:

function regularExpression (description) {
  const pruned =
    reachableFromStart(
      mergeEquivalentStates(
        description
      )
    );
  const {
    start,
    transitions,
    accepting,
    stateSet
  } = validatedAndProcessed(pruned);

  // ...TBD

  function between ({ from, to, viaStates }) {
    // ... TBD
  }
};

Let’s get the most degenerate case out of the way first. If a finite-state recognizer has no accepting states, then its formal regular expression is the empty set:

function regularExpression (description) {
  const pruned =
    reachableFromStart(
      mergeEquivalentStates(
        description
      )
    );
  const {
    start,
    transitions,
    accepting,
    acceptingSet,
    stateSet
  } = validatedAndProcessed(pruned);

  if (accepting.length === 0) {
    return '∅';
  } else {
    // ...TBD

    function between ({ from, to, viaStates }) {
      // ... TBD
    }
  }
};

// ----------

verify(regularExpression, new Map([
  [emptySet(), '∅']
]));

Now what if there are accepting states? As described, the final regular expression must represent the union of all the expressions for getting from the start state to each accepting state. Let’s fill that in for a moment, deliberately omitting viaStates:

function alternateExpr(...exprs) {
  const uniques = [...new Set(exprs)];
  const notEmptySets = uniques.filter( x => x !== '∅' );

  if (notEmptySets.length === 0) {
    return '∅';
  } else if (notEmptySets.length === 1) {
    return notEmptySets[0];
  } else {
    return notEmptySets.map(p).join('|');
  }
}

function regularExpression (description) {
  const pruned =
    reachableFromStart(
      mergeEquivalentStates(
        description
      )
    );
  const {
    start,
    transitions,
    accepting,
    acceptingSet,
    stateSet
  } = validatedAndProcessed(pruned);

  if (accepting.length === 0) {
    return '∅';
  } else {
    const from = start;
    const pathExpressions =
      accepting.map(
        to => expression({ from, to })
      );

    const acceptsEmptyString = accepting.indexOf(start) >= 0;

    if (acceptsEmptyString) {
      return alternateExpr('ε', ...pathExpressions);
    } else {
      return alternateExpr(...pathExpressions);
    }

    function between ({ from, to, viaStates }) {
      // ... TBD
    }
  }
};

There’s another special case thrown in: Although we haven’t written our between function yet, we know that if a finite-state recognizer beins in an accepting state, then it accepts the empty string, and thus we can take all the other expressions for getting from a start state to an accepting state, and union them with ε.

Now how about the between function?


the between function

The between function returns a formal regular expression representing all of the possible ways a finite-state recognizer can consume strings to get from the from state to the to state.

The way it works is to divide-and-conquer. We begin by choosing any state as the via state. We can divide up all the paths as follows:

  1. All the paths from from to to that go through some state we shall call via least once, and;
  2. All the paths from from to to that do not go through via at all.

If we could compute formal regular expressions for each of these two sets of paths, we could return the union of the two regular expressions and be done. So let’s begin by picking a viaState. Kleene numbered the states and begin with the largest state, we will simply take whatever state is first in the viaStates set’s enumeration:

function between ({ from, to, viaStates = [...allStates] }) {
  if (viaStates.size === 0) {
    // .. TBD
  } else {
    const [via] = viaStates;

    // ... TBD
  }
}

We have left room for the degenerate case where viaStates is empty. We’ll get to that in a moment. The first part of our case is to write an expression for all the paths from from to to that go through via at least once. Here’s the formulation for that:

  1. The expression representing all the paths from from to via that do not go through via, catenated with;
  2. The expression representing all the paths from via looping back to via that do not go through vi, repeated any number of times, catenated with;
  3. The expression representing all the paths from via to to that do not go through via.

Our normal case is going to look something like this:

function zeroOrMoreExpr (a) {
  if (a === '∅' || a === 'ε') {
    return 'ε';
  } else {
    return `${p(a)}*`;
  }
}

function catenateExpr (...exprs) {
  if (exprs.some( x => x === '∅' )) {
    return '∅';
  } else {
    const notEmptyStrings = exprs.filter( x => x !== 'ε' );

    if (notEmptyStrings.length === 0) {
      return 'ε';
    } else if (notEmptyStrings.length === 1) {
      return notEmptyStrings[0];
    } else {
      return notEmptyStrings.map(p).join('');
    }
  }
}

function between ({ from, to, viaStates = allStates }) {
  if (viaStates.size === 0) {
    // .. TBD
  } else {
    const [via] = viaStates;

    const fromToVia = expression({ from, to: via });
    const viaToVia = zeroOrMoreExpr(
      expression({ from: via, to: via })
    );
    const viaToTo = expression({ from: via, to, });

    const throughVia = catenateExpr(fromToVia, viaToVia, viaToTo);
  }
}

That being said, we have left out what to pass for viaStates. Well, we want our routine to do the computation for paths not passing through the state via, so we really want is all the remaining states except via:

function between ({ from, to, viaStates = [...allStates] }) {
  if (viaStates.length === 0) {
    // .. TBD
  } else {
    const [via, ...exceptVia] = viaStates;

    const fromToVia = expression({ from, to: via, viaStates: exceptVia });
    const viaToVia = zeroOrMoreExpr(
      expression({ from: via, to: via, viaStates: exceptVia })
    );
    const viaToTo = expression({ from: via, to, viaStates: exceptVia });

    const throughVia = catenateExpr(fromToVia, viaToVia, viaToTo);
  }
}

Now how about the second part of our case? It’s the expression for all the paths from from to to that do not go through via. Which we then alternate with the expression for all the paths going through via:

function between ({ from, to, viaStates = [...allStates] }) {
  if (viaStates.length === 0) {
    // .. TBD
  } else {
    const [via, ...exceptVia] = viaStates;

    const fromToVia = expression({ from, to: via, viaStates: exceptVia });
    const viaToVia = zeroOrMoreExpr(
      expression({ from: via, to: via, viaStates: exceptVia })
    );
    const viaToTo = expression({ from: via, to, viaStates: exceptVia });

    const throughVia = catenateExpr(fromToVia, viaToVia, viaToTo);
    const notThroughVia = expression({ from, to, viaStates: exceptVia });

    return alternateExpr(throughVia, notThroughVia);
  }
}

Eventually,5 this function will end up calling itself and passing an empty list of states. That’s our degenerate case. Given two states, what are all the paths between them that don’t go through any other states? Why, just the transitions directly between them. And the expressions for those are the symbols consumed, plus some allowance for symbols we have to escape.

function between ({ from, to, viaStates = [...allStates] }) {
  if (viaStates.length === 0) {
    const directExpressions =
      transitions
      .filter( ({ from: tFrom, to: tTo }) => from === tFrom && to === tTo )
      .map( ({ consume }) => toValueExpr(consume) );

    return alternateExpr(...directExpressions);
  } else {
    const [via, ...exceptVia] = viaStates;

    const fromToVia = expression({ from, to: via, viaStates: exceptVia });
    const viaToVia = zeroOrMoreExpr(
      expression({ from: via, to: via, viaStates: exceptVia })
    );
    const viaToTo = expression({ from: via, to, viaStates: exceptVia });

    const throughVia = catenateExpr(fromToVia, viaToVia, viaToTo);
    const notThroughVia = expression({ from, to, viaStates: exceptVia });

    return alternateExpr(throughVia, notThroughVia);
  }
}

const a = evaluate('a', formalRegularExpressions);

regularExpression(a)
  //=> ((((∅|a)∅∅)|∅)(((∅|a)∅∅)|∅)(((∅|a)∅∅)|(∅|a)))|(((∅|a)∅∅)|(∅|a))

This is a valid regular expression, but all the s make it unreadable. We’re not going to get into functions for finding the minimal expression for a finite-state recognizer, but we can make things less ridiculous with five easy optimizations:

  1. catenating any expression a with the empty set returns the empty set.
  2. alternating any expression a with the empty set returns the expression a.
  3. Repeating the empty zeroOrMore times returns the empty set.
function alternateExpr(...exprs) {
  const uniques = [...new Set(exprs)];
  const notEmptySets = uniques.filter( x => x !== '∅' );

  if (notEmptySets.length === 0) {
    return '∅';
  } else if (notEmptySets.length === 1) {
    return notEmptySets[0];
  } else {
    return notEmptySets.map(p).join('|');
  }
}

function catenateExpr (...exprs) {
  if (exprs.some( x => x === '∅' )) {
    return '∅';
  } else {
    const notEmptyStrings = exprs.filter( x => x !== 'ε' );

    if (notEmptyStrings.length === 0) {
      return 'ε';
    } else if (notEmptyStrings.length === 1) {
      return notEmptyStrings[0];
    } else {
      return notEmptyStrings.map(p).join('');
    }
  }
}

function zeroOrMoreExpr (a) {
  if (a === '∅' || a === 'ε') {
    return 'ε';
  } else {
    return `${p(a)}*`;
  }
}

function regularExpression (description) {
  const pruned =
    reachableFromStart(
      mergeEquivalentStates(
        description
      )
    );
  const {
    start,
    transitions,
    accepting,
    allStates
  } = validatedAndProcessed(pruned);

  if (accepting.length === 0) {
    return '∅';
  } else {
    const from = start;
    const pathExpressions =
      accepting.map(
        to => expression({ from, to })
      );

    const acceptsEmptyString = accepting.indexOf(start) >= 0;

    if (acceptsEmptyString) {
      return alternateExpr('ε', ...pathExpressions);
    } else {
      return alternateExpr(...pathExpressions);
    }

    function between ({ from, to, viaStates = [...allStates] }) {
      if (viaStates.length === 0) {
        const directExpressions =
          transitions
          .filter( ({ from: tFrom, to: tTo }) => from === tFrom && to === tTo )
          .map( ({ consume }) => toValueExpr(consume) );

        return alternateExpr(...directExpressions);
      } else {
        const [via, ...exceptVia] = viaStates;

        const fromToVia = expression({ from, to: via, viaStates: exceptVia });
        const viaToVia = zeroOrMoreExpr(
          expression({ from: via, to: via, viaStates: exceptVia })
        );
        const viaToTo = expression({ from: via, to, viaStates: exceptVia });

        const throughVia = catenateExpr(fromToVia, viaToVia, viaToTo);
        const notThroughVia = expression({ from, to, viaStates: exceptVia });

        return alternateExpr(throughVia, notThroughVia);
      }
    }
  }
};

Done! Now let’s look at what it does:


using the regularExpression function

First, let’s take an arbitrary finite-state recognizer, and convert it to a formal regular expression:

regularExpression(binary)
  //=> 0|((1((0|1)*)(0|1))|1)

The result, 0|((1((0|1)*)(0|1))|1), isn’t the most compact or readable regular expression, but if we look at it carefully, we can see that it produces the same result: It matches a zero, or a one, or a one followed by a either a zero or one followed by either a zero or one zero or more times. Basically, it’s equivalent to 0|1|1(0|1)(0|1)*. And 1|1(0|1)(0|1)* is equivalent to 1(0|1)*, so 0|((1((0|1)*)(0|1))|1) is equivalent to 0|1(0|1)*.

Let’s check it:

verifyRecognizer(binary, {
  '': false,
  '0': true,
  '1': true,
  '00': false,
  '01': false,
  '10': true,
  '11': true,
  '000': false,
  '001': false,
  '010': false,
  '011': false,
  '100': true,
  '101': true,
  '110': true,
  '111': true,
  '10100011011000001010011100101110111': true
});
  //=> All 16 tests passing

const reconstitutedBinaryExpr = regularExpression(binary);
  //=> 0|((1((0|1)*)(0|1))|1)

verifyEvaluate(reconstitutedBinaryExpr, formalRegularExpressions, {
  '': false,
  '0': true,
  '1': true,
  '00': false,
  '01': false,
  '10': true,
  '11': true,
  '000': false,
  '001': false,
  '010': false,
  '011': false,
  '100': true,
  '101': true,
  '110': true,
  '111': true,
  '10100011011000001010011100101110111': true
});
  //=> All 16 tests passing

0|((1((0|1)*)(0|1))|1) may be a fugly way to describe binary numbers, but it is equivalent to 0|1(0|1)*, and what counts is that for any finite-state recognizer, our function finds an equivalent formal regular expression. And if we know that for every finite-state recognizer, there is an equivalent formal-state recognizer, then we now have a universal demonstration that our Level One and Level Two features describe regular languages just like formal regular expressions. This is true even if–like our Level Two features–there is no obvious and direct translation to a formal regular expression.

However, testing binary doesn’t actually demonstrate that the finite-state recognizer produced by compiling a Level Two expression to a finite-state recognizer can be compiled back to an equivalent finite-state recognizer. We already know that binary numbers is a regular language. So let’s try our function with some level two examples.


a test suite for the regularExpression function

We can check a few more results to give us confidence. But instead of reasoning through each one, we’ll check the equivalence using test cases. What we’ll do is take a regular expression and run it through test cases. Then we’ll evaluate it to produce a finite-state recognizer, translate the finite-state recognizer to a formal regular expression with regularExpression, and run it through the same text cases again.

If all the tests pass, we’ll declare that our regularExpression function does indeed demonstrate that there is an equivalent formal regular expression for every finite-state recognizer. Here’s our test function, and an example of trying it with 0|1(0|1)*:

function verifyRegularExpression (expression, tests) {
  const recognizer = evaluate(expression, levelTwoExpressions);

  verifyRecognizer(recognizer, tests);

  const formalExpression = regularExpression(recognizer);

  verifyEvaluate(formalExpression, formalRegularExpressions, tests);
}

verifyRegularExpression('0|1(0|1)*', {
  '': false,
  '0': true,
  '1': true,
  '00': false,
  '01': false,
  '10': true,
  '11': true,
  '000': false,
  '001': false,
  '010': false,
  '011': false,
  '100': true,
  '101': true,
  '110': true,
  '111': true,
  '10100011011000001010011100101110111': true
});

And now to try it with some Level Two examples:


verifyRegularExpression('(a|b|c)∪(b|c|d)', {
  '': false,
  'a': true,
  'b': true,
  'c': true,
  'd': true
});

verifyRegularExpression('(ab|bc|cd)∪(bc|cd|de)', {
  '': false,
  'ab': true,
  'bc': true,
  'cd': true,
  'de': true
});

verifyRegularExpression('(a|b|c)∩(b|c|d)', {
  '': false,
  'a': false,
  'b': true,
  'c': true,
  'd': false
});

verifyRegularExpression('(ab|bc|cd)∩(bc|cd|de)', {
  '': false,
  'ab': false,
  'bc': true,
  'cd': true,
  'de': false
});

verifyRegularExpression('(a|b|c)\\(b|c|d)', {
  '': false,
  'a': true,
  'b': false,
  'c': false,
  'd': false
});

verifyRegularExpression('(ab|bc|cd)\\(bc|cd|de)', {
  '': false,
  'ab': true,
  'bc': false,
  'cd': false,
  'de': false
});

Success! There is an equivalent formal regular expression for the finite-state recognizers we generate with our Level Two features.


conclusion

We have now demonstrated, in constructive fashion, that for every finite-state recognizer, there is an equivalent formal regular expression.

This implies several important things. First and foremost, since we have also established that for every formal regular expression, there is an equivalent finite-state recognizer, we now know that The set of languages described by formal regular expressions–regular languages–is identical to the set of languages recognized by finite-state automata. Finite-state automata recognize regular languages, and regular languages can be recognized by finite-state automata.

Second, if we devise any arbitrary extension to formal regular languages–or even an entirely new kind of language, and we also devise a way to compile such descriptions to finite-state recognizers, then we know that the languages we can describe with these extensions or languages are still regular languages.

Although we are not emphasizing performance, we also know that sentences in any such extensions or languages we may care to create can still be recognized in at worst linear time, because finite-state recognizers recognize sentences in at worst linear time.

(discuss on Hacker News)


Notes
  1. A more subtle issue is that all of our code for manipulating finite-state recognizers depends upon them having unique state names. Invoking union2(a, a) or catenation2(a, a) will not work properly because the names will clash. To make such expressions work, we have to make a duplicate of one of the arguments, e.g. union2(a, dup(a)) or catenation2(a, dup(a)). In this case, we invoked catenation2(a, zeroOrMore(dup(a))).

    None of this is a consideration with our existing code, because it always generates brand new recognizers with unique states. But when we manually write our own expressions in JavaScript, we have to guard against name clashes by hand. Which is another argument against writing expressions in JavaScript. aa and a|a in a formal regular expression “just work.” union2(a, a) and catenation2(a, a) don’t. 

  2. If you feel like having a go at one more, try implementing another quantification operator, explicit repetition. In many regexen flavours, we can write (expr){5} to indicate we wish to match (expr)(expr)(expr)(expr)(expr). The syntax allows other possibilities, such as (expr){2,3} and (expr){3,}, but ignoring those, the effect of (expr){n} for any n from 1 to 9 could be emulated with an infix operator, such as , so that (expr)⊗5 would be transpiled to (expr)(expr)(expr)(expr)(expr)

  3. Our source code uses a lot of double back-slashes, but this is an artefact of JavaScript the programming language using a backslash as its escape operator. The actual strings use a single backslash internally. 

  4. In most proofs, this function is called L, and its arguments are called p, q, and k. One-character names are terrific when writing proofs by hand using chalk and a blackboard, but we’ve moved on since 1951 and we’ll use descriptive names. Likewise, Kleene numbered the states in order to create an ordering that is easy to work with by hand. We’ll work with sets instead of numbers, because once again, we have computers do do all the bookkeeping for us. 

  5. How eventually? With enough states in a recognizer, it could take a very long time. This particular algorithm has exponential running time! But that being said, we are writing it to prove that it can be done, we don’t actually need to do it to actually recognize sentences. 

https://raganwald.com/2019/12/17/regular-expressions
Exploring Regular Expressions and Finite-State Recognizers, Part I
Show full content

Snowblower

Prelude

In this essay, we’re going to explore regular expressions by implementing regular expressions.

This essay will be of interest to anyone who would like a refresher on the fundamentals of regular expressions and pattern matching. It is not intended as a practical “how-to” for using modern regexen, but the exercise of implementing basic regular expressions is a terrific foundation for understanding how to use and optimize regular expressions in production code.

As we develop our implementation, we will demonstrate a number of important results concerning regular expressions, regular languages, and finite-state automata, such as:

  • For every formal regular expression, there exists an equivalent finite-state recognizer.
  • For every finite-state recognizer with epsilon-transitions, there exists a finite-state recognizer without epsilon-transitions.
  • For every finite-state recognizer, there exists an equivalent deterministic finite-state recognizer.
  • The set of finite-state recognizers is closed under union, catenation, and kleene*.
  • Every regular language can be recognized by a finite-state recognizer.

Then, in Part II, we will explore more features of regular expressions, and show that if a finite-state automaton recognizes a language, that language is regular.

All of these things have been proven, and there are numerous explanations of the proofs available in literature and online. In this essay, we will demonstrate these results in a constructive proof style. For example, to demonstrate that for every formal regular expression, there exists an equivalent finite-state recognizer, we will construct a function that takes a formal regular expression as an argument, and returns an equivalent finite-state recognizer.1

We’ll also look at various extensions to formal regular languages that make it easier to write regular expressions. Some–like + and ?–will mirror existing regex features, while others–like (intersection), \ (difference), and ¬ (complement)–do not have direct regex equivalents.

When we’re finished, we’ll know a lot more about regular expressions, finite-state recognizers, and pattern matching.

If you are somewhat familiar with formal regular expressions (and the regexen we find in programming tools), feel free to skip the rest of the prelude and jump directly to the Table of Contents.


what is a regular expression?

In programming jargon, a regular expression, or regex (plural “regexen”),2 is a sequence of characters that define a search pattern. They can also be used to validate that a string has a particular form. For example, /ab*c/ is a regex that matches an a, zero or more bs, and then a c, anywhere in a string.

Regexen are–fundamentally–descriptions of sets of strings. A simple example is the regex /^0|1(0|1)*$/, which describes the set of all strings that represent whole numbers in base 2, also known as the “binary numbers.”

In computer science, the strings that a regular expression matches are known as “sentences,” and the set of all strings that a regular expression matches is known as the “language” that a regular expression matches.

So for the regex /^0|1(0|1)*$/, its language is “The set of all binary numbers,” and strings like 0, 11, and 1010101 are sentences in its language, while strings like 01, two, and Kltpzyxm are sentences that are not in its language.3

Regexen are not descriptions of machines that recognize strings. Regexen describe “what,” but not “how.” To actually use regexen, we need an implementation, a machine that takes a regular expression and a string to be scanned, and returns–at the very minimum–whether or not the string matches the expression.

Regexen implementations exist on most programming environments and most command-line environments. grep is a regex implementation. Languages like Ruby and JavaScript have regex libraries built in and provide syntactic support for writing regex literals directly in code.

The syntactic style of wrapping a regex in / characters is a syntactic convention in many languages that support regex literals, and we repeat them here to help distinguish them from formal regular expressions.


formal regular expressions

Regex programming tools evolved as a practical application for Formal Regular Expressions, a concept discovered by Stephen Cole Kleene, who was exploring Regular Languages. Regular Expressions in the computer science sense are a tool for describing Regular Languages: Any well-formed regular expression describes a regular language, and every regular language can be described by a regular expression.

Formal regular expressions are made with three “atomic” or indivisible expressions:

  • The symbol describes the language with no sentences, { }, also called “the empty set.”
  • The symbol ε describes the language containing only the empty string, { '' }.
  • Literals such as x, y, or z describe languages containing single sentences, containing single symbols. e.g. The literal r describes the language { 'r' }.

What makes formal regular expressions powerful, is that we have operators for alternating, catenating, and quantifying regular expressions. Given that x is a regular expression describing some language X, and y is a regular expression describing some language Y:

  1. The expression x|y describes the union of the languages X and Y, meaning, the sentence w belongs to x|y if and only if w belongs to the language X, or w belongs to the language Y. We can also say that x|y represents the alternation of x and y.
  2. The expression xy describes the language XY, where a sentence ab belongs to the language XY if and only if a belongs to the language X, and b belongs to the language Y. We can also say that xy represents the catenation of the expressions x and y.
  3. The expression x* describes the language Z, where the sentence ε (the empty string) belongs to Z, and, the sentence pq belongs to Z if and only if p is a sentence belonging to X, and q is a sentence belonging to Z. We can also say that x* represents a quantification of x.

Before we add the last rule for regular expressions, let’s clarify these three rules with some examples. Given the constants a, b, and c, resolving to the languages { 'a' }, { 'b' }, and { 'c' }:

  • The expression b|c describes the language { 'b', 'c' }, by rule 1.
  • The expression ab describes the language { 'ab' } by rule 2.
  • The expression a* describes the language { '', 'a', 'aa', 'aaa', ... } by rule 3.

Our operations have a precedence, and it is the order of the rules as presented. So:

  • The expression a|bc describes the language { 'a', 'bc' } by rules 1 and 2.
  • The expression ab* describes the language { 'a', 'ab', 'abb', 'abbb', ... } by rules 2 and 3.
  • The expression b|c* describes the language { '', 'b', 'c', 'cc', 'ccc', ... } by rules 1 and 3.

As with the algebraic notation we are familiar with, we can use parentheses:

  • Given a regular expression x, the expression (x) describes the language described by x.

This allows us to alter the way the operators are combined. As we have seen, the expression b|c* describes the language { '', 'b', 'c', 'cc', 'ccc', ... }. But the expression (b|c)* describes the language { '', 'b', 'c', 'bb', 'bc', 'cb', 'cc', 'bbb', 'bbc', 'bcb', 'bcc', 'cbb', ... }.

It is quite obvious that regexen borrowed a lot of their syntax and semantics from regular expressions. Leaving aside the mechanism of capturing and extracting portions of a match, almost every regular expressions is also a regex. For example, /reggiee*/ is a regular expression that matches words like reggie, reggiee, and reggieee anywhere in a string.

Regexen add a lot more affordances like character classes, the dot operator, decorators like ? and +, and so forth, but at their heart, regexen are based on regular expressions.

And now to the essay.


Roundhouse


Table of Contents Prelude Our First Goal: “For every regular expression, there exists an equivalent finite-state recognizer”

Evaluating Arithmetic Expressions

Finite-State Recognizers Building Blocks Alternating Regular Expressions

Taking the Product of Two Finite-State Automata

From Product to Union

Catenating Regular Expressions

Converting Nondeterministic to Deterministic Finite-State Recognizers

Quantifying Regular Expressions What We Have Learned So Far

Ffestiniog Locomotive


Our First Goal: “For every regular expression, there exists an equivalent finite-state recognizer”

As mentioned in the Prelude, Stephen Cole Kleene developed the concept of formal regular expressions and regular languages, and published a seminal theorem about their behaviour in 1951.

Regular expressions are not machines. In and of themselves, they don’t generate sentences in a language, nor do they recognize whether sentences belong to a language. They define the language, and it’s up to us to build machines that do things like generate or recognize sentences.

Kleene studied machines that can recognize sentences in languages. Studying such machines informs us about the fundamental nature of the computation involved. In the case of formal regular expressions and regular languages, Kleene established that for every regular language, there is a finite-state automaton that recognizes sentences in that language.

(Finite-state automatons that are arranged to recognize sentences in languages are also called “finite-state recognizers,” and that is the term we will use from here on.)

Kleene also established that for every finite-state recognizer, there is a formal regular expression that describes the language that the finite-state recognizer accepts. In proving these two things, he proved that the set of all regular expressions and the set of all finite-state recognizers is equivalent.

In the first part of this essay, we are going to demonstrate these two important components of Kleene’s theorem by writing JavaScript code, starting with a demonstration that “For every regular expression, there exists an equivalent finite-state recognizer.”


our approach

Our approach to demonstrating that for every regular expression, there exists an equivalent finite-state recognizer will be to write a program that takes as its input a regular expression, and produces as its output a description of a finite-state recognizer that accepts sentences in the language described by the regular expression.

Our in computer jargon, we’re going to write a regular expression to finite-state recognizer compiler. Compilers and interpreters are obviously an extremely interesting tool for practical programming: They establish an equivalency between expressing an algorithm in a language that humans understand, and expressing an equivalent algorithm in a language a machine understands.

Our compiler will work like this: Instead of thinking of a formal regular expression as a description of a language, we will think of it as an expression, that when evaluated, returns a finite-state recognizer.

Our “compiler” will thus be an algorithm that evaluates regular expressions.


Evaluating Arithmetic Expressions

We needn’t invent our evaluation algorithm from first principles. There is a great deal of literature about evaluating expressions, especially expressions that consist of values, operators, and parentheses.

One simple and easy-to work-with approach works like this:

  1. Take an expression in infix notation (when we say “infix notation,” we include expressions that contain prefix operators, postfix operators, and parentheses).
  2. Convert the expression to reverse-polish representation, also called reverse-polish notation, or “RPN.”
  3. Push the RPN onto a stack.
  4. Evaluate the RPM using a stack machine.

Before we write code to do this, we’ll do it by hand for a small expression, 3*2+4!:

Presuming that the postfix ! operator has the highest precedence, followed by the infix * and then the infix + has the lowest precedence, 3*2+4! in infix notation becomes [3, 2, *, 4, !, +] in reverse-polish representation.

Evaluating [3, 2, *, 4, !, +] with a stack machine works by taking each of the values and operators in order. If a value is next, push it onto the stack. If an operator is next, pop the necessary number of arguments off apply the operator to the arguments, and push the result back onto the stack. If the reverse-polish representation is well-formed, after processing the last item from the input, there will be exactly one value on the stack, and that is the result of evaluating the reverse-polish representation.

Let’s try it:

  1. The first item is a 3. We push it onto the stack, which becomes [3].
  2. The next item is a 2. We push it onto the stack, which becomes [3, 2].
  3. The next item is a *, which is an operator with an arity of two.
  4. We pop 2 and 3 off the stack, which becomes [].
  5. We evaluate *(3, 2) (in a pseudo-functional form). The result is 6.
  6. We push 6 onto the stack, which becomes [6].
  7. The next item is 4. We push it onto the stack, which becomes [6, 4].
  8. The next item is a !, which is an operator with an arity of one.
  9. We pop 4 off the stack, which becomes [6].
  10. We evaluate !(4) (in a pseudo-functional form). The result is 24.
  11. We push 24 onto the stack, which becomes [6, 24].
  12. The next item is a +, which is an operator with an arity of two.
  13. We pop 24 and 6 off the stack, which becomes [].
  14. We evaluate +(6, 24) (in a pseudo-functional form). The result is 30.
  15. We push 30 onto the stack, which becomes [30].
  16. There are no more items to process, and the stack contains one value, [30]. We therefore return [30] as the result of evaluating [3, 2, *, 4, !, +].

Let’s write this in code. We’ll start by writing an infix-to-reverse-polish representation converter. We are not writing a comprehensive arithmetic evaluator, so we will make a number of simplifying assumptions, including:

  • We will only handle singe-digit values and single-character operators.
  • We will not allow ambiguos operators. For example, in ordinary arithmetic, the - operator is both a prefix operator that negates integer values, as well as an infix operator for subtraction. In our evaluator, - can be one, or the other, but not both.
  • We’ll only process strings when converting to reverse-polish representation. It’ll be up to the eventual evaluator to know that the string '3' is actually the number 3.
  • We aren’t going to allow whitespace. 1 + 1 will fail, 1+1 will not.

We’ll also parameterize the definitions for operators. This will allow us to reuse our evaluator for regular expressions simply by changing the operator definitions.


converting infix to reverse-polish representation

Here’s our definition for arithmetic operators:

const arithmetic = {
  operators: {
    '+': {
      symbol: Symbol('+'),
      type: 'infix',
      precedence: 1,
      fn: (a, b) => a + b
    },
    '-': {
      symbol: Symbol('-'),
      type: 'infix',
      precedence: 1,
      fn: (a, b) => a - b
    },
    '*': {
      symbol: Symbol('*'),
      type: 'infix',
      precedence: 3,
      fn: (a, b) => a * b
    },
    '/': {
      symbol: Symbol('/'),
      type: 'infix',
      precedence: 2,
      fn: (a, b) => a / b
    },
    '!': {
      symbol: Symbol('!'),
      type: 'postfix',
      precedence: 4,
      fn: function factorial(a, memo = 1) {
        if (a < 2) {
          return a * memo;
        } else {
          return factorial(a - 1, a * memo);
        }
      }
    }
  }
};

Note that for each operator, we define a symbol. We’ll use that when we push things into the output queue so that our evaluator can disambiguate symbols from values (Meaning, of course, that these symbols can’t be values.) We also define a precedence, and an eval function that the evaluator will use later.

Armed with this, how do we convert infix expressions to reverse-polish representation? With a “shunting yard.”

The Shunting Yard Algorithm is a method for parsing mathematical expressions specified in infix notation with parentheses. As we implement it here, it will produce a reverse-polish representation without parentheses. The shunting yard algorithm was invented by Edsger Dijkstra, and named the “shunting yard” algorithm because its operation resembles that of a railroad shunting yard.

The shunting yard algorithm is stack-based. Infix expressions are the form of mathematical notation most people are used to, for instance 3 + 4 or 3 + 4 × (2 − 1). For the conversion there are two lists, the input and the output. There is also a stack that holds operators not yet added to the output queue. To convert, the program reads each symbol in order and does something based on that symbol. The result for the above examples would be (in Reverse Polish notation) 3 4 + and 3 4 2 1 − × +, respectively.

The Shunting Yard Algorithm © Salix alba

Here’s our shunting yard implementation. There are a few extra bits and bobs we’ll fill in in a moment:

function shuntingYardFirstCut (infixExpression, { operators }) {
  const operatorsMap = new Map(
    Object.entries(operators)
  );

  const representationOf =
    something => {
      if (operatorsMap.has(something)) {
        const { symbol } = operatorsMap.get(something);

        return symbol;
      } else if (typeof something === 'string') {
        return something;
      } else {
        error(`${something} is not a value`);
      }
    };
  const typeOf =
    symbol => operatorsMap.has(symbol) ? operatorsMap.get(symbol).type : 'value';
  const isInfix =
    symbol => typeOf(symbol) === 'infix';
  const isPostfix =
    symbol => typeOf(symbol) === 'postfix';
  const isCombinator =
    symbol => isInfix(symbol) || isPostfix(symbol);

  const input = infixExpression.split('');
  const operatorStack = [];
  const reversePolishRepresentation = [];
  let awaitingValue = true;

  while (input.length > 0) {
    const symbol = input.shift();

    if (symbol === '(' && awaitingValue) {
      // opening parenthesis case, going to build
      // a value
      operatorStack.push(symbol);
      awaitingValue = true;
    } else if (symbol === '(') {
      // value catenation
      error(`values ${peek(reversePolishRepresentation)} and ${symbol} cannot be catenated`);
    } else if (symbol === ')') {
      // closing parenthesis case, clear the
      // operator stack

      while (operatorStack.length > 0 && peek(operatorStack) !== '(') {
        const op = operatorStack.pop();

        reversePolishRepresentation.push(representationOf(op));
      }

      if (peek(operatorStack) === '(') {
        operatorStack.pop();
        awaitingValue = false;
      } else {
        error('Unbalanced parentheses');
      }
    } else if (isCombinator(symbol)) {
      const { precedence } = operatorsMap.get(symbol);

      // pop higher-precedence operators off the operator stack
      while (isCombinator(symbol) && operatorStack.length > 0 && peek(operatorStack) !== '(') {
        const opPrecedence = operatorsMap.get(peek(operatorStack)).precedence;

        if (precedence < opPrecedence) {
          const op = operatorStack.pop();

          reversePolishRepresentation.push(representationOf(op));
        } else {
          break;
        }
      }

      operatorStack.push(symbol);
      awaitingValue = isInfix(symbol);
    } else if (awaitingValue) {
      // as expected, go straight to the output

      reversePolishRepresentation.push(representationOf(symbol));
      awaitingValue = false;
    } else {
      // value catenation
      error(`values ${peek(reversePolishRepresentation)} and ${symbol} cannot be catenated`);
    }
  }

  // pop remaining symbols off the stack and push them
  while (operatorStack.length > 0) {
    const op = operatorStack.pop();

    if (operatorsMap.has(op)) {
      const { symbol: opSymbol } = operatorsMap.get(op);
      reversePolishRepresentation.push(opSymbol);
    } else {
      error(`Don't know how to push operator ${op}`);
    }
  }

  return reversePolishRepresentation;
}

Naturally, we need to test our work before moving on:

function deepEqual(obj1, obj2) {
  function isPrimitive(obj) {
      return (obj !== Object(obj));
  }

  if(obj1 === obj2) // it's just the same object. No need to compare.
      return true;

  if(isPrimitive(obj1) && isPrimitive(obj2)) // compare primitives
      return obj1 === obj2;

  if(Object.keys(obj1).length !== Object.keys(obj2).length)
      return false;

  // compare objects with same number of keys
  for(let key in obj1) {
      if(!(key in obj2)) return false; //other object doesn't have this prop
      if(!deepEqual(obj1[key], obj2[key])) return false;
  }

  return true;
}

const pp = list => list.map(x=>x.toString());

function verifyShunter (shunter, tests, ...additionalArgs) {
  try {
    const testList = Object.entries(tests);
    const numberOfTests = testList.length;

    const outcomes = testList.map(
      ([example, expected]) => {
        const actual = shunter(example, ...additionalArgs);

        if (deepEqual(actual, expected)) {
          return 'pass';
        } else {
          return `fail: ${JSON.stringify({ example, expected: pp(expected), actual: pp(actual) })}`;
        }
      }
    )

    const failures = outcomes.filter(result => result !== 'pass');
    const numberOfFailures = failures.length;
    const numberOfPasses = numberOfTests - numberOfFailures;

    if (numberOfFailures === 0) {
      console.log(`All ${numberOfPasses} tests passing`);
    } else {
      console.log(`${numberOfFailures} tests failing: ${failures.join('; ')}`);
    }
  } catch(error) {
    console.log(`Failed to validate the description: ${error}`)
  }
}

verifyShunter(shuntingYardFirstCut, {
  '3': [ '3' ],
  '2+3': ['2', '3', arithmetic.operators['+'].symbol],
  '4!': ['4', arithmetic.operators['!'].symbol],
  '3*2+4!': ['3', '2', arithmetic.operators['*'].symbol, '4', arithmetic.operators['!'].symbol, arithmetic.operators['+'].symbol],
  '(3*2+4)!': ['3', '2', arithmetic.operators['*'].symbol, '4', arithmetic.operators['+'].symbol, arithmetic.operators['!'].symbol]
}, arithmetic);
  //=> All 5 tests passing

handling a default operator

In mathematical notation, it is not always necessary to write a multiplication operator. For example, 2(3+4) is understood to be equivalent to 2 * (3 + 4).

Whenever two values are adjacent to each other in the input, we want our shunting yard to insert the missing * just as if it had been explicitly included. We will call * a “default operator,” as our next shunting yard will default to * if there is a missing infix operator.

shuntingYardFirstCut above has two places where it reports this as an error. Let’s modify it as follows: Whenever it encounters two values in succession, it will re-enqueue the default operator, re-enqueue the second value, and then proceed.

We’ll start with a way to denote which is the default operator, and then update our shunting yard code:4

const arithmeticB = {
  operators: arithmetic.operators,
  defaultOperator: '*'
}

function shuntingYardSecondCut (infixExpression, { operators, defaultOperator }) {
  const operatorsMap = new Map(
    Object.entries(operators)
  );

  const representationOf =
    something => {
      if (operatorsMap.has(something)) {
        const { symbol } = operatorsMap.get(something);

        return symbol;
      } else if (typeof something === 'string') {
        return something;
      } else {
        error(`${something} is not a value`);
      }
    };
  const typeOf =
    symbol => operatorsMap.has(symbol) ? operatorsMap.get(symbol).type : 'value';
  const isInfix =
    symbol => typeOf(symbol) === 'infix';
  const isPrefix =
    symbol => typeOf(symbol) === 'prefix';
  const isPostfix =
    symbol => typeOf(symbol) === 'postfix';
  const isCombinator =
    symbol => isInfix(symbol) || isPrefix(symbol) || isPostfix(symbol);
  const awaitsValue =
    symbol => isInfix(symbol) || isPrefix(symbol);

  const input = infixExpression.split('');
  const operatorStack = [];
  const reversePolishRepresentation = [];
  let awaitingValue = true;

  while (input.length > 0) {
    const symbol = input.shift();

    if (symbol === '(' && awaitingValue) {
      // opening parenthesis case, going to build
      // a value
      operatorStack.push(symbol);
      awaitingValue = true;
    } else if (symbol === '(') {
      // value catenation

      input.unshift(symbol);
      input.unshift(defaultOperator);
      awaitingValue = false;
    } else if (symbol === ')') {
      // closing parenthesis case, clear the
      // operator stack

      while (operatorStack.length > 0 && peek(operatorStack) !== '(') {
        const op = operatorStack.pop();

        reversePolishRepresentation.push(representationOf(op));
      }

      if (peek(operatorStack) === '(') {
        operatorStack.pop();
        awaitingValue = false;
      } else {
        error('Unbalanced parentheses');
      }
    } else if (isPrefix(symbol)) {
      if (awaitingValue) {
        const { precedence } = operatorsMap.get(symbol);

        // pop higher-precedence operators off the operator stack
        while (isCombinator(symbol) && operatorStack.length > 0 && peek(operatorStack) !== '(') {
          const opPrecedence = operatorsMap.get(peek(operatorStack)).precedence;

          if (precedence < opPrecedence) {
            const op = operatorStack.pop();

            reversePolishRepresentation.push(representationOf(op));
          } else {
            break;
          }
        }

        operatorStack.push(symbol);
        awaitingValue = awaitsValue(symbol);
      } else {
        // value catenation

        input.unshift(symbol);
        input.unshift(defaultOperator);
        awaitingValue = false;
      }
    } else if (isCombinator(symbol)) {
      const { precedence } = operatorsMap.get(symbol);

      // pop higher-precedence operators off the operator stack
      while (isCombinator(symbol) && operatorStack.length > 0 && peek(operatorStack) !== '(') {
        const opPrecedence = operatorsMap.get(peek(operatorStack)).precedence;

        if (precedence < opPrecedence) {
          const op = operatorStack.pop();

          reversePolishRepresentation.push(representationOf(op));
        } else {
          break;
        }
      }

      operatorStack.push(symbol);
      awaitingValue = awaitsValue(symbol);
    } else if (awaitingValue) {
      // as expected, go straight to the output

      reversePolishRepresentation.push(representationOf(symbol));
      awaitingValue = false;
    } else {
      // value catenation

      input.unshift(symbol);
      input.unshift(defaultOperator);
      awaitingValue = false;
    }
  }

  // pop remaining symbols off the stack and push them
  while (operatorStack.length > 0) {
    const op = operatorStack.pop();

    if (operatorsMap.has(op)) {
      const { symbol: opSymbol } = operatorsMap.get(op);
      reversePolishRepresentation.push(opSymbol);
    } else {
      error(`Don't know how to push operator ${op}`);
    }
  }

  return reversePolishRepresentation;
}

verifyShunter(shuntingYardSecondCut, {
  '3': [ '3' ],
  '2+3': ['2', '3', arithmetic.operators['+'].symbol],
  '4!': ['4', arithmetic.operators['!'].symbol],
  '3*2+4!': ['3', '2', arithmetic.operators['*'].symbol, '4', arithmetic.operators['!'].symbol, arithmetic.operators['+'].symbol],
  '(3*2+4)!': ['3', '2', arithmetic.operators['*'].symbol, '4', arithmetic.operators['+'].symbol, arithmetic.operators['!'].symbol],
  '2(3+4)5': ['2', '3', '4', arithmeticB.operators['+'].symbol, '5', arithmeticB.operators['*'].symbol, arithmeticB.operators['*'].symbol],
  '3!2': ['3', arithmeticB.operators['!'].symbol, '2', arithmeticB.operators['*'].symbol]
}, arithmeticB);
  //=> All 7 tests passing

We now have enough to get started with evaluating the reverse-polish representation produced by our shunting yard.


evaluating the reverse-polish representation with a stack machine

Our first cut at the code for evaluating the reverse-polish representation produced by our shunting yard, will take the definition for operators as an argument, and it will also take a function for converting strings to values.

function stateMachine (representationList, {
  operators,
  toValue
}) {
  const functions = new Map(
    Object.entries(operators).map(
      ([key, { symbol, fn }]) => [symbol, fn]
    )
  );

  const stack = [];

  for (const element of representationList) {
    if (typeof element === 'string') {
      stack.push(toValue(element));
    } else if (functions.has(element)) {
      const fn = functions.get(element);
      const arity = fn.length;

      if (stack.length < arity) {
        error(`Not enough values on the stack to use ${element}`)
      } else {
        const args = [];

        for (let counter = 0; counter < arity; ++counter) {
          args.unshift(stack.pop());
        }

        stack.push(fn.apply(null, args))
      }
    } else {
      error(`Don't know what to do with ${element}'`)
    }
  }
  if (stack.length === 0) {
    return undefined;
  } else if (stack.length > 1) {
    error(`should only be one value to return, but there were ${stack.length} values on the stack`);
  } else {
    return stack[0];
  }
}

We can then wire the shunting yard up to the postfix evaluator, to make a function that evaluates infix notation:

function evaluateFirstCut (expression, definition) {
  return stateMachine(
    shuntingYardSecondCut(
      expression, definition
    ),
    definition
  );
}

const arithmeticC = {
  operators: arithmetic.operators,
  defaultOperator: '*',
  toValue: string => Number.parseInt(string, 10)
};

verify(evaluateFirstCut, {
  '': undefined,
  '3': 3,
  '2+3': 5,
  '4!': 24,
  '3*2+4!': 30,
  '(3*2+4)!': 3628800,
  '2(3+4)5': 70,
  '3!2': 12
}, arithmeticC);
  //=> All 8 tests passing

This extremely basic function for evaluates:

  • infix expressions;
  • with parentheses, and infix operators (naturally);
  • with postfix operators;
  • with a default operator that handles the case when values are catenated.

That is enough to begin work on compiling regular expressions to finite-state recognizers.


Trainspotters


Finite-State Recognizers

If we’re going to compile regular expressions to finite-state recognizers, we need a representation for finite-state recognizers. There are many ways to notate finite-state automata. For example, state diagrams are particularly easy to read for smallish examples:

stateDiagram [*] --> start start --> zero : 0 start --> one : 1 one --> one : 0, 1 zero --> [*] one --> [*]

Of course, diagrams are not particularly easy to work with in JavaScript. If we want to write JavaScript algorithms that operate on finite-state recognizers, we need a language for describing finite-state recognizers that JavaScript is comfortable manipulating.

describing finite-state recognizers in JSON

We don’t need to invent a brand-new format, there is already an accepted formal definition. Mind you, it involves mathematical symbols that are unfamiliar to some programmers, so without dumbing it down, we will create our own language that is equivalent to the full formal definition, but expressed in a subset of JSON.

JSON has the advantage that it is a language in the exact sense we want: An ordered set of symbols.

Now what do we need to encode? Finite-state recognizers are defined as a quintuple of (Σ, S, s, ẟ, F), where:

  • Σ is the alphabet of symbols this recognizer operates upon.
  • S is the set of states this recognizer can be in.
  • s is the initial or “start” state of the recognizer.
  • is the recognizer’s “state transition function” that governs how the recognizer changes states while it consumes symbols from the sentence it is attempting to recognize.
  • F is the set of “final” states. If the recognizer is in one of these states when the input ends, it has recognized the sentence.

For our immediate purposes, we do not need to encode the alphabet of symbols, and the set of states can always be derived from the rest of the description, so we don’t need to encode that either. This leaves us with describing the start state, transition function, and set of final states.

We can encode these with JSON. We’ll use descriptive words rather than mathematical symbols, but note that if we wanted to use the mathematical symbols, everything we’re doing would work just as well.

Or JSON representation will represent the start state, transition function, and set of final states as a Plain Old JavaScript Object (or “POJO”), rather than an array. This makes it easier to document what each element means:

{
  // elements...
}

The recognizer’s initial, or start state is required. It is a string representing the name of the initial state:

{
  "start": "start"
}

The recognizer’s state transition function, , is represented as a set of transitions, encoded as a list of POJOs, each of which represents exactly one transition:

{
  "transitions": [

  ]
}

Each transition defines a change in the recognizer’s state. Transitions are formally defined as triples of the form (p, a, q):

  • p is the state the recognizer is currently in.
  • a is the input symbol consumed.
  • q is the state the recognizer will be in after completing this transition. It can be the same as p, meaning that it consumes a symbol and remains in the same state.

We can represent this with POJOs. For readability by those unfamiliar with the formal notation, we will use the words from, consume, and to. This may feel like a lot of typing compared to the formal symbols, but we’ll get the computer do do our writing for us, and it doesn’t care.

Thus, one possible set of transitions might be encoded like this:

{
  "transitions": [
    { "from": "start", "consume": "0", "to": "zero" },
    { "from": "start", "consume": "1", "to": "notZero" },
    { "from": "notZero", "consume": "0", "to": "notZero" },
    { "from": "notZero", "consume": "1", "to": "notZero" }
  ]
}

The recognizer’s set of final, or accepting states is required. It is encoded as a list of strings representing the names of the final states. If the recognizer is in any of the accepting (or “final”) states when the end of the sentence is reached (or equivalently, when there are no more symbols to consume), the recognizer accepts or “recognizes” the sentence.

{
  "accepting": ["zero", "notZero"]
}

Putting it all together, we have:

const binary = {
  "start": "start",
  "transitions": [
    { "from": "start", "consume": "0", "to": "zero" },
    { "from": "start", "consume": "1", "to": "notZero" },
    { "from": "notZero", "consume": "0", "to": "notZero" },
    { "from": "notZero", "consume": "1", "to": "notZero" }
  ],
  "accepting": ["zero", "notZero"]
}

Our representation translates directly to this simplified state diagram:

stateDiagram [*] --> start start --> zero : 0 start --> notZero : 1 notZero --> notZero : 0, 1 zero --> [*] notZero --> [*]

This finite-state recognizer recognizes binary numbers.


verifying finite-state recognizers

It’s all very well to say that a description recognizes binary numbers (or have any other expectation for it, really). But how do we have confidence that the finite-state recognizer we describe recognizes the language what we think it recognizes?

There are formal ways to prove things about recognizers, and there is the informal technique of writing tests we can run. Since we’re emphasizing working code, we’ll write tests.

Here is a function that takes as its input the definition of a recognizer, and returns a Javascript recognizer function:56

function automate (description) {
  if (description instanceof RegExp) {
    return string => !!description.exec(string)
  } else {
    const {
      stateMap,
      start,
      acceptingSet,
      transitions
    } = validatedAndProcessed(description);

    return function (input) {
      let state = start;

      for (const symbol of input) {
        const transitionsForThisState = stateMap.get(state) || [];
        const transition =
        	transitionsForThisState.find(
            ({ consume }) => consume === symbol
        	);

        if (transition == null) {
          return false;
        }

        state = transition.to;
      }

      // reached the end. do we accept?
      return acceptingSet.has(state);
    }
  }
}

Here we are using automate with our definition for recognizing binary numbers. We’ll use the verify function throughout our exploration to build simple tests-by-example:

function verifyRecognizer (recognizer, examples) {
  return verify(automate(recognizer), examples);
}

const binary = {
  "start": "start",
  "transitions": [
    { "from": "start", "consume": "0", "to": "zero" },
    { "from": "start", "consume": "1", "to": "notZero" },
    { "from": "notZero", "consume": "0", "to": "notZero" },
    { "from": "notZero", "consume": "1", "to": "notZero" }
  ],
  "accepting": ["zero", "notZero"]
};

verifyRecognizer(binary, {
  '': false,
  '0': true,
  '1': true,
  '00': false,
  '01': false,
  '10': true,
  '11': true,
  '000': false,
  '001': false,
  '010': false,
  '011': false,
  '100': true,
  '101': true,
  '110': true,
  '111': true,
  '10100011011000001010011100101110111': true
});
  //=> All 16 tests passing

We now have a function, automate, that takes a data description of a finite-state automaton/recognizer, and returns a Javascript recognizer function we can play with and verify.

Verifying recognizers will be extremely important when we want to verify that when we compile a regular expression to a finite-state recognizer, that the finite-state recognizer is equivalent to the regular expression.


Building Blocks


Building Blocks

Regular expressions have a notation for the empty set, the empty string, and single characters:

  • The symbol describes the language with no sentences, also called “the empty set.”
  • The symbol ε describes the language containing only the empty string.
  • Literals such as x, y, or z describe languages containing single sentences, containing single symbols. e.g. The literal r describes the language R, which contains just one sentence: 'r'.

In order to compile such regular expressions into finite-state recognizers, we begin by defining functions that return the empty language, the language containing only the empty string, and languages with just one sentence containing one symbol.

∅ and ε

Here’s a function that returns a recognizer that doesn’t recognize any sentences:

const names = (() => {
  let i = 0;

  return function * names () {
    while (true) yield `G${++i}`;
  };
})();

function emptySet () {
  const [start] = names();

  return {
    start,
    "transitions": [],
    "accepting": []
  };
}

verifyRecognizer(emptySet(), {
  '': false,
  '0': false,
  '1': false
});
  //=> All 3 tests passing

It’s called emptySet, because the the set of all sentences this language recognizes is empty. Note that while hand-written recognizers can have any arbitrary names for their states, we’re using the names generator to generate state names for us. This automatically avoid two recognizers ever having state names in common, which makes some of the code we write later a great deal simpler.

Now, how do we get our evaluator to handle it? Our evaluate function takes a definition object as a parameter, and that’s where we define operators. We’re going to define as an atomic operator.7

const regexA = {
  operators: {
    '∅': {
      symbol: Symbol('∅'),
      type: 'atomic',
      fn: emptySet
    }
  },
  // ...
};

Next, we need a recognizer that recognizes the language containing only the empty string, ''. Once again, we’ll write a function that returns a recognizer:

function emptyString () {
  const [start] = names();

  return {
    start,
    "transitions": [],
    "accepting": [start]
  };
}

verifyRecognizer(emptyString(), {
  '': true,
  '0': false,
  '1': false
});
  //=> All 3 tests passing

And then we’ll add it to the definition:

const regexA = {
  operators: {
    '∅': {
      symbol: Symbol('∅'),
      type: 'atomic',
      fn: emptySet
    },
    'ε': {
      symbol: Symbol('ε'),
      type: 'atomic',
      fn: emptyString
    }
  },
  // ...
};

literal

What makes recognizers really useful is recognizing non-empty strings of one kind or another. This use case is so common, regexen are designed to make recognizing strings the easiest thing to write. For example, to recognize the string abc, we write /^abc$/:

verify(/^abc$/, {
  '': false,
  'a': false,
  'ab': false,
  'abc': true,
  '_abc': false,
  '_abc_': false,
  'abc_': false
})
  //=> All 7 tests passing

Here’s an example of a recognizer that recognizes a single zero:

stateDiagram [*]-->empty empty-->recognized : 0 recognized-->[*]

We could write a function that returns a recognizer for 0, and then write another a for every other symbol we might want to use in a recognizer, and then we could assign them all to atomic operators, but this would be tedious. Instead, here’s a function that makes recognizers that recognize a literal symbol:

function literal (symbol) {
  return {
    "start": "empty",
    "transitions": [
      { "from": "empty", "consume": symbol, "to": "recognized" }
    ],
    "accepting": ["recognized"]
  };
}

verifyRecognizer(literal('0'), {
  '': false,
  '0': true,
  '1': false,
  '01': false,
  '10': false,
  '11': false
});
  //=> All 6 tests passing

Now clearly, this cannot be an atomic operator. But recall that our function for evaluating postfix expressions has a special function, toValue, for translating strings into values. In a calculator, the values were integers. In our compiler, the values are finite-state recognizers.

Our approach to handling constant literals will be to use toValue to perform the translation for us:

const regexA = {
  operators: {
    '∅': {
      symbol: Symbol('∅'),
      type: 'atomic',
      fn: emptySet
    },
    'ε': {
      symbol: Symbol('ε'),
      type: 'atomic',
      fn: emptyString
    }
  },
  defaultOperator: undefined,
  toValue (string) {
    return literal(string);
  }
};

using ∅, ε, and literal

Now that we have defined operators for and ε, and now that we have written toValue to use literal, we can use evaluate to generate recognizers from the most basic of regular expressions:

const emptySetRecognizer = evaluate(`∅`, regexA);
const emptyStringRecognizer = evaluate(`ε`, regexA);
const rRecognizer = evaluate('r', regexA);

verifyRecognizer(emptySetRecognizer, {
  '': false,
  '0': false,
  '1': false
});
  //=> All 3 tests passing

verifyRecognizer(emptyStringRecognizer, {
  '': true,
  '0': false,
  '1': false
});
  //=> All 3 tests passing

verifyRecognizer(rRecognizer, {
  '': false,
  'r': true,
  'R': false,
  'reg': false,
  'Reg': false
});
  //=> All 5 tests passing

We’ll do this enough that it’s worth building a helper for verifying our work:

function verifyEvaluateFirstCut (expression, definition, examples) {
  return verify(
    automate(evaluateFirstCut(expression, definition)),
    examples
  );
}

verifyEvaluateFirstCut('∅', regexA, {
  '': false,
  '0': false,
  '1': false
});
  //=> All 3 tests passing

verifyEvaluateFirstCut(`ε`, regexA, {
  '': true,
  '0': false,
  '1': false
});
  //=> All 3 tests passing

verifyEvaluateFirstCut('r', regexA, {
  '': false,
  'r': true,
  'R': false,
  'reg': false,
  'Reg': false
});
  //=> All 5 tests passing

Great! We have something to work with, namely constants. Before we get to building expressions using operators and so forth, let’s solve the little problem we hinted at when making and ε into operators.


recognizing special characters

There is a bug in our code so far. Or rather, a glaring omission: How do we write a recognizer that recognizes the characters or ε?

This is not really necessary for demonstrating the general idea that we can compile any regular expression into a finite-state recognizer, but once we start adding operators like * and ?, not to mention extensions like + or ?, the utility of our demonstration code will fall dramatically.

Now we’ve already made and ε into atomic operators, so now the question becomes, how do we write a regular expression with literal or ε characters in it? And not to mention, literal parentheses?

Let’s go with the most popular approach, and incorporate an escape symbol. In most languages, including regexen, that symbol is a \. We could do the same, but JavaScript already interprets \ as an escape, so our work would be littered with double backslashes to get JavaScript to recognize a single \.

We’ll set it up so that we can choose whatever we like, but by default we’ll use a back-tick:

function shuntingYard (
  infixExpression,
  {
    operators,
    defaultOperator,
    escapeSymbol = '`',
    escapedValue = string => string
  }
) {
  const operatorsMap = new Map(
    Object.entries(operators)
  );

  const representationOf =
    something => {
      if (operatorsMap.has(something)) {
        const { symbol } = operatorsMap.get(something);

        return symbol;
      } else if (typeof something === 'string') {
        return something;
      } else {
        error(`${something} is not a value`);
      }
    };
  const typeOf =
    symbol => operatorsMap.has(symbol) ? operatorsMap.get(symbol).type : 'value';
  const isInfix =
    symbol => typeOf(symbol) === 'infix';
  const isPrefix =
    symbol => typeOf(symbol) === 'prefix';
  const isPostfix =
    symbol => typeOf(symbol) === 'postfix';
  const isCombinator =
    symbol => isInfix(symbol) || isPrefix(symbol) || isPostfix(symbol);
  const awaitsValue =
    symbol => isInfix(symbol) || isPrefix(symbol);

  const input = infixExpression.split('');
  const operatorStack = [];
  const reversePolishRepresentation = [];
  let awaitingValue = true;

  while (input.length > 0) {
    const symbol = input.shift();

    if (symbol === escapeSymbol) {
      if (input.length === 0) {
        error('Escape symbol ${escapeSymbol} has no following symbol');
      } else {
        const valueSymbol = input.shift();

        if (awaitingValue) {
          // push the escaped value of the symbol

          reversePolishRepresentation.push(escapedValue(valueSymbol));
        } else {
          // value catenation

          input.unshift(valueSymbol);
          input.unshift(escapeSymbol);
          input.unshift(defaultOperator);
        }
        awaitingValue = false;
      }
    } else if (symbol === '(' && awaitingValue) {
      // opening parenthesis case, going to build
      // a value
      operatorStack.push(symbol);
      awaitingValue = true;
    } else if (symbol === '(') {
      // value catenation

      input.unshift(symbol);
      input.unshift(defaultOperator);
      awaitingValue = false;
    } else if (symbol === ')') {
      // closing parenthesis case, clear the
      // operator stack

      while (operatorStack.length > 0 && peek(operatorStack) !== '(') {
        const op = operatorStack.pop();

        reversePolishRepresentation.push(representationOf(op));
      }

      if (peek(operatorStack) === '(') {
        operatorStack.pop();
        awaitingValue = false;
      } else {
        error('Unbalanced parentheses');
      }
    } else if (isPrefix(symbol)) {
      if (awaitingValue) {
        const { precedence } = operatorsMap.get(symbol);

        // pop higher-precedence operators off the operator stack
        while (isCombinator(symbol) && operatorStack.length > 0 && peek(operatorStack) !== '(') {
          const opPrecedence = operatorsMap.get(peek(operatorStack)).precedence;

          if (precedence < opPrecedence) {
            const op = operatorStack.pop();

            reversePolishRepresentation.push(representationOf(op));
          } else {
            break;
          }
        }

        operatorStack.push(symbol);
        awaitingValue = awaitsValue(symbol);
      } else {
        // value catenation

        input.unshift(symbol);
        input.unshift(defaultOperator);
        awaitingValue = false;
      }
    } else if (isCombinator(symbol)) {
      const { precedence } = operatorsMap.get(symbol);

      // pop higher-precedence operators off the operator stack
      while (isCombinator(symbol) && operatorStack.length > 0 && peek(operatorStack) !== '(') {
        const opPrecedence = operatorsMap.get(peek(operatorStack)).precedence;

        if (precedence < opPrecedence) {
          const op = operatorStack.pop();

          reversePolishRepresentation.push(representationOf(op));
        } else {
          break;
        }
      }

      operatorStack.push(symbol);
      awaitingValue = awaitsValue(symbol);
    } else if (awaitingValue) {
      // as expected, go straight to the output

      reversePolishRepresentation.push(representationOf(symbol));
      awaitingValue = false;
    } else {
      // value catenation

      input.unshift(symbol);
      input.unshift(defaultOperator);
      awaitingValue = false;
    }
  }

  // pop remaining symbols off the stack and push them
  while (operatorStack.length > 0) {
    const op = operatorStack.pop();

    if (operatorsMap.has(op)) {
      const { symbol: opSymbol } = operatorsMap.get(op);
      reversePolishRepresentation.push(opSymbol);
    } else {
      error(`Don't know how to push operator ${op}`);
    }
  }

  return reversePolishRepresentation;
}

function evaluate (expression, definition) {
  return stateMachine(
    shuntingYard(
      expression, definition
    ),
    definition
  );
}

And now to test it:

function verifyEvaluate (expression, definition, examples) {
  return verify(
    automate(evaluate(expression, definition)),
    examples
  );
}

verifyEvaluate('∅', regexA, {
  '': false,
  '∅': false,
  'ε': false
});
  //=> All 3 tests passing

verifyEvaluate('`∅', regexA, {
  '': false,
  '∅': true,
  'ε': false
});
  //=> All 3 tests passing

verifyEvaluate('ε', regexA, {
  '': true,
  '∅': false,
  'ε': false
});
  //=> All 3 tests passing

verifyEvaluate('`ε', regexA, {
  '': false,
  '∅': false,
  'ε': true
});
  //=> All 3 tests passing

And now it’s time for what we might call the main event: Expressions that use operators.

Composeable recognizers and patterns are particularly interesting. Just as human languages are built by layers of composition, all sorts of mechanical languages are structured using composition. JSON is a perfect example: A JSON element like a list is composed of zero or more arbitrary JSON elements, which themselves could be lists, and so forth.

Regular expressions and regexen are both built with composition. If you have two regular expressions, a and b, you can create a new regular expression that is the union of a and b with the expression a|b, and you can create a new regular expression that is the catenation of a and b with ab.

Our evaluate functions don’t know how to do that, and we aren’t going to update them to try. Instead, we’ll write combinator functions that take two recognizers and return the finite-state recognizer representing the alternation, or catenation of their arguments.

We’ll begin with alternation.


Shunting


Alternating Regular Expressions

So far, we only have recognizers for the empty set, the empty string, and any one character. Nevertheless, we will build alternation to handle any two recognizers, because that’s exactly how the rules of regular expressions defines it:

  1. The expression x|y describes to the union of the languages X and Y, meaning, the sentence w belongs to x|y if and only if w belongs to the language X or w belongs to the language Y. We can also say that x|y represents the alternation of x and y.

We’ll get started with a function that computes the union of the descriptions of two finite-state recognizers, which is built on a very useful operation, taking the product of two finite-state automata.


Taking the Product of Two Finite-State Automata

Consider two finite-state recognizers. The first, a, recognizes a string of one or more zeroes:

stateDiagram [*]-->emptyA emptyA-->zero : 0 zero-->zero : 0 zero-->[*]

The second, b, recognizes a string of one or more ones:

stateDiagram [*]-->emptyB emptyB-->one : 1 one--> one : 1 one-->[*]

Recognizer a has two declared states: 'empty' and 'zero'. Recognizer b also has two declared states: 'empty' and 'one'. Both also have an undeclared state: they can halt. As a convention, we will refer to the halted state as an empty string, ''.

Thus, recognizer a has three possible states: 'empty', 'zero', and ''. Likewise, recognizer b has three possible states: 'empty', 'one', and ''.

Now let us imagine the two recognizers are operating concurrently on the same stream of symbols:

stateDiagram simultaneous state simultaneous { [*]-->emptyA emptyA-->zero : 0 zero-->zero : 0 zero-->[*] -- [*]-->emptyB emptyB-->one : 1 one--> one : 1 one-->[*] }

At any one time, there are nine possible combinations of states the two machines could be in:

a b '' '' '' 'emptyB' '' 'one' 'emptyA' '' 'emptyA' 'emptyB' 'emptyA' 'one' 'zero' '' 'zero' 'emptyB' 'zero' 'one'

If we wish to simulate the actions of the two recognizers operating concurrently, we could do so if we had a finite-state automaton with nine states, one for each of the pairs of states that a and b could be in.

It will look something like this:

stateDiagram state1 : '' and '' state2 : '' and 'emptyB' state3 : '' and 'one' stateDiagram state4 : 'emptyA' and '' state5 : 'emptyA' and 'emptyB' state6 : 'emptyA' and 'one' stateDiagram state7 : 'zero' and '' state8 : 'zero' and 'emptyB' state9 : 'zero' and 'one'

The reason this is called the product of a and b, is that when we take the product of the sets { '', 'emptyA', 'zero' } and {'', 'emptyB', 'one' } is the set of tuples { ('', ''), ('', 'emptyB'), ('', 'one'), ('emptyA', ''), ('emptyA', 'emptyB'), ('emptyA', 'one'), ('zero', ''), ('zero', 'emptyB'), ('zero', 'one')}.

There will be (at most) one set in the product state machine for each tuple in the product of the sets of states for a and b.

We haven’t decided where such an automaton would start, how it transitions between its states, and which states should be accepting states. We’ll go through those in that order.


starting the product

Now let’s consider a and b simultaneously reading the same string of symbols in parallel. What states would they respectively start in? emptyA and emptyB, of course, therefore our product will begin in state5, which corresponds to emptyA and emptyB:

stateDiagram [*] --> state5 state5 : 'emptyA' and 'emptyB'

This is a rule for constructing products: The product of two recognizers begins in the state corresponding to the start state for each of the recognizers.


transitions

Now let’s think about our product automaton. It begins in state5. What transitions can it make from there? We can work that out by looking at the transitions for emptyA and emptyB.

Given that the product is in a state corresponding to a being in state Fa and b being in state Fb, We’ll follow these rules for determining the transitions from the state (Fa and Fb):

  1. If when a is in state Fa it consumes a symbol S and transitions to state Ta, but when b is in state Fb it does not consume the symbol S, then the product of a and b will consume S and transition to the state (Ta and ''), denoting that were the two recognizers operating concurrently, a would transition to state Ta while b would halt.
  2. If when a is in state Fa it does not consume a symbol S, but when b is in state Fb it consumes the symbol S and transitions to state Tb, then the product of a and b will consume S and transition to ('' and Tb), denoting that were the two recognizers operating concurrently, a would halt while b would transition to state Tb.
  3. If when a is in state Fa it consumes a symbol S and transitions to state Ta, and also if when b is in state Fb it consumes the symbol S and transitions to state Tb, then the product of a and b will consume S and transition to (Ta and Tb), denoting that were the two recognizers operating concurrently, a would transition to state Ta while b would transition to state Tb.

When our product is in state 'state5', it corresponds to the states ('emptyA' and 'emptyB'). Well, when a is in state 'emptyA', it consumes 0 and transitions to 'zero', but when b is in 'emptyB', it does not consume 0.

Therefore, by rule 1, when the product is in state 'state5' corresponding to the states ('emptyA' and 'emptyB'), it consumes 0 and transitions to 'state7' corresponding to the states ('zero' and ''):

stateDiagram [*] --> state5 state5 --> state7 : 0 state5 : 'emptyA' and 'emptyB' state7 : 'zero' and ''

And by rule 2, when the product is in state 'state5' corresponding to the states ('emptyA' and 'emptyB'), it consumes 1 and transitions to 'state3', corresponding to the states ('' and 'one'):

stateDiagram [*] --> state5 state5 --> state7 : 0 state5 --> state3 : 1 state3: '' and 'one' state5 : 'emptyA' and 'emptyB' state7 : 'zero' and ''

What transitions take place from state 'state7'? b is halted in 'state7', and therefore b doesn’t consume any symbols in 'state7', and therefore we can apply rule 1 to the case where a consumes a 0 from state 'zero' and transitions to state 'zero':

stateDiagram [*] --> state5 state5 --> state7 : 0 state5 --> state3 : 1 state7 --> state7 : 0 state3: '' and 'one' state5 : 'emptyA' and 'emptyB' state7 : 'zero' and ''

We can always apply rule 1 to any state where b is halted, and it follows that all of the transitions from a state where b is halted will lead to states where b is halted. Now what about state 'state3'?

Well, by similar logic, since a is halted in state 'state3', and b consumes a 1 in state 'one' and transitions back to state 'one', we apply rule 2 and get:

stateDiagram [*] --> state5 state5 --> state7 : 0 state5 --> state3 : 1 state7 --> state7 : 0 state3 --> state3 : 1 state3: '' and 'one' state5 : 'emptyA' and 'emptyB' state7 : 'zero' and ''

We could apply our rules to the other six states, but we don’t need to: The states 'state2', 'state4', 'state6', 'state8', and 'state9' are unreachable from the starting state 'state5'.

And 'state1 need not be included: When both a and b halt, then the product of a and b also halts. So we can leave it out.

Thus, if we begin with the start state and then recursively follow transitions, we will automatically end up with the subset of all possible product states that are reachable given the transitions for a and b.


a function to compute the product of two recognizers

Here is a function that takes the product of two recognizers:

// A state aggregator maps a set of states
// (such as the two states forming part of the product
// of two finite-state recognizers) to a new state.
class StateAggregator {
  constructor () {
    this.map = new Map();
    this.inverseMap = new Map();
  }

  stateFromSet (...states) {
    const materialStates = states.filter(s => s != null);

    if (materialStates.some(ms=>this.inverseMap.has(ms))) {
      error(`Surprise! Aggregating an aggregate!!`);
    }

    if (materialStates.length === 0) {
      error('tried to get an aggregate state name for no states');
    } else if (materialStates.length === 1) {
      // do not need a new state name
      return materialStates[0];
    } else {
      const key = materialStates.sort().map(s=>`(${s})`).join('');

      if (this.map.has(key)) {
        return this.map.get(key);
      } else {
        const [newState] = names();

        this.map.set(key, newState);
        this.inverseMap.set(newState, new Set(materialStates));

        return newState;
      }
    }
  }

  setFromState (state) {
    if (this.inverseMap.has(state)) {
      return this.inverseMap.get(state);
    } else {
      return new Set([state]);
    }
  }
}

function product (a, b, P = new StateAggregator()) {
  const {
    stateMap: aStateMap,
    start: aStart
  } = validatedAndProcessed(a);
  const {
    stateMap: bStateMap,
    start: bStart
  } = validatedAndProcessed(b);

  // R is a collection of states "remaining" to be analyzed
  // it is a map from the product's state name to the individual states
  // for a and b
  const R = new Map();

  // T is a collection of states already analyzed
  // it is a map from a product's state name to the transitions
  // for that state
  const T = new Map();

  // seed R
  const start = P.stateFromSet(aStart, bStart);
  R.set(start, [aStart, bStart]);

  while (R.size > 0) {
    const [[abState, [aState, bState]]] = R.entries();
    const aTransitions = aState != null ? (aStateMap.get(aState) || []) : [];
    const bTransitions = bState != null ? (bStateMap.get(bState) || []) : [];

    let abTransitions = [];

    if (T.has(abState)) {
      error(`Error taking product: T and R both have ${abState} at the same time.`);
    }

    if (aTransitions.length === 0 && bTransitions.length == 0) {
      // dead end for both
      // will add no transitions
      // we put it in T just to avoid recomputing this if it's referenced again
      T.set(abState, []);
    } else if (aTransitions.length === 0) {
      const aTo = null;
      abTransitions = bTransitions.map(
        ({ consume, to: bTo }) => ({ from: abState, consume, to: P.stateFromSet(aTo, bTo), aTo, bTo })
      );
    } else if (bTransitions.length === 0) {
      const bTo = null;
      abTransitions = aTransitions.map(
        ({ consume, to: aTo }) => ({ from: abState, consume, to: P.stateFromSet(aTo, bTo), aTo, bTo })
      );
    } else {
      // both a and b have transitions
      const aConsumeToMap =
        aTransitions.reduce(
          (acc, { consume, to }) => (acc.set(consume, to), acc),
          new Map()
        );
      const bConsumeToMap =
        bTransitions.reduce(
          (acc, { consume, to }) => (acc.set(consume, to), acc),
          new Map()
        );

      for (const { from, consume, to: aTo } of aTransitions) {
        const bTo = bConsumeToMap.has(consume) ? bConsumeToMap.get(consume) : null;

        if (bTo != null) {
          bConsumeToMap.delete(consume);
        }

        abTransitions.push({ from: abState, consume, to: P.stateFromSet(aTo, bTo), aTo, bTo });
      }

      for (const [consume, bTo] of bConsumeToMap.entries()) {
        const aTo = null;

        abTransitions.push({ from: abState, consume, to: P.stateFromSet(aTo, bTo), aTo, bTo });
      }
    }

    T.set(abState, abTransitions);

    for (const { to, aTo, bTo } of abTransitions) {
      // more work remaining?
      if (!T.has(to) && !R.has(to)) {
        R.set(to, [aTo, bTo]);
      }
    }

    R.delete(abState);
  }

  const accepting = [];

  const transitions =
    [...T.values()].flatMap(
      tt => tt.map(
        ({ from, consume, to }) => ({ from, consume, to })
      )
    );

  return { start, accepting, transitions };

}

We can test it with out a and b:

const a = {
  "start": 'emptyA',
  "accepting": ['zero'],
  "transitions": [
    { "from": 'emptyA', "consume": '0', "to": 'zero' },
    { "from": 'zero', "consume": '0', "to": 'zero' }
  ]
};

const b = {
  "start": 'emptyB',
  "accepting": ['one'],
  "transitions": [
    { "from": 'emptyB', "consume": '1', "to": 'one' },
    { "from": 'one', "consume": '1', "to": 'one' }
  ]
};

product(a, b)
  //=>
    {
      "start": "G41",
      "transitions": [
        { "from": "G41", "consume": "0", "to": "G42" },
        { "from": "G41", "consume": "1", "to": "G43" },
        { "from": "G42", "consume": "0", "to": "G42" },
        { "from": "G43", "consume": "1", "to": "G43" }
      ],
      "accepting": []
    }

It doesn’t actually accept anything, so it’s not much of a recognizer. Yet.


From Product to Union

We know how to compute the product of two recognizers, and we see how the product actually simulates having two recognizers simultaneously consuming the same symbols. But what we want is to compute the union of the recognizers.

So let’s consider our requirements. When we talk about the union of a and b, we mean a recognizer that recognizes any sentence that a recognizes, or any sentence that b recognizes.

If the two recognizers were running concurrently, we would want to accept a sentence if a ended up in one of its recognizing states or if b ended up in one of its accepting states. How does this translate to the product’s states?

Well, each state of the product represents one state from a and one state from b. If there are no more symbols to consume and the product is in a state where the state from a is in a’s set of accepting states, then this is equivalent to a having accepted the sentence. Likewise, if there are no more symbols to consume and the product is in a state where the state from b is in b’s set of accepting states, then this is equivalent to b having accepted the sentence.

In theory, then, for a and b, the following product states represent the union of a and b:

a b '' 'one' 'emptyA' 'one' 'zero' '' 'zero' 'emptyB' 'zero' 'one'

Of course, only two of these ('zero' and '', '' and 'one') are reachable, so those are the ones we want our product to accept when we want the union of two recognizers.

Here’s a union function that makes use of product and some of the helpers we’ve already written:

function union2 (a, b) {
  const {
    states: aDeclaredStates,
    accepting: aAccepting
  } = validatedAndProcessed(a);
  const aStates = [null].concat(aDeclaredStates);

  const {
    states: bDeclaredStates,
    accepting: bAccepting
  } = validatedAndProcessed(b);
  const bStates = [null].concat(bDeclaredStates);

  // P is a mapping from a pair of states (or any set, but in union2 it's always a pair)
  // to a new state representing the tuple of those states
  const P = new StateAggregator();

  const productAB = product(a, b, P);
  const { start, transitions } = productAB;

  const statesAAccepts =
    aAccepting.flatMap(
      aAcceptingState => bStates.map(bState => P.stateFromSet(aAcceptingState, bState))
    );
  const statesBAccepts =
    bAccepting.flatMap(
      bAcceptingState => aStates.map(aState => P.stateFromSet(aState, bAcceptingState))
    );

  const allAcceptingStates =
    [...new Set([...statesAAccepts, ...statesBAccepts])];

  const { stateSet: reachableStates } = validatedAndProcessed(productAB);
  const accepting = allAcceptingStates.filter(state => reachableStates.has(state));

  return { start, accepting, transitions };
}

And when we try it:

union2(a, b)
  //=>
    {
      "start": "G41",
      "transitions": [
        { "from": "G41", "consume": "0", "to": "G42" },
        { "from": "G41", "consume": "1", "to": "G43" },
        { "from": "G42", "consume": "0", "to": "G42" },
        { "from": "G43", "consume": "1", "to": "G43" }
      ],
      "accepting": [ "G42", "G43" ]
    }

Now we can incorporate union2 as an operator:

const regexB = {
  operators: {
    '∅': {
      symbol: Symbol('∅'),
      type: 'atomic',
      fn: emptySet
    },
    'ε': {
      symbol: Symbol('ε'),
      type: 'atomic',
      fn: emptyString
    },
    '|': {
      symbol: Symbol('|'),
      type: 'infix',
      precedence: 10,
      fn: union2
    }
  },
  defaultOperator: undefined,
  toValue (string) {
    return literal(string);
  }
};

verifyEvaluate('a', regexB, {
  '': false,
  'a': true,
  'A': false,
  'aa': false,
  'AA': false
});
  //=> All 5 tests passing

verifyEvaluate('A', regexB, {
  '': false,
  'a': false,
  'A': true,
  'aa': false,
  'AA': false
});
  //=> All 5 tests passing

verifyEvaluate('a|A', regexB, {
  '': false,
  'a': true,
  'A': true,
  'aa': false,
  'AA': false
});
  //=> All 5 tests passing

We’re ready to work on catenation now, but before we do, a digression about product.


the marvellous product

Taking the product of two recognizers is a general-purpose way of simulating the effect of running two recognizers in parallel on the same input. In the case of union(a, b), we obtained product(a, b), and then selected as accepting states, all those states where either a or b reached an accepting state.

But what other ways could we determine the accepting state of the result?

If we accept all those states where both a and b reached an accepting state, we have computed the intersection of a and b. intersection is not a part of formal regular expressions or of most regexen, but it can be useful and we will see later how to add it as an operator.

If we accept all those states where a reaches an accepting state but b does not, we have computed the difference between a and b. This can also be used for implementing regex lookahead features, but this time for negative lookaheads.

We could even compute all those states where either a or b reach an accepting state, but not both. This would compute the disjunction of the two recognizers.

We’ll return to some of these other uses for product after we satisfy ourselves that we can generate a finite-state recognizer for any formal regular expression we like.


Connecting


Catenating Regular Expressions

And now we turn our attention to catenating descriptions. Let’s begin by informally defining what we mean by “catenating descriptions:”

Given two recognizers, a and b, the catenation of a and b is a recognizer that recognizes a sentence AB, if and only if A is a sentence recognized by a and B is a sentence recognized by b.

Catenation is very common in composing patterns. It’s how we formally define recognizers that recognize things like “the function keyword, followed by optional whitespace, followed by an optional label, followed by optional whitespace, followed by an open parenthesis, followed by…” and so forth.

A hypothetical recognizer for JavaScript function expressions would be composed by catenating recognizers for keywords, optional whitespace, labels, parentheses, and so forth.


catenating descriptions with epsilon-transitions

Our finite-state automata are very simple: They are deterministic, meaning that in every state, there is one and only one transition for each unique symbol. And they always consume a symbol when they transition.

Some finite-state automata relax the second constraint. They allow a transition between states without consuming a symbol. If a transition with a symbol to be consumed is like an “if statement,” a transition without a symbol to consume is like a “goto.”

Such transitions are called “ε-transitions,” or “epsilon transitions” for those who prefer to avoid greek letters. As we’ll see, ε-transitions do not add any power to finite-state automata, but they do sometimes help make diagrams a little easier to understand and formulate.

Recall our recognizer that recognizes variations on the name “reg.” Here it is as a diagram:

stateDiagram [*]-->start start-->r : r,R r-->re : e,E re-->reg : g,G reg-->[*]

And here is the diagram for a recognizer that recognizes one or more exclamation marks:

stateDiagram [*]-->empty empty-->bang : ! bang-->bang : ! bang-->[*]

The simplest way to catenate recognizers is to put all their states together in one big diagram, and create an ε-transition between the accepting states for the first recognizer, and the start state of the second. The start state of the first recognizer becomes the start state of the result, and the accepting states of the second recognizer become the accepting state of the result.

Like this:

stateDiagram [*]-->start start-->r : r,R r-->re : e,E re-->reg : g,G reg-->empty empty-->bang : ! bang-->bang : ! bang-->[*]

Here’s a function to catenate any two recognizers, using ε-transitions:

function epsilonCatenate (a, b) {
  const joinTransitions =
    a.accepting.map(
      from => ({ from, to: b.start })
    );

  return {
    start: a.start,
    accepting: b.accepting,
    transitions:
      a.transitions
        .concat(joinTransitions)
        .concat(b.transitions)
  };
}

epsilonCatenate(reg, exclamations)
  //=>
    {
      "start": "empty",
      "accepting": [ "bang" ],
      "transitions": [
        { "from": "empty", "consume": "r", "to": "r" },
        { "from": "empty", "consume": "R", "to": "r" },
        { "from": "r", "consume": "e", "to": "re" },
        { "from": "r", "consume": "E", "to": "re" },
        { "from": "re", "consume": "g", "to": "reg" },
        { "from": "re", "consume": "G", "to": "reg" },
        { "from": "reg", "to": "empty2" },
        { "from": "empty2", "to": "bang", "consume": "!" },
        { "from": "bang", "to": "bang", "consume": "!" }
      ]
    }

Of course, our engine for finite-state recognizers doesn’t actually implement ε-transitions. We could add that as a feature, but instead, let’s look at an algorithm for removing ε-transitions from finite-state machines.


removing epsilon-transitions

To remove an ε-transition between any two states, we start by taking all the transitions in the destination state, and copy them into the origin state. Next, if the destination state is an accepting state, we make the origin state an accepting state as well.

We then can remove the ε-transition without changing the recognizer’s behaviour. In our catenated recognizer, we have an ε-transition between the reg and empty states:

stateDiagram reg-->empty empty-->bang : ! bang-->bang : ! bang-->[*]

The empty state has one transition, from empty to bang, while consuming !. If we copy that into reg, we get:

stateDiagram reg-->bang : ! empty-->bang : ! bang-->bang : ! bang-->[*]

Since empty is not an accepting state, we do not need to make reg an accepting state, so we are done removing this ε-transition. We repeat this process for all ε-transitions, in any order we like, until there are no more ε-transitions. In this case, there only was one, so the result is:

stateDiagram [*]-->start start-->r : r,R r-->re : e,E re-->reg : g,G reg-->bang : ! empty-->bang : ! bang-->bang : ! bang-->[*]

This is clearly a recognizer that recognizes the name “reg” followed by one or more exclamation marks. Our catenation algorithm has two steps. In the first, we create a recognizer with ε-transitions:

  1. Connect the two recognizers with an ε-transition from each accepting state from the first recognizer to the start state from the second recognizer.
  2. The start state of the first recognizer becomes the start state of the catenated recognizers.
  3. The accepting states of the second recognizer become the accepting states of the catenated recognizers.

This transformation complete, we can then remove the ε-transitions. For each ε-transition between an origin and destination state:

  1. Copy all of the transitions from the destination state into the origin state.
  2. If the destination state is an accepting state, make the origin state an accepting state as well.
  3. Remove the ε-transition.

(Following this process, we sometimes wind up with unreachable states. In our example above, empty becomes unreachable after removing the ε-transition. This has no effect on the behaviour of the recognizer, and in the next section, we’ll see how to prune those unreachable states.)


implementing catenation

Here’s a function that implements the steps described above: It takes any finite-state recognizer, and removes all of the ε-transitions, returning an equivalent finite-state recognizer without ε-transitions.

There’s code to handle cases we haven’t discussed–like ε-transitions between a state and itself, and loops in epsilon transitions (bad!)–but at its heart, it just implements the simple algorithm we just described.

function removeEpsilonTransitions ({ start, accepting, transitions }) {
  const acceptingSet = new Set(accepting);
  const transitionsWithoutEpsilon =
    transitions
      .filter(({ consume }) => consume != null);
  const stateMapWithoutEpsilon = toStateMap(transitionsWithoutEpsilon);
  const epsilonMap =
    transitions
      .filter(({ consume }) => consume == null)
      .reduce(
          (acc, { from, to }) => {
            const toStates = acc.has(from) ? acc.get(from) : new Set();

            toStates.add(to);
            acc.set(from, toStates);
            return acc;
          },
          new Map()
        );

  const epsilonQueue = [...epsilonMap.entries()];
  const epsilonFromStatesSet = new Set(epsilonMap.keys());

  const outerBoundsOnNumberOfRemovals = transitions.length;
  let loops = 0;

  while (epsilonQueue.length > 0 && loops++ <= outerBoundsOnNumberOfRemovals) {
    let [epsilonFrom, epsilonToSet] = epsilonQueue.shift();
    const allEpsilonToStates = [...epsilonToSet];

    // special case: We can ignore self-epsilon transitions (e.g. a-->a)
    const epsilonToStates = allEpsilonToStates.filter(
      toState => toState !== epsilonFrom
    );

    // we defer resolving destinations that have epsilon transitions
    const deferredEpsilonToStates = epsilonToStates.filter(s => epsilonFromStatesSet.has(s));
    if (deferredEpsilonToStates.length > 0) {
      // defer them
      epsilonQueue.push([epsilonFrom, deferredEpsilonToStates]);
    } else {
      // if nothing to defer, remove this from the set
      epsilonFromStatesSet.delete(epsilonFrom);
    }

    // we can immediately resolve destinations that themselves don't have epsilon transitions
    const immediateEpsilonToStates = epsilonToStates.filter(s => !epsilonFromStatesSet.has(s));
    for (const epsilonTo of immediateEpsilonToStates) {
      const source =
        stateMapWithoutEpsilon.get(epsilonTo) || [];
      const potentialToMove =
        source.map(
          ({ consume, to }) => ({ from: epsilonFrom, consume, to })
        );
      const existingTransitions = stateMapWithoutEpsilon.get(epsilonFrom) || [];

      // filter out duplicates
      const needToMove = potentialToMove.filter(
        ({ consume: pConsume, to: pTo }) =>
          !existingTransitions.some(
            ({ consume: eConsume, to: eTo }) => pConsume === eConsume && pTo === eTo
          )
      );
      // now add the moved transitions
      stateMapWithoutEpsilon.set(epsilonFrom, existingTransitions.concat(needToMove));

      // special case!
      if (acceptingSet.has(epsilonTo)) {
        acceptingSet.add(epsilonFrom);
      }
    }
  }

  if (loops > outerBoundsOnNumberOfRemovals) {
    error("Attempted to remove too many epsilon transitions. Investigate possible loop.");
  } else {
    return {
      start,
      accepting: [...acceptingSet],
      transitions: [
        ...stateMapWithoutEpsilon.values()
      ].flatMap( tt => tt )
    };
  }
}

removeEpsilonTransitions(epsilonCatenate(reg, exclamations))
  //=>
    {
      "start": "empty",
      "accepting": [ "bang" ],
      "transitions": [
        { "from": "empty", "consume": "r", "to": "r" },
        { "from": "empty", "consume": "R", "to": "r" },
        { "from": "r", "consume": "e", "to": "re" },
        { "from": "r", "consume": "E", "to": "re" },
        { "from": "re", "consume": "g", "to": "reg" },
        { "from": "re", "consume": "G", "to": "reg" },
        { "from": "empty2", "to": "bang", "consume": "!" },
        { "from": "bang", "to": "bang", "consume": "!" },
        { "from": "reg", "consume": "!", "to": "bang" }
      ]
    }

We have now implemented catenating two deterministic finite-state recognizers in such a way that we return a finite-state recognizer. The only things left to do are remove unreachable states, and to deal with a catch that we’ll describe below.


unreachable states

Our “epsilon join/remove epsilons” technique has a small drawback: It can create an unreachable state when the starting state of the second recognizer is not the destination of any other transitions.

Consider:

stateDiagram [*]-->start start-->zero : 0 zero --> [*]

And:

stateDiagram [*]-->empty empty-->one : 1 one --> [*]

When we join them and remove transitions, we end up with an unreachable state, empty:

stateDiagram [*]-->start start-->zero : 0 zero --> one : 1 empty-->one : 1 one --> [*]

We could implement a very specific fix, but the code to do a general elimination of unreachable states is straightforward:

function reachableFromStart ({ start, accepting: allAccepting, transitions: allTransitions }) {
  const stateMap = toStateMap(allTransitions, true);
  const reachableMap = new Map();
  const R = new Set([start]);

  while (R.size > 0) {
    const [state] = [...R];
    R.delete(state);
    const transitions = stateMap.get(state) || [];

    // this state is reachable
    reachableMap.set(state, transitions);

    const reachableFromThisState =
      transitions.map(({ to }) => to);

    const unprocessedReachableFromThisState =
      reachableFromThisState
        .filter(to => !reachableMap.has(to) && !R.has(to));

    for (const reachableState of unprocessedReachableFromThisState) {
      R.add(reachableState);
    }
  }

  const transitions = [...reachableMap.values()].flatMap(tt => tt);

  // prune unreachable states from the accepting set
  const reachableStates = new Set(
    [start].concat(
      transitions.map(({ to }) => to)
    )
  );

  const accepting = allAccepting.filter( state => reachableStates.has(state) );

  return {
    start,
    transitions,
    accepting
  };
}

And we can test it out:

const zero = {
  "start": "empty",
  "transitions": [
    { "from": "empty", "consume": "0", "to": "zero" }
  ],
  "accepting": ["zero"]
};

const one = {
  "start": "empty",
  "transitions": [
    { "from": "empty", "consume": "1", "to": "one" }
  ],
  "accepting": ["one"]
};

reachableFromStart(removeEpsilonTransitions(epsilonCatenate(zero, one)))
  //=>
    {
      "start":"empty",
      "transitions":[
        {"from":"empty","consume":"0","to":"zero"},
        {"from":"zero","consume":"1","to":"one"}
      ],
      "accepting":["one"]
    }

No unreachable states!


the catch with catenation

We hinted above that catenation came with a “catch.” Consider this recognizer that recognizes one or more 0s:

stateDiagram [*]-->empty empty-->zeroes : 0 zeroes--> zeroes : 0 zeroes --> [*]

And consider this recognizer that recognizes a binary number:

stateDiagram [*] --> empty empty --> zero : 0 empty --> notZero : 1 notZero --> notZero : 0, 1 zero --> [*] notZero --> [*]

What happens when we use our functions to catenate them?

const zeroes = {
  "start": "empty",
  "accepting": [ "zeroes" ],
  "transitions": [
    { "from": "empty", "consume": "0", "to": "zeroes" },
    { "from": "zeroes", "consume": "0", "to": "zeroes" }
  ]
};

const binary = {
  "start": "empty",
  "accepting": ["zero", "notZero"],
  "transitions": [
    { "from": "empty", "consume": "0", "to": "zero" },
    { "from": "empty", "consume": "1", "to": "notZero" },
    { "from": "notZero", "consume": "0", "to": "notZero" },
    { "from": "notZero", "consume": "1", "to": "notZero" }
  ]
}

reachableFromStart(removeEpsilonTransitions(epsilonCatenate(zeroes, binary)))
  //=>
    {
      "start": "empty",
      "accepting": [ "zero", "notZero" ],
      "transitions": [
        { "from": "empty", "consume": "0", "to": "zeroes" },
        { "from": "zeroes", "consume": "0", "to": "zeroes" },
        { "from": "zeroes", "consume": "0", "to": "zero" },
        { "from": "zeroes", "consume": "1", "to": "notZero" },
        { "from": "start", "to": "zero", "consume": "0" },
        { "from": "start", "to": "notZero", "consume": "1" },
        { "from": "notZero", "to": "notZero", "consume": "0" },
        { "from": "notZero", "to": "notZero", "consume": "1" }
      ]
    }

And here’s a diagram of the result:

stateDiagram [*]-->empty empty-->zeroes : 0 zeroes-->zeroes : 0 zeroes-->zero : 0 zeroes-->notZero : 1 notZero-->notZero : 0 notZero-->notZero : 1 zero-->[*] notZero-->[*]

The problem is that there are two transitions from zeroes when consuming a 0. That makes this transition nondeterministic. Deterministic state machines always have exactly one possible transition from any state for each symbol consumed in that state. Nondeterministic finite-state machines can have multiple transitions for the same symbol form any state.

We want to catenate two deterministic finite-state recognizers, and wind up with a deterministic finite-state recognizer. Why? From a theoretical perspective, nondeterministic finite-state recognizers are easier to reason about. They’re always doing exactly one thing.

From a practical perspective, deterministic finite-state recognizers are always guaranteed to execute in On time: They follow exactly one transition for every symbol consumed. Of course, they trade space for time: We say that with product, and we’re going to see that again with our next important function, powerset.


Converting Nondeterministic to Deterministic Finite-State Recognizers

As noted, our procedure for joining two recognizers with ε-transitions can create nondeterministic finite-state automata (“NFAs”). We wish to convert these NFAs to deterministic finite-state automata (“DFAs”) so that we end up with a catenation algorithm that can take any two DFA recognizers and return a DFA recognizer for the catenation of the recognizers’ languages.

We have already solved a subset of this problem, in a way. Consider the problem of taking the union of two recognizers. We did this with the product of the two recognizers. The way “product” worked was that it modelled two recognizers being in two different states at a time by creating new states that represented the pair of states each recognizer could be in.

We can use this approach with NFAs as well.


taking the product of a recognizer… with itself

Recall that for computing the union of two recognizers, when we wanted to simulate two recognizers acting in parallel on the same input, we imagined that there was a state for every pair of states the two recognizers could be simultaneously in. This approach was called taking the product of the two recognizers.

Now let’s imagine running a nondeterministic state machine in parallel with itself. It would start with just one copy of itself, like this:

stateDiagram [*]-->empty empty-->zeroes : 0 zeroes-->zeroes : 0 zeroes-->zero : 0 zeroes-->notZero : 1 notZero-->notZero : 0 notZero-->notZero : 1 zero-->[*] notZero-->[*]

It could operate as a single machine as long as every transition it took would be deterministic. For example, it could consume the empty string and halt, that would be deterministic. Same for the string 0 and all strings beginning with 01....

But what would happen when it consumed the string 00? the first 0 would take it from state 'empty' to 'zeroes', but the second 0 is nondeterministic: It should transition to both 'zero' and back to 'zeroes'.

If we had a second parallel state machine, we could have one transition to 'zero' while the other transitions back to 'zeroes'. From our implementation of product, we know how to hadle this: we need a new state representing the two machines simultaneously being in states 'zero' and 'zeroes', the tuple ('zero', 'zeroes').

Using similar logic as we used with product, we can work out that from our new tuple state, we need to draw in all the transitions from either of its states. In this case, that’s ridiculously easy, since 'zero' doesn’t have any outbound transitions, so ('zero', 'zeroes') would have the same transitions as 'zeroes'.

Now this is a very simple example. What is the worst case for using an algorithm like this?

Well, given a state machine with n states, there could be a state for every possible subset of states. Consider this pathological example with three states:

stateDiagram [*]-->one one-->two : 1 one-->three : 2 two-->one : 3 two-->two : 3 two-->one : 4 two-->three : 4 two-->two : 5 two-->three : 5 three-->one : 6 three-->two : 6 three-->three : 6 three-->[*]

If we work our way through it by hand, we see that we need seven states to represent all the possible subsets of states this recognizer can reach: ('one'), ('two'), ('three'), ('one', 'two'), ('one', 'three'), ('two', 'three'), ('one', 'two', 'three').

The set of all possible subsets of a set is called the powerset of a set. The powerset of a set includes the empty set and the set itself. Our diagram and list do not include the empty set, because that represents the machine halting, so it is an implied state of the machine.

We can also work out all the transitions just as we did with product. It ends up as this plate of spaghetti:

stateDiagram [*]-->one one-->two : 1 one-->three : 2 two-->onetwo : 3 two-->onethree : 4 two-->twothree : 5 three-->onetwothree : 6 onetwo-->two : 1 onetwo-->three : 2 onetwo-->onetwo : 3 onetwo-->onethree : 4 onetwo-->twothree : 5 onethree-->two : 1 onethree-->three : 2 onethree-->onetwothree : 6 twothree-->onetwo : 3 twothree-->onethree : 4 twothree-->twothree : 5 twothree-->onetwothree : 6 three-->[*] onethree-->[*] twothree-->[*] onetwothree-->[*]

But while we may call it the “worst case” as far as the number of states is concerned, it is now a deterministic state machine that has the exact same semantics as its nondeterministic predecessor.

Although it appears to be much more complicated than the NFA at a glance, the truth is that it is merely making the inherent complexity of the behaviour apparent. It’s actually easier to follow along by hand, since we don’t have to keep as many as three simultaneous states in our heads at any one time.


computing the powerset of a nondeterministic finite-state recognizer

Using this approach, our algorithm for computing the powerset of a nondeterministic finite-state recognizer will use queue of states.

We begin by placing the start state in the queue, and then:

  1. If the queue is empty, we’re done.
  2. Remove the state from the front of the queue, call it “this state.”
  3. If this state is already in the powerset recognizer, discard it and go back to step 1.
  4. If this is the name of a single state in the nondeterministic finite-state recognizer:
    1. Collect the transitions from this state.
    2. If the state is an accepting state in the nondeterministic finite-state recognizer, add this state to the powerset recognizer’s accepting states.
  5. If this is the name of several states in the nondeterministic finite-state recognizer:
    1. collect the transitions from each of these states.
    2. If any of the states is an accepting state in the nondeterministic finite-state recognizer, add this state to the powerset recognizer’s accepting states.
  6. For each deterministic transition from this state (i.e. there is only one transition for a particular symbol from this state):
  7. Add the transition to the powerset recognizer.
  8. Add the destination set to the queue.
  9. For each nondeterministic transition from this state (i.e. there is more than one transition for a particular symbol from this state):
  10. Collect the set of destination states for this symbol from this state.
  11. Create a name for the set of destination states.
  12. Create a transition from this state to the name for the set of destination states.
  13. Add the transition to the powerset recognizer.
  14. Add the name for the set of destination states to the queue.

We can encode this as a function, powerset:

function powerset (description, P = new StateAggregator()) {
  const {
    start: nfaStart,
    acceptingSet: nfaAcceptingSet,
    stateMap: nfaStateMap
  } = validatedAndProcessed(description, true);

  // the final set of accepting states
  const dfaAcceptingSet = new Set();

  // R is the work "remaining" to be analyzed
  // organized as a set of states to process
  const R = new Set([ nfaStart ]);

  // T is a collection of states already analyzed
  // it is a map from the state name to the transitions
  // from that state
  const T = new Map();

  while (R.size > 0) {
    const [stateName] = [...R];
    R.delete(stateName);

    // all powerset states represent sets of state,
    // with the degenerate case being a state that only represents
    // itself. stateSet is the full set represented
    // by stateName
    const stateSet = P.setFromState(stateName);

    // get the aggregate transitions across all states
    // in the set
    const aggregateTransitions =
      [...stateSet].flatMap(s => nfaStateMap.get(s) || []);

    // a map from a symbol consumed to the set of
    // destination states
    const symbolToStates =
      aggregateTransitions
        .reduce(
          (acc, { consume, to }) => {
            const toStates = acc.has(consume) ? acc.get(consume) : new Set();

            toStates.add(to);
            acc.set(consume, toStates);
            return acc;
          },
          new Map()
        );

    const dfaTransitions = [];

  	for (const [consume, toStates] of symbolToStates.entries()) {
      const toStatesName = P.stateFromSet(...toStates);

      dfaTransitions.push({ from: stateName, consume, to: toStatesName });

      const hasBeenDone = T.has(toStatesName);
      const isInRemainingQueue = R.has(toStatesName)

      if (!hasBeenDone && !isInRemainingQueue) {
        R.add(toStatesName);
      }
    }

    T.set(stateName, dfaTransitions);

    const anyStateIsAccepting =
      [...stateSet].some(s => nfaAcceptingSet.has(s));

    if (anyStateIsAccepting) {
      dfaAcceptingSet.add(stateName);
    }

  }

  return {
    start: nfaStart,
    accepting: [...dfaAcceptingSet],
    transitions:
      [...T.values()]
        .flatMap(tt => tt)
  };
}

Let’s try it:

const zeroes = {
  "start": 'empty',
  "accepting": ['zeroes'],
  "transitions": [
    { "from": 'empty', "consume": '0', "to": 'zeroes' },
    { "from": 'zeroes', "consume": '0', "to": 'zeroes' }
  ]
};

const binary = {
  "start": "empty",
  "accepting": ["zero", "notZero"],
  "transitions": [
    { "from": "empty", "consume": "0", "to": "zero" },
    { "from": "empty", "consume": "1", "to": "notZero" },
    { "from": "notZero", "consume": "0", "to": "notZero" },
    { "from": "notZero", "consume": "1", "to": "notZero" }
  ]
}

const nondeterministic =
  reachableFromStart(removeEpsilonTransitions(epsilonCatenate(zeroes, binary)));

nondeterministic
  //=>
    {
      "start": "empty",
      "accepting": [ "zero", "notZero" ],
      "transitions": [
        { "from": "empty", "consume": "0", "to": "zeroes" },
        { "from": "zeroes", "consume": "0", "to": "zeroes" },
        { "from": "zeroes", "consume": "0", "to": "zero" },
        { "from": "zeroes", "consume": "1", "to": "notZero" },
        { "from": "notZero", "to": "notZero", "consume": "0" },
        { "from": "notZero", "to": "notZero", "consume": "1" }
      ]
    }

const deterministic = powerset(nondeterministic);

deterministic
  //=>
    {
      "start": "empty",
      "transitions": [
        { "from": "empty", "consume": "0", "to": "zeroes" },
        { "from": "zeroes", "consume": "0", "to": "G36" },
        { "from": "zeroes", "consume": "1", "to": "notZero" },
        { "from": "G36", "consume": "0", "to": "G36" },
        { "from": "G36", "consume": "1", "to": "notZero" },
        { "from": "notZero", "consume": "0", "to": "notZero" },
        { "from": "notZero", "consume": "1", "to": "notZero" }
      ],
      "accepting": [ "G36", "notZero" ]
    }

The powerset function converts any nondeterministic finite-state recognizer into a deterministic finite-state recognizer.


catenation without the catch

Computing the catenation of any two deterministic finite-state recognizers is thus:

function catenation2 (a, b) {
  return powerset(
    reachableFromStart(
      removeEpsilonTransitions(
        epsilonCatenate(a, b)
      )
    )
  );
}

verifyRecognizer(catenation2(zeroes, binary), {
  '': false,
  '0': false,
  '1': false,
  '00': true,
  '01': true,
  '10': false,
  '11': false,
  '000': true,
  '001': true,
  '010': true,
  '011': true,
  '100': false,
  '101': false,
  '110': false,
  '111': false
});
  //=> All 15 tests passing

Given catenation2, we are now ready to enhance our evaluator:

const regexC = {
  operators: {
    '∅': {
      symbol: Symbol('∅'),
      type: 'atomic',
      fn: emptySet
    },
    'ε': {
      symbol: Symbol('ε'),
      type: 'atomic',
      fn: emptyString
    },
    '|': {
      symbol: Symbol('|'),
      type: 'infix',
      precedence: 10,
      fn: union2
    },
    '→': {
      symbol: Symbol('→'),
      type: 'infix',
      precedence: 20,
      fn: catenation2
    }
  },
  defaultOperator: '→',
  toValue (string) {
    return literal(string);
  }
};

Note that:

  1. We are using an uncommon operator, for catenation, to reduce the need for back ticks, and;
  2. We have set it up as a default operator so that we need not include it in formal regular expressions.

Let’s give it a try:

verifyEvaluate('r→e→g', regexC, {
  '': false,
  'r': false,
  're': false,
  'reg': true,
  'reggie': false
});
  //=> All 5 tests passing

verifyEvaluate('reg', regexC, {
  '': false,
  'r': false,
  're': false,
  'reg': true,
  'reggie': false
});
  //=> All 5 tests passing

verifyEvaluate('reg|reggie', regexC, {
  '': false,
  'r': false,
  're': false,
  'reg': true,
  'reggie': true
});
  //=> All 5 tests passing

Great!

We have one more operator to add, *, but before we do, let’s consider what happens when we combine catenation with union.


the fan-out problem

Consider taking the union of a and A:

evaluate('a|A', regexC)
  //=>
    {
      "start": "G83",
      "transitions": [
        { "from": "G83", "consume": "a", "to": "G80" },
        { "from": "G83", "consume": "A", "to": "G82" }
      ],
      "accepting": [ "G80", "G82" ]
    }

The way we’ve written union2, we end up with two equivalent accepting states for a|A, G80 and G82 in this example. This would be a minor distraction, but consider:

evaluate('(a|A)(b|B)(c|C)', regexC)
  //=>
    {
      "start": "G91",
      "transitions": [
        { "from": "G91", "consume": "a", "to": "G88" },
        { "from": "G91", "consume": "A", "to": "G90" },
        { "from": "G88", "consume": "b", "to": "G96" },
        { "from": "G88", "consume": "B", "to": "G98" },
        { "from": "G90", "consume": "b", "to": "G96" },
        { "from": "G90", "consume": "B", "to": "G98" },
        { "from": "G96", "consume": "c", "to": "G104" },
        { "from": "G96", "consume": "C", "to": "G106" },
        { "from": "G98", "consume": "c", "to": "G104" },
        { "from": "G98", "consume": "C", "to": "G106" }
      ],
      "accepting": [ "G104", "G106" ]
    }

When we draw thus finite-state recognizer as a diagram, it looks like this:

stateDiagram [*]-->G91 G91-->G88 : a G91-->G90 : A G88-->G96 : b G88-->G98 : B G90-->G96 : b G90-->G98 : B G96-->G104 : c G96-->G106 : C G98-->G104 : c G98-->G106 : C G104-->[*] G106-->[*]

Look at all the duplication! Nearly half of the diagram is a nearly exact copy of the other half. States G88 and G90 are equivalent: They have the exact same set of outgoing transitions. The same is true of G96 and G98, and of G104 and G106.

Ideally, we would merge the equivalent states, and then discard the unnecessary states. This would reduce the number of states from seven to four:

stateDiagram [*]-->G91 G91-->G88 : a, A G88-->G90 : b, B G90-->G92 : c, C G92-->[*]

Here’s a function that repeatedly merges equivalent states until there are no more mergeable states:

const keyS =
  (transitions, accepting) => {
    const stringifiedTransitions =
      transitions
        .map(({ consume, to }) => `${consume}-->${to}`)
        .sort()
        .join(', ');
    const acceptingSuffix = accepting ? '-->*' : '';

    return `[${stringifiedTransitions}]${acceptingSuffix}`;
  };

function mergeEquivalentStates (description) {
  searchForDuplicate: while (true) {
    let {
      start,
      transitions: allTransitions,
      accepting,
      states,
      stateMap,
      acceptingSet
    } = validatedAndProcessed(description);

    const statesByKey = new Map();

    for (const state of states) {
      const stateTransitions = stateMap.get(state) || [];
      const isAccepting = acceptingSet.has(state);
      const key = keyS(stateTransitions, isAccepting);

      if (statesByKey.has(key)) {
        // found a dup!
        const originalState = statesByKey.get(key);

      	console.log({ state, originalState, isAccepting })

        if (start === state) {
          // point start to original
          start = originalState;
        }

        // remove duplicate's transitions
        allTransitions = allTransitions.filter(
          ({ from }) => from !== state
        );

        // rewire all former incoming transitions
        allTransitions = allTransitions.map(
          ({ from, consume, to }) => ({
            from, consume, to: (to === state ? originalState : to)
          })
        );

        if (isAccepting) {
          // remove state from accepting
          accepting = accepting.filter(s => s !== state)
        }

        // reset description
        description = { start, transitions: allTransitions, accepting };

        // and then start all over again
        continue searchForDuplicate;
      } else {
        statesByKey.set(key, state);
      }
    }
    // no duplicates found
    break;
  }

  return description;
}

Armed with this, we can enhance our union2 function:

function union2merged (a, b) {
  return mergeEquivalentStates(
    union2(a, b)
  );
}

const regexD = {
  operators: {
    '∅': {
      symbol: Symbol('∅'),
      type: 'atomic',
      fn: emptySet
    },
    'ε': {
      symbol: Symbol('ε'),
      type: 'atomic',
      fn: emptyString
    },
    '|': {
      symbol: Symbol('|'),
      type: 'infix',
      precedence: 10,
      fn: union2merged
    },
    '→': {
      symbol: Symbol('→'),
      type: 'infix',
      precedence: 20,
      fn: catenation2
    }
  },
  defaultOperator: '→',
  toValue (string) {
    return literal(string);
  }
};

Now let’s compare the old:

function verifyStateCount (definition, examples) {
  function countStates (regex) {
    const fsr = evaluate(regex, definition);

    const states = toStateSet(fsr.transitions);
    states.add(fsr.start);

    return states.size;
  }

  return verify(countStates, examples);
}

const caseInsensitiveABC = "(a|A)(b|B)(c|C)"
const abcde = "(a|b|c|d|e)";
const lowercase =
  "(a|b|c|d|e|f|g|h|i|j|k|l|m|n|o|p|q|r|s|t|u|v|w|x|y|z)";

const fiveABCDEs =
  `${abcde}${abcde}${abcde}${abcde}${abcde}`;
const twoLowercaseLetters =
  `${lowercase}${lowercase}`;

verifyEvaluate(caseInsensitiveABC, regexC, {
  '': false,
  'a': false,
  'z': false,
  'ab': false,
  'kl': false,
  'abc': true,
  'AbC': true,
  'edc': false,
  'abcde': false,
  'abCde': false,
  'dcabe': false,
  'abcdef': false
});

verifyEvaluate(fiveABCDEs, regexC, {
  '': false,
  'a': false,
  'z': false,
  'ab': false,
  'kl': false,
  'abc': false,
  'AbC': false,
  'edc': false,
  'abcde': true,
  'dcabe': true,
  'abcdef': false,
  'abCde': false
});
  //=> All 12 tests passing

verifyEvaluate(twoLowercaseLetters, regexC, {
  '': false,
  'a': false,
  'z': false,
  'ab': true,
  'kl': true,
  'abc': false,
  'AbC': false,
  'edc': false,
  'abcde': false,
  'dcabe': false,
  'abcdef': false,
  'abCde': false
});
  //=> All 12 tests passing

verifyStateCount(regexC, {
  [caseInsensitiveABC]: 7,
  [fiveABCDEs]: 26,
  [twoLowercaseLetters]: 53
});
  //=> All 3 tests passing

To the new:

verifyEvaluate(caseInsensitiveABC, regexD, {
  '': false,
  'a': false,
  'z': false,
  'ab': false,
  'kl': false,
  'abc': true,
  'AbC': true,
  'edc': false,
  'abcde': false,
  'abCde': false,
  'dcabe': false,
  'abcdef': false
});
  //=> All 12 tests passing

verifyEvaluate(fiveABCDEs, regexD, {
  '': false,
  'a': false,
  'z': false,
  'ab': false,
  'kl': false,
  'abc': false,
  'AbC': false,
  'edc': false,
  'abcde': true,
  'dcabe': true,
  'abcdef': false,
  'abCde': false
});
  //=> All 12 tests passing

verifyEvaluate(twoLowercaseLetters, regexD, {
  '': false,
  'a': false,
  'z': false,
  'ab': true,
  'kl': true,
  'abc': false,
  'AbC': false,
  'edc': false,
  'abcde': false,
  'dcabe': false,
  'abcdef': false,
  'abCde': false
});
  //=> All 12 tests passing

verifyStateCount(regexD, {
  [caseInsensitiveABC]: 4,
  [fiveABCDEs]: 6,
  [twoLowercaseLetters]: 3
});
  //=> All 3 tests passing

The old union2 function created unnecessary states, and as a result, the number of states created when we catenate unions grows polynomially. Our new union2merged merges equivalent states, and as a result, the number of states created when we catenate unions grows linearly.


summarizing catenation (and an improved union)

In sum, we have created catenation2, a function that can catenate any two finite-state recognizers, and return a new finite-state recognizer. If is a finite-state recognizer that recognizes sentences in the language A, and is a finite-state recognizer that recognizes sentences in the language B, then catenation2(ⓐ, ⓑ) is a finite-state recognizer that recognizes sentences in the language AB, where a sentence ab is in the language AB, if and only if a is a sentence in the language A, and b is a sentence in the language B.

We also created an optimized union2merged that merges equivalent states, preventing the fan-out problem when catenating unions.

Before we move on to implement the kleene*, let’s also recapitule two major results that we demonstrated, namely for every finite-state recognizer with epsilon-transitions, there exists a finite-state recognizer without epsilon-transitions, and, for every finite-state recognizer, there exists an equivalent deterministic finite-state recognizer.


Margie


Quantifying Regular Expressions

Formal regular expressions are made with three constants and three operators. We’ve implemented the three constants:

  • The symbol describes the language with no sentences, also called “the empty set.”
  • The symbol ε describes the language containing only the empty string.
  • Literals such as x, y, or z describe languages containing single sentences, containing single symbols. e.g. The literal r describes the language R, which contains just one sentence: 'r'.

And we’ve implemented two of the three operators:

  • The expression x|y describes to the union of the languages X and Y, meaning, the sentence w belongs to x|y if and only if w belongs to the language X or w belongs to the language Y. We can also say that x|y represents the alternation of x and y.
  • The expression xy describes the language XY, where a sentence ab belongs to the language XY if and only if a belongs to the language X and b belongs to the language Y. We can also say that xy represents the catenation of the expressions x and y.

This leaves one remaining operator to implement, *:8:

  • The expression x* describes the language Z, where the sentence ε (the empty string) belongs to Z, and, the sentence pq belongs to Z if and only if p is a sentence belonging to X, and q is a sentence belonging to Z. We can also say that x* represents a quantification of x, as it allows a regular expression to represent a language containing sentences that match some number of sentences represented by x catenated together.

And when we’ve implemented * in our evaluator, we will have a function that takes any formal regular expression and “compiles” it to a finite-state recognizer.

implementing the kleene star

We’ll build a JavaScript operator for the kleene* step-by-step, starting with handling the “one or more” case.

Our strategy will be to take a recognizer, and then add epsilon transitions between its accepting states and its start state. In effect, we will create “loops” back to the start state from all accepting states.

For example, if we have:

stateDiagram [*]-->empty empty-->Aa : A, a Aa-->[*]

We will turn it into this to handle one or more as and As:

stateDiagram [*]-->empty empty-->Aa : A, a Aa-->[*] Aa-->empty

Once again we remove epsilon transitions, unreachable states, and possible nondeterminism:

stateDiagram [*]-->empty empty-->Aa : A, a Aa-->Aa : A, a Aa-->[*]

Presto, a recognizer that handles one or more instances of an upper- or lower-case a! Here’s an implementation in JavaScript:

function oneOrMore (description) {
  const {
    start,
    transitions,
    accepting
  } = description;

  const withEpsilonTransitions = {
    start,
    transitions:
      transitions.concat(
        accepting.map(
          acceptingState => ({ from: acceptingState, to: start })
        )
      ),
      accepting
  };

  const oneOrMore = reachableFromStart(
    mergeEquivalentStates(
      powerset(
        removeEpsilonTransitions(
          withEpsilonTransitions
        )
      )
    )
  );

  return oneOrMore;
}

const Aa = {
  "start": "empty",
  "transitions": [
    { "from": "empty", "consume": "A", "to": "Aa" },
    { "from": "empty", "consume": "a", "to": "Aa" }
  ],
  "accepting": ["Aa"]
};

verifyRecognizer(Aa, {
  '': false,
  'a': true,
  'A': true,
  'aa': false,
  'Aa': false,
  'AA': false,
  'aaaAaAaAaaaAaa': false,
  ' a': false,
  'a ': false,
  'eh?': false
});
  //=> All 10 tests passing

oneOrMore(Aa)
  //=>
    {
      "start": "empty",
      "transitions": [
        { "from": "empty", "consume": "A", "to": "Aa" },
        { "from": "empty", "consume": "a", "to": "Aa" }
        { "from": "Aa", "consume": "A", "to": "Aa" },
        { "from": "Aa", "consume": "a", "to": "Aa" }
      ],
      ],
      "accepting": ["Aa"]
    }

verifyRecognizer(oneOrMore(Aa), {
  '': false,
  'a': true,
  'A': true,
  'aa': true,
  'Aa': true,
  'AA': true,
  'aaaAaAaAaaaAaa': true,
  ' a': false,
  'a ': false,
  'eh?': false
});
  //=> All 10 tests passing

Handling one-or-more is nice, and maps directly to a programming regex operator, +. But the kleene star handles zero or more. How do we implement that?

Well, we can directly manipulate a recognizer’s definition, but let’s use what we already have. Given some recognizer x, what is the union of x and ε (the empty string)?

verifyEvaluate('((a|A)|ε)', formalRegularExpressions, {
  '': true,
  'a': true,
  'A': true,
  'aa': false,
  'Aa': false,
  'AA': false,
  'aaaAaAaAaaaAaa': false,
  ' a': false,
  'a ': false,
  'eh?': false
});
  //=> All 10 test passing

function zeroOrOne (description) {
  return union2merged(description, emptyString());
}

verifyRecognizer(zeroOrOne(Aa), {
  '': true,
  'a': true,
  'A': true,
  'aa': false,
  'Aa': false,
  'AA': false,
  'aaaAaAaAaaaAaa': false,
  ' a': false,
  'a ': false,
  'eh?': false
});
  //=> All 10 test passing

Matching the empty string or whatever a recognizer matches, is matching zero or one sentences the recognizer recognizes. Now that we have both oneOrMore and zeroOrOne, zeroOrMore is obvious:

function zeroOrMore (description) {
  return zeroOrOneoneOrMore(description));
}

const formalRegularExpressions = {
  operators: {
    '∅': {
      symbol: Symbol('∅'),
      type: 'atomic',
      fn: emptySet
    },
    'ε': {
      symbol: Symbol('ε'),
      type: 'atomic',
      fn: emptyString
    },
    '|': {
      symbol: Symbol('|'),
      type: 'infix',
      precedence: 10,
      fn: union2merged
    },
    '→': {
      symbol: Symbol('→'),
      type: 'infix',
      precedence: 20,
      fn: catenation2
    },
    '*': {
      symbol: Symbol('*'),
      type: 'postfix',
      precedence: 30,
      fn: zeroOrMore
    }
  },
  defaultOperator: '→',
  toValue (string) {
    return literal(string);
  }
};

verifyRecognizer(zeroOrMore(Aa), {
  '': true,
  'a': true,
  'A': true,
  'aa': true,
  'Aa': true,
  'AA': true,
  'aaaAaAaAaaaAaa': true,
  ' a': false,
  'a ': false,
  'eh?': false
});

verifyEvaluate('(a|A)*', formalRegularExpressions, {
  '': true,
  'a': true,
  'A': true,
  'aa': true,
  'Aa': true,
  'AA': true,
  'aaaAaAaAaaaAaa': true,
  ' a': false,
  'a ': false,
  'eh?': false
});

verifyEvaluate('ab*c', formalRegularExpressions, {
  '': false,
  'a': false,
  'ac': true,
  'abc': true,
  'abbbc': true,
  'abbbbb': false
});

We have now defined every constant and combinator in formal regular expressions. So if we want, we can create regular expressions for all sorts of languages, such as (R|r)eg(ε|gie(ε|ee*!)):

verifyEvaluate('(R|r)eg(ε|gie(ε|ee*!))', formalRegularExpressions, {
  '': false,
  'r': false,
  'reg': true,
  'Reg': true,
  'Regg': false,
  'Reggie': true,
  'Reggieeeeeee!': true
});

And best of all, we know that whatever formal regular expression we devise, we can produce a finite-state recognizer that accept sentences in the language the formal regular expression describes, simply by feeding it to our evaluate function along with the formalRegulaExpressions definition dictionary.


Caboose


What We Have Learned So Far

In Part II, we will go beyond formal regular expressions, exploring other features of regexen that still compile to finite-state recognizers. We’ll also look at features not commonly found in regexen, but are nevertheless highly useful. And we’ll use two different constructive demonstrations that pattern-matching expressions built with those features are still equivalent in power to formal regular expressions.

In this essay, we demonstrated that for every formal regular expression, there is an equivalent finite-state recognizer (more on that in our summary below). In Part II, we’ll demonstrate the converse: That for every finite-state recognizer, there is an equivalent formal regular expression.

But before we move on, let’s recapitulate what we’ve established so far.


for every finite-state recognizer with epsilon-transitions, there exists a finite-state recognizer without epsilon-transitions

This function:

function formalRegularExpressionToFiniteStateRecognizer (description) {
  return evaluate(description, formalRegulaExpressions);
}

Demonstrates that whatever formal regular expression we devise, we can produce a finite-state recognizer that accept sentences in the language the formal regular expression describes. It’s not a formal proof: If we wanted a rigorous proof, we’d have to prove all sorts of things about JavaScript programs and the computers they run on. But it is enough to satisfy our intuition.

When building catenation, we added ε-transitions to join two finite-state recognizers, and then used removeEpsilonTransitions to derive an equivalent finite-state recognizer without ε-transitions. removeEpsilonTransitions demonstrates that for every finite-state recognizer with epsilon-transitions, there exists a finite-state recognizer without epsilon-transitions.

Or to put it another way, the set of languages recognized by finite-state recognizers without ε-transitions is equal to the set of finite-state recognizers recognized by finite-state recognizers that do do do not include ε-transitions.

We also established something about non-deterministic finite-state recognizers (and non-deterministic finite-state automata in general):


for every finite-state recognizer, there exists an equivalent deterministic finite-state recognizer

Let’s reflect on what writing powerset told us about finite-state recognizers. Because we can take any finite-state recognizer–whether deterministic or non-deterministic–then pass it to powerset, and get back a deterministic finite-state recognizer, we know that for every finite-state recognizer, there exists an equivalent deterministic finite-state recognizer.

This tells us that the set of all languages recognized by deterministic finite state recognizers is equal to the set of all languages recognized by all finite-state recognizers, whether deterministic or non-deterministic.

This is not true for other types of automata: In A Brutal Look at Balanced Parentheses, Computing Machines, and Pushdown Automata, we saw that non-deterministic pushdown automata could recognize palindromes, whereas deterministic pushdown automata could not. So the set of languages recognized by deterministic pushdown automata is not equal to the set of languages recognized by pushdown automata.

But it’s a very important result, and this is why:


every regular language can be recognized in linear time

Consider a finite-state recognizer that is deterministic, and without ε-transitions, exactly like the finite-state recognizers we compile from formal regular expressions.

Such a recognizer begins in its start state, and only transitions from state to state when consuming a symbol. If it halts before consuming the entire sentence, it fails to recognize the sentence. If it does not halt when the sentence is complete, it will have performed n transitions where n is the number of symbols in the sentence.

It cannot have performed more than n transitions, because:

  1. It must consume a symbol to perform a transition, thanks to there being no ε-transitions, and;
  2. It can only perform one transition per symbol, thank to it being deterministic.

Now, thanks to powerset, it may well consume exponential space relative to the length of the longest sentence it recognizes. But it will only consume linear time.

Many contemporary regex engines, on the other hand, use nondeterministic algorithms that consume much less space, but can exhibit high orders of polynomial time. There are a number of reasons for this, including the requirement to support features that recognize non-regular languages (like balanced parentheses and palindromes).

But we know that if a formal regular expression or regex describes a regular language, it is possible to execute it in–at worst–linear time. And our code is the demonstration.

And with that, we bring Part I to a close.

(discuss on reddit and hacker news)


Notes

The code for both the shunting yard and stack machine have been extracted into a Github repository.

  1. We will use the word “equivalent” a lot in this essay. When we say “equivalent,” we don’t mean structurally identical, we mean functionally identical. For example, the regular expressions 0|1(0|1)* and 0|1|(0|1)*(0|1) are equivalent, because they both describe the same language, the language of binary numbers. We will also sometimes say that an expression is equivalent to a finite-state recognizer. When we do so, we mean that that the expression describes the exact same language that the finite-state recognizer recognizes. 

  2. In common programming jargon, a “regular expression” refers any of a family of pattern-matching and extraction languages, that can match a variety of languages. In computer science, a “regular expression” is a specific pattern matching language that recognizes regular languages only. To avoid confusion, in this essay we will use the word “regex” (plural “regexen”) to refer to the programming construct. 

  3. Be sure to read this paragraph our loud. 

  4. Keen-eyed readers will also note that we’ve added support for prefix operators on top of the existing support for infix and postfix operators. That will come in handy much later. 

  5. automate relies on validatedAndProcessed, a utility function that does some general-purpose processing useful to many of the things we will build along the way. The source code is here. Throughout this essay, we will publish the most important snippets of code, but link to the full source. 

  6. automate can also take a JavaScript RegExp as an argument and return a recognizer function. This is not central to developing finite-state recognizers, but is sometimes useful when comparing JavaScript regexen to our recognizers. 

  7. Atomic operators take zero arguments, as contrasted with postfix operators that take one argument, or infix operators that take two operators. 

  8. The * operator is named the kleene star, after Stephen Kleene. 

https://raganwald.com/2019/09/21/regular-expressions
A Brutal Look at Balanced Parentheses, Computing Machines, and Pushdown Automata
Show full content

As discussed in Pattern Matching and Recursion, a well-known programming puzzle is to write a function that determines whether a string of parentheses is “balanced,” i.e. each opening parenthesis has a corresponding closing parenthesis, and the parentheses are properly nested.

For example:

Input Output Comment '' true the empty string is balanced '()' true   '(())' true parentheses can nest '()()' true multiple pairs are acceptable '(()()())()' true multiple pairs can nest '((()' false missing closing parentheses '()))' false missing opening parentheses ')(' false close before open


This problem is amenable to all sorts of solutions, from the pedestrian to the fanciful. But it is also an entry point to exploring some of fundamental questions around computability.

This problem is part of a class of problems that all have the same basic form:

  1. We have a language, by which we mean, we have a set of strings. Each string must be finite in length, although the set itself may have infinitely many members.
  2. We wish to construct a program that can “recognize” (sometimes called “accept”) strings that are members of the language, while rejecting strings that are not.
  3. The “recognizer” is constrained to consume the symbols of each string one at a time.

Computer scientists study this problem by asking themselves, “Given a particular language, what is the simplest possible machine that can recognize that language?” We’ll do the same thing.

Instead of asking ourselves, “What’s the wildest, weirdest program for recognizing balanced parentheses,” we’ll ask, “What’s the simplest possible computing machine that can recognize balanced parentheses?”

That will lead us on an exploration of formal languages, from regular languages, to deterministic context-free languages, and finally to context-free languages. And as we go, we’ll look at fundamental computing machines, from deterministic finite automata, to deterministic pushdown automata, and finally to pushdown automata.


Robarts Library, la biblioteca más grande de la UofT, parece un guajolote!


prelude: why is this a “brutal” look?

Brutalist architecture flourished from 1951 to 1975, having descended from the modernist architectural movement of the early 20th century. Considered both an ethic and aesthetic, utilitarian designs are dictated by function over form with raw construction materials and mundane functions left exposed. Reinforced concrete is the most commonly recognized building material of Brutalist architecture but other materials such as brick, glass, steel, and rough-hewn stone may also be used.

This essay focuses on the “raw” construction materials of formal languages and the idealized computing machines that recognize them. If JavaScript is ubiquitous aluminum siding, and if Elixir is gleaming steel and glass, pushdown automata are raw, exposed, and brutal concrete.

Let’s get started.


Hurley Building, Boston, MA


Table of Contents

Regular Languages and Deterministic Finite Automata:

Deterministic Context-free Languages and Deterministic Pushdown Automata:

Context-Free Languages and Pushdown Automata:

The End:


Regular Languages and Deterministic Finite Automata

Berkeley Art Museum


formal languages and recognizers

We’ll start by defining a few terms.

A “formal language” is a defined set of strings (or tokens in a really formal argument). For something to be a formal language, there must be an unambiguous way of determining whether a string is or is not a member of the language.

“Balanced parentheses” is a formal language, there is an unambiguous specification for determining whether a string is or is not a member of the language. In computer science, strings containing balanced parentheses are called “Dyck Words,” because they were first studied by Walther von Dyck.

We mentioned “unambiguously specifying whether a string belongs to a language.” A computer scientist’s favourite tool for unambiguously specifying anything is a computing device or machine. And indeed, for something to be a formal language, there must be a machine that acts as its specification.

As alluded to above, we call these machines recognizers. A recognizer takes as its input a series of tokens making up a string, and returns as its output whether it recognizes the string or not. If it does, that string is a member of the language. Computer scientists studying formal languages also study the recognizers for those languages.

There are infinitely many formal languages, but there is an important family of formal languages called regular languages.1

There are a couple of ways to define regular languages, but the one most pertinent to pattern matching is this: A regular language can be recognized by a Deterministic Finite Automaton, or “DFA.” Meaning, we can construct a simple “state machine” to recognize whether a string is valid in the language, and that state machine will have a finite number of states.

Consider the very simple language consisting of the strings Reg and Reggie. This language can be implemented with this deterministic finite automaton:

graph TD start(start)-->|R|R R-->|e|Re Re-->|g|Reg Reg-.->|end|recognized(recognized) Reg-->|g|Regg Regg-->|i|Reggi Reggi-->|e|Reggie Reggie-.->|end|recognized;

Brutalism


implementing a deterministic finite automaton in javascript

A Deterministic Finite Automaton is the simplest of all possible state machines: It can only store information in its explicit state, there are no other variables such as counters or stacks.2

Since a DFA can only encode state by being in one of a finite number of states, and since a DFA has a finite number of possible states, we know that a DFA can only encode a finite amount of state.

The only thing a DFA recognizer does is respond to tokens as it scans a string, and the only way to query it is to look at what state it is in, or detect whether it has halted.

Here’s a pattern for implementing the “name” recognizer DFA in JavaScript:

// infrastructure for writing deterministic finite automata
const END = Symbol('end');

class DeterministicFiniteAutomaton {
  constructor(internal = 'start') {
    this.internal = internal;
    this.halted = false;
    this.recognized = false;
  }

  transitionTo(internal) {
    this.internal = internal;
    return this;
  }

  recognize() {
    this.recognized = true;
    return this;
  }

  halt() {
    this.halted = true;
    return this;
  }

  consume(token) {
    return this[this.internal](token);
  }

  static evaluate (string) {
    let state = new this();

    for (const token of string) {
      const newState = state.consume(token);

      if (newState === undefined || newState.halted) {
        return false;
      } else if (newState.recognized) {
        return true;
      } else {
        state = newState;
      }
    }

    const finalState = state.consume(END);
    return !!(finalState && finalState.recognized);
  }
}

function test (recognizer, examples) {
  for (const example of examples) {
    console.log(`'${example}' => ${recognizer.evaluate(example)}`);
  }
}

// our recognizer
class Reginald extends DeterministicFiniteAutomaton {
  start (token) {
    if (token === 'R') {
      return this.transitionTo('R');
    }
  }

  R (token) {
    if (token === 'e') {
      return this.transitionTo('Re');
    }
  }

  Re (token) {
    if (token === 'g') {
      return this.transitionTo('Reg');
    }
  }

  Reg (token) {
    if (token === 'g') {
      return this.transitionTo('Regg');
    }
    if (token === END) {
      return this.recognize();
    }
  }

  Regg (token) {
    if (token === 'i') {
      return this.transitionTo('Reggi');
    }
  }

  Reggi (token) {
    if (token === 'e') {
      return this.transitionTo('Reggie');
    }
  }

  Reggie (token) {
    if (token === END) {
      return this.recognize();
    }
  }
}

test(Reginald, [
  '', 'Scott', 'Reg', 'Reginald', 'Reggie'
]);
  //=>
    '' => false
    'Scott' => false
    'Reg' => true
    'Reginald' => false
    'Reggie' => true

This DFA has some constants for its own internal use, a state definition consisting of a function for each possible state the DFA can reach, and then a very simple “token scanning machine.” The recognizer function takes a string as an argument, and returns true if the machine reaches the RECOGNIZED state.

Each state function takes a token and returns a state to transition to. If it does not return another state, the DFA halts and the recognizer returns false.

If we can write a recognizer using this pattern for a language, we know it is a regular language. Our “name” language is thus a very small formal language, with just two recognized strings.

On to infinite regular languages!


Concrete Habour


infinite regular languages

If there are a finite number of finite strings in a language, there must be a DFA that recognizes that language.3

But what if there are an infinite number of finite strings in the language?

For some languages that have an infinite number of strings, we can still construct a deterministic finite automaton to recognize them. For example, here is a deterministic finite automaton that recognizes binary numbers:

graph LR start(start)-->|0|zero zero-.->|end|recognized(recognized) start-->|1|one[one or more] one-->|0 or 1|one one-.->|end|recognized;

And we can also write this DFA in JavaScript:

class Binary extends DeterministicFiniteAutomaton {
  start (token) {
    if (token === '0') {
      return this.transitionTo('zero');
    }
    if (token === '1') {
      return this.transitionTo('oneOrMore');
    }
  }

  zero (token) {
    if (token === END) {
      return this.recognize();
    }
  }

  oneOrMore (token) {
    if (token === '0') {
      return this.transitionTo('oneOrMore');
    }
    if (token === '1') {
      return this.transitionTo('oneOrMore');
    }
    if (token === END) {
      return this.recognize();
    }
  }
}

test(binary, [
  '', '0', '1', '00', '01', '10', '11',
  '000', '001', '010', '011', '100',
  '101', '110', '111',
  '10100011011000001010011100101110111'
]);
  //=>
    '' => false
    '0' => true
    '1' => true
    '00' => false
    '01' => false
    '10' => true
    '11' => true
    '000' => false
    '001' => false
    '010' => false
    '011' => false
    '100' => true
    '101' => true
    '110' => true
    '111' => true
    '10100011011000001010011100101110111' => true

Our recognizer is finite, yet it recognizes an infinite number of finite strings, including those that are improbably long. And since the recognizer has a fixed and finite size, it follows that “binary numbers” is a regular language.

Now that we have some examples of regular languages. We see that they can be recognized with finite state automata, and we also see that it is possible for regular languages to have an infinite number of strings, some of which are arbitrarily long (but still finite). This does not, in principle, bar us from creating deterministic finite automatons to recognize them.

We can now think a little harder about the balanced parentheses problem. If “balanced parentheses” is a regular language, it must be possible to write a deterministic finite automaton to recognize a string with balanced parentheses.

But if it is not possible to write a deterministic finite automaton to recognize balanced parentheses, then balanced parentheses must be a kind of language that is more complex than a regular language, and must require a more powerful machine for recognizing its strings.


Orșova - biserica catolică "Neprihănita Zămislire"


nested parentheses

Of all the strings that contain zero or more parentheses, there is a set that contains zero or more opening parentheses followed by zero or more closed parentheses, and where the number of opening parentheses exactly equals the number of closed parentheses.

The strings that happen to contain exactly the same number of opening parentheses as closed parentheses can just as easily be described as follows: A string belongs to the language if the string is (), or if the string is ( and ) wrapped around a string that belongs to the language.

We call these strings “nested parentheses,” and it is related to balanced parentheses: All nested parentheses strings are also balanced parentheses strings. Our approach to determining whether balanced parentheses is a regular language will use nested parentheses.

First, we will assume that there exists a deterministic finite automaton that can recognized balanced parentheses. Let’s call this machine B. Since nested parentheses are also balanced parentheses, B must recognize nested parentheses. Next, we will use nested parentheses strings to show that by presuming that B has a finite number of states, we create a logical contradiction.

This will establish that our assumption that there is a deterministic finite automaton—’B’—that recognizes balanced parentheses is faulty, which in turn establishes that balanced parentheses is not a regular language.4

Okay, we are ready to prove that a deterministic finite automaton cannot recognize nested parentheses, which in turn establishes that a deterministic finite automaton cannot recognize balanced parentheses.


Mäusenbunker — Exploring Architectural Brutalism in #Berlin Lichterfelde


balanced parentheses is not a regular language

Back to the assumption that there is a deterministic finite automaton that can recognize balanced parentheses, B. We don’t know how many states B has, it might be a very large number, but we know that there are a finite number of these states.

Now let’s consider the set of all strings that begin with one or more open parentheses: (, ((, (((, and so forth. Our DFA will always begin in the start state, and for each one of these strings, when B scans them, it will always end in some state.

There are an infinite number of such strings of open parentheses, but there are only a finite number of states in B, so it follows that there are at least two different strings of open parentheses that–when scanned–end up in the same state. Let’s call the shorter of those two strings p, and the longer of those two strings q.5

We can make a pretend function called state. state takes a DFA, a start state, and a string, and returns the state the machine is in after reading a string, or it returns halt if the machine halted at some point while reading the string.

We are saying that there is at least one pair of strings of open parentheses, p and q, such that p ≠ q, and state(B, start, p) = state(B, start, q). (Actually, there are an infinite number of such pairs, but we don’t need them all to prove a contradiction, a single pair will do.)

Now let us consider the string p’. p' consists of exactly as many closed parentheses as there are open parentheses in p. It follows that string pp' consists of p, followed by p'. pp' is a string in the balanced parentheses language, by definition.

String qp' consists of q, followed by p'. Since p has a different number of open parentheses than q, string qp' consists of a different number of open parentheses than closed parentheses, and thus qp' is not a string in the balanced parentheses language.

Now we run B on string pp', pausing after it has read the characters in p. At that point, it will be in state(B, start, p). It then reads the string p', placing it in state(B, state(B, start, p), p').

Since B recognizes strings in the balanced parentheses language, and pp' is a string in the balanced parentheses language, we know that state(B, start, pp') is recognized. And since state(B, start, pp') equals state(B, state(B, start, p), p'), we are also saying that state(B, state(B, start, p), p') is recognized.

What about running B on string qp'? Let’s pause after it reads the characters in q. At that point, it will be in state(B, start, q). It then reads the string p', placing it in state(B, state(B, start, q), p'). Since B recognizes strings in the balanced parentheses language, and qp' is not a string in the balanced parentheses language, we know that state(B, start, pq') must not equal recognized, and that state state(B, state(B, start, q), p') must not equal recognized.

But state(B, start, p) is the same state as state(B, start, q)! And by the rules of determinism, then state(B, state(B, start, p), p') must be the same as state(B, state(B, start, q), p'). But we have established that state(B, state(B, start, p), p') must be recognized and that state(B, state(B, start, p), p') must not be recognized.

Contradiction! Therefore, our original assumption—that B exists—is false. There is no deterministic finite automaton that recognizes balanced parentheses. And therefore, balanced parentheses is not a regular language.


Deterministic Context-free Languages and Deterministic Pushdown Automata

HUD Plaza


deterministic pushdown automata

We now know that “balanced parentheses” cannot be recognized with one of the simplest possible computing machines, a finite state automaton. This leads us to ask, “What is the simplest form of machine that can recognize balanced parentheses?” Computer scientists have studied this and related problems, and there are a few ideal machines that are more powerful than a DFA, but less powerful than a Turing Machine.

All of them have some mechanism for encoding an infinite number of states by adding some form of “external state” to the machine’s existing “internal state.” This is very much like a program in a von Neumann machine. Leaving out self-modifying code, the position of the program counter is a program’s internal state, while memory that it reads and writes is its external state.6

The simplest machine that adds external state, which we might think of as being one step more powerful than a DFA, is called a Deterministic Pushdown Automaton, or “DPA.” A DPA is very much like our Deterministic Finite Automaton, but it adds an expandable stack as its external state.

There are several classes of Pushdown Automata, depending upon what they are allowed to do with the stack. A Deterministic Pushdown Automaton has the simplest and least powerful capability:

  1. When a DPA matches the current token, the value of the top of the stack, or both.
  2. A DPA can halt or choose the next state, and it can also push a symbol onto the top of the stack, pop the current symbol off the top of the stack, or replace the top symbol on the stack.

If a deterministic pushdown automata can recognize a language, the language is known as a deterministic context-free language. Is “balanced parentheses” a deterministic context-free language?

Can we write a DPA to recognize balanced parentheses? DPAs have a finite number of internal states. Our proof that balanced parentheses was not a regular language rested on the fact that any DFA could not recognize balanced parentheses with a finite number of internal states.

Does that apply to DPAs too? No.

A DPA still has a finite number of internal states, but because of its external stack, it can encode an infinite number of possible states. With a DFA, we asserted that if it is in a particular internal state, and it reads a string of tokens, it will end up halting or reaching a state, and given that internal state and that series of tokens, the DFA will always end up halting or always end up reaching the same end state.

This is not true of a DPA. A DPA can push tokens onto the stack, pop tokens off the stack, and make decisions based on the top token on the stack. As a result, we cannot determine the destiny of a DPA based on its internal state and sequence of tokens alone, we have to include the state of the stack.

Therefore, our proof that a DFA with finite number of internal states cannot recognize balanced parentheses does not apply to DPAs. If we can write a DPA to recognize balanced parentheses, then “balanced parentheses” is a deterministic context-free language.


Croydon Brutalism


balanced parentheses is a deterministic context-free language

Let’s start with a recognizer that can implement DPAs. Now that we have to track both the current internal state and external state in the form of a stack.

const END = Symbol('end');

class DeterministicPushdownAutomaton {
  constructor(internal = 'start', external = []) {
    this.internal = internal;
    this.external = external;
    this.halted = false;
    this.recognized = false;
  }

  push(token) {
    this.external.push(token);
    return this;
  }

  pop() {
    this.external.pop();
    return this;
  }

  replace(token) {
    this.external[this.external.length - 1] = token;
    return this;
  }

  top() {
    return this.external[this.external.length - 1];
  }

  hasEmptyStack() {
    return this.external.length === 0;
  }

  transitionTo(internal) {
    this.internal = internal;
    return this;
  }

  recognize() {
    this.recognized = true;
    return this;
  }

  halt() {
    this.halted = true;
    return this;
  }

  consume(token) {
    return this[this.internal](token);
  }

  static evaluate (string) {
    let state = new this();

    for (const token of string) {
      const newState = state.consume(token);

      if (newState === undefined || newState.halted) {
        return false;
      } else if (newState.recognized) {
        return true;
      } else {
        state = newState;
      }
    }

    const finalState = state.consume(END);
    return !!(finalState && finalState.recognized);
  }
}

function test (recognizer, examples) {
  for (const example of examples) {
    console.log(`'${example}' => ${recognizer.evaluate(example)}`);
  }
}

Now, a stack implemented in JavaScript cannot actually encode an infinite amount of information. The depth of the stack is limited to 2^32 -1, and there are a finite number of different values we can push onto the stack. And then there are limitations like the the memory in our machines, or the number of clock ticks our CPUs will execute before the heat-death of the universe.

But our implementation shows the basic principle, and it’s good enough for any of the test strings we’ll write by hand.

Now how about a recognizer for balanced parentheses? Here is the state diagram:

graph LR start(start)-->|"(, [_] v ("|start start-->|"), [(] ^"|start start-.->|"end, []"|recognized(recognized)

Note that we have added some decoration to the arcs between internal states: Instead of just recognizing a single token, it recognizes a tuple of a token and the top of the stack. [] means it matches when the stack is empty, [(] means match when the top of the stack contains a (, and _ is a wild-card that matches any non-empty token. We have also added two optional instructions for the stack: v to push, and ^ to pop.

And here’s the code for the same DPA:

class BalancedParentheses extends DeterministicPushdownAutomaton {
  start(token) {
    if (token === '(') {
      return this.push(token);
    } else if (token === ')' && this.top() === '(') {
      return this.pop();
    } else if (token === END && this.hasEmptyStack()) {
        return this.recognize();
    }
  }
}

test(BalancedParentheses, [
	'', '(', '()', '()()', '(())',
'([()()]())', '([()())())',
'())()', '((())(())'
]);
  //=>
    '' => true
    '(' => false
    '()' => true
    '()()' => true
    '(())' => true
    '([()()]())' => false
    '([()())())' => false
    '())()' => false
    '((())(())' => false

Aha! Balanced parentheses is a deterministic context-free language.

Our recognizer is so simple, we can give in to temptation and enhance it to recognize multiple types of parentheses (we won’t bother with the diagram):

class BalancedParentheses extends DeterministicPushdownAutomaton {
  start(token) {
    if (token === '(') {
      return this.push(token);
    } else if (token === '[') {
      return this.push(token);
    } else if (token === '{') {
      return this.push(token);
    } else if (token === ')' && this.top() === '(') {
      return this.pop();
    } else if (token === ']' && this.top() === '[') {
      return this.pop();
    } else if (token === '}' && this.top() === '{') {
      return this.pop();
    } else if (token === END && this.hasEmptyStack()) {
        return this.recognize();
    }
  }
}

test(BalancedParentheses, [
  '', '(', '()', '()()', '{()}',
	'([()()]())', '([()())())',
	'())()', '((())(())'
]);
  //=>
    '' => true
    '(' => false
    '()' => true
    '()()' => true
    '{()}' => true
    '([()()]())' => true
    '([()())())' => false
    '())()' => false
    '((())(())' => false

Balanced parentheses with a finite number of pairs of parentheses is also a deterministic context-free language. We’re going to come back to deterministic context-free languages in a moment, but let’s consider a slightly different way to recognize balanced parentheses first.


BNP Paribas Fortis, Brussels


recursive regular expressions

We started this essay by mentioning regular expressions. We then showed that a formal regular expression cannot recognize balanced parentheses, in that formal regular expressions can only define regular languages.

Regular expressions as implemented in programming languages–abbreviated rexen (singular regex)–are a different beast. Various features have been added to make them non-deterministic, and on some platforms, even recursive.

JavaScripts regexen do not support recursion, but the Oniguruma regular expression engine used by Ruby (and PHP) does support recursion. Here’s an implementation of simple balanced parentheses, written in Ruby:

/^(?'balanced'(?:\(\g'balanced'\))*)$/x

It is written using the standard syntax. Standard syntax is compact, but on more complex patterns can make the pattern difficult to read. “Extended” syntax ignores whitespace, which is very useful when a regular expression is complex and needs to be visually structured.

Extended syntax also allows comments. Here’s a version that can handle three kinds of parentheses:

%r{                     # Start of a Regular expression literal.

  ^                     # Match the beginning of the input

  (?'balanced'          # Start a non-capturing group named 'balanced'

    (?:                 # Start an anonymous non-capturing group

      \(\g'balanced'\)  # Match an open parenthesis, anything matching the 'balanced'
                        # group, and a closed parenthesis. ( and ) are escaped
                        # because they have special meanings in regular expressions.

      |                 # ...or...

      \[\g'balanced'\]  # Match an open bracket, anything matching the 'balanced'
                        # group, and a closed bracket. [ and ] are escaped
                        # because they have special meanings in regular expressions.

      |                 # ...or...

      \{\g'balanced'\}  # Match an open brace, anything matching the 'balanced'
                        # group, and a closed bracket. { and } are escaped
                        # because they have special meanings in regular expressions.

    )*                  # End the anonymous non-capturing group, and modify
                        # it so that it matches zero or more times.

  )                     # End the named, non-capturing group 'balanced'

  $                     # Match the end of the input

}x                      # End of the regular expression literal. x is a modifier
                        # indicating "extended" syntax, allowing comments and
                        # ignoring whitespace.

These recursive regular expressions specify a deterministic context-free language, and indeed we already have developed deterministic pushdown automata that perform the same recognizing.

We know that recognizing these languages requires some form of state that is equivalent to a stack with one level of depth for every unclosed parenthesis. That is handled for us by the engine, but we can be sure that somewhere behind the scenes, it is consuming the equivalent amount of memory.

So we know that recursive regular expressions appear to be at least as powerful as deterministic pushdown automata. But are they more powerful? Meaning, is there a language that a recursive regular expression can match, but a DPA cannot?


Context-Free Languages and Pushdown Automata

El CECUT, Centro Cultural Tijuana, la Bola.


nested parentheses

To demonstrate that recursive regular expressions are more powerful than DPAs, let’s begin by simplifying our balanced parentheses language. Here’s a three-state DPA that recognizes nested parentheses only, not all balanced parentheses.

Here’s a simplified state diagram that just handles round parentheses:

graph TD start(start)-->|end|recognized(recognized) start-->|"(, [] v ("|opening opening-->|"(, [] v ("|opening opening-->|"), [(] ^"|closing closing-->|"), [(] ^"|closing closing-.->|"end, []"|recognized

And here’s the full code handling all three kinds of parentheses:

class NestedParentheses extends DeterministicPushdownAutomaton {
  start(token) {
    switch(token) {
      case END:
        return this.recognize();
      case '(':
        return this
          .push(token)
          .transitionTo('opening');
      case '[':
        return this
          .push(token)
          .transitionTo('opening');
      case '{':
        return this
          .push(token)
          .transitionTo('opening');
    }
  }

  opening(token) {
    if (token === '(') {
      return this.push(token);
    } else if (token === '[') {
      return this.push(token);
    } else if (token === '{') {
      return this.push(token);
    } else if (token === ')' && this.top() === '(') {
      return this
        .pop()
        .transitionTo('closing');
    } else if (token === ']' && this.top() === '[') {
      return this
        .pop()
        .transitionTo('closing');
    } else if (token === '}' && this.top() === '{') {
      return this
        .pop()
        .transitionTo('closing');
    }
  }

  closing(token) {
    if (token === ')' && this.top() === '(') {
      return this.pop();
    } else if (token === ']' && this.top() === '[') {
      return this.pop();
    } else if (token === '}' && this.top() === '{') {
      return this.pop();
    } else if (token === END && this.hasEmptyStack()) {
      return this.recognize();
    }
  }
}

test(NestedParentheses, [
  '', '(', '()', '()()', '{()}',
	'([()])', '([))',
	'(((((())))))'
]);
  //=>
    '' => true
    '(' => false
    '()' => true
    '()()' => false
    '{()}' => true
    '([()])' => true
    '([))' => false
    '(((((())))))' => true

And here is the equivalent recursive regular expression:

nested = %r{
    ^
    (?'nested'
      (?:
        \(\g'nested'\)
        |
        \[\g'nested'\]
        |
        \{\g'nested'\}
      )?
    )
    $
  }x

def test pattern, strings
  strings.each do |string|
    puts "'#{string}' => #{!(string =~ pattern).nil?}"
  end
end

test nested, [
  '', '(', '()', '()()', '{()}',
	'([()])', '([))',
	'(((((())))))'
]
  #=>
    '' => true
    '(' => false
    '()' => true
    '()()' => false
    '{()}' => true
    '([()])' => true
    '([))' => false
    '(((((())))))' => true

So far, so good. Of course they both work, nested parentheses is a subset of balanced parentheses, so we know that it’s a deterministic context-free language.

But now let’s modify our program to help with documentation, rather than math. Let’s make it work with quotes.


Rezola cement factory, San Sebastian, Spain


context-free languages

Instead of matching open and closed parentheses, we’ll match quotes, both single quotes like ' and double quotes like "".7

Our first crack is to modify our existing DPA by replacing opening and closing parentheses with quotes. We’ll only need two cases, not three:

Here’s our DPA:

class NestedQuotes extends DeterministicPushdownAutomaton {
  start(token) {
    switch(token) {
      case END:
        return this.recognize();
      case '\'':
        return this
          .push(token)
          .transitionTo('opening');
      case '"':
        return this
          .push(token)
          .transitionTo('opening');
    }
  }

  opening(token) {
    if (token === '\'' && this.top() === '\'') {
      return this
        .pop()
        .transitionTo('closing');
    } else if (token === '"' && this.top() === '"') {
      return this
        .pop()
        .transitionTo('closing');
    } else if (token === '\'') {
      return this.push(token);
    } else if (token === '"') {
      return this.push(token);
    }
  }

  closing(token) {
    if (token === '\'' && this.top() === '\'') {
      return this.pop();
    } else if (token === '"' && this.top() === '"') {
      return this.pop();
    } else if (token === END && this.hasEmptyStack()) {
      return this.recognize();
    }
  }
}

test(NestedQuotes, [
  ``, `'`, `''`, `""`, `'""'`,
  `"''"`, `'"'"`, `"''"""`,
  `'"''''''''''''''''"'`
]);
  //=>
    '' => true
    ''' => false
    '''' => false
    '""' => false
    ''""'' => false
    '"''"' => false
    ''"'"' => false
    '"''"""' => false
    ''"''''''''''''''''"'' => false

NestedQuotes does not work. What if we modify our recursive regular expression to work with single and double quotes?

quotes = %r{
    ^
    (?'balanced'
      (?:
        '\g'balanced''
        |
        "\g'balanced'"
      )?
    )
    $
  }x

test quotes, [
  %q{}, %q{'}, %q{''}, %q{""}, %q{'""'},
  %q{"''"}, %q{'"'"}, %q{"''"""},
  %q{'"''''''''''''''''"'}
]
  #=>
    %q{} => true
    %q{'} => false
    %q{''} => true
    %q{""} => true
    %q{'""'} => true
    %q{"''"} => true
    %q{'"'"} => false
    %q{"''"""} => false
    %q{'"''''''''''''''''"'} => true

The recursive regular expression does work! Now, we may think that perhaps we went about writing our deterministic pushed automaton incorrectly, and there is a way to make it work, but no. It will never work on this particular problem.

This particular language–nested single and double quotes quotes–is a very simple example of the “palindrome” problem. We cannot use a deterministic pushdown automaton to write a recognizer for palindromes that have at least two different kinds of tokens.

And that tells us that there is a class of languages that are more complex than deterministic context-free languages. The are context-free languages, and they are a superset of deterministic context-free languages.


brutalism 3 of 3


why deterministic pushdown automata cannot recognize palindromes

Why can’t a deterministic pushdown automaton recognize our nested symmetrical quotes language? For the same reason that deterministic pushdown automata cannot recognize arbitrarily long palindromes.

Let’s consider a simple palindrome language. In this language, there are only two tokens, 0, and 1. In our language, any even-length palindrome is valid, such as 00, 1001, and 000011110000. We aren’t going to do a formal proof here, but let’s imagine that there is a DPA that can recognize this language, we’ll call this DPA P.8

Let’s review for a moment how DPAs (like FSAs) work. There can only be a finite number of internal states. And there can only be a finite set of symbols that it manipulates (there might be more symbols than tokens in the language it recognizes, but still only a finite set.)

Now, depending upon how P is organized, it may push one symbol onto the stack for each token it reads, or it might push a token every so many tokens. For example, it could encode four bits of information about the tokens read as a single hexadecimal 0 through F. Or 256 bits as the hexadecimal pairs 00 through FF, and so on. or it might just store one bit of information per stack element, in which case it’s “words” would only have one bit each.

So the top-most element of the P’s external stack can contain an arbitrary, but finite amount of information. And P’s internal state can hold an arbitrary, but finite amount of information. There are fancy ways to get at elements below the topmost element, but to do so, we must either discard the top-most element’s information, or store it in P’s finite internal state temporarily, and then push it back.

That doesn’t actually give P any more recollection than storing state on the top of the stack and in its internal state. We can store more information than can be held in P’s internal state and on the top of P’s external stack, but to access more information, we must permanently discard information from the top of the stack.9


So let’s consider what happens when we feed P a long string of 0s and 1s that is “incompressible,” or random in an information-theoretic sense. Each token in our vocabulary of 1s and 0s represents an additional bit of information. We construct a string long enough that P must store some of the string’s information deep enough in the stack to be inaccessible without discarding information:

Now we begin to feed it the inverse of the string read so far. What does P do?

If P is to recognize a palindrome, it must eventually dig into the “inaccessible” information, which means discarding some information. So let’s feed it enough information such that it discards some information. Now we give it a token that is no longer part of the inverse of the original string. What does P do now? We haven’t encountered the END, so P cannot halt and declare that the string is not a palindrome. After all, this could be the beginning of an entirely new palindrome, we don’t know yet.

But since P has discarded some information in order to know whether to match an earlier possible palindrome, P does not now have enough information to match a larger palindrome. Ok, so maybe P shouldn’t have discarded information to match the shorter possible palindrome. If it did not do so, P would have the information required to match the longer possible palindrome.

But that breaks any time the shorter palindrome is the right thing to recognize.

Now matter how we organize P, we can always construct a string large enough that P must discard information to correctly recognize one possible string, but not discard information in order to correctly recognize another possible string.

Since P is deterministic, meaning it always does exactly one thing in response to any token given a particular state, P cannot both discard and simultaneously not discard information, therefore P cannot recognize languages composed of palindromes.

Therefore, no DPA can recognize languages composed of palindromes.


gottfried böhm, architect: maria königin des friedens pilgrimage church, neviges, germany 1963-1972


pushdown automata

We said that there is no deterministic pushdown automaton that can recognize a palindrome language like our symmetrical quotes language. And to reiterate, we said that this is the case because a deterministic pushdown automaton cannot simultaneously remove information from its external storage and add information to its external storage. If we wanted to make a pushdown automaton that could recognize palindromes, we could go about it in one of two ways:

  1. We could relax the restrictions on what an automaton can do with the stack in one step. e.g. access any element or store a stack in an element of the stack. Machine that can do more with the stack than a single push, pop, or replace are called Stack Machines.
  2. We could relax the restriction that a deterministic pushdown automata must do one thing, or another, but not both.

Let’s consider the latter option. Deterministic machines must do one and only one thing in response to a token and the top of the stack. That’s what makes them “deterministic.” In effect, their logic always looks like this:

start (token) {
  if (token === this.top()) {
    return this.pop();
  } else if (token === '0' || token === '1') {
    return this.push(token);
  } else if (token == END and this.hasEmptyStack()) {
    return this.recognize();
  } else if (token == END and this.hasEmptyStack()) {
    return this.halt();
  }
}

Such logic pops, recognizes, halts, or pushes the current token, but it cannot do more than one of these things simultaneously. But what if the logic looked like this?

* start (token) {
  if (token === this.top()) {
    yield this.pop();
  }
  if (token === '0' || token === '1') {
    yield this.push(token);
  }
  if (token == END and this.hasEmptyStack()) {
    yield this.recognize();
  }
  if (token == END and this.hasEmptyStack()) {
    yield this.halt();
  }
}

This logic is formulated as a generator that yields one or more outcomes. It can do more than one thing at a time. It expressly can both pop and push the current token when it matches the top of the stack. How will this work?


overhead brutalism


implementing pushdown automata

We will now use this approach to write a recognizer for the “even-length binary palindrome” problem. We’ll use the same general idea as our NestedParentheses language, but we’ll make three changes:

  • Our state methods will be generators;
  • We evaluate all the possible actions and yield each one’s result;
  • We add a call to .fork() for each result, which as we’ll see below, means that we are cloning our state and making changes to the clone.
class BinaryPalindrome extends PushdownAutomaton {
  * start (token) {
    if (token === '0') {
      yield this
      	.fork()
        .push(token)
      	.transitionTo('opening');
    }
    if (token === '1') {
      yield this
      	.fork()
        .push(token)
      	.transitionTo('opening');
    }
    if (token === END) {
      yield this
      	.fork()
        .recognize();
    }
  }

  * opening (token) {
    if (token === '0') {
      yield this
      	.fork()
        .push(token);
    }
    if (token === '1') {
      yield this
      	.fork()
        .push(token);
    }
    if (token === '0' && this.top() === '0') {
      yield this
      	.fork()
        .pop()
      	.transitionTo('closing');
    }
    if (token === '1' && this.top() === '1') {
      yield this
      	.fork()
      	.pop()
      	.transitionTo('closing');
    }
  }

  * closing (token) {
    if (token === '0' && this.top() === '0') {
      yield this
      	.fork()
        .pop();
    }
    if (token === '1' && this.top() === '1') {
      yield this
      	.fork()
      	.pop();
    }
    if (token === END && this.hasEmptyStack()) {
      yield this
      	.fork()
        .recognize();
    }
  }
}

Now let’s modify DeterministicPushdownAutomaton to create PushdownAutomaton. We’ll literally copy the code from class DeterministicPushdownAutomaton { ... } and make the following changes:10

class PushdownAutomaton {

   // ... copy-pasta from class DeterministicPushdownAutomaton

  consume(token) {
    return [...this[this.internal](token)];
  }

  fork() {
    return new this.constructor(this.internal, this.external.slice(0));
  }

  static evaluate (string) {
    let states = [new this()];

    for (const token of string) {
      const newStates = states
        .flatMap(state => state.consume(token))
        .filter(state => state && !state.halted);

      if (newStates.length === 0) {
        return false;
      } else if (newStates.some(state => state.recognized)) {
        return true;
      } else {
        states = newStates;
      }
    }

    return states
      .flatMap(state => state.consume(END))
      .some(state => state && state.recognized);
  }
}

The new consume method calls the internal state method as before, but then uses the array spread syntax to turn the elements it yields into an array. The fork method makes a deep copy of a state object.11

The biggest change is to the evaluate static method. we now start with an array of one state. As we loop over the tokens in the string, we take the set of all states and flatMap them to the states they return, then filter out any states that halt.

If we end up with no states that haven’t halted, the machine fails to recognize the string. Whereas, if any of the states lead to recognizing the string, the machine recognizes the string. If not, we move to the next token. When we finally pass in the END token, if any of the states returned recognize the string, then we recognize the string.

So does it work?

test(BinaryPalindrome, [
  '', '0', '00', '11', '0110',
  '1001', '0101', '100111',
  '01000000000000000010'
]);
  //=>
    '' => true
    '0' => false
    '00' => true
    '11' => true
    '0110' => true
    '1001' => true
    '0101' => false
    '100111' => false
    '01000000000000000010' => true

Indeed it does, and we leave as “exercises for the reader” to perform either of these two modifications:

  1. Modify Binary Palindrome to recognize both odd- and even-length palindromes, or;
  2. Modify Binary Palindrome so that it recognizes nested quotes instead of binary palindromes.

Our pushdown automaton works because when it encounters a token, it both pushes the token onto the stack and compares it to the top of the stack and pops it off if it matches. It forks itself each time, so it consumes exponential space and time. But it does work.

And that shows us that pushdown automata are more powerful than deterministic pushdown automata, because they can recognize languages that deterministic pushdown automata cannot recognize: Context-free languages.


Marseille - Cité Radieuse


deterministic context-free languages are context-free languages

Pushdown automata are a more powerful generalization of deterministic pushdown automata: They can recognize anything a deterministic pushdown automaton can recognize, and using the exact same diagram.

We can see this by writing a pushdown automaton to recognize a deterministic context-free language. Here is the state diagram:

graph LR start(start)-->|L|L1[L] L1-->|O|LO["L(OL)*O"] LO-->|L|LOL["L(OL)+"] LOL-->|O|LO LOL-.->|end|recognized(recognized)

And here is our code. As noted before, in our implementation, it is almost identical to the implementation we would write for a DPA:

class LOL extends PushdownAutomaton {
  * start (token) {
    if (token === 'L') {
      yield this
      	.fork()
      	.transitionTo('l');
    }
  }

  * l (token) {
    if (token === 'O') {
      yield this
      	.fork()
      	.transitionTo('lo');
    }
  }

  * lo (token) {
    if (token === 'L') {
      yield this
      	.fork()
      	.transitionTo('lol');
    }
  }

  * lol (token) {
    if (token === 'O') {
      yield this
      	.fork()
      	.transitionTo('lo');
    }
    if (token === END) {
      yield this
      	.fork()
      	.recognize();
    }
  }
}

test(LOL, [
  '', 'L', 'LO', 'LOL', 'LOLO',
  'LOLOL', 'LOLOLOLOLOLOLOLOLOL',
  'TROLOLOLOLOLOLOLOLO'
]);
  //=>
    '' => false
    'L' => false
    'LO' => false
    'LOL' => true
    'LOLO' => false
    'LOLOL' => true
    'LOLOLOLOLOLOLOLOLOL' => true
    'TROLOLOLOLOLOLOLOLO' => false

But we needn’t ask anyone to “just trust us on this.” Here’s an implementation of PushdownAutomaton that works for both deterministic and general pushdown automata:

class PushdownAutomaton {
  constructor(internal = 'start', external = []) {
    this.internal = internal;
    this.external = external;
    this.halted = false;
    this.recognized = false;
  }

  isDeterministic () {
    return false;
  }

  push(token) {
    this.external.push(token);
    return this;
  }

  pop() {
    this.external.pop();
    return this;
  }

  replace(token) {
    this.external[this.external.length - 1] = token;
    return this;
  }

  top() {
    return this.external[this.external.length - 1];
  }

  hasEmptyStack() {
    return this.external.length === 0;
  }

  transitionTo(internal) {
    this.internal = internal;
    return this;
  }

  recognize() {
    this.recognized = true;
    return this;
  }

  halt() {
    this.halted = true;
    return this;
  }

  consume(token) {
    const states = [...this[this.internal](token)];
    if (this.isDeterministic()) {
      return states[0] || [];
    } else {
      return states;
    }
  }

  fork() {
    return new this.constructor(this.internal, this.external.slice(0));
  }

  static evaluate (string) {
    let states = [new this()];

    for (const token of string) {
      const newStates = states
        .flatMap(state => state.consume(token))
        .filter(state => state && !state.halted);

      if (newStates.length === 0) {
        return false;
      } else if (newStates.some(state => state.recognized)) {
        return true;
      } else {
        states = newStates;
      }
    }

    return states
      .flatMap(state => state.consume(END))
      .some(state => state && state.recognized);
  }
}

Now if we want to show that an automaton written in non-deterministic style has the same semantics as an automaton written for our original DeterministicPushdownAutomaton class, we can write it like this:

class LOL extends PushdownAutomaton {

  isDeterministic () {
    return true;
  }

  // rest of states remain exactly the same

  ...
}

We can even experiment with something like BinaryPalindrome. By implementing isDeterministic() and alternating between having it return true and false, we can see that the language it recognizes is context-free but not deterministically context-free:

class BinaryPalindrome extends PushdownAutomaton {
  isDeterministic () {
    return true;
  }

  ...

}

test(BinaryPalindrome, [
  '', '0', '00', '11', '0110',
  '1001', '0101', '100111',
  '01000000000000000010'
]);
  //=>
    '' => true
    '0' => false
    '00' => false
    '11' => false
    '0110' => false
    '1001' => false
    '0101' => false
    '100111' => false
    '01000000000000000010' => false

class BinaryPalindrome extends PushdownAutomaton {
  isDeterministic () {
    return false;
  }

  ...

}

test(BinaryPalindrome, [
  '', '0', '00', '11', '0110',
  '1001', '0101', '100111',
  '01000000000000000010'
]);
  //=>
    '' => true
    '0' => false
    '00' => true
    '11' => true
    '0110' => true
    '1001' => true
    '0101' => false
    '100111' => false
    '01000000000000000010' => true

Now, since context-free languages are the set of all languages that pushdown automata can recognize, and since pushdown automata can recognize all languages that deterministic pushdown automata can recognize, it follows that the set of all deterministic context-free languages is a subset of the set of all context-free languages.

Which is implied by the name, but it’s always worthwhile to explore some of the ways to demonstrate its truth.


The End

Night View of The Geisel Library, University of California San Diego


summary

We’ve seen that formal languages (those made up of unambiguously defined strings of symbols) come in at least three increasingly complex families, and that one way to quantify that complexity is according to the capabilities of the machines (or “automata”) capable of recognizing strings in the language.

Here are the three families of languages and automata that we reviewed:

Language Family Automata Family Example Language(s) Regular Finite State Binary Numbers, LOL Deterministic Context-free Deterministic Pushdown Balanced Parentheses Context-free Pushdown Palindromes


An obvious question is, Do you need to know the difference between a regular language and a context-free language if all you want to do is write some code that recognizes balanced parentheses?

The answer is, probably not. Consider cooking. A food scientist knows all sorts of things about why certain recipes do what they do. A chef de cuisine (or “chef”) knows how to cook and improvise recipes. Good chefs end up acquiring a fair bit of food science in their careers, and they know how to apply it, but they spend most of their time cooking, not thinking about what is going on inside the food when it cooks.

There are some areas where at least a smattering of familiarity with this particular subject is helpful. Writing parsers, to give one example. Armed with this knowledge, and but little more, the practising programmer knows how to design a configuration file’s syntax or a domain-specific language to be amenable to parsing by an LR(k) parser, and what implications deviating from a deterministic context-free language will have on the performance of the parser.

But on a day-to-day basis, if asked to recognize balanced parentheses?

The very best answer is probably /^(?'balanced'(?:\(\g'balanced'\))*)$/x for those whose tools support recursive regular expressions, a simple loop with a counter, or a stack.


Robarts

*Detail of Robarts Library © 2006 Andrew Louis. Used with permission, all rights reserved.


further reading

If you enjoyed reading this introduction to formal languages and automata that recognize them, here are some interesting avenues to pursue:

Formal languages are actually specified with formal grammars, not with informal descriptions like “palindrome,” or, “binary number.” The most well-known formal grammar is the regular grammar, which defines a regular language. Regular grammars begat the original regular expressions.

Balanced parentheses has been discussed in this blog before: Pattern Matching and Recursion discusses building a recognizer out of composeable pattern-matching functions, while Alice and Bobbie and Sharleen and Dyck discusses a cheeky little solution to the programming problem.

For those comfortable with code examples written in Ruby, the general subject of ideal computing machines and the things they can compute is explained brilliantly and accessibly in Tom Stuart’s book Understanding Computation.


Beinecke Rare Book & Manuscript Library Interior


discussions

Discuss this essay on hacker news, proggit, or /r/javascript.


Notes
  1. Formal regular expressions were invented by Stephen Kleene

  2. There are many ways to write DFAs in JavaScript. In How I Learned to Stop Worrying and ❤️ the State Machine, we built JavaScript programs using the state pattern, but they were far more complex than a deterministic finite automaton. For example, those state machines could store information in properties, and those state machines had methods that could be called.

    Such “state machines” are not “finite” state machines, because in principle they can have an infinite number of states. They have a finite number of defined states in the pattern, but their properties allow them to encode state in other ways, and thus they are not finite state machines. 

  3. To demonstrate that “If there are a finite number of strings in a language, there must be a DFA that recognizes that language,” take any syntax for defining a DFA, such as a table. With a little thought, one can imagine an algorithm that takes as its input a finite list of acceptable strings, and generates the appropriate table. 

  4. This type of proof is known as “Reductio Ad Absurdum,” and it is a favourite of logicians, because quidquid Latine dictum sit altum videtur

  5. An interesting thing to note is for any DFA, there must be an infinite number of pairs of different strings that lead to the same state, this follows form the fact that the DFA has a finite number of states, but we can always find more finite strings than there are states in the DFA.

    But for this particular case, where we are talking about strings that consist of a single symbol, and of different lengths, it follows that the DFA must contain at least one loop. 

  6. No matter how a machine is organized, if it has a finite number of states, it cannot recognized balanced parenthese by our proof above. Fr example, if we modify our DFA to allow an on/off flag for each state, and we have a finite number of states, our machine is not more powerful than a standard DFA, it is just more compact: Its definition is log2 the size of a standard DFA, but it still has a finite number of possible different states. 

  7. For this pattern, we are not interested in properly typeset quotation marks, we mean the single and double quotes that don’t have a special form for opening and closing, the kind you find in programming languages that were designed to by reproducible by telegraph equipment: ' and ". If we could use the proper “quotes,” then our language would be a Dyck Language, equivalent to balanced parentheses. 

  8. The even-length palindrome language composes of 0s and 1s is 100% equivalent to the nested quotes language, we’re just swapping 0 for ', and 1 for ", because they’re easier for this author’s eyes to read. 

  9. In concatenative programming languages, such as PostScript, Forth, and Joy, there are DUP and SWAP operations that, when used in the right combination, can bring an element of the stack to the top, and then put it back where it came from. One could construct a DPA such that it has the equivalent of DUP and SWAP operations, but since a DPA can only perform one push, pop, or replace for each token it consumes, the depth of stack it can reach is limited by its ability to store the tokens it is consuming in its internal state, which is finite. For any DPA, there will always be some depth of stack that is unreachable without discarding information. 

  10. If we were really strict about OO and inheritance, we might have them both inherit from a common base class of AbstractPushdownAutomaton, but “Aint nobody got time for elaborate class hierarchies.” 

  11. This code makes a number of unnecessary copies of states, we could devise a scheme to use structural sharing and copy-on-write semantics, but we don’t want to clutter up the basic idea right now. 

https://raganwald.com/2019/02/14/i-love-programming-and-programmers
Ayoayo and Linear Recursion
Show full content

In this essay, we’re going to look at a game called Ayoayo. As we’ll read, Ayoayo is part of the Mancala family of games that has spread throughout Africa, and beyond. We’ll write some code that would be useful if we were implementing an Ayoayo game, and along the way, we’ll look at how we can keep our functions decoupled from each other and themselves with dependency injection.

We’ll then look at how our gratuitous use of linear recursion also helps us keep our code decoupled, and teaches us how to write decoupled code even if we decided to use iteration.

But first, some personal recollections. Feel free to skip it if you want to dive right into code.


Visiting a slave fort in Ghana

I believe this is a picture of my sister and me visiting a fort used to house slaves before their transportation (what a bloodless word) to the Americas, probably in Ghana.


Prelude: A Boy and a Game
africa 1968 – 1971

In 1968 or thereabouts, my mother took my sister and I on a trip to West Africa. We returned to Nigeria a year or so later, and lived there while she developed software for a brand-new IBM 360 that had been installed at the University of Ibadan.

It is difficult for me to explain the magnitude of the culture shock I experienced in Africa. In the 1960s, Toronto was much less visibly multicultural than it is today. I do not recall ever seeing a black police officer, teacher, or newscaster on television. In 1963, my great-uncle Leonard had become the first black MPP in Canada, but in 1968 he was still more of an exception than a minority.

And then I visited Africa. Almost everyone was black, where in Canada, almost everyone was white. I distinctly recall being amazed to visit the flight deck of our Air Afrique flight, and the black pilots let me sit in the co-pilot’s chair. The flight attendants were black as well, and everyone spoke French. I had never encountered such a thing.

In Africa, there was fantastic wealth, and there was wretched poverty as I’d never imagined. Many institutions in Africa were older than the country of Canada. That’s something you find in most of the old world, like England or Europe, but it was particularly stunning for me to look at things like the Great Mosque of Djenné. The first mosque on that site was built in the late 13th century, three centuries before the first Europeans set permanent foot in Canada.

In 1970, we were even in the crowds for the coronation procession of Opoku Ware II, 15th Emperor-King of the Ashanti people. The Empire of Ashanti was formed in 1701, a century and a half before Canada became a country. My brain exploded when we touched down in Africa, and continued to erupt the entire time we were there. Everything I encountered was literally fantastic.

From an early age, my mother had taken our education “in hand,” as they used to say. I was already aware of various important stories and facts about black people like Harriet Tubman. I think I had already read some Anansi stories from books she had imported by mail. You wouldn’t find things like that in a bookstore in those days.

And amongst all those great things, she may have also introduced me to some form of the game Oware in Toronto, but my recollection is that I first encountered it in Africa.


Mancala / Awale / Oware / Ayoayo

This board can be used to play many different kinds of Oware, including Ayoayo.


oware and ayoayo

What is Oware? Wikipedia puts it succinctly and well: Oware is an abstract strategy game among the Mancala family of board games (pit and pebble games) played worldwide with slight variations as to the layout of the game, number of players and strategy of play. Its origin is uncertain, but it is widely believed to be of Ashanti origin.

The games in the Mancala family have spread all over the world. They crossed the Sahara with the gold and salt trade to East Africa, where they then crossed to Southern Asia. I recall a friend visiting my house, who spotted a board and told me that she had played the game as a young girl in Malaysia.

One of its charms is its radical simplicity. It is often played with pebbles and its scooped out of the ground or sand. It can be played with cups and marbles, pennies, or even twiddly-winks. In that respect, it reminds me of tic-tac-toe. It can be played almost anywhere, almost any time, by almost everyone.

I think that I first learned to play Oware in Nigeria, because what I recall of the rules closely matches the rules of Ayoayo, the variation of the game played by the Yoruba people of Nigeria.


Stone at Tolowa Dunes State Park

One of Ayoayo’s charms is its radical simplicity. It is often played with pebbles and its scooped out of the ground or sand.


how I learnt ayoayo is played

In Nigeria, bored children would play Ayoayo the way children play tic-tac-toe in the West. If there was a board handy, we’d play on that. If not, we could collect pebbles or coins and set up a makeshift board on paper or even the ground. But it was usually a board, as I recall. Every house had one, often several. Some houses would have a ceremonial board, an older, elaborate hand-carved board, elevated so that two people sitting on stools could play without the need of a table between them.

Ayoayo is played on the most common Oware layout, a board with two rows of six pits. Many boards also provide with a pit on each side of the board for “captured” stones, but this is optional when playing Ayoayo.

The players face each other with the board between them, such that each player has a row of six pits in front of them. These pits “belong” to that player. If extra pits are provided for captured stones, each player takes one for themselves, but the extra pits are not in play. (In some other games in the same family, pits for captured stones are used in play.)

We’ve mentioned capturing stones several times, and for good reason: The game play consists of capturing stones, and when the game is completed, the player who has captured the most stones, wins.

The rules are simple:

The game begins with four stones in each of the twelve pits. Thus, the game requires 48 stones. Pebbles, marbles, or even lego pieces can be used to represent the stones. A wooden board is nice, but pits can be scooped out of earth or sand to make a board. This extreme simplicity is part of the game’s charm, much as tic-tac-toe’s popularity stems in part from the fact that you can play a game with little more than a stick and a piece of flat earth.

The players alternate turns, as in many games.

On a player’s turn, they select one of their pits to “sow.” There are some exceptions listed below, but in general if the player has more than one pit with stones, they may select which one to sow. There are many variations on rules for how to sow the stones amongst the Mancala family of games, but in Ayoayo, sowing works like this:

  • The player scoops all of the stones from the starting pit into their hand.
  • Moving counter-clockwise, the player drops one stone into each pit.
  • On their row, they move from left to right.
  • If they reach the end of their row, they move from right to left on their opponent’s row (thus “counter-clockwise”).
  • They always skip the starting pit on their row.
  • The sowing pauses when they have sowed the last stone in their hand.

If the last stone lands in a pit on either side of the board that contains one or more stones, they scoop the stones up (including that last stone), and continue sowing. They continue to skip their original starting pit only, but can sow into any pits that get scooped up in this manner. This is called relay sowing.

If the last stone lands in an empty pit on that player’s own side, the player “captures” any stones that are in the pit on the opponent’s side of the board from their last pit.

If a player has no move on their turn, the game ends, and their opponent captures any remaining stones (which will–of course–be on their side).

If a player ends his or her turn with no seeds left in his or her row, the opponent must (if it is possible) choose his move in such a way to bring one or more seeds into the other’s row. If a player has several such moves (as is usually the case), the player may choose which move to make.

I learned those rules very quickly, and it was pretty easy to knock of game after game while laughing and joking with school mates or friends from the neighbourhood. Good times.


An Altair 8800 on display at the Smithsonian Institute

An Altair 8800 on display at the Smithsonian Institute


segue to implementing ayoayo

At that time in my life, I knew about computers, but I didn’t have proper access to a computer. I couldn’t write COBOL or FORTRAN programs and run them on the computer at the University. Nobody had a home computer: The [Altair 8800] hadn’t been invented yet, and I had never heard of Douglas Engelbart, much less seen The Mother of All Demos.

If I had, I doubtless would have tried to write a program to play Ayoayo with me. But that was then. How about now? What would be involved in writing a program to play Ayoayo? Let’s explore implementing some of the rules. We won’t build a complete game, but we will explore some ideas around decoupling and recursion as we go.


Natural Mancala Game

A natural Mancala game.


Implementing Ayoayo
a starting point: the board, the stones, and sowing

If we were going to make an Ayoayo program, where would we start?

Let’s start with a simple idea: We’ll have to represent the state of the game. We could use an array for the twelve pits of the game, with an integer representing the number of stones in that pit. We’ll initialize it with four stones in each pit:

const gamePits = Array(12).fill(4);

Associating array elements with pits

Associating the elements of the array with the pits belonging to the players.


In Ayoayo, a full turn involves relay sowing as described above:

When relay sowing, if the last stone during sowing lands in an occupied hole, all the contents of that hole, including the last sown stone, are immediately re-sown from the hole.

The other kind of sowing, as used in other games from the same family, is just called sowing. The sowing stops when the last stone is sown, regardless of whether the last stone lands in an occupied or unoccupied pit.

It’s fairly obvious that if we make a function for sowing, we can use that to make a function for relay sowing, so let’s start with an ordinary sowing function. We’ll make a relay sowing function later.

The first thing we need is a function to scoop up the stones from a pit:

function scoop(fromPit) {
  const stonesInHand = gamePits[fromPit];

  gamePits[fromPit] = 0;

  return stonesInHand;
}

We’ll also need a function to distribute stones. It needs to know which pit to skip, where to start distributing, and how many stones to are in the hand that need to be distributed:

function distribute(skipPit, currentPit, stonesInHand) {

  // now what?

}

The first thing our distribute function needs to do is place a stone from the hand into the current pit. With a wrinkle that if the current pit is the skip pit, move to the next pit:

function nextPit(pit) {
  return (pit + 1) % 12;
}

function distribute(skipPit, currentPit, stonesInHand) {

  if (currentPit === skipPit) {
    currentPit = nextPit(currentPit);
  }

  ++gamePits[currentPit];
  --stonesInHand;

  // And now what?
}

What should we do next?

Well, if we don’t have any more stones in hand, we’re done. We should return the current pit so that other code can do things like work out whether to continue sowing, or determine whether to capture any stones from our opponent:

function nextPit(pit) {
  return (pit + 1) % 12;
}

function distribute(skipPit, currentPit, stonesInHand) {

  if (currentPit === skipPit) {
    currentPit = nextPit(currentPit);
  }

  ++gamePits[currentPit];
  --stonesInHand;

  if (stonesInHand === 0) {
    return currentPit;
  } else if (stonesInHand > 0) {
    // what goes here?
  }
}

If we still have stones, then what? We need to keep sowing. We could rewrite distribute to have a loop of so kind, maybe do { ... } while (...). But hang on!

distribute is a function that takes a pit to skip, a current pit, and a number of stones in hand. It returns the result of sowing stones. That’s what we need to return, with just one alteration: We need distribute starting with the next pit. So let’s just do that:

function nextPit(pit) {
  return (pit + 1) % 12;
}

function distribute(skipPit, currentPit, stonesInHand) {

  if (currentPit === skipPit) {
    currentPit = nextPit(currentPit);
  }

  ++gamePits[currentPit];
  --stonesInHand;

  if (stonesInHand === 0) {
    return currentPit;
  } else if (stonesInHand > 0) {
    return distribute(skipPit, nextPit(currentPit), stonesInHand);
  }
}

With scoop and distribute, we can write sow:

function sow(skipPit, fromPit = skipPit) {
  return distribute(skipPit, nextPit(fromPit), scoop(fromPit));
}

Let’s try it:

sow(0)
  //=> 4

gamePits
  //=> [0, 5, 5, 5, 5, 4, 4, 4, 4, 4, 4, 4]

We can test it by hand.


initial position


And after sowing once, it looks like this:


position after sowing from position zero


Now, there were five stones in the last pit (pit 4). If there are two or more stones in the last pit, it wasn’t empty before we sowed the last stone into it. So when relay sowing, we’d sow again, only this time we’ll tell our function start at pit 4, where we left off last time:

sow(0, 4)
  //=> 9

Five stones in the last pit (9), Let’s do it again:

sow(0, 9)
  //=> 3

Six stones in pit 3! Again:

sow(0, 3)
  //=> 9

pits
  //=> [0, 6, 6, 0, 1, 6, 6, 6, 6, 1, 5, 5]

Aha, just one stone in pit 9, we’re done. Let’s try it by hand, remembering to skip pit 0, and compare:


position after relay sowing from position zero


Let’s look at our accumulated code all together:

function scoop(fromPit) {
  const stonesInHand = gamePits[fromPit];

  gamePits[fromPit] = 0;

  return stonesInHand;
}

function nextPit(pit) {
  return (pit + 1) % 12;
}

function distribute(skipPit, currentPit, stonesInHand) {

  if (currentPit === skipPit) {
    currentPit = nextPit(currentPit);
  }

  ++gamePits[currentPit];
  --stonesInHand;

  if (stonesInHand === 0) {
    return currentPit;
  } else if (stonesInHand > 0) {
    return distribute(skipPit, nextPit(currentPit), stonesInHand);
  }
}

function sow(skipPit, fromPit = skipPit) {
  return distribute(skipPit, nextPit(fromPit), scoop(fromPit));
}

let gamePits = Array(12).fill(4);

sow(0)
  //=> 4
sow(0, 4)
  //=> 9
sow(0, 9)
  //=> 3
sow(0, 3)
  //=> 9

gamePits
  //=> [0, 6, 6, 0, 1, 6, 6, 6, 6, 1, 5, 5]

We’re off to a reasonable start. And as it happens, the recursive solution is actually straightforward to explain and to implement.


Toghiz Qumalaq

Toguz korgool (Kyrgyz: тогуз коргоол - “nine sheep droppings”) or toguz kumalak/toghiz qumalaq (Kazakh: тоғыз құмалақ), is a two-player game in the mancala family that is played in Central Asia.


a most practical digression

Let’s look at our calls to sow again:

sow(0)
sow(0, 4)
sow(0, 9)
sow(0, 3)

Shall we try it again?

sow(0)
  //=> undefined

D’oh! We forgot to reset the board. Our sow function relies upon a mutable value from outside of its body–gamePits–and indeed it mutates that value by changing the contents of the array. Thus, when you invoke sow, you don’t know what you’re going to get unless you already know the state of gamePits.

We say that sow is coupled to gamePits. In order to test sow, we have to first carefully set up gamePits to align with our expectations. We see this kind of thing when writing production code for large systems: Many tests do more work setting up and tearing down all of the required initial conditions than they do actually testing functions and methods.

So coupling a function to a mutable value requires us to keep track of that value in order to understand what sow will or won’t do. That adds some complexity to understanding our system. And this coupling is transitive: Not only is sow coupled to gamePits, any other function we write that is coupled to gamePits, becomes coupled to sow. Running sow changes the behaviour of any function coupled to gamePits because sow mutates gamePits.

In fact, sow is coupled to itself! Even if we remember to correctly initialize gamePits before running sow, the order in which we invoke sow affects the results.

Having a single variable, gamePits, describing the state of the game in progress does feel intuitively sound. There is only one board in the game, and when we sow stones, we change it. So why shouldn’t we “model the real world accurately?”

That question is easy to answer: I used to own a typewriter. It had no “undo.” It had no “copy” or “paste,” those functions were accomplished by making copies of pages and laboriously cutting sections of text out with an x-acto blade and gluing them onto other pieces of paper. The text editor I am using now is superior to my typewriter precisely because it does not insist on modelling the real world accurately.

Modelling the real world is a tool for exploring ideas in software and for helping others read our software: What is familiar, subjectively feels “intuitive.”1 But when the modelling interferes with our ability to understand our software’s behaviour, we should relax our desire to make everything about modelling. What we seek is maximum understandability and maximum flexibility, not maximum fidelity.

So how can we “decouple” or code from itself and each other?


Two coupled high-speed trains ETR 610 of SBB on the Gotthard line

Picture of a train coupling.


decoupling code from shared mutable values

The easiest thing is the obvious thing: If a function shares mutable values with another function (or itself!), and we wish to remove the coupling caused by changes to the shared mutable values, we rewrite the functions to get rid of the shared mutable values. This is an easy refactoring:2

  • We take all of the references to shared mutable values that functions must read, and replace them with parameters.
  • We take all of the references to shared mutable values that functions must write, and replace them with creating copies of the shared mutable values. We then return those copies along with any other return values the functions already have.

So our code becomes:

function scoop(before, fromPit) {
  const after = before.slice(0);
  const stonesInHand = after[fromPit];

  after[fromPit] = 0;

  return [after, stonesInHand];
}

function nextPit(pit) {
  return (pit + 1) % 12;
}

function distribute(before, skipPit, currentPit, stonesInHand) {
  const after = before.slice(0);

  if (currentPit === skipPit) {
    currentPit = nextPit(currentPit);
  }

  ++after[currentPit];
  --stonesInHand;

  if (stonesInHand === 0) {
    return [after, currentPit];
  } else if (stonesInHand > 0) {
    return distribute(after, skipPit, nextPit(currentPit), stonesInHand);
  }
}

function sow(before, skipPit, fromPit = skipPit) {
  const [after, pitsInHand] = scoop(before, fromPit);

  return distribute(after, skipPit, nextPit(fromPit), pitsInHand);
}

And we use it like this:

let gamePits = Array(12).fill(4);

sow(gamePits, 0)
  //=> [
         [0, 5, 5, 5, 5, 4, 4, 4, 4, 4, 4, 4],
         4
       ]

If we want to update the game mutable values, we can choose to do that:

let gamePits = Array(12).fill(4);
let lastPit;

[gamePits, lastPit] = sow(gamePits, 0);

gamePits
  //=> [0, 5, 5, 5, 5, 4, 4, 4, 4, 4, 4, 4]

At first sight, there seem to be more moving parts with this approach.

But in truth, they were always there, it’s just that we “lifted” them into the interface of the scoop, distribute, and sow functions where we can see them. And now, we can test any arbitrary invocation without having to remember to set things up in advance.3

For example, here’s the second call we made:

sow([0, 5, 5, 5, 5, 4, 4, 4, 4, 4, 4, 4], 0, 4)
  //=> [
         [0, 5, 5, 5, 0, 5, 5, 5, 5, 5, 4, 4],
         9
       ]

Checks out, and so do the third and fourth calls we made:

sow([0, 5, 5, 5, 0, 5, 5, 5, 5, 5, 4, 4], 0, 9)
  //=> [
         [0, 6, 6, 6, 0, 5, 5, 5, 5, 0, 5, 5],
         3
       ]

sow([0, 6, 6, 6, 0, 5, 5, 5, 5, 0, 5, 5], 0, 3)
  //=> [
         [0, 6, 6, 0, 1, 6, 6, 6, 6, 1, 5, 5],
         9
       ]

We get to the same place, but we can run our “tests” in any order and don’t have to do any set up in advance.

The lesson learned is that decoupling functions from mutable mutable values makes them easier to understand and test. And if we want to model the “real world” by having a value representing the current game mutable values, we can still do that:

let gamePits = Array(12).fill(4);

let initialChoice = 0;
let lastPit;

[gamePits, lastPit] = sow(gamePits, initialChoice);
[gamePits, lastPit] = sow(gamePits, initialChoice, lastPit);
[gamePits, lastPit] = sow(gamePits, initialChoice, lastPit);
[gamePits, lastPit] = sow(gamePits, initialChoice, lastPit);

gamePits[lastPit]
  //=> 1

gamePits
  //=> [0, 6, 6, 0, 1, 6, 6, 6, 6, 1, 5, 5]

We’re now ready to automate the “relay sowing.”


Relayer is an album by Prog Rock progenitors Yes. It is notable for being the first album they recorded without Rick Wakeman, who wanted to break from the emphasis on long-form compositions.


relay sowing

We have implemented “relay sowing” by hand above. It’s time to write a relaySow function. We could modify sow, but accreting new functionality on top of old is a recipe for bloat.

How do we start relaySow? Well, with sow:

function relaySow(before, skipPit, currentPit = skipPit) {
  let [after, lastPit] = sow(before, skipPit, currentPit);

  // what happens next?
}

Well, what does happen next? I think we know the answer: We need to check and see whether the number of stones in the last pit is one or not:

function relaySow(before, skipPit, currentPit = skipPit) {
  let [after, lastPit] = sow(before, skipPit, currentPit);

  if (after[lastPit] === 1) {
    return [after, lastPit];
  } else {
    // now what?
  }
}

“Now what,” indeed. Now what? We can’t just call sow again and return the result, that won’t work if we end up needing to sow a third time. We’d need all sorts of nested if statements. Yecch.

We could rewrite relaySow to have a loop of so kind, maybe do { ... } while (...). But hang on! With distribute, we recursively called distribute when there was more work to be done. Let’s use the same pattern:

function relaySow(before, skipPit, currentPit = skipPit) {
  let [after, lastPit] = sow(before, skipPit, currentPit);

  if (after[lastPit] === 1) {
    return [after, lastPit];
  } else {
    return relaySow(after, skipPit, lastPit);
  }
}

let gamePits = Array(12).fill(4);
let lastPit;

[gamePits, lastPit] = relaySow(gamePits, 0);

lastPit
  //=> 9

gamePits
  //=> [0, 6, 6, 0, 1, 6, 6, 6, 6, 1, 5, 5]

We have relay sowing working. What else do we need to model a player’s turn?


Game of Go at Club Saarto

Go is a completely different game from the Mancala family, but they share the notion of capturing stones.


capturing stones

Relay sowing ends when the last stone is placed in an empty hole. In Ayoayo, if that empty hole is on the player’s side, they capture all of the stones in the corresponding hole on the opponent’s side:

function capturedStones(before, startPit, endPit) {
  const endedOnThePlayersSide = startPit < 6 === endPit < 6;

  if (endedOnThePlayersSide) {
    const pitOnOpponentsSide = 11 - endPit;
    const after = before.slice(0);
    const stones = after[pitOnOpponentsSide];

    after[pitOnOpponentsSide] = 0;

    return [after, stones];
  } else {
    return [before, 0];
  }
}

We will try it:

const gameStart = Array(12).fill(4);
const startPit = 0;

const [afterSowing, endPit] = relaySow(gameStart, startPit);
const [afterTurn, captured] = capturedStones(afterSowing, startPit, endPit);

afterTurn
  //=> [0, 6, 6, 0, 1, 6, 6, 6, 6, 1, 5, 5]

endPit
  //=> 9

captured
  //=> 0

No stones were captured, because the last pit was pit nine on the other side. How can we get a test case that captures stones? Should we try the numbers at random or in sequence? We don’t have to. The starting position of the board is symmetrical under rotation. So if move 0 ends up with the board in a certain layout after sowing, and ending on pit 9, it follows that choosing 1 should lead to exactly the same layout, but rotated one pit counter-clockwise, and the last pit will be 10.

Well this makes things easy. If the player chooses to start with 3, the last pit should be 0, on the player’s own side:

const gameStart = Array(12).fill(4);
const startPit = 3;

const [afterSowing, endPit] = relaySow(gameStart, startPit);
const [afterTurn, captured] = capturedStones(afterSowing, startPit, endPit);

afterSowing
  //=> [1, 5, 5, 0, 6, 6, 0, 1, 6, 6, 6, 6]

afterTurn
  //=> [1, 5, 5, 0, 6, 6, 0, 1, 6, 6, 6, 0]

endPit
  //=> 0

captured
  //=> 6

state of the board after the first player chooses pit three

The state of the board after the first player chooses pit three.


If we want to update the game state, we’ll need a notion of a “score,” and we can change capturedStones into handleCapture:

function ownerOf(pit) {
  return pit > 5 ? 1 : 0
}

function handleCaptures(beforeBoard, beforeScore, startPit, endPit) {
  const endedOnThePlayersSide = startPit < 6 === endPit < 6;

  if (endedOnThePlayersSide) {
    const pitOnOpponentsSide = 11 - endPit;
    const afterBoard = beforeBoard.slice(0);
    const playerNumber = ownerOf(startPit);
    const afterScore = Object.assign(
      beforeScore,
      { [playerNumber]: beforeScore[playerNumber] + beforeBoard[pitOnOpponentsSide] }
    );
    afterBoard[pitOnOpponentsSide] = 0;

    return [afterBoard, afterScore];
  } else {
    return [beforeBoard, beforeScore];
  }
}

const gameStart = Array(12).fill(4);
const scoreStart = { 0: 0, 1: 0 };
const startPit = 3;

const [afterSowing, endPit] = relaySow(gameStart, startPit);
const [afterTurn, scoreAfter] = handleCaptures(afterSowing, scoreStart, startPit, endPit);

afterTurn
  //=> [1, 5, 5, 0, 6, 6, 0, 1, 6, 6, 6, 0]

scoreAfter
  //=> {0: 6, 1: 0}

Finally, we can assemble sowing and capturing together:

function sowAndCapture(beforeBoard, beforeScore, startPit) {
  const [afterSowing, endPit] = relaySow(beforeBoard, startPit);
  const [afterTurn, scoreAfter] = handleCaptures(afterSowing, beforeScore, startPit, endPit);

  return [afterTurn, scoreAfter];
}

const gameStart = Array(12).fill(4);
const scoreStart = { 0: 0, 1: 0 };
const startPit = 3;

const [afterTurn, scoreAfter] = sowAndCapture(gameStart, scoreStart, startPit);

afterTurn
  //=> [1, 5, 5, 0, 6, 6, 0, 1, 6, 6, 6, 0]

scoreAfter
  //=> {0: 6, 1: 0}

We now have almost everything we need to referee a game between two players. Not enough to write a program to play on its own, but we almost have enough to get going on—for example—a web site that would allow players to play each other online. So what are we missing?

Two things: Determining when the game is over, and determining which moves are permissible. The former depends upon the latter, so our next step is to determine which moves a player is allowed to make.


no entry?

In Ayoayo, some moves are permitted, and some are prohibited.


permissible moves

In working out which moves are permissible, we need think of only three rules. First, a player can only choose one of the six pits on their side. We’ll call this the set of potential moves. Second, a player can only choose a potential move if that pit has one or more stones in it. We’ll call the set of potential moves that also have stones to be the set of possible moves:

function pitsBelongingTo(player) {
  if (player === 0) {
    return [0, 1, 2, 3, 4, 5];
  } else {
    return [6, 7, 8, 9, 10, 11];
  }
}

Now what about filtering the potential moves down to those that are possible?

function possible(pits, pit) {
  return pits[pit] > 0;
}

function possibleMoves(pits, moves) {
  if (moves.length === 0) {
    return moves
  } else {
    const [first, ...rest] = moves;

    if (possible(pits, first)) {
      return [first].concat(possibleMoves(pits, rest));
    } else {
      return possibleMoves(pits, rest);
    }
  }
}

We can see this in action from our previous work:

const gameStart = Array(12).fill(4);
const scoreStart = { 0: 0, 1: 0 };
const startPit = 3;

const [afterSowing, endPit] = relaySow(gameStart, startPit);
const [afterTurn, scoreAfter] = handleCaptures(afterSowing, scoreStart, startPit, endPit);
const potentialMovesForPlayerOne = pitsBelongingTo(1);
const possibleMovesForPlayerOne = possibleMoves(afterTurn, potentialMovesForPlayerOne);

afterTurn
  //=> [1, 5, 5, 0, 6, 6, 0, 1, 6, 6, 6, 0]

possibleMovesForPlayerOne
  //=> [7, 8, 9, 10]

We’ve covered rules one and two. Now how about rule three?


three buffalo

Three buffalo. Aren’t they magnificent?


implementing the third rule, with a digression into optimization

The third rule for whether a move is permissible is far more interesting than the first two:

If a player ends his or her turn with no seeds left in his or her row, the opponent must (if it is possible) choose his move in such a way to bring one or more seeds into the other’s row. This scheme is found in many Mancala games, and sometimes referred to as “feeding” the opponent (i.e., save the opponent from starving).

As expressed, this rule is not equivalent to the simpler, “If you can make a move that leaves your opponent with a move, you must.” This rule only applies to the situation where the opponent’s play leaves them with no stones/seeds on their row. Fine, let’s code that.

The first and obvious thing to take into account is that this has no effect if the opponent has one or more pits with stones in them. For that, we’ll need a function to determine whether a player has at least one possible move. We’ll use the [first, ...rest] recursive pattern from above.

After all, a player has at least one possible move out of some set of moves if the first move is possible or if there is at least one possible move out of the remaining moves, right? Here’s our first crack at it. atLeastOnePossibleMove works out which row belongs to a particular player, and then calls atLeastOne, which is recursive in much the same way that possibleMoves was recursive, breaking the list of possible moves down into first and ...rest as it goes:

function atLeastOne(pits, moves) {
  if (moves.length === 0) {
    return false;
  } else {
    const [first, ...rest] = moves;

    return pits[first] > 0 || atLeastOne(pits, rest);
  }
}

function atLeastOnePossibleMove(pits, player) {
  const moves = pitsBelongingTo(player);

  return atLeastOne(pits, moves);
}

Since we’re already familiar with how this works, we can take a moment and consider some of the ways it could be better.

One of the issues with [first, ...rest] = moves is that we’re not just getting the first element of the array, we’re also copying all the other elements of the array into rest. That’s insignificant for five elements, but as a pattern, it can get expensive if we start dealing with really big arrays. And since we do it for every element, we end up performing approximately n-squared over two copies.

That being said, our code is safe. Let’s say we rewrite atLeastOne to look like this:

function atLeastOne(pits, moves) {
  if (moves.length === 0) {
    return false;
  } else {
    const first = moves.pop();

    return pits[first] > 0 || atLeastOne(pits, moves);
  }
}

Now it destructively modifies moves. Which actually works, and gets rid of the copies! The reason it is fine is that every time pitsBelongingTo is called, it generates a new array for moves, so modifying that array doesn’t break the code.

But one day, somebody has an idea: Why are we making new arrays every time we call pitsBelongingTo? If they don’t look too closely at atLeastOne, they may think this is wasteful. So they write this optimization:

const PLAYER_TO_ROW = {
  0: [0, 1, 2, 3, 4, 5],
  1: [6, 7, 8, 9, 10, 11]
};

function pitsBelongingTo(player) {
  return PLAYER_TO_ROW[player];
}

Now atLeastOnePossibleMove doesn’t need to create a new array every time it’s called. And if we were using the old version of atLeastOne, it would work. But with our new code that modifies the array it is given, the arrays within PLAYER_TO_ROW get modified, and you can’t call atLeastOnePossibleMove more than once.

We now have two optimizations. Either one by themselves works, but if they’re both implemented together, our code is broken!

The general shape of this problem is a dependency between the two functions. We call this a “resource ownership” problem. Both functions think they “own” the array and are entitled to make assumptions about what they can do with it and what they can expect from other functions.

Some other languages have special types and other features to establish constraints, so that if a function expects to modify an array it is given, the compiler can detect whether a piece of code giving it an array expects that array to remain untouched.

In JavaScript, we have a couple of options. One is to Object.freeze() PLAYER_TO_ROW:

const PLAYER_TO_ROW = Object.freeze({
  0: Object.freeze([0, 1, 2, 3, 4, 5]),
  1: Object.freeze([6, 7, 8, 9, 10, 11])
});

function pitsBelongingTo(player) {
  return PLAYER_TO_ROW[player];
}

That prevents the situation where one piece of code thinks the array is immutable, and another tries to mutate it. JavaScript gives us an Unable to delete property. error. And frankly, this is an excellent practice whenever we have entities we think are immutable.

But although it doesn’t apply here, sometimes function A does want some object, O, to be mutable for its own purposes, but when it passes O to function B, it doesn’t expect B to mutate it. Object.freeze(O) is not helpful, A wants to mutate O itself, it just doesn’t want anybody else to mutate O.

What to do?


Ownership Key by Mike Lawrence


managing ownership of mutable data structures with encapsulation

Here’s one pattern that solves both scenarios.

First, we should look at a function like atLeastOne and think about its contract with other functions. It takes an array of moves as a parameter, and reports something else. Clearly, it is not being asked to change moves. That’s an implementation detail. But we want to copy arrays for performance purposes.

The solution in many cases is to copy the array, but only once. Something like this:

function atLeastOne(pits, moves) {
  const disposableMoves = moves.slice(0);

  return recursiveAtLeastOne(pits, disposableMoves);
}

function recursiveAtLeastOne(pits, disposableMoves) {
  if (disposableMoves.length === 0) {
    return false;
  } else {
    const first = disposableMoves.pop();

    return pits[first] > 0 || recursiveAtLeastOne(pits, disposableMoves);
  }
}

Now we can call atLeastOne with an array and never worry about whether it will mutate the array, because it makes a copy and passes the copy to recursiveAtLeastOne.

But now we have a different problem. What if someone tries to invoke recursiveAtLeastOne. We’re relying on a naming convention to remind them not to trust it with an array. That’s no good. A better solution is to encapsulate recursiveAtLeastOne so that it isn’t available to any other function.

One way to do that is with an ES6 module. But a simpler way, good enough for our purposes, is to use function scope, like this:

function atLeastOne(pits, moves) {
  const disposableMoves = moves.slice(0);

  return recursiveAtLeastOne(pits, disposableMoves);

  function recursiveAtLeastOne(pits, disposableMoves) {
    if (disposableMoves.length === 0) {
      return false;
    } else {
      const first = disposableMoves.pop();

      return pits[first] > 0 || recursiveAtLeastOne(pits, disposableMoves);
    }
  }
}

By hiding recursiveAtLeastOne inside of atLeastOne, we prevent other functions from calling it. They can only call atLeastOne, which has the same behaviour as the original version–it doesn’t mutate the array passed in–but makes only one copy.

And if we have no use for atLeastOne other than atLeastOnePossibleMove calling it, we can nest it inside atLeastOnePossibleMove:

function atLeastOnePossibleMove(pits, player) {
  const moves = pitsBelongingTo(player);

  return atLeastOne(pits, moves);

  function atLeastOne(pits, moves) {
    const disposableMoves = moves.slice(0);

    return recursiveAtLeastOne(pits, disposableMoves);

    function recursiveAtLeastOne(pits, disposableMoves) {
      if (disposableMoves.length === 0) {
        return false;
      } else {
        const first = disposableMoves.pop();

        return pits[first] > 0 || recursiveAtLeastOne(pits, disposableMoves);
      }
    }
  }
}

This is beginning to look like a Turducken. If atLeastOnePossibleMove ends up owning atLeastOne, we don’t need to make it safe, so let’s revert to:

function atLeastOnePossibleMove(pits, player) {
  const disposableMoves = pitsBelongingTo(player).slice(0);

  return atLeastOne(pits, moves);

  function atLeastOne(pits, disposableMoves) {
    if (disposableMoves.length === 0) {
      return false;
    } else {
      const first = disposableMoves.pop();

      return pits[first] > 0 || atLeastOne(pits, disposableMoves);
    }
  }
}

The upside is that we do far less copying. The downside is that atLeastOne is no longer nicely decoupled from itself. But since it is now an implementation detail of atLeastOnePossibleMove, that probably doesn’t matter. This pattern comes up a fair bit: Start with the simplest code, and refactor to faster code when the profiler shows you that it matters.[^nope]

But this optimization is for illustration purposes only, there are far, far faster ways to write the code if performance is all that matters.4

What we’ve really learned is that when we have an implementation detail, we want to hide that detail from the rest of the code. That’s what encapsulation is all about, and we don’t always need objects, methods, or modules to do it: Sometimes a nested function is exactly the right tool for the encapsulation job.


Blendy Coffee Filter by Jonathan Lin


returning to permissable moves

Now let’s move on and get started on permissibleFilter:

function permissibleFilter(pits, moves) {
  // the degenerate case
  if (moves.length === 0) {
    return moves;
  }

  const firstMove = moves[1];
  const otherPlayer = ownerOf(firstMove);
  const otherPlayerHasStones =
    atLeastOnePossibleMove(pits, otherPlayer);

  if (otherPlayerHasStones) {
    return moves;
  }

  // Ok, what do we do here?
}

We’ve handled the obvious: If the other player has stones on their side of the board, any of the supplied moves are permissible. But if they don’t have any stones on their side of the board, we have to figure out if any of the moves will “feed” them by placing at least one stone on their side of the board.

Sounds like we need another filter function:

function movesThatFeedTheOtherPlayer(before, moves) {
  // the degenerate case
  if (moves.length === 0) {
    return moves;
  }

  const [first, ...rest] = moves;
  const irrelevantScore = { 0: 0, 1: 0 };
  const [after] = sowAndCapture(before, irrelevantScore, first);
  const otherPlayer = otherPlayerFor(ownerOf(first));
  const otherPlayerHasStonesAfter =
    atLeastOnePossibleMove(after, otherPlayer);

  if (otherPlayerHasStonesAfter) {
    return [first].concat(movesThatFeedTheOtherPlayer(before, rest));
  } else {
    return movesThatFeedTheOtherPlayer(before, rest);
  }
}

const lateGameBoard = [1, 0, 2, 0, 1, 0, 0, 0, 0, 0, 0, 0];
const potentialMovesForPlayerZero = pitsBelongingTo(0);
const possibleMovesForPlayerZero = possibleMoves(lateGameBoard, potentialMovesForPlayerZero);
const movesThatFeedPlayerOne = movesThatFeedTheOtherPlayer(lateGameBoard, possibleMovesForPlayerZero);

movesThatFeedPlayerOne
  //=> [2]

And now we can finish permissibleFilter, then put it all together into permissibleMoves:

function permissibleFilter(pits, moves) {
  // the degenerate case
  if (moves.length === 0) {
    return moves;
  }

  const firstMove = moves[1];
  const otherPlayer = otherPlayerFor(ownerOf(firstMove));
  const otherPlayerHasStones =
    atLeastOnePossibleMove(pits, otherPlayer);

  if (otherPlayerHasStones) {
    return moves;
  } else {
    const permissibleMoves = movesThatFeedTheOtherPlayer(pits, moves);

    if (permissibleMoves.length > 0) {
      return permissibleMoves;
    } else {
      // slightly irrelevant, as the gamne is about to end
      // no matter which move is chosen
      return moves;
    }
  }
}

function permissibleMoves(before, player) {
  const potentialMovesForPlayer = pitsBelongingTo(player);
  const possibleMovesForPlayer = possibleMoves(lateGameBoard, potentialMovesForPlayer);
  const permissibleMovesForPlayer = permissibleFilter(lateGameBoard, possibleMovesForPlayer);

  return permissibleMovesForPlayer;
}

const lateGameBoard = [1, 0, 2, 0, 1, 0, 0, 0, 0, 0, 0, 0];

permissibleMoves(lateGameBoard, 0)
  //=> [2]

Whew! We’re almost done writing the things we would need to handle game mechanics if we were writing an Ayoayo game. The final thing is the end condition. If a player has no stones in their row at the beginning of their turn, their opponent collects all of the stones left on the board, and the game ends.

Let’s implement that.


A whole new view of the Crab Nebula

Everything ends, not just games, but even stars: In 1054 AD, during the Song dynasty, Chinese astronomers spotted a bright new star in the night sky. This newcomer turned out to be a violent explosion within the Milky Way, caused by the spectacular death of a star some 1600 light-years away. This explosion created one of the most well-studied and beautiful objects in the night sky–the Crab Nebula. (More information.)


ending the game

We said above that “If a player has no stones in their row at the beginning of their turn, their opponent collects all of the stones left on the board, and the game ends.” That is equivalent to saying that “If, after making a permissible move, your opponent’s row is empty, you capture all of the stones in your row, and then the game ends.”

Put that way, we don’t need a brand new function, we can update handleCaptures. That function is already big enough and has one clear responsibility, handling the ordinary kind of capture. So let’s extract that and make it a helper, then write a new function to handle captures after completing a turn:

function handleCapturesInTurn(beforeBoard, beforeScore, startPit, endPit) {
  const endedOnThePlayersSide = startPit < 6 === endPit < 6;

  if (endedOnThePlayersSide) {
    const pitOnOpponentsSide = 11 - endPit;
    const afterBoard = beforeBoard.slice(0);
    const playerNumber = ownerOf(startPit);
    const afterScore = Object.assign(
      beforeScore,
      { [playerNumber]: beforeScore[playerNumber] + beforeBoard[pitOnOpponentsSide] }
    );
    afterBoard[pitOnOpponentsSide] = 0;

    return [afterBoard, afterScore];
  } else {
    return [beforeBoard, beforeScore];
  }
}

function handleGameEnd(beforeBoard, beforeScore, playerWhoJustMoved) {
  // Write function here!
}

function handleCaptures(beforeBoard, beforeScore, startPit, endPit) {
  const [afterCapturesInTurn, afterScoreInTurn] =
    handleCapturesInTurn(beforeBoard, beforeScore, startPit, endPit);
  const playerWhoJustMoved = ownerOf(startPit);
  const [afterTurn, afterScore] = handleGameEnd(afterCapturesInTurn, afterScoreInTurn, playerWhoJustMoved);
  const isOver = (afterScore[0] + afterScore[1]) === 48;

  return [afterTurn, afterScore, isOver];
}

function sowAndCapture(beforePits, beforeScore, startPit) {
  const [afterSowing, endPit] = relaySow(beforePits, startPit);
  const [afterTurn, scoreAfter, isOver] = handleCaptures(afterSowing, beforeScore, startPit, endPit);

  return [afterTurn, scoreAfter, isOver];
}

And now let’s write handleGameEnd. For starters, we check whether the other player has moves, if they do, there is no effect:

function otherPlayerFor(player) {
  return 1 - player;
}

function handleGameEnd(beforeBoard, beforeScore, playerWhoJustMoved) {
  const otherPlayerNumber = otherPlayerFor(playerWhoJustMoved);
  const otherPlayerHasMoves = atLeastOnePossibleMove(beforeBoard, otherPlayer);

  if (otherPlayerHasMoves) {
    return [beforeBoard, beforeScore];
  } else {
    // Capture everything else!
  }
}

If they don’t, we capture every stone on the player’s side. We can use our encapsulation pattern to avoid [first, rest]:

function handleGameEnd(beforeBoard, beforeScore, playerWhoJustMoved) {
  const otherPlayer = otherPlayerFor(playerWhoJustMoved);
  const otherPlayerHasMoves = atLeastOnePossibleMove(beforeBoard, otherPlayer);

  if (otherPlayerHasMoves) {
    return [beforeBoard, beforeScore];
  } else {
    const disposablePits = pitsBelongingTo(playerWhoJustMoved).slice(0);
    const stonesRemaining = countStones(beforeBoard, disposablePits);

    const playerWhoJustMovedScore = beforeScore[playerWhoJustMoved] + stonesRemaining;

    const finalScore = {
      [playerWhoJustMoved]: playerWhoJustMovedScore,
      [otherPlayer]: beforeScore[otherPlayer]
    };

    return [
      [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
      finalScore
    ];
  }

  function countStones(board, disposablePits, runningTotal = 0) {
    if (pits.length === 0) {
      return runningTotal;
    } else {
    	const first = disposablePits.slice(0);

    	return countStones(board, rest, runningTotal + board[first]);
    }
  }
}

And we can do an “integration test,” running our code from end to end:

const lateGameBoard = [1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0];
const lateGameScore = { 0: 40, 1: 5 };

let [afterTurn, afterScore, isOver] = sowAndCapture(lateGameBoard, lateGameScore, 2);

afterTurn
  //=> [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

afterScore
  //=> { 0: 43, 1: 5 }

isOver
  //=> true

const notOverBoard = [1, 0, 2, 0, 1, 0, 0, 0, 0, 0, 0, 0];
const notOverScore = { 0: 39, 1: 5 }

[afterTurn, afterScore, isOver] = sowAndCapture(notOverScore, lateGameScore, 2);

afterTurn
  //=> [1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0]

afterScore
  //=> { 0: 39, 1: 5 }

isOver
  //=> false

Success!


taking stock of what we’ve learned

We’ve written a fair amount of code, and along the way taken a few digressions. Let’s take our “learnings” and put them together in a more structured fashion.

It’s time to talk about recursion a little more formally.


Wendeltreppe Cafe Glockenspiel by Renate Dodell


Ayoayo and Linear Recursion
what is linear recursion?

In programming, a recursive function is a function that invokes itself directly or indirectly. For example, a merge sort divides lists into two halves, then calls itself to sort each of the halves before calling merge to merge them back together in order.

function mergeSort (list) {
  const listLength = list.length;

  if (listLength <= 1) {
    return list;
  } else {
    const cutPoint =  Math.floor(listLength/2);

    const half = list.slice(0, cutPoint);
    const otherHalf = list.slice(cutPoint);

    const sortedHalf = mergeSort(half);
    const sortedOtherHalf = mergeSort(otherHalf);

    return merge(sortedHalf, sortedOtherHalf);
  }
}

function merge (...mutableLists) {
  return recursiveMerge(mutableLists, []);

  function recursiveMerge (mutableLists, mutableSoFar) {
    const listsWithElements = mutableLists.filter(l => l.length > 0);

    if (listsWithElements.length === 0) {
      return mutableSoFar;
    } else if (listsWithElements.length === 1) {
      return mutableSoFar.concat(listsWithElements[0]);
    } else {
      const firsts = listsWithElements.map(l => l[0]);
      const lowest = Math.min(...firsts);
      const indexOfLowest = firsts.indexOf(lowest);
      const listWithLowest = listsWithElements[indexOfLowest];

      listWithLowest.shift();
      mutableSoFar.push(lowest);

      return recursiveMerge(listsWithElements, mutableSoFar)
    }
  }
}

console.log(mergeSort([7, 1, 3, 4, 5, 0, 2, 6]))
  //=> [0, 1, 2, 3, 4, 5, 6, 7]

If we draw the call graph for recursiveMerge, it’s a tree with two branches per node. We say that it’s bi-recursive.

graph TD seven1345026["[7, 1, 3, 4, 5, 0, 2, 6]"] --> seven134["[7, 1, 3, 4]"] seven134 --> seven1["[7, 1]"] seven1 --> seven["[7]"] seven1 --> one["[1]"] seven134 --> three4["[3, 4]"] three4 --> three["[3]"] three4 --> four["[4]"] seven1345026["[7, 1, 3, 4, 5, 0, 2, 6]"] --> five026["[5, 0, 2, 6]"] five026 --> five0["[5, 0]"] five0 --> five["[5]"] five0 --> zero["[0]"] five026 --> two6["[2, 6]"] two6 --> two["[2]"] two6 --> six["[6]"]

But if we draw the call tree for any one invocation of recursiveMerge within merge, it’s a straight line:

graph TD a["[[1,3,4,7],[0,2,5,6]], []"] --> b b["[[1,3,4,7],[2,5,6]], [0]"] --> c c["[[3,4,7],[2,5,6]], [0,1]"] --> d d["[[3,4,7],[5,6]], [0,1,2]"] --> e e["[[4,7],[5,6]], [0,1,2,3]"] --> f f["[[7],[5,6]], [0,1,2,3,4]"] --> g g["[[7],[6]], [0,1,2,3,4,5]"] --> h["[[7],[]], [0,1,2,3,4,5,6]"]

A linearly recursive function is a function where each invocation that invokes itself (directly or indirectly) at most once before returning. Therefore, recursiveMerge is a linearly recursive function.

Linear recursion is interesting, and here’s why…


Spiral Bridge of Young Stars Between Two Ancient Galaxies

Linear recursion is interesting, and so is this photograph showing a spiral bridge of young stars between two ancient galaxies.

At the center of the bull’s-eye of blue, gravitationally lensed filaments lies a pair of elliptical galaxies that are also exhibiting some interesting features. A 100,000-light-year-long structure that looks like a string of pearls twisted into a corkscrew shape winds around the cores of the two massive galaxies. The “pearls” are superclusters of blazing, blue-white, newly born stars. These super star clusters are evenly spaced along the chain at separations of 3,000 light-years from one another.


why linear recursion is interesting

Linear recursion is particularly interesting because it is computationally equivalent to iteration. Any function that is linearly recursive, can be rewritten as a function that iterates but is not recursive. The converse is also true: Any piece of code that iterates, can be rewritten as an invocation of a linearly recursive function that does not iterate.

There are important reasons why this equivalence matters to fundamental computer science theory: It shows that if we have functions and function invocation, we don’t need to add iteration to prove things about computability. But that isn’t what we’re going to look at here.

Consider a hypothetical for loop inside a function:

function doSomethingOrOther (collection) {
  let someVar = somethingOrOther();

  // something happens before the loop

  // frobbish the collection:
  for (const element of collection) {
    // something happens inside the loop to
    // frobbish the collection, element by element
  }

  // something happens after the loop
}

What kinds of things can happen inside the loop to “frobbish the collection?” Almost anything can happen inside the loop! Code can continue, jumping to the next iteration. Code can break from the loop, jumping straight to // something happens after the loop. Code can read variables from outside the loop, and also write to those variables.

Code inside loops can be very tightly coupled to the code outside of the loop. That’s kind of the whole point, code inside a loop is rarely thought of as being “separate” from the rest of the code in a function, so why shouldn’t it be tightly coupled with that code?

We’ve seen a lot of code so far, so let’s continue with this hypothetical and completely code-free example. Let’s imagine we refactor the code to extract the loop into a frobbish function:

function frobbish (collection, someVar) {
  // something happens inside the function to
  // frobbish the collection, element by element
};

function doSomethingOrOther (collection) {
  let someVar = somethingOrOther();

  // something happens before the loop

  // frobbish the collection:
  const frobbishedCollection =
    frobbish(collection, someVar);

  // something happens after the loop
}

We get an immediate win, and I don’t mean that our function is shorter: When we lose the linear flow of a function, it’s harder to read. We have to get something in exchange, and what we get is actually quite powerful: By extracting the function, it is now clear what our frobbish code reads (collection and someVar), and what it writes (just collection).

If We use a fancy editor that has an “extract helper function” refactoring capability, it can extract almost any chunk of code, but if the code was tightly coupled to the rest of the function, we’ll get four or five parameters going in, and another four or five coming out.

Whether we do it by hand or not, extracting the function takes what was always there–the dependencies–and makes them explicit. That can encourage us to rewrite the code to eliminate the dependencies, and if we don’t, at least it can help us understand what effect that code can have on the rest of the function.

But there’s more.

function frobbish (collection, someVar) {
  // something happens inside the function to
  // frobbish the collection, element by element
};

Code inside the function can become tightly coupled to itself. It can read and write anything it wants. It can set up some variables and modify them inside a loop. In fact, we can move all of the coupling into our extracted function that we wanted to get rid of.

But what if we refactor that function–which was designed to replicate a single piece of iteration–to be linearly recursive?

function frobbish (collection, someVar) {
  if (collection.length === 0) {
    // handle the degenerate case
  } else {
    const [first, ...rest] = collection;

    // frobbish the first element, then
    // combine it with frobbish(rest, someVar)
  }
};

Recall above where we espoused the value of pure functions as being decoupled from themselves. Doing the same with a linearly recursive function forces us to decouple frobbish from itself, making any dependencies explicit, and encouraging us to find ways to eliminate as many dependencies as possible.

In sum, A tremendous value of expressing code as linear recursion with dependency-free functions rather than as iteration, is that it makes dependencies explicit. That in turn helps us eliminate them.

Now we could go off on a tangent about how to refactor iteration into linear recursion, or the reverse, how to refactor linear recursion into iteration. But let’s sum up what we’ve gathered from this exercise.


start-finish line


the finish line

We learned that while we can refactor functions with dependencies into functions without dependencies (via the “dependency injection” pattern), we can also use what we learned to write them functions without dependencies in the first place. The important thing is not the specifics of the refactoring, but what we are trying to achieve.

It’s the same with linear recursion. The important thing is to understand what we’re trying to achieve: low coupling.

We don’t need to write everything with iteration and then refactor to linear recursion: We can just write it as linear recursion in the first place. When we take a “linear-recursion-first” approach, we naturally end up with code that has high cohesion and low coupling. Linear recursion makes that easy.

Of course, there are real tradeoffs. Constantly making copies of arrays is expensive: We have to select data structures that are less wasteful when we slice them up, or use patterns (like iterators) that allow us to write the equivalent code that doesn’t need to do things like copy arrays.

Or we have to learn how to write tail-recursive functions, and use a language implementation that optimizes our functions for us.

Or we have to get it working with linear recursion, and then refactor it back to iteration. This is very interesting, because if we write it with linear recursion, and then refactor it back to iteration, it will retain its uncoupled form. Of course, over time code can accrete and become coupled.

These and other considerations are beyond the scope of this essay. But starting with linear recursion, and then refactoring to iterative code, is an excellent way to ensure that the iteration we do implement is clean and coupling-free.

And in the fullness of time, we will start writing iterative code that is clean and decoupled from the outset, thanks to our familiarity and practice with writing linearly recursive functions. Linear recursion–like dependency-free functions–is a powerful tool for learning what kinds of iterative code is going to be easiest to understand and work with.

THE END


code.close()


Appendix: The completed code
Notes
  1. Intuitive Equals Familiar, Jef Raskin, 1994 

  2. In “How to Deal With Dirty Side Effects in Your Pure Functional Javascript,” James Sinclair calls this Dependency Injection

  3. More precisely, the “moving parts” that we have to think about were always there. This particular implementation does also create a number of temporary arrays via .slice(...) and copy elements into them. That is an implementation detail that does not increase the complexity of the code to be read, but if we are very interested in optimizing the performance of our algorithms, there are ways to implement array alicing using Strutural Sharing and Copy-on-Write Semantics

  4. Another pattern for getting rid of excessive copying is Strutural Sharing and Copy-on-Write Semantics as mentioned above. 

https://raganwald.com/2019/02/03/ayoayo
Structural Sharing and Copy-on-Write Semantics, Part II: Reduce-Reuse-Recycle
Show full content

This is Part II of an essay that takes a highly informal look at two related techniques for achieving high performance when using large data structures: Structural Sharing, and Copy-on-Write Semantics.

In Part I, we used recursive functions that operate on lists to explore how we could use Structural Sharing to write code that avoids making copies of objects while still retaining the semantics of code that makes copies.

Here in Part II, we’ll consider resource ownership, starting with using copy-on-write semantics to implement safe mutation while still reserving structural sharing.


The Canonization of Blessed John XXIII and Blessed John Paul II


a brief review of structural sharing

In Part I, we created the Slice class, with its static method Slice.of(...). Instances of slice implement slices of JS arrays, but use structural sharing to keep the cost of creating a slice constant, rather than order-n.:

//
// https://raganwald.com/2019/01/14/structural-sharing-and-copy-on-write.html
//

const SliceHandler = {
  has (slice, property) {
    if (property in slice) {
      return true;
    }

    if (typeof property === 'symbol') {
      return false;
    }

    const matchInt = property.match(/^\d+$/);
    if (matchInt != null) {
      const i = parseInt(property);

      return slice.has(i);
    }

    const matchCarCdr = property.match(/^c([ad]+)r$/);
    if (matchCarCdr != null) {
      return true;
    }
  },

  get (slice, property) {
    if (property in slice) {
      return slice[property];
    }

    if (typeof property === 'symbol') {
      return;
    }

    const matchInt = property.match(/^\d+$/);
    if (matchInt != null) {
      const i = parseInt(property);
      return slice.at(i);
    }

    const matchCarCdr = property.match(/^c([ad]+)r$/);
    if (matchCarCdr != null) {
      const [, accessorString] = matchCarCdr;
      const accessors = accessorString.split('').map(ad => `c${ad}r`);
      return accessors.reduceRight(
        (value, accessor) => Slice.of(value)[accessor],
        slice);
    }
  }
};

function normalizedFrom(arrayIsh, from = 0) {
    if (from < 0) {
      from = from + arrayIsh.length;
    }
    from = Math.max(from, 0);
    from = Math.min(from, arrayIsh.length);

    return from;
}

function normalizedLength(arrayIsh, from, length = arrayIsh.length) {
    from = normalizedFrom(arrayIsh, from);

    length = Math.max(length, 0);
    length = Math.min(length, arrayIsh.length - from);

    return length;
}

function normalizedTo(arrayIsh, from, to) {
    from = normalizedFrom(arrayIsh, from);

    to = Math.max(to, 0);
    to = Math.min(arrayIsh.length, to);

    return to;
}

class Slice {
  static of(object, from = 0, to = Infinity) {
    if (object instanceof this) {
      from = normalizedFrom(object, from);
      to = normalizedTo(object, from, to);

      return new Slice(object.array, object.from + from, to - from);
    }
    if (object instanceof Array) {
      from = normalizedFrom(object, from);
      to = normalizedTo(object, from, to);

      return new this(object, from, to - from);
    }
    if (typeof object[Symbol.iterator] === 'function') {
      return this.of([...object], from, to);
    }
  }

  constructor(array, from, length) {
    this.array = array;
    this.from = normalizedFrom(array, from);
    this.length = normalizedLength(array, from, length);

    return new Proxy(this, SliceHandler);
  }

  * [Symbol.iterator]() {
    const { array, from, length } = this;

    for (let i = 0; i < length; i++) {
      yield array[i + from];
    }
  }

  join(separator = ",") {
    const { array, from, length } = this;

    if (length === 0) {
      return '';
    } else {
      let joined = array[from];

      for (let i = 1; i < this.length; ++i) {
        joined = joined + separator + array[from + i];
      }

      return joined;
    }
  }

  toString() {
    return this.join();
  }

  slice (from, to = Infinity) {
    from = normalizedFrom(this, from);
    to = normalizedTo(this, from, to);

    return Slice.of(this.array, this.from + from, this.from + to);
  }

  has(i) {
    const { array, from, length } = this;

    if (i >= 0 && i < length) {
      return (from + i) in array;
    } else {
      return false;
    }
  }

  at(i) {
    const { array, from, length } = this;

    if (i >= 0 && i < length) {
      return array[from + i];
    }
  }

  concat(...args) {
    const { array, from, length } = this;

    return Slice.of(array.slice(from, length).concat(...args));
  }

  get [Symbol.isConcatSpreadable]() {
    return true;
  }
}

function sum (array) {
  return sumOfSlice(Slice.of(array), 0);

  function sumOfSlice (remaining, runningTotal) {
    if (remaining.length === 0) {
      return runningTotal;
    } else {
      const first = remaining[0];
      const rest = Slice.of(remaining, 1);

      return sumOfSlice(rest, runningTotal + first);
    }
  }
}

const oneToSeven = [1, 2, 3, 4, 5, 6, 7];

sum(oneToSeven)
  //=> 28

This covers the basics, but let’s step back for a moment. In pure functional programming, data is immutable. This makes it much easier for humans and machines to reason about programs. We never have a pesky problem like passing an array to a sum function and having sum modify the array out from under us.

But when programming in a multi-paradigm environment, we need to accomodate code that is written around mutable data structures. In JavaScript, we can write:

const abasement = ['a', 'b', 'a', 's', 'e', 'm', 'e', 'n', 't'];

const bade = abasement.slice(1, 5);
bade[2] = 'd';

bade.join('')
  //=> "bade"

const bad = bade.slice(0, 3);

bad.join('')
  //=> "bad"

Modifying a slice of abasement does not modify the original array. But what happens with our Slice class? We haven’t done anything to handle modifying elements, so as it turns out, we can set properties but they don’t affect the underlying array that we use for things like further slices or joining:

let slice = Slice.of(abasement, 1, 5);
slice[2] = 'd';

slice.join('')
  //=> "base"

Just as we needed our proxy to mediate [...], we also need our proxy to mediate [...] =. Let’s make it so.


renovations


copy on write

In our structural sharing implementation, our slices depend upon the underlying array not mutating out from underneath them. Which seems to imply that the Slices themselves have to be immutable. But not so: The slices do not depend upon each other, but upon the underlying array.

This makes it possible for us to modify a slice without modifying any other slices that depend upon the slice’s original array… by making a copy of the slice’s array before we write to it:

const SliceHandler = {

  // ...

  set(slice, property, value) {
    if (typeof property === 'string') {
      const matchInt = property.match(/^\d+$/);
      if (matchInt != null) {
        const i = parseInt(property);
        return slice.atPut(i, value);
      }
  	}

    return slice[property] = value;
  }
}

class Slice {

  // ...

  atPut (i, value) {
    const { array, from, length } = this;

    this.array = array.slice(from, length);
    this.from = 0;
    this.length = this.array.length;

    return this.array[i] = value;
  }
}

const a1to5 = [1, 2, 3, 4, 5];
const oneToFive = Slice.of(a1to5);

oneToFive.atPut(2, 'three');
oneToFive[0] = 'uno';

a1to5
  //=> [1, 2, 3, 4, 5]
[...oneToFive]
  //=> ['uno', 2, 'three', 4, 5]

When an element of the slice is modified, the slice invokes .slice(...) on the underlying array, and switches to using the value returned as its new underlying array. It then performs the modification of the new array.

graph TD a["a1to5: [1, 2, 3, 4, 5]"] b["[1, 2, 'three', 4, 5]"] c["['uno', 2, 'three', 4, 5]"] b-. copy of .->a c-. copy of .->b d["oneToFive { from: 0, length: 5 }"] d-->|array:|c

This prevents the modification from affecting the original array, which may be shared by other slices, or by other code that expected it not to change.

This pattern is called copy on write. In effect, when we took a slice of the original array, we delayed making an actual copy until such time as we needed to make a copy to preserve the original array’s values. When do we actually need the copy? When we write to it, instead of reading from it.

And if we never write to it, we win “bigly” by never making copies. Before we go on to implement other methods like push, pop, unshift, and shift, let’s ask ourselves a question: Must we always make a fresh copy on every write?


Parts


smarter copying on write

Let’s reason about when we need to make a copy.

The first time we write something, we have to make a copy. The array that was passed to Slice in the constructor was provided by another piece of code. Given that we are emulating the protocol of Array.prototype.slice, that piece of code expects that we will not modify the array it passed to Slice.of. Absent a type system that understands mutable and immutable arrays, we must be conservative and assume that we should not modify the original.1

The first time we write to the array, we must make ourselves a copy.

What about after that? Well, after the first write, we have a new array that no other code shares (yet). So we can actually mutate it with abandon. Only when we share it with another piece of code must we revert to making a copy on writes. When do we share that array? When .slice is called, or if another object does a get on our array property.

We need to mediate other objects accessing our array with this scheme, so we’ll store it in a symbol property. That’s private enough to prevent accidental access. And if someone deliberately wants to break our encapsulation, there’s nothing we can do about a determined programmer with a commit bit anyways.

An updated version that only makes copies when necessary:

const arraySymbol = Symbol('array');

class Slice {

  // ...

  constructor(array, from, length) {
    this[arraySymbol] = array;
    this.from = normalizedFrom(array, from);
    this.length = normalizedLength(array, from, length);
    this.makeUnsafe();

    return new Proxy(this, SliceHandler);
  }

  get array() {
    this.makeUnsafe();

    return this[arraySymbol];
  }

  makeUnsafe () {
    this.safe = false;
  }

  makeSafe () {
    const { [arraySymbol]: array, from, length, [safeSymbol]: safe } = this;

    if (!safe) {
      this[arraySymbol] = array.slice(from, length);
      this.from = 0;
      this.length = this[arraySymbol].length;
      this.safe = true;
    }
  }

  atPut (i, value) {
    this.makeSafe();

    const { [arraySymbol]: array, from, length } = this;

    this[arraySymbol] = array.slice(from, length);
    this.from = 0;
    this.length = this[arraySymbol].length;

    return this[arraySymbol][i] = value;
  }

  slice (from, to = Infinity) {
    this.makeUnsafe();

    from = normalizedFrom(this, from);
    to = normalizedTo(this, from, to);

    return Slice.of(this.array, this.from + from, this.from + to);
  }
}

const oneToFive = Slice.of([1, 2, 3, 4, 5]);

oneToFive[0] = 'uno';
oneToFive[1] = "zwei";
oneToFive[2] = 'three';

const fourAndFive = oneToFive.slice(3);

oneToFive[3] = 'for';
oneToFive[4] = 'marun';

[...oneToFive]
  //=> ['uno', "zwei", 'three', 'for', 'marun']
[...fourAndFive]
  //=> [4, 5]

If we trace the code, we see that we made a copy when we invoked oneToFive[0] = 'uno', because we can’t make assumptions about the array provided to the constructor. We did not make a copy after oneToFive[1] = "zwei" or oneToFive[2] = 'three', because we knew that we had our copy all to ourselves.

We then invoked oneToFive.slice(3). We didn’t make a copy, but we noted that we were no longer safe, so then when we called oneToFive[3] = 'for', we made another copy. We then were safe again, so invoking oneToFive[4] = 'marun' did not make a third copy.

graph TD a["[1, 2, 3, 4, 5]"] b["['uno', 'zwei', 'three', 4, 5]"] c["['uno', 'zwei', 'three', 'for', 'marun']"] b-. copy of .->a c-. copy of .->b d["oneToFive: { from: 0, length: 5 }"] e["fourAndFive: { from: 3, length: 2 }"] d-->|array:|c e-->|array:|b

The result is identical to the behaviour of making a copy every time we slice, or every time we write, but we’re stingier about making copies when we don’t need them.

And now, emulating other Array.prototype methods that modify the underlying array is easy. For example:

class Slice {

  // ...

  push(element) {
    this.makeSafe();

    const value = this[arraySymbol].push(element);
    this.length = this[arraySymbol].length;

    return value;
  }

  pop() {
    this.makeSafe();

    const value = this[arraySymbol].pop();
    this.length = this[arraySymbol].length;

    return value;
  }

  unshift(element) {
    this.makeSafe();

    const value = this[arraySymbol].unshift(element);
    this.length = this[arraySymbol].length;

    return value;
  }

  shift() {
    this.makeSafe();

    const value = this[arraySymbol].shift();
    this.length = this[arraySymbol].length;

    return value;
  }
}

We could go on implementing other array-ish methods for our Slice class, but let’s reëxamine what we have been doing.


Arenberg Mine


resource ownership

Our implementation of copy-on-write semantics is written as if the primary issue is whether it is safe for an instance of Slice to mutate its underlying array. Which is the case. But how do we decide?

  • The slice’s underlying array is safe to mutate if The current slice is the only piece of code that could possibly own a reference to that array.
  • The slice’s underlying array is not safe to mutate if Other pieces of code that could share references to that array with the current slice.

Another way to look at this is to ask whether the current slice owns the underlying array:

  • If the slice doesn’t share references to the underlying array, the slice owns the underlying array.
  • If the slice might share references to the underlying array, the slice does not own the underlying array.

We don’t have to rename all of our methods, but let’s keep this concept in mind. In a structural sharing environment:

A piece of code is safe to mutate a data structure if that piece of code owns the data structure.

Our code follows this thinking with respect to modifying the slice’s underlying array after the slice has been constructed. But hang on! Recall that when we construct a new slice, we always begin with it being unsafe, meaning that we may share a reference to the array being passed in.

But what if we knew that we were the only ones with a reference to the array? This is extremely difficult to guarantee mechanically, but if we rely on the code that creates an instance of Slice to tell us whether the array being passed is ours to own, we take that into account when creating a new instance.

So our new version of the constructor will take a parameter, safe, indicating whether the slice is safe. Our first pass at our existing of static method will always indicate that the array being passed is unsafe, just as we do now.

class Slice {
  static of(object, from = 0, to = Infinity) {
    if (object instanceof this) {
      const safe = false;

      from = normalizedFrom(object, from);
      to = normalizedTo(object, from, to);

      return new Slice(object.array, object.from + from, to - from, safe);
    }
    if (object instanceof Array) {
      const safe = false;

      from = normalizedFrom(object, from);
      to = normalizedTo(object, from, to);

      return new this(object, from, to - from, safe);
    }
    if (typeof object[Symbol.iterator] === 'function') {
      return this.of([...object], from, to);
    }
  }

  constructor(array, from, length, safe = false) {
    this[arraySymbol] = array;
    this.from = normalizedFrom(array, from);
    this.length = normalizedLength(array, from, length);
    this.safe = safe;

    return new Proxy(this, SliceHandler);
  }
}

Wait, what happens when we call Slice.of with an object that is not an array and not another instance of slice? If it’s iterable, we make a new array using [...object], and then pass that to the constructor to make a new slice.

Well, if we’re making an array with [...object], and we don’t do anything else with the array, the new array we’re passing to Slice.of is one that won’t be used anywhere else, so it is safe:

class Slice {
  static of(object, from = 0, to = Infinity) {
    if (object instanceof this) {
      const safe = false;

      from = normalizedFrom(object, from);
      to = normalizedTo(object, from, to);

      return new Slice(object.array, object.from + from, to - from, safe);
    }
    if (object instanceof Array) {
      const safe = false;

      from = normalizedFrom(object, from);
      to = normalizedTo(object, from, to);

      return new this(object, from, to - from, safe);
    }
    if (typeof object[Symbol.iterator] === 'function') {
      const safe = true;
      const array = [...object];

      from = normalizedFrom(array, from);
      to = normalizedTo(array, from, to);

      return new Slice(array, from, to, safe);
    }
  }

  // ...

}

function * countTo (n) {
  let i = 1;
  while (i <= n) {
    yield i++;
  }
}

const arrayToTen = Slice.of([...countTo(10)]);
const oneToTen = Slice.of(countTo(10));

arrayToTen.safe
  //=> false

oneToTen.safe
  //=> true

There will be other times that a piece of code will want to create a slice of an object, but know that the object is no longer owned. In effect, and object is being given to the Slice glass. To represent this, we’ll create a new static method. .given:

class Slice {
  static given(object, from = 0, to = Infinity) {
    if (object instanceof this) {
      const safe = object.safe;

      from = normalizedFrom(object, from);
      to = normalizedTo(object, from, to);

      return new Slice(object[arraySymbol], object.from + from, to - from, safe);
    }
    if (object instanceof Array) {
      const safe = true;

      from = normalizedFrom(object, from);
      to = normalizedTo(object, from, to);

      return new this(object, from, to - from, safe);
    }
    if (typeof object[Symbol.iterator] === 'function') {
      return this.given([...object], from, to);
    }
  }

  // ...

}

const unsafeTen = Slice.of([...countTo(10)]);
const safeTen = Slice.given([...countTo(10)]);

unsafeTen.safe
  //=> false

safeTen.safe
  //=> true

const givenUnsafeTen = Slice.of(unsafeTen);
const givenSafeTen = Slice.given(safeTen]);

givenUnsafeTen.safe
  //=> false

givenSafeTen.safe
  //=> true

givenUnsafeTen === unsafeTen
  //=> false

givenSafeTen === safeTen
  //=> false

When Slice.given is passed another slice, the slice it returns is safe if the object passed was safe. If that object owned its array, and that object will not be accessed again, then the array is safe for the newly created slice. In effect, we are given the slice, but necessarily the array backing it.

Hmmmm… If that slice is no longer needed, why are we creating another instance of Slice? We can reuse the one that already exists:

class Slice {
  static given(object, from = 0, to = Infinity) {
    if (object instanceof this) {
      from = normalizedFrom(object, from);
      to = normalizedTo(object, from, to);

      object.from = object.from + from;
      object.length = to - from;

      return object;
    }
    if (object instanceof Array) {
      const safe = true;

      from = normalizedFrom(object, from);
      to = normalizedTo(object, from, to);

      return new this(object, from, to - from, safe);
    }
    if (typeof object[Symbol.iterator] === 'function') {
      return this.given([...object], from, to);
    }
  }

  // ...

}

const unsafeTen = Slice.of([...countTo(10)]);
const safeTen = Slice.given([...countTo(10)]);

unsafeTen.safe
  //=> false

safeTen.safe
  //=> true

const givenUnsafeTen = Slice.of(unsafeTen);
const givenSafeTen = Slice.given(safeTen]);

givenUnsafeTen.safe
  //=> false

givenSafeTen.safe
  //=> true

givenUnsafeTen === unsafeTen
  //=> true

givenSafeTen === safeTen
  //=> true

Now when we call Slice.given and pass in another slice, Slice.given mutates the slice that was provided, much as our existing copy-on-write code can mutate the slice’s array when it knows that it does not share it with any other code.


Connex Labyrinth


given in action

The recursive function sum was provided above. It looks like this:

function sum (array) {
  return sumOfSlice(Slice.of(array), 0);

  function sumOfSlice (remaining, runningTotal) {
    if (remaining.length === 0) {
      return runningTotal;
    } else {
      const first = remaining[0];
      const rest = Slice.of(remaining, 1);

      return sumOfSlice(rest, runningTotal + first);
    }
  }
}

When we invoke sum([1, 2, 3, 4, 5, 6, 7]), the outer function creates a new slice out of the original array, and then it calls someOfSlice, which calls itself 7 times, each time invoking Slice.of and creating a new instance of Slice. The net result is that eight slices are created. This may be much less memory than copying slices out of a long array, but it’s unnecessary.

Here’s how we can stop making all those temporary slices:

function sum (array) {
  return sumOfSlice(Slice.of(array), 0);

  function sumOfSlice (remaining, runningTotal) {
    if (remaining.length === 0) {
      return runningTotal;
    } else {
      const first = remaining[0];
      const rest = Slice.given(remaining, 1);

      return sumOfSlice(rest, runningTotal + first);
    }
  }
}

In this formulation, sumOfSlice invokes Slice.given instead of Slice.of. The only objects passed to sumOfSlice are slices created for this function. And having created a slice, it is not used again after being passed to sumOfSlice. Therefore, we can use .given, and the code recycles the same slice over and over again.

Now when we invoke sum([1, 2, 3, 4, 5, 6, 7]), the outer function creates a new slice out of the original array as before, and then it calls someOfSlice, which calls itself 7 times as before. But now, each time it is called, it invokes Slice.given. That reuses the existing slice. The net result is that only one slice is created.


caution


caution

We now have two different mechanisms that manage the mutability of a Slice. The original copy-on-write code is extremely conservative. In conjunction with Slice.of, it assumes that its array is shared with other code, and makes a copy the moment we try to mutate the slice with [...] = or one of the mutating methods like .pop(). Once it makes a copy, it knows that it owns the copy until we share the underlying array with other code via get array() or .slice(...), at which point it becomes conservative again.

Although it’s possible to deliberately circumvent this copy-on-write protocol, for almost all purposes it can be trusted to “just work.” No knowledge of the protocol is needed to understand how to use the Slice class, it is a full-encapsulated implementation detail.

This is not the situation with our other mechanism, Slice.given. Using Slice.given requires that clients of the Slice class understand about resource ownership and the consequences of “aliasing” an array passed to Slice.given. In days of distant yore, programmers managed their own memory with malloc and free. A lot of debugging was devoted to finding places where memory was allocated, but not freed, or freed incorrectly when it was still in use.

The Slice.given mechanism is a return to those days, when programmers would have to manage the ownership of data structures by hand. In general, it is a win that we no longer have to do this. There can be some times when the optimization offered by explicitly managing resource ownership is valuable.

In this code, we have only considered the cost of creating and recycling objects. But sometimes there are other resources attached to an object, like an open web socket or file handle. Under such circumstances, protocols such as distinguishing between a resource being shared with an object, and a resource being given to an object can be useful.2

In sum, we have seen that we can create abstractions for data structures that use structural sharing to reduce copying. This is ridiculously easy when all data is immutable, but nevertheless, we can still write code that saves us from making copies, or even entire objects, until we need to.

the end.


The Complete Code
End Notes
  1. JavaScript has the notion of a frozen object, so if we’re passed a frozen array, we certainly don’t need to worry about anyone else modifying the array out from under us. but likewise, we can’t modify a frozen array ourselves, so it doesn’t help us know whether the array that is used to construct the slice is safe to modify or not. So we’ll be paranoid and assume that it is not safe to modify. 

  2. If you scratch the surface of JavaScript’s Iterator protocols, you get deep into the weeds of .return() and .throw() methods that exist to allow resource-backed iterators (like an iterator that iterates over lines in an open text file) perform resource cleanup (like closing the file). Abstractions like .given could be used to help keep track of which piece of code is responsible for telling the iterator to dispose of its resources when it’s done. 

https://raganwald.com/2019/01/26/reduce-reuse-recycle
Exploring Structural Sharing and Copy-on-Write Semantics, Part I
Show full content

This essay takes a highly informal look at two related techniques for achieving high performance when using large data structures: Structural Sharing, and Copy-on-Write Semantics. In Part I, we’ll look at the background of Structural Sharing and start making a Slice class that abstracts the concept of a slice of an array. In Part II, we’ll consider the problem of resource ownership when mutating objects.

To give us some context for exploring these techniques, we’re going to solve a very simple problem: Programming in a Lisp-like recursive style, but using JavaScript arrays. Although not the most practical use case, it’s interesting because even a small function written in Lisp style (like summing a list of integers) can create and recycle a lot of temporary objects.

Fixing that problem gives us an excuse to look at ways to use memory efficiently, minimizing the data we have to copy. Although we’re unlikely to use recursion gratuitously to sum a list of integers, the techniques we’ll use here to make it “not embarrassing” are the exact same techniques needed to make working with large data structures performant.

And now, let’s start at the beginning. The beginning of functional programming, in fact.


The IBM 704


wherein we travel back in time to the dawn of functional programming

Once upon a time, there was a programming language called Lisp, an acronym for LIS(t) P(rocessing).1 Lisp was one of the very first high-level languages, the very first implementation was written for the IBM 704 computer.2

The 704 had a 36-bit word, meaning that it was very fast to store and retrieve 36-bit values. The CPU’s instruction set featured two important macros: CAR would fetch 15 bits representing the Contents of the Address part of the Register, while CDR would fetch the Contents of the Decrement part of the Register.

In broad terms, this means that a single 36-bit word could store two separate 15-bit values and it was very fast to save and retrieve pairs of values. If you had two 15-bit values and wished to write them to the register, the CONS macro would take the values and write them to a 36-bit word.

Thus, CONS put two values together, CAR extracted one, and CDR extracted the other. Lisp’s basic data type is often said to be the list, but in actuality it was the “cons cell,” the term used to describe two 15-bit values stored in one word. The 15-bit values were used as pointers that could refer to a location in memory, so in effect, a cons cell was a little data structure with two pointers to other cons cells.

Lists were represented as linked lists of cons cells, with each cell’s head pointing to an element and the tail pointing to another cons cell.

Having these instructions be very fast was important to those early designers: They were working on one of the first high-level languages (COBOL and FORTRAN being the others), and computers in the late 1950s were extremely small and slow by today’s standards. Although the 704 used core memory, it still used vacuum tubes for its logic. Thus, the design of programming languages and algorithms was driven by what could be accomplished with limited memory and performance.

Here’s the scheme in JavaScript, using two-element arrays to represent cons cells:

const cons = (a, d) => [a, d],
      car  = ([a, d]) => a,
      cdr  = ([a, d]) => d;

We can make a list by calling cons repeatedly, and terminating it with null:

const oneToFive = cons(1, cons(2, cons(3, cons(4, cons(5, null)))));

oneToFive
  //=> [1,[2,[3,[4,[5, null]]]]]

Notice that though JavaScript displays our list as if it is composed of arrays nested within each other like Russian Dolls, in reality the arrays refer to each other with references, so [1,[2,[3,[4,[5,null]]]]] is our way to represent:

graph LR one(( ))-->|car|a["1"] one-->|cdr|two(( )) two-->|car|b["2"] two-->|cdr|three(( )) three-->|car|c["3"] three-->|cdr|four(( )) four-->|car|d["4"] four-->|cdr|five(( )) five-->|car|e["5"] five-->|cdr|null["fa:fa-ban null"];

This is a Linked List, it’s just that those early Lispers used the names car and cdr after the hardware instructions, whereas today we use words like element and next. But it works the same way: If we want the head of a list, we call car on it:

car(oneToFive)
  //=> 1

car is very fast, it simply extracts the first element of the cons cell. And what about the rest of the list? cdr does the trick:

cdr(oneToFive)
  //=> [2,[3,[4,[5, null]]]]

That’s another linked list too:

graph LR two(( ))-->|car|b["2"] two-->|cdr|three(( )) three-->|car|c["3"] three-->|cdr|four(( )) four-->|car|d["4"] four-->|cdr|five(( )) five-->|car|e["5"] five-->|cdr|null["fa:fa-ban null"];

By extracting references from cons cells, it achieves high performance. In Lisp, it’s blazingly fast because it happens in hardware. There’s no making copies of arrays, the time to get the cdr of a list with five elements is the same as the time to get the cdr pf a list with 5,000 elements. In each case, we have a reference to the first cell, and we get a reference to the next cell in one step. No elements need to be copied and the list does not need to be traversed.

In JavaScript, even without the low-level support, it’s still much, much, much faster to get all the elements except the head from a linked list than from an array. Getting one reference to a structure that already exists is faster than copying a bunch of elements.

So now we understand that in Lisp, a lot of things use linked lists, and they do that in part because it was what the hardware made fast.


Symbolics "old style" keyboard

Symbolics, Inc. was a computer manufacturer headquartered in Cambridge, Massachusetts, and later in Concord, Massachusetts, with manufacturing facilities in Chatsworth, California. Symbolics designed and manufactured a line of Lisp machines, single-user computers optimized to run the Lisp programming language.


operating on lists

As we can see, it was always fast to get the first element of a list and the rest of a list. Now, you could get every element of a list by traversing the list pointer by pointer. So if you wanted to do something with a list, like sum the elements of a list, you’d write a linearly recursive function like this:

const cons = (a, d) => [a, d],
      car  = ([a, d]) => a,
      cdr  = ([a, d]) => d;

function sum (linkedList, runningTotal = 0) {
  if (linkedList == null) {
    return runningTotal;
  } else {
    const first = car(linkedList);
    const rest = cdr(linkedList);

    return sum(rest, runningTotal + first);
  }
}

const oneToFive = cons(1, cons(2, cons(3, cons(4, cons(5, null)))));

sum(oneToFive)
  //=> 15

If we ignore the fact that the original cons cells were many many orders of magnitude faster than using arrays with two elements, we have the general idea:

It was ridiculously fast to separate a list into the first and rest, and as a result, many linear algorithms written in Lisp were organized around repeatedly (by recursion or looping) getting the first and rest of a list.


Garbage Day


Garbage, Garbage Everywhere

But what about today’s JavaScript? Today, we can write a list with an array. And we can get the first and rest with [0] and .slice(1):

function sum (array, runningTotal = 0) {
  if (array.length === 0) {
    return runningTotal;
  } else {
    const first = array[0];
    const rest = array.slice(1);

    return sum(rest, runningTotal + first);
  }
}

const oneToFive = [1, 2, 3, 4, 5];

sum(oneToFive)
  //=> 15

Like car, calling array[0] is fast. But when we invoke array.slice(1), JavaScript makes a new array that is a copy of the old array, omitting element 0. That is much slower, and since these copies are temporary, hammers away at the garbage collector.

We’re only working with five elements at a time, so we can afford to chuckle at the performance implications. But if we start operating on long lists, all that copying is going to bury us under a mound of garbage. Of course, we could switch to linked lists in JavaScript. But the cure would be worse than the disease.

Nobody wants to read code that looks like cons(1, cons(2, cons(3, cons(4, cons(5, null))))). And sometimes, we want to access arbitrary elements of a list. With a linked list, we have to traverse the list element by element to get it:3

function at (linkedList, index) {
  if (linkedList == null) {
    return undefined;
  } else if (index === 0) {
    return car(linkedList);
  } else {
    return at(cdr(linkedList), index - 1);
  }
}

const oneToFive = [1, 2, 3, 4, 5];

at(oneToFive, 4)
  //=> 5

Accessing arbitrary elements of a linked list is the “Shlemiel The Painter” of Computer Science:

Shlemiel gets a job as a street painter, painting the dotted lines down the middle of the road. On the first day he takes a can of paint out to the road and finishes 300 yards of the road. “That’s pretty good!” says his boss, “you’re a fast worker!” and pays him a kopeck.

The next day Shlemiel only gets 150 yards done. “Well, that’s not nearly as good as yesterday, but you’re still a fast worker. 150 yards is respectable,” and pays him a kopeck.

The next day Shlemiel paints 30 yards of the road. “Only 30!” shouts his boss. “That’s unacceptable! On the first day you did ten times that much work! What’s going on?” “I can’t help it,” says Shlemiel. “Every day I get farther and farther away from the paint can!”

If only there was a way to have the elegance of Lisp, and the performance of Arrays when accessing arbitrary elements.

Let’s work our way up to that. Where do we begin?


The Beginning


slicing and structural sharing

Let’s start with a couple of very modest requirements. First, what we’re building is for the case when we want to process arrays in a [0] and .slice(1), style, usually recursively.

(Most of the time, we don’t want to do process lists in this style. But when we do–perhaps we are playing with a recursive algorithm we read about in a book like SICP, perhaps we want to refactor such an algorithm step-by-step–we want the performance to be “not embarrassing.”)

Second, we are going to presume that the array we’re dealing with will not be mutated, at least not while we’re working with it. That’s certainly the case when writing functions that fold a list, like sum.

Given those two constraints, what problem are we trying to solve? As we noted, .slice(1) is expensive because it is implemented by copying arrays. Imagine an array with 10,000 elements!!! The first slice creates another array with 9,999 elements, the next with 9,998 elements, and so on.

So: Our beginning step will be to make .slice less expensive.

The technique we are going to use is called structural sharing. Let’s review our two-element array implementation of linked lists from above:

const cons = (a, d) => [a, d],
      car  = ([a, d]) => a,
      cdr  = ([a, d]) => d;

const oneToFive = cons(1, cons(2, cons(3, cons(4, cons(5, null)))));
const twoToFive = cdr(oneToFive);

The variable twoToFive points to the second element in oneToFive’s list, and both of these lists share the same four elements:

graph LR R1(oneToFive)-->one(("[...]")) R2(twoToFive)-->two(("[...]")) one-->|0|a["1"] one-->|1|two two-->|0|b["2"] two-->|1|three(("[...]")) three-->|0|c["3"] three-->|1|four(("[...]")) four-->|0|d["4"] four-->|1|five(("[...]")) five-->|0|e["5"] five-->|1|null["fa:fa-ban null"];

As long as we don’t want to destructively modify any part of a list that is being shared, this scheme works beautifully.

We are not going to use cons cells or two-element arrays, but we are going to share structure, and as noted, we are going to have to avoid any kind of operation that modifies an existing list in such a way that it affects other variables that are sharing its structure.

So what will our technique be? Well, we are going to create a data structure that behaves enough like an array that we can write things like const first = arrayLikeDataStructure[0]; and const rest = arrayLikeDataStructure.slice(1), and they will work. But of course, our implementation won’t copy arrays. Instead, it will share the array.

We’ll begin with a class representing a slice of an array. Although we don’t need them directly for our purposes, we’ll implement an iterator, a .join method, and a .toString() method, for debugging purpose:4

class Slice {
  constructor(array, from = 0, length = array.length) {
    if (from < 0) {
      from = from + array.length;
    }
    from = Math.max(from, 0);
    from = Math.min(from, array.length);

    length = Math.max(length, 0);
    length = Math.min(length, array.length - from);

    this.array = array;
    this.from = from;
    this.length = length;
  }

  * [Symbol.iterator]() {
    const { array, from, length } = this;

    for (let i = 0; i < length; i++) {
      yield array[i + from];
    }
  }

  join(separator = ",") {
    const { array, from, length } = this;

    if (length === 0) {
      return '';
    } else {
      let joined = array[from];

      for (let i = 1; i < this.length; ++i) {
        joined = joined + separator + array[from + i];
      }

      return joined;
    }
  }

  toString() {
    return this.join();
  }
}

const a1to5 = [1, 2, 3, 4, 5];
const fromTwo = new Slice(a1to5, 2);

fromTwo.toString()
  //=> "3,4,5"

[...fromTwo]
  //=> [3, 4, 5]

Instances of Slice encapsulate the idea of a slice of an array, without making another array:

graph TD b["fromTwo: { from: 2, length: 3 }"]-->|array:|a["a1to5: [1, 2, 3, 4, 5]"];

We’ll now add support for [0] and .slice(1). The function .slice is a little different from the constructor, because the constructor is concerned with initializing the object’s properties, while .slice mimics the semantics of Array.prototype.slice. And we’ll extract some duplication while we’re at it:

function normalizedFrom(arrayIsh, from = 0) {
    if (from < 0) {
      from = from + arrayIsh.length;
    }
    from = Math.max(from, 0);
    from = Math.min(from, arrayIsh.length);

    return from;
}

function normalizedLength(arrayIsh, from, length = arrayIsh.length) {
    from = normalizedFrom(arrayIsh, from);

    length = Math.max(length, 0);
    length = Math.min(length, arrayIsh.length - from);

    return length;
}

function normalizedTo(arrayIsh, from, to) {
    from = normalizedFrom(arrayIsh, from);

    to = Math.max(to, 0);
    to = Math.min(arrayIsh.length, to);

    return to;
}

class Slice {
  constructor(array, from, length) {
    this.array = array;
    this.from = normalizedFrom(array, from, length);
    this.length = normalizedLength(array, from, length);
  }

  * [Symbol.iterator]() {
    const { array, from, length } = this;

    for (let i = 0; i < length; i++) {
      yield array[i + from];
    }
  }

  join(separator = ",") {
    const { array, from, length } = this;

    if (length === 0) {
      return '';
    } else {
      let joined = array[from];

      for (let i = 1; i < this.length; ++i) {
        joined = joined + separator + array[from + i];
      }

      return joined;
    }
  }

  toString() {
    return this.join();
  }

  slice(from, to = Infinity) {
    from = normalizedFrom(this, from, length);
    to = normalizedTo(this, from, to);

    return new Slice(this.array, this.from + from, to - from);
  }
}

const a1to5 = [1, 2, 3, 4, 5];
const fromZero = new Slice(a1to5, 0);
const fromOne = fromZero.slice(1);
const twoToFour = fromZero.slice(2, 4);

[...fromOne]
  //=> [2, 3, 4, 5]
[...twoToFour]
  //=> [3, 4]
graph TD b["fromZero: { from: 0, length: 5 }"]-->|array:|a["a1to5: [1, 2, 3, 4, 5]"] c["fromOne { from: 1, length: 4 }"]-->|array:|a;

To make it work with [0], we need to implement []. Implementing [] just for 0 is easy, but if we implement just [0], we’re begging for a bug later when somebody thinks they can use [1]. What we want instead is a way to allow any indexed access, and properly access the correct element of the underlying array, and without allowing access beyond our slice’s dimension.

To do that, we’ll use a Proxy to handle indexed access.


remote-control-locomotives-sign


meta-programming with proxies

A Proxy is an object that “stands in” for another object, called the target in JavaScript’s documentation. The idea is that the proxy implements the desired behaviour of the object, so we can interact with the proxy as if it was the original.

Proxies have a number of interesting uses. One is to decorate functionality. If we wanted to log changes to a model object, one way to do that is to decorate the model’s methods with logging code. That’s the “aspect-oriented programming” approach: Add functionality to the target in a structured way.

A proxy approach would be to create a proxy for the target model, and the proxy object could implement the logging while forwarding the method invocations to the target. That separates concerns in a different way than decorating methods separates concerns. The decoration is aggregated in the proxy.

Proxies in JavaScript also provide the only way to perform dynamic method dispatch. Famously, the Ruby programming language provides a method_missing hook that allows any class to define code to handle methods that do not have concrete implementations. JavaScript does not bake that into every object. Instead, it makes this type of functionality available in proxies via specific hooks for getting and setting properties.

A proxy associates a target object with a handler object that contains—surprise—handlers for various hooks. Each hook controls a specific type of behaviour.

Initially, we’ll add a has hook and a get hook to our Slice objects. The net effect of the has hook is that every time another piece of code tries to determine whether our slice instances have a particular property, the handler intercepts the detection and can return true or false itself.

The get hook works similarly, only it is responsible for returning a value whenever another piece of code performs a property access. As a rule, it makes sense to implement these two methods in tandem.

In our case, our Slice instances do not have any properties for 0, 1, 2, &c. So if we want to be able to access the elements of the underlying array with code like someSlice[3], we need to handle the attempt to get(slice, '3') and forward it to a method we’ll write on Slice, at(...).

Of course, methods in JavaScript are functions bound to properties, so our has and get handlers always check to see if the target slice already has a property. If so, it delegates the access back to the target.

const SliceHandler = {
  has (slice, property) {
    if (property in slice) {
      return true;
    }

    if (typeof property === 'symbol') {
      return false;
    }

    const matchInt = property.match(/^\d+$/);
    if (matchInt != null) {
      const i = parseInt(property);

      return slice.has(i);
    }
  },

  get (slice, property) {
    if (property in slice) {
      return slice[property];
    }

    if (typeof property === 'symbol') {
      return;
    }

    const matchInt = property.match(/^\d+$/);
    if (matchInt != null) {
      const i = parseInt(property);
      return slice.at(i);
    }
  }
};

We also modify the Slice class. We implement has and at methods that understand the way indexes work relative to the slice’s boundaries, and we also modify the constructor to return a proxy that uses our handler to mediate access to the slice.

class Slice {
  constructor(array, from, length) {
    this.array = array;
    this.from = normalizedFrom(array, from, length);
    this.length = normalizedLength(array, from, length);

    return new Proxy(this, SliceHandler);
  }

  // ...

  has(i) {
    const { array, from, length } = this;

    if (i >= 0 && i < length) {
      return (from + i) in array;
    } else {
      return false;
    }
  }

  at(i) {
    const { array, from, length } = this;

    if (i >= 0 && i < length) {
      return array[from + i];
    }
  }
}

const a1to5 = [1, 2, 3, 4, 5];
const fromZero = new Slice(a1to5, 0);
const fromLast = new Slice(a1to5, -1);

fromZero[0]
  //=> 1
fromLast[0]
  //=> 5

In effect, our slice is a proxy (lower-case “p”) for the underlying array, and we are now returning a Proxy (upper-case “P”) for the slice. That’s two layers of proxies, and doubtless we are all thinking of the famous aphorism “All problems in computer engineering can be solved by another level of indirection, except for the problem of too many layers of indirection.”5

And now we can implement one last thing, a static factory method for making Slice objects out of other things. With of, we can to use Slice to make our recursive functions “not embarrassing.”

class Slice {
  static of(object, from = 0, to = Infinity) {
    if (object instanceof this) {
      from = normalizedFrom(object, from, length);
      to = normalizedTo(object, from, to);

      return new Slice(object.array, object.from + from, to - from);
    }
    if (object instanceof Array) {
      from = normalizedFrom(object, from, length);
      to = normalizedTo(object, from, to);

      return new this(object, from, to - from);
    }
    if (typeof object[Symbol.iterator] === 'function') {
      return this.of([...object], from, to);
    }
  }

  // ...
}

function sum (array) {
  return sumOfSlice(Slice.of(array), 0);

  function sumOfSlice (remaining, runningTotal) {
    if (remaining.length === 0) {
      return runningTotal;
    } else {
      const first = remaining[0];
      const rest = remaining.slice(1);

      return sumOfSlice(rest, runningTotal + first);
    }
  }
}

const oneToSix = [1, 2, 3, 4, 5, 6];

sum(oneToSix)
  //=> 21

No more copying entire arrays! And because our .of static method allows us to create a new slice of something and specify the range being sliced, we can also write our function like this:6

function sum (array) {
  return sumOfSlice(Slice.of(array), 0);

  function sumOfSlice (remaining, runningTotal) {
    if (remaining.length === 0) {
      return runningTotal;
    } else {
      const first = remaining[0];
      const rest = Slice.of(remaining, 1);

      return sumOfSlice(rest, runningTotal + first);
    }
  }
}

const oneToSeven = [1, 2, 3, 4, 5, 6, 7];

sum(oneToSeven)
  //=> 28

Naturally, it’s called of, because we use it to take a slice of some list-like object.


List (2007)


more array-ish behaviour

We didn’t need to implement an iterator, but it should be noted that since it has an iterator, we get a lot of JavaScript array-ish behaviour. For example, in strict mode, the iterator is used when destructuring. So if we want to, we can write:

const a1to5 = [1, 2, 3, 4, 5];
const oneToFive = Slice.of(a1to5);
const [first, ...rest] = oneToFive;

first
  //=> 1
rest
  //=> [2, 3, 4, 5]

Unfortunately, destructuring an iterable with the spread operator always creates a new array in JavaScript, so our Slice class can’t help us make const [first, ...rest] = someSlice; not embarrassing. Iterators work with the spread operator in expressions as well:

const abc = ['a', 'b', 'c'];
const oneTwoThree = Slice.of([1, 2, 3]);

[...abc, ...oneTwoThree]
  //=> ["a", "b", "c", 1, 2, 3]

And they get us for... of loops:

const abc = ['a', 'b', 'c'];

const alphabet = {};

for (const letter of Slice.of(abc)) {
  alphabet[letter] = letter;
}

alphabet
  /=> {a: "a", b: "b", c: "c"}

When we dive deeply into the spec, we uncover Symbol.isConcatSpreadable. Forcing it to be true gets us array spread concatenation behaviour. While we’re at it, we can implement .concat:

class Slice {

  // ...

  concat(...args) {
    const { array, from, length } = this;

    return Slice.of(array.slice(from, length).concat(...args));
  }

  get [Symbol.isConcatSpreadable]() {
    return true;
  }
}

const abc = ['a', 'b', 'c'];
const oneTwoThree = Slice.of([1, 2, 3]);

abc.concat(oneTwoThree)
  //=> ["a", "b", "c", 1, 2, 3]
oneTwoThree.concat(abc)
  //=> [ 1, 2, 3, "a", "b", "c"]

Of course, the biggest array-like behaviour our slices are missing is that we haven’t implemented any of the methods for modifying our slices. We’ll do that in Part II. But before moving on, let’s summarize what we’ve done so far.


ideas


wrapping up

We set out with the purpose of writing some code that would allow us to use JavaScript arrays in a Lisp-like style, without the heavy penalty of making lots and lots of copies. To do that, we implemented structural sharing. We added a Proxy to give our new class indexed access to the elements of our Slice class.

While these techniques are far too heavyweight for a simple task like writing a sum function in the style favoured by Lisp programmers of the 1960s and 1970s, that task was small enough and simple enough to allow us to focus on the implementation of these techniques, rather than on the problem of the domain.

These techniques may seem exotic at first, but they form the basis for high-performance implementation of large data structures. And many other languages, such as Clojure, bake these semantics right in. If JavaScript worked like Clojure, there would be no need to implement a Slice class, because arrays would already implement structural sharing. Calling .slice would be inexpensive, right out of the box.

Until the day that JavaScript gets such data structures in its standard library, we’ll have to Greenspun the functionality ourselves, or use a library such as David Nolen’s Mori.

Next: Structural Sharing and Copy-on-Write Semantics, Part II: Reduce-Reuse-Recycle

(discuss on hacker news and reddit; portions of this essay have previously appeared in the book JavaScript Allongé)


Bonus Hack!

Lisp programmers used car and cdr in intricate ways. Although we’ve only looked at simple lists, cons cells could be used to make trees of arbitrary complexity, and the right sequence of car and cdr invocations could navigate a path to any element or sub-tree.

To facilitate this, Lisp had a system where any function name that started with c, ended with r, and had one or more a or d characters in between was automatically also a function, and it was implemented as if the functions car and cdr were composed in order.

For example, (cadr list) was equivalent to (car (cdr list)), which is the second element. If we wanted to really get Lisp-y, we would implement the same scheme…

This being JavaScript, we’ll hack this idea with a proxy and synthetic properties. That way, we can destructure slices, like this:

const SliceHandler = {
  has (slice, property) {
    if (property in slice) {
      return true;
    }

    if (typeof property === 'symbol') {
      return false;
    }

    const matchInt = property.match(/^\d+$/);
    if (matchInt != null) {
      const i = parseInt(property);

      return slice.has(i);
    }

    const matchCarCdr = property.match(/^c([ad]+)r$/);
    if (matchCarCdr != null) {
      return true;
    }
  },

  get (slice, property) {
    if (property in slice) {
      return slice[property];
    }

    if (typeof property === 'symbol') {
      return;
    }

    const matchInt = property.match(/^\d+$/);
    if (matchInt != null) {
      const i = parseInt(property);
      return slice.at(i);
    }

    const matchCarCdr = property.match(/^c([ad]+)r$/);
    if (matchCarCdr != null) {
      const [, accessorString] = matchCarCdr;
      const accessors = accessorString.split('').map(ad => `c${ad}r`);
      return accessors.reduceRight(
        (value, accessor) => Slice.of(value)[accessor],
        slice);
    }
  }
};

class Slice {

  // ...

  get car() {
    return this.at(0);
  }

  get cdr() {
    return this.slice(1);
  }

}

const oneToFive = Slice.of([1, 2, 3, 4, 5]);

const { car: first, cadr: second, cddr: rest } = oneToFive;

first
  //=> 1
second
  //=> 2
[...rest]
  //=> [3, 4, 5]

The Complete Code
End Notes
  1. Lisp is still very much alive, and one of the most interesting and exciting programming languages in use today is Clojure, a Lisp dialect that runs on the JVM, along with its sibling ClojureScript, Clojure that transpiles to JavaScript. Clojure and ClojureScript both make extensive use of structural sharing and copy-on-write semantics to achieve high performance. By default. 

  2. Fun fact: The very first FORTRAN implementation was also written for the IBM 704

  3. When we say, Nobody wants to read code that looks like cons(1, cons(2, cons(3, cons(4, cons(5, null))))), we mean it. Even the Lisp gurus of old didn’t want to deal with that, so in Lisp when you write '(1 2 3 4 5), it is translated directly into a linked list of cons cells. 

  4. All of this code requires the engine to implement strict JavaScript semantics. Some engines can be configured in “loose” mode, where their implementation of things like destructuring may vary from the standard. 

  5. David Wheeler is credited with what is often called The Fundamental Theorem of Software Engineering, “We can solve any problem by introducing an extra level of indirection.” The wording has evolved over time, and the corollary “…except for the problem of too many layers of indirection” is almost always quoted at the same time. 

  6. The version using remaining.slice(1) is going to be more familiar to other programmers, but we will see in Part II how Slice.of(remaining, 1) leads us towards a better understanding of resource ownership. 

https://raganwald.com/2019/01/14/structural-sharing-and-copy-on-write
Alice and Bobbie and Sharleen and Dyck
Show full content

Alice and Bobbie were comparing notes after interviewing interns for an upcoming work term with their company, HipCo. Their interview process, although often maligned on social media, worked reasonably well for their purposes: They spent an hour with each candidate, devoting twenty minutes to introductions and some basic behavioural questions, about half an hour to a basic programming problem, and the remaining ten minutes or so was turned over to the candidate to ask them questions.

They quickly went over all of the the candidates but one. Alice held that back for the end of their meeting.

“And then there was Sharleen,” said Alice. “Sharleen has decent marks, is in third year, and raised no red flags in the behavioural questions.” Bobbie waited, expecting a revelation with respect to the programming problem, and was not disappointed.

“I asked our usual question, determining whether a string of brackets was a valid Dyck Word, that is, a word in the Dyck language.”


In the theory of formal languages of computer science, mathematics, and linguistics, a Dyck word is a balanced string of square brackets [ and ]. The set of Dyck words forms the Dyck language.

Dyck words and language are named after the mathematician Walther von Dyck. They have applications in the parsing of expressions that must have a correctly nested sequence of brackets, such as arithmetic or algebraic expressions.

[] is a Dyck Word, as are [][], [[]], and [[][]]. ][ is not a Dyck Word, and neither are []], [][][, or ][][.

Dyck words are easily explained to everyone with a basic grasp of arithmatic. If we take a valid arithmatic or algebraic expression that includes parenthesis, such as a + (b + 2) - (c / (d - 1)), then remove everything except the parentheses, such as ()(()), and finally turn them into square brackets, [][[]], what remains is a valid Dyck word if the orginal expression was properly parenthesized.


“Sharleen did write working code to determine whether a string was a valid Dyck Word. But it was unlike any solution I’ve ever seen before. She obviously didn’t memorize any of the solutions we find on those hack-the-job-interview web sites. Here’s what she presented to handle the simple case of only one type of parenthesis…”


Meaning of the word Dyck


determining whether a string is a valid dyck word

Here is Sharleen’s first solution:

function isDyckWord (before) {
  if (before === '') return true;

  const after = before.replace('[]','');

  return (before !== after) && isDyckWord(after);
}

Having shared the solution with Bobbie, Alice continued.

“I was, I admit, surprised. Normally the first cut has some kind of stack, and then we ask abut optimizing to an integer. I asked about performance, and she cheerfully admitted that the solution was nearly maximally pessimum, with n-squared run time as the algorithm scans the string on every call to isDyckWord. But she pointed out that the solution is insanely simple, and she favours simplicity first, optimization second.”

“She was ridiculously complacent about this, so I poked at another problem. What, I asked her, would this do on when presented with a worst-case string, say one that consisted of 100,000 [s followed by 100,000 ]s? Was she not concerned with the possibility of a stack overflow?”

“Sharleen thought for a moment, then said that it depended very much upon the level of optimization in the JavaScript engine. A sufficiently smart just-in-time compiler would recognize that the call to isDyckWord(after) was effectively in tail position, and then use Tail Call Optimization to convert this recursive function into a loop.”

“But if it didn’t, Sharleen presented a trivial change:”

function isDyckWord (before) {
  if (before === '') return true;

  const after = before.replace('[]','');

  if (before === after) {
    return false;
  } else {
    return isDyckWord(after);
  }
}

“JavaScript’s specification calls for Tail-Call Optimization, she explained, and thus this recursive function would be executed as a loop and not take up stack space proportional to the depth of the deepest nesting. I wondered aloud about space, and she was ready for my question: She explained that because this duplicated the string at each pass, it consumed space proportional to the size of the input, regardless of the depth of nesting.”

“I pulled out the question of an Extended Dyck Language that consists of multiple types of parentheses. Using the conventional solution of tracking the last seen opening parenthesis on a stack, we can convert the stack to a count when there is only one kind of parenthesis. But if we introduce additional types, e.g. [](){}, the stack must be used, and the solution requires space proportional to the deepest nesting.”

“I asked if she could create a solution for multiple parentheses, and if so, what space would it require? With a few keystrokes, Sharleen added the two new types of parentheses:”

function isExtendedDyckLanguageWord (before) {
  if (before === '') return true;

  const after = before.replace('[]','').replace('()','').replace('{}','');

  if (before === after) {
    return false;
  } else {
    return isExtendedDyckLanguageWord(after);
  }
}

“I could see at once that this solution was likewise in tail call form and would require space proportional to the size of the input, as before.”

Bobbie was amazed at this idiosyncratic solution. “Well,” Bobbie said at last, “at least you weren’t discussing deterministic vs. non-deterministic pattern matching and its relationship to pushdown automata.”

“No,” Alice admitted, “the opposite. This solution was remarkably simple, even if it is the least performant thing I’ve ever seen. Sharleen had an outrageous amount of confidence to present this in an interview and cheerfully admit to its shortcomings.”

“So?” Asked Bobbie, “Did you press her to write a faster solution?”

“No,” Alice replied, “By this time I was sure that she was trolling me, and I’ll be damned if I was going to let her whip out a fast solution. So instead, I asked her what about JavaScript implementations–like V8 at this time–that don’t support TCO? In that case, she would still have a stack overflow.”

Bobbie pointed out that Alice had clearly gone beyond what was necessary to determine if Sharleen would be an effective intern. “I know,” said Alice, “But I was now all-in on wrangling this solution with Sharleen. I had well and truly fallen for the troll bait.” Bobbie admitted that she was likewise curious about whether Sharleen had an answer for this problem.

“Well,” Alice said, “Sharleen remembered that the code could be converted to use a trampoline, and asked if she could do a web search for the general approach. I assented, and thank god, she fell into a rabbit hole: She wound up reading some weird article about recursive combinators, and ran out of time before she could finish.”

Bobbie made a wry face. “Interviews are not supposed to be battles for intellectual superiority.” Alice’s face fell. “True, true. I got caught up in the moment. But come on, Sharleen was surely having a little joke at our expense.”

“Yes,” agreed Bobbie, “I expect she was. And why not? Interviewers pull a lot of stupid stunts, if once in a while a candidate trolls us with a joke solution, it’s 100% understandable. So any ways, did you do the standard thing and explain that she was free to finish the refactoring on her own time and email it to you?”

“I sure did,” said Alice, “And about twenty minutes later she had texted me a gist.”


Galápagos Mockingbird ©2012 Ben Tavener


sharleen’s trampolining solution

And here’s Sharleen’s solution, building upon a widowbird, a recursive combinator that implements trampolining:

const widowbird =
  fn => {
    class Thunk {
      constructor (args) {
        this.args = args;
      }

      evaluate () {
        return fn(...this.args);
      }
    }

    return (...initialArgs) => {
      let value = fn(
        (...args) => new Thunk(args),
        ...initialArgs
      );

      while (value instanceof Thunk) {
        value = value.evaluate();
      }

      return value;
    };
  };

const isExtendedDyckLanguageWord =
  widowbird(
    (myself, before) => {
      if (before === '') return true;

      const after = before.replace('[]','').replace('()','').replace('{}','');

      if (before === after) {
        return false;
      } else {
        return myself(myself, after);
      }
    }
  );

Alice and Bobbie started at the solution in all its glory. “Well,” said Bobbie, “This is amazingly clever, and amusingly so when you consider how bad its performance is!”

“Yes,” agreed Alice. “There is no question that Sharleen has the horsepower to write code. And she understands the tradeoffs. But do we have any concern that she’ll be too clever?”

Bobbie shook her head. “Interviews are necessarily artificial scenarios. It’s dangerous to jump to conclusions about somebody’s behaviour on-the-job based on a few lines of code written in an interview. That’s what we have the behavioural questions for. You said she was fine with the behavioural questions?”

Alice nodded.

“Well then,” said Bobbie, “The question is not whether we’ll make her an offer, it’s whether you were sufficiently interesting that she will accept.”


why sharleen’s solution works

Given a balanced string, we can insert () anywhere in that string and the result of the insert will be balanced. For example, given the balanced string (()())(), we can make any of these nine insertions: ()(()())(), (()()())(), ((())())(), (()()())(), (()(()))(), (()()())(), (()())()(), (()())(()), or (()())()(). All are balanced.

Also, every finite balanced string except the empty string contains at least one () pair.

https://raganwald.com/2018/11/14/dyck-joke
Pattern Matching and Recursion
Show full content

A popular programming “problem” is to determine whether a string of parentheses is “balanced:”

Given a string that consists of open and closed parentheses, write a function that determines whether the parentheses in the string are balanced. “Balanced” parentheses means that each opening symbol has a corresponding closing symbol and the pairs of parentheses are properly nested.

For example:

Input Output Comment '()' true   '(())' true parentheses can nest '()()' true multiple pairs are acceptable '(()()())()' true multiple pairs can nest '((()' false missing closing parentheses '()))' false missing opening parentheses ')(' false close before open


There are a number of approaches to solving this problem. Some optimize for brevity of the solution, others optimize for space and/or running time.

Naturally, everyone also attempts to optimize for understandability. Most of the time, this means optimizing for understanding what the code does and how it does it. For example, this code is quite readable in the sense of understanding what the code does:

const balanced =
  input => {
    let openParenthesesCount = 0;
    let closeParenthesesCount = 0;

    for (let i = 0; i < input.length; ++i) {
      const c = input[i];

      if (c === '(') {
        ++openParenthesesCount;
      } else if (c === ')') {
        ++closeParenthesesCount;
      } else return false;

      if (closeParenthesesCount > openParenthesesCount) return false;
    }

    return closeParenthesesCount === openParenthesesCount;
  };

There’s a small optimization available to use just one counter that increments and decrements, but it is not to difficult to understand what this code does, and from there we might be able to deduce that a balanced string is one where:

  1. There are an equal number of open and closed parentheses, and;
  2. For any prefix (i.e. a substring that includes the first character), there are at least as many open as closed parentheses.

But even if we make this deduction, does that really help us understand that the problem we’re trying to solve is handling well-formed parenthetical expressions? These facts about the counts of parentheses are true of balanced strings, but they aren’t what we’re trying to communicate.

The “shape” of the problem does not really represent the shape of the solution presented.

Let’s consider a different approach–matching the shape of the solution to the shape of the problem–for balanced parentheses.


helvetica Parentheses

the shape of the balanced parentheses problem

If we take a certain kind of “mathematical” approach to defining the problem we’re trying to solve, we can reduce the definition of a balanced string of parentheses to:

  1. () is balanced.
  2. (), followed by a balanced string, is balanced.
  3. (, followed by a balanced string, followed by ), is balanced.
  4. (, followed by a balanced string, followed by ), followed by a balanced string, is balanced.

The “shape” of this definition is that there are four cases. The first is a “base” or “irreducible” case. The second, third, and fourth cases are self-referential: They define ways to build more complex balanced strings from simpler balanced strings.

Definitions like this are declarative: They describe rules or patterns for recognizing a balanced string, not a step-by-step algorithm for determining whether a string is balanced.

So what would a solution look like if we tried to make it the same shape as this definition? It would:

  1. Describe a pattern, not a step-by-step algorithm, and;
  2. It would have four cases.

Of course, there’s one obvious way to implement a pattern that recognizes particular strings.


regex, because a computer is a terrible thing to waste

regular expressions

Some people, when confronted with a problem, think “I know, I’ll use regular expressions.” Now they have two problems.—Jamie Zawinski

When it comes to recognizing strings, regular expressions are the usual go-to tool. However, regular expressions define regular languages, and regular languages cannot match balanced parentheses.

Two of the three cases in our pattern include a recursive self-reference, and standard regular expressions—including JavaScript’s regular expression implementation—cannot implement self-references. Perl and some other languages include non-standard extensions to regular expressions, including features for creating recursive patterns.

In such languages, (?:(\((?R)\))+ matches balanced parentheses, and for people who understand regular expressions well, the shape of the expression matches the shape of problem we’re solving. That being said, there are two problems with this solution. First, JavaScript isn’t one of those languages, so we can’t use that extended regular expression.

But even if it was, regular expressions optimize for compactness, but aren’t always obvious and clear, even if the expression’s shape matches the problem’s shape. So, what can we do?

Greenspun our own pattern-matching, that’s what we can do.


Bletchley Quilt

greenspunning our own pattern-matching

Let’s posit that instead of using an embedded pattern-matching language like regular expressions, we use functions.

We’ll start by defining our “api” for functions that match strings. With regular expressions, patterns can be set up to match anywhere in a string, just the beginning of the string, just the end, or the entire string.

We’ll start with the idea of a function that matches a string based on the beginning of the string. If it matches, it returns what it matched. If it doesn’t match, it returns false. For example:

const isGreeting =
  input =>
    input.startsWith('hello') &&
    'hello';

isGreeting('fubar')
  //=> false

isGreeting('hello world')
  //=> 'hello'

We don’t want to write all that out every time we want to match something, so we can make a higher-order function that makes simple string match functions:

const just =
  target =>
    input =>
      input.startsWith(target) &&
      target;

just('(')('((()))')
  //=> '('

just(')')('((()))')
  //=> false

This is not enough, of course.


2222 holes

composing patterns

We have written just, a function that makes a simple pattern matching the start of a string. We’ll also need a way to compose patterns. Let’s review the shape of our problem:

  1. () is balanced.
  2. (), followed by a balanced string, is balanced.
  3. (, followed by a balanced string, followed by ), is balanced.
  4. (, followed by a balanced string, followed by ), followed by a balanced string, is balanced.

For the first case, we can use just right out of the box. This will only match balanced substrings strings at the start of a string, but we’ll address matching the entire string later:

const case1 = just('()');

case1('(()')
  //=> false

case1('(())')
  //=> false

case1('()')
  //=> '()'

Writing the second case involves two new ideas. First, we need to have a way of describing a pattern that matches two or more other patterns in succession:

const follows =
  (...patterns) =>
    input => {
      let matchLength = 0;
      let remaining = input;

      for (const pattern of patterns) {
        const matched = pattern(remaining);

        if (matched === false) return false;

        matchLength = matchLength + matched.length;
        remaining = input.slice(matchLength);
      }

      return input.slice(0, matchLength);
    };

follows(just('fu'), just('bar'))('foobar')
  //=> false

follows(just('fu'), just('bar'))('fubar\'d')
  //=> 'fubar'

Next, we’ll need a way to describe a pattern that is made out of other patterns, each of which represents one case.

There are multiple ways to interpret the semantics of matching multiple cases. For example, if two or more cases match, we could take the first match, or the longest match. For this problem, two or more cases can easily both match, e.g. ()() could match either the first or second cases. We’re going to write our function such that when two or more cases match, it picks the longest match.1

const cases =
  (...patterns) =>
    input => {
      const matches = patterns.map(p => p(input)).filter(m => m !== false);

      if (matches.length === 0) {
        return false;
      } else {
        return matches.sort((a, b) => a.length > b.length ? -1 : +1)[0]
      }
    };

const badNews = cases(
  just('fubar'),
  just('snafu')
)

badNews('snafu')
  //=> '()'

And now to describe the second case. We’ll use cases to define balanced from the first two cases, and we’ll rely on JavaScript’s name binding to implement recursion in the second case:

const balanced =
  input => cases(
    just('()'),
    follows(just('()'), balanced)
  )(input);

balanced('()')
  //=> '()'

balanced('()()()')
  //=> '()()()'

Adding support for the third and fourth cases is straightforward:

const balanced =
  input => cases(
    just('()'),
    follows(just('()'), balanced),
    follows(just('('), balanced, just(')')),
    follows(just('('), balanced, just(')'), balanced)
  )(input);

balanced('(())(')
  //=> '(())'

balanced('(()())()')
  //=> '(()())()'

And here we come to a place to evaluate the way we’ve formulated our rules.


children's choice

recursion vs iteration

The snippets follows(just('()'), balanced) and follows(just('('), balanced, just(')'), balanced) are very interesting. they handle cases like (), ()(), ()()(), and so forth, without any need for a special higher-order pattern meaning “match one or more of this pattern.”

Our code does this using recursion. Which is not surprising, recursion is a way to implement what most programming languages implement with iteration and repetition. What’s intersting is that repeating patterns are very common, and yet regular expressions can’t implement recursion. So how do they handle repeating patterns?

Standard regular expressions use the postfix operators, * and + to handle the cases where we need zero or more of a pattern, or one or more of a pattern. So in a regular expression, we would write (?:\(\))+ to define a patterns matching one or more instances of () in succession.

Although it’s not strictly necessary, we could write such pattern modifiers ourselves. For example, here’s “one or more” (equivalent to +):

const oneOrMore =
  pattern =>
    input => {
      let matchedLength = 0;
      let remaining = input;

      while (remaining.length > 0) {
        const matched = pattern(remaining);

        if (matched === false || matched.length === 0) break;

        matchedLength = matchedLength + matched.length;
        remaining = remaining.slice(matched.length);
      }

      return matchedLength > 0 && input.slice(0, matchedLength);
    };

“Zero or more” is almost exactly the same, only the last line changes:

const zeroOrMore =
  pattern =>
    input => {
      let matchedLength = 0;
      let remaining = input;

      while (remaining.length > 0) {
        const matched = pattern(remaining);

        if (matched === false || matched.length === 0) break;

        matchedLength = matchedLength + matched.length;
        remaining = remaining.slice(matched.length);
      }

      return input.slice(0, matchedLength);
    };

The original depiction of the problem requirements was the following four rules:

  1. () is balanced.
  2. (), followed by a balanced string, is balanced.
  3. (, followed by a balanced string, followed by ), is balanced.
  4. (, followed by a balanced string, followed by ), followed by a balanced string, is balanced.

That led to this implementation:

const balanced =
  input => cases(
    just('()'),
    follows(just('()'), balanced),
    follows(just('('), balanced, just(')')),
    follows(just('('), balanced, just(')'), balanced)
  )(input);

If we change the problem statement such that a balanced string is:

  1. A balanced string is a sequence of one or more strings conforming to either of the following cases:
  2. ()
  3. (, followed by a balanced string, followed by )

That leads to this compact implementation:

const balanced =
  input =>
    oneOrMore(
      cases(
        just('()'),
        follows(just('('), balanced, just(')'))
      )
    )(input);

Is this better? Sometimes a more compact definition is considerably better. Sometimes, as with playing code golf, the code is correct, but actually harder to understand. This general problem–how compact is too compact–crops up with recursion all the time. It is mathematically advantageous to be able to implement things like iteration with recursion, and even to implement recursion without name binding, but in practice, our code is clearer with name binding and looping constructs.

The same is true of composing patterns. Sometimes, the most compact form is most elegant, but less readable than one that lists more cases explicitly. Since most of us are familiar with regular expressions, we’ll continue this essay presuming that using oneOrMore or zeroOrMore is advantageous.


nothing is nothing

another look at the degenerate case

Both the original and “compact” implementation included the “base” case of just('()'). With recursive problems, there’s always some kind of base case that is irreducible, and if we presume that an empty string is not balanced, () is our irreducible case.

But why do we assume that? It isn’t one of the cases given at the top of the essay, and as this problem is usually presented, what to do with an empty string is usually not mentioned one way or the other.2

In a production environment, sometimes we are given all of the requirements and have no flexibility. If we aren’t told how to handle something like the empty string, we ask and have to implement whatever answer we are given. Of course, sometimes the missing requirement is entirely up to us to implement as we see fit.

Let’s consider the possibility that we can unilaterally declare that the empty string is balanced (it certainly isn’t unbalanced!). If the empty string is balanced, we can actually make an even more compact rule:

  1. A balanced string is a sequence of zero or more strings conforming to the following rule:
  2. . (, followed by a balanced string, followed by )

And our implementation becomes:

const balanced =
  input =>
    zeroOrMore(
      follows(just('('), balanced, just(')'))
    )(input);

Notice that although we introduced the oneOrMore and zeroOrMore higher-order-patterns, and although we interpreted an unstated requirement requirements in such a way to produce a more compact implementation, we are still employing the same approach of determining the shape of the problem and then creating an implementation that matches the shape of the problem as we understand it.


construction

extending our pattern to handle multiple types of parentheses

A common extension to the problem is to match multiple types of parentheses. We can handle this requirement with two more cases:

const balanced =
  input =>
    zeroOrMore(
      cases(
        follows(just('('), balanced, just(')')),
        follows(just('['), balanced, just(']')),
        follows(just('{'), balanced, just('}'))
      )
    )(input);

We’ll need one more thing to complete our solution: We need to match strings that are entirely balanced, not just strings that have a balanced prefix:

To complete the problem we need to match strings that are entirely balanced, not just starting with balanced:

const entirely =
  pattern =>
    input => {
      const matched = pattern(input);

      return matched !== false &&
        matched === input &&
        matched;
    };

const fubar = entirely(just('fubar'));

fubar('fubar stands for effed up beyond recognition')
  //=> false

fubar('fubar')
  //=> 'fubar'

And putting it all together:

const entirelyBalanced = entirely(balanced);

entirelyBalanced('({}(()))(()')
  //=> false

entirelyBalanced('({()[]})[[(){}]]')
  //=> ({()[]})[[(){}]]

Success!


Finish

the complete solution

The supporting functions we need to implement our pattern-matching abstraction is:

const just =
  target =>
    input =>
      input.startsWith(target) &&
      target;

const cases =
  (...patterns) =>
    input => {
      const matches = patterns.map(p => p(input)).filter(m => m !== false);

      if (matches.length === 0) {
        return false;
      } else {
        return matches.sort((a, b) => a.length > b.length ? -1 : +1)[0]
      }
    };

const follows =
  (...patterns) =>
    input => {
      let matchLength = 0;
      let remaining = input;

      for (const pattern of patterns) {
        const matched = pattern(remaining);

        if (matched === false) return false;

        matchLength = matchLength + matched.length;
        remaining = input.slice(matchLength);
      }

      return input.slice(0, matchLength);
    };

const zeroOrMore =
  pattern =>
    input => {
      let matchedLength = 0;
      let remaining = input;

      while (remaining.length > 0) {
        const matched = pattern(remaining);

        if (matched === false || matched.length === 0) break;

        matchedLength = matchedLength + matched.length;
        remaining = remaining.slice(matched.length);
      }

      return input.slice(0, matchedLength);
    };

const oneOrMore =
  pattern =>
    input => {
      let matchedLength = 0;
      let remaining = input;

      while (remaining.length > 0) {
        const matched = pattern(remaining);

        if (matched === false || matched.length === 0) break;

        matchedLength = matchedLength + matched.length;
        remaining = remaining.slice(matched.length);
      }

      return matchedLength > 0 && input.slice(0, matchedLength);
    };

const entirely =
  pattern =>
    input => {
      const matched = pattern(input);

      return matched !== false &&
        matched === input &&
        matched;
    };

With these in hand, we implement our solution with:

const balanced =
  input =>
    zeroOrMore(
      cases(
        follows(just('('), balanced, just(')')),
        follows(just('['), balanced, just(']')),
        follows(just('{'), balanced, just('}'))
      )
    )(input);

const entirelyBalanced = entirely(balanced);

Is this good? Bad? Terrible?


In the balance

the good, the bad, and the ugly

The very good news about our solution is that the form of the solution exactly replicates the form of the problem statement as we defined it.

The bad news is that we require much more supporting code for our abstraction than code describing the solution. This is generally thought to be fine when we reuse this abstraction multiple times, amortizing the cost of the implementation across multiple uses. But for a one-off, it requires the reader of the code to grok our solution and the implementation of pattern-matching. That can be a bit much.

The ugly is that this particular implementation of pattern matching is slow and wasteful of memory. There is a silver lining, though: If we write some code in one place, and it is slow, when we optimize that code, it gets faster.

When we write an abstraction layer that is used by many pieces of code, and it is slow, all of those pieces of code are slow. Terrible! But that same leverage applies when we optimize that abstraction layer’s code. All of the code that uses it gets faster, “for free.”

At the end of the day, when we have a problem that looks like a pattern, we should at least consider writing a solution structured to match the structure of the pattern. And if the structure of the problem is recursive, then we should likewise consider making the structure of our solution recursive.

the end

(discuss on reddit)


Notes
  1. A pattern matching engine that always handles cases like this the same way is called deterministic. Another way is to presume that when the engine encounters a case where a pattern could match a rule in multiple ways, it is allowed to pick any of the options in order that the entire input match. That is called a non-deterministic engine, and non-deterministic engines can handle a richer variety of possible inputs. For example, consider the regular expression /a+b/. It can match any number of as followed by a single b. Our engine can handle that too. But what about /a+ab/? Our engine would fail, as it would greedily match all of the as, and then the pattern ab would fail. Non-deterministic engines can choose to match fewer than all of the as and succeed in that match. 

  2. This discussion of treating the empty string as balanced was provoked by pizzarollexpert’s excellent comment on Reddit. 

https://raganwald.com/2018/10/17/recursive-pattern-matching
Ruby's Hashes and Perl's Autovivification, in JavaScript
Show full content

The Ruby programming language has the notion of a Hash. A Hash is a dictionary-like collection of unique keys and their values. Ruby hashes have most of the semantics of an ES6 Map, but also have the syntactic conveniences of Plain-Old-JavaScript-Objects (“POJOs”).

Interestingly, Ruby hashes also have the notion of programmatically determine a default value to be returned when accessing keys that have not been set. In JavaScript, the default value is always undefined.

In this essay we will look at rolling our own Hash class with Ruby-like semantics, and then we’ll examine one of the most interesting things that can be built on top of a Hash: Autovivification. We’ll go into a more thorough explanation below, but autovivifying hashes can be summed up as, Hashes that are recursively hashes, all the way down.

If that doesn’t whet our curiosity, nothing will!


©2007 Eagan Snow


Ruby Hashes

As noted, the Ruby programming language has the notion of a Hash. A Hash is a dictionary-like collection of unique keys and their values.1

Ruby hash literals have several syntaxes, including:

grades = { "Jane Doe" => 10, "Jim Doe" => 6 }
options = { :font_size => 10, :font_family => "Arial" }
options2 = { font_size: 10, font_family: "Arial" }

Ruby hashes are thus a little like JavaScript’s Map, because they permit the use of any object as a key, not just strings. On the other hand, you can access the values of a Ruby hash using square braces, like this:

grades = { "Jane Doe" => 10, "Jim Doe" => 6 }
options = { :font_size => 10, :font_family => "Arial" }
options2 = { font_size: 10, font_family: "Arial" }

grades["jane doe"]
  #=> 10
options[:font_family]
  #=> "Arial"

That’s more like a JavaScript object. With a JavaScript Map, we have to use .get and .set, instead of [] and []=.

Ruby hashes have a default value that is returned when accessing keys that do not exist in the hash. If no default is set, nil is used. That is like JavaScript objects, which return undefined when we access a key that was not set. But in Ruby, you can set a different default value by sending it as an argument to #new:

grades = Hash.new(0)
grades["Dorothy Doe"] = 9

grades["Tom Swift"]
  #=> 0
grades["Dorothy Doe"]
  #=> 9

Another way to provide a default value that is returned when accessing keys that do not exist in the hash is to supply a block. If a block is specified, it will be called with the hash object and the key, and should return the default value. It is the block’s responsibility to store the value in the hash if required.

h = Hash.new { |hash, key| hash[key] = "Go Fish: #{key}" }
h["c"]           #=> "Go Fish: c"
h["d"]           #=> "Go Fish: d"

JavaScript objects do not have any notion of a default value that we can set. It’s always undefined.


implementing hash in javascript

Given that JavaScript already has Object and Map, the only motivation to snarf any of Hash’s behaviour is going to be the ability to set our own default values. This is rather handy in Ruby, and it might be handy in JavaScrip too. So let’s come up with a toy implementation we can play with.

The first thing we have to decide is whether we’ll base our implementation on Object or Map. For the purposes of this essay, Object has the nicer syntax, and using objects as dictionaries is the usual case in JavaScript. And a Map implementation will be trivial once the basic pattern is articulated. (The HashMap implementation based on delegating to a Map, is below.)

When we create an instance of Hash, we’ll wrap it in a Proxy2 to handle access.

What behaviour do we want?

// use case zero
const obj = new Hash();

obj instanceof Hash
  //=> true

// use case one
const ages = new Hash();
ages["Dorothy Doe"] = 23;

ages["Tom Swift"]
  //=> undefined
ages["Dorothy Doe"]
  //=> 23

// use case two
const grades = new Hash(0);
grades["Dorothy Doe"] = 9;

grades["Tom Swift"]
  //=> 0
grades["Dorothy Doe"]
  //=> 9

// use case three
const h = new Hash((hash, key) => hash[key] = `Go Fish: ${key}`);

h["c"]
  //=> "Go Fish: c"
h["d"]
  //=> "Go Fish: d"

Since classes derive from Object by default, and since JavaScript objects all support [] notation, we can just use an empty class to handle use cases zero and one:

class Hash {
  // T.B.D.
}

const obj = new Hash();

obj instanceof Hash

const ages = new Hash();
ages["Dorothy Doe"] = 23;

ages["Tom Swift"]
  //=> undefined
ages["Dorothy Doe"]
  //=> 23

Use case two allows us to pass a non-function value as a default. We’ll make a constructor function, and incorporate a Proxy. Note that JavaScript allows us to return something other than the object created from a constructor. That is ripe for abuse, but returning decorated instances from a constructor is perfectly cromulant.

class Hash {
  constructor (defaultValue = undefined) {
    return new Proxy(this, {
      get: (target, key) =>
        Reflect.has(target, key)
          ? Reflect.get(target, key)
          : defaultValue
    });
  }
}

const grades = new Hash(0);
grades["Dorothy Doe"] = 9;

grades["Tom Swift"]
  //=> 0
grades["Dorothy Doe"]
  //=> 9

Our third use case involves checking whether defaultValue is an ordinary value, or a function.3

We could check every time it’s accessed, but instead we’ll assign different function bodies to the proxy’s get key. That way, it’s only checked at (open air quotes) compile time (close air quotes):4

class Hash {
  constructor (defaultValue = undefined) {
    return new Proxy(this, {
      get: (defaultValue instanceof Function)
        ? ((target, key) =>
            Reflect.has(target, key)
              ? Reflect.get(target, key)
              : defaultValue(target, key))
        : ((target, key) =>
            Reflect.has(target, key)
              ? Reflect.get(target, key)
              : defaultValue)
    });
  }
}

const h = new Hash((hash, key) => hash[key] = `Go Fish: ${key}`);

h["c"]
  //=> "Go Fish: c"
h["d"]
  //=> "Go Fish: d"

As noted, we can make a Map-like Hash with even less hackery, we don’t need a proxy! But most idiomatic JavaScript uses objects, so that’s what we’ll use. This is enough to set the stage for the next bit of snarfing.


Bride of Frankenstein


Autovivifying Hashes

The Perl language also has hashes, and they have an interesting feature called autovivification. As explained in Implementing autovivification in Ruby hashes:

In Perl, the following line will successfully run:

$h{'a'}{'b'}{'c'} = 1;

even if $h was previously undefined. Perl will automatically set $h to be an empty hash, it will assign $h{'a'} to be a reference to an empty hash, $h{'a'}{'b'} to be an empty hash, and then finally assign $h{‘a’}{‘b’}{‘c’} to 1`, all automatically.

This is called autovivification in Perl: hash and array variables automatically “come to life” as necessary. This is incredibly convenient for working with multidimensional arrays, for example.

The syntax is a little different than Ruby or JavaScript, but the example snippet shows two things:

  1. $h is a variable that has not yet been bound. In JavaScript, it would be undefined. If we tried to assign one of its properties, it would break. In Perl, trying to assign a property of an undefined variable turns it into a hash. So $h{'a'} = 1 would “autovivify” $h and then assign 1 to {'a'}.
  2. Given a hash $h, the code $h{'a'}{'b'}{'c'} = 1; assigns hashes to {'a'}, and then {'a'}{'b'}, and then it assigns 1 to {'a'}{'b'}{'c'}.

This would be like writing this JavaScript:

const h['a']['b']['c'] = 1;

And having the interpreter execute the code as if we had written:

const h = { a: { b: { c: 1 } } };

Can we do this? Almost. We can’t autovivify a new variable as a hash, but given a hash, we can autovivify its values. Certainly. And we can tear a page out of Ruby’s book, as inspired by Implementing autovivification in Ruby hashes.


attempting to autovivify ruby-style hashes

Let’s review the Hash pseudo-class we created above. One of the things we can do is provide a default value for a hash. What if the default value is another Hash? BTW, for shits and giggles, we’ll use property-based notation in these examples, just to show how confusing JS can be for people coming from a more disciplined OO language:

const h2 = new Hash(new Hash());

h2.a.b = 1;

h2.a
  //=> a new hash

h2.a.b
  //=> 1

This arrangement looks promising, but it has two bugs. Rubyists have been bitten by the first one so often, they probably spotted it before I could mention that this code has a bug. I’ll show you the failure case:

h2.c.d = 2;

h2.a.d
  //=> 2

We have passed a single hash as the default value, so all of the keys that get ‘autovivified’ share the same hash. We really need to generate a new one every time we want a default value. For that, we need to use the function form:

const h3 = new Hash((target, key) => target[key] = new Hash());

h3.a.b = 1;
h3.c.d = 2;

h3.a.b
  //=> 1

h3.c.d
  //=> 2

h3.a.d
  //=> undefined

That’s what we expect. And it leads us to the next problem. This only goes one level deep. It “vivifies” h.a and h.b as separate hashes, but when we type h3.a.d, we want another hash. But that’s two levels deep, so it doesn’t work. We can fix it to handle two levels:

const h4 = new Hash(
  (target, key) =>
    target[key] = new Hash(
      (target, key) =>
        target[key] = new Hash()
    )
);

Or three:

const h5 = new Hash(
  (target, key) =>
    target[key] = new Hash(
      (target, key) =>
        target[key] = new Hash(
          (target, key) =>
            target[key] = new Hash()
        )
    )
);

But we can only type so much of that. How do we make it work for an arbitrary number of levels?


autovivifying hashes in Javascript, the classical approach

We’ve been doing everything in an “OO” style so far, let’s take things to their natural OO conclusion:

class AutovivifyingHash extends Hash {
  constructor () {
    super(
      (target, key) => target[key] = new AutovivifyingHash()
    );
  }
}

const avh = new AutovivifyingHash();

avh.a.b.c.d.e.f.g.h.i.j.k.l.m.n.o.p.q.r.s.t.u.v.w.x.y.z = 'alpha beta';

avh.a.b.c.d.e.f.g.h.i.j.k.l.m.n.o.p.q.r.s.t.u.v.w.x.y.z
  //=> "alpha beta"

It works! We’ve introduced recursion by having our constructor use a reference to the name of the class.

That being said, maybe we don’t want a brand new class, maybe we want to use our Hash, but do something recursive with the function we use to generate default values. Something like:

const autovivifyingHash = () =>
  new Hash(
    (target, key) => target[key] = autovivifyingHash()
  );

const fh = autovivifyingHash();

fh.a.b.c.d.e.f.g.h.i.j.k.l.m.n.o.p.q.r.s.t.u.v.w.x.y.z = 'alpha beta';

fh.a.b.c.d.e.f.g.h.i.j.k.l.m.n.o.p.q.r.s.t.u.v.w.x.y.z
  //=> "alpha beta"

We’re still performing recursion by name. Which is fine, JavaScript has names and scopes, we ought to make use of them. But that being said, it’s good to know how to make an autovivifying hash without requiring a reference to a class or a function.

And we remember how to do that from the essays on recursive combinators: To Grok a Mockingbird, and Why Y? Deriving the Y Combinator in JavaScript:

const why =
  fn =>
    (x => x(x))(
      maker =>
        (...args) =>
          fn(maker(maker), ...args)
    );


const yh = new Hash(
  why((myself, target, key) => target[key] = new Hash(myself))
);

yh.a.b.c.d.e.f.g.h.i.j.k.l.m.n.o.p.q.r.s.t.u.v.w.x.y.z = 'alpha beta';

yh.a.b.c.d.e.f.g.h.i.j.k.l.m.n.o.p.q.r.s.t.u.v.w.x.y.z
  //=> "alpha beta"

This style allows us to make autovivifying hashes wherever we like, without having to set up a new class and a new module. This is exactly the approach explained in Implementing autovivification in Ruby hashes.

Of course, in Ruby the Hash class comes baked in, so there’s a good incentive to build upon a standard and very common data structure. In JavaScript, we have to build our own. If we’re not that interested in classical OO, maybe we can back up and strip things down to their essentials?

autovivifying hashes in Javascript, the idiomatic approach

If we pare things down to their essentials, we can drop the entire Hash class and just use a function. Here it is calling itself by name:

const autovivifying = () => new Proxy({}, {
  get: (target, key) =>
    Reflect.has(target, key)
      ? Reflect.get(target, key)
      : target[key] = autovivifying()
});

const ah = autovivifying();

ah.a.b.c.d.e.f.g.h.i.j.k.l.m.n.o.p.q.r.s.t.u.v.w.x.y.z = 'alpha beta';

ah.a.b.c.d.e.f.g.h.i.j.k.l.m.n.o.p.q.r.s.t.u.v.w.x.y.z
  //=> "alpha beta"

And if we want to go “pure” and avoid any issues with binding, we’ll use why as above, but without any classes involved:

const ph = why(
  myself => new Proxy({}, {
    get: (target, key) =>
      Reflect.has(target, key)
        ? Reflect.get(target, key)
        : target[key] = myself()
  })
)();

ph.a.b.c.d.e.f.g.h.i.j.k.l.m.n.o.p.q.r.s.t.u.v.w.x.y.z = 'alpha beta';

ph.a.b.c.d.e.f.g.h.i.j.k.l.m.n.o.p.q.r.s.t.u.v.w.x.y.z
  //=> "alpha beta"

Yowza, our code is not in Kansas any more.


Scottish Parliament ©2006 Martin Pettitt


Of Course! But Maybe…

Most of the time, we do not debate whether the things in this blog belong in production–or any, really–code. The point is to explore ideas. What matters is not “here is something we can use tomorrow,” as much as, “here are some ideas that change the way we think about code.”

When we integrate those changes to the way we think about code with the various forces acting upon our code decisions, we may end up with something that on the surface is entirely different from the code in these posts, but has been influenced by the journey we take working things out.

But that being said, this particular post touches on various ways to build this feature, from heavyweight OO to lightweight functions, with exotica like proxies tossed around at will. When we discuss whether and/or when to use such techniques, we also discuss ideas of general application around abstraction and pragmatism.

So here goes.


do we need a hash class?

We began by writing a Hash class to imitate what Ruby provides “out-of-the-box:”

class Hash {
  constructor (defaultValue = undefined) {
    return new Proxy(this, {
      get:
        (defaultValue instanceof Function)
          ? ((target, key) =>
              Reflect.has(target, key)
                ? Reflect.get(target, key)
                : defaultValue(target, key))
          : ((target, key) =>
              Reflect.has(target, key)
                ? Reflect.get(target, key)
                : defaultValue)
    });
  }
}

And we still need to put autovivification on top of this:

class AutovivifyingHash extends Hash {
  constructor () {
    super((target, key) => target[key] = new AutovivifyingHash());
  }
}

This seems like overkill if all we want is autovivification. If that’s all we need, better to write the simplest thing that could possibly work:

const autovivifying = () => new Proxy({}, {
  get: (target, key) =>
    Reflect.has(target, key)
      ? Reflect.get(target, key)
      : target[key] = autovivifying()
});

It may come down to whether a particular code base leans towards OO or lightweight FP. Both are fine approaches (as are mixed-paradigm approaches), so it could be that regardless of the number of lines of code, the approach that resembles the rest of the code base is most correct.


the rule of three

When might it be sensible to write Hash? Well, in programming there is a rule of three. It is usually applied to removing duplication: When you first write something, you obviously don’t worry about duplication. If you write it a second time, make a note of the duplication, but don’t rush to refactor things. Only when you need to write it for the third time do you refactor everything to eliminate duplication.

Why wait for the third use? And why do we even count the first use? Well, there is a cost to de-duplication, often in the form of generalization and/or abstraction. Take the Hash class. If we write the entire class but only use it to make the AutovivivifyingHash class, we are incurring the costs of de-duplication before we’ve even used it twice.

In essence, we’re deciding that at some point we will use Hash again, and at that time we can benefit from a single class multiple pieces of code can share, but we’d like to pay that cost now. This is called Premature Abstraction.

Of ourse, it could be that we spot multiple uses for the Hash class. In that case, there is a benefit to bundling it up on its own. And the rule of three helps us with this decision. If there are two other pieces of code that would benefit from being written with (or refactored to use) Hash, just the way it is, then having a separate Hash class is a win.

If not, we shouldn’t bother. If and when we have another use for it, we can refactor. This isn’t the kind of decision where we fear that if we fail to make the perfect choice today, we’ll be stuck with our mistake forever.


should we be even more oo?

And as long as we are not being fanatic about functions being superior to classes, we might want to also consider whether a Hash based on Map is superior to one based on Object. Consider:

const DEFAULT_KEY = Symbol("default-key");
const MAP = Symbol("map");

class HashMap {
  constructor (defaultValue = undefined) {
    this[MAP] = new Map();
    this[DEFAULT_KEY] = defaultValue;
  }

  has(key) {
    return this[MAP].has(key);
  }

  get (key) {
    if (this[MAP].has(key)) {
      return this[MAP].get(key);
    } else {
      const defaultValue = this[DEFAULT_KEY];

      if (defaultValue instanceof Function) {
        return defaultValue(this, key);
      } else {
        return defaultValue;
      }
    }
  }

  set (key, value) {
    return this[MAP].set(key, value);
  }
}

class AutovivifyingHashMap extends HashMap {
  constructor () {
    super((target, key) => target.set(key, new AutovivifyingHashMap()));
  }
}

const hm = new AutovivifyingHashMap();
hm.get(1).get(2).set(3, 123);

hm.get(1).get(2).get(3)
  //=> 123;

HashMap as given here delegates to an instance of Map, while allowing for a custom default value. The obvious advantage is that since it’s based on Map, we can use arbitrary values as keys (including primitives and object references), not just strings. If you need that, you need HashMap, not Map, period.56

Its advantage from an architectural perspective is that there’s no Proxy magic. We are not against metaprogramming of any kind, but sometimes in a code base we make the decision to prefer explicit to implicit. We can generally expect that if we call a .get method on a HashMap class, that it will decorate the basic functionality of Map. In JavaScript, we don’t normally expect the behaviour of [] or .foo to be customized.

The idea of overriding methods is canon in OOP, so overriding .get to autovivify another dictionary is colouring well within the lines. A certain type of OO purist would prefer this approach, even if it means giving up the [] and []= syntax. If the remainder of the code base leans towards this philosophy, HashMap may be superior to Hash.

On the other hand, if the code base leans heavily on using POJOs as dictionaries, and using [] and []= to access them, Hash may be the better choice.


but libraries

There’s another special consideration. If we are writing code for others, such as when writing a library, then the rule of three doesn’t apply. If our library is successful, then even the least commonly used classes and functions will be used in dozens or even hundreds of code bases. Conversely, with a library change is hard: We can’t reach out and refactor our downstream dependents if we decide in the future to increase the level of abstraction.

That makes fairly obvious sense. Our architectural decisions around our application code should favour pragmatism, while our architectural decisions around libraries encourage more forward-thinking.

And there’s another way in which libraries influence our choices. If we have something like the Hash class tucked away in a library, it’s a lot easier to justify building on it. We have some idea that maintaining it is “free.” Whereas, every line of code we write carries a cost of some kind.

If we have to write our own Hash or HashMap class, fine, but we need good reasons to add the abstraction and maintenance cost to our code. But if we can get it “for free,” then of course we still need to understand how it works, but it’s easier to justify building on it if we never have to worry about maintaining Hash or HashMap itself.


does autovivification make sense in javascript?

Now here’s another question. Hash is part of Ruby’s standard idioms, so an autovivifying hash isn’t a big leap away from how Ruby already works. There is precedent for a hash to have default values, even dynamically generated default values.

But JavaScript doesn’t have anything like Ruby’s Hash to begin with. So whether we’re building a brand new Hash/HashMap class, or using the lightweight, idiomatic approach, we’re taking two steps forward with autovivifying hashes, we’re promoting the idea of a dictionary generating default values, and also promoting the idea that the entire process is recursive. Well, maybe one-and-a-half steps forward, a sesqui-leap outside of the comfort zone.

What do we get in return? We get to write:

const h = autovivifying();

h.a.b.c = "Poirot's Famous Case";

And we do so to avoid writing:

const h = { a: { b: { c: "Poirot's Famous Case" } } };

It’s not gi-normously more compact, but there is some value in the autovivifying syntax if conceptually we are trying to think of a path to some data in a tree. And in all fairness, if we only look at initializing data (which is the normal case for working ina strongly functional style), we ignore the benefits of autovivification in a more imperative style.

if we are given an existing autovivifying hash, we can easily add anything we want, anywhere in its tree, with h.a.b.c = "Poirot's Famous Case";. But we cannot write h = { a: { b: { c: "Poirot's Famous Case" } } }; for an existing data structure, because we might overwrite existing hashes.

An autovivifying hash might be a win, more-so if we expect to update existing hashes.


let’s debate performance

This space left intentionally blank.

Ok, seriously, it’s important to know things like, “Proxy is dog-slow compared to ordinary object access.” But after that, most things don’t need to be fast. If something does need to be fast, you profile it.

Wait! Stop!! We don’t mean that we should run speed tests on the various snippets of code to determine which idiom is the fastest. We mean that in production, when we identify an honest-to-goodness bottleneck that has meaningful impact on outcomes like user experience, only then do we drill down and figure out which slow piece of code is holding things up.

Until then, we prioritize ease of writing and maintaining code. And since we have confidence that we can refactor our code safely, we know that if and when we discover that Proxy is a problem, for example, we can easily rewrite our code then.


Vulcan & two Lancasters formation ©2014 Alan Wilson


Looking Back

We started with a Ruby data structure and idiom, the Hash class and its ability to customize the default value for missing keys. We implemented a version in JavaScript, and then following the suggestion of Implementing autovivification in Ruby hashes, we looked at a few ways to implement autovivification, both on top of Hash and without it.

Finally, we looked at some of the considerations before adopting these ideas:

  1. We should not abstract based on one or two applications, wait until we have three uses for an abstraction. That applies to Hash and AutovivifyingHash.
  2. It’s slightly easier to adopt something like Hash is we can get it from a library.
  3. Our consideration around how the rule of three does not apply if we ourselves are writing a library: It’s our job to guess that dozens or hundreds of downstream users will adopt our abstraction.
  4. If we need arbitrary objects as keys, or if we prefer a more pure OO approach, a Map-based approach may be preferred.
  5. Auto-vivification is not much of a win for immutable data, but may be useful if we are constantly adding data to tree-like structures.
  6. Premature optimization is the root of all evil, but it’s not wrong to be aware of the performance of our implementation.

ttfn!

(discuss on /r/javascript and Hacker News)


Notes
  1. Programming language libraries have an awful track record for naming things. A “hash” is generally, of course, a way of implementing a “dictionary” or “associative array.” But we generally use the word “Hash” to describe a dictionary, even if we don’t particularly care how it’s implemented. 

  2. The Proxy object is used to define custom behaviour for fundamental operations (e.g. property lookup, assignment, enumeration, function invocation, etc). It is enormously flexible, but extremely slow compared to “native” behaviour. Does that mean we should never use it? No, it means we should use our judgment. 

  3. We are hand-waving over the possibility that we’d ever want a hash that returns a function by default. This design is fine for the purposes of exposition, but if we ever consider shipping such a thing to the world in a library, we might reconsider our design choices. 

  4. This is a questionable optimization. It’s not excessively clever code, but the performance benefit is negligible given the costs of using a proxy for these instances, and it precludes us from implementing another feature of Ruby’s hashes, the ability to mutate the default value of an existing instance. 

  5. For example, in this exact blog you can find a memoize function decorator. It uses an object-based dictionary to store a mapping from keys to result values. Quite obviously, a Map-based implementation would be more generally useful. 

  6. In general, we prefer delegation/composition to extension (aka “inheritance”). This is discussed at length in Mixins, Forwarding, and Delegation in JavaScript. But it should be noted that with respect to the built-in Map class, we should be careful. Extending Map generally works in environments that provide a native Map class, but can break when transpiling ES6 to ES5 for compatibility. 

https://raganwald.com/2018/09/12/auto-vivifying-hash
Why Y? Deriving the Y Combinator in JavaScript
Show full content

…and two practical applications…


The Y Combinator is an important result in theoretical computer science.1

In this essay, after a brief review of the work we’ve already done on the Mockingbird, we’ll derive the Why Bird, known most famously as the Y Combinator. The why bird provides all the benefits of the mockingbird, but allows us to write more idiomatic JavaScript. We’ll see that one of the benefits of writing recursive functions in “why bird form” is that we can compose and decorate them easily.

We’ll then derive the “Decoupled Trampoline,” a/k/a “Long-Tailed Widowbird.” The decoupled trampoline builds on the why bird and Y Combinator to allow us to write tail-recursive functions that execute in constant stack space, while hewing closely to idiomatic JavaScript.

While this use case is admittedly rare in production code, it does arise from time to time and it is pleasing to contemplate a direct connection between one of programming’s most cerebrally theoretical constructs, and a tool for overcoming the limitations of today’s JavaScript implementations.


Hood Mockingbird copyright 2007


Preamble: Revisiting the mockingbird

To review what we saw in To Grok a Mockingbird, a typical recursive function calls itself by name, like this:2

function exponent (x, n) {
  if (n === 0) {
    return 1;
  } else if (n % 2 === 1) {
    return x * exponent(x * x, Math.floor(n / 2));
  } else {
    return exponent(x * x, n / 2);
  }
}

exponent(2, 7)
  //=> 128

Because it calls itself by name, it is tightly coupled to itself. This means that if we want to decorate it–such as by memoizing its return values, or if we want to change its implementation strategy–like employing trampolining–we have to rewrite the function.

We saw that we can decouple a recursive function from itself. Instead of calling itself by name, we arrange to pass the recursive function to itself as a parameter. We begin by rewriting our function to take itself as a parameter, and also to pass itself as a parameter.

We call that writing a recursive function in mockingbird form. It looks like this:

(myself, x, n) => {
  if (n === 0) {
    return 1;
  } else if (n % 2 === 1) {
    return x * myself(myself, x * x, Math.floor(n / 2));
  } else {
    return myself(myself, x * x, n / 2);
  }
};

Given a function written in mockingbird form, we use a JavaScript implementation of the mockingbird to turn it into a recursive function:3

const mockingbird =
  fn =>
    (...args) =>
      fn(fn, ...args);

const exponent =
  mockingbird(
    (myself, x, n) => {
      if (n === 0) {
        return 1;
      } else if (n % 2 === 1) {
        return x * myself(myself, x * x, Math.floor(n / 2));
      } else {
        return myself(myself, x * x, n / 2);
      }
    }
  );

exponent(3, 3)
  //=> 27

Because the recursive function has been decoupled from itself, we can do things like memoize it:

const memoized = (fn, keymaker = JSON.stringify) => {
  const lookupTable = Object.create(null);

  return function (...args) {
    const key = keymaker.call(this, args);

    return lookupTable[key] || (lookupTable[key] = fn.apply(this, args));
  }
};

const ignoreFirst = ([_, ...values]) => JSON.stringify(values);

const exponent =
  mockingbird(
    memoized(
      (myself, x, n) => {
        if (n === 0) {
          return 1;
        } else if (n % 2 === 1) {
          return x * myself(myself, x * x, Math.floor(n / 2));
        } else {
          return myself(myself, x * x, n / 2);
        }
      },
      ignoreFirst
    )
  );

Memoizing our recursive function does not require any changes to its code. We can easily reuse it elsewhere if we wish.


Curlicue copyright 2017 Anja Pietsch


why the mockingbird needs improvement

The mockingbird is a useful tool, but it has a drawback: In addition to rewriting our functions to take themselves as a parameter, we also have to rewrite them to pass themselves along. So in addition to this:

(myself, x, n) => ...

We must also write this:

myself(myself, x * x, n / 2)

The former is the point of decoupling. The latter is nonsense!

The idea behind the way we use the mockingbird (as opposed to a literal interpretation of the M combinator) is to write idiomatic JavaScript. But there’s nothing “idiomatic” about a function invoking itself with myself(myself, ...).

What we want is a function like the mockingbird, but it must support functions calling themselves idiomatically, e.g. myself(x * x, n / 2).

Let’s visualize exactly what we want. With the mockingbird, we write:

const exponent =
  mockingbird(
    (myself, x, n) => {
      if (n === 0) {
        return 1;
      } else if (n % 2 === 1) {
        return x * myself(myself, x * x, Math.floor(n / 2));
      } else {
        return myself(myself, x * x, n / 2);
      }
    }
  );

We want a better recursive combinator, one that lets us write:

const exponent =
  _____(
    (myself, x, n) => {
      if (n === 0) {
        return 1;
      } else if (n % 2 === 1) {
        return x * myself(x * x, Math.floor(n / 2));
      } else {
        return myself(x * x, n / 2);
      }
    }
  );

Let’s build that!


Sage Grouse Lek © 2006 BLM Wyoming


Part I: Deriving the Y Combinator from the Mockingbird

Before we begin, there are some rules we have to follow if we are to take the mockingbird and derive another combinator from it. Every combinator has the following properties:

  1. It is a function.
  2. It can only use its parameters or previously defined combinators that have these same properties. This means it cannot refer to non-combinators, like constants, ordinary functions, or object from the global namespace.
  3. It cannot be a named function declaration or named function expression.
  4. It cannot create a named function expression.
  5. It cannot declare any bindings other than via parameters.
  6. It can invoke a function in its implementation.

In combinatory logic, all combinators take exactly one parameter, as do all of the functions that combinators create. Combinatory logic also eschews all gathering and spreading of parameters. When creating an idiomatic JavaScript combinator (like mockingbird), we eschew these limitations. Idiomatic JavaScript combinators can:

  • Gather parameters, and;
  • Spread parameters.

Sage Thrasher © 2016 Bettina Arrigoni


from mockingbird to why bird in seven easy pieces

Step Zero: We begin with the mockingbird:

const mockingbird =
  fn =>
    (...args) =>
      fn(fn, ...args);

Step One, we name our combinator. Honouring Raymond Smullyan’s choice, we shall call it the why bird:

const why =
  fn =>
    (...args) =>
      fn(fn, ...args);

Step Two, we identify the key change we have to make:

const why =
  fn =>
    (...args) =>
      fn(?, ...args);

We’ve replaced that one fn with a placeholder. Why? Well, our fn is a function that looks like this: (myself, arg0, arg1, ..., argn) => .... But whatever we pass in for myself will look like this: (arg0, arg1, ..., argn) => .... So it can’t be fn.

But what will it be?

Well, the approach we are going to take is to think about the mockingbird. What does it do? It takes a function like (myself, arg0, arg1, ..., argn) => ..., and returns a function that looks like (arg0, arg1, ..., argn) => ....

The mockingbird isn’t what we want, but let’s airily assume that there is such a function. We’ll call it maker, because it makes the function we want.

Step Three, we replace ? with maker(??). We know it will make the function we want, but we don’t yet know what we must pass to it:

const why =
  fn =>
    (...args) =>
      fn(maker(??), ...args);

This leaves us two things to figure out:

  1. Where do we get maker, and;
  2. What parameter(s) do we pass to it.

For 1, we could define maker as an anonymous function expression. Another option arises. In the days before ES6, if we wanted to define variables within a scope smaller than a function, we created an immediately invoked function expression.

Step Four, we could, after some experimentation, consider this format that binds a function expression to maker with an immediately invoked function expression;

const why =
  fn =>
    (
      maker =>
        (...args) =>
          fn(maker(??), ...args)
    )(???);

This still leaves us two things to work out: ?? is what we pass to maker, and ??? is maker’s expression. Here’s the “Eureka!” moment:

maker is a function that takes one or more parameters and returns a function that looks like (...args) => fn(maker(??), ...args). That’s the function we want to pass to fn as myself. The mockingbird isn’t such a function, but we can see one right in front of us:

maker => (...args) => fn(maker(??), ...args) is a function that takes one parameter(maker) and returns a function that looks like (...args) => fn(maker(??), ...args)!

Step Five, let’s fill that in for ???:

const why =
  fn =>
    (
      maker =>
        (...args) =>
          fn(maker(??), ...args)
    )(
      maker =>
        (...args) =>
          fn(maker(??), ...args)
    );

Now what about ??? Well, we have just decided two things:

  1. maker takes one or more parameters and returns a function that looks like (...args) => fn(maker(??), ...args), and;
  2. maker => (...args) => fn(maker(??), ...args) is a function that takes one parameter(maker) and returns a function that looks like (...args) => fn(maker(??), ...args).

Conclusion: maker takes one parameter, maker, and returns a function that looks like (...args) => fn(maker(??), ...args). Therefore, the expression we want is maker(maker), and ?? is nothing more than maker!

Step Six:

const why =
  fn =>
    (
      maker =>
        (...args) =>
          fn(maker(maker), ...args)
    )(
      maker =>
        (...args) =>
          fn(maker(maker), ...args)
    );

Let’s test it:

const exponent =
  why(
    (myself, x, n) => {
      if (n === 0) {
        return 1;
      } else if (n % 2 === 1) {
        return x * myself(x * x, Math.floor(n / 2));
      } else {
        return myself(x * x, n / 2);
      }
    }
  );

exponent(2, 9)
  //=> 512

Voila! A working why bird!!


Underwood Typewriter Keys ©2010 Steve Depolo


from why bird to y combinator

Our why bird is written in–and for–idiomatic JavaScript, especially with respect to employing functions that take more than one parameter. A direct implementation of a formal combinator only takes one parameter and only works with functions that take one parameter.

We can translate our why bird to its formal combinator, the y combinator. To aid us, let’s first imagine a recursive function:

const isEven =
  n =>
    (n === 0) || !isEven(n - 1);

In why bird form, it becomes:

const _isEven =
  (myself, n) =>
    (n === 0) || !myself(n - 1);

Alas, it now takes two parameters. We fix this by currying it:

const __isEven =
  myself =>
    n =>
      (n === 0) || !myself(n - 1);

Instead of taking two parameters (myself and n), it is now a function taking one parameter, myself, and returning a function that takes another parameter, n.

To accommodate functions in this form, we take our why bird and perform some similar modifications. We’ll start as above by renaming it:

const Y =
  fn =>
    (
      maker =>
        (...args) =>
          fn(maker(maker), ...args)
    )(
      maker =>
        (...args) =>
          fn(maker(maker), ...args)
    );

Next, we observe that (...args) => fn(maker(maker), ...args) is not allowed, we do not gather and spread parameters. First, we change ...args into just arg, since only one parameter is allowed:

const Y =
  fn =>
    (
      maker =>
        arg => fn(maker(maker), arg)
    )(
      maker =>
        arg => fn(maker(maker), arg)
    );

fn(maker(maker), arg) is also not allowed, we do not pass two parameters to any function. Instead, we pass one parameter, get a function back, and pass the second parameter to that function. Like this:

const Y =
  fn =>
    (
      maker =>
        arg => fn(maker(maker))(arg)
    )(
      maker =>
        arg => fn(maker(maker))(arg)
    );

Let’s try it:

Y(
  myself =>
    n =>
      (n === 0) || !myself(n - 1)
)(1962)

It works too, and now we have derived one of the most important results in theoretical computer science. The Y Combinator matters deeply, because in the kind of formal computation models that are simple enough to prove results (like the Lambda Calculus and Combinatory Logic), we do not have any iterative constructs, and must use recursion for nearly everything non-trivial.4

The Y Combinator makes recursion possible without requiring variable declarations. As we showed above, we can even make an anonymous function recursive, which is necessary in systems where functions do not have names.5


Dame Judy Dench as Lady Miles Messervy


if a forest contains a mockingbird, it also contains a why bird

Looking this expression of the Y combinator, we can see why it was named after the letter “Y,” the code literally looks like a forking branch:

const Y =
  fn =>
    (m => a => fn(m(m))(a))(
      m => a => fn(m(m))(a)
    );

Since we’re talking direct implementations of formal combinators, let’s have a look at the M combinator:

const M =
  fn => fn(fn);

We can combine M and Y to create a more compact expression of the Y combinator:

const Y =
  fn =>
    M(m => a => fn(m(m))(a));

The compact expression of the Y combinator is usually expressed with the M combinator “reduced” to (x => x(x)):

const Y =
  fn =>
    (x => x(x))(m => a => fn(m(m))(a));

We can use the reduced M combinator make a compact why bird, too:

const why =
  fn =>
    (x => x(x))(
      maker =>
        (...args) =>
          fn(maker(maker), ...args)
    );

And with that, we have derived compact implementations of both the Y combinator and its idiomatic JavaScript equivalent, the why bird, from our mockingbird implementation.


Spiral ©2012 Renzo Borgatti


Part II: Two practical applications for the Y Combinator

This function for determining whether a number is even is extremely slow:

const _isEven =
  (myself, n) =>
    (n === 0) || !myself(n - 1);

const isEven = why(_isEven);

isEven(1000)
  //=> Go for coffee,
  //   then return to find out that the answer is `true`

isEven(1001)
  //=> Go for another coffee,
  //   then return to find out that the answer is `false`

As discussed in To Grok a Mockingbird, one of the things we can do to improve performance is to memoize our function:

const memoized = (fn, keymaker = JSON.stringify) => {
  const lookupTable = Object.create(null);

  return function (...args) {
    const key = keymaker.call(this, args);

    return lookupTable[key] || (lookupTable[key] = fn.apply(this, args));
  }
};

const ignoreFirst = ([_, ...values]) => JSON.stringify(values);

const isEvenFast = why(memoized(_isEven));

isEvenFast(1000)
  //=> Go for coffee,
  //   then return to find out that the answer is `true`

isEvenFast(1001)
  //=> false, immediately

Because the why bird decouples our _isEven function from itself, we can compose it with our memoized decorator or not as we see fit. Nothing in _isEven is coupled to whether we are memoizing the function or not.

Thus, the first benefit of functions written in “why bird form” is that they can be easily composed with decorators.6


Stack ©2014 alemjusic


trampolining tail-recursive functions

There’s another–albeit rarer–benefit to rewriting functions in why bird form. Consider this extreme use case:

why(
  (myself, n) =>
    (n === 0) || !myself(n - 1)
)(1000042)
  //=> Maximum call stack exceeded

Our function consumes stack space equal to the magnitude of the argument n. Naturally, this is a contrived example, but recursive functions that consume the entire stack to occur from time to time, and it is not always appropriate to rewrite them in iterative form.

One solution to this problem is to rewrite the function in tail-recursive form. If the JavaScript engine supports tail-call optimization, the function will execute in constant stack space:

// Safari Browser, c. 2018

why(
  (myself, n) => {
    if (n === 0)
      return true;
    else if (n === 1)
      return false;
    else return myself(n - 2);
  }
)(1000042)
  //=> true

However, not all engines support tail-call optimization, despite it being part of the JavaScript specification. If we wish to execute such a function in constant stack space, one of our options is to “greenspun” tail-call optimization ourselves by implementing a trampoline:7

A trampoline is a loop that iteratively invokes thunk-returning functions (continuation-passing style). A single trampoline is sufficient to express all control transfers of a program; a program so expressed is trampolined, or in trampolined style; converting a program to trampolined style is trampolining. Trampolined functions can be used to implement tail-recursive function calls in stack-oriented programming languages.–Wikipedia

As we saw in To Grok a Mockingbird, this necessitates having our recursive function become tightly coupled to its execution strategy. In other words, above and beyond being rewritten in tail-recursive form, it must explicitly return thunks rather than call myself:

class Thunk {
  constructor (delayed) {
    this.delayed = delayed;
  }

  evaluate () {
    return this.delayed();
  }
}

const trampoline =
  fn =>
    (...initialArgs) => {
      let value = fn(...initialArgs);

      while (value instanceof Thunk) {
        value = value.evaluate();
      }

      return value;
    };

const isEven =
  trampoline(
    function myself (n, parity = 0) {
      if (n === 0) {
        return parity === 0;
      } else {
        return new Thunk(() => myself(n - 1, 1 - parity));
      }
    }
  );

isEven(1000001)
  //=> false

In To Grok a Mockingbird, we solved this problem for functions written “in mockingbird form” with the Jackson’s Widowbird function. We created a function with the same contract as the mockingbird, but its implementation used a trampoline to execute recursive functions in constant stack space.

Functions written “in why bird form” are more idiomatically JavaScript than functions written in mockingbird form. If we can create a similar function that has the same contract as the why bird, but uses a trampoline to evaluate the recursive function, we could execute tail-recursive functions in constant stack space.

We will call this function the “Long-tailed Widowbird.” Let’s derive it.


Long-Tailed Widowbird


deriving the long-tailed widowbird from the why bird

Our goal is to create a trampolining function. So let’s start with the basic outline of a trampoline, and call it longtailed:

class Thunk {
  constructor (delayed) {
    this.delayed = delayed;
  }

  evaluate () {
    return this.delayed();
  }
}

const longtailed =
  fn =>
    (...initialArgs) => {
      let value = fn(...initialArgs);

      while (value instanceof Thunk) {
        value = value.evaluate();
      }

      return value;
    };

We won’t even bother trying this, we know that fn(...initialArgs) is not going to work without injecting a function for myself. But we do know a function that we can call with ...initialArgs:

class Thunk {
  constructor (delayed) {
    this.delayed = delayed;
  }

  evaluate () {
    return this.delayed();
  }
}

const longtailed =
  fn =>
    (...initialArgs) => {
      let value = why(fn)(...initialArgs);

      while (value instanceof Thunk) {
        value = value.evaluate();
      }

      return value;
    };

This works, but never actually creates any thunks. To do that, let’s reduce why(fn):

class Thunk {
  constructor (delayed) {
    this.delayed = delayed;
  }

  evaluate () {
    return this.delayed();
  }
}

const longtailed =
  fn =>
    (...initialArgs) => {
      let value =
        (x => x(x))(
          maker =>
            (...args) =>
              fn(maker(maker), ...args)
        )(...initialArgs);

      while (value instanceof Thunk) {
        value = value.evaluate();
      }

      return value;
    };

Now we see where the value for myself comes from, it’s maker(maker). Let’s replace that with a function that, given some arguments, returns a new thunk that—when evaluated—returns maker(maker) invoked with those arguments:

class Thunk {
  constructor (delayed) {
    this.delayed = delayed;
  }

  evaluate () {
    return this.delayed();
  }
}

const longtailed =
  fn =>
    (...initialArgs) => {
      let value =
        (x => x(x))(
          maker =>
            (...args) =>
              fn((...argsmm) => new Thunk(() => maker(maker)(...argsmm)), ...args)
        )(...initialArgs);

      while (value instanceof Thunk) {
        value = value.evaluate();
      }

      return value;
    };

longtailed(
  (myself, n) => {
    if (n === 0)
      return true;
    else if (n === 1)
      return false;
    else return myself(n - 2);
  }
)(1000042)
  //=> true

It works! And it executes in constant stack space, as we wanted.8


Ink & Water ©2010 Gagneet Parmar


from long-tailed widowbird to decoupled trampoline

The long-tailed widowbird works, but it is code that only its author could love.

Let’s begin our cleanup by moving Thunk inside our function. This has certain technical advantages if we ever create a recursive program that itself returns thunks. Since it is now a special-purpose class that only ever invokes a single function, we’ll give it a more specific implementation:

const longtailed =
  fn => {
    class Thunk {
      constructor (fn, ...args) {
        this.fn = fn;
        this.args = args;
      }

      evaluate () {
        return this.fn(...this.args);
      }
    }

    return (...initialArgs) => {
      let value =
        (x => x(x))(
          maker =>
            (...args) =>
              fn((...argsmm) => new Thunk(maker(maker), ...argsmm), ...args)
        )(...initialArgs);

      while (value instanceof Thunk) {
        value = value.evaluate();
      }

      return value;
    };
  };

Next, let’s extract the creation of a function that delays the invocation of maker(maker):

const longtailed =
  fn => {
    class Thunk {
      constructor (fn, ...args) {
        this.fn = fn;
        this.args = args;
      }

      evaluate () {
        return this.fn(...this.args);
      }
    }

    const thunkify =
      fn =>
        (...args) =>
          new Thunk(fn, ...args);

    return (...initialArgs) => {
      let value =
        (x => x(x))(
          maker =>
            (...args) =>
              fn(thunkify(maker(maker)), ...args)
        )(...initialArgs);

      while (value instanceof Thunk) {
        value = value.evaluate();
      }

      return value;
    };
  };

And now we have a considerably less ugly long-tailed widowbird. Well, actually, we are ignoring “the elephant in the room,” the name of the function. “Long-tailed Widowbird” is a touching tribute to the genius of Raymond Smullyan, and there is an amusing correlation between its long tail and the business of optimizing tail-recursive functions.

Nevertheless, if we are to work with others, we might want to consider the possibility that they would prefer a less poetic approach:

const decoupledTrampoline =
  fn => {
    class Thunk {
      constructor (fn, ...args) {
        this.fn = fn;
        this.args = args;
      }

      evaluate () {
        return this.fn(...this.args);
      }
    }

    const thunkify =
      fn =>
        (...args) =>
          new Thunk(fn, ...args);

    return (...initialArgs) => {
      let value =
        (x => x(x))(
          maker =>
            (...args) =>
              fn(thunkify(maker(maker)), ...args)
        )(...initialArgs);

      while (value instanceof Thunk) {
        value = value.evaluate();
      }

      return value;
    };
  };

And there we have our decoupled trampoline in its final form.


Clouds at the end ©2008 Richard Hammond


summary

The mockingbird that we explored in To Grok a Mockingbird is easy to understand, and decouples recursive functions from themselves. This provides us with more ways to compose recursive functions with other functions like decorators. However, it requires us to pass myself along when making recursive calls.

This is decidedly not idiomatic, so in Part I we derived the why bird, an idiomatic JavaScript recursive combinator that enables recursive functions to call themselves without any additional parameters. We then derived a JavaScript implementation of the Y combinator from the why bird, and finished by using a reduced version of the M combinator to produce “compact” implementations of both the why bird and the Y combinator.

In Part II, we revisited the need for optimizing tail-recursive recursive functions such that they operate in constant stack space. We derived the long-tailed widowbird from the why bird, then polished its implementation, naming the finished function the Decoupled Trampoline.

To recapitulate the use case for the decoupled trampoline, in the rare but nevertheless valid case where we wish to refactor a singly recursive function into a trampolined function to ensure that it does not consume the stack, we previously had to:

  1. Refactor the function into tail-recursive form;
  2. Refactor the tail-recursive version to explicitly invoke a trampoline;
  3. Wrap the result in a trampoline function.

With the decoupled trampoline, we can:

  1. Refactor the function into tail-recursive form;
  2. Refactor the function into “why bird form,” then;
  3. Wrap the result in the decoupled trampoline.

Why is this superior? We’re going to refactor into tail-recursive form either way, and we’re going to wrap the function either way, however:

  1. Refactoring into “why bird form” is less intrusive than rewriting the code to explicitly return thunks, and;
  2. The refactored code is decoupled from trampolining, so it is easier to reverse the procedure if need be, or even just used with the why bird;

If we compare and contrast:

const isEven =
  trampoline(
    function myself (n) {
      if (n === 0)
        return true;
      else if (n === 1)
        return false;
      else return new Thunk(() => myself(n - 2));
    }
  );

With:

const isEven =
  decoupledTrampoline(
    (myself, n) => {
      if (n === 0)
        return true;
      else if (n === 1)
        return false;
      else return myself(n - 2);
    }
  );

The latter has clearer separation of concerns and is thus easier to grok at first sight. And thus, we have articulated a practical (albeit infrequently needed) use for the Y Combinator.

That’s all!

(discuss on hacker news or reddit)


The essays in this series on recursive combinators are: To Grok a Mockingbird and Why Y? Deriving the Y Combinator in JavaScript. Enjoy them both!


Notes
  1. Not to mention that there’s a famous technology investment firm and startup incubator that takes its name from the Y Combinator, likely because the Y Combinator acts as a kind of “bootstrap” to allow a function to build upon itself. 

  2. The paradox of instructional explorations is that if we wish to illustrate a mechanism like recursive combinators, choosing trivial functions like exponentiation makes it easier to focus on the thing we’re exploring, the combinators. The tradeoff is that with such simple functions, it will always feel over-complicated to use recursive combinators. Whereas, if we work with functions with real-world implications, the mechanism we’re exploring gets lost in the complexity of the functions it operates upon. 

  3. The mockingbird is more formally known as the M Combinator. Our naming convention is that when discussing formal combinators from combinatory logic, or direct implementations in JavaScript, we will use the formal name. But when using variations designed to work more idiomatically in JavaScript–such as versions that work with functions taking more than one argument), we will use Raymond Smullyan’s ornithological nicknames.

    For a formalist, the M Combinator’s direct translation is const M = fn => fn(fn). This is only useful if fn is implemented in “curried” form, e.g. const isEven = myself => n => n === 0 || !myself(n - 1). If we wish to use a function written in idiomatic JavaScript form, such as const isEven = (myself, n) => n === 0 || !myself(n - 1), we use the mockingbird, which is given later as const mockingbird = fn => (...args) => fn(fn, ...args). This is far more practical for programming purposes. 

  4. Well, actually, what we have derived is the applicative form of the Y Combinator, often called the Z Combinator. The difference between this combinator and the version you will find in the Lambda Calculus is driven by the fact that JavaScript is an eagerly evaluated language, and the Lambda Calculus is lazily evaluated. 

  5. As alluded to, there is an enormous significance to the Y combinator beyond writing recursive JavaScript functions that are decoupled from themselves. Deriving the Y combinator is interesting in its own right, and highlighting the relationship between the M combinator and the Y combinator is something that is rarely mentioned in casual blogs.

    If the subject piques your interest, be sure to look into point-free programming, fixed point functions, recursion theory, … and most especially, read Raymond Smullyan’s To Mock a Mockingbird

  6. The use of the why bird or Y Combinator to implement recursive memoization has been discussed a number of times before, including here and here and here, and most especially here

  7. A more complete exploration of ways to convert recursive functions to non-recursive functions can be found in Recursion? We don’t need no stinking recursion!, and its follow-up, A Trick of the Tail

  8. Well, actually: If we use the original trampoline implementation, we explicitly create the thunks that cause the trampolining, so if we return a thunk, we explicitly expect this call to be in tail position. With this function, it’s all done behind the scenes by a function that has nearly the same contract as the why bird.

    However, the long-tailed widowbird is not exactly the same as the why bird. The why bird works with any function (even a non-recursive function). If it happens to be tail-recursive, the why bird works just fine, although it leaves optimization up to the JavaScript engine.

    The long-tailed widowbird, on the other hand, does not work with any function, it works with functions that are not recursive, and functions that are tail-recursive. If we pass it a function that is recursive but not tail-recursive, it will not work. We have decoupled the function from its implementation, but not the implementation from the function. 

https://raganwald.com/2018/09/10/why-y
To Grok a Mockingbird
Show full content

Using recursive combinators to enhance functional composition, with special guests the Mockingbird, Widowbird, and Why Bird


In this essay we’re going to look at recursive combinators. A recursive combinator is a function that takes another function that is not recursive, and returns a function that is recursive. Recursive combinators make it possible to create recursive functions that are not tightly coupled to themselves.

Recursive combinators have important theoretical implications, but for the working programmer they decouple recursive functions from the mechanism that implements recursion. This makes it easier to compose recursive functions with decorators and to implement recursion strategies like trampolining.

We’ll begin our exploration with a look at the mockingbird, also called the M Combinator.1 We’ll then move on to examine the widowbird, a combinator that executes tail-recursive functions in constant space. We’ll finish with a brief look at the famous Why Bird, or Y Combinator.


Eye in the Sky ©2011 Ian Sane


recursion and binding

As the number of people discussing recursion in an online forum increases, the probability that someone will quote the definition for recursion as Recursion: see ‘recursion’, approaches one.

This is a function that computes exponentiation. If we want to compute something like 2^8 (two to the power of eight), we can compute it like this: 2 * 2 * 2 * 2 * 2 * 2 * 2 * 2, which requires O(n) operations. Our function exploits basic arithmetic and recursion to obtain the same result in O(log2n) operations:2

function exponent (x, n) {
  if (n === 0) {
    return 1;
  } else if (n % 2 === 1) {
    return x * exponent(x * x, Math.floor(n / 2));
  } else {
    return exponent(x * x, n / 2);
  }
}

exponent(2, 7)
  //=> 128

Question: How does our exponent function actually perform recursion? The immediate answer is, “It calls itself when the work to be performed is not the base case” (the base case for exponentiation is an exponent of 0 or 1).

How does it call itself? Well, when we have a function declaration (like above), or a named function expression, the function is bound to its own name within the body of the function automatically.

So within the body of the exponent function, the function itself is bound to the name exponent, and that’s what it calls. This is obvious to most programmers, and it’s how we nearly always implement recursion.

But it’s not always exactly what we want. If we want even more performance, we might consider memoizing the function.

Here’s a memoization decorator, snarfed from Time, Space, and Life As We Know It :

const memoized = (fn, keymaker = JSON.stringify) => {
  const lookupTable = Object.create(null);

  return function (...args) {
    const key = keymaker.call(this, args);

    return lookupTable[key] || (lookupTable[key] = fn.apply(this, args));
  }
};

We can make a memoized version of our exponent function:

const mExponent = memoized(exponent);

mExponent(2, 8)
  //=> 256, performs three multiplications
mExponent(2, 8)
  //=> 256, returns the memoized result without further multiplications

There is a hitch with this solution: Although we are invoking mExponent, internally exponent is invoking itself directly, without memoization. So if we write:

const mExponent = memoized(exponent);

mExponent(2, 8)
  //=> 256, performs three multiplications
mExponent(2, 9)
  //=> 512, performs four multiplications

When we invoke exponent(2, 8), we also end up invoking exponent(4, 4), exponent(16, 2), and exponent(256, 1). We want those memoized. That way, when we invoke exponent(2, 9), and it invoked exponent(4, 4), the result is memoized and it need do no further computation.

Our problem here is that exponent is “hard-wired” to call exponent, not mExponent. So it never invoked the memoized version of the function.

We can work around that like this:

const mExponent = memoized((x, n) => {
  if (n === 0) {
    return 1;
  } else if (n % 2 === 1) {
    return x * mExponent(x * x, Math.floor(n / 2));
  } else {
    return mExponent(x * x, n / 2);
  }
});

mExponent(2, 8)
  //=> 256, performs three multiplications
mExponent(2, 9)
  //=> 512, performs only one multiplication

In many cases this is fine. But conceptually, writing it this way means that our exponent function needs to know whether it is memoized or not. This runs counter to our “Allongé” style of writing things that can be composed without them needing to know anything about each other.

For example, if we wanted a non-memoized exponentiation function, we’d have to duplicate all of the code, with a minor variation:

const exponent = (x, n) => {
  if (n === 0) {
    return 1;
  } else if (n % 2 === 1) {
    return x * exponent(x * x, Math.floor(n / 2));
  } else {
    return exponent(x * x, n / 2);
  }
};

That is not composing things, at all. What we want is to have one exponentiation function, and find a way to use it with or without decoration (such as with or without memoization). And we can do this.


Penrose tiling Oxford ©2014 Kelbv


composeable recursive functions

The sticking point is that to have full memoization, our exponentiation function needs to have a hard-coded reference to the memoized version of itself, which means it can’t be used without memoization. This is a specific case of a more general problem where things that have hard-coded references to each other become tightly coupled, and are thus difficult to compose in different ways. Only in this case, we’ve made the thing tightly coupled to itself!

So let’s attack the hard-coded reference problem, decoupling our recursive function from itself. Since it doesn’t have to be a named function, we can make it a “fat arrow” expression. If we want a function to have a reference to another function in JavaScript, we can pass it in as a parameter. So the ‘signature’ for our new function expression will look like this:

(myself, x, n) => // ...

In this case, our function assumes that myself is going to be bound to the function itself. Now what about the body of the function? We can change exponent to myself:

(myself, x, n) => {
  if (n === 0) {
    return 1;
  } else if (n % 2 === 1) {
    return x * myself(x * x, Math.floor(n / 2));
  } else {
    return myself(x * x, n / 2);
  }
};

One little hitch: Our function signature is (myself, x, n), but when we invoke myself, we’re only passing in x and n. So we can pass myself in as well:

(myself, x, n) => {
  if (n === 0) {
    return 1;
  } else if (n % 2 === 1) {
    return x * myself(myself, x * x, Math.floor(n / 2));
  } else {
    return myself(myself, x * x, n / 2);
  }
};

Now this seems very contrived, and it doesn’t even work yet. How can we make it work?


Galápagos Mockingbird ©2012 Ben Tavener


the mockingbird

Behold, the JavaScript mockingbird:

const mockingbird = fn => (...args) => fn(fn, ...args);

The mockingbird is a function that takes another function, and returns a function. That function takes a bunch or arguments, and invoked the original function with itself and the arguments.3

So now we can write:

mockingbird((myself, x, n) => {
  if (n === 0) {
    return 1;
  } else if (n % 2 === 1) {
    return x * myself(myself, x * x, Math.floor(n / 2));
  } else {
    return myself(myself, x * x, n / 2);
  }
})(2, 8)
  //=> 256

That is all very well and good, but we’ve added some extra bookkeeping. Do we have any wins? Let’s try composing it with the memoization function. Although we didn’t use it above, our memoize function does allow us to customize the function used to create a key. Here’s a key making function that deliberately ignores the first argument:

const ignoreFirst = ([_, ...values]) => JSON.stringify(values);

And now we can create a memoized version of our anonymous function. First, here it is step-by-step:

const _exponent = (myself, x, n) => {
  if (n === 0) {
    return 1;
  } else if (n % 2 === 1) {
    return x * myself(myself, x * x, Math.floor(n / 2));
  } else {
    return myself(myself, x * x, n / 2);
  }
};

mockingbird(memoized(_exponent, ignoreFirst))(2, 8)
  //=> 256

But now for the big question: Does it memoize everything? Let’s test it:

const mExponent = mockingbird(memoized(_exponent, ignoreFirst));

mExponent(2, 8)
  //=> 256, performs three multiplications
mExponent(2, 9)
  //=> 512, performs only one multiplication

Yes it does properly memoize everything. And best of all, our function need have absolutely NO reference to the name of our memoized function. It doesn’t know whether it’s memoized or not.4

Because we’ve separated the function from the mockingbird that implements recursion, we can compose our exponentiation function with memoization or not as we see fit. When the exponentiation function was responsible for directly calling itself, if we wanted one version memoized and one not, we’d have to write two nearly identical versions of the same code.

But with the mockingbird separating how a function calls itself from the function, we can now write:

const _exponent = (myself, x, n) => {
  if (n === 0) {
    return 1;
  } else if (n % 2 === 1) {
    return x * myself(myself, x * x, Math.floor(n / 2));
  } else {
    return myself(myself, x * x, n / 2);
  }
};

const mExponent = mockingbird(memoized(_exponent, ignoreFirst));
const exponent = mockingbird(_exponent);

We have our composeability and reuse!


Chrysler Imperial ©2017 pyntofmyld


tail recursion

Here’s a function that determines whether a whole number is even (true) or odd (false).

It is highly pessimum, and its use of recursion is completely gratuitous. But we’ll experiment with it, as it provides a good demonstration of the perils of deeply recursive functions:

const isEven =
  n => {
    if (n === 0) {
      return true;
    } else {
      return !isEven(n - 1);
    }
  };

isEven(47)
  //=> false

For any number n, this function makes n recursive calls. So, what happens if we write:

isEven(1000000)

We know the answer is true, but will it actually return at all? No. For technical reasons, JavaScript engines place a hard limit on the maximum depth of the call stack. Meaning, if too many function calls are nested–including recursive calls–we get Maximum call stack size exceeded.

One fix for this is to rewrite the function in tail-recursive form, and then allow the engine to automatically execute the function without consuming excess stack space. Here it is again, with its recursive call “in tail position:”5

const isEven =
  (n, parity = 0) => {
    if (n === 0) {
      return parity === 0;
    } else {
      return isEven(n - 1, 1 - parity);
    }
  };

isEven(42)
  //=> true

This code works just fine on the Safari browser, which in addition to being far more thrifty with battery life on OS X and iOS devices, implements Tail Call Optimization, as specified in the JavaScript standard. Alas, most other implementations refuse to implement TCO.

There’s a workaround for engines that don’t support TCO: As discussed in Trampolines in JavaScript, we can get around the call stack problem ourselves with a technique called trampolining:

A trampoline is a loop that iteratively invokes thunk-returning functions (continuation-passing style). A single trampoline is sufficient to express all control transfers of a program; a program so expressed is trampolined, or in trampolined style; converting a program to trampolined style is trampolining. Trampolined functions can be used to implement tail-recursive function calls in stack-oriented programming languages.–Wikipedia

Like our mockingbird, the trampoline pattern separates the code into a function that defines the work to be done, and a trampoline function that calls the recursive function. The trampoline function checks the function’s return value. If it’s a thunk, the trampoline evaluates the thunk, usually invoking the function again. Since the recursive function always returns before evaluating the next call, the stack does not grow.

Here’s the simplest trampoline good enough to criticize. Unlike approaches that rely on returning functions for thunks instead of a class, this works for functions that are supposed to return functions.

We include a version of isEven designed to work with it:

class Thunk {
  constructor (delayed) {
    this.delayed = delayed;
  }

  evaluate () {
    return this.delayed();
  }
}

const trampoline =
  fn =>
    (...initialArgs) => {
      let value = fn(...initialArgs);

      while (value instanceof Thunk) {
        value = value.evaluate();
      }

      return value;
    };

const isEven =
  trampoline(
    function myself (n, parity = 0) {
      if (n === 0) {
        return parity === 0;
      } else {
        return new Thunk(() => myself(n - 1, 1 - parity));
      }
    }
  );

isEven(1000001)
  //=> false

This works, but suffers from the recursive sins that our mockingbird fixed. The isEven function is coupled to itself, and it is coupled to the implementation of trampolining. Why should it know what a Thunk is?


Mockingbird nabs a berry ©2011 Kim Taylor


a new kind of passerine science

A mockingbird cannot directly solve our problem, but we can learn from the mockingbird to write a new kind of trampolining function based on the mockingbird. We start by rewriting our isEven function in both tail-recursive and decoupled form.

Note that this works just fine with our mockingbird:

const _isEven =
  (myself, n, parity = 0) => {
    if (n === 0) {
      return parity === 0;
    } else {
      return myself(myself, n - 1, 1 - parity);
    }
  };

mockingbird(_isEven)(1001)
  //=> false

Now we’ve decoupled the form of the function from the mechanism of recursion. So, let’s swap the mechanism of recursion for a trampoline without altering the recursive function to suit the new implementation.

We’ll call our new “combinator” a Jackson’s Widowbird:6

const widowbird =
  fn => {
    class Thunk {
      constructor (args) {
        this.args = args;
      }

      evaluate () {
        return fn(...this.args);
      }
    }

    return (...initialArgs) => {
      let value = fn(
        (...args) => new Thunk(args),
        ...initialArgs
      );

      while (value instanceof Thunk) {
        value = value.evaluate();
      }

      return value;
    };
  };

widowbird(_isEven)(1001)
  //=> false

Since we’re passing the function to be called recursively into our recursive function, we can place the thunk mechanism in our widowbird, instead of in the recursive function. Thus, the recursive function is completely decoupled from the mechanism for recursing without consuming the stack.

And what about our naive exponentiation that broke the stack earlier?

widowbird(_isEven)(1000000)
  //=> true

It works just fine, even on engines that don’t support tail call optimization. The widowbird has shown us another benefit of separating the recursive computation to be done from the mechanism for performing the recursion.7


y? ©2012 Newtown grafitti


the why bird

The mockingbird has the advantage of being the very simplest recursive combinator. But it can be enhanced. One of the annoying things about it is that when we write our functions to use with a mockingbird, not only do we need a myself parameter, but we need to remember to pass it on as well.

This isn’t a bad tradeoff, but logicians searched for a combinator that could implement recursion with a parameter, like the mockingbird, but avoid having to pass that parameter on. This had important theoretical consequences, but for us, the value of such a combinator is that the functions we write are more natural.

The combinator that decouples recursion using a parameter, but doesn’t require passing that parameter along, is called the Y Combinator.

A compact JavaScript implementation looks like this:

const Y =
  fn =>
    (x => x(x))(m => a => fn(m(m))(a));

Without getting into exactly how it works, we can see that the disadvantage of the Y combinator is that it assumes that all functions are curried to take only one argument.8

Here’s an idiomatic JavaScript version, called the Why Bird. It handles functions with more than one argument:

const why =
  fn =>
    (x => x(x))(
      maker =>
        (...args) =>
          fn(maker(maker), ...args)
    );

Armed with our why bird, we can write recursive functions that look a little more idiomatic. This implementation of map is gratuitously recursive, but demonstrates that using the why bird, we need not pass myself along when map calls itself recursively:

const _map =
  (myself, fn, input) => {
    if (input.length === 0) {
      return [];
    } else {
      const [first, ...rest] = input;

      return [fn(first)].concat(myself(fn, rest));
    }
  };

why(_map)(x => x * x, [1, 2, 3])
  //=> [1, 4, 9]

No more myself(myself, ...)!

The why bird makes the code we write much simpler. And like the mockingbird, it allows us to separate the mechanism for recursion from the function we wish to make recursive.

(We look at how to derive the Why bird and Y combinator from the Mockingbird and M combinator in the literally named Why Y? Deriving the Y Combinator in JavaScript)


The Summary Key ©2017 Mike Lawrence


summary

In summary, the mockingbird is a recursive combinator: It takes a function that is not directly recursive, and makes it recursive by passing the subject function to itself as a parameter. This has the effect of removing a hard-coded dependency between the subject function and itself, which allows us to decorate it with functionality like memoization.

We’ve also seen that having performed this separation, we can swap the mockingbird out for other functions implementing recursion, such as the widowbird. We’ve seen that the widowbird is superior to other approaches, because it does not require the function being trampolined to “know” that it is being trampolined.

And finally, we saw the why bird, or Y Combinator. We saw that it makes our functions a little more idiomatic, and once again delivers the value of separating function from recursion mechanism.

Recursive combinators like mockingbirds, widowbirds, and why birds are a few more tools in our “composeable functions” toolbox, increasing reuse by decoupling recursive functions from themselves.

(discuss on reddit here, or here, or on hacker news)


The essays in this series on recursive combinators are: To Grok a Mockingbird and Why Y? Deriving the Y Combinator in JavaScript. Enjoy them both!


Notes
  1. The mockingbird or “M combinator” is also sometimes called ω, or “little omega”. The full explanation for ω, as well as its relation to Ω (“big omega”), can be found on David C Keenan’s delightful To Dissect a Mockingbird page.

    In Combinatory Logic, the fundamental combinators are named after birds, following the example of Raymond Smullyan’s famous book To Mock a Mockingbird. Needless to say, the title of the book and its central character is the inspiration for this essay! 

  2. This basic pattern was originally discussed in an essay about a different recursive function, writing a matrix multiplication implementation of fibonacci

  3. In proper combinatorial logic, the mockingbird is actually defined as M x = x x. However, this presumes that all combinators are “curried” and only take one argument. Our mockingbird is more “idiomatically JavaScript.”

    But it’s certainly possible to use const M = fn => fn(fn);, we would just need to also rewrite our exponentiation function to have a signature of myself => x => n => ..., and so forth. That typically clutters JavaScript up, so we’re using const mockingbird = fn => (...args) => fn(fn, ...args);, which amounts to the same thing. 

  4. In JavaScript, like almost all programming languages, we can bind values to names with parameters, or with variable declarations, or with named functions. So having something like the M Combinator is optional, as we can choose to have a function refer to itself via a function name or variable binding. However, in Combinatory Logic and the Lambda calculus, there are no variable declarations or named functions.

    Therefore, recursive combinators are necessary, as they are the only way to implement recursion. And since they don’t have iteration either, recursion is the only way to do a lot of things we take for granted in JavaScript, like mapping lists. So recursive combinators are deeply important to the underlying building blocks of computer science. 

  5. See A Trick of the Tail for a fuller explanation of how to perform this refactoring. 

  6. The Jackson’s Widowbird, Euplectes Jacksoni, is a passerine bird in the family Ploceidae. As notably portrayed in BBC Planet Earth II, when attempting to attract females to nest in their territory, the males repeatedly jump to show off their fitness. If we exercise our vivid imaginations, we can think of this as resembling the behaviour of a trampolining tail-recursive function. Instead of “drilling deeper and deeper,” it repeatedly bounces back up to the top. 

  7. Although the widowbird works just fine, it should be noted that it does not work in conjunction with memoization. This is unsurprising, as memoization relies on functions returning values, and trampolining hacks functions to return thunks. So memoization will memoize thunks rather than values.

    All things considered, that may be acceptable, as the widowbird is designed to simulate an optimization that hacks a tail-recursive function to behave as if it was iterative. 

  8. There are lots of essays deriving the Y Combinator step-by-step. Here’s one in JavaScript, and here’s another

https://raganwald.com/2018/08/30/to-grok-a-mockingbird
The Eight Queens Problem... and Raganwald's Unexpected Nostalgia
Show full content

A few weeks ago, I ordered a copy of the Sesquicentennial Edition of The Annotated Alice.

As is their wont, Amazon’s collaborative filters showed me other books that might be of interest to me, and I spotted a copy of Knots and Borromean Rings, Rep-Tiles, and Eight Queens: Martin Gardner’s Unexpected Hanging.

I nearly gasped out loud, savouring the memory of one of the earliest computer programs that I ever wrote from scratch, a program that searched for solutions to the eight queens puzzle.


Knots and Borromean Rings, Rep-Tiles, and Eight Queens: Martin Gardner's Unexpected Hanging
Prelude: 1972 – 1977

This prelude is long on nostalgia and short on programming. If that does not interest you, feel free to skip straight to the description of the eight queens puzzle.

In the nineteen-seventies, I spent a lot of time in Toronto’s libraries. My favourite hangouts were the Sanderson Branch (which was near my home in Little Italy), and the “Spaced Out Library,” a non-circulating collection of science fiction and fantasy donated by Judith Merril, that was housed within St. George and College Street branch.1

I especially enjoyed reading back issues of Scientific American, and like many, I was captivated by Martin Gardner’s “Mathematical Games” columns. My mother had sent me to a day camp for gifted kids once, and it was organized like a university. The “students” self-selected electives, and I picked one called “Whodunnit.” It turned out to be a half-day exercise in puzzles and games, and I was hooked.

Where else would I learn about playing tic-tac-toe in a hypercube? Or about liars and truth-tellers? Or, as it happened, about Martin Gardner? I suspect the entire material was lifted from his collections of columns, and that suited me down to the ground.


Scientific American

One day we had a field trip to the University of Toronto’s High-Speed Job Stream, located in the Sanford Fleming Building2. This was a big room that had a line printer on one side of it, a punch card reader on the other, and lots and lots of stations for punching your own cards.

To run a job, you typed out your program, one line per card, and then stuck a header on the front that told the batch job what kind of interpreter or compiler to use. Those cards were brightly coloured, and had words like WATFIV or SNOBOL printed on them in huge letters.

You put header plus program into the hopper at the back, waited, and when it emerged from the reader, collected your punch cards and headed over to the large and noisy line printer. When the IBM 360 got around to actually running your job, it would print the results for you, and you would head over to a table to review the output and–nearly all of the time for me–find the typo or bug, update your program, and start all over again.


IMB Keypunch Machine

You can see equipment like this in any computer museum, so I won’t go into much more detail. Besides, the mechanics of running programs as batch jobs was not the interesting thing about the High Speed Job Stream. The interesting thing about the High Speed Job Stream was that there was no restriction on running jobs. You didn’t need an account or a password. Nobody stood at the door asking for proof that you were an undergrad working on an assignment.

So I’d go over there on a summer day and write software, and sometimes, I’d try to write programs to solve puzzles.


Raganwald at S.A.C.


school

In the autumn of 1976, I packed my bags and went to St. Andrew’s College, a boarding school. One of the amazing things about “SAC” was that they had an actual minicomputer on the campus. For the time, this was unprecedented. In Ontario’s public school system, it was possible to take courses in programming, but they nearly all involved writing programs by filling in “bubble cards” with a pencil and submitting jobs overnight.

At SAC, there was a Nova 1220 minicomputer in a room with–oh glorious day–four ancient teletype machines hooked up to it with what I now presume were serial ports. It had various operating modes that were set by loading a 5MB removable hard disk (It was a 12” platter encased in a big protective plastic shell), and rebooting the machine by toggling bootstrap instructions into the front panel.

The mode set up for student use was a four-user BASIC interpreter. It had 16KB of RAM (yes, you read that right), and its simple model partitioned the memory so that each user got 4KB to themselves. You could type your program into the teletype, and its output would print on a roll of paper.

Saving programs on disc was not allowed. The teletypes had paper tape interfaces on the side, so to save a program we would LIST the source with the paper tape on, and it would punch ASCII or EBDIC codes onto the tape. We’d tear it off and take it with us. Later, to reload a program, we’d feed the tape into the teletype and it would act as if we were typing the program anew.

4KB was enough for assignments like writing a simple bubble sort, but I had discovered David Ahl by this time, and programs like “Super Star Trek” did not fit in 4KB. There was a 16KB single-user disc locked in a cabinet alongside programs for tabulating student results.

In defiance of all regulation, I would go in late, pick the cupboard’s lock, remove the disc I wanted, and boot up single-user mode. I could then work on customizing Super Star Trek or write programs to solve puzzles. Curiously, I never tampered with the student records. I was a morally vacant vessel at that point in my life: I’m not going to tell you that I had a moral code about these things. I think the truth is that I just didn’t care about marks.


Eight Queens Puzzle


The Eight Queens Puzzle

One of the things I worked on at school was writing new games. I made a Maharajah and the Sepoys program that would play the Maharajah while I played the standard chess pieces. It could beat me, which was enough AI for my purposes.

This got me thinking about something I’d read in a Martin Gardner book, the Eight Queens Puzzle. As Wikipedia explains, “The eight queens puzzle is the problem of placing eight chess queens on an 8×8 chessboard so that no two queens threaten each other. Thus, a solution requires that no two queens share the same row, column, or diagonal.”

By this time I knew a little about writing “generate and test” algorithms, as well as a little about depth-first search from writing games (like “Maharajah and the Sepoys”) that performed basic minimax searches for moves to make. So I set about writing a BASIC program to search for solutions. I had no formal understanding of computational complexity and running time, but what if I wrote a program and left it running all night?

The “most pessimum” approach looks something like this (BASIC has a for... next construct, but close enough):

for (let i0 = 0; i0 <= 7; ++i0) {
  for (let j0 = 0; j0 <= 7; ++j0) {
    for (let i1 = 0; i1 <= 7; ++i1) {
      for (let j1 = 0; j1 <= 7; ++j1) {

        // ...lots of loops elided...

          for (let i7 = 0; i7 <= 7; ++i7) {
            inner: for (let j7 = 0; j7 <= 7; ++j7) {
              const board = [
                [".", ".", ".", ".", ".", ".", ".", "."],
                [".", ".", ".", ".", ".", ".", ".", "."],
                [".", ".", ".", ".", ".", ".", ".", "."],
                [".", ".", ".", ".", ".", ".", ".", "."],
                [".", ".", ".", ".", ".", ".", ".", "."],
                [".", ".", ".", ".", ".", ".", ".", "."],
                [".", ".", ".", ".", ".", ".", ".", "."],
                [".", ".", ".", ".", ".", ".", ".", "."]
              ];

              const queens = [
                [i0, j0],
                [i1, j1],
                [i2, j2],
                [i3, j3],
                [i4, j4],
                [i5, j5],
                [i6, j6],
                [i7, j7]
              ];

              for (const [i, j] of queens) {
                if (board[i][j] != '.') {
                  // square is occupied or threatened
                  continue inner;
                }

                for (let k = 0; k <= 7; ++k) {
                  // fill row and column
                  board[i][k] = board[k][j] = "x";

                  const vOffset = k - i;
                  const hDiagonal1 = j - vOffset;
                  const hDiagonal2 = j + vOffset;

                  // fill diagonals
                  if (hDiagonal1 >= 0 && hDiagonal1 <= 7) {
                    board[k][hDiagonal1] = "x";
                  }

                  if (hDiagonal2 >= 0 && hDiagonal2 <= 7) {
                    board[k][hDiagonal2] = "x";
                  }
                }
              }

              console.log(diagramOf(queens));
            }
          }

        // ...lots of loops elided...

      }
    }
  }
}

function diagramOf (queens) {
  const board = [
    [".", ".", ".", ".", ".", ".", ".", "."],
    [".", ".", ".", ".", ".", ".", ".", "."],
    [".", ".", ".", ".", ".", ".", ".", "."],
    [".", ".", ".", ".", ".", ".", ".", "."],
    [".", ".", ".", ".", ".", ".", ".", "."],
    [".", ".", ".", ".", ".", ".", ".", "."],
    [".", ".", ".", ".", ".", ".", ".", "."],
    [".", ".", ".", ".", ".", ".", ".", "."]
  ];

  for (const [i, j] of queens) {
    board[i][j] = "Q";
  }

  return board.map(row => row.join('')).join("\n");
}

I believe I tried that, left the program running overnight, and when I came in the next morning before school it was still running. It was searching 8^16 (or more accurately, 64^8) candidates for a solution, that’s 281,474,976,710,656 loops. Given the speed of that minicomputer, I suspect the program would still be running today.

This will seem very obvious today, but one broad classification of algorithms for solving a problem like this is that of searching for solutions. It’s not the only one, but it’s the one I tried back then, and the one we’re going to focus on today. When you have a search problem, there are two ways to solve it more quickly: Search faster, and search a smaller problem space.

Although I didn’t use such words, I grasped that my first priority was searching a smaller space. So I thought about it for a bit. Then I had an insight of sorts: If I could think of the board as a one-dimensional ordered list of squares, I could reason as follows. If I pick a square for the leftmost queen, every other queen would have to come to the right of that queen.

By induction that would follow for the third and every subsequent queen. That is different than the worst-case brute force algorithm: After it picks a square for the first queen, each of the other queens can be in any position before or after it. But if we’re iterating through all of the possible positions for the first queen, it follows that we will already have iterated over any position with a queen before the first.

So this approach would eliminate a lot of duplicate positions to consider.

Although I didn’t have the education to articulate the idea properly, I was reasoning that what I wanted to search was the space of the number combinations of choosing 8 squares from 64 possibilities. That reduces the search space from 64^8 down to 4,426,165,368 candidate positions. That’s 63,593 times smaller, a big deal.

Before we get into the code that implements the “combinations” approach, I’ll share what happened when I tested my conjecture using my clumsy code of the time…


Boiler explosion throws one steam locomotive onto another


digression: disaster strikes

As above, I had chosen not to halt the program when it found a solution. Perhaps I wanted to print all of the solutions. As it happened, my test code had a bug, but it didn’t manifest itself until the program was deeper into its search, and my “optimization” took it to the failure case more quickly.

But I didn’t know this, so I left the updated program running overnight, and once again returned before breakfast to see if it had found any solutions. When I entered the room, there was a horrible smell and a deafening clacking sound. The test function had failed at some point, and it was passing thousands of positions in rapid order.

The paper roll on the teletype had jammed at some point in the night and was no longer advancing, but the teletype had hammered through the paper and was hammering on the roller behind. Rolls of paper had emerged from the machine and lay in a heap around it. I consider it a very lucky escape that a spark hadn’t ignited the paper or its dust that hung in the air.

I shut everything down, cleaned up as best I could, and then set about finding the bug. Although I never did cause another “physical crash,” it took me days (or possibly weeks, I don’t quite remember, and I did have other things going on at the time) before I had improved my program to the point where it found solutions.


The Royal Ontario Museum, ©2009 Steve Harris


refactoring before rewriting

Now back to writing out an improved algorithm based on combinations rather than the most pessimum approach.

One of our go-to techniques for modifying programs is to begin my making sure that the thing we wish to change is refactored into its own responsibility, then we can make a change to just one thing. The code from above has the generating loops and testing code thrown together all higgledy-piggledy. That makes it awkward to change the generation or the testing independently.

We might begin be refactoring the code into a generator and consumer pattern. The generator lazily enumerates the search space, and the consumer filters it to select solutions:3

function * mostPessimumGenerator () {
  for (let i0 = 0; i0 <= 7; ++i0) {
    for (let j0 = 0; j0 <= 7; ++j0) {
      for (let i1 = 0; i1 <= 7; ++i1) {
        for (let j1 = 0; j1 <= 7; ++j1) {

          // ...lots of loops elided...

            for (let i7 = 0; i7 <= 7; ++i7) {
              for (let j7 = 0; j7 <= 7; ++j7) {
                const queens = [
                  [i0, j0],
                  [i1, j1],
                  [i2, j2],
                  [i3, j3],
                  [i4, j4],
                  [i5, j5],
                  [i6, j6],
                  [i7, j7]
                ];

                yield queens;
              }
            }

          // ...lots of loops elided...

        }
      }
    }
  }
}

function test (queens) {
  const board = [
    [".", ".", ".", ".", ".", ".", ".", "."],
    [".", ".", ".", ".", ".", ".", ".", "."],
    [".", ".", ".", ".", ".", ".", ".", "."],
    [".", ".", ".", ".", ".", ".", ".", "."],
    [".", ".", ".", ".", ".", ".", ".", "."],
    [".", ".", ".", ".", ".", ".", ".", "."],
    [".", ".", ".", ".", ".", ".", ".", "."],
    [".", ".", ".", ".", ".", ".", ".", "."]
  ];

  for (const [i, j] of queens) {
    if (board[i][j] != '.') {
      // square is occupied or threatened
      return false;
    }

    for (let k = 0; k <= 7; ++k) {
      // fill row and column
      board[i][k] = board[k][j] = "x";

      const vOffset = k-i;
      const hDiagonal1 = j - vOffset;
      const hDiagonal2 = j + vOffset;

      // fill diagonals
      if (hDiagonal1 >= 0 && hDiagonal1 <= 7) {
        board[k][hDiagonal1] = "x";
      }

      if (hDiagonal2 >= 0 && hDiagonal2 <= 7) {
        board[k][hDiagonal2] = "x";
      }

      board[i][j] = "Q";
    }
  }

  return true;
}

function * filterWith (predicateFunction, iterable) {
  for (const element of iterable) {
    if (predicateFunction(element)) {
      yield element;
    }
  }
}

function first (iterable) {
  const [value] = iterable;

  return value;
}

const solutionsToEightQueens = filterWith(test, mostPessimumGenerator());

diagramOf(first(solutionsToEightQueens))
  //=> ...go to bed and catch some 💤...

With this in hand, we can make a faster “combinations” generator, and we won’t have to work around any of the other code.


Choose your colour ©2014 jaros


the “combinations” algorithm

An easy way to implement choosing combinations of squares is to work with numbers from 0 to 63 instead of pairs of indices. Here’s a generator that does the exact thing we want:

function * mapWith (mapFunction, iterable) {
  for (const element of iterable) {
    yield mapFunction(element);
  }
}

function * choose (n, k, offset = 0) {
  if (k === 1) {
    for (let i = 0; i <= (n - k); ++i) {
      yield [i + offset];
    }
  } else if (k > 1) {
    for (let i = 0; i <= (n - k); ++i) {
      const remaining = n - i - 1;
      const otherChoices = choose(remaining, k - 1, i + offset + 1);

      yield * mapWith(x => [i + offset].concat(x), otherChoices);
    }
  }
}

choose(5, 3)
  //=>
    [0, 1, 2]
    [0, 1, 3]
    [0, 1, 4]
    [0, 2, 3]
    [0, 2, 4]
    [0, 3, 4]
    [1, 2, 3]
    [1, 2, 4]
    [1, 3, 4]
    [2, 3, 4]

We can now write choose(64, 8) to get all the ways to choose eight squares, and [Math.floor(n/8), n % 8] to convert a number from 0 to 63 into a pair of indices between 0 and 7:

const numberToPosition = n => [Math.floor(n/8), n % 8];
const numbersToPositions = queenNumbers => queenNumbers.map(numberToPosition);

const combinationCandidates = mapWith(numbersToPositions, choose(64, 8));

const solutionsToEightQueens = filterWith(test, combinationCandidates);

diagramOf(first(solutionsToEightQueens))
  //=> ...go to bed and catch some 💤...

4,426,165,368 candidate solutions is still a tremendous size of space to search. It was definitely beyond my 1977 hardware.

But we can get faster. If we list the candidates out, we can see some of the problems right away. For example, the very first combination it wants to test is [[0, 0], [0, 1], [0, 2], [0, 3], [0, 4], [0, 5], [0, 6], [0, 7]]. The queens are all on the same row!

There is an easy fix for this and as a bonus, it gets us solutions really fast.


Huge Tree ©2009 Mitch Bennett


tree searching

In my original BASIC program way back in 1977, I built the board as I went, and marked the “threatened” squares. But instead of iterating over all the possible queen positions, as I added queens to the board I iterated over all the open positions. So after placing the first queen in the first open space, my board looked conceptually like this:

Qxxxxxxx
xx......
x.x.....
x..x....
x...x...
x....x..
x.....x.
x......x

The next queen I would try would be in the first “open” square, like this:

Qxxxxxxx
xxQxxxxx
xxxx....
x.xxx...
x.x.xx..
x.x..xx.
x.x...xx
x.x....x

I’d continue like this until there were eight queens, or I ran out of empty spaces. If I failed, I’d backtrack and try a different position for the last queen. If I ran out of different positions for the last queen, I’d try a different position for the second-to-last queen, and so on. This eliminated the problem of testing candidate solutions like [[0, 0], [0, 1], [0, 2], [0, 3], [0, 4], [0, 5], [0, 6], [0, 7]], because having placed [0,0], the next queen could not be placed on any square until [1, 2].

I did not know the words for it, but I was performing a depth-first search of a “tree” of positions. I was trying to find a path that was eight queens deep. And I was keeping the board updated to do so.

This method is interesting, because it is an “inductive” method that lends itself to recursive thinking. We begin with the solution for zero queens, and empty board. Then we successively search for ways to add one more queen to whatever we already have, backtracking if we run out of available spaces.

We can be clever in some ways. for example, we need only ever mark squares that come after the queen we are placing, as we never check any square earlier than the last queen we placed. Here’s an implementation similar to my 1977 approach, implemented as a class just to prove that we aren’t dogmatic about using functions for everything:

const OCCUPATION_HELPER = Symbol("occupationHelper");

class Board {
  constructor () {
    this.threats = [
       0,  0,  0,  0,  0,  0,  0,  0,
       0,  0,  0,  0,  0,  0,  0,  0,
       0,  0,  0,  0,  0,  0,  0,  0,
       0,  0,  0,  0,  0,  0,  0,  0,
       0,  0,  0,  0,  0,  0,  0,  0,
       0,  0,  0,  0,  0,  0,  0,  0,
       0,  0,  0,  0,  0,  0,  0,  0,
       0,  0,  0,  0,  0,  0,  0,  0
    ];
    this.queenIndices = [];
  }

  isValid (index) {
    return index >= 0 && index <= 63;
  }

  isAvailable (index) {
    return this.threats[index] === 0;
  }

  isEmpty () {
    return this.queenIndices.length === 0;
  }

  isOccupiable (index) {
    if (this.isEmpty()) {
      return this.isValid(index);
    } else {
      return this.isValid(index) && index > this.lastQueen() && this.isAvailable(index);
    }
  }

  numberOfQueens () {
    return this.queenIndices.length
  }

  hasQueens () {
    return this.numberOfQueens() > 0;
  }

  queens () {
    return this.queenIndices.map(index => [Math.floor(index / 8), index % 8]);
  }

  lastQueen () {
    if (this.queenIndices.length > 0) {
      return this.queenIndices[this.queenIndices.length - 1];
    }
  }

  * availableIndices () {
    for (let index = (this.isEmpty() ? 0 : this.lastQueen() + 1); index <= 63; ++index) {
      if (this.isAvailable(index)) {
        yield index;
      }
    }
  }

  [OCCUPATION_HELPER] (index, action) {
    const [row, col] = [Math.floor(index / 8), index % 8];

    // the rest of the row
    const endOfTheRow = row * 8 + 7;
    for (let iThreatened = index + 1; iThreatened <= endOfTheRow; ++iThreatened) {
      action(iThreatened);
    }

    // the rest of the column
    const endOfTheColumn = 56 + col;
    for (let iThreatened = index + 8; iThreatened <= endOfTheColumn; iThreatened += 8) {
      action(iThreatened);
    }

    // diagonals to the left
    const lengthOfLeftDiagonal = Math.min(col, 7 - row);
    for (let i = 1; i <= lengthOfLeftDiagonal; ++i) {
      const [rowThreatened, colThreatened] = [row + i, col - i];
      const iThreatened = rowThreatened * 8 + colThreatened;

      action(iThreatened);
    }

    // diagonals to the right
    const lengthOfRightDiagonal = Math.min(7 - col, 7 - row);
    for (let i = 1; i <= lengthOfRightDiagonal; ++i) {
      const [rowThreatened, colThreatened] = [row + i, col + i];
      const iThreatened = rowThreatened * 8 + colThreatened;

      action(iThreatened);
    }

    return this;
  }

  occupy (index) {
    const occupyAction = index => {
      ++this.threats[index];
    };

    if (this.isOccupiable(index)) {
      this.queenIndices.push(index);
      return this[OCCUPATION_HELPER](index, occupyAction);
    }
  }

  unoccupy () {
    const unoccupyAction = index => {
      --this.threats[index];
    };

    if (this.hasQueens()) {
      const index = this.queenIndices.pop();

      return this[OCCUPATION_HELPER](index, unoccupyAction);
    }
  }
}

The Board class does nearly all the work. But here’s the function that finds solutions:

function * inductive (board = new Board()) {
  if (board.numberOfQueens() === 8) {
    yield board.queens();
  } else {
    for (const index of board.availableIndices()) {
      board.occupy(index);
      yield * inductive(board);
      board.unoccupy();
    }
  }
}

Very simple, and it shows at a high level exactly how things work. This inductive approach is a big step forward over combinations: With no further improvements or pruning, it tries 118,968 queen placements, a forty-thousandfold improvement over the 4,426,165,368 candidates of the combinations approach. It still is a little wasteful. For example, more than 42,000 of those placements involve placing the first queen on indices 8 or later, which means the first row is empty, and none of them will ever work. It’s just spinning its wheels after finding all 92 positions.

Also, and unlike our true generate-and-test approach, it interleaves partial generation with testing, so it’s not possible to break it into two separate pieces. A more subtle problem is this: By identifying the places in which we were trying to “choose” positions or look for “permutations” of positions, we were able to extract single responsibilities, and make them explicit with names.

It’s also a lot more complex, mostly because it’s mutating a single board in place rather than working with immutable data structures.

But even so, checking 118,968 placements is quite achievable, even on 1977 hardware. We’ve broken out of theory and into practice. But we can get faster!


Castle


the “rooks” algorithm

Let’s digress and consider a simpler problem. What are all the ways that eight rooks can be placed on a chessboard such that they don’t threaten each other?

Obviously, no two rooks can be on the same column or row. So the “aha!” realization is that we want all the combinations of eight positions which have a unique column and a unique row.

Let’s start with the unique rows. Every time we generate a set of rooks, one will be on row 0, one on row 1, one on row 2, and so forth. So the candidate solutions can always be arranged to look like this:

[
  [0, ?], [1, ?], [2, ?], [3, ?], [4, ?], [5, ?], [6, ?], [7, ?]
]

Now what about the columns? Since no two rooks can share the same column, the candidate solutions must all have a unique permutation of the numbers 0 through 7, something like this:

[
  [?, 3], [?, 1], [?, 5], [?, 6], [?, 4], [?, 2], [?, 0], [?, 7]
]

We’ll need to be able to generate the permutations of the column numbers from 0 to 7, and assign them to the rows 0 through 7 in order. That way, each candidate will look something like this:

[
  [0, 3], [1, 1], [2, 5], [3, 6], [4, 4], [5, 2], [6, 0], [7, 7]
]

It’s fairly easy to generate arbitrary permutations4 if we don’t mind splicing and reassembling arrays:

function * permutations (arr, prefix = []) {
  if (arr.length === 1) {
    yield prefix.concat(arr);
  } else if (arr.length > 1) {
    for (let i = 0; i < arr.length; ++i) {
      const chosen = arr[i];
      const remainder = arr.slice(0, i).concat(arr.slice(i+1, arr.length))

      yield * permutations(remainder, prefix.concat([chosen]));
    }
  }
}

permutations([1, 2, 3])
//=> [1, 2, 3], [1, 3, 2], [2, 1, 3], [2, 3, 1], [3, 1, 2], [3, 2, 1]

And now we can apply our permutations generator to generating candidate solutions to the rooks problem:

const solutionsToEightRooks = mapWith(
  ii => ii.map((i, j) => [i, j]),
  permutations([0, 1, 2, 3, 4, 5, 6, 7])
);

Array.from(solutionsToEightRooks).length
  //=> 40320

How do we apply this to solving the eight queens problem? Well, the set of all solutions to the eight queens problem is a subset of the set of all solutions to solving the eight rooks problem, so let’s search the set of all solutions to the eight rooks problem, and cut the search space down from 118,968 to 40,320!

const solutionsToEightQueens = filterWith(test, solutionsToEightRooks);

diagramOf(first(solutionsToEightQueens))
//=>
  Q.......
  ......Q.
  ....Q...
  .......Q
  .Q......
  ...Q....
  .....Q..
  ..Q.....

This is great! We’ve made a huge performance improvement simply by narrowing the “search space.” We’re down to 8! permutations of queens on unique rows and columns, just 40,320 different permutations to try.


Spatial Cardioidal Variations ©2013


programming digression: speeding up the testing

We’ve certainly sped things up by being smarter about the candidates we submit for testing. But what about the testing itself? The algorithm of filling in squares on a chess board very neatly matches how we might do this mentally, but it is quite slow. How can we make it faster?

For starters, if we know that we are only submitting solutions to the “eight rooks” problem, we need never test whether queens threaten each other on rows and columns. That cuts our testing workload roughly in half!

But what about diagonal attacks? Observe:

 0  1  2  3  4  5  6  7  1  2  3  4  5  6  7  8  2  3  4  5  6  7  8  9  3  4  5  6  7  8  9 10  4  5  6  7  8  9 10 11  5  6  7  8  9 10 11 12  6  7  8  9 10 11 12 13  7  8  9 10 11 12 13 14

If we sum the row and column number (row + col), we get a number representing the position of one of a queen’s diagonals. We can thus use:

 0  1  2  3  4  5  7  7  8  9 10 11 12 13 14

Instead of the entire chessboard! Simply compute the diagonal number for each queen and put an ‘x’ in this one-dimensional array. That’s much faster than putting an ‘x’ in every square of a diagonal.

What about the other diagonal?

 7  6  5  4  3  2  1  0  8  7  6  5  4  3  2  1  9  8  7  6  5  4  3  2 10  9  8  7  6  5  4  3 11 10  9  8  7  6  5  4 12 11 10  9  8  7  6  5 13 12 11 10  9  8  7  6 14 13 12 11 10  9  8  7

Ah! We can sum the row with the inverse of the column number (row + 7 - col). If we use two of these one-dimensional arrays, we can check both diagonal attacks much more quickly than tediously marking chessboard squares. Like this:

function testDiagonals (queens) {
  const nesw = [".", ".", ".", ".", ".", ".", ".", ".", ".", ".", ".", ".", ".", ".", "."];
  const nwse = [".", ".", ".", ".", ".", ".", ".", ".", ".", ".", ".", ".", ".", ".", "."];

  if (queens.length < 2) return true;

  for (const [i, j] of queens) {
    if (nwse[i + j] !== '.' || nesw[i + 7 - j] !== '.') return false;

    nwse[i + j] = 'x';
    nesw[i + 7 - j] = 'x';
  }

  return true;
}

const solutionsToEightQueens = filterWith(testDiagonals, solutionsToEightRooks);

diagramOf(first(solutionsToEightQueens))
//=>
  Q.......
  ......Q.
  ....Q...
  .......Q
  .Q......
  ...Q....
  .....Q..
  ..Q.....

Checking diagonals without filling in squares is a specialized optimization, of course.5 Now we have coupled the test with the generation algorithm. In a larger software project, we might decouple things so that we can use them in different places in different ways.

But when we come along to optimize something like this, the coupling makes it harder to reuse components, and it makes the program harder to change. Luckily for us, this isn’t an essay about writing large software projects.


Time rusting away their thin hinges ©2016 Derek Σωκράτης Finch


tree searching solutions to the rooks problem

The code we wrote for generating solutions to the rooks problem enumerates every permutation of eight squares that don’t share a common row or column. But many of those, of course, share a diagonal. For example, here are the first eight solutions it generates:

[[0, 0], [1, 1], [2, 2], [3, 3], [4, 4], [5, 5], [6, 6], [7, 7]]
[[0, 0], [1, 1], [2, 2], [3, 3], [4, 4], [5, 5], [7, 6], [6, 7]]
[[0, 0], [1, 1], [2, 2], [3, 3], [4, 4], [6, 5], [5, 6], [7, 7]]
[[0, 0], [1, 1], [2, 2], [3, 3], [4, 4], [6, 5], [7, 6], [5, 7]]
[[0, 0], [1, 1], [2, 2], [3, 3], [4, 4], [7, 5], [5, 6], [6, 7]]
[[0, 0], [1, 1], [2, 2], [3, 3], [4, 4], [7, 5], [6, 6], [5, 7]]
[[0, 0], [1, 1], [2, 2], [3, 3], [5, 4], [4, 5], [6, 6], [7, 7]]
[[0, 0], [1, 1], [2, 2], [3, 3], [5, 4], [4, 5], [7, 6], [6, 7]]

We can see at a glance that any solution beginning with [0,0], [1,1] is not going to work, so why bother generating all of the myriad candidates that are disqualified from the very first thing we check?

If we think of the rooks code as generating a tree of candidate positions rather than a flat list, we can adapt it for the eight queens problem by checking partial solutions as we go, and pruning entire subtrees that couldn’t possibly work. In essence, we’re combining my original tree search with the rooks solution that I didn’t know about.

This algorithm builds solutions one row at a time, iterating over the open columns, and checking for diagonal attacks. If there are none, it recursively calls itself to add another row. When it reaches eight rows, it yields the solution. It finds all 92 solutions by searching just 5,508 positions (Of which eight are the degenerate case of having just one queen on the first row):

const without = (array, element) =>
	array.filter(x => x !== element);

function * inductiveRooks (
  queens = [],
  candidateColumns = [0, 1, 2, 3, 4, 5, 6, 7]
) {
  if (queens.length === 8) {
    yield queens;
  } else {
    for (const chosenColumn of candidateColumns) {
      const candidateQueens = queens.concat([[queens.length, chosenColumn]]);
      const remainingColumns = without(candidateColumns, chosenColumn);

      if (testDiagonals(candidateQueens)) {
        yield * inductiveRooks(candidateQueens, remainingColumns);
      }
    }
  }
}

This has the best of both worlds: It takes advantage of the “rooks” optimization and the “inductive” approach to pruning subtrees. And although it isn’t a pure pipeline, it does break the generation apart from the testing. So it’s not only faster, it’s also simpler.

I wish I’d thought of this approach in 1977!


Corner Office ©2016 Michael Pardo


bonus: exploiting symmetry

Something else comes to mind when thinking about reducing the size of the tree to search. There is symmetry to the queen moves, and as a consequence, the positions we find have rotational symmetry, and they also have reflective symmetry on either horizontal or vertical axes.

One way to exploit this begins with noting that every valid arrangement also has another valid arrangement that is symmetrical under vertical reflection, like these two mirror images of each other:

Q.......  .......Q
......Q.  .Q......
....Q...  ...Q....
.......Q  Q.......
.Q......  ......Q.
...Q....  ....Q...
.....Q..  ..Q.....
..Q.....  .....Q..

Thus, every time we discover a valid arrangement, we can go ahead and make a vertical mirror image of it. That saves us work if we can also avoid generating and testing that mirror image arrangement.

So the $64,000 question is, “Can we avoid the work of generating both positions and their mirror images?”

Note the following numbered positions:

1234....
........
........
........
........
........
........
........

The “inductive” approach calculates every possible arrangement that has a queen in position 1 before computing those with a queen in position 2, then 3, then 4. When it has done so, it has computed half of the possible arrangements. But as we noted above, we can simply make a mirror image copy of each solution found, and thus we do not need to search all of the possible mirror arrangements.

Therefore, when we have searched all of the arrangements with a queen in positions one through four, we have essentially already searched all of these arrangements as well:

....4321
........
........
........
........
........
........
........

We thus know every possible solution that has a queen in one of the first four squares, plus every possible solution that does not have a queen in any of the first four squares. This is every possibility, and we need compute no further. Therefore, we can cut the search in half simply by only doing half the work, and then reflecting the solutions:

function * halfInductive () {
  for (const candidateQueens of [[[0, 0]], [[0, 1]], [[0, 2]], [[0, 3]]]) {
    yield * inductive(candidateQueens);
  }
}

function verticalReflection (queens) {
  return queens.map(
    ([row, col]) => [row, 7 - col]
  );
}

function * flatMapWith (fn, iterable) {
  for (const element of iterable) {
    yield * fn(element);
  }
}

const withReflections = flatMapWith(
  queens => [queens, verticalReflection(queens)], halfInductive());

Array.from(withReflections).length
  //=> 92

Now we’re really getting lazy: We only have to evaluate 2,750 candidate positions, a far, far smaller number than the original worst-case, most-pessimum, 281,474,976,710,656 tests. How much smaller? One hundred billion times smaller!

Mind you, a fairer comparison is to the combinations approach, which required 4,426,165,368 tests. A tree of 2,750 candidate positions is more than 1.5 million times smaller. We’ll take it!6


©2009 Matteo


obtaining fundamental solutions

Now that we’ve had a look at exploiting vertical symmetry to do less work but still generate all of the possible solutions, including those that are reflections and rotations of each other, what about going the other way?

As Wikipedia explain, “If solutions that differ only by the symmetry operations of rotation and reflection of the board are counted as one, the puzzle has 12 solutions. These are called fundamental solutions.”

If we only want the fundamental solutions, we can filter the solutions we generate by testing them against a set that includes reflections and rotations. We obviously won’t actually output reflections and rotations, we’re just using them to filter the results:

const sortQueens = queens =>
  queens.reduce(
    (acc, [row, col]) => (acc[row] = [row, col], acc),
    [null, null, null, null, null, null, null, null]
  );

const rotateRight = queens =>
  sortQueens( queens.map(([row, col]) => [col, 7 - row]) );

const rotations = solution => {
  const rotations = [null, null, null];
  let temp = rotateRight(solution);

  rotations[0] = temp;
  temp = rotateRight(temp);
  rotations[1] = temp;
  temp = rotateRight(temp);
  rotations[2] = temp;

  return rotations;
}

const indexQueens = queens => queens.map(([row, col]) => `${row},${col}`).join(' ');

function * fundamentals (solutions) {
  const solutionsSoFar = new Set();

  for (const solution of solutions) {
    const iSolution = indexQueens(solution);

    if (solutionsSoFar.has(iSolution)) continue;

    solutionsSoFar.add(iSolution);
    const rSolutions = rotations(solution);
    const irSolutions = rSolutions.map(indexQueens);
    for (let irSolution of irSolutions) {
      solutionsSoFar.add(irSolution);
    }

    const vSolution = verticalReflection(solution);

    const rvSolutions = rotations(vSolution);
    const irvSolutions = rvSolutions.map(indexQueens);

    for (let irvSolution of irvSolutions) {
      solutionsSoFar.add(irvSolution);
    }

    yield solution;
  }
}

function niceDiagramOf (queens) {
  const board = [
    ["⬜️", "⬛️", "⬜️", "⬛️", "⬜️", "⬛️", "⬜️", "⬛️"],
    ["⬛️", "⬜️", "⬛️", "⬜️", "⬛️", "⬜️", "⬛️", "⬜️"],
    ["⬜️", "⬛️", "⬜️", "⬛️", "⬜️", "⬛️", "⬜️", "⬛️"],
    ["⬛️", "⬜️", "⬛️", "⬜️", "⬛️", "⬜️", "⬛️", "⬜️"],
    ["⬜️", "⬛️", "⬜️", "⬛️", "⬜️", "⬛️", "⬜️", "⬛️"],
    ["⬛️", "⬜️", "⬛️", "⬜️", "⬛️", "⬜️", "⬛️", "⬜️"],
    ["⬜️", "⬛️", "⬜️", "⬛️", "⬜️", "⬛️", "⬜️", "⬛️"],
    ["⬛️", "⬜️", "⬛️", "⬜️", "⬛️", "⬜️", "⬛️", "⬜️"]
  ];

  for (const [row, col] of queens) {
    board[7 - row][col] = "👸🏾";
  }

  return board.map(row => row.join('')).join("\n");
}

mapWith(niceDiagramOf, fundamentals(halfInductive()))

Success!!! If we want to check them against the fundamental solutions listed in Wikipedia, our algorithm outputs them in this order (some are rotations or reflections of the solution as displayed in Wikipedia):

 3  2  1  4  5  6  7  8  9 12 10 11

The Twelve Fundamental Solutions


And so to bed…

It was a lot of fun to revisit Martin Gardner’s column on the eight queen’s problem, and especially to rewrite these algorithms forty years later. It was neat to look at everything again with fresh eyes, and to see how we could go from searching 281,474,976,710,656, to 4,426,165,368, to 40,320, to 5,508, and finally to 2,750 candidate positions.

This post doesn’t have a deep insight into program design, and thus there’s no major point to summarize. Just as there can be recreational mathematics, there can be recreational programming. And that’s a very fine thing to enjoy.

Thank you, and good night!


Code
Notes
  1. Of particular interest to me was that the Spaced Out Library also contained a collection of sci-fi themed wargames. At the time, these games were quite expensive and nearly always out of my financial reach. I recall going to the library with some like-minded neighbourhood friends and playing games like BattleFleet Mars, Starforce: Alpha Centauri and StarSoldier, or just reading the rules and fantasizing about the universe described in the games. 

  2. There’s a nice history of the Sanford Fleming Building on Skulepedia, including an account of the infamous fire that engulfed the building in the Spring of 1977. 

  3. You’ll find the code for all of the solutions in this gist

  4. One of the benefits of having some exposure to math and computer science is this: If you recognize that something is a formal concept, you can extract it, and name it after the “term of art” that is well-understood. Without that exposure, you may reinvent the concept, but you are less likely to know to extract it independently and probably won’t give it a name that everyone recognizes at a glance. Thus, we can create explicit functions like choose and permutations, and that is superior to having the exact same functionality performed implicitly in the code. 

  5. But far from the only optimization! Once we grasp that the interesting thing about a queen is the number of its row and column, and then add the concept of the numbers of its NESW and NWSE diagonals, it is a fairly obvious step to go right to encoding queen positions as bits. This allows us to use very fast bitwise operations instead of reading and writing from arrays. Here’s an example in JavaScript that exploits bitwise operators and inductive search: A bitwise solution to the n-Queens problem in Javascript

  6. There are some other optimizations available around exploiting horizontal or rotational symmetry to reduce the search space. For example, consider an approach where we also generate the horizontal reflection of a solution we find. In that case, after completing all the possible solutions that include a Queen in position [0, 0], we can eliminate trying any solutions that have a queen in position [7,0]. Alas, this is a leaf, so we don’t get to prune an entire subtree, and the ratio of code complexity to gains is marginal. I found my attempt inelegant. You might want to play around with searching in different ways so that there is an elegant way to exploit horizontal and rotational symmetry to reduce the search space. 

https://raganwald.com/2018/08/03/eight-queens
A Trick of the Tail
Show full content

In Recursion? We don’t need no stinking recursion!, we looked at seven different techniques for turning recursive functions into iterative functions. In this post, we’re going to take a deeper look at technique #3, convert recursion to iteration with tail calls.

Before we dive into it, here’s a quick recap of what we explored in the previous post:


recursion, see recursion

The shallow definition of a recursive algorithm is a function that directly or indirectly calls itself. For example, the factorial of an integer:

In mathematics, the factorial of a non-negative integer n, denoted by n!, is the product of all positive integers less than or equal to n. For example,

5! = 5 * 4 * 3 * 2 * 1 = 120.

The value of 0! is 1, according to the convention for an empty product.

In JavaScript, we can write it as:

function factorial (n) {
  if (n === 0) {
    return 1;
  } else {
    return n * factorial(n - 1);
  }
}

Our factorial function clearly calls itself. And because of the way almost every implementation of Javascript we encounter is designed, every time it calls itself, it creates a frame on the program stack. The stack is a limited resource, and for a sufficiently large number, our function will exhaust the stack.

This is sometimes given as a reason to convert recursive calls to iteration. That is true in theory, but in practice it is unusual to have to worry about the stack being exhausted. For example, if we wanted to compute 5000!, rewriting our function to avoid exhausting the stack is the least of our worries. We’d also have to convert our function to work with some kind of Big Integer data type, as we are going to end up working with some huge integers, and JavaScript does not support arbitrarily large numbers “out of the box.”

However, exploring the process of converting a recursive function to a function that is tail recursive is interesting in its own right, and furthermore, exploring how to make a function that is tail recursive avoid exhausting the stack is even more interesting in its own right, so that’s what we’re going to do.

tail calls

In computer science, a tail call is a subroutine call performed as the final action of a procedure. If a tail call might lead to the same subroutine being called again later in the call chain, the subroutine is said to be tail-recursive, which is a special case of recursion. Tail recursion (or tail-end recursion) is particularly useful, and often easy to handle in implementations.

The TL;DR is that if a function calls another function and then does nothing with the result except return it, it is said to be making a tail call. Here’s a simplified version of a function from JavaScript Allongé:

function whenNotNull(fn, ...args) {
  if (args.length === null) {
    return;
  }
  for (const arg of args) {
    if (arg == null) {
      return;
    }
  }
  return fn(...args);
}

factorial(5)
  //=> 120
factorial(null)
  //=> Maximum call stack size exceeded.

whenNotNull(factorial, 5)
  //=> 120
whenNotNull(factorial, null)
  //=> undefined

whenNotNull is a higher-order function closely related to the maybe decorator. We call it with the name of a function and one or more arguments. If none of the arguments are null, it returns the result of calling that function with the arguments. But of no arguments are supplied, or any of them are null, it simply returns without calling the function.

The key thing to observe is that when whenNotNull calls fn, it returns the result with no further calculations or computations. The statement return fn(...args); is a tail call. By way of contrast, the statement return n * factorial(n - 1); is not a tail call, because after invoking factorial(n - 1), our factorial function proceeds to multiply the result by n.

A tail recursive function is simply a function that only makes calls in tail position, and that as a result of making a call in tail position, directly or indirectly calls itself.

Tail recursive functions are interesting for several reasons. Let’s look at the first practical reason why they are interesting:


Piet Mondrian, Composition, 1921 by Sharon Mollerus

converting simple recursive functions to tail recursive functions using functional composition

There are a large class of recursive functions that can be converted into tail recursive functions. Tom Moertel gives a procedure for performing this conversion in his Tricks of the trade: Recursion to Iteration series:

  1. Find a recursive call that’s not a tail call.
  2. Identify what work is being done between that call and its return statement.
  3. Extend the function with a secret feature to do that work, as controlled by a new accumulator argument with a default value that causes it to do nothing.
  4. Use the secret feature to eliminate the old work.
  5. You’ve now got a tail call!
  6. Repeat until all recursive calls are tail calls.

We’ll use a contrived version of it here. Let’s start with a ridiculous recursive function:

function isEven (n) {
  if (n === 0) {
    return true;
  } else {
    return !isEven(n - 1);
  }
}

isEven(13)
  //=> false

There’s a really obvious transformation into a tail-recursive form,1 but let’s follow Moertel’s steps, sort of. First, we identify the recursive call that is not in tail position:

return !isEven(n - 1);

Next, we identify the work that is being done between that call and the return statement. It’s the !, which is the prefix operator for logical negation Since this is JavaScript, and we prefer function-oriented programming, we’ll can refactor it into an immediately invoked anonymous function:

function isEven (n) {
  if (n === 0) {
    return true;
  } else {
    return (x => !x)(isEven(n - 1));
  }
}

Now we’ve identified that the work to be done is the function x => !x. Next, we extend the isEven function to do any extra work it is passed in an “accumulator argument with a default value that causes it to do nothing”. In the standard form, we would pass data, but in our function-oriented style, we will pass a function that does the work.

And the default value that does no work is the infamous “Identity function,” or “Idiot Bird:”

function isEven (n, accFn = x => x) {
  if (n === 0) {
    return true;
  } else {
    return isEven(n - 1, x => !x);
  }
}

Now how shall our function make use of accFn? In the case of n === 0, it is obvious:

function isEven (n, accFn = x => x) {
  if (n === 0) {
    return accFn(true);
  } else {
    return isEven(n - 1, x => !x);
  }
}

But what about our recursive call? Let’s temporarily do the same thing:

function isEven (n, accFn = x => x) {
  if (n === 0) {
    return accFn(true);
  } else {
    return accFn(isEven(n - 1, x => !x));
  }
}

This works, although maddeningly we still have a non-tail recursive call, we’ve just swapped accFn(isEven(n - 1, x => !x)) for !isEven(n - 1). However, we have a semi-secret weapon: Naïve function composition. If we compose accFn with x => !x, we can pass it into the function to be done later:

function compose (a, b) {
  return (...args) => a(b(...args));
}

function isEven (n, accFn = x => x) {
  if (n === 0) {
    return accFn(true);
  } else {
    return isEven(n - 1, compose(accFn, x => !x));
  }
}

And now, our call to isEven is in tail position. What’s the difference? Let’s rename our functions so we won’t get them confused:

function isEvenNotTailRecursive (n) {
  if (n === 0) {
    return true;
  } else {
    return !isEvenNotTailRecursive(n - 1);
  }
}

isEvenNotTailRecursive(100000)
  //=> Maximum call stack size exceeded.

function isEvenTailRecursive (n, accFn = x => x) {
  if (n === 0) {
    return accFn(true);
  } else {
    return isEvenTailRecursive(n - 1, compose(accFn, x => !x));
  }
}

isEvenTailRecursive(100000)
  //=> true

In implementations that support tail call optimization, our tail recursive version does not consume space on the call stack. That is very interesting!


3D view of May072013lja1a by Kent Schimke

the problem with our naïve functional composition approach

To summarize, when we wish to convert a recursive function to a tail recursive function, we follow these steps:

  1. Find a recursive call that’s not a tail call.
  2. Identify what work is being done between that call and its return statement, and make it a work function. In our example, the work function was not
  3. Extend the recursive function with a new accumulator function argument with a default value that causes it to do nothing, I.
  4. Wherever we are returning a result, run it through the accumulator function.
  5. Wherever we are making a recursive call, pass in the composition of the accumulator function and the work function.
  6. We’ve now got a tail call!
  7. Repeat until all recursive calls are tail calls. Now we have a tail-recursive function.

Having done so, implementations that optimize for tail calls are able to avoid consuming space on the call stack.

We’ll now layer in a little more complexity with the sumTo function (It’s deliberately very similar to factorial).

function sumTo (n) {
  if (n === 0) {
    return 0;
  } else {
    return n + sumTo(n - 1);
  }
}

sumTo(100000)
  //=> Maximum call stack size exceeded.

Unlike isEven, the work to be done between the recursive call and the return statement is not fixed, so for our work function, we will need a closure:

function sumToTailRecursive (n, accFn = x => x) {
  if (n === 0) {
    return accFn(0);
  } else {
    return sumToTailRecursive(n - 1, compose(accFn, x => n * x));
  }
}

sumToTailRecursive(100000)
  //=> 5000050000

Excellent! Now for a trick question: How much space do our tail recursive functions take up? Well, it is going to be on the order of the size of our input.

It’s possible that every time we evaluate an expression like x => !x, we get a new function object. It would be nice if our implementation knew enough to hoist it out of our function for us and make it a constant, but even if it did, when we evaluate compose(accFn, x => !x), we are absolutely creating a new function object.

And of course, functions like x => n * x are going to be created fresh every time sumToTailRecursive is called. So one way or the other, we are going to end up with a lot of function objects when we use the function composition method for transforming recursive functions into tail recursive functions.

Let’s address the problem that compose creates. It’s a naïve function that composes any two functions, so it’s the right default choice. But sometimes, the functions we’re composing have some kind of special property that allows us to avoid profligately creating new functions.

In the case of composing x => !x with x => !x or with x => x, there is a special set of rules. Here’s a composition table we can make:

a b compose(a,b) x => x x => x x => x x => !x x => !x x => x x => x x => !x x => !x x => !x x => x x => !x

If we ideologically stay with functions, we can start with extracting our anonymous functions to bind them to specific variables, and then write our own composition function:

const I = x => x;
const not = x => !x;

function composeNotI (a, b) {
  if (a === b) {
    return I;
  } else {
    return not;
  }
}

function isEvenConstantSpace (n, accFn = I) {
  if (n === 0) {
    return accFn(true);
  } else {
    return isEvenConstantSpace(n - 1, composeNotI(accFn, not));
  }
}

This is a little bit of wrangling, but what it comes down to is recognizing how operations compose. Now let’s look at sumTo.


Mondrian 3D by Felipe Salgado

converting simple recursive functions to tail recursive functions using data composition

Let’s look at sumTo again:

function sumTo (n) {
  if (n === 0) {
    return 0;
  } else {
    return n + sumTo(n - 1);
  }
}

We’ll do the usual refactoring, but note the choice of default accFn:

function sumTo (n, accFn = x => 0 + x) {
  if (n === 0) {
    return accFn(0);
  } else {
    return sumTo(n - 1, compose(accFn, x => n + x));
  }
}

Now, what happens if, instead of accFn, we pass a number into sumTo? We’ll need to expand accFn in place:

function sumTo (n, acc = 0) {
  if (n === 0) {
    return acc + 0;
  } else {
    return sumTo(n - 1, compose(acc, n));
  }
}

compose doesn’t work with integers, but we do know how to compose integres for addition:

function sumToConstantSpace (n, acc = 0) {
  if (n === 0) {
    return acc + 0;
  } else {
    return sumToConstantSpace(n - 1, acc + n);
  }
}

sumToConstantSpace(100000)
  //=> 5000050000

And this is how conversion to tail recursive form is usually handled, by finding a way to compose the data instead of explicitly composing functions. But working through functional composition highlights what we’re really doing: converting the work to be done later into a function, applying it later, and composing work to be done.

Now let’s take another look at converting a tail-recursive form to an iterative form.


Infinite Loop Apple by Mario Antonio Pena Zapatería
Follow

converting tail-recursive functions to iterative loops

Above, we gave this example of a tail recursive function executing without consuming the call stack:

function sumToConstantSpace (n, acc = 0) {
  if (n === 0) {
    return acc + 0;
  } else {
    return sumToConstantSpace(n - 1, acc + n);
  }
}

// Safari

sumToConstantSpace(100000)
  //=> 5000050000

That worked on the Safari browser, which in addition to being far more thrifty with battery life on OS X and iOS devices, implements Tail Call Optimization, as specified in the JavaScript standard. Alas, most other implementations refuse to implement TCO.

If we run the same code in Chrome…

// Chrome

sumToConstantSpace(100000)
  //=> Maximum call stack size exceeded

Ugh.

Well, the good news is that we can fix this problem. There’s a simple transformation to turn any tail recursive function into an iterative function with a loop:

  1. Wrap everything in an infinite loop
  2. Transform all tail recursive calls to rebind the function’s parameters, followed by a continue statement

Here’s sumTo wrapped in a loop:

function sumToLoop (n, acc = 0) {
  while (true) {
    if (n === 0) {
      return acc + 0;
    } else {
      return sumToLoop(n - 1, acc + n);
    }
  }
}

And now we transform the tail call:

function sumToLoop (n, acc = 0) {
  while (true) {
    if (n === 0) {
      return acc + 0;
    } else {
      [n, acc] = [n - 1, acc + n];
      continue;
    }
  }
}

// Chrome

sumToConstantSpace(100000)
  //=> 5000050000

Unfortunately, this will not work with tail recursive functions built with naïve functional composition:

function compose (a, b) {
  return (...args) => a(b(...args));
}

function isEvenLoop (n, accFn = x => x) {
  while (true) {
    if (n === 0) {
      return accFn(true);
    } else {
      [n, accFn] = [n - 1, compose(accFn, x => !x)];
      continue;
    }
  }
}

// Safari

isEvenLoop(100000)
  //=> true

// Chrome

isEvenLoop(100000)
  //=> Maximum call stack size exceeded

What happened? Well, in addition to using up a lot of space, the functions we built with compose are like a Matryoshka doll. Each one calls a function that calls a function and so on down the line. Luckily, the functions it creates are tail recursive, so Safari manages to invoke them without using up the Call Stack.

Chrome doesn’t optimize that, so isEvenLoop breaks on Chrome.

Now what if we find another way to compose functions that doesn’t involve nesting functions like Matryoshka dolls? We did discuss using special-purpose composition:

const I = x => x;
const not = x => !x;

function composeNotI (a, b) {
  if (a === b) {
    return I;
  } else {
    return not;
  }
}

function isEvenLoopCompose (n, accFn = I) {
  while (true) {
    if (n === 0) {
      return accFn(true);
    } else {
      [n, accFn] = [n - 1, composeNotI(accFn, not)];
      continue;
    }
  }
}

// Chrome

isEvenLoopCompose(100000)
  //=> true

And purely as an exercise, we can also conceive of a different way to compose functions, one that relies on linked lists of functions:

class Composition {
  constructor (a, b = undefined) {
    this.a = a;
    this.b = b;
  }

  call (pseudoThis, ...args) {
    let cell = this;

    while (true) {
      const result = cell.a.apply(pseudoThis, args);
      if (cell.b === undefined) {
        return result;
        } else {
          [args, cell] = [[result], cell.b];
        }
    }
  }
}

function isEven (n, accComposition = new Composition(x => x)) {
  while (true) {
    if (n === 0) {
      return accComposition.call(null, true);
    } else {
      [n, accComposition] = [
        n - 1,
        new Composition(x => !x, accComposition)
      ];
      continue;
    }
  }
}

// Chrome

console.log(isEven(100000))
  //=> true

Our conclusion is that the conversion to an iterative loop is fine, provided that we use the data composition method, or some function composition method that does not rely on nesting functions. If we use naïve functional composition, we will still wind up with a recursive function hidden in our accumulator.


notes
  1. const isEven = n => n === 0 ? true : n === 1 ? false : isEven(n - 2); 

https://raganwald.com/2018/05/27/tail
Recursion? We don't need no stinking recursion!
Show full content

Interviewer: “Please whiteboard an algorithm that Counts the leaves in a tree/Solves Towers of Hanoi/Random pet recursion problem.”

Interviewee: “Ok… Scribble, scribble… That should do it.”

Interviewer: “That looks like it works, but can you convert it to an iterative solution?”

Interviewee: “Hmmmm…”

The good news is that every recursive algorithm can be implemented with iteration. Whether we should implement a recursive algorithm with iteration is, as they say, “an open problem,” and the factors going into that decision are so varied that there is no absolute “Always favour recursion” or “Never use recursion” answer.

However, what we can say with certainty is that knowing how to implement a recursive algorithm with iteration is deeply interesting! And as a bonus, this knowledge is useful when we do encounter one of the situations where we want to convert an algorithm that is normally recursive into an iterative algorithm.

Even if we never encounter this exact question in an interview.


Rhombic Dodecahedron

recursion, see recursion

The shallow definition of a recursive algorithm is a function that directly or indirectly calls itself. Digging a little deeper, most recursive algorithms address the situation where the solution to a problem can be obtained by dividing it into smaller pieces that are similar to the initial problem, solving them, and then combining the solution.

This is known as “divide and conquer,” and here is an example: Counting the number of leaves in a tree. Our tree is represented as an object of type Tree that contains one ore more children. Each child is either a single leaf, represented as an object of class Leaf, or a subtree, represented as another object of type Tree.

So a tree that contains a single leaf is just new Tree(new Leaf()), while a tree contains three leaves might be new Tree(new Leaf(), new Leaf(), new Leaf()):

class Leaf {}

class Tree {
  constructor(...children) {
    this.children = children;
  }
}

function countLeaves(tree) {
  if (tree instanceof Tree) {
    return tree.children.reduce(
      (runningTotal, child) => runningTotal + countLeaves(child),
      0
    )
  } else if (tree instanceof Leaf) {
    return 1;
  }
}

const sapling = new Tree(
  new Leaf()
);

countLeaves(sapling)
  //=> 1

const tree = new Tree(
  new Tree(
    new Leaf(), new Leaf()
  ),
  new Tree(
    new Tree(
      new Leaf()
    ),
    new Tree(
      new Leaf(), new Leaf()
    ),
    new Leaf()
  )
);

countLeaves(tree)
  //=> 6

This is a classic divide-and-conquer: Divide a tree up into its children, and count the leaves in child, then sum them to get the count of leaves in the tree.

For the vast majority of cases, recursive algorithms are just fine. This is especially true when the form of the algorithm matches the form of the data being manipulated. A recursive algorithm to “fold” the elements of a tree makes a certain amount of sense because the definition of a tree is itself recursive: A tree is either a left or another tree. And the function we just saw either returns 1 or the count of leaves in a tree.

But sometimes people want iterative algorithms. It could be that recursion eats up to much stack space, and they would rather consume heap space. It could be that they just don’t like recursive algorithms. Or… Who knows? Our purpose here is not to fall down a hole of discussing performance and optimization, we’d rather fall down a hole of exploring what kinds of interesting techniques we might use to transform recursion into iteration.

Let’s start with the simplest, and one that works remarkably well:


Ferris Wheel

1: hide the recursion with iterators

Our algorithm above interleaves the mechanics of visiting every node in a tree with the particulars of what we want to do with leaf nodes. Let’s say that we want to be able to express that countLeaves is all about counting the number of nodes with a non-null leaf property, but we don’t want to clutter that up with recursion.

The easiest way to do that is to keep using recursion, but separate the concern of how to visit all the nodes in a tree from the concern of what we want to do with them. In JavaScript, we can make our trees into Iterables:

class Tree {
  constructor(...children) {
    this.children = children;
  }

  *[Symbol.iterator]() {
    for (const child of this.children) {
      yield child;
      if (child instanceof Tree) {
        yield* child;
      }
    }
  }
}

function countLeaves(tree) {
  let runningTotal = 0;

  for (const child of tree) {
    if (child instanceof Leaf) {
      ++runningTotal;
    }
  }

  return runningTotal
}

Notice now that countLeaves is iterative. The recursion has been pushed into Tree’s iterator. It’s still recursive in the whole, but the code that knows how to count leaves is certainly iterative.

Furthermore, this lets us mix-and-match different collection types. Here’s a basket of yard waste:

class Twig {}

const yardWaste = new Set([
  new Leaf(),
  new Leaf(),
  new Leaf(),
  new Twig(),
  new Twig()
])

countLeaves(yardWaste)
  //=> 3

And of course, we can apply functions over iterables, for example:

function count (iterable) {
  let runningTotal = 0;

  for (const _ of iterable) {
    ++runningTotal;
  }

  return runningTotal;
}

function filter * (predicateFunction, iterable) {
  for (const element of iterable) {
    if (predicateFunction(element)) {
      yield element;
    }
  }
}

function countLeaves (iterable) {
  return count(
    filter(
      (element) => element instanceof Leaf,
      iterable
    )
  );
}

This is great because our functions over iterables apply to a wide class of problems, not just recursive problems. So the takeaway is, one technique for turning recursive algorithms into iterative algorithms is to see whether we can recursively iterate. If so, separate the recursive iteration from the rest of the algorithm by writing an iterator.


Acid Tower 9

2. abstract the recursion with higher-order recursive functions

Divide-and-conquer comes up a lot, but it’s not always directly transferrable to iteration. For example, we might want to recursively rotate a square. If we want to separate the mechanics of recursion from the “business logic” of rotating a square, we could move some of the logic into a higher-order function, multirec.

multirec is a template function that implements n-ary recursion:1

function mapWith (fn) {
  return function * (iterable) {
    for (const element of iterable) {
      yield fn(element);
    }
  };
}

function multirec({ indivisible, value, divide, combine }) {
  return function myself (input) {
    if (indivisible(input)) {
      return value(input);
    } else {
      const parts = divide(input);
      const solutions = mapWith(myself)(parts);

      return combine(solutions);
    }
  }
}

To use multirec, we plug four functions into its template:

  1. An indivisible predicate function. It should report whether the problem we’re solving is too small to be divided. It’s simplicity itself for our counting leaves problem: node => node instanceof Leaf.
  2. A value function that determines what to do with a value that is indivisible. For counting, we return 1: leaf => 1
  3. A divide function that breaks a divisible problem into smaller pieces. tree => tree.children
  4. A combine function that puts the result of dividing a problem up, back together. counts => counts.reduce((acc, c) => acc + c, 0)

Here we go:

const countLeaves = multirec({
  indivisible: node => node instanceof Leaf,
  value: leaf => 1,
  divide: tree => tree.children,
  combine: counts => [...counts].reduce((acc, c) => acc + c, 0)
})

Now, this does separate the implementation of a divide-and-conquer recursive algorithm from what we want to accomplish, but it’s still obvious that we’re doing a divide and conquer algorithm. Balanced against that, multirec can do a lot more than we can accomplish with a recursive iterator, like rotating squares, implementing HashLife, or even finding a solution to the Towers of Hanoi.


Towers of Hanoi


Speaking of the Towers of Hanoi… Not all recursive algorithms map neatly to recursive data structures. The recursive solution to the Towers of Hanoi is a good example. We’ll use multirec, demonstrating that we can separate the mechanics of recursion from our code for anything recursive, not just algorithms that work with trees. Let’s start with the function signature:

function hanoi (params) { // params = {disks, from, to, spare}
  // ...
}

Our four elements will be:

  1. An indivisible predicate function. In our case, ({disks}) => disks == 1
  2. A value function that determines what to do with a value that is indivisible: ({from, to}) => [from + " -> " + to]
  3. A divide function that breaks a divisible problem into smaller pieces. ({disks, from, to, spare}) => [{disks: disks - 1, from, to: spare, spare: to}, {disks: 1, from, to, spare}, {disks: disks - 1, from: spare, to, spare: from}]
  4. A combine function that puts the result of dividing a problem up, back together. [...moves].reduce((acc, move) => acc.concat(move), [])

Like so:

const hanoi = multirec({
  indivisible: ({disks}) => disks == 1,
  value: ({from, to}) => [from + " -> " + to],
  divide: ({disks, from, to, spare}) => [
    {disks: disks - 1, from, to: spare, spare: to},
    {disks: 1, from, to, spare},
    {disks: disks - 1, from: spare, to, spare: from}
  ],
  combine: moves => [...moves].reduce(
                      (acc, move) => acc.concat(move), [])
});

hanoi({disks: 3, from: 1, to: 3, spare: 2})
  //=>
    ["1 -> 3", "1 -> 2", "3 -> 2",
     "1 -> 3", "2 -> 1", "2 -> 3",
     "1 -> 3"]

Tails

3. convert recursion to iteration with tail calls

Some recursive algorithms are much simpler than traversing a tree or generating solutions to the Towers of Hanoi. For example, the algorithm for computing Fibonacci numbers.

No, not that algorithm, the one we are thinking of involves matrix exponentiation, and you can read all about it here. In the middle of that algorithm, we have the need to multiply matrices by each other. We’ll repeat the same logic here, only using integers so that we can focus on the recursion.

Let’s start with a generic function for multiplying one or more numbers:

const multiply = (...numbers) => numbers.reduce((x, y) => x * y);

multiply(1, 2, 3, 4, 5)
  //=> 120

If we want to find the exponent of a number, the naïve algorithm is to multiply it by itself repeatedly. We’ll use unnecessarily clever code to implement the trick, like this:

const multiply = (...numbers) => numbers.reduce((x, y) => x * y);
const repeat = (times, value) => new Array(times).fill(value);
const naivePower = (exponent, number) =>
                     multiply(...repeat(exponent, number));

naivePower(3, 2)
  //=> 8

Besides the obvious of dropping down into the language’s library routines, there is a valuable optimization available. Exponentiation should not require “O-n” operations, it should be “O-log2-n.” We get there with recursion:

const power = (exponent, number) => {
  if (exponent === 0) {
    return 1;
  } else {
    const halfExponent = Math.floor(exponent / 2);
    const extraMultiplier = exponent %2 === 0 ? 1 : number;

    const halves = power(halfExponent, number);

    return halves * halves * extraMultiplier;
  }
}

power(16, 2)
  //=> 65536

Instead of performing 15 multiplications, the recursive algorithm performs four multiplications. That saves a lot of stack, if that’s our concern with recursion. But there is an interesting opportunity here. The stack is needed because after power calls itself, it does a bunch more work before returning a result.

If we can find a way for power to avoid doing anything except returning the result of calling itself, we have a couple of optimizations available to ourselves. So let’s arrange things such that when power calls itself, it returns the result right away. The “one weird trick”” is to supply an extra parameter, so that the work gets done eventually, but not after power calls itself:2

const power = (exponent, number, acc = 1) => {
  if (exponent === 0) {
    return acc;
  } else {
    const halfExponent = Math.floor(exponent / 2);
    const extraMultiplier = exponent %2 === 0 ? 1 : number;

    return power(
      halfExponent,
      number * number,
      acc * extraMultiplier
    );
  }
}

Our power functions recursion is now a tail call meaning that when it calls itself, it returns that result right away, it doesn’t do anything else with it. Because it doesn’t do anything with the result, behind the scenes the language doesn’t have to store a bunch of information in a stack frame. In essence, it can treat the recursive call like a GOTO instead of a CALL.

Many functional languages optimize this case. While our code may look like recursion, in such a language, our implementation of power it would execute in constant stack space, just like a loop.

Alas, tail calls in JavaScript have turned out to be a contentious issue. At the time of this writing, Safari is the only browser implementing tail call optimization. So, while this is a valid optimization in some languages, and might be useful for a Safari-only application in JavaScript, it would be nice if there was something useful we could do now that we’ve done all the work of transforming power into tail-recursive form.

Like… Convert it to a loop ourselves.


foffa03

4. convert tail-recursive functions into loops

Here’s power again:

const power = (exponent, number, acc = 1) => {
  if (exponent === 0) {
    return acc;
  } else {
    const halfExponent = Math.floor(exponent / 2);
    const extraMultiplier = exponent %2 === 0 ? 1 : number;

    return power(
      halfExponent,
      number * number,
      acc * extraMultiplier
    );
  }
}

If we don’t want to leave the tail call optimization up to the compiler, with a tail-recursive function there’s a simple transformation we can perform: We wrap the whole thing in a loop, we reassign the parameters rather than passing parameters, and we use continue instead of invoking ourselves.

Like this:

const power = (exponent, number, acc = 1) => {
  while (true) {
    if (exponent === 0) {
      return acc;
    } else {
      const halfExponent = Math.floor(exponent / 2);
      const extraMultiplier = exponent %2 === 0 ? 1 : number;

      [exponent, number, acc] =
        [halfExponent, number * number, acc * extraMultiplier];
      continue;
    }
  }
}

The continue is superfluous, but when converting other functions it becomes essential. With a bit of cleaning up, we get:

const power = (exponent, number, acc = 1) => {
  while (exponent > 0) {
    const halfExponent = Math.floor(exponent / 2);
    const extraMultiplier = exponent %2 === 0 ? 1 : number;

    [exponent, number, acc] =
      [halfExponent, number * number, acc * extraMultiplier];
  }

  return acc;
}

So we know that for a certain class of recursive function, we can convert it to iteration in two steps:

  1. Convert it to tail-recursive form.
  2. Convert the tail-recursive function into a loop that rebinds its parameters.

And presto, we get an algorithm that executes in constant stack space! Or do we?

Unfortunately, it is not always possible to execute a recursive function in constant space. power worked, because it is an example of linear recursion: At each step of the way, we execute on one piece of the bigger problem and then combine the result in a simple way with the rest of the problem.

Linearly recursive algorithms are often associated with lists. Every fold, unfold, filter and other algorithm can be expressed as linear recursion, and it can also be expressed as an iteration in constant space. But not all recursive algorithms are linear. Some are “n-ary,” where n > 1. In other words, at each step of the way they must call themselves more than once before combining the results.

Such algorithms can be transformed into loops, but the mechanism by which we store data to be worked on later will necessarily grow in some relation to the size of the problem.


Stacked

5. implement multirec with our own stack

It is possible to convert n-ary recursive algorithms to iterative versions, but not in constant space. What we can do, however, is move the need for “stacking” our intermediate results and work to be done out of the system stack and into our own explicit stack. Let’s give it a try.

We could look at directly implementing something like Towers of Hanoi with a stack, but let’s maintain our separation of concerns. Instead of implementing Towers of Hanoi directly, we’ll implement multirec with a stack, and then trust that our existing Towers of Hanoi implementation will work without any changes.

That particular choice gives us the power of being able to express Towers of Hanoi as a divide-and-conquer algorithm, while implementing it using a stack on the heap. That’s terrific if our only objection to multirec is the possibility of a stack overflow.3

Our new multirec has its own stack, and is clearly iterative. It’s one big loop:

function multirec({ indivisible, value, divide, combine }) {
  return function (input) {
    const stack = [];

    if (indivisible(input)) {
      // the degenerate case
      return value(input);
    } else {
      // the iterative case
      const parts = divide(input);
      const solutions = [];
      stack.push({parts, solutions});
    }

    while (true) {
      const {parts, solutions} = stack[stack.length - 1];

      if (parts.length > 0) {
        const subproblem = parts.pop()

        if (indivisible(subproblem)) {
          solutions.unshift(value(subproblem));
        } else {
          // going deeper
          stack.push({
            parts: divide(subproblem),
            solutions: []
          })
        }
      } else {
        stack.pop(); // done with this frame

        const solution = combine(solutions);

        if (stack.length === 0) {
          return solution;
        } else {
          stack[stack.length - 1].solutions.unshift(solution);
        }
      }
    }
  }
}

With this version of multirec, our Towers of Hanoi and countLeaves algorithms both have switched from being recursive to being iterative with their own stack. That’s the power of separating the specification of the divide-and-conquer algorithm from its implementation.


Pigeons

6. implementing depth-first iterators with our own stack

We can use a similar approach for iteration. Recall our Tree class:

class Tree {
  constructor(...children) {
    this.children = children;
  }

  *[Symbol.iterator]() {
    for (const child of this.children) {
      yield child;
      if (child instanceof Tree) {
        yield* child;
      }
    }
  }
}

Writing an iterator for a recursive data structure was useful for hiding recursion in the implementation, and it was also useful because we have many patterns, library functions, and language features (like for..of) that operate on iterables. However, the recursive implementation uses the system’s stack.

What about using our own stack, as we did with multirec? That would produce a nominally iterative solution:

class Tree {
  constructor(...children) {
    this.children = children;
  }

  *[Symbol.iterator]() {
    let stack = [...this.children].reverse();

    while (stack.length > 0) {
      const child = stack.pop();

      yield child;

      if (child instanceof Tree) {
        stack = stack.concat([...child.children].reverse());
      }
    }
  }
}

And once again, we have no need to change any function relying on Tree being iterable to get a solution that does not consume the system stack.


Six in a row

7. implementing breadth-first iterators with a queue

Our stack-based iterator performs a depth-first search:

Depth-first search

By Alexander Drichel - Own work, CC BY-SA 3.0

But for certain algorithms, we want to perform a breadth-first search:

Breadth-first search

By Alexander Drichel - Own work, CC BY 3.0

We can accomplish this using a queue instead of a stack:

class Tree {
  constructor(...children) {
    this.children = children;
  }

  *[Symbol.iterator]() {
    let queue = [...this.children];

    while (queue.length > 0) {
      const child = queue.shift();

      yield child;

      if (child instanceof Tree) {
        queue = queue.concat([...child.children]);
      }
    }
  }
}

And once again, we have no need to change any function relying on Tree being iterable to get a solution that does not consume the system stack.


Wrapped

wrap-up

We’ve just seen seven different ways to get recursion out of our functions. The first two (“hide the recursion with iterators” and “abstract the recursion with higher-order recursive functions”) didn’t eliminate recursion altogether, but moved the recursion out of our code.

“Convert recursion to iteraction with tail calls” arranged our code such that a suitable programming language implementation would convert our recursive code into iteration automatically. “Convert tail-recursive functions into loops” showed how to do this manually for the case where the language wouldn’t do it for us, or if we’re allergic to recursion for other reasons.

The last three (“implement multirec with our own stack,” “implementing depth-first iterators with our own stack,” and “implementing breadth-first iterators with a queue”) showed how to manage our own storage when working with n-ary recursive algorithms. And we saw that by converting our iterators and multirec to iteration, we’d get iteration “for free” for all our other code, thanks to the refactoring in the first two approaches.

So now, if we’re ever in an interview and our interlocutor asks, “Can you convert this algorithm to use iteration,” we can reply, “Sure! But there are at least seven different ways to do that, depending upon what we want to accomplish…”4

(Discuss on reddit and hacker news. If you like this kind of thing, JavaScript Allongé is exactly the kind of thing you’ll like.)


notes
  1. There is more about multirec, linrec, and another function, binrec, in From Higher-Order Functions to Libraries And Frameworks

  2. Refactoring recursive functions into “tail recursive functions” has been practiced from the early days of Lisp, and there are a number of practical techniques we can learn to apply. An excellent guide that touches on both tail-recursive refactoring and on converting recursion to iteration is Tom Moertel’s Tricks of the trade: Recursion to Iteration series. 

  3. Many, many years ago, I wanted to implement a Towers of Hanoi solver in BASIC. The implementation I was using allowed calling subroutines, however subroutines were non-reëntrant. So it was impossible for a subroutine to invoke itself. I wound up implementing a stack of my own in an array, and that, as they say, was that. (You read this note correctly: Forty years ago, non-reëntrant subroutines were a thing in widely available implementations.) 

  4. Actually there are at least two more, but this blog post is already long enough. I’ve written elsewhere about using trampolines to implement tail-call optimization in JavaScript, and then there is the deeply fascinating subject of conversion to continuation-passing style

https://raganwald.com/2018/05/20/we-dont-need-no-stinking-recursion
More State Machine ❤️: From Reflection to Statecharts
Show full content

In “How I Learned to Stop Worrying and ❤️ the State Machine,” we built an extremely basic state machine to model a bank account.

State machines, as we discussed, are a very useful tool for organizing the behaviour of domain models, representations of meaningful real-world concepts pertinent to a sphere of knowledge, influence or activity (the “domain”) that need to be modelled in software.

A state machine is an object, but it has a distinctive “behaviour.” It is always in exactly one of a finite number of states, and its behaviour is determined by its state, right down to what methods it has in each state and which other states a method may or may not transition the state machine to.

This is interesting! So interesting, that we are going to spend a few minutes looking strictly at state machine behaviour, and more specifically, at the interface a state machine has with the entities that use it.

Let’s get started.


Reflections

our bank account state machine

Here’s a small variation on the “bank account” code we wrote:12

// The naïve state machine extracted from https://raganwald.com/2018/02/23/forde.html
// Modified to use weak maps for "private" state

const STATES = Symbol("states");
const STARTING_STATE = Symbol("starting-state");
const RESERVED = [STARTING_STATE, STATES];

const MACHINES_TO_CURRENT_STATE_NAMES = new WeakMap();
const MACHINES_TO_STARTING_STATES = new WeakMap();
const MACHINES_TO_NAMES_TO_STATES = new WeakMap();

function getStateName (machine) {
  return MACHINES_TO_CURRENT_STATE_NAMES.get(machine);
}

function getState (machine) {
  return MACHINES_TO_NAMES_TO_STATES.get(machine)[getStateName(machine)];
}

function setState (machine, stateName) {
  MACHINES_TO_CURRENT_STATE_NAMES.set(machine, stateName);
}

function transitionsTo (stateName, fn) {
  return function (...args) {
    const returnValue = fn.apply(this, args);
    setState(this, stateName);
    return returnValue;
  };
}

function BasicStateMachine (description) {
  const machine = {};

  // Handle all the initial states and/or methods
  const propertiesAndMethods = Object.keys(description).filter(property => !RESERVED.includes(property));
  for (const property of propertiesAndMethods) {
    machine[property] = description[property];
  }

  // now its states
  MACHINES_TO_NAMES_TO_STATES.set(machine, description[STATES]);

  // what event handlers does it have?
  const eventNames = Object.entries(MACHINES_TO_NAMES_TO_STATES.get(machine)).reduce(
    (eventNames, [state, stateDescription]) => {
      const eventNamesForThisState = Object.keys(stateDescription);

      for (const eventName of eventNamesForThisState) {
        eventNames.add(eventName);
      }
      return eventNames;
      },
    new Set()
  );

  // define the delegating methods
  for (const eventName of eventNames) {
    machine[eventName] = function (...args) {
      const handler = getState(machine)[eventName];
      if (typeof handler === 'function') {
        return getState(machine)[eventName].apply(this, args);
      } else {
        throw `invalid event ${eventName}`;
      }
    }
  }

  // set the starting state
  MACHINES_TO_STARTING_STATES.set(machine, description[STARTING_STATE]);
  setState(machine, MACHINES_TO_STARTING_STATES.get(machine));

  // we're done
  return machine;
}

const account = BasicStateMachine({
  balance: 0,

  [STARTING_STATE]: 'open',
  [STATES]: {
    open: {
      deposit (amount) { this.balance = this.balance + amount; },
      withdraw (amount) { this.balance = this.balance - amount; },
      availableToWithdraw () { return (this.balance > 0) ? this.balance : 0; },
      placeHold: transitionsTo('held', () => undefined),
      close: transitionsTo('closed', function () {
        if (this.balance > 0) {
          // ...transfer balance to suspension account
        }
      })
    },
    held: {
      removeHold: transitionsTo('open', () => undefined),
      deposit (amount) { this.balance = this.balance + amount; },
      availableToWithdraw () { return 0; },
      close: transitionsTo('closed', function () {
        if (this.balance > 0) {
          // ...transfer balance to suspension account
        }
      })
    },
    closed: {
      reopen: transitionsTo('open', function () {
        // ...restore balance if applicable
      })
    }
  }
});

(code)

It’s simple, and it seems to work. What’s the problem?


Evidence

reflecting on state machines

In computer science, reflection is the ability of a computer program to examine, introspect, and modify its own structure and behaviour at runtime.–Wikipedia

Remember we talked about objects exposing their interface? In JavaScript, we can use code to examine the methods a bank account responds to:

function methodsOf (obj) {
  const list = [];

  for (const key in obj) {
    if (typeof obj[key] === 'function') {
      list.push(key);
    }
  }
  return list;
}

methodsOf(account)
  //=> deposit, withdraw, availableToWithdraw, placeHold, close, removeHold, reopen

This is technically correct, because we wrote methods that delegated all of these “events” to the current state. But this is semantically wrong, because the whole idea behind a state machine is that the methods it responds to vary according to what state it is in.

For example, when an object is created, it is in ‘open’ state, and placehold, removeHold, and reopen are all invalid methods. Our interface is lying to the outside world about what methods the object truly supports. This is an artefact of our design: We chose to implement methods, but then throw invalid method if an object in a particular state was not supposed to respond to a particular event.

The ideal would be for it not to have these methods at all, so that the standard way that we use our programming language to determine whether an object responds to a method–testing for a member that is a function–just works.

One way to go about this is to replace all the delegation methods with prototype mongling. First, the new setState method:3

function setState (machine, stateName) {
  MACHINES_TO_CURRENT_STATE_NAMES.set(machine, stateName);
  Object.setPrototypeOf(machine, getState(machine));
}

Now we can remove all the code that writes the delegation methods:

function RefectiveStateMachine (description) {
  const machine = {};

  // Handle all the initial states and/or methods
  const propertiesAndMethods = Object.keys(description).filter(property => !RESERVED.includes(property));
  for (const property of propertiesAndMethods) {
    machine[property] = description[property];
  }

  // now its states
  MACHINES_TO_NAMES_TO_STATES.set(machine, description[STATES]);

  // set the starting state
  MACHINES_TO_STARTING_STATES.set(machine, description[STARTING_STATE]);
  setState(machine, MACHINES_TO_STARTING_STATES.get(machine));

  // we're done
  return machine;
}

How well does it work? Let’s try it with account = ReflectiveStateMachine({ ... }) more-or-less as before:

methodsOf(account)
  //=> deposit, withdraw, availableToWithdraw, placeHold, close

account.placeHold()
methodsOf(account)
  //=> removeHold, deposit, availableToWithdraw, close

(code)

Now we have a state machine that correctly exposes the shallowest part of its interface, the methods that it responds to at any one time. What else might other code be interested in? And why?


Valves

descriptions and diagrams for code

Two things have been proven to be consistently true since the dawn of human engineering:

  1. Using a diagram, schematic, blueprint, or other symbolic representation of work to be done helps us plan our work, do our work, verify that our work is correctly done, and understand our work.
  2. Diagrams, schematics, blueprints, and other symbolic representations of work invariably drift from the work over time, until their inaccuracies present more harm than good.

This is especially true of programming, where change happens rapidly and “documentation” lags woefully behind. In early days, researchers toyed with various ways of making executable diagrams for programs: Humans would draw a diagram that communicated the program’s behaviour, and the computer would interpret it directly.

Another approach has been to dynamically generate diagrams and comments of one form or another. Many modern programming frameworks can generate documentation from the source code itself, sometimes using special annotations as a kind of markup. The value of this approach is that when the code changes, so does the documentation.

Can we generate state transition diagrams from our source code?

Well, we’re not going to write an entire graphics generation engine, although that would be a pleasant diversion. But what we will do is generate a kind of program that another engine can consume to produce our documentation. The diagrams in How I Learned to Stop Worrying and ❤️ the State Machine were generated with Graphviz, free software that generates graphs specified with the DOT graph description language.

The dot file to generate the transition graph for our bank account looks like this:

digraph Account {

  start [label="", fixedsize="false", width=0, height=0, shape=none];
  start -> open [color=darkslategrey];

  open;

  open -> open [color=blue, label="deposit, withdraw, availableToWithdraw"];
  open -> held [color=blue, label="placeHold"];
  open -> closed [color=blue, label="close"];

  held;

  held -> held [color=blue, label="deposit, availableToWithdraw"];
  held -> open [color=blue, label="removeHold"];
  held -> closed [color=blue, label="close"];

  closed;

  closed -> open [color=blue, label="reopen"];
}

We could generate this DOT file if we have a list of states, events, and the states those events transition to. Getting a list of states and events is easy. But what we don’t have is the starting state, nor do we have the states these methods (a/k/a “events”) transition to.

We could easily store the starting state for each state machine in a weak map. But what to do about the transitions? This is a deep problem. Throughout our programming explorations, we have repeatedly feasted on JavaScript’s ability for functions to consume other functions as arguments and return new functions. Using this, we have written many different kinds of decorators, including transitionsTo.

The beauty of functions returning functions is that closures form a hard encapsulation: The closure wrapping a function is available only functions created within its scope, not to any other scope. The drawback is that when we want to do some inspection, we cannot pierce the closure. We simply cannot tell from the function that transitionsTo returns what state it will transition to.

We have a few options. One is to use a different form of description that encodes the destination states without a transitionsTo function.


Bank of Montréal, in Toronto

a new-old kind of notation for bank accounts

When we first formulated a notation for state machines, we considered a more declarative format that encoded states and transitions using nested objects. It looked a little like this:

const TRANSITIONS = Symbol("transitions");
const STARTING_STATE = Symbol("starting-state");

const account = TransitionOrientedStateMachine({
  balance: 0,

  [STARTING_STATE]: 'open',
  [TRANSITIONS]: {
    open: {
      deposit (amount) { this.balance = this.balance + amount; },
      withdraw (amount) { this.balance = this.balance - amount; },
      availableToWithdraw () { return (this.balance > 0) ? this.balance : 0; },

      held: {
        placeHold () {}
      },

      closed: {
        close () {
          if (this.balance > 0) {
            // ...transfer balance to suspension account
          }
        }
      }
    },

    held: {
      deposit (amount) { this.balance = this.balance + amount; },
      availableToWithdraw () { return 0; },

      open: {
        removeHold () {}
      },

      closed: {
        close () {
          if (this.balance > 0) {
            // ...transfer balance to suspension account
          }
        }
      }
    },

    closed: {
      open: {
        reopen () {
          // ...restore balance if applicable
        }
      }
    }
  }
});

This is touch more verbose, but we can write a StateMachine to do all the interpretation work. It will keep the description but translate the methods to use transitionsTo for us:

const MACHINES_TO_TRANSITIONS = new WeakMap();

function TransitionOrientedStateMachine (description) {
  const machine = {};

  // Handle all the initial states and/or methods
  const propertiesAndMethods = Object.keys(description).filter(property => !RESERVED.includes(property));
  for (const property of propertiesAndMethods) {
    machine[property] = description[property];
  }

  // set the transitions for later reflection
  MACHINES_TO_TRANSITIONS.set(machine, description[TRANSITIONS]);

  // create its top-level state prototypes
  MACHINES_TO_NAMES_TO_STATES.set(machine, Object.create(null));

  for (const state of Object.keys(MACHINES_TO_TRANSITIONS.get(machine))) {
    const stateObject = Object.create(null);
    const stateDescription = MACHINES_TO_TRANSITIONS.get(machine)[state];
    const nonTransitioningMethods = Object.keys(stateDescription).filter(name => typeof stateDescription[name] === 'function');
    const destinationStates = Object.keys(stateDescription).filter(name => typeof stateDescription[name] !== 'function');

    for (const nonTransitioningMethodName of nonTransitioningMethods) {
      const nonTransitioningMethod = stateDescription[nonTransitioningMethodName];

      stateObject[nonTransitioningMethodName] = nonTransitioningMethod;
    }

    for (const destinationState of destinationStates) {
      const destinationStateDescription = stateDescription[destinationState];
      const transitioningMethodNames = Object.keys(destinationStateDescription).filter(name => typeof destinationStateDescription[name] === 'function');

      for (const transitioningMethodName of transitioningMethodNames) {
        const transitioningMethod = destinationStateDescription[transitioningMethodName];

        stateObject[transitioningMethodName] = transitionsTo(destinationState, transitioningMethod);
      }
    }

    MACHINES_TO_NAMES_TO_STATES.get(machine)[state] = stateObject;
  }

  // set the starting state
  MACHINES_TO_STARTING_STATES.set(machine, description[STARTING_STATE]);
  setState(machine, MACHINES_TO_STARTING_STATES.get(machine));

  // we're done
  return machine;
}

And now we can write a getTransitions function that extracts the structure of the transitions:

function getTransitions (machine) {
  const description = { [STARTING_STATE]: MACHINES_TO_STARTING_STATES.get(machine) };
  const transitions = MACHINES_TO_TRANSITIONS.get(machine);

  for (const state of Object.keys(transitions)) {
    const stateDescription = transitions[state];

    description[state] = Object.create(null);
    const selfTransitions = [];

    for (const descriptionKey of Object.keys(stateDescription)) {
      const innerDescription = stateDescription[descriptionKey];

      if (typeof innerDescription === 'function' ) {
        selfTransitions.push(descriptionKey);
      } else {
        const destinationState = descriptionKey;
        const transitionEvents = Object.keys(innerDescription);

        description[state][destinationState] = transitionEvents;
      }
    }

    if (selfTransitions.length > 0) {
      description[state][state] = selfTransitions;
    }
  }

  return description;
}

getTransitions(account)
  //=> {
    open: {
      open: ["deposit", "withdraw", "availableToWithdraw"],
      held: ["placeHold"],
      closed: ["close"]
    },
    held: {
      open: ["removeHold"],
      held: ["deposit", "availableToWithdraw"],
      closed: ["close"]
    },
    closed: {
      open: ["reopen"]
    },
    Symbol(starting-state): "open"
  }

🎉‼️

With getTransitions in hand, we’re ready to generate a DOT file from the symbolic description:

function dot (machine, name) {
  const transitionsForMachine = getTransitions(machine);
  const startingState = transitionsForMachine[STARTING_STATE];
  const dot = [];

  dot.push(`digraph ${name} {`);
  dot.push('');
  dot.push('  start [label="", fixedsize="false", width=0, height=0, shape=none];');
  dot.push(`  start -> ${startingState} [color=darkslategrey];`);

  for (const state of Object.keys(transitionsForMachine)) {
    dot.push('');
    dot.push(`  ${state}`);
    dot.push('');

    const stateDescription = transitionsForMachine[state];

    for (const destinationState of Object.keys(stateDescription)) {
      const events = stateDescription[destinationState];

      dot.push(`  ${state} -> ${destinationState} [color=blue, label="${events.join(', ')}"];`);
    }
  }

  dot.push('}');

  return dot.join("\r");
}

dot(account, "Account")
  //=>
    digraph Account {

      start [label="", fixedsize="false", width=0, height=0, shape=none];
      start -> open [color=darkslategrey];

      open

      open -> open [color=blue, label="deposit, withdraw, availableToWithdraw"];
      open -> held [color=blue, label="placeHold"];
      open -> closed [color=blue, label="close"];

      held

      held -> open [color=blue, label="removeHold"];
      held -> held [color=blue, label="deposit, availableToWithdraw"];
      held -> closed [color=blue, label="close"];

      closed

      closed -> open [color=blue, label="reopen"];
    }
}

(code)

We can feed this .dot file to Graphviz, and it will produce the image we see right in this blog post, and that’s exactly how it was generated:

Bank account diagram

So. We now have a way of drawing state transition diagrams for state machines. Being able to extract the semantic structure of an object–like the state transitions for a state machine–is a useful kind of reflection, and one that exists at a higher semantic level than simply reporting on things like the methods an object responds to or the properties it has.


contract

should a state machine hide the fact that it’s a state machine?

People talk about information hiding as separating the responsibility for an object’s “implementation” from its “interface.” The question here is whether an object’s states, transitions between states, and methods in each state are part of its implementation, or whether they are part of its interface, its contracted behaviour.

One way to tell is to ask ourselves whether changing those things will break other code. If we can change the transitions of a state machine without changing any other code in an app, it’s fair to say that the transitions are “just implementation details,” Just as changing a HashMap class to use a Cuckoo Hash instead of a Hash Table doesn’t require us to change any of the code that interacts with our HashMap.

So how about our bank accounts? Consider the following sequence:

account.availableToWithdraw()
  //=> 0

account.deposit(42);

account.availableToWithdraw()
  //=> ???

Does the second .availableToWithdraw() return forty-two? Or zero? That depends upon whether the account is open or held, and that isn’t a private secret that only the account knows about. We know this, because people openly talk about whether accounts are open, held, or closed. People build processes around the account’s state, and it’s certainly part of the account’s design that the .placeHold method transitions an account into held state.

From this, we get that accounts should certainly behave like state machines. And from that, it’s reasonable that other pieces of code ought to be able to dynamically inspect an account’s current state, as well as its complete graph of states and transitions. That’s as much a part of its public interface as any particular method.4

It follows that in addition to our ReflectiveStateMachine function, we ought to also make both getTransitions and getStateName “public” functions that every other entity can interact with. setState, on the other hand, is best left as a “private” function: To preserve the proper business logic, other entities should invoke methods that perform the appropriate transitions, not directly change state.

We can even create public functions like:

function isStateMachine (object) {
  return MACHINES_TO_NAMES_TO_STATES.has(object);
}

So, now we have an understanding of a state machine, the behaviour it contractually guarantees, and how it might be organized such that it can expose its contract dynamically to other code.

But speaking of behaviour…


ATMs in Thailand

it’s never as simple as it seems in a blog post

One of the reasons we don’t see as many explicit state machines “in the wild” as we’d expect is that there is a perception that state machines are great when we’ve performed an exhaustive “Big Design Up Front” analysis, and have perfectly identified all of the states and transitions.

Many programmers believe that in a more agile, incremental process, we’re constantly discovering requirements and methods. We may not know that we have a state machine until later in the process, and things being messy, a domain model may not fit neatly into the state machine model even if it appears to have states.

For example, instead of being told that a bank account might be open, held, or closed, we might be told something different:

  • A bank account is either open or closed;
  • An open bank account is either held or not held;

So now we are told that an account doesn’t have one single state, it has two flags: open/closed and held/not-held. That doesn’t fit the state machine model at all, and refactoring it into open/held/closed may not be appropriate. This is an extremely simple example, but impedance mismatches like this are common, and over time a model may accrete a half-dozen or more toggles and enumerations that represent little states within the domain model’s larger state.

This is maddening, because we know how to model open/closed as a state machine, and if we didn’t have to worry about closed accounts, we also know how to model held/not-held as a state machine.

What to do?


Fractal Fun

i heard you liked state machines, so…

If we look at a bank account as having two flags, and between their possible settings there are three valid states, we don’t see a state machine. We see a stateful object. But hidden in our implementation is a clue as to how we can solve this semantically.

As implemented, state machines are objects that delegate their behaviour to a state object. There’s one state object for each possible state.

The state object for closed state has just one method, reopen. Our problem with the state object for open state is that it has two different behaviours, depending on whether the account is held or not-held.

That sounds very familiar!

What we are describing is that a bank account has two states. One of them, closed, is a simple object. But the other, open, is itself a state machine! It has two states, held, and not-held.

If we allow state object to be state machines, we can actually model our bank account exactly as our business users describe it.

But first, we’re going to need a bigger notation.5


Composer's score for Don Giovanni

a bigger notation for state machines

Let’s look at what we want to model. For starters, we want an account that has open and closed states, and we want to describe what is guaranteed to be the case for each state:

const account = HierarchalStateMachine({
  balance: 0,

  [STARTING_STATE]: 'open',
  [TRANSITIONS]: {
    open: {
      deposit (amount) { this.balance = this.balance + amount; }

      closed: {
        close () {
          if (this.balance > 0) {
            // ...transfer balance to suspension account
          }
        }
      }
    },
    closed: {
      open: {
        reopen () {
          // ...restore balance if applicable
        }
      }
    }
  }
});

And if we were thinking about a state machine nested within the open state, it would look like this:

HierarchalStateMachine({
  [STARTING_STATE]: 'not-held',
  [TRANSITIONS]: {
    ['not-held']: {
      withdraw (amount) { this.balance = this.balance - amount; },
      availableToWithdraw () { return (this.balance > 0) ? this.balance : 0; },

      held: {
        placeHold () {}
      }
    },
    held: {
      availableToWithdraw () { return 0; },

      ['not-held']: {
        removeHold () {}
      }
    }
});

It’s a state machine with two states. In not-held, it supports withdraw, availableToWithdraw, and placewHold. In held, it supports availableToWithdraw and removeHold. It does not need support deposit, because our top-level state machine does that.

So now, we need a way to put the second notation inside the first notation, perhaps like this:

const INNER = Symbol("inner");

const account = HierarchalStateMachine({
  balance: 0,

  [STARTING_STATE]: 'open',
  [TRANSITIONS]: {
    open: {
      deposit (amount) { this.balance = this.balance + amount; },

      closed: {
        close () {
          if (this.balance > 0) {
            // ...transfer balance to suspension account
          }
        }
      },

      [INNER]: {
        [STARTING_STATE]: 'not-held',
        [TRANSITIONS]: {
          ['not-held']: {
            withdraw (amount) { this.balance = this.balance - amount; },
            availableToWithdraw () { return (this.balance > 0) ? this.balance : 0; },

            held: {
              placeHold () {}
            }
          },
          held: {
            availableToWithdraw () { return 0; },

            ['not-held']: {
              removeHold () {}
            }
          }
        }
      }
    },

    closed: {
      open: {
        reopen () {
          // ...restore balance if applicable
        }
      }
    }
  }
});

Tree

using prototypical inheritance to model hierarchal state machines

Here’s what we want to happen: When an account enters open state, as per usual, its prototype will be set to its open state object. That will have a deposit and close method. But what will the open state object’s prototype be? Ah! That will be the inner state machine that has held and not-held methods. And every time the object enters the open state, the inner state machine will enter the not-held state.

So the prototype chain will look something like this:

account -> open -> open.inner -> open.inner.not-held

If the account is held, it will change to:

account -> open -> open.inner -> open.inner.held

And if it is closed, it will change to:

account -> closed

The first thing we’re going to need is to slightly modify our state machine to take a prototype as an argument, and then make sure that every stateObject we create for it points to that prototype:

function HierarchalStateMachine (description, prototype = Object.prototype) {
  // ...

  for (const state of Object.keys(MACHINES_TO_TRANSITIONS.get(machine))) {
    const stateObject = Object.create(prototype);

    // ...
  }
  // ...
}

Armed with this, we can look for [INNER] in any state description. If we don’t find one, the state is going to be whatever we construct for our stateObject.

But if we do find an [INNER], we will recursively construct a state machine from its description, but have its prototype be the stateObject we have constructed. And our state is going to be the inner state machine:

function HierarchalStateMachine (description, prototype = Object.prototype) {

  // ...

  for (const state of Object.keys(MACHINES_TO_TRANSITIONS.get(machine))) {
    const stateObject = Object.create(prototype);

    // ...

    const innerStateMachineDescription = stateDescription[INNER];

    // ...

    if (innerStateMachineDescription == null) {
      MACHINES_TO_NAMES_TO_STATES.get(machine)[state] = stateObject;
    } else {
      const innerStateMachine = HierarchalStateMachine(
        innerStateMachineDescription,
        stateObject
      );

      MACHINES_TO_NAMES_TO_STATES.get(machine)[state] = innerStateMachine;
    }
  }

  // ...
}

One final thing. When we set the state for a state machine, we have to figure out whether the state machine can handle it, or whether it should delegate that to a nested state machine:

function setState (machine, stateName) {
  const currentState = getState(machine);

  if (hasState(machine, stateName)) {
  	setDirectState(machine, stateName)
  } else if (isStateMachine(currentState)) {
    setState(currentState, stateName);
  } else {
    console.log(`illegal transition to ${stateName}`, machine);
  }
}

function setDirectState (machine, stateName) {
  MACHINES_TO_CURRENT_STATE_NAMES.set(machine, stateName);

  const newState = getState(machine);
  Object.setPrototypeOf(machine, newState);

  if (isStateMachine(newState)) {
    setState(newState, MACHINES_TO_STARTING_STATES.get(newState));
  }
}

Ta da!

(code)

There is, of course, much more work to be done. What should getCurrentState actually return? How will it show that an account might be in both open and not-held states? What about drawing DOT diagrams? How should our getTransitions and dot functions work to produce a nested state transitions diagram?

But before we do that work, perhaps we should look at the work that already exists:


Except from Harel's Paper

statecharts

In the 1980s, David Harel invented the statechart notation for specifying and programming reactive systems. In addition to formalizing hierarchal state machines, statechart notation covers many other important topics like concurrency and parallel execution.

Harel’s original paper explains the entire idea.

Statecharts have a formal SCXML notation, and a number of actively maintained libraries implementing executable statecharts, including xstate, which bills itself as “functional, stateless JavaScript finite state machines and statecharts.” You may also want to take a look at the delightfully readable (but unfinished) statecharts.github.io project. There’s plenty to review in Wikipedia’s article on UML State Machines as well.

There’s a lot to digest!

But we’ve gotten the general idea, and that is enough for now. Now that we’ve done the work, let’s turn to the most pressing question: What is it good for? Why should we care?


Rubber Duck Tour in HK

rubber duck design

Hierarchal state machines seem like an awful lot of work for a model with a handful of methods, but let’s take a step back and think about the general principles involved. First, hierarchal state machines help us model the way people think about states more closely than if we had to translate what they are saying to a simple “flat” state machine.

Second, hierarchal state machines open up opportunities for composing state machines out of smaller parts. Being able to decompose and recompose models is a programming superpower.

Third, having a “meta-model”–whether it be hierarchal state machines or flat state machines–for constructing domain models provides an unexpected process benefit we might call “Rubber Duck Design.”

In software engineering, rubber duck debugging or rubber ducking is a method of debugging code. The name is a reference to a story in the book The Pragmatic Programmer in which a programmer would carry around a rubber duck and debug their code by forcing themselves to explain it, line-by-line, to the duck. Many other terms exist for this technique, often involving different inanimate objects.

Many programmers have had the experience of explaining a problem to someone else, possibly even to someone who knows nothing about programming, and then hitting upon the solution in the process of explaining the problem.

Rubber duck design works just like rubber duck debugging: The act of explaining the design prompts our brains to think through the thing we’re modelling more thoroughly than if we just stare at the code. What makes state machines–whether hierarchal or flat–particularly effective for this is that they force us to “explain to the computer” all of the states and transitions the entity we’re modelling will have.

As an incredibly fecund bonus, we also get dynamically generated diagrams and a cleaner form of self-documenting code in the form of the descriptions that makes it easier for others to read and modify our domain objects.

Whether we use the complete and rather heavyweight SCXML or build ourself a lightweight alternative, in many cases, modelling domain objects as state machines is a win.

(discuss on reddit)


javascript allongé, the six edition

If you enjoyed this essay, you’ll ❤️ JavaScript Allongé, the Six Edition. It’s 100% free to read online!


Water Calligraphy

notes
  1. Banking software is not actually written with objects that have methods like .deposit for soooooo many reasons, but this toy example describes something most people understand on a basic level, even if they aren’t familiar with the needs of banking infrastructure, correctness, and so forth. 

  2. We’re also using a slightly different pattern for associating state machines with their states, based on weak maps

  3. mongle: v, to molest or disturb. 

  4. Of course, we could implement a state machine in some other way, such that it behaves like a state machine but is implemented in some other fashion. That would be changing its implementation and not its interface. We could, for example, rewrite our StateMachine function to generate a collection of actors communicating with asynchronous method passing. The value of our curent approach is that the implementation strongly mirrors the interface, which has certain benefits for readability. 

  5. Or a bigger boat

https://raganwald.com/2018/03/03/reflections
How I Learned to Stop Worrying and ❤️ the State Machine
Show full content

“Any sufficiently complicated model class contains an ad-hoc, informally-specified, bug-ridden, slow implementation of half of a state machine.”–former colleague1

Domain models are representations of meaningful real-world concepts pertinent to a sphere of knowledge, influence or activity (the “domain”) that need to be modelled in software. Domain models can represent concrete real-word objects, or more abstract things like meetings and incidents.

My colleague’s insight was that most domain models end up having a representation of various states. Over time, they build up a lot of logic around how and when they transition between these states, and that logic is smeared across various methods where it becomes difficult to understand and modify.

By recognizing when domain models should be represented first and foremost as state machines–or recognizing when to refactor domain models into state machines–we keep our models understandable and workable. We tame their complexity.

So, what are state machines? And how do they help?


A rube goldberg machine

finite state machines

A finite state machine, or simply a state machine, is a mathematical model of computation. It is an abstract machine that can be in exactly one of a finite number of states at any given time.

A state machine can change from one state to another in response to some external inputs; the change from one state to another is called a transition. A state machine is defined by a list of its states, its initial state, and the conditions for each transition.–Wikipedia

Well, that’s a mouthful. To put it in context from a programming (as opposed to idealized model of computation) point of view, when we program with objects, we build little machines. They have internal state and respond to messages or events from the rest of the program via the mechanism of their methods. (JavaScript also permits direct access to properties, but let’s consider the subset of programs that only interact with objects via methods.)

Most objects are designed to encapsulate some kind of state. Stateless objects are certainly a thing, but let’s put them aside and consider only objects that have state. Some such objects can be considered State Machines, some cannot. What distinguishes them?

First, a state machine has a notion of a state. All stateful objects have some kind of “state,” but a state machine reifies this and gives it a name. Furthermore, there are a finite number of possible states that a state machine can be in, and it is always in exactly one of these states.

For example let’s say that a bank account has a balance and it can be be one of open or closed. Its balance certainly is stateful, but it’s “state” from the perspective of a state machine is either open or closed.

Second, a state machine formally defines a starting state, and allowable transitions between states. The aforementioned bank account starts in open state. It can transition from open to closed, and from time to time, from closed back to open. Sometimes, it transitions from a state back to itself, which we see in an arrow from open back to open. This can be displayed with a diagram, and such diagrams are a helpful way to document or brainstorm behaviour:

Bank account with Open and Closed states

Third, a state machine transitions between states in response to events. Our bank account starts in open state. When open, a bank account responds to a close event by transitioning to the closed state. When closed, a bank account responds to the reopen event by transitioning to the open state. As noted, some transitions do not involve changes in state. When open, a bank account responds to deposit and withdraw events, and it transitions back to open state.


Strathfield Bends

transitions

The events that trigger transitions are noted by labeling the transitions on the diagram:

Bank account with events

We now have enough to create a naïve JavaScript object to represent what we know so far about a bank account. We will make each event a method, and the state will be a string:

let account = {
  state: 'open',
  balance: 0,

  deposit (amount) {
    if (this.state === 'open') {
      this.balance = this.balance + amount;
    } else {
      throw 'invalid event';
    }
  },

  withdraw (amount) {
    if (this.state === 'open') {
      this.balance = this.balance - amount;
    } else {
      throw 'invalid event';
    }
  },

  close () {
    if (this.state === 'open') {
      if (this.balance > 0) {
        // ...transfer balance to suspension account
      }
      this.state = 'closed';
    } else {
      throw 'invalid event';
    }
  },

  reopen () {
    if (this.state === 'closed') {
      // ...restore balance if applicable
      this.state = 'open';
    } else {
      throw 'invalid event';
    }
  }
}

account.state
  //=> open

account.close();
account.state
  //=> closed

The first thing we observe is that our bank account handles different events in different states. In the open state, it handles the deposit, withdraw, and close event. In the closed state, it only handles the reopen event. The natural way of implementing events is as methods on the object, but now we are imbedding in each method a responsibility for knowing what state the account is in and whether that transition can be allowed or not.

It’s also not clear from the code alone what all the possible states are, or how we transition. What is clear is what each method does, in isolation. This is one of the “affordances” of a typical object-with-methods design: It makes it very easy to see what an individual method does, but not to get a higher view of how the methods are related to each other.


Let’s add a little functionality: A “hold” can be placed on accounts. Held accounts can accept deposits, but not withdrawals. And naturally, the hold can be removed. The new diagram looks like this:

Bank account with events and self and hold

And the code we end up with looks like this:

let account = {
  state: 'open',
  balance: 0,

  deposit (amount) {
    if (this.state === 'open' || this.state === 'held') {
      this.balance = this.balance + amount;
    } else {
      throw 'invalid event';
    }
  },

  withdraw (amount) {
    if (this.state === 'open') {
      this.balance = this.balance - amount;
    } else {
      throw 'invalid event';
    }
  },

  placeHold () {
    if (this.state === 'open') {
      this.state = 'held';
    } else {
      throw 'invalid event';
    }
  },

  removeHold () {
    if (this.state === 'held') {
      this.state = 'open';
    } else {
      throw 'invalid event';
    }
  },

  close () {
    if (this.state === 'open' || this.state === 'held') {
      if (this.balance > 0) {
        // ...transfer balance to suspension account
      }
      this.state = 'closed';
    } else {
      throw 'invalid event';
    }
  },

  reopen () {
    if (this.state === 'closed') {
      // ...restore balance if applicable
      this.state = 'open';
    } else {
      throw 'invalid event';
    }
  }
}

To accomodate the new state, we had to update a number of different methods. This is not difficult when the requirements are in front of us, and it’s often a mistake to overemphasize whether it is easy or difficult to implement something when the requirements and the code are both well-understood.

However, we can see that the code does not do a very good job of documenting what is or isn’t possible for a held account. This organization makes it easy to see exactly what a deposit or withdraw does, at the expense of making it easy to see how held accounts work or the overall flow of accounts from state to state.

If we wanted to emphasize states, what could we do?


A bank balance

executable state descriptions

Directly compiling diagrams has been–so far–highly unproductive for programming. But there’s another representation of a state machine that can prove helpful: A transition table. Here’s our transition table for the naïve bank account:

  open held closed open deposit, withdraw placeHold close held removeHold deposit close closed reopen    

In the leftmost column, we have the current state of the account. Each subsequent column is a destination state. At the intersection of the current state and a destination state, we have the event or events that transition the object from current to destination state. Thus, deposit and withdraw transition from open to open, while placeHold transitions the object from open to held. The start state is arbitrarily taken as the first state listed.

Like the state diagram, the transition table shows clearly which events are handled by which state, and the transitions between them. We can take this idea to our executable code: Here’s a version of our account that uses objects to represent table rows and columns.

const STATES = Symbol("states");
const STARTING_STATE = Symbol("starting-state");

const Account = {
  balance: 0,
  STARTING_STATE: 'open',
  STATES: {
    open: {
      open: {
        deposit (amount) { this.balance = this.balance + amount; },
        withdraw (amount) { this.balance = this.balance - amount; },
      },
      held: {
        placeHold () {}
      },
      closed: {
        close () {
          if (this.balance > 0) {
            // ...transfer balance to suspension account
          }
        }
      }
    },

    held: {
      open: {
        removeHold () {}
      },
      held: {
        deposit (amount) { this.balance = this.balance + amount; }
      },
      closed: {
        close () {
          if (this.balance > 0) {
            // ...transfer balance to suspension account
          }
        }
      }
    },

    closed: {
      open: {
        reopen () {
          // ...restore balance if applicable
        }
      }
    }
  }
};

This description isn’t executable, but it doesn’t take much to write an implementation organized along the same lines:


Code

implementing a state machine that matches our description

In Mixins, Forwarding, and Delegation in JavaScript, we briefly touched on using late-bound delegation to create state machines. The principle is that instead of using strings for state, we’ll use objects that contain the methods we’re interested in. First, we’ll write out what those objects will look like:

const STATE = Symbol("state");
const STATES = Symbol("states");

const open = {
  deposit (amount) { this.balance = this.balance + amount; },
  withdraw (amount) { this.balance = this.balance - amount; },
  placeHold () {
    this[STATE] = this[STATES].held;
  },
  close () {
    if (this.balance > 0) {
      // ...transfer balance to suspension account
    }
    this[STATE] = this[STATES].closed;
  }
};

const held = {
  removeHold () {
    this[STATE] = this[STATES].open;
  },
  deposit (amount) { this.balance = this.balance + amount; },
  close () {
    if (this.balance > 0) {
      // ...transfer balance to suspension account
    }
    this[STATE] = this[STATES].closed;
  }
};

const closed = {
  reopen () {
    // ...restore balance if applicable
    this[STATE] = this[STATES].open;
  }
};

Now our actual account object stores a state object rather than a state string, and delegates all methods to it. When an event is invalid, we’ll get an exception. That can be “fixed,” but let’s not worry about it now:

const account = {
  balance: 0,

  [STATE]: open,
  [STATES]: { open, held, closed },

  deposit (...args) { return this[STATE].deposit.apply(this, args); },
  withdraw (...args) { return this[STATE].withdraw.apply(this, args); },
  close (...args) { return this[STATE].close.apply(this, args); },
  placeHold (...args) { return this[STATE].placeHold.apply(this, args); },
  removeHold (...args) { return this[STATE].removeHold.apply(this, args); },
  reopen (...args) { return this[STATE].reopen.apply(this, args); }
};

Unfortunately, this regresses: We’re littering the methods with state assignments. One of the benefits of transition tables and state diagrams is that they communicate both the from_and _to states of each transition. Assigning state within methods does not make this clear, and introduces an opportunity for error.

To fix this, we’ll write a transitionsTo decorator to handle the state changes.2

const STATE = Symbol("state");
const STATES = Symbol("states");

function transitionsTo (stateName, fn) {
  return function (...args) {
    const returnValue = fn.apply(this, args);
    this[STATE] = this[STATES][stateName];
    return returnValue;
  };
}

const open = {
  deposit (amount) { this.balance = this.balance + amount; },
  withdraw (amount) { this.balance = this.balance - amount; },
  placeHold: transitionsTo('held', () => undefined),
  close: transitionsTo('closed', function () {
    if (this.balance > 0) {
      // ...transfer balance to suspension account
    }
  })
};

const held = {
  removeHold: transitionsTo('open', () => undefined),
  deposit (amount) { this.balance = this.balance + amount; },
  close: transitionsTo('closed', function () {
    if (this.balance > 0) {
      // ...transfer balance to suspension account
    }
  })
};

const closed = {
  reopen: transitionsTo('open', function () {
    // ...restore balance if applicable
  })
};

const account = {
  balance: 0,

  [STATE]: open,
  [STATES]: { open, held, closed },

  deposit (...args) { return this[STATE].deposit.apply(this, args); },
  withdraw (...args) { return this[STATE].withdraw.apply(this, args); },
  close (...args) { return this[STATE].close.apply(this, args); },
  placeHold (...args) { return this[STATE].placeHold.apply(this, args); },
  removeHold (...args) { return this[STATE].removeHold.apply(this, args); },
  reopen (...args) { return this[STATE].reopen.apply(this, args); }
};

Now we have made it quite clear which methods belong to which states, and which states they transition to.

We could stop right here if we wanted to: This is a pattern that is remarkably easy to write by hand, and for many cases, it is far easier to read and maintain than having various if and/or switch statements littering every method. But since we’re enjoying ourselves, what would it take to automate the process of implementing this naïve state machine pattern from descriptions?


A machine that types on a typewriter

compiling descriptions into state machines

Code that writes code does add a certain complexity, but it also enables us to arrange our code such that it is organized more appropriately for our problem domain. Managing stateful entities is one of the hardest problems in programming3, so it’s often worth investing in a little infrastructure work to arrive at an easier to understand and extend program.

The first thing we’ll do is “begin with the end in mind.” We wish to be able to write something like this:

const STATES = Symbol("states");
const STARTING_STATE = Symbol("starting-state");

function transitionsTo (stateName, fn) {
  return function (...args) {
    const returnValue = fn.apply(this, args);
    this[STATE] = this[STATES][stateName];
    return returnValue;
  };
}

const account = StateMachine({
  balance: 0,

  [STARTING_STATE]: 'open',
  [STATES]: {
    open: {
      deposit (amount) { this.balance = this.balance + amount; },
      withdraw (amount) { this.balance = this.balance - amount; },
      placeHold: transitionsTo('held', () => undefined),
      close: transitionsTo('closed', function () {
        if (this.balance > 0) {
          // ...transfer balance to suspension account
        }
      })
    },
    held: {
      removeHold: transitionsTo('open', () => undefined),
      deposit (amount) { this.balance = this.balance + amount; },
      close: transitionsTo('closed', function () {
        if (this.balance > 0) {
          // ...transfer balance to suspension account
        }
      })
    },
    closed: {
      reopen: transitionsTo('open', function () {
        // ...restore balance if applicable
      })
    }
  }
});

What does StateMachine do?

const RESERVED = [STARTING_STATE, STATES];

function StateMachine (description) {
  const machine = {};

  // Handle all the initial states and/or methods
  const propertiesAndMethods = Object.keys(description).filter(property => !RESERVED.includes(property));
  for (const property of propertiesAndMethods) {
    machine[property] = description[property];
  }

  // now its states
  machine[STATES] = description[STATES];

  // what event handlers does it have?
  const eventNames = Object.entries(description[STATES]).reduce(
    (eventNames, [state, stateDescription]) => {
      const eventNamesForThisState = Object.keys(stateDescription);

      for (const eventName of eventNamesForThisState) {
        eventNames.add(eventName);
      }
      return eventNames;
      },
    new Set()
  );

  // define the delegating methods
  for (const eventName of eventNames) {
    machine[eventName] = function (...args) {
      const handler = this[STATE][eventName];
      if (typeof handler === 'function') {
        return this[STATE][eventName].apply(this, args);
      } else {
        throw `invalid event ${eventName}`;
      }
    }
  }

  // set the starting state
  machine[STATE] = description[STATES][description[STARTING_STATE]];

  // we're done
  return machine;
}

A clock mechanism

let’s summarize

We began with this simple code for a bank account that behaved like a state machine:

let account = {
  state: 'open',

  close () {
    if (this.state === 'open') {
      if (this.balance > 0) {
        // ...transfer balance to suspension account
      }
      this.state = 'closed';
    } else {
      throw 'invalid event';
    }
  },

  reopen () {
    if (this.state === 'closed') {
      // ...restore balance if applicable
      this.state = 'open';
    } else {
      throw 'invalid event';
    }
  }
};

Encumbering this simple example with meta-programming to declare a state machine may not have been worthwhile, so we won’t jump to the conclusion that we ought to have written it differently. However, code being code, requirements were discovered, and we ended up writing:

let account = {
  state: 'open',
  balance: 0,

  deposit (amount) {
    if (this.state === 'open') {
      this.balance = this.balance + amount;
    } else if (this.state === 'held') {
      this.balance = this.balance + amount;
    } else {
      throw 'invalid event';
    }
  },

  withdraw (amount) {
    if (this.state === 'open') {
      this.balance = this.balance - amount;
    } else {
      throw 'invalid event';
    }
  },

  placeHold () {
    if (this.state === 'open') {
      this.state = 'held';
    } else {
      throw 'invalid event';
    }
  },

  removeHold () {
    if (this.state === 'held') {
      this.state = 'open';
    } else {
      throw 'invalid event';
    }
  },

  close () {
    if (this.state === 'open') {
      if (this.balance > 0) {
        // ...transfer balance to suspension account
      }
      this.state = 'closed';
    }
    if (this.state === 'held') {
      if (this.balance > 0) {
        // ...transfer balance to suspension account
      }
      this.state = 'closed';
    } else {
      throw 'invalid event';
    }
  },

  reopen () {
    if (this.state === 'open') {
      throw 'invalid event';
    } else if (this.state === 'closed') {
      // ...restore balance if applicable
      this.state = 'open';
    }
  }
}

Faced with more complexity, and the dawning realization that things are going to inexorably become more complex over time, we refactored our code from an ad-hoc, informally-specified, bug-ridden, slow implementation of half of a state machine to this example:

const account = StateMachine({
  balance: 0,

  [STARTING_STATE]: 'open',
  [STATES]: {
    open: {
      deposit (amount) { this.balance = this.balance + amount; },
      withdraw (amount) { this.balance = this.balance - amount; },
      placeHold: transitionsTo('held', () => undefined),
      close: transitionsTo('closed', function () {
        if (this.balance > 0) {
          // ...transfer balance to suspension account
        }
      })
    },
    held: {
      removeHold: transitionsTo('open', () => undefined),
      deposit (amount) { this.balance = this.balance + amount; },
      close: transitionsTo('closed', function () {
        if (this.balance > 0) {
          // ...transfer balance to suspension account
        }
      })
    },
    closed: {
      reopen: transitionsTo('open', function () {
        // ...restore balance if applicable
      })
    }
  }
});

Shasta Dam. Trashracks as seen from a boat on reservoir

how does this help?

Let’s finish our examination of state machines with a small change. We wish to add an availableToWithdraw method. It returns the balance (if positive and for accounts that are open and not on hold). The old way would be to write a single method with an if statement:

let account = {

  // ...

  availableToWithdraw () {
    if (this.state === 'open') {
      return (this.balance > 0) ? this.balance : 0;
    } else if (this.state === 'held') {
      return 0;
    } else {
      throw 'invalid method';
    }
  }
}

As discussed, this optimizes for understanding availableToWithdraw, but makes it harder to understand how open and held accounts differ from each other. It combines multiple responsibilities in the availableToWithdraw method: Understanding everything about account states, and implementing the functionality for each of the applicable states.

The “state machine way” is to write:

const account = StateMachine({
  balance: 0,

  [STARTING_STATE]: 'open',
  [STATES]: {
    open: {

      // ...

      availableToWithdraw () { return (this.balance > 0) ? this.balance : 0; }
    },
    held: {

      // ...

      availableToWithdraw () { return 0; }
    }
  }
});

This emphasizes the different states and the characteristics of an account in each state, and it separates the responsibilities: States and transitions are handled by our “state machine” organization, and each of the two functions handles just the mechanics of reporting the correct amount available for withdrawal.

“So much complexity in software comes from trying to make one thing do two things.”–Ryan Singer

We can obviously extend our code to generate ES2015 classes, incorporate before- and after- method decorations, tackle validating models in general and transitions in particular, and so forth. Our code here was written to illustrate the basic idea, not to power the next Startup Unicorn.

But for now, our lessons are:

  1. It isn’t always necessary to build architecture to handle every possible future requirement. But it is a good idea to recognize that as the code grows and evolves, we may wish to refactor to the correct choice in the future. While we don’t want to do it too early, we also don’t want to do it too late.

  2. Organizing things that behave like state machines along state machine lines separates responsibilities and decomposes methods into smaller pieces, each of which is focused on implementing the correct behaviour for a single state.

(discuss on hacker news, /r/programming, /r/javascript, or edit this page)

New!: More State Machine ❤️: From Reflection to Statecharts

p.s. State was popularized as one of the twenty-three design patterns articulated by the “Gang of Four.” The implementation examples tend to be somewhat specific to a certain style of OOP language, but it is well-worth a review.


javascript allongé, the six edition

If you enjoyed this essay, you’ll ❤️ JavaScript Allongé, the Six Edition. It’s 100% free to read online!


A clock mechanism

notes
  1. My colleague may or may not have said those exact words, but they introduced me to the idea of considering domain models as state machines by default, and my understanding has been crisper ever since. 

  2. At this moment, syntactic sugar for decorators are not yet fully accepted as a JavaScript standard. In this essay we’ll stick with ES2015 syntax, but feel free to use @transitionsTo('open') syntax if your toolchain supports the proposed syntactic sugar. 

  3. Programmers are congenitally unable to resist quoting Phil Karlton when they hear the words “hard,” “problem,” and “computer.” Let’s rise above that today. 

https://raganwald.com/2018/02/23/forde
Truncatable Primes in JavaScript
Show full content

In number theory, a right-truncatable prime is a prime number which, in a given base, contains no 0, and if the last (“right”) digit is successively removed, then all resulting numbers are prime. 7393 is an example of a right-truncatable prime, since 7393, 739, 73, and 7 are all prime.

Wikipedia

In this essay, we’re going to write some code to generate truncatable primes. Along the way, we’ll get some practice working with JavaScript generators implmenting lazily generated lists, we’ll get a chance to look at some of the ways a naïve algorithm might have terrible runtime performance, and we’ll get a chance to explore how pipelining data through functions helps us to separate concerns.

That, and we’ll get a chance to play with an esoteric concept in number theory, truncatable primes. Let’s start with a simple question: Are there an infinite number of truncatable primes? Or is the number of truncatable primes finite?


prime numbers less than 250,000

the infinitude of primes

Now, we know that there are an infinite number of primes. The reductio ad absurdum proof of this is easy to follow along:

To prove that there are an infinite number of primes, we first assume the opposite–that there are a finite number of primes–and then show that this presumption leads to a contradiction.

If there are a finite number of primes, there is some finite list of primes, call them p0, p1, p2, …, pN, where “pN” is the largest prime. Now we know from other work in number theory that every number can be decomposed into a set of prime factors, even primes. The only thing special about primes in this respect is that their only prime factor is themselves.

It follows then that every integer has one or more factors from the list p0, p1, p2, …, pN and only this list. So now let us consider the number p0 times p1 times p2, …, times pN. It is the product of all of the primes, and we will call it pN. This is obviously not a prime. But what about the number pN + 1?

We also know from other work that if some number x is divisible by some prime p, the numbers x + y and x - y are not divisible by p unless y is divisible by p. The most trivial example is when y = 1, because 1 is not divisible by any prime. For example, the number 10 is divisible by 2 and 5, but the numbers 9 and 11 are not divisible by either 2 or 5.

From this, we know that pN* + 1 is not divisible by any prime, be it p0, p1, p2, …, or pN. But that contradicts our knowledge that all numbers have one or more prime factors! So, from this, we conclude that pN* + 1 must be divisible by some prime other than p0, p1, p2, …, pN.

This tells us that any finite list of primes is necessarily incomplete.

Ok, that is middle-school mathematics. What about truncatable primes? Are there an infinite number of them?


Monks Cellarium

are there an infinite number of truncatable primes?

One way to settle this question is with a clever bit of reasoning, like the proof that there are an infinite number of primes. But while that’s the elegant way, it’s not the only way.

Some mathematical problems can be solved by brute force. If you have an abbey full of mathematically minded monks, you can solve a brute force problem in a couple of decades. It’s a matter of figuring out how to enumerate all of the cases, divide up the work, and wait.

No abbey? No problem, today we have computers. How can we put a computer to work to brute-force the problem?

the naïve brute force approach

The naïve thing to do is to lazily generate a list of primes, checking each one to see if it’s a truncatable prime. This generates a lazy list of truncatable primes.

That seems like a terrible idea, because we just established that there are an infinite list of primes. Whether there are a finite number of truncatable primes or not, our algorithm will never terminate.

But we can combine brute force with a modicum of reasoning. If there are an infinite number of truncatable primes, our algorithm will never stop. But what if there are a finite number of truncatable primes?

What do we know about truncatable primes? One thing we can use is the deduction that any two consecutive truncatable primes must either have the same number of digits, or differ by at most one digit.

Consider some right truncatable prime p, with d digits. The next largest truncatable prime might also have d digits, as might the next. Or it might have d + 1 digits, as might the next. But for any truncatable prime, the next largest truncatable prime cannot have d + 2 digits, because if you removed a digit, there would have to be some truncatable prime with d + 1 digits.

It follows then that if we are testing consecutive primes for “truncatability,” and if we know the length of the largest truncatable prime that we’ve seen so far, any prime that has two more digits than the largest truncatable prime must necessarily not be truncatable, and we would know there would be no larger truncatable primes. Which would mean that there would be a finite number of left truncatable primes.

Of course, if our algorithm keeps finding truncatable primes that have the same length as the previous truncatable prime, or are at most one digit longer, we will only know that there are more of them than we have patience to test.

But maybe we should try it? There may be a reasonably tractable finite number of truncatable primes.

PDP-8

computering truncatable primes

In job interviews, it seems they always ask you to implement something from scratch, whereas in real life you just DuckDuckGo for TEH CODEZ. Let’s do that: We want some code that lazily generates prime numbers in ascending order. Like this code from The Hubris of Impatient Sieves of Eratosthenes:

function * multiplesOf (startingWith, n) {
  let number = startingWith;

  while (true) {
    yield number;
    number = number + n;
  }
}

function destructure (iterable) {
  const iterator = iterable[Symbol.iterator]();
  const { done, value } = iterator.next();

  if (!done) {
    return { first: value, rest: iterator }
  }
}

class HashSieve {
  constructor () {
    this._hash = Object.create(null);
  }

  addAll (iterable) {
    const { first, rest } = destructure(iterable);

    if (this._hash[first]) {
      this._hash[first].push(rest);
    }
    else this._hash[first] = [rest];

    return this;
  }

  has (number) {
    if (this._hash[number]) {
      this._remove(number);
      return true;
    }
    else return false;
  }

  _remove (number) {
    const iterables = this._hash[number];

    if (iterables == null) return false;

    delete this._hash[number];
    iterables.forEach((iterable) => this.addAll(iterable));

    return number;
  }
}

function * Primes () {
  let prime = 2;
  const composites = new HashSieve();

  while (true) {
    yield prime;
    composites.addAll(multiplesOf(prime * prime, prime));

    while (composites.has(++prime)) {
      // do nothing
    }
  }
}

We’ll need to iterate over all the primes, checking each one to see if it is truncatable. That would normally involve a lot of checking whether numbers are prime. But a lazy list of primes doesn’t help with that. We could save them as we generate them, but that might take up a lot of space. If we presume that there are a lot fewer tractable primes than all primes, maybe we can get away with just storing tractable primes.

In fact, we don’t need to save every truncatable prime, just those that are the same length or one digit smaller than whatever prime we’re currently examining. If there aren’t any that are the same length or one digit smaller, it means our current prime is at least two digits larger than the largest truncatable prime we’ve found, and we’re done.

Here’s a first cut at a brute-force check for right truncatable primes. Although we’re generating the primes, what we’re really doing is a brute-force search for a gap that would indicate that there can be no larger right truncatable primes:

// Depends upon Primes() from https://gist.github.com/raganwald/78b086166c0712b49e5160edca5ebadd

const rightTruncatablePrimeStrings = [];

for (const primeInt of Primes()) {
  const prime = primeInt.toString();
  const isRightTruncatablePrime = isRightTruncatablePrimeString(prime);

  if (isRightTruncatablePrime === true) {
    rightTruncatablePrimeStrings.push(prime);
    console.log(prime);
  } else if (isRightTruncatablePrime === null) {
    console.log('There are no more right truncatable primes.');
    break;
  }
}

// returns:
//
//   true,  indicating that the string passed represents a right truncatable prime;
//
//   false, indicating that the string passed does not represent a right truncatable prime,
//          but more right truncatable primes may yet exist
//
//   null,  indicating the string passed does not represent a right truncatable prime,
//          and there are no larger right truncatable primes
function isRightTruncatablePrimeString(prime) {
  if (prime.length === 1) {
    return true;
  } else {
    const remainder = prime.substr(0, prime.length - 1);
    const remainderLength = remainder.length - 1;

    // remove our existing truncatable primes that are too short
    while (rightTruncatablePrimeStrings.length > 0 && rightTruncatablePrimeStrings[0].length < remainderLength) {
      rightTruncatablePrimeStrings.shift();
    }

    if (rightTruncatablePrimeStrings.length === 0) {
      return null;
    } else {
      return rightTruncatablePrimeStrings.includes(remainder);
    }
  }
}

//=>
  2
  3
  5
  7
  23

  ...

  23399339
  29399999
  37337999
  59393339
  73939133
  There are no more right truncatable primes.

Success! Of a kind…


Mechanical Adding Machine

evaluating our naïve brute force algorithm

If you physically babysist this algorithm while it runs, you’ll see that it gets slower and slower as it goes. If we count how many primes it has to check to discover each truncatable prime, we find that although the number gyrates back and forth a lot, the number of primes to be tested grows rapidly as the algorithm finds longer and longer truncatable primes.

For example, having found the second-last truncatable prime (59,393,339), it has to check 807,690 more primes before it discovers 73,939,133, the last truncatable prime.

How many primes do you suppose it has to check before it reaches 1,000,000,007, the first prime with ten digits? That’s when it realizes that there can be no more truncatable primes.

And it’s worse than this. Generating the consecutive primes with our “sieve” algorithm is itself a process that gets slower and slower as each prime is generated. And we need to generate more and more primes to disciver each truncatable prime.

So there’s no surprise that it is painfully slow. But at least we made it work. Can we make it faster?


fractal fun

generate-and-test

Our algorithm above is a classic “generate-and-test” brute-force approach. One algorithm generates candidate solutions, the other tests them. As it happens, the “generate” is itself a variation of generate-and-test: It generates integers, tests to see whether the integers are prime, and then tests the primes to see if they are truncatable.

Another approach is to flip things around. Instead of testing whether prime numbers are truncatable, what if we test truncatable numbers to see if they are prime?

Here’s a little prime testing function. It uses our lazily generated primes to come up with factors to test:

const primeIterable = Primes();
const factors = [];

function isPrime(n) {
  const squareRoot = Math.floor(math.sqrt(n));

  while (factors.length === 0 || factors[factors.length - 1] < squareRoot) {
    factors.push(primeIterable.next().value);
  }

  for (factor of factors) {
    if (n % factor === 0) {
      return false;
    } else if (n > squareRoot) {
      break;
    }
  }
  return true;
}

This works, but there’s an obvious refactoring: The function is doing mixing two concerns. factors is clearly a list of primes, but we’re faffing about with an array backed by an iterable to save recomputing primes from 2 every time we test a number.

What we want is an iterable over primes that is memoized. DuckDuckGo to the rescue again… And this gist has just the thing!

function memoize (generator) {
  const memos = {},
        iterators = {};

  return function * (...args) {
    const key = JSON.stringify(args);
    let i = 0;

    if (memos[key] == null) {
      memos[key] = [];
      iterators[key] = generator(...args);
    }

    while (true) {
      if (i < memos[key].length) {
        yield memos[key][i++];
      }
      else {
        const { done, value } = iterators[key].next();

        if (done) {
          return;
        } else {
          yield memos[key][i++] = value;
        }
      }
    }
  }
}

And now we can write:

// requires `memoize` from https://gist.github.com/raganwald/9714874740ec0048e3bc

const factors = memoize(Primes);

function isPrime(n) {
  const squareRoot = Math.floor(Math.sqrt(n));

  for (const factor of factors()) {
    if (n % factor === 0) {
      return false;
    } else if (factor > squareRoot) {
      return true;
    }
  }
}

Much better. Ok, we can test whether some arbitrary number is a prime. How do we generate truncatables?

2, 3, 5, and 7 are truncatables. Given a truncatable, we can try appending a 1, 3, 7, or 9 to it. If the result is a prime, it too is truncatable. (We could try 2, 4, 6, and 8, but it’s obvious that any number ending in an even digit is not a prime, and any number ending in a 0 or a 5 is also not a prime.)

If we visualize the truncatable numbers as a tree, we can perform a search of the tree for primes. 2 has as its children 21, 23, 27, and 29. Only 23 and 29 are prime. 23 has as its children 231, 233, 237, and 239, of which 233 and 239 are prime. And so forth, and so forth…

Here’s an implementation of the above “depth-first” search. It won’t be in numerical order, but if it terminates, we know that the number of truncatables is finite:

function * truncatables(...bases) {
  for (const base of bases) {
    yield base;

    const baseTimesTen = base * 10;

    for (const digit of [1, 3, 7, 9]) {
      const candidate = baseTimesTen + digit;

      if (isPrime(candidate)) {
        yield * truncatables(candidate);
      }
    }
  }
}

for (const truncatable of truncatables(2, 3, 5, 7)) {
  console.log(truncatable);
}

console.log('there are a finite number of truncatables');

And as we expected, this is lightning-quick compared to our “generate primes and test them for truncatability” algorithm.


Traveller's Notebook

what have we learned?

First, we’ve learned that brute-force, while it has its limitations, can answer questions for us, or at least rule out certain possibilities.

Second, we’ve learned that even when choosing to “brute force” a solution to a problem, carefully choosing how we go about brute forcing the solution can have a tremendous impact on the performance of our programs.

And third, we’ve learned (by osmosis) that lazy computations like using generators can help us structure our code in a reasonable manner, separating concerns.

(discuss “Truncatable Primes in JavaScript” on /r/javascript, or feel free to edit this page yourself)


the final source
https://raganwald.com/2017/12/14/73939133
Closing Iterables is a Leaky Abstraction
Show full content
iterators and iterables, a quick recapitulation

In JavaScript, iterators and iterables provide an abstract interface for sequentially accessing values, such as we might find in collections like arrays or priority queues.1

An iterator is an object with a .next() method. When you call it, you get a Plain Old JavaScript Object (or “POJO”) that has a done property. If the value of done is false, you are also given a value property that represents, well, a value. If the value of done is true, you may or may not be given a value property.

Iterators are stateful by design: Repeatedly invoking the .next() method usually results in a series of values until done (although some iterators continue indefinitely).

Here’s an iterator that counts down:

const iCountdown = {
  value: 10,
  done: false,
  next() {
    this.done = this.done || this.value < 0;

    if (this.done) {
      return { done: true };
    } else {
      return { done: false, value: this.value-- };
    }
  }
};

iCountdown.next()
  //=> { done: false, value: 10 }

iCountdown.next()
  //=> { done: false, value: 9 }

iCountdown.next()
  //=> { done: false, value: 8 }

// ...

iCountdown.next()
  //=> { done: false, value: 1 }

iCountdown.next()
  //=> { done: true }

An iterable is an object with a [Symbol.iterator] method. When invoked, [Symbol.iterator]() returns an iterator. Semantically, the iterator returned by [Symbol.iterator]() represents an iteration over the values associated with the iterable collection.

For example:

const countdown = {
  [Symbol.iterator]() {
    const iterator = {
      value: 10,
      done: false,
      next() {
        this.done = this.done || this.value < 0;

        if (this.done) {
          return { done: true };
        } else {
          return { done: false, value: this.value-- };
        }
      }
    };

    return iterator;
  }
};

We can do interesting things with iterables, like iterate over them using a for... of loop:

for (const count of countdown) {
  console.log(count);
}
  //=>
    10
    9
    8
    ...
    1

Or destructure them:

const [ten, nine, eight, ...rest] = countdown;

ten
  //=> 10
nine
  //=> 9
eight
  //=> 8
rest
  //=> [7, 6, 5, 4, 3, 2, 1]

And now, let’s get started. We’ll begin with a simple problem: How do we iterate over the lines of a text file?


reading lines from a file

We wish to create an iterable that successively yields the lines from a text file. Presuming we have some kind of library for opening, reading from, and closing files, we might write something a little like this:

function lines (path) {
  return {
    [Symbol.iterator]() {
      return {
        done: false,
        fileDescriptor: File.open(path),
        next() {
          if (this.done) return { done: true };
          const line = this.fileDescriptor.readLine();

          this.done = line == null;

          if (this.done) {
            fileDescriptor.close();
            return { done: true };
          } else {
            return { done: false, value: line };
          }
        }
      };
    }
  };
}

Whenever we want to iterate over all the lines of a file, we call our function, e.g. lines('./README.md'), and we get an iterable for the lines in the file.

When we invoke [Symbol.iterator]() on our iterable, we get an iterator that opens the file, reads the file line by line when we call .next(), and then closes the file when there are no more lines to be read.

So we could output all the lines containing a particular word like this:

for (const line of lines('./README.md')) {
  if (line.match(/raganwald/)) {
    console.log(line);
  }
}

The expression lines(‘./README.md’)` would create a new iterator with an open file, we’d iterate over each line, and eventually we’d run out of lines, close the file, and exit the loop.

What if we only want to find the first line with a particular word in it?

for (const line of lines('./README.md')) {
  if (line.match(/raganwald/)) {
    console.log(line);
    break;
  }
}

Now we have a problem. How are we going to close the file? The only way it will exhaust the iterations and invoke this.fileDescriptor.close() is if the file doesn’t contain raganwald. If the file does contain raganwald, our program will happily carry on while leaving the file open.

This is not good. And it’s not the only case. We might write iterators that act as coroutines, communicating with other processes over ports. Once again, we’d want to explicitly close the port when we are done with the iterator. We don’t want to just garbage-collect the memory we’re using.

What we need is some way to explicitly “close” iterators, and then each iterator could dispose of any resources it is holding. Then we could exercise a little caution, and explicitly close every iterator when we were done with them. We wouldn’t need to know whether the iterator was holding on to an open file or socket or whatever, the iterator would deal with that.

Fortunately, there is a mechanism for closing iterators, and it was designed for the express purpose of dealing with iterators that must hold onto some kind of resource like a file descriptor, an open port, a tremendous amount of memory, anything at all, really.

Iterables that need to dispose of resources introduce a problem. To solve it, the language introduced a mechanism for closing iterators, but we will still need to work out patterns and protocols of our own.

Let’s take a look at the mechanism.


return to forever

We’ve seen that the interface for iterators includes a mandatory .next() method. It also includes an optional .return() method. The contract for .return(optionalReturnValue) is that when invoked:

  • it should return { done: true } if no optional return value is provided, or { done: true, value: optionalReturnValue } if an optional return value is provided.
  • thereafter, the iterator should permanently return { done: true } should .next() be called.
  • as a consequence of the above, the iterator can and should dispose of any resources it is holding.

Looking back at our countdown iterable, we can implement .return() for it:

const countdown = {
  [Symbol.iterator]() {
    const iterator = {
      value: 10,
      done: false,
      next() {
        this.done = this.done || this.value < 0;

        if (this.done) {
          return { done: true };
        } else {
          return { done: false, value: this.value-- };
        }
      },
      return(value) {
        this.done = true;
        if (arguments.length === 1) {
          return { done: true, value };
        } else {
          return { done: true };
        }
      }
    };

    return iterator;
  }
};

There is some duplication of logic around returning { done: true } and setting this.done = true, and this duplication will be more acute when we deal with disposing of resources, so let’s clean it up:

const countdown = {
  [Symbol.iterator]() {
    const iterator = {
      value: 10,
      done: false,
      next() {
        if (this.done) {
          return { done: true };
        } else if (this.value < 0) {
          return this.return();
        } else {
          return { done: false, value: this.value-- };
        }
      },
      return(value) {
        this.done = true;
        if (arguments.length === 1) {
          return { done: true, value };
        } else {
          return { done: true };
        }
      }
    };

    return iterator;
  }
};

Now we can see how to write a loop that breaks before it exhausts the entire iteration:

count iCountdown = countdown[Symbol.iterator]();

while (true) {
  const { done, value: count } = iCountdown.next();

  if (done) break;

  console.log(count);

  if (count === 6) {
    iCountdown.return();
    break;
  }
}

Calling .return() ensures that iCountdown disposes of any resources it has or otherwise cleans itself up. Of course, this is a PITA if we have to work directly with iterators and give up the convenience of things like for... of loops and destructuring.

It would be really nice if they followed the same pattern. Do they? Let’s find out. We can use a breakpoint in the .return() method, or insert an old-school console.log statement:

return(value) {
  if (!this.done) {
    console.log('Return to Forever');
    this.done = true;
  }
  if (arguments.length === 1) {
    return { done: true, value };
  } else {
    return { done: true };
  }
}

And now let’s try:

for (const count of countdown) {
  console.log(count);
  if (count === 6) break;
}
  //=>
    10
    9
    8
    7
    6
    Return to Forever

And also:

const [ten, nine, eight] = countdown;
  //=> Return to Forever

JavaScript’s built-in constructs for consuming iterators from iterables invoke .return() if we don’t consume the entire iteration.

Also, we can see that the .return() method is optional: JavaScript’s built-in constructs will not call .return() on an iterator that doesn’t implement .return().


invoking return isn’t always simple

So, now we see that we should write our iterators to have a .return() method when they have resources that need to be disposed of, and that we can use this method ourselves or rely on JavaScript’s built-in constructs to call it for us.

This can be tricky. Here’s a function that returns the first value (if any) of an iterable:

function first (iterable) {
  const [value] = iterable;

  return value;
}

Because destructuring always closes the iterator extracted from the iterable it is given, this flavour of first can be counted on to close its parameter. If we get fancy and try to do everything by hand:

function first (iterable) {
  const iterator = iterable[Symbol.iterator]();
  const { done, value } = iterator.next();

  if (!done) return value;
}

We might neglect closing the iterator we extracted. We have to do everything ourselves:

function first (iterable) {
  const iterator = iterable[Symbol.iterator]();
  const { done, value } = iterator.next();

  if (typeof iterator.return === 'function') {
    iterator.return();
  }

  if (!done) return value;
}

A good heuristic is, If we can use JavaScript’s built-in constructs to close a the iterator extracted from an iterable, we should.

As we can see, destructuring handles closing an iterator for us. We’ve already seen that breaking a for... of loop also closes an iterator for us, whether we exhaust the iterator or break from inside the loop.

This is also true if we yield from inside a for... of loop within a generator. For example, we have previously seen functions like mapWith:

function * mapWith (mapFn, iterable) {
  for (const value of iterable) {
    yield mapFn(value);
  }
}

This is a generator that takes an iterable as an argument and returns an iterable. We can see that if we exhaust the iterable it returns, it will exhaust the iterable it is passed. But what happens if we terminate the iteration prematurely? For example, what if we break from inside a for... of loop?

We can test this directly:

const countdownInWords = mapWith(n => words[n], countdown);

for (const word of countdownInWords) {
  break;
}
  //=> Return to Forever

Invoking break inside this for... of loop is also invoking break inside of mapWith’s for... of loop, because that is where execution pauses when it invokes yield. So this will close the iterator that mapWith’s for... of loop extracts.

Unfortunately, we cannot always arrange for JavaScript’s built-in constructs to close our iterators for us.


more about closing iterators explicitly

The zipWith function takes two or more iterables, and “zips” them together with a function. If we write this as a generator, there is no easy way to rely on JavaScript’s built-in constructs to close all of the iterators we extract from its parameters.

function * zipWith (zipper, ...iterables) {
  const iterators = iterables.map(i => i[Symbol.iterator]());

  while (true) {
    const pairs = iterators.map(j => j.next()),
          dones = pairs.map(p => p.done),
          values = pairs.map(p => p.value);

    if (dones.indexOf(true) >= 0) {
      for (const iterator of iterators) {
        if (typeof iterator.return === 'function') {
          iterator.return();
        }
      }
      return;
    }

    yield zipper(...values);
  }
}
const fewWords = ['alper', 'bethe', 'gamow'];

for (const pair of zipWith((l, r) => [l, r], countdown, fewWords)) {
  //... diddley
}
  //=> Return to Forever

This code will explicitly close every iterator if and when any one of them is exhausted. However, if we prematurely terminate the iteration, such as using incomplete destructuring or invoking break from inside a loop, it will not close any of the iterators:

const [[firstCount, firstWord]] = zipWith((l, r) => [l, r], countdown, fewWords);
  //=>

This snippet does not log Return to Forever. Although JavaScript’s built-in behaviour attempts to close the iterator created by our generator function, it never invokes the code we wrote to close all the iterators.

As suggested by jaffathecake, we can make sure the iterables are closed within a generator using a try... finally construct:

function * zipWith (zipper, ...iterables) {
  const iterators = iterables.map(i => i[Symbol.iterator]());

  try {
    while (true) {
      const pairs = iterators.map(j => j.next()),
            dones = pairs.map(p => p.done),
            values = pairs.map(p => p.value);

      if (dones.indexOf(true) >= 0) {
        for (const iterator of iterators) {
          if (typeof iterator.return === 'function') {
            iterator.return();
          }
        }
        return;
      }

      yield zipper(...values);
    }
  }
  finally {
    for (const iterator of iterators) {
      if (typeof iterator.return === 'function') {
        iterator.return();
      }
    }
  }
}

Now, when we close the iterable returned by zipWith, we are going to explicitly close each and every one of the iterables we pass into it, provided that they implement .return(). Let’s try it:

const [[firstCount, firstWord]] = zipWith((l, r) => [l, r], countdown, fewWords);
  //=> Return to Forever

Another sure way to close all the iterators is to take 100% control of zipWith. Instead of writing it as a generator function, we can write it as a function that returns an iterable object:

function zipWith (zipper, ...iterables) {
  return {
    [Symbol.iterator]() {
      return {
        done: false,
        iterators: iterables.map(i => i[Symbol.iterator]()),
        zipper,
        next() {
          const pairs = this.iterators.map(j => j.next()),
                dones = pairs.map(p => p.done),
                values = pairs.map(p => p.value);

          if (dones.indexOf(true) >= 0) {
            return this.return();
          } else {
            return { done: false, value: this.zipper(...values) };
          }
        },
        return(optionalValue) {
          if (!this.done) {
            this.done = true;

            for (const iterable of this.iterators) {
              if (typeof iterable.return === 'function') {
                iterable.return();
              }
            }
          }

          if (arguments.length === 1) {
            return { done: true, value:optionalValue };
          } else {
            return { done: true };
          }
        }
      };
    }
  };
}

And that also works:2

const [[firstCount, firstWord]] = zipWith((l, r) => [l, r], countdown, fewWords);
  //=> Return to Forever

Either way, we must explicitly arrange things such that zipWith closes its iterators when its own iterator is closed.


hidden affordances

We’ve seen that iterators need to be closed. We’ve also seen that the affordance for closing an iterator is invisible. There’s a .return() method we may need to invoke. We also may need to implement it. But it’s usually invisible, and the most convenient way to work with iterables–writing generators and using constructs like for... of loops or destructuring–hides .return() from us.

This conscious design choice does make learning about iterables particularly easy. It’s easy to write generators, and when we encounter code like this in a blog post:

function * take (numberToTake, iterable) {
  const iterator = iterable[Symbol.iterator]();

  for (let i = 0; i < numberToTake; ++i) {
    const { done, value } = iterator.next();
    if (!done) yield value;
  }
}

We can grasp the fundamental idea of what the code is trying to accomplish very quickly. But it’s not obvious to the untrained eye why this code is preferred:

function * take (numberToTake, iterable) {
  let i = 0;

  for (const value of iterable) {
    if (i++ === numberToTake) {
      return;
    } else {
      yield value;
    }
  }
}

And then there is the eternal debate of explicit versus implicit:

function * take (numberToTake, iterable) {
  const iterator = iterable[Symbol.iterator]();

  try {
    for (let i = 0; i < numberToTake; ++i) {
      const { done, value } = iterator.next();
      if (!done) yield value;
    }
  }
  finally {
    if (typeof iterator.return === 'function') {
      iterator.return();
    }
  }
}

Is the for... of loop more elegant? What if for (let i = 0; i < numberToTake; ++i) is faster? Is the try... finally code better because it explicitly handles closing the iterator? Or is it worse because it introduces extra code not central to the purpose of the function?


chesterton’s fence and leaky abstractions

In the matter of reforming things, as distinct from deforming them, there is one plain and simple principle; a principle which will probably be called a paradox. There exists in such a case a certain institution or law; let us say, for the sake of simplicity, a fence or gate erected across a road. The more modern type of reformer goes gaily up to it and says, “I don’t see the use of this; let us clear it away.” To which the more intelligent type of reformer will do well to answer: “If you don’t see the use of it, I certainly won’t let you clear it away. Go away and think. Then, when you can come back and tell me that you do see the use of it, I may allow you to destroy it.” –G.K. Chesterton

Imagine that some code included the “implicit” implementation of take. An engineer sees it, decides it could be faster, and optimizes it, therein removing its ability to correctly close an iterable it is handed. Is this wrong? How would anybody know, if there is no mechanism to discover why it was written that way to begin with?

Many patterns are like this. They include code for solving problems that are not evident on the first read: We have to have encountered the problem in the past in order to appreciate why the code we’re looking at solves it. And to be fair, the problem being solved may not apply to us today.

The implementation of take given in the blog post is fine for the code in the blog post, and for most code. But when it isn’t fine, it’s broken.

This is the state of affairs with all code, whether functional, OO, whatever. We have “leaky abstractions” like iterables. They are fine as long as we are well within the most common case, but when we stray near the edges, we need to understand what is going on “under the hood.” If we can’t see beneath the abstraction, we won’t appreciate interactions such as whether a for... of loop inside a generator closes its iterator if the enclosing iterator is closed while it yields.


in closing

We have found that closing iterables introduces an aspect of how to write correct iterables code that is not always visibly evident. Sometimes we must choose one construct, sometimes another. Sometimes we are safe to use generators, sometimes we must write functions that return iterable objects.

In the end, the safest way to proceed is to understand our tools really, really well. Abstractions are useful for writing code that eliminates accidental complexity, but that does not mean that we as programmers do not need to understand what is happening beneath the abstractions. It is just that we don’t always need to clutter our code up with what is happening beneath the abstractions.

(discuss on reddit)


notes
  1. For a more thorough discussion of iterators and iterables, have a look at the Collections chapter of JavaScript Allongé 

  2. There’s another case not discussed in this post, handling exceptions. That is deliberate, as the point of this post is illustrating how Iiterators are a leaky abstraction, not dictating patterns for robustly handling all of its leaks. 

https://raganwald.com/2017/07/22/closing-iterables-is-a-leaky-abstraction
A Sequence Problem
Show full content

Here are the first sixteen elements of a sequence:

.
*
(*)
(*.)
((*))
(*..)
(**)
(*...)
((*.))
((*).)
(*.*)
(*....)
(*(*))
(*.....)
(*..*)
(**.)

And the next sixteen:

(((*)))
(*......)
((*.)*)
(*.......)
(*.(*))
(*.*.)
(*...*)
(*........)
(*(*.))
((*)..)
(*....*)
((*.).)
(*..(*))
(*.........)
(***)
(*..........)

What is the next element in the sequence?

solving sequence problems

Problems with this rough form appear regularly in “intelligence” tests. To solve them, you need a couple of different things:

First, you need some facility with manipulating abstract relationships and patterns. You need a modicum of deductive and inductive logical thought. This is probably what people are talking about when they talk about using a problem like this to measure “intelligence.” It’s not the only kind of intelligence, of course, but it certainly is one or more kinds of intelligence.

But intelligence (whether intelligence-singular or multiple types of specialized intelligence) is not the only thing you need to figure this out. Intelligence is necessary but not sufficient. You also need experience with tools.

tooling

What kinds of tools are we talking about?

Imagine we try to prove the Pythagorean Theorem from scratch, with no mathematical training. This requires intelligence, obviously, but there is something else: It is much harder to prove a geometry theorem from first principles, than if we’ve already had some exposure to plane geometry and can use the notation and methods taught in school.

The fact is, tools matter for solving problems, and experience with the tools greatly influences your ability to solve problems in a particular domain.

Solving sequence problems requires intelligence, yes, but it is greatly assisted by experience solving sequence problems and exposure to the tools for solving sequence problems.

One tool is to make a hypothesis about how the sequence is constructed, test it against the examples given, and if it fits, derive the next value from your hypothetical rules for constructing the sequence.

The more experience you have with sequence problems, the more hypotheses you are likely to consider and the faster you will generate and test them.

One hypothesis to try is that this is a sequence where there is a fixed transformation on each element to derive the next element. If that was the case, we would look for f where:

f. -> *
f* -> (*)
f(*) -> (*.)

...

f(***) -> (*..........)

If we solve for f, we can then apply f(*..........) and we’d have our answer.

The recognition that sequences have patterns like “repeated application of a function” is a powerful insight. You could, of course, make a tremendous leap unassisted and work this out. Or, you could have been exposed to the idea in a book or in school, in which case you are demonstrating your experience with the tools of mathematical thinking.

But of course, these are two different things. And if you want to measure one, you may wind up accidentally measuring the other. This is the main problem with “brain teasers” as programming interview questions. We often want to measure “smarts,” but we instead measure “experience with abstract problems.”

The argument about whether the ability to solve sequence problems applies to the ability to write software, often comes down to the difference between the raw intelligence, which very well may to apply to programming, and the experience with specific math tools, which may not.1

But back to our sequence. You can stop here if you haven’t solved the problem and care to work on it yourself.


Numbers (c) 2012 Morebyless

the sequence

Another general form for sequences is that they are a mapping from some well-known sequence to another. The sequence above could be a code or representation for the words of the American Declaration of Independence. Or Pantone colours. Or more likely, some well-known sequence of numbers.

In that case, the sequence above could be something like:

f0 -> .
f1 -> *
f2 -> (*)
f3 -> (*.)

...

f31 -> (*..........)

If that was the case, the sequence would be a representation of the Natural Numbers (also called the non-negative integers), in order, from 0 through 15 in the first list, and 16 through 31 in the second list. If we knew that this was a list of the natural numbers, we would know that the next number is going to be 32, and if we know f, we can apply f32 -> and derive the next item in the list.

How can we verify our hypothesis? Well, the natural numbers have some patterns, and we could see if the sequence we have has similar patterns. For example, do all the even or odd items have something in common?

To make things easier, let’s play with the sequence in JavaScript. Here’s some code that “prints” each element along with our hypothetical relationship:

const s = ['.', '*', '(*)', '(*.)', '((*))', '(*..)',
  '(**)', '(*...)', '((*.))', '((*).)', '(*.*)', '(*....)',
  '(*(*))', '(*.....)', '(*..*)', '(**.)', '(((*)))',
  '(*......)', '((*)*)', '(*.......)', '(*.(*))',
  '(*.*.)', '(*...*)', '(*........)', '(*(*.))', '((*)..)',
  '(*....*)', '((*.).)', '(*..(*))', '(*.........)',
  '(***)', '(*..........)'];

for (let i = 0; i < s.length; i = i + 1)
  console.log(`f${i} -> `+ s[i]);

f0 -> .
f1 -> *
f2 -> (*)
f3 -> (*.)
f4 -> ((*))
f5 -> (*..)
f6 -> (**)
f7 -> (*...)
f8 -> ((*.))
f9 -> ((*).)
f10 -> (*.*)
f11 -> (*....)
f12 -> (*(*))
f13 -> (*.....)
f14 -> (*..*)
f15 -> (**.)
f16 -> (((*)))
f17 -> (*......)
f18 -> ((*)*)
f19 -> (*.......)
f20 -> (*.(*))
f21 -> (*.*.)
f22 -> (*...*)
f23 -> (*........)
f24 -> (*(*.))
f25 -> ((*)..)
f26 -> (*....*)
f27 -> ((*.).)
f28 -> (*..(*))
f29 -> (*.........)
f30 -> (***)
f31 -> (*..........)

If you haven’t solved it yet, feel free to stop here and take advantage of these two tools: The hypothesis that this is a mapping from the natural numbers to some representation, and a snippet of JavaScript that facilitates playing with the elements of the sequence.


Sequence #1073 © 2010 fdecomite

some observations

Shall we continue? One thing we can do is look at the even elements:

for (let i = 0; i < s.length; i = i + 2)
  console.log(`f${i} -> `+ s[i]);

f0 -> .
f2 -> (*)
f4 -> ((*))
f6 -> (**)
f8 -> ((*.))
f10 -> (*.*)
f12 -> (*(*))
f14 -> (*..*)
f16 -> (((*)))
f18 -> ((*)*)
f20 -> (*.(*))
f22 -> (*...*)
f24 -> (*(*.))
f26 -> (*....*)
f28 -> (*..(*))
f30 -> (***)

And the odd elements:

for (let i = 1; i < s.length; i = i + 2)
  console.log(`f${i} -> `+ s[i]);

f3 -> (*.)
f5 -> (*..)
f7 -> (*...)
f9 -> ((*).)
f11 -> (*....)
f13 -> (*.....)
f15 -> (**.)
f17 -> (*......)
f19 -> (*.......)
f21 -> (*.*.)
f23 -> (*........)
f25 -> ((*)..)
f27 -> ((*.).)
f29 -> (*.........)
f31 -> (*..........)

Interesting. The first odd is *, which we hypothesize is 1. All subsequent odds end in .), while none of the evens end in .). A bunch of the odds are even “more extreme,” they start with (*, then have one or more dots ending in .).

We can express that with a regular expression. Annoyingly, we have to escape everything because this sequence consists entirely of characters that have a special meaning in regular expressions: /^\(\*\.+\)$/.

Let’s use it:

for (let i = 0; i < s.length; i = i + 1)
  if (s[i].match(/^\(\*\.+\)$/))
    console.log(`f${i} -> `+ s[i]);

f3 -> (*.)
f5 -> (*..)
f7 -> (*...)
f11 -> (*....)
f13 -> (*.....)
f17 -> (*......)
f19 -> (*.......)
f23 -> (*........)
f29 -> (*.........)
f31 -> (*..........)

This sequence looks very familiar, but it’s missing something. Where is f2? If we modify our regular expression to match zero or more dots instead of one or more, we get:

for (let i = 0; i < s.length; i = i + 1)
  if (s[i].match(/^\(\*\.*\)$/))
    console.log(`f${i} -> `+ s[i]);

f2 -> (*)
f3 -> (*.)
f5 -> (*..)
f7 -> (*...)
f11 -> (*....)
f13 -> (*.....)
f17 -> (*......)
f19 -> (*.......)
f23 -> (*........)
f29 -> (*.........)
f31 -> (*..........)

Aha! These are prime numbers. (*) is the first prime, (*.) is the second, (*..) is the third, and so forth, up to (*..........) being the eleventh prime. If our hypothesis is correct, . is a zero and * is a one.

Our special “exception”–two–fits, it’s an exception in number theory as well: Two is the only even prime number. This discovery is encouraging, let’s observe something else:

Each prime is indicated by a one in its position and a zero in the previous positions. We know something like this. Our standard numerical notation (e.g. base ten) uses positions. A number in a particular position indicates how much to multiply the position’s value, and we sum all the values together.

Maybe this uses the same method, but with primes instead of powers of a base like ten? Let’s check it out. If that were the case, then given (*.) for three and (*) for two, we would expect (**) to be five (“three plus two”). But no, it’s six. Which is three times two.

Let’s try another. (*...) is seven, and (*.) is three. (*.*.) is 21, seven times three, not eleven. It looks like this is a multiplicative scheme. And if we check all the numbers that don’t have nested parentheses, that’s exactly what we have.

This all works out, even (***) for thirty (five times three times two). But what about nested parentheses? Well, four appears to be ((*)) if our hypothesis is correct. That would be two times two, and there’s no way to derive four from multiple primes.

So ((*)) must me some way of multiplying two by itself. We know that, it’s two to the power of two. If we stare at it a bit, we see that one is *, and two is (*), so that’s a little like saying that two is (1) So if we look at ((1)), we can take the inner (1) and turn it into two: (2) which is like saying two times two. So when we have nested parentheses, we are substituting a parenthesized expression anywhere a dot or asterisk could go.

This explains nine (((*).)) and twenty-five ((*)..). But how about eight? That’s ((*.)), which is like (3). Obviously we can’t say that (3) means multiplying two by three. It must mean raising two to the third power.

And now the whole thing is bare.

prime factorization

This notation expresses the numbers zero and one as special cases. Everything larger uses the parentheses to represent numbers as their prime factorization. For example, twenty-eight is seven to the power of one times two to the power of two (7ꜛ1 ⨉ 2ꜛ2). Seven is (*...), two is (*), and thus twenty-eight is (*..(*)).

Each position is the exponent for that prime, also called its multiplicity. It looks a little weird because everything is smooshed together, but if we use Lisp’s s-exprs, it’s easier to see how it works when an exponent is itself an expression:

f28 -> (* . . (*))

This representation is recursive, so if (*..(*)) is 28, then ((*..(*)).) is 3ꜛ28, or 22,876,792,454,961. Likewise, consider:

f2 -> (*)
f4 -> ((*))
f16 -> (((*)))
f65536 -> ((((*))))

...

This is 2ꜛ1, 2ꜛ2ꜛ1, 2ꜛ2ꜛ2ꜛ1, 2ꜛ2ꜛ2ꜛ2ꜛ1 and so forth ad infinitum.

Many of the numeric properties derived from factorizing numbers are obvious from direct inspection of this notation. For example:

  • Prime numbers have just one prime factor raised to the power of one, thus they all have the form (* followed by zero or more .s followed by ).
  • Composite numbers have any other form beginning with ( and ending with ).
  • Odd numbers do not have two as a factor, so they must end with .).
  • Even numbers have two as a factor, so they must end with *) or )).
  • A semiprime is a compound number consisting of two primes multiplied by each other or one prime squared. Thus, it is either:
    1. ((*) followed by zero or more .s, followed by ), or;
    2. (*, followed by zero or more .s, followed by *, followed by zero or more .s, followed by )
  • A square number has even multiplicity for all prime factors. * is a square, and also every number of the form () where each position is either a . or a parenthesized representation of an even number (see above).
  • A powerful number has multiplicity above one for every prime factor, therefore a powerful number is represented as a (, followed by either .s or parenthesized expressions, followed by ).
  • A square-free number is represented as a (*, followed by zero or more .s or *s, followed by ).
  • A prime power is represented as a (, either a * or a parenthesized expression, followed by zero or more .s, followed by ).

There are many more. What they all have in common is that they can be determined with fairly simple pattern matching from this representation.

closing thought

Representations are celebrated for what they make easy. As we saw above, this notation makes all sorts of questions based on factorization easy. And it is much more compact than our base-n representation, the built-in exponentiation scales, well, exponentially.

However, to be useful as a general-purpose representation, it would have to be easy to work with for routine tasks like addition and subtraction. And while converting from this representation seems straightforward, requiring only multiplication and exponentiation, converting to this representation is one of the hardest problems in number theory!

A fun exercise is to write a program to generate the sequence. A FizzBuzz with a side of curiosity, so to speak.

(discuss on hacker news)


Children at Hiawatha Playfield, 1912 Item 29278, Don Sherwood Parks History Collection (Record Series 5801-01), Seattle Municipal Archives.

author’s afterword

Earlier in this essay, I touched on the problem with using questions like this to test “intelligence.” The crux of the argument was that besides testing for intelligence, it also tests for exposure to the tooling.

The conclusion is that we really shouldn’t draw conclusions about someone’s intelligence, much less fitness for programming, from their ability to solve a problem like this in the context of a job interview.

A similar dynamic is in play when we compare someone who has seen the problem to someone who hasn’t. If Alice is posing the problem to Bob, Alice can easily appear to be smarter than Bob!

It seems silly when I write it out like this, obviously Alice posing the problem to Bob doesn’t mean Alice is smarter or more capable than Bob. But all too often, this is the exact dynamic in job interviews. It’s easy for interviewers to arrogantly fixate on an interviewee’s struggle as evidence of them being unable to come up with the “obvious” solution.

And the interviewee can contract a bad case of intimidation, feeling they are not smart enough or good enough to work in a place full of smart people who do nothing but solve math problems for fun.

This generalizes to all interview problems, whether mathematical or not. Never assume that struggling with a problem implies that the interviewee must not be as smart as the interviewer, who has the advantage of having studied the problem at leisure.

And while you’re thinking about that, ask yourself this question: If Donovan writes a blog post about math, or programming, or anything at all, and Carol finds it unfamiliar, should she presume that Donovan is smarter and more experienced than she is?

No, for the same reasons. Writing a blog post is evidence that Donovan carefully selected something he felt he knew, and then spent an undetermined amount of time writing, researching and polishing his words.

Carol, reading it extemporaneously, should not worry that she is in any way less intelligent or even less experienced than Donovan. Writing a blog post is a scenario where the author picks the problem and the tools necessary to solve the problem.

It is not evidence of anything other than an enthusiasm for sharing, and I encourage everyone to enjoy blog posts in that spirit. Please do not worry that you may be less gifted or unworthy in any way.


an example program to generate the sequence without using integers or arrays
notes
  1. There’s another conjecture that organic exposure to mathematics is strongly correlated with programming ability. The archetype from my generation is the nerd who subscribed to Scientific American just for Martin Gardner’s “Mathematical Recreations” column, and who reads Raymond Smullyan for fun. This may or may not be a reasonable conjecture, but modern thought is that while it may have some positive signal, it has many false negatives. Another, even more glaring flaw is that when there are financial incentives for pretending to have organic exposure to mathematics, people will fake this by purchasing entire books devoted to learning how to solve math problems, just to pass job interviews. In which case, you are testing someone’s ability to cram for exams, which is not the same thing at all, and may end up excluding someone who chose to read about combinatorial logic instead of solving sequence problems. 

https://raganwald.com/2017/06/04/sequences
What's a Transducer?
Show full content

a matrix dream

In Using iterators to write highly composeable code, we saw that the staged approach to data transformation is decomposed, but duplicates the entire data set. Whereas, the single pass approach is more efficient, but the code was entangled and monolithic.

Now we’re going to look at an interesting approach for building composeable pipelines of transformations without incurring a memory penalty, transducers.

Let’s start with a review of reducing (a/k/a “folding”):


reducers

A reducer is a function that takes an accumulation and a value, and folds the value into the accumulation. For example, if [1, 2, 3] is an accumulation ,and 4 is a value, (acc, val) => acc.concat([val]); is a reducer that returns [1, 2, 3, 4]:

const acc = [1, 2, 3];
const val = 4;
const reducer = (acc, val) => acc.concat([val]);

reducer(acc, val)
  ///=> 1, 2, 3, 4

(acc, val) => acc.concat([val]) is a reducer that returns the catenation of a list and a value.

Likewise, (acc, val) => acc.add(val) is a reducer that .adds a value to an accumulation. It works for any object that has a .add method and returns itself from .add, like Set.prototype.add:

const acc = new Set([1, 2, 3]);
const val = 4;
const reducer = (acc, val) => acc.add(val);

reducer(acc, val)
  ///=> Set{1, 2, 3, 4}

Here is a function that makes an array out of any iterable using our catenation reducer:

const toArray = iterable => {
  const reducer = (acc, val) => acc.concat([val]);
  const seed = [];
  let accumulation = seed;

  for (value of iterable) {
    accumulation = reducer(accumulation, value);
  }

  return accumulation;
}

toArray([1, 2, 3])
  //=> [1, 2, 3]

We can extract our reducer and seed variables as parameters to create a reduction function:

const reduce = (iterable, reducer, seed) => {
  let accumulation = seed;

  for (const value of iterable) {
    accumulation = reducer(accumulation, value);
  }

  return accumulation;
}

reduce([1, 2, 3], (acc, val) => acc.concat([val]), [])
  //=> [1, 2, 3]

Thankfully, JavaScript is evolving towards a convention of writing functions like reduce to take the reducer first. In JavaScript Allongé-style nomenclature, we can write:

const reduceWith = (reducer, seed, iterable) => {
  let accumulation = seed;

  for (const value of iterable) {
    accumulation = reducer(accumulation, value);
  }

  return accumulation;
}

reduce((acc, val) => acc.concat([val]), [], [1, 2, 3])
  //=> [1, 2, 3]

// becomes:

reduceWith((acc, val) => acc.concat([val]), [], [1, 2, 3])
  //=> [1, 2, 3]

In JavaScript, arrays have a .reduce method built in, and they behave exactly like our reduce or reduceWith functions:

[1, 2, 3].reduce((acc, val) => acc.concat([val]), [])
  //=> [1, 2, 3]

Now, (acc, val) => acc.concat([val]) makes a lot of excess copies of things, and in JavaScript, we can substitute (acc, val) => { acc.push(val); return acc; }.1

Either way, what we get is a reducer that accumulates values into an array. Let’s give it a name:

const arrayOf = (acc, val) => { acc.push(val); return acc; };

reduceWith(arrayOf, [], [1, 2, 3])
  //=> [1, 2, 3]

Here’s yet another reducer:

const sumOf = (acc, val) => acc + val;

reduceWith(sumOf, 0, [1, 2, 3])
  //=> 6

We can write reducers that reduce an iterable of one type (such as an array) into another type (such as a number).


decorating reducers

JavaScript makes it easy to write functions that return functions. Here’s a function that makes a reducer for us:

const joinedWith =
  separator =>
    (acc, val) =>
      acc == '' ? val : `${acc}${separator}${val}`;

reduceWith(joinedWith(', '), '', [1, 2, 3])
  //=> "1, 2, 3"

reduceWith(joinedWith('.'), '', [1, 2, 3])
  //=> "1.2.3"

JavaScript also makes it easy to write functions that take functions as arguments.

Decorators are JavaScript functions that take a function as an argument and return another function that is semantically related to its argument. For example, this function takes a binary function and decorates it by adding one to its second input:

const incrementSecondArgument =
  binaryFn =>
    (x, y) => binaryFn(x, y + 1);

const power =
  (base, exponent) => base ** exponent;

const higherPower = incrementSecondArgument(power);

power(2, 3)
  //=> 8

higherPower(2, 3)
  //=> 16

higherPower is power, decorated to add one to its exponent. Thus, higherPower(2, 3) produces the same result as power(2, 4). We have been working with binary functions already, of course. Reducers are binary functions. Can we decorate them? Yes!

reduceWith(incrementSecondArgument(arrayOf), [], [1, 2, 3])
  //=> [2, 3, 4]

const incremented =
  iterable =>
    reduceWith(incrementSecondArgument(arrayOf), [], iterable);

incremented([1, 2, 3])
  //=> [2, 3, 4]

mappers

We have produced a mapper, a function that takes an iterable and returns a mapping from the iterable’s values to the incremented iterable’s values. We map values all the time in JavaScript, but of course we want to do more than just increment. Let’s take another look at incrementSecondArgument:

const incrementSecondArgument =
  binaryFn =>
    (x, y) => binaryFn(x, y + 1);

Since we’re using it to decorate reducers, let’s give it some more relevant names:

const incrementValue =
  reducer =>
    (acc, val) => reducer(acc, val + 1);

Now we see at a glance that incrementValue takes a reducer as an argument and returns a reducer that increments its value before reducing it further. We can extract the “incrementing” logic into a parameter:

const map =
  fn =>
    reducer =>
      (acc, val) => reducer(acc, fn(val));

const incrementValue = map(x => x + 1);

reduceWith(incrementValue(arrayOf), [], [1, 2, 3])
  //=> [2, 3, 4]

Although it looks unfamiliar to people not used to the idea of a function taking a function as an argument and returning a function that takes a function as an argument, we can write map(x => x + 1) anywhere we can write incrementValue, therefore we can write:

reduceWith(map(x => x + 1)(arrayOf), [], [1, 2, 3])
  //=> [2, 3, 4]

And because our map decorator can decorate any reducer, we can also join the increments of the numbers from one to three into a string or sum them:

reduceWith(map(x => x + 1)(joinedWith('.')), '', [1, 2, 3])
  //=> "2.3.4"

reduceWith(map(x => x + 1)(sumOf), 0, [1, 2, 3])
  //=> 9

Armed with all we’ve seen so far, what is the sum of the squares of the numbers from one to ten?

const squares = map(x => power(x, 2));
const one2ten = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];

reduceWith(squares(sumOf), 0, one2ten)
  //=> 385

Pythagoras Tree


filters

Let’s go back to our first reducer:

const arrayOf = (acc, val) => { acc.push(val); return acc; };

reduceWith(arrayOf, [], one2ten)
  //=> [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

What if we want an array of just the numbers greater than five? Easily done:

const bigUns = (acc, val) => {
  if (val > 5 ) {
    acc.push(val);
  }
  return acc;
};

reduceWith(bigUns, [], one2ten)
  //=> [6, 7, 8, 9, 10]

Naturally, we can combine what we already have to produce an array of the squares of the numbers greater than five:

reduceWith(squares(bigUns), [], one2ten)
  //=> [9, 16, 25, 36, 49, 64, 81, 100]

This is not what we wanted! We have the squares that are greater than five, rather than the squares of the numbers that are greater than five. We want to do the selecting of numbers before we do the squaring, not after. This is easily done, and the insight is that what we want is a decorator that selects numbers, and we can use that to decorate the reducer:

reduceWith(squares(arrayOf), [], one2ten)
  //=> [1, 4, 9, 16, 25, 36, 49, 64, 81, 100]

const bigUnsOf =
  reducer =>
    (acc, val) =>
      (val > 5) ? reducer(acc, val) : acc;

reduceWith(bigUnsOf(squares(arrayOf)), [], one2ten)
  //=> [36, 49, 64, 81, 100]

bgUnsOf is rather specific. Just as we did with map, let’s extract the predicate function:

reduceWith(squares(arrayOf), [], one2ten)
  //=> [1, 4, 9, 16, 25, 36, 49, 64, 81, 100]

const filter =
  fn =>
    reducer =>
      (acc, val) =>
        fn(val) ? reducer(acc, val) : acc;

reduceWith(filter(x => x > 5)(squares(arrayOf)), [], one2ten)
  //=> [36, 49, 64, 81, 100]

We can make all kinds of filters, and name them if we want. Or not:

reduceWith(filter(x => x % 2 === 1)(arrayOf), [], one2ten)
  //=> [1, 3, 5, 7, 9]

With all this in hand, the sum of the squares of the odd numbers from one to ten is:

reduceWith(filter(x => x % 2 === 1)(squares(sumOf)), 0, one2ten)
  //=> 165

“transformers” and composition

Denizens of other programming communities have a word for a function that takes an argument and transforms it into something else: They call such functions “transformers.” What we call decorators are a special case of transformers, and so if some talks about a “transformer” function that transforms a reducer into another reducer, we know that they are talking about the same thing as when we talk about a function that “decorates” a reducer with some additional functionality such as mapping or filtering.

The mappers and filters we have discussed so far are transformers. Within the context of this programming pattern, an essential characteristic of transformers is that they compose to produce a new transformer. As a refresher, here’s a function that composes any two functions:

const plusFive = x => x + 5;
const divideByTwo = x => x / 2;

plusFive(3)
  //=> 8

divideByTwo(8)
  //=> 4

const compose2 =
  (a, b) =>
    (...c) =>
      a(b(...c));

const plusFiveDividedByTwo = compose2(divideByTwo, plusFive);

plusFiveDividedByTwo(3)
  //=> 4

What does it mean to say that transformers compose to make a new transformer? Just that if we compose2 any two transformers, we get a new transformer that transforms a reducer. Thus:

const squaresOfTheOddNumbers = compose2(
  filter(x => x % 2 === 1),
  squares
);

reduceWith(squaresOfTheOddNumbers(sumOf), 0, one2ten)
  //=> 165

squaresOfTheOddNumbers is a transformer we created by composing a filter with a mapper.

Being able to compose decorators lets us decompose complex and highly coupled code into smaller units with a single responsibility that we can name if we choose.


composition with transformers

Now that we know how to compose2, what if we want to compose an arbitrary number of functions? There’s a reduction for that!

let’s start by rewriting compose2 as a transformer, compositionOf:

const compositionOf = (acc, val) => (...args) => val(acc(...args));

Now we can write compose as a reduction of its arguments:

const compose = (...fns) =>
  reduceWith(compositionOf, x => x, fns);

so what’s a transducer?

Given reductions written in this style:

reduceWith(squaresOfTheOddNumbers(sumOf), 0, one2ten)

We can note that we have four separate elements: A transformer for the reducer (which may be a composition of transformers), a seed, and an iterable. If we tease these into separate parameters, we get:2

const transduce = (transformer, reducer, seed, iterable) => {
  const transformedReducer = transformer(reducer);
  let accumulation = seed;

  for (const value of iterable) {
    accumulation = transformedReducer(accumulation, value);
  }

  return accumulation;
}

transduce(squaresOfTheOddNumbers, sumOf, 0, one2ten)
  //=> 165

And there you have it: A reducer is the kind of function you’d pass to .reduce—it takes an accumulated result and a new input, and returns a new accumulated result. A transformer is a function that transforms a reducer into another reducer. And a transducer (“transformer” plus “reducer,” get it?) is a function that takes a transformer, a reducer, a seed, and an iterable and reduces it to a value.

The elegance of the transducer pattern is that transformers compose naturally to produce new transformers. So we can chain as many transformers together as we like, and since we end up with one transformed reducer, we only iterate over the collection once. We don’t need to create intermediate copies of the data or iterate over it multiple times.

Transducers come to us from the Clojure programming community, but as you can see they “cut with JavaScript’s grain” and are a natural fit for what JavaScript makes easy.

So, if someone asks us what a “transducer” is, we can now reply:

What's the problem?


afterward

The code we’ve written to explore transducers is quite compact and elegant:

const arrayOf = (acc, val) => { acc.push(val); return acc; };

const sumOf = (acc, val) => acc + val;

const setOf = (acc, val) => acc.add(val);

const map =
  fn =>
    reducer =>
      (acc, val) => reducer(acc, fn(val));

const filter =
  fn =>
    reducer =>
      (acc, val) =>
        fn(val) ? reducer(acc, val) : acc;

const compose = (...fns) =>
  fns.reduce((acc, val) => (...args) => val(acc(...args)), x => x);

const transduce = (transformer, reducer, seed, iterable) => {
  const transformedReducer = transformer(reducer);
  let accumulation = seed;

  for (const value of iterable) {
    accumulation = transformedReducer(accumulation, value);
  }

  return accumulation;
}

It covers all of the cases that we would currently use .map, .filter, and .reduce for with arrays, and the composable transducers don’t make multiple copies of the data set. Transducers as developed for production code bases cover more use cases, such as replicating the functionality of .find.

Another case libraries cover is this: Our transduce function assumes that the collection is iterable, and it demands that we provide the seed and reducer functions. In most cases, the seed and reducer functions are the same for all collections of the same type.

OOP has solved this problem with polymorphism, of course. Collections have methods, so if you invoke the right method, you get the right thing back. Production-class libraries provide an interface for collection types to operate gracefully with transducers.

But this is enough to grasp the pattern behind transducers, and once again to embrace the elegant possibilities when a language provides functions as first-class values.


the transducer approach to tracking user transitions

(see Using iterators to write highly composeable code for context.)

const logContents = `1a2ddc2, 5f2b932
f1a543f, 5890595
3abe124, bd11537
f1a543f, 5f2b932
f1a543f, bd11537
f1a543f, 5890595
1a2ddc2, bd11537
1a2ddc2, 5890595
3abe124, 5f2b932
f1a543f, 5f2b932
f1a543f, bd11537
f1a543f, 5890595
1a2ddc2, 5f2b932
1a2ddc2, bd11537
1a2ddc2, 5890595`;

const asStream = function * (iterable) { yield * iterable; };

const lines = str => str.split('\n');
const streamOfLines = asStream(lines(logContents));

const datums = str => str.split(', ');
const datumize = map(datums);

const userKey = ([user, _]) => user;

const pairMaker = () => {
  let wip = [];

  return reducer =>
    (acc, val) => {
      wip.push(val);

      if (wip.length === 2) {
        const pair = wip;
        wip = wip.slice(1);
        return reducer(acc, pair);
      } else {
        return acc;
      }
  }
}

const sortedTransformation =
  (xfMaker, keyFn) => {
    const decoratedReducersByKey = new Map();

    return reducer =>
      (acc, val) => {
        const key = keyFn(val);
        let decoratedReducer;

        if (decoratedReducersByKey.has(key)) {
          decoratedReducer = decoratedReducersByKey.get(key);
        } else {
          decoratedReducer = xfMaker()(reducer);
          decoratedReducersByKey.set(key, decoratedReducer);
        }

        return decoratedReducer(acc, val);
      }
  }

const userTransitions = sortedTransformation(pairMaker, userKey);

const justLocations = map(([[u1, l1], [u2, l2]]) => [l1, l2]);

const stringify = map(transition => transition.join(' -> '));

const transitionKeys = compose(
  stringify, justLocations, userTransitions, datumize
);

const countsOf =
  (acc, val) => {
    if (acc.has(val)) {
      acc.set(val, 1 + acc.get(val));
    } else {
      acc.set(val, 1);
    }
    return acc;
  }

const greatestValue = inMap =>
  Array.from(inMap.entries()).reduce(
    ([wasKeys, wasCount], [transitionKey, count]) => {
      if (count < wasCount) {
        return [wasKeys, wasCount];
      } else if (count > wasCount) {
        return [new Set([transitionKey]), count];
      } else {
        wasKeys.add(transitionKey);
        return [wasKeys, wasCount];
      }
    }
    , [new Set(), 0]
  );

greatestValue(
  transduce(transitionKeys, countsOf, new Map(), streamOfLines)
)
  //=>
    [
      "5f2b932 -> bd11537",
      "bd11537 -> 5890595"
    ],
    4

further reading
notes
  1. (acc, val) => (acc.push(val), acc) is more pleasing semantically, but the comma operator is confusing to those who haven’t seen its regular use, and usually best avoided in production code. 

  2. In some programming communities, there is a strong sense of conservation with respect to characters, so tarnsformer is abbreviated to xform or even xf. Don’t be surprised if you see writing like (xf, reduce, seed, coll), or xf((val, acc) => acc) -> (val, acc) => acc. We’re not going to do that here, but we have no problem with a name like xf or xform in production code. 

https://raganwald.com/2017/04/30/transducers
Having our cake and eating it too: "Using iterators to write highly composeable code"
Show full content

Network

Consider this problem: We have a hypothetical startup that, like so many other unimaginative clones of each other, provides some marginal benefit in exchange for tracking user locations. We want to mine that location data.

For the purposes of this brief blog post, we might have a file that looks like this:

1a2ddc2, 5f2b932
f1a543f, 5890595
3abe124, bd11537
f1a543f, 5f2b932
f1a543f, bd11537
f1a543f, 5890595
1a2ddc2, bd11537
1a2ddc2, 5890595
3abe124, 5f2b932
f1a543f, 5f2b932
f1a543f, bd11537
f1a543f, 5890595
1a2ddc2, 5f2b932
1a2ddc2, bd11537
1a2ddc2, 5890595

...

The first column is a pseudo-anonymous hash identifying a user. The second is a pseudo-anonymous hash representing a location. If we eyeball the first 14 lines, we can see that user 1a2ddc2 visited 5f2b932, bd11537, 5890595, 5f2b932, bd11537, then 5890595. Meanwhile, user f1a543f visited 5890595, 5f2b932, bd11537, 5890595, 5f2b932, bd11537, and then 5890595. And so forth.

Let’s say we’re interested in learning where people tend to go. We are looking for the most popular transitions. So given that user 1a2ddc2 visited 5f2b932, bd11537, 5890595, 5f2b932, bd11537, then 5890595, we count the transitions as:

  • 5f2b932 -> bd11537
  • bd11537 -> 5890595
  • 5890595 -> 5f2b932
  • 5f2b932 -> bd11537
  • bd11537 -> 5890595

Notice that we have to track the locations by user in order to get the correct transitions. Next, we’re interested in the most popular transitions, so we’ll count them:

  • 5f2b932 -> bd11537 appears twice
  • bd11537 -> 5890595 also appears twice
  • 5890595 -> 5f2b932 only appears once

Now all we have to do is count all the transitions across all users, and report the most popular transition. We’ll look at three different approaches:

  1. The staged approach
  2. The single pass approach
  3. The stream approach

Highway 401 and the DVP

The staged approach

The most obvious thing to do is to write this as a series of transformations on the data. We’ve already seen one: Given the initial data, let’s get a list of locations for each user.

We can read the data from a file line-by-line, but to make it easy to follow along in a browser, let’s pretend our file is actually a multiline string. So the first thing is to convert it to an array:

const logContents = `1a2ddc2, 5f2b932
f1a543f, 5890595
3abe124, bd11537
f1a543f, 5f2b932
f1a543f, bd11537
f1a543f, 5890595
1a2ddc2, bd11537
1a2ddc2, 5890595
3abe124, 5f2b932
f1a543f, 5f2b932
f1a543f, bd11537
f1a543f, 5890595
1a2ddc2, 5f2b932
1a2ddc2, bd11537
1a2ddc2, 5890595`;

const lines = str => str.split('\n');
const logLines = lines(logContents);

const datums = str => str.split(', ');
const datumize = arr => arr.map(datums);

const data = datumize(logLines);
  //=>
    [["1a2ddc2", "5f2b932"]
     ["f1a543f", "5890595"]
     ["3abe124", "bd11537"]
     ["f1a543f", "5f2b932"]
     ["f1a543f", "bd11537"]
     ["f1a543f", "5890595"]
     ["1a2ddc2", "bd11537"]
     ["1a2ddc2", "5890595"]
     ["3abe124", "5f2b932"]
     ["f1a543f", "5f2b932"]
     ["f1a543f", "bd11537"]
     ["f1a543f", "5890595"]
     ["1a2ddc2", "5f2b932"]
     ["1a2ddc2", "bd11537"]
     ["1a2ddc2", "5890595"]]

Next we convert these to lists of locations grouped by user. We’ll create a map:

const listize = arr => arr.reduce(
  (map, [user, location]) => {
    if (map.has(user)) {
      map.get(user).push(location);
    } else {
      map.set(user, [location]);
    }
    return map;
  }, new Map());

const locationsByUser = listize(data);
  //=>
    Map{
      "1a2ddc2": [
        "5f2b932",
        "bd11537",
        "5890595",
        "5f2b932",
        "bd11537",
        "5890595"
      ],
      "3abe124": [
        "bd11537",
        "5f2b932"
      ],
      "f1a543f": [
        "5890595",
        "5f2b932",
        "bd11537",
        "5890595",
        "5f2b932",
        "bd11537",
        "5890595"
      ]
    }

We’ll convert these to transitions. slicesOf is a handy function for that:

const slicesOf = (sliceSize, array) =>
  Array(array.length - sliceSize + 1).fill().map((_,i) => array.slice(i, i+sliceSize));

const transitions = list => slicesOf(2, list);

const transitionsByUser = Array.from(locationsByUser.entries()).reduce(
  (map, [user, listOfLocations]) => {
    map.set(user, transitions(listOfLocations));
    return map;
  }, new Map());
  //=>
    Map{
      "1a2ddc2": [
          ["5f2b932", "bd11537"],
          ["bd11537", "5890595"],
          ["5890595", "5f2b932"],
          ["5f2b932", "bd11537"],
          ["bd11537", "5890595"]
        ],
      "f1a543f": [
          ["5890595", "5f2b932"],
          ["5f2b932", "bd11537"],
          ["bd11537", "5890595"],
          ["5890595", "5f2b932"],
          ["5f2b932", "bd11537"],
          ["bd11537", "5890595"]
        ],
      "3abe124": [
          ["bd11537", "5f2b932"]
        ]
    }

Before we move on, let’s extract something from transitionsByUser. One thing is transitions, the other is applying transitions to each of the values in a map:

const mapValues = (fn, inMap) => Array.from(inMap.entries()).reduce(
  (outMap, [key, value]) => {
    outMap.set(key, fn(value));
    return outMap;
  }, new Map());

const transitionsByUser = mapValues(transitions, locationsByUser);

This is very interesting. We can take it a step further, and use partial application. We could write or borrow a leftPartialApply function, but just to show our hardcore JS creds, let’s use .bind:

const mapValues = (fn, inMap) => Array.from(inMap.entries()).reduce(
  (outMap, [key, value]) => {
    outMap.set(key, fn(value));
    return outMap;
  }, new Map());

const transitionize = mapValues.bind(null, transitions);

const transitionsByUser = transitionize(locationsByUser);

Now we have each step in our process consisting of applying a single function to the return value of the previous function application. But let’s take the next step. We have a mapping from users to their transitions, but we don’t care about the users, just the transitions, so let’s fold them back together:

const reduceValues = (mergeFn, inMap) =>
  Array.from(inMap.entries())
    .map(([key, value]) => value)
      .reduce(mergeFn);

const concatValues = reduceValues.bind(null, (a, b) => a.concat(b));

const allTransitions = concatValues(transitionsByUser);
  //=>
    [
      ["5f2b932", "bd11537"],
      ["bd11537", "5890595"],
      ["5890595", "5f2b932"],
      ["5f2b932", "bd11537"],
      ["bd11537", "5890595"],
      ["5890595", "5f2b932"],
      ["5f2b932", "bd11537"],
      ["bd11537", "5890595"],
      ["5890595", "5f2b932"],
      ["5f2b932", "bd11537"],
      ["bd11537", "5890595"],
      ["bd11537", "5f2b932"]
    ]

Now we want to count the occurrences of each transition. We’ll reduce our new list to a pairing between the highest count and a list of transitions that match. To facilitate this, we’ll turn the arrays for each transition into a string:1

const stringifyTransition = transition => transition.join(' -> ');
const stringifyAllTransitions = arr => arr.map(stringifyTransition);

const stringTransitions = stringifyAllTransitions(allTransitions);
  //=>
    [
      "5f2b932 -> bd11537",
      "bd11537 -> 5890595",
      "5890595 -> 5f2b932",
      "5f2b932 -> bd11537",
      "bd11537 -> 5890595",
      "5890595 -> 5f2b932",
      "5f2b932 -> bd11537",
      "bd11537 -> 5890595",
      "5890595 -> 5f2b932",
      "5f2b932 -> bd11537",
      "bd11537 -> 5890595",
      "bd11537 -> 5f2b932"
    ]

Now we can count them with ease:

const countTransitions = arr => arr.reduce(
  (transitionsToCounts, transitionKey) => {
    if (transitionsToCounts.has(transitionKey)) {
      transitionsToCounts.set(transitionKey, 1 + transitionsToCounts.get(transitionKey));
    } else {
      transitionsToCounts.set(transitionKey, 1);
    }
    return transitionsToCounts;
  }
  , new Map());

const counts = countTransitions(stringTransitions);
  //=>
    Map{
      "5f2b932 -> bd11537": 4,
      "bd11537 -> 5890595": 4,
      "5890595 -> 5f2b932": 3,
      "bd11537 -> 5f2b932": 1
    }

And which is/are the most common?

const greatestValue = inMap =>
  Array.from(inMap.entries()).reduce(
    ([wasKeys, wasCount], [transitionKey, count]) => {
      if (count < wasCount) {
        return [wasKeys, wasCount];
      } else if (count > wasCount) {
        return [new Set([transitionKey]), count];
      } else {
        wasKeys.add(transitionKey);
        return [wasKeys, wasCount];
      }
    }
    , [new Set(), 0]
  );

greatestValue(counts);
  //=>
    [
      "5f2b932 -> bd11537",
      "bd11537 -> 5890595"
    ],
    4
pipelining this solution

One of the nice thing about this solution is that it forms a pipeline. A chunk of data moves through the pipleline, being transformed at each stage. Leaving the definitions out, the pipeline is:

const theStagedSolution = logContents =>
  greatestValue(
    countTransitions(
      stringifyAllTransitions(
        concatValues(
          transitionize(
            listize(
              datumize(
                lines(
                  logContents
                )
              )
            )
          )
        )
      )
    )
  );

theStagedSolution(logContents)
  //=>
    [
      "5f2b932 -> bd11537",
      "bd11537 -> 5890595"
    ],
    4

We can write this using pipeline:

const pipeline = (...fns) => fns.reduceRight((a, b) => c => a(b(c)));

const theStagedSolution = pipeline(
  lines,
  datumize,
  listize,
  transitionize,
  concatValues,
  stringifyAllTransitions,
  countTransitions,
  greatestValue
);

And here is the complete staged solution:

const lines = str => str.split('\n');
const logLines = lines(logContents);

const datums = str => str.split(', ');
const datumize = arr => arr.map(datums);

const listize = arr => arr.reduce(
  (map, [user, location]) => {
    if (map.has(user)) {
      map.get(user).push(location);
    } else {
      map.set(user, [location]);
    }
    return map;
  }, new Map());

const slicesOf = (sliceSize, array) =>
  Array(array.length - sliceSize + 1).fill().map((_,i) => array.slice(i, i+sliceSize));
const transitions = list => slicesOf(2, list);

const mapValues = (fn, inMap) => Array.from(inMap.entries()).reduce(
  (outMap, [key, value]) => {
    outMap.set(key, fn(value));
    return outMap;
  }, new Map());

const transitionize = mapValues.bind(null, transitions);

const reduceValues = (mergeFn, inMap) =>
  Array.from(inMap.entries())
    .map(([key, value]) => value)
      .reduce(mergeFn);

const concatValues = reduceValues.bind(null, (a, b) => a.concat(b));

const stringifyTransition = transition => transition.join(' -> ');
const stringifyAllTransitions = arr => arr.map(stringifyTransition);

const countTransitions = arr => arr.reduce(
  (transitionsToCounts, transitionKey) => {
    if (transitionsToCounts.has(transitionKey)) {
      transitionsToCounts.set(transitionKey, 1 + transitionsToCounts.get(transitionKey));
    } else {
      transitionsToCounts.set(transitionKey, 1);
    }
    return transitionsToCounts;
  }
  , new Map());

const greatestValue = inMap =>
  Array.from(inMap.entries()).reduce(
    ([wasKeys, wasCount], [transitionKey, count]) => {
      if (count < wasCount) {
        return [wasKeys, wasCount];
      } else if (count > wasCount) {
        return [new Set([transitionKey]), count];
      } else {
        wasKeys.add(transitionKey);
        return [wasKeys, wasCount];
      }
    }
    , [new Set(), 0]
  );

const pipeline = (...fns) => fns.reduceRight((a, b) => c => a(b(c)));

const theStagedSolution = pipeline(
  lines,
  datumize,
  listize,
  transitionize,
  concatValues,
  stringifyAllTransitions,
  countTransitions,
  greatestValue
);

theStagedSolution(logContents)
  //=>
    [
      "5f2b932 -> bd11537",
      "bd11537 -> 5890595"
    ],
    4

The very nice thing is that we have decomposed our solution into a simple pipe that takes some data in at one end, and performs a succession of transformations on it, until what emerges at the other end is the result we want.

Each step can be easily checked and tested, and each step as a well-understood and explicit input, followed by an explicit and well-understood output. There are no side-effects to confuse our reasoning.

But there is a dark side, of course. If we care very deeply about memory, at each step but the last, we construct a data structure of roughly equal size to the input data.

We would use much less data if we wrote a single fold that had a lot of internal moving parts, but only iterated over the data in one pass. Let’s try it.

Speed

The single pass approach

In production systems, memory and performance can matter greatly, especially for an algorithm that may be analyzing data at scale. We can transform our “staged” solution into a single pass with a bit of care.

Let’s start with a for of loop. We’ll fill in the obvious bit first:2

const theSinglePassSolution = (logContents) => {
  const lines = str => str.split('\n');
  const logLines = lines(log);

  for (const line of logLines) {
    const row = datums(line);
     // ...
  }
  // ...
}

Now we’ll hand-code a reduction to get locations by users:

const theSinglePassSolution = (logContents) => {
  const lines = str => str.split('\n');
  const logLines = lines(logContents);
  const locationsByUser = new Map();

  for (const line of logLines) {
    const [user, location] = datums(line);

    if (locationsByUser.has(user)) {
      const locations = locationsByUser.get(user);
      locations.push(location);
    } else {
      locationsByUser.set(user, [location]);
    }
  }

  // ...
}

What about obtaining transitions from the locations for each user? Strictly speaking, we don’t have to worry about slicing the list if we know that the current set of locations has at least two elements. So we’ll just take a transition for granted, then we’ll discard the oldest location we’ve seen for this user, as it can no longer figure in any future transitions:3

const theSinglePassSolution = (logContents) => {
  const lines = str => str.split('\n');
  const logLines = lines(logContents);
  const locationsByUser = new Map();

  for (const line of logLines) {
    const [user, location] = datums(line);

    if (locationsByUser.has(user)) {
      const locations = locationsByUser.get(user);
      locations.push(location);

      const transition = locations;
      locationsByUser.set(user, locations.slice(1));
    } else {
      locationsByUser.set(user, [location]);
    }
  }

  // ...
}

Folding the transitions per user back into one stream would be sheer simplicity, but we can actually skip it since we have the transition we care about. What’s the next step that matters? Getting a string from the transition:

const theSinglePassSolution = (logContents) => {
  const lines = str => str.split('\n');
  const logLines = lines(logContents);
  const locationsByUser = new Map();

  for (const line of logLines) {
    const [user, location] = datums(line);

    if (locationsByUser.has(user)) {
      const locations = locationsByUser.get(user);
      locations.push(location);

      const transition = locations;
      locationsByUser.set(user, locations.slice(1));

      const transitionKey = stringifyTransition(transition);
    } else {
      locationsByUser.set(user, [location]);
    }
  }

  // ...
}

Now we count them, again performing a reduce by hand:

const theSinglePassSolution = (logContents) => {
  const lines = str => str.split('\n');
  const logLines = lines(logContents);
  const locationsByUser = new Map();
  const transitionsToCounts = new Map();

  for (const line of logLines) {
    const [user, location] = datums(line);

    if (locationsByUser.has(user)) {
      const locations = locationsByUser.get(user);
      locations.push(location);

      const transition = locations;
      locationsByUser.set(user, locations.slice(1));

      const transitionKey = stringifyTransition(transition);
      let count;
      if (transitionsToCounts.has(transitionKey)) {
        count = 1 + transitionsToCounts.get(transitionKey);
      } else {
        count = 1;
      }
      transitionsToCounts.set(transitionKey, count);
    } else {
      locationsByUser.set(user, [location]);
    }
  }

  // ...
}

No need to iterate over transitionsToCounts in a separate pass to obtain the highest count, we’ll do that in this pass as well, and wind up with the greatest count and entries:

const theSinglePassSolution = (logContents) => {
  const lines = str => str.split('\n');
  const logLines = lines(logContents);
  const locationsByUser = new Map();
  const transitionsToCounts = new Map();
  let wasKeys = new Set();
  let wasCount = 0;

  for (const line of logLines) {
    const [user, location] = datums(line);

    if (locationsByUser.has(user)) {
      const locations = locationsByUser.get(user);
      locations.push(location);

      const transition = locations;
      locationsByUser.set(user, locations.slice(1));

      const transitionKey = stringifyTransition(transition);
      let count;
      if (transitionsToCounts.has(transitionKey)) {
        count = 1 + transitionsToCounts.get(transitionKey);
      } else {
        count = 1;
      }
      transitionsToCounts.set(transitionKey, count);

      if (count > wasCount) {
        wasKeys = new Set([transitionKey])
        wasCount = count;
      } else if (count === wasCount) {
        wasKeys.add(transitionKey);
      }
    } else {
      locationsByUser.set(user, [location]);
    }
  }

  return [wasKeys, wasCount];
}

theSinglePassSolution(logContents)
  //=>
    [
      "5f2b932 -> bd11537",
      "bd11537 -> 5890595"
    ],
    4

We get the same solution, but with a single pass through the data and requiring space proportional to the number of users, not a multiple of the size of the data. But note that although the code now looks somewhat different, it actually does the exact same steps as the staged approach, in the same order.

That’s because we wrote (and debugged!) the pipeline, and then refactored it to a single pass. We did all of the hard reasoning while working with the easier-to-reason-about and factor code, before we wrote the everything-entangled code.

Obviously, there’s a trade-off involved. Maximum readability and easiest to reason about? Or performance? Or is it obvious?

What if we could have it both ways?

Beetle Asembly Line at Volkswagon

The stream approach

Our staged approach sets up a pipeline of functions, each of which has a well-defined input and a well-defined output:

const theStagedSolution = pipeline(
  lines,
  datumize,
  listize,
  transitionize,
  concatValues,
  stringifyAllTransitions,
  countTransitions,
  greatestValue
);

This is an excellent model of computation, it’s decomposed nicely, it’s easy to test, it’s easy to reuse the components, and we get names for things that matter. The drawback is that the inputs and outputs of each function are bundles of data the size of the entire input data.

If this were a car factory, we would have an assembly line, but instead of making one frame at a time in the first stage, then adding one engine at a time in the second stage, and so on, this pipeline makes frames for all the cars at the first stage before passing the frames to have all the engines added at the second, and so forth.

Terrible!

Ideally, an automobile factory passes the cars along one at a time, so that at each station, inputs are arriving continuously and outputs are being passed to the next station continuously. We can do the same thing in JavaScript, but instead of working with lists, we work with iterables.

So instead of starting with a massive string that we split into lines, we would start with an iterator over the lines in the log. This could be a library function that reads a physical file a line at a time, or it could be a series of log lines arriving asynchronously from a service that monitors our servers. For testing purposes, we’ll take our string and wrap it in a little function that returns an iterable over its lines, but won’t let us treat it like a list:

function * asStream (iterable) { yield * iterable; };

const lines = str => str.split('\n');
const streamOfLines = asStream(lines(logContents));

asStream has no functional purpose, it exists merely to constrain us to work with a stream of values rather than with lists.

With this in hand, we can follow the same general path that we did with writing a one pass algorithm: We go through our existing staged approach and rewrite each step. Only instead of combining them all into one function, we’ll turn them from ordinary functions into generators, functions that generate streams of values. Let’s get cracking!

Our original staged approach mapped its inputs several times. We can’t call .map on an iterable, so let’s write a convenience function to do it for us:

function * mapIterableWith (mapFn, iterable) {
  for (const value of iterable) {
    yield mapFn(value);
  }
}

const datums = str => str.split(', ');
const datumizeStream = iterable => mapIterableWith(datums, iterable);

Or the equivalent:

const datumizeStream = mapIterableWith.bind(null, datums);

Are you tired of repeating this pattern? Let’s (finally) write a left partial application function:

const leftPartialApply = (fn, ...values) => fn.bind(null, ...values);

const datumizeStream = leftPartialApply(mapIterableWith, datums);

Now we’re ready for something interesting. Our original code performed a reduce, folding a list into a map from users to locations. We are working with a stream, of course, and we absolutely do not want to reduce all the elements of the stream to a single object.

IBM Card Sorter

collating our locations

Consider the metaphor of the assembly line. Log lines enter at the beginning, and are converted into arrays by datumizeStream. Instead of bundling everything up into a box, we want to process the lines, but we need to collate the items so we can process them in order for each user.

One way to do this while processing one line at a time is to create a series of parallel streams, one per user. We direct each line into the appropriate stream and do some processing on it. We then merge the outputs back into a single stream for more processing.

If we stop and think about it, this is what we actually wanted to do when we created a map to begin with. We just need to code that intention directly. So we will write a function that takes a stream and divides it (metaphorically) into multiple streams according to a function that takes each value and returns a string key.

The key function is simplicity itself:

const userKey = ([user, _]) => user;

We plan to will apply this to each value as it comes in, and streams will be created for each distinct key. Then, a transforming function will be applied to each stream. Our mapping functions so far were stateless, and mapped one value to another. But we’re going to do both of these things differently. Our transforming functions will have state, and they will map each value into a list of zero or one value, which will then be merged to form our resulting stream.

Our function looks a lot like the code we wrote for extracting transitions from our single pass solution, only we don’t keep the locations per user in a map, and we either return a transition in a list, or an empty list:

let locations = [];

([_, location]) => {
  locations.push(location);

  if (locations.length === 2) {
    const transition = locations;
    locations = locations.slice(1);
    return [transition];
  } else {
    return [];
  }
}

This function take a location at a time, and returns either an empty list or a transition in a list. We can use it to iterate over locations one by one, and get transitions. Which is exactly what we’re going to do. Mind you, it isn’t quite ready, because while it does maintain state (in the locations variable), we will need a different state for each user. In order to have as many of these as we like, we’ll wrap the whole thing in a function:

const transitionsMaker = () => {
  let locations = [];

  return ([_, location]) => {
    locations.push(location);

    if (locations.length === 2) {
      const transition = locations;
      locations = locations.slice(1);
      return [transition];
    } else {
      return [];
    }
  }
}

Now we can call transitionsMaker for each user, and get a function that can map the locations for that user into transitions.

Armed with a function for turning a user and location into a key, and transitionsMaker, we can write our collating function. It takes a function that makes a stateful mapping function and a function that extracts keys from values as arguments, and returns a function that transforms a stream of values:

const sortedFlatMap = (mapFnMaker, keyFn) =>
  function * (values) {
    const mappersByKey = new Map();

    for (const value of values) {
      const key = keyFn(value);
      let mapperFn;

      if (mappersByKey.has(key)) {
        mapperFn = mappersByKey.get(key);
      } else {
        mapperFn = mapFnMaker();
        mappersByKey.set(key, mapperFn);
      }

      yield * mapperFn(value);
    }
  };

const transitionsStream = sortedFlatMap(transitionsMaker, userKey);

Why is sortedFlatMap called a “flat map?” A function that maps a value to zero or more values is called a flat map. There’s actually more to this idea if we dive into functional programming a little more deeply, we can think of putting values in lists as “wrapping” them, and if we have an operation that takes a value and then returns a wrapped value, “flat map” is a function that performs the operation on a value and unwraps the result.

In our case, we take values and map them to zero or one transition, which we represent with an empty list or a list with a transition. sortedFlatMap “flattens” or “unwraps” these lists using yield *, which yields the contents of an iterable, in our case, a list with zero or one element.

Continuing our practise of writing our “stream” solution with the same steps as our “pipeline” solution, we transform the transitions into strings we can use as keys:

const stringifyTransition = transition => transition.join(' -> ');

const stringifyStream = leftPartialApply(mapIterableWith, stringifyTransition);

If we stop and debug our work, we’ll see that we now have a stream of transitions represented as strings, and we have the same memory footprint as our single pass solution:

stringifyStream(transitionsStream(datumizeStream(streamOfLines)))
  //=>
    "5890595 -> 5f2b932"
    "5f2b932 -> bd11537"
    "bd11537 -> 5890595"
    "5f2b932 -> bd11537"
    "bd11537 -> 5890595"
    "bd11537 -> 5f2b932"
    "5890595 -> 5f2b932"
    "5f2b932 -> bd11537"
    "bd11537 -> 5890595"
    "5890595 -> 5f2b932"
    "5f2b932 -> bd11537"
    "bd11537 -> 5890595"
counting transitions

Our original function for counting transitions performed a .reduce on a list of transitions:

const countTransitions = arr => arr.reduce(
  (transitionsToCounts, transitionKey) => {
    if (transitionsToCounts.has(transitionKey)) {
      transitionsToCounts.set(transitionKey, 1 + transitionsToCounts.get(transitionKey));
    } else {
      transitionsToCounts.set(transitionKey, 1);
    }
    return transitionsToCounts;
  }
  , new Map());

It’s straightforward to transform this into an iteration over the transitions we receive:

const countTransitionStream = transitionKeys => {
  const transitionsToCounts = new Map();

  for (const transitionKey of transitionKeys) {
    if (transitionsToCounts.has(transitionKey)) {
      transitionsToCounts.set(transitionKey, 1 + transitionsToCounts.get(transitionKey));
    } else {
      transitionsToCounts.set(transitionKey, 1);
    }
  }
  return transitionsToCounts;
}

And then we can reüse:

const greatestValue = inMap =>
  Array.from(inMap.entries()).reduce(
    ([wasKeys, wasCount], [transitionKey, count]) => {
      if (count < wasCount) {
        return [wasKeys, wasCount];
      } else if (count > wasCount) {
        return [new Set([transitionKey]), count];
      } else {
        wasKeys.add(transitionKey);
        return [wasKeys, wasCount];
      }
    }
    , [new Set(), 0]
  );

And now we can get our result “the old fashioned way:”

greatestValue(
  countTransitionStream(
    stringifyStream(
      transitionsStream(
        datumizeStream(
          streamOfLines
        )
      )
    )
  )
)

Or use a pipeline again:

const pipeline = (...fns) => fns.reduceRight((a, b) => c => a(b(c)));

const theStreamSolution = pipeline(
  datumizeStream,
  transitionsStream,
  stringifyStream,
  countTransitionStream,
  greatestValue
);

theStreamSolution(streamOfLines)
  //=>
    [
      "5f2b932 -> bd11537",
      "bd11537 -> 5890595"
    ],
    4

Voila!

To recap what we have accomplished: We are processing the data step by step, just like our original staged approach, but we are also handling the locations one by one without processing the entire data set in each step, just like our single pass approach.

We have harvested the best parts of each approach.

Now, it’s true that we have does a bunch of things that people call “functional programming,” but that wasn’t the goal. The goal, the benefit we can inspect, is that we have decomposed the algorithm into a series of steps, each of which has well-defined inputs and outputs. And, we have arranged our code such that we are not making copies of the entire data set with each of our steps.

The end goal, as always, is to decompose the algorithm into smaller parts that can be named, tested, and perhaps reused elsewhere. Using iterables and generators to implement a stream approach can help us achieve our goals without compromising practical considerations like memory footprint.


further reading
appendix: the full code
const logContents =`1a2ddc2db4693cfd16d534cde5572cc1, 5f2b9323c39ee3c861a7b382d205c3d3
f1a543f5a2c5d49bc5dde298fcf716e4, 5890595e16cbebb8866e1842e4bd6ec7
3abe124ecc82bf2c2e22e6058f38c50c, bd11537f1bc31e334497ec5463fc575e
f1a543f5a2c5d49bc5dde298fcf716e4, 5f2b9323c39ee3c861a7b382d205c3d3
f1a543f5a2c5d49bc5dde298fcf716e4, bd11537f1bc31e334497ec5463fc575e
f1a543f5a2c5d49bc5dde298fcf716e4, 5890595e16cbebb8866e1842e4bd6ec7
1a2ddc2db4693cfd16d534cde5572cc1, bd11537f1bc31e334497ec5463fc575e
1a2ddc2db4693cfd16d534cde5572cc1, 5890595e16cbebb8866e1842e4bd6ec7
3abe124ecc82bf2c2e22e6058f38c50c, 5f2b9323c39ee3c861a7b382d205c3d3
f1a543f5a2c5d49bc5dde298fcf716e4, 5f2b9323c39ee3c861a7b382d205c3d3
f1a543f5a2c5d49bc5dde298fcf716e4, bd11537f1bc31e334497ec5463fc575e
f1a543f5a2c5d49bc5dde298fcf716e4, 5890595e16cbebb8866e1842e4bd6ec7
1a2ddc2db4693cfd16d534cde5572cc1, 5f2b9323c39ee3c861a7b382d205c3d3
1a2ddc2db4693cfd16d534cde5572cc1, bd11537f1bc31e334497ec5463fc575e
1a2ddc2db4693cfd16d534cde5572cc1, 5890595e16cbebb8866e1842e4bd6ec7`;

const asStream = function * (iterable) { yield * iterable; };

const lines = str => str.split('\n');
const streamOfLines = asStream(lines(logContents));

function * mapIterableWith (mapFn, iterable) {
  for (const value of iterable) {
    yield mapFn(value);
  }
}

const leftPartialApply = (fn, ...values) => fn.bind(null, ...values);

const datums = str => str.split(', ');
const datumizeStream = leftPartialApply(mapIterableWith, datums);

const userKey = ([user, _]) => user;

const transitionsMaker = () => {
  let locations = [];

  return ([_, location]) => {
    locations.push(location);

    if (locations.length === 2) {
      const transition = locations;
      locations = locations.slice(1);
      return [transition];
    } else {
      return [];
    }
  }
}

const sortedFlatMap = (mapFnMaker, keyFn) =>
  function * (values) {
    const mappersByKey = new Map();

    for (const value of values) {
      const key = keyFn(value);
      let mapperFn;

      if (mappersByKey.has(key)) {
        mapperFn = mappersByKey.get(key);
      } else {
        mapperFn = mapFnMaker();
        mappersByKey.set(key, mapperFn);
      }

      yield * mapperFn(value);
    }
  };

const transitionsStream = sortedFlatMap(transitionsMaker, userKey);

const stringifyTransition = transition => transition.join(' -> ');
const stringifyStream = leftPartialApply(mapIterableWith, stringifyTransition);

const countTransitionStream = transitionKeys => {
  const transitionsToCounts = new Map();

  for (const transitionKey of transitionKeys) {
    if (transitionsToCounts.has(transitionKey)) {
      transitionsToCounts.set(transitionKey, 1 + transitionsToCounts.get(transitionKey));
    } else {
      transitionsToCounts.set(transitionKey, 1);
    }
  }
  return transitionsToCounts;
}

const greatestValue = inMap =>
  Array.from(inMap.entries()).reduce(
    ([wasKeys, wasCount], [transitionKey, count]) => {
      if (count < wasCount) {
        return [wasKeys, wasCount];
      } else if (count > wasCount) {
        return [new Set([transitionKey]), count];
      } else {
        wasKeys.add(transitionKey);
        return [wasKeys, wasCount];
      }
    }
    , [new Set(), 0]
  );
const pipeline = (...fns) => fns.reduceRight((a, b) => c => a(b(c)));

const theStreamSolution = pipeline(
  datumizeStream,
  transitionsStream,
  stringifyStream,
  countTransitionStream,
  greatestValue
);
notes
  1. It would be nice if JavaScript gave us a Deep JSON Equality function, but it doesn’t. We could go down a rabbit-hole of writing our own comparison functions and maps and what-not, but it’s simpler to convert the transitions to strings before counting them. That’s because JavaScript acts as if strings are canonicalized, so they make great keys for objects and maps. 

  2. Note that if we are reading the file from disc, we can actually iterate over the lines directly, instead of calling .split('\n') on the contents. 

  3. We could also tidy up some extra variable references, but we’re trying to make this code map somewhat reasonably to our staged approach, and the extra names make it more obvious. Compared to the overhead of making multiple copies of the data, the extra work for these is negligible. 

https://raganwald.com/2017/04/19/incremental
foldl, foldr, and associative order
Show full content

New And Improved

This essay originally appeared in 2017. Eagle-eyed readers pointed out that the original implementation of foldr had incorrect semantics. The essay has now been substantially revised to provide an implementation of foldr that is much closer to the one we find in lazy languages like Haskell.


When talking with people in the functional programming community, we often hear the term fold. Folding is an abstraction of operations to be carried out on linear collections, and it’s nearly always implemented as a higher-order function.

In this essay we’re going to look at what “folding” does by writing our own implementations. In addition to exploring its basic purpose, we’ll look at variations on folding that use different associativities (“left-associative folds” and “right-associative folds”) and explore the use of right-associative folds for performing lazy operations on possibly unbounded iterables.

By the end of the essay, we’ll have a basic grasp of common terms of art such as foldl and foldr, as well as a grasp of when each pattern should be applied–and when they cut against javaScript’s gran and should be eschewed in favour of simpler constructs.

Here we go.


Pine Trees


preamble: Array.prototype.reduce

JavaScript has a method on arrays called reduce. It’s used for “reducing” a collection to a value of some kind. Here we use it to “reduce” an array of numbers to the sum of the numbers:

[1, 2, 3, 4, 5].reduce((x, y) => x + y, 0)
  //=> 15

Reduce folds the array into a single value. Mapping can be implemented as folding. Here we fold an array of numbers into an array of the squares of the numbers:

[1, 2, 3, 4, 5].reduce(
  (acc, n) => acc.concat([n*n]),
  []
)
  //=> [1, 4, 9, 16, 25]

And if we can map an array with a fold, we can also filter an array with a fold:

[1, 2, 3, 4, 5].reduce(
  (acc, n) => n % 2 === 0 ? acc.concat([n]) : acc,
  []
)
  //=> [2, 4]

Folding is a very fundamental kind of operation on collections. It can be used in many other ways, but let’s move along and talk about what kinds of collections we might want to fold.

foldl

Let’s write our own fold, foldl:

let foldl = (fn, valueSoFar, iterable) => {
  const iterator = iterable[Symbol.iterator]();
  const { value: current, done } = iterator.next();

  if (done) {
    return valueSoFar;
  } else {
    return foldl(fn, fn(valueSoFar, current), iterator);
  }
};

foldl((valueSoFar, current) => valueSoFar + current, 0, [1, 2, 3, 4, 5])
  //=> 15

This is a recursive implementation. Thanks to almost every implementation of JavaScript punting on tail recursion optimization, there’s no easy way to write a version that doesn’t consume the stack in proportion to the number of elements being folded. Nevertheless, we’ll work with the recursive implementation for now.1

With foldl in hand, we can look at its associative property.

the associative property

foldl consumes the elements from the left of the collection. But foldl is not called foldl because it consumes its elements from the left: It’s called foldl because it associates its folding function from the left.

When we write foldl((valueSoFar, current) => valueSoFar + current, 0, [1, 2, 3, 4, 5]), we’re computing the sum as if we wrote (((((0 + 1) + 2) + 3) + 4) + 5):

foldl((valueSoFar, current) => valueSoFar + current, 0, [1, 2, 3, 4, 5])
  //=> 15

(((((0 + 1) + 2) + 3) + 4) + 5)
  //=> 15

Addition is associative, meaning that it makes no difference how we group the operations, we get the same result. But not all operators are associative. For example, subtraction is not associative, we get different results depending upon how we group or order the operations:

(((((0 - 1) - 2) - 3) - 4) - 5)
  //=> -15

(0 - (1 - (2 - (3 - (4 - 5)))))
  //=> -3

Because subtraction is not associative, if we write an expression without explicitly forcing the order of operations with parentheses, we leave the order up to rules we arbitrarily establish about how we evaluate expressions.

How does JavaScript associate expressions by default? That’s easy to test:

0 - 1 - 2 - 3 - 4 - 5
  //=> -15

We say that JavaScript is left-associative, meaning that given an expression like 0 - 1 - 2 - 3 - 4 - 5, JavaScript always evaluates it as if we wrote (((((0 - 1) - 2) - 3) - 4) - 5)).

And likewise, we say that foldl is left-associative, because when we write foldl((valueSoFar, current) => valueSoFar - current, 0, [1, 2, 3, 4, 5]), we get the same result as if we wrote (((((0 - 1) - 2) - 3) - 4) - 5)):

(((((0 - 1) - 2) - 3) - 4) - 5)
  //=> -15

foldl((valueSoFar, current) => valueSoFar - current, 0, [1, 2, 3, 4, 5])
  //=> -15

But that’s not always what we want. Sometimes, we want to fold an iterable with right-associative semantics.

With right-associative semantics, 0 - 1 - 2 - 3 - 4 - 5 would be evaluated as if we wrote (0 - (1 - (2 - (3 - (4 - 5))))). And if we had a fold function with right-associative semantics–let’s call it foldr–then we would expect:

(0 - (1 - (2 - (3 - (4 - 5)))))
  //=> -3

foldr((current, valueToCompute) => current - valueToCompute, 5, [0, 1, 2, 3, 4])
  //=> -3

Note that with foldl, we supplied 0 as an initial value and [1, 2, 3, 4, 5] as the iterable, because we associate from the left. With foldr, we supplied 5 as the initial value, and [0, 1, 2, 3, 4] as the iterable, because foldr associates from the right, and thus the first thing we want it to evaluate will be 4 - 5.

Here’s foldr:

let foldr = (fn, valueToCompute, iterable) => {
  const iterator = iterable[Symbol.iterator]();
  const { value: current, done } = iterator.next();

  if (done) {
    return valueToCompute;
  } else {
    valueToCompute = foldr(fn, valueToCompute, iterator);

    return fn(current, valueToCompute);
  }
};

foldr((current, valueToCompute) => current - valueToCompute, 5, [0, 1, 2, 3, 4])
  //=> -3

We can see that like foldl, we consume the elements from the left. But because we write:

valueToCompute = foldr(fn, valueToCompute, iterator);

return fn(current, valueToCompute);

The remainder of the computation is evaluated first using recursion, and then its passed to the folding function fn. This is what makes it right-associative: Givien 0 - 1 - 2 - 3 - 4 - 5, it computes 1 - (2 - (3 - (4 - 5))) => 3 first, then returns 0 - 3 as the final result.

Although it consumes its elements from the left, foldr associates its operations from the right.

foldr also passes the arguments into fn in the opposite order from the way foldl. It’s easy to remember which function works which way: The argument representing the aggregated computation is on the left with foldl and on the right with foldr, which matches the associativity.

hard work pays off in the future;
laziness pays off right away

When we’re working with finite iterables, foldl can be used to implement map:

let mapl = (fn, iterable) =>
  foldl(
    (valueSoFar, current) => valueSoFar.concat([fn(current)]),
    [],
    iterable
  );

mapl(current => current * current, [1, 2, 3, 4, 5])
  //=> [1, 4, 9, 16, 25]

foldl is eager, meaning it computesmapl cannot be used on an infinite iterable. With care, we can make a fold that handles infinite iterables, but we begin with foldr rather than foldl.

What we’ll do is structure the code as with our eager version of foldr, but instead of passing the remainder of the computation to the folding function, we’ll pass a memoized of the remainder of the computation.

The folding function will explicitly invoke the thunk when it needs to evaluate the computation.2

This allows us to create a lazy foldr:

const memoized = (fn, keymaker = JSON.stringify) => {
    const lookupTable = Object.create(null);

    return function (...args) {
      const key = keymaker.call(this, args);

      return lookupTable[key] || (lookupTable[key] = fn.apply(this, args));
    }
  };

let foldr = (fn, valueToCompute, iterable) => {
  const iterator = iterable[Symbol.iterator]();
  const { value: current, done } = iterator.next();

  if (done) {
    return valueToCompute;
  } else {
    const toComputeThunk = memoized(
      () => foldr(fn, valueToCompute, iterator)
    );

    return fn(current, toComputeThunk);
  }
};

foldr(
  (current, toComputeThunk) => current + toComputeThunk(), 0, [1, 2, 3, 4, 5])
  //=> 15

foldr((current, toComputeThunk) => current - toComputeThunk(), 5, [0, 1, 2, 3, 4])
  //=> -3

Laziness won’t help us sum an infinite series, so we won’t try that. But we could use laziness to search a possibly infinite iterable:

let first = (predicate, iterable) =>
  foldr(
    (current, toComputeThunk) =>
      predicate(current) ? current : toComputeThunk(),
    undefined,
    iterable
  );

let fibonacci = function * () {
  let a = 0;
  let b = 1;
  let c;

  while (true) {
    yield a;

    ([a, b] = [b, a + b]);
  }
}

first(n => n > 0 && n % 7 === 0, fibonacci())
  //=> 21

It’s counter-intuitive that associating operations to the right makes working with infinite iterables possible, but foldr is quite explicit about separating the current value from the remainder of the computations to be performed, and since it consumes elements from the left, it can stop at any time by not evaluating the thunk representing the remainder of the computation.

And now to implement a lazy map. Our lazy map can take any iterable (whether bounded or unbounded) as an argument, and it always returns an iterable:3

const lazycons = (value, iterableThunk) => {
  return function * conscell () {
    yield value;
    yield * iterableThunk();
  }();
};

let lazymap = (mapper, iterable) =>
  foldr(
    (current, toComputeThunk) =>
      lazycons(mapper(current), toComputeThunk),
    [],
    iterable
  );

[a, b, c, d, e, f, g] = lazymap(c => c * c, fibonacci());

[a, b, c, d, e, f, g]
  //=> [0, 1, 1, 4, 9, 25, 64]

As we can see, the unique combination of consuming from the left and associating from the right makes the lazy version of foldr very useful for working with short-circuit semantics like first, or working with unbounded iterables like fibonacci.

what we’ve learned about foldl and foldr

As we’ve seen, the order of consuming values and the order of association are independent, and because they are independent, we get different semantics:

  • Both foldl and foldr consume from the left. And thus, they can be written to consume iterables.
  • foldl associates its folding function from the left.
  • foldr associates its folding function from the right.
  • Because foldr consumes from the left and associates from the right, a lazy implementation can provide short-circuit semantics and/or manage unbounded iterables.

In sum, the order of consuming values and the order of associating a folding function are two separate concepts, and they both matter.

We’ve also learned that foldl is best used when we want to eagerly evaluate a fold.

Likewise, we’ve learned that foldr’s semantics of consuming from the left but associating from the right make it ideal for lazy computations, such as working with short-circuit semantics or unbounded iterables. We are likely then to prefer to use the lazy implementation for foldr.

The penultimate “rule of thumb” is this: Use foldl for eager computations, foldr for lazy computations.4


Dr. Ian Malcolm, Jurrasic Park


just because we could, doesn’t mean we should

The examples in this essay were chosen to be simple enough that we could focus on the mechanisms of foldl and foldr. This helps us communicate with members of the larger programming community whose experience is grounded in orthodox functional programming.

However, writing all of our code as if we are thinking in Scheme/Haskell/ML and then translating those functions directly into JavaScript is often “cutting against the grain.” We initially wrote foldl recursively, but since most implementations of JavaScript cannot optimize linear recursion, we should usually implement foldl with a loop:5

let foldl = (fn, valueSoFar, iterable) => {
  for (const current of iterable) {
    valueSoFar = fn(valueSoFar, current);
  }

  return valueSoFar;
};

This implementation is faster and consumes less memory than the recursive implementation. Likewise, for working with lazy iterables, there are some benefits to the foldr approach, but for many linear higher-order operations on iterables, there are simpler higher-order functions we can write in a non-recursive style using iterators and generators.

We can map iterables lazily:

function * mapIterableWith (mapper, iterable) {
  for (const element of iterable) {
    yield mapper(element);
  }
}

[a, b, c, d, e, f, g] = mapIterableWith(c => c * c, fibonacci());

[a, b, c, d, e, f, g]
  //=> [0, 1, 1, 4, 9, 25, 64]

Filter iterables lazily:

function * filterIterableWith (predicate, iterable) {
  for (const element of iterable) {
    if (predicate(element)) yield element;
  }
}

first(n => n > 0 && n % 7 === 0, fibonacci())

[a, b, c, d, e, f, g] = filterIterableWith(n => n % 7 === 0, fibonacci());

[a, b, c, d, e, f, g]
  //=> [0, 21, 987, 46368, 2178309, 102334155, 4807526976]

And short-circuit semantics like first are also easy to write directly:

function first (predicate, iterable) {
  const [head] = filterIterableWith(predicate, iterable);

  return head;
}

first(n => n > 0 && n % 7 === 0, mapIterableWith(c => c * c, fibonacci()))
  //=> 441

The final rule of thumb is thus: Our code should make the simple things easy, and the complex things possible. Therefore, we write simple functions for mapping, filtering, or finding things in a lazy way, and we use tools like foldr when we encounter something that doesn’t neatly correspond to one of our simple tools.

(Discuss on /r/javascript)


Notes
  1. Languages like Haskell and Scheme that support tail call optimization can automatically transform linear recursion into a loop. Since loops are not difficult to write, that doesn’t seem like a big deal. But as we’ll see when we write foldr below, having two different functions share the same general shape communicates their design and relationship. Writing one as a loop and the other recursively conceals that which should be manifest. 

  2. Languages like Haskell that have lazy evaluation don’t need any special treatment to make this work, but since JavaScript is an “eager” language, we have to make some adjustments. 

  3. lazycons implements a linked list of sorts, with each conscell generator yielding a single value and then invoking a thunk to get a generator for the remainder of its values. This arrangement creates a lazily computed iterable that optimizes for the simplicity of prepending elements to the list. 

  4. The English phrase rule of thumb refers to a principle with broad application that is not intended to be strictly accurate or reliable for every situation. It refers to an easily learned and easily applied procedure or standard, based on practical experience rather than theory. This usage of the phrase can be traced back to the seventeenth century and has been associated with various trades where quantities were measured by comparison to the width or length of a thumb. –Wikipedia 

  5. There’s another consideration for code bases where for... of loops are unreliable because of the behaviour of Symbol shims, but this consideration is well outside the scope of this essay. Throughout this collection of essays, we presume that all features of ES2015 are available except for TCO. 

https://raganwald.com/2017/04/10/foldl-foldr
Turing Machines and Tooling, Part I
Show full content

monk at work

Note well: This is an unfinished work-in-progress.


Turing Machines and Tooling, Part I

Much is made of “functional” programming in JavaScript. People get very excited talking about how to manage, minimize, or even eliminate mutation and state. But what if, instead of trying to avoid state and mutation, we embrace it? What if we “turn mutation up to eleven?”

We know the rough answer without even trying. We’d need a lot of tooling to manage our programs. Which is interesting in its own right, because we might learn a lot about tools for writing software.

So with that in mind, let’s begin at the beginning.

turing machines

In 1936, Alan Turing invented what we now call the Turing Machine in his honour. He called it an “a-machine,” or “automatic machine.”1 Turing machines are mathematical models of computation reduced to an almost absurd simplicity. Computer scientists like to work with very simple models of computation, because that allows them to prove important things about computability in general.

Turing had worked with other model of computation. For example, he was facile with combinatorial logic, and discovered what is now called the Turing Combinator: (λx. λy. (y (x x y))) (λx. λy. (y (x x y))).

Turing’s “a-machine” model of computation differed greatly from combinatorial logic and the other historically significant model, the lambda calculus. Where combinatorial logic and the lambda calculus both modelled computation as expressions without side-effects, mutation, or state, his “a-machine” modelled computation as side-effects, mutation, and state without expressions.

This new model allowed him to prove some very important results about computability, and as we’ll see, it inspired John von Neumann’s designed for the stored-program computer, the architecture we use to this day.

Turing Machine Model

So what was this “a-machine” of his? Well, he never actually built one. It was a thought experiment. He imagined an infinite paper tape. In his model, the tape had a beginning, but no end. (If you have heard Turing machines described as operating on a tape that stretches to infinity in both directions, be patient, we’ll explain the difference later.)

The tape is divided into cells, each of which is either empty/blank, or contains a mark.

Moving along this tape is the a-machine (or equivalently, moving the tape through the a-machine). The machine is, at any one time, positioned over exactly one cell. The machine is capable of reading the mark in a cell, writing a mark into a cell, and moving once cell in either direction.

Turing machines have a finite and predetermined set of states that they can be in. One state is marked as the start state, and when a Turing machine begins to operate it begins in that state, and begins positioned over the first cell of the tape. While operating, a Turing machine always has a current state.

An abstract Turing machine

When a Turing machine is started, and continuously thereafter, it reads the contents of the cell in its position, and depending on the current state of the machine, it:

  • Writes a mark, moves left, or moves right, and;
  • Either changes to a different state, or remains in the same state.

A Turing machine is defined such that given the same mark in the current cell, and the same state, it always performed the same action. Turing machines are deterministic.

The machine continues to operate until one of two things happens:

  1. It moves off the left edge of the tape, or;
  2. There is no defined behaviour for its current state and the symbol in the current cell.

If either of these things happens, the machine halts.

our first turing machine

Given the definition above, we can write a Turing machine emulator. We will represent the tape as an array. The 0th element will be the start of the tape. As the machine moves to the right, if it moves past the end of our array, we will append another element. Thus, the tape appears to be infinite (or as large as the implementation of arrays allows). Marks on the tape will be represented by numbers and strings. By default, 0 will represent an empty cell (although anything will do).

Turing used tables to represent the definitions for his a-machines. We’ll use an array of arrays, as the “description” of the Turing machine. Each element of the description will be an array containing:

  1. The current state, represented as a string.
  2. The current mark, represented as a string or number.
  3. The next state for the machine (it can be the same as the current state).
  4. An action to perform (a mark to write, or instructions to move left, or move right).

Naturally, we’ll write it as a function that takes a description and a tape as input, and—if the emulated machine halts—outputs the state of the tape.

const ERASE = 0;
const PRINT = 1;
const LEFT = 2;
const RIGHT = 3;

function aMachine({ description, tape: _tape = [0] }) {
  const tape = Array.from(_tape);

  let tapeIndex = 0;
  let currentState = description[0][0];

  while (true) {
    const currentMark = tape[tapeIndex];

    if (![0, 1].includes(currentMark)) {
      // illegal mark on tape
      return tape;
    }

    const rule = description.find(([state, mark]) => state === currentState && mark === currentMark);

    if (rule == null) {
      // no defined behaviour for this state and mark
      return tape;
    }

    const [_, __, nextState, action] = rule;

    if (action === LEFT) {
      --tapeIndex;
      if (tapeIndex < 0) {
        // moved off the left edge of the tape
        return tape;
      }
    } else if (action === RIGHT) {
      ++tapeIndex;
      if (tape[tapeIndex] == null) tape[tapeIndex] = 0;
    } else if ([ERASE, PRINT].includes(action)) {
      tape[tapeIndex] = action;
    } else {
      // illegal action
      return tape;
    }

    currentState = nextState;
  }
}

Our “a-machine” has a “vocabulary” of 0 and 1: These are the only marks allowed on the tape. If it encounters another mark, it halts. These are also the only marks it is allowed to put on the tape, via the ERASE and PRINT actions. It selects as the start state the state of the first instruction.

Any finite number of states are permitted. Here is a program that prints a 1 in the first position of the tape, and then halts:

const description = [
  ['start', 0, 'halt', PRINT]
];

aMachine({ description })
  //=> [1]

It starts in state start because it is the first (and only) instruction. It is for our convenience that the state is called “start.” The instruction matches a 0, and the action is to PRINT a 1. So given a blank tape, that’s what it does. It then transitions to the halt state.

What happens next? Well, it is in the halt state. It is positioned over a 1. But there is no instruction matching a state of halt and a mark of 1 (actually, there is no instruction matching a state of halt at all). So it halts.

Given a tape that already has a 1 in the first position, it halts without doing anything, because there is no instruction matching a state of start and a 1. We can add one to our program:

const description = [
  ['start', 0, 'halt', PRINT],
  ['start', 1, 'halt', ERASE]
];

aMachine({ description, tape: [0] })
  //=> [1]

aMachine({ description, tape: [1] })
  //=> [0]

We’ve written a “not” function. Let’s write another. Let’s say that we have a number on the tape, represented as a string of 1s. So the number zero is an empty tape, the number one would be represented as [1], two as [1, 1], three as [1, 1, 1] and so forth.

Here’s a program that adds one to a number:

const description = [
  ['start', 0, 'halt', PRINT],
  ['start', 1, 'start', RIGHT]
];

aMachine({ description, tape: [0] })
  //=> [1]

aMachine({ description, tape: [1] })
  //=> [1, 1]

aMachine({ description, tape: [1, 1] })
  //=> [1, 1, 1]

If it encounters a 0, it prints a mark and halts. If it encounters a 1, it moves right and remains in the same state. Thus, it moves right over any 1s it finds, until it reaches the end, at which point it writes a 1 and halts.

All of our machines so far have one “real” state, start, and one deliberately “undefined” state, halt. We can write programs with more than one state. This one prints a 1 in the third position on the tape:

const description = [
  ['start', 0, 'one', RIGHT],
  ['start', 1, 'one', RIGHT],
  ['one', 0, 'two', RIGHT],
  ['one', 1, 'two', RIGHT],
  ['two', 0, 'halt', PRINT]
];

aMachine({ description })
  //=> [0, 0, 1]

It has three different states (plus “halt”).

expressiveness and power

Our “a-machine” is very simple. It does allow for as many states as we like, but only two symbols. Each instruction can only print, erase, or move. Despite its simplicity, Alan Turing proved that anything that can be computed, can be computed by an a-machine. This is not an essay about computer science, so we won’t concern ourselves with the formal proof.

Instead, we will follow the path of demonstrating why an a-machine is much more powerful than it may appear. Our method will be this:

First, we designate the a-machine as being the simplest possible type of Turing machine. Meaning, it has the least possible “expressiveness” of descriptions. Next, we think of a Turing machine that is more expressive than an a-machine. How do we demonstrate that despite being more expressive, the new machine is no more powerful than an a-machine?

We show how to transform any input for our more expressive machine into input for an a-machine. And we show how to transform the output of our a-machine into the output for our more powerful machine. If we can do both of these things, we can grasp that the two machines have equivalent power. Meaning, that both can compute exactly the same things.

the sequence-machine

Here is a Turing machine that is undeniably more expressive than an a-machine. Its principle advantage is that it permits any sequence of actions to be associated with a single instruction:

function sequenceMachine({ description, tape: _tape = [0] }) {
  const tape = Array.from(_tape);

  let tapeIndex = 0;
  let currentState = description[0][0];

  while (true) {

    const currentMark = tape[tapeIndex];

    if (![0, 1].includes(currentMark)) {
      // illegal mark on tape
      return tape;
    }

    const rule = description.find(([state, mark]) => state === currentState && mark === currentMark);

    if (rule == null) {
      // no defined behaviour for this state and mark
      return tape;
    }

    const [_, __, nextState, ...actions] = rule;

    for (const action of actions) {
      if (action === LEFT) {
        --tapeIndex;
        if (tapeIndex < 0) {
          // moved off the left edge of the tape
          return tape;
        }
      } else if (action === RIGHT) {
        ++tapeIndex;
        if (tape[tapeIndex] == null) tape[tapeIndex] = 0;
      } else if ([ERASE, PRINT].includes(action)) {
        tape[tapeIndex] = action;
      } else {
        // illegal action
        return tape;
      }
    }

    currentState = nextState;
  }
}

It runs all the programs written for an a-machine:

const description = [
  ['start', 0, 'one', RIGHT],
  ['start', 1, 'one', RIGHT],
  ['one', 0, 'two', RIGHT],
  ['one', 1, 'two', RIGHT],
  ['two', 0, 'halt', PRINT]
];

sequenceMachine({ description })
  //=> [0, 0, 1]

But it can also run a new kind of program that an a-machine cannot run:

const description = [
  ['start', 0, 'halt', RIGHT, RIGHT, PRINT],
  ['start', 1, 'halt', RIGHT, RIGHT, PRINT]
];

sequenceMachine({ description })
  //=> [0, 0, 1]

This is a much more convenient way to run programs. Is it more powerful? No.

demonstrating that a sequence-machine is no more powerful than an a-machine

We started with a program for an a-machine that looked like this:

const description = [
  ['start', 0, 'one', RIGHT],
  ['start', 1, 'one', RIGHT],
  ['one', 0, 'two', RIGHT],
  ['one', 1, 'two', RIGHT],
  ['two', 0, 'halt', PRINT]
];

And we transformed it into a program for a sequence-machine that looked like this:

const description = [
  ['start', 0, 'halt', RIGHT, RIGHT, PRINT],
  ['start', 1, 'halt', RIGHT, RIGHT, PRINT]
];

To demonstrate that a sequence-machine is no more powerful than an a-machine, we will do the reverse: We will show that we can transform any description of a sequence-machine into a description of an a-machine that produces the same result.

Here is our demonstration written in JavaScript:

// prologue: some handy functions

const flatMap = (arr, lambda) => {
  const inLen = arr.length;
  const mapped = new Array(inLen);

  let outLen = 0;

  arr.forEach((e, i) => {
    const these = lambda(e);

    mapped[i] = these;
    outLen = outLen + these.length;
  });

  const out = new Array(outLen);

  let outIndex = 0;
  for (const these of mapped) {
    for (const e of these) {
      out[outIndex++] = e;
    }
  }

  return out;
};

const gensym = (()=> {
  let n = 1;

  return (prefix = 'G') => `${prefix}-${n++}`;
})();

const times = n =>
  Array.from({ length: n }, (_, i) => i);

// flatten, transform any description for a sequence-machine into
// a description for an a-machine

const flatten = ({ description: _description, tape }) => {
  const description = flatMap(_description, ([currentState, currentMark, nextState, ...instructions]) => {
    if (instructions.length === 0) {
      // pathological case
      return [];
    } else {
      const len = instructions.length;
      const nextStates = [];

      let destinationState = nextState;

      times(len).forEach( () => {
        nextStates.unshift(destinationState);
        const match = destinationState.match(/^\*(.*)-\d+$/)

        if (match) {
          destinationState = gensym(`*${match[1]}`);
        } else destinationState = gensym(`*${destinationState}`);
      });

      const currentStates = nextStates.slice(0, len - 1);
      currentStates.unshift(currentState);

      let possibleMarks = [currentMark];

      const compiled = flatMap(times(len), i => {
        const instruction = instructions[i];

        const mappedInstructions = possibleMarks.map(
          mark => [currentStates[i], mark, nextStates[i], instruction]
        );

        if ([LEFT, RIGHT].includes(instruction)) {
          possibleMarks = [0, 1];
        } else if ([ERASE, PRINT].includes(instruction)) {
          possibleMarks = [instruction.mark];
        }

        return mappedInstructions;
      });

      return compiled;
    }
  });

  return { description, tape };
}

We can “flatten” any description for a sequence-machine into a description for an a-machine:

const description = [
  ['start', 0, 'halt', RIGHT, RIGHT, PRINT],
  ['start', 1, 'halt', RIGHT, RIGHT, PRINT]
];

flatten({ description, tape: [0] })
  //=>
    {
      description: [
        ["start", 0, "*halt-2", 3],
        ["*halt-2", 0, "*halt-1", 3],
        ["*halt-2", 1, "*halt-1", 3],
        ["*halt-1", 0, "halt", 1],
        ["*halt-1", 1, "halt", 1],
        ["start", 1, "*halt-5", 3],
        ["*halt-5", 0, "*halt-4", 3],
        ["*halt-5", 1, "*halt-4", 3],
        ["*halt-4", 0, "halt", 1],
        ["*halt-4", 1, "halt", 1]
      ],
      tape: [0]
    }

Although it has a few moving parts, what it does at its simplest is turn any instruction with more than one action into a series of instructions, one per action. To chain them together, it creates “synthetic” states like *halt-2. It also tries to make moving robust to account for any marks the machine may encounter.

If we run our flattened description, we see it produces the same output:

aMachine(flatten({ description, tape: [0]}))
  //=> [0, 0, 1]

Now, it’s not exactly the same program that we originally wrote. Our program had fewer states, because we optimized for having a single state for each cell we moved over. The flatten function is very conservative. But it will produce a description for an a-machine that performs the same computation, so we can convince ourselves informally that anything a sequence-machine can compute, so can an a-machine, because any member of the set of all possible sequence-machine descriptions maps to at least one member of the set of all possible a-machines.

from computer science to tooling

That is an interesting result in Computer Science, and we will follow the same reasoning to work with other types of Turing machines. But our focus is on something else, tooling. Did you notice that flatten is a tool? It’s a compiler, and it is no different in principle than something like ClojureScript or Babel. It compiles a program written in a more-expressive language into one written in a less-expressive language.

And that is very interesting.

We earlier defined what we meant by two machines having equivalent power. Let’s quickly say what we mean about being “more expressive.” There are a few different concepts in play, but one of the most obvious is that for a given computation, the description that contains less accidental complexity.

So when looking at this description of an a-machine:

const description = [
  ['start', 0, 'one', RIGHT],
  ['start', 1, 'one', RIGHT],
  ['one', 0, 'two', RIGHT],
  ['one', 1, 'two', RIGHT],
  ['two', 0, 'halt', PRINT]
];

We can take the position that the states start and halt are “essential” complexity, but the states one and two are artefacts of the solution domain, they’re how we implement moving right twice. So they are “accidental” complexity. This description of a sequence-machine:

const description = [
  ['start', 0, 'halt', RIGHT, RIGHT, PRINT],
  ['start', 1, 'halt', RIGHT, RIGHT, PRINT]
];

Appears to eliminate the one and two states. Naturally, we know that it actually creates at much actual complexity as the original hand-coded description:

[
  ["start", 0, "*halt-2", 3],
  ["*halt-2", 0, "*halt-1", 3],
  ["*halt-2", 1, "*halt-1", 3],
  ["*halt-1", 0, "halt", 1],
  ["*halt-1", 1, "halt", 1],
  ["start", 1, "*halt-5", 3],
  ["*halt-5", 0, "*halt-4", 3],
  ["*halt-5", 1, "*halt-4", 3],
  ["*halt-4", 0, "halt", 1],
  ["*halt-4", 1, "halt", 1]
]

But as programmers, we don’t appear to see that complexity, so it feels like it has been eliminated.

declension

comparing compilers to interpreters

We actually have two different implementations of the “sequence-machine.” The first one we look at is basically an interpreter that understands and executes instructions with multiple actions. The second one we looked at compiles a sequence-machine description into an a-machine description.

We have analogues for this in the “real world” as well. We have tools like Babel that translate various cutting-edge dialects of JavaScript into a more basic flavour of JavaScript. But we also have browsers like Safari and Chrome gradually incorporating more expressive flavours of JavaScript into their underlying engines.

What are the relative tradeoffs of the two approaches? In broad terms, it often comes down to how we want to manage the complexity we’re trying to hide. We are going to implement more expressive Turing machines over the coming posts. With each one, we have to manage more and more complexity.

When we push that complexity into the interpreter (or “engine”), we often find the engine getting a little more complicated. In our very simple example, both the “a-machine” and the “sequence-machine” have while loops to iterate step-by-step through the instructions as the machine operates. But to handle multiple actions within a single instruction, the “sequence-machine” adds a for loop within the while loop.

Whereas, the “flatten” complier we wrote obviously doesn’t add anything to the “a-machine,” it translates programs for the sequence-machine into programs for the a-machine and then runs them on the unmodified a-machine code. That being said, compilers often appear more complicated than the code they add to interpreters. “flatten” certainly has lots more code than the changes we made to transform an a-machine into a sequence-machine.

so why build compilers?

The compiler solution has one very powerful advantage over the interpreter, despite its size. The compiler absolutely and positively separates the concern of handling sequences from the concern of interpreting instructions for an a-machine. If we want to know what we have to do to handle sequences, all the code we need to read is in “flatten.”

Whereas, in the sequence-machine, the code handling sequences is embedded. It’s not factored out into a separate function or module. As we add more features to interpreters, our interpreters tend to grow higgledy-piggledy, with the considerations for each feature coupled to every other feature’s code.

We can and should work to separate these concerns, but that does not come “for free.” When writing interpreters, the default or easiest path is for them to grow in complexity. But what about compilers?

As we add new functionality, if we add it to “flatten” to make a super-compiler, we have the same problem. But compilers don’t have to grow this way by default.

In Part II, we are going to add more functionality, and look at how we can pipeline compilers, a pattern that naturally separates and modularizes concerns.


notes
  1. In this essay, we will call his model an “a-machine” when discussing his work in historical context, but call the machines we build in the present tense, “Turing machines.” 

https://raganwald.com/2017/04/06/turing-machines
The Lumberjane Song
Show full content

SINGER
I’m a lumberjane and I’m OK
I sleep all night and I code all day

NERD CHOIR

She’s a lumberjane and she’s OK
She sleeps all night and codes all day

I write clean code, I mentor youth
I go to the lavatory
On Wednesdays I race bicycles
Mine’s espresso, if you please

She writes clean code, mentors youth
She goes to the lavatory
On Wednesdays she races bicycles
Hers is espresso, if you please

She’s a lumberjane and she’s OK
She sleeps all night and codes all day

I write clean code, I mentor youth
I march against injustice
I can’t stand wearing high heels
And play roller derby

She writes clean code, she mentors youths
She marches against injustice
She can’t stand wearing high heels
And plays roller derby…?

(The nerds begin to look uncomfortable, but brighten up as they go into the chorus.)

She’s a lumberjane and she’s OK
She sleeps all night and codes all day

I write clean code, don’t give a damn
What reddit thinks is right
I’m taking back, my human rights
Just like my dear mama

She writes clean code, doesn’t give a damn
What reddit thinks is right???

(The nerd choir mumbles angrily, and the song breaks down. The singer skates off, whistling the tune cheerfully.)

https://raganwald.com/2017/03/08/the-lumberjane-song
Time, Space, and Life As We Know It
Show full content

In Why Recursive Data Structures? we used multirec, a recursive combinator, to implement quadtrees and coloured quadtrees (The full code for creating and rotating quadtrees and coloured quadtrees is below).

Our focus was on the notion of an isomorphism between the data structures and the algorithms, more than on the performance of quadtrees. Today, we’ll take a closer look at taking advantage of their recursive structure to optimize for time and using memoization and canonicalization.

Finally, we will look at the application of recursive algorithms and quadtrees to simulate the universe. Really.


Recursion

Recursion is a pile of dog faeces, © 2006 Robin Corps, some rights reserved


optimization

Performance-wise, naïve array algorithms are O n, and naïve quadtree algorithms are O n log n. Coloured quadtrees are worst-case O n log n, but are faster than naïve quadtrees whenever there are regions that are entirely white or entirely black, because the entire region can be handled in one ‘operation.’

They can even be faster than naïve array algorithms if the image contains enough blank regions. But can we fined even more opportunities to optimize their behaviour?

The general idea behind coloured quadtrees is that if we know a way to compute the result of an operation (whether rotation or superimposition) on an entire region, we don’t need to recursively drill down and do the operation on every cell in the region. We save O n log n operations where n is the size of the region.

We happen to know that all-white or all-black regions are a special case for rotation an superimposition, so coloured quadtrees optimize that case. But if we could find an even more common case, we could go even faster.

One interesting special case is this: If we’ve done the operation on an identical quadrant before, we could remember the result instead of recomputing it.

For example, if we want to rotate:

⚪️⚪️⚪️⚫️⚫️⚪️⚪️⚪️
⚪️⚪️⚫️⚪️⚪️⚫️⚪️⚪️
⚪️⚫️⚫️⚪️⚪️⚫️⚫️⚪️
⚫️⚪️⚪️⚫️⚫️⚪️⚪️⚫️
⚫️⚪️⚪️⚫️⚫️⚪️⚪️⚫️
⚪️⚫️⚫️⚪️⚪️⚫️⚫️⚪️
⚪️⚪️⚫️⚪️⚪️⚫️⚪️⚪️
⚪️⚪️⚪️⚫️⚫️⚪️⚪️⚪️

We would divide it into:

⚪️⚪️⚪️⚫️ ⚫️⚪️⚪️⚪️
⚪️⚪️⚫️⚪️ ⚪️⚫️⚪️⚪️
⚪️⚫️⚫️⚪️ ⚪️⚫️⚫️⚪️
⚫️⚪️⚪️⚫️ ⚫️⚪️⚪️⚫️

⚫️⚪️⚪️⚫️ ⚫️⚪️⚪️⚫️
⚪️⚫️⚫️⚪️ ⚪️⚫️⚫️⚪️
⚪️⚪️⚫️⚪️ ⚪️⚫️⚪️⚪️
⚪️⚪️⚪️⚫️ ⚫️⚪️⚪️⚪️

We would complete the first, second, third, and fourth quadrants. They’re all different. However, consider computing the first quadrant.

We divide it into:

⚪️⚪️ ⚪️⚫️
⚪️⚪️ ⚫️⚪️

⚪️⚫️ ⚫️⚪️
⚫️⚪️ ⚪️⚫️

It’s first and third sub-quadrants are unique, but the second and fourth are identical, so after we have rotated:

⚪️⚪️
⚪️⚪️

⚪️⚫️
⚫️⚪️

⚫️⚪️
⚪️⚫️

If we have saved our work, we don’t need to rotate

⚪️⚫️
⚫️⚪️

Again, because we have already done it and have saved the result. So we save 25% of the work to compute the first quadrant.

And it gets better. The second quadrant subdivides into:

⚫️⚪️ ⚪️⚪️
⚪️⚫️ ⚪️⚪️

⚪️⚫️ ⚫️⚪️
⚫️⚪️ ⚪️⚫️

We’ve seen all four of these sub-quadrants already, so we can rotate them in one step, looking up the saved result. The same goes for the third and fourth quadrants, we’ve seen their sub-quadrants before, so we can do each of their sub-quadrants in a single step as well.

Once we have rotated three sub-quadrants, we have done all the computation needed. Everything else is saving and looking up the results.

Le’s write ourselves a simple implementation.


Index Cards

Index cards in the public library of Trinidad, Cuba, © 2008 Paul Keller, some rights reserved


memoization

We are going to memoize the results of an operation. We’ll use rotation again, as it’s a simple case, and thus we can focus on the memoization code. We won’t worry about colouring our quadtrees (although implementing colour and memoization optimizations can be valuable for some cases).

Here’s our naïve quadtree rotation again:

const rotateQuadTree = multirec({
    indivisible : isString,
    value : itself,
    divide: quadTreeToRegions,
    combine: regionsToRotatedQuadTree
  });

In principle, our algorithm will consists of, “Do we already know how to rotate this? Yes? Return the answer. No? Rotate it, save the answer, and return the answer.”

As seen elsewhere, we can use memoize to take any function and give it this exact behaviour. Unfortunately, this won’t work:

const memoized = (fn, keymaker = JSON.stringify) => {
    const lookupTable = Object.create(null);

    return function (...args) {
      const key = keymaker.call(this, args);

      return lookupTable[key] || (lookupTable[key] = fn.apply(this, args));
    }
  };

const rotateQuadTree = memoized(
    multirec({
        indivisible : isString,
        value : itself,
        divide: quadTreeToRegions,
        combine: regionsToRotatedQuadTree
    })
  );

As explained, the memoization has to be applied to the function that is being called recursively. In other words, we need to memoize the function inside multirec. If we don’t, we memoize the first call (for the entire image), but none of the others.

We can do this properly1 with a new combinator and a function to generate keys:

function memoizedMultirec({ indivisible, value, divide, combine, key }) {
  const myself = memoized((input) => {
    if (indivisible(input)) {
      return value(input);
    } else {
      const parts = divide(input);
      const solutions = mapWith(myself)(parts);

      return combine(solutions);
    }
  }, key);

  return myself;
}

const catenateKeys = (keys) => keys.join('');

const simpleKey = multirec({
    indivisible : isString,
    value : itself,
    divide: quadTreeToRegions,
    combine: catenateKeys
  });

const memoizedRotateQuadTree = memoizedMultirec({
      indivisible : isString,
      value : itself,
      divide: quadTreeToRegions,
      combine: regionsToRotatedQuadTree,
      key: simpleKey
  });

And now, we are able to take advantage of redundancy within our images: Whenever two quadrants of any size are have identical content, we need only rotate one. We get the other from our lookup table.

Of course, we have an additional overhead involved in checking our cache, and we require additional space for the cache. And we are handwaving over the work involved in computing keys. But for the moment, we grasp that we can take advantage of a recursive data structure to exploit redundancy in our data.

Let’s exploit it even more.


canonicalization

One of the benefits of our key function is that when two different quadrants have the same content, we return the same result for rotating them. We’re exploiting redundancy to reduce the time required to perform an operation like rotating a square.

But that isn’t the only redundancy we can exploit. What about redundancy in the space required to represent a square? Why do we ever need two different quadtrees to have the same content?

If we reused the same quadtree whenever we needed the same content, we would be able to save space as well as time. Our quadtree would go from being a strict tree to being a DAG, but it would be smaller in many cases. In the most extreme case of an image entirely white or entirely black, it would become equivalent to a linked list with four identical links at each level of the quadtree.

To make this work, we have to isolate the creation of new quadtrees. Let’s start by extracting them into a function:

const quadtree = (ul, ur, lr, ll) =>
  ({ ul, ur, lr, ll });

const regionsToQuadTree = ([ul, ur, lr, ll]) =>
  quadtree(ul, ur, lr, ll);

const regionsToRotatedQuadTree = ([ur, lr, ll, ul]) =>
  quadtree(ul, ur, lr, ll);

Next, we memoize our new quadtree function. We compute the key of a quadtree we want to create much the same as how we compute the key of a finished quadtree, although that is not strictly necessary for the function to work:

const compositeKey = (...regions) => regions.map(simpleKey).join('');

const quadtree = memoized(
    (ul, ur, lr, ll) => ({ ul, ur, lr, ll }),
    compositeKey
  );

const regionsToQuadTree = ([ul, ur, lr, ll]) =>
  quadtree(ul, ur, lr, ll);

const regionsToRotatedQuadTree = ([ur, lr, ll, ul]) =>
  quadtree(ul, ur, lr, ll);

Now redundant quadtrees optimize for space as well as for time.


computing keys

The one thing that is terribly wrong with our work so far is that we are always recursively computing keys to achieve our “efficiencies.” This is, as noted, O n log n every time we do it.

We could go with a straight memoization of the key computation function, but there is another path. Since we know we will always need to compute keys for every quadtree we make, why not compute them ahead of time, just like we did with colours in our coloured quadtrees?

After all, we might or might not need to rotate a square, but we will always need to check keys for canonicalization purposes. So we’ll put the key in a property. Since we’re trying to advance our code incrementally, we’ll use a symbol for the property key, instead of a string.

Our new simpleKey and quadtree functions will look like this:

const KEY = Symbol('key');

const simpleKey = (something) =>
  isString(something)
  ? something
  : something[KEY];

const quadtree = memoized(
    (ul, ur, lr, ll) => ({ ul, ur, lr, ll, [KEY]: compositeKey(ul, ur, lr, ll) }),
    compositeKey
  );

Thus, the keys are memoized, but explicitly within each canonicalized quadtree instead of in a separate lookup table. Now the redundant computation of keys is gone.


summary of memoization and canonicalization

We have seen that recursive data structures like quadtrees offer opportunities to take advantage of redundancy, and that we can exploit this to save both time and space. The complete code for our memoized and canonicalized quadtrees is below.

These are an important optimizations, and they flow directly from what we investigated in Why Recursive Data Structures?: The isomorphism between the shape of the data structure and the run-time behaviour of the algorithm allows us to create optimizations that are not otherwise possible.

And again, the opportunity to exploit these optimizations very much depends upon the amount of redundancy in the “flat” representation of the underlying data.

But now let us consider a completely different kind of operation on quadtrees.


Composition in Red, Blue, and Yellow

Photograph of Mondrian’s “Composition in Red, Blue, and Yellow,” © Davis Staedtler, some rights reserved


averaging

Operations like rotation, superimposition, and reflection are all self-contained. For example, the result for rotating a square or region is always exactly the same size as the square. Furthermore, these operations scale: They can be defined from a smallest possible size and up. As a result, they have a natural “fit” with multirec, or to be more precise, with divide-and-conquer algorithms.

But not all operations are self-contained. Let us take, as an example, an image filter we will call average. To average an image, each pixel takes on a colour based on the weighted average of the colours of the pixels in its immediate neighbourhood.

A black pixel surrounded by mostly white pixels becomes white, and a white pixel surrounded by mostly black pixels becomes black. If there are an equal number of black and white neighbours, the pixel stays the same colour.

We’ll say that if a black pixel has five or more white “neighbours,” it becomes white, while if a white pixel has five or more black neighbours, it becomes black. Consider only the centre pixel in these diagrams:

This one stays black, it only an equal number of back and white neighbours:

⚪️⚫️⚫️
⚫️⚫️⚪️
⚫️⚪️⚪️

This one becomes white, it has more white neighbours than black:

⚪️⚪️⚪️
⚫️⚫️⚪️
⚫️⚪️⚪️

This stays white, it has more white neighbours than black:

⚪️⚫️⚫️
⚪️⚪️⚪️
⚪️⚪️⚪️

This white pixel becomes black, it has five black neighbours and only three white neighbours:

⚪️⚫️⚫️
⚫️⚪️⚪️
⚪️⚫️⚫️

Applied to a larger image, this:

⚫️⚪️⚪️⚪️
⚪️⚫️⚫️⚪️
⚪️⚫️⚪️⚫️
⚫️⚪️⚫️⚪️

Becomes this after “averaging” it once:

⚪️⚫️⚪️⚪️
⚫️⚪️⚪️⚪️
⚫️⚫️⚫️⚪️
⚪️⚫️⚪️⚫️

And we can average it again, just as we can rotate images more than once:

⚫️⚪️⚪️⚪️
⚫️⚫️⚪️⚪️
⚫️⚫️⚪️⚪️
⚫️⚫️⚫️⚪️

And again:

⚫️⚫️⚪️⚪️
⚫️⚫️⚪️⚪️
⚫️⚫️⚪️⚪️
⚫️⚫️⚪️⚪️

And again:

⚫️⚫️⚪️⚪️
⚫️⚫️⚪️⚪️
⚫️⚫️⚪️⚪️
⚫️⚫️⚪️⚪️

We’ve reached an equilibrium, further averaging operations will have no effect on our image. This crude “average” operation is not particularly useful for graphics, but it is simple enough that we can explore more of the ramifications of working with memoized and canonicalized algorithms, so let’s carry on.


the problem

We could easily code a function to determine the result of “averaging” a pixel, something like this:

function averagedPixel (pixel, blackNeighbours) {
  if (pixel === '⚪️') {
    return [5, 6, 7, 8].includes(blackNeighbours) ? '⚫️' : '⚪️';
  } else {
    return [4, 5, 6, 7, 8].includes(blackNeighbours) ? '⚫️' : '⚪️';
  }
}

But there is a bug! This only works for interior pixels, edges and corners would have different numbers. We could fix this, but before we do, let’s realize something significant. We had to make no such adjustment for rotating images. Edges and corners weren’t special.

Because edges and corners are special, the behaviour of averaging a square if different depending upon whether the edge of the square is the edge of the entire image or not. If it’s not the entire image, we get different values for our edges and corners depending upon the square’s neighbours.

Consider, for example:

⚫️⚪️⚪️⚫️ ⚫️⚪️⚪️⚪️
⚪️⚫️⚫️⚪️ ⚪️⚫️⚫️⚪️
⚪️⚫️⚪️⚫️ ⚪️⚫️⚪️⚫️
⚪️⚪️⚫️⚪️ ⚫️⚪️⚫️⚪️

⚪️⚫️⚪️⚪️ ⚪️⚫️⚪️⚫️
⚫️⚪️⚫️⚪️ ⚫️⚪️⚫️⚪️
⚪️⚫️⚫️⚪️ ⚪️⚫️⚫️⚪️
⚫️⚪️⚪️⚫️ ⚪️⚪️⚪️⚫️

Our square in the upper-right now has a different outcome for its lower and left edges, because they have neighbours from the upper-left and lower-right quadrants:

 ⚪️⚫️⚪️⚪️
 ⚫️⚪️⚪️⚪️
 ⚪️⚫️⚫️⚪️
 ⚪️⚫️⚫️⚫️

This is different than “rotate.” With rotate, rotating a square had no dependency on any adjacent squares. That’s what makes our “divide and conquer” algorithms work, and especially our memoization work: rotating a square was rotating a square was rotating a square.

As it happens, averaging a square is not always the same as averaging a square. So what do we do? What can we salvage?


averaging the centre of a 4x4 square

No matter what neighbours a 4x4 square has or doesn’t have, the result of averaging the square will always be the same for the centre four pixels. They are only affected by the pixels that are in the square, not by its neighbours.

Those centre four pixels make up a square that is half the size of the entire square. So here’s a very conservative conjecture: Perhaps we can write an algorithm for averaging that only tells us what happens to the centre of a square.

We know how to write a function that gives us the average for the centre pixels of a 4x4 square, so let’s start with that, it will be the “indivisible case” for our multirec. We’ll enumerate the list of neighbours for each of the four squares in the centre, clockwise from the neighbour to the upper-left of the pixel we’re averaging:

const sq = arrayToQuadTree([
  ['0', '1', '2', '3'],
  ['4', '5', '6', '7'],
  ['8', '9', 'A', 'B'],
  ['C', 'D', 'E', 'F']
]);

const neighboursOfUlLr = (square) => [
    square.ul.ul, square.ul.ur, square.ur.ul, square.ur.ll,
    square.lr.ul, square.ll.ur, square.ll.ul, square.ul.ll
  ];

neighboursOfUlLr(sq).join('')
  //=> "0126A984"

const neighboursOfUrLl = (square) => [
    square.ul.ur, square.ur.ul, square.ur.ur, square.ur.lr,
    square.lr.ur, square.lr.ul, square.ll.ur, square.ur.lr
  ];

neighboursOfUrLl(sq).join('')
  //=> "1237BA95"

const neighboursOfLrUl = (square) => [
    square.ul.lr, square.ur.ll, square.ur.lr, square.lr.ur,
    square.lr.lr, square.lr.ll, square.ll.lr, square.ll.ur
  ];

neighboursOfLrUl(sq).join('')
  //=> "567BFED9"

const neighboursOfLlUr = (square) => [
    square.ul.ll, square.ul.lr, square.ur.ll, square.lr.ul,
    square.lr.ll, square.ll.lr, square.ll.ll, square.ll.ul
  ];

neighboursOfLlUr(sq).join('')
  //=> "456AEDC8"

We can count the number of black neighbouring pixels:

const countNeighbouringBlack = (neighbours) =>
  neighbours.reduce((c, n) => n === '⚫️' ? c + 1 : c, 0);

We already have a function for determining the result of averaging a pixel with its neighbours, we’ll extract the arrays to make it more compact:

const B = [5, 6, 7, 8];
const S = [4, 5, 6, 7, 8];

const averagedPixel = (pixel, blackNeighbours) =>
  (pixel === '⚪️')
  ? B.includes(blackNeighbours) ? '⚫️' : '⚪️'
  : S.includes(blackNeighbours) ? '⚫️' : '⚪️';

Now we have everything we need to compute the average of the centre four pixels of a 4x4 square:

const averageOf4x4 = (sq) => quadtree(
    averagedPixel(sq.ul.lr, count(neighboursOfUlLr(sq))),
    averagedPixel(sq.ur.ll, count(neighboursOfUrLl(sq))),
    averagedPixel(sq.lr.ul, count(neighboursOfLrUl(sq))),
    averagedPixel(sq.ll.ur, count(neighboursOfLlUr(sq)))
  );

averageOf4x4(
  arrayToQuadTree([
    ['⚫️', '⚪️', '⚪️', '⚪️'],
    ['⚪️', '⚫️', '⚫️', '⚪️'],
    ['⚪️', '⚫️', '⚪️', '⚫️'],
    ['⚫️', '⚪️', '⚫️', '⚪️']
  ])
)
  //=>
    { "ul": "⚪️", "ur": "⚪️",
      "ll": "⚫️, "lr": "⚫️"   }

Can we build from here? Yes, and with some interesting manoeuvres. But first, a necessary digression. It won’t take long.


Peart Elementary

Peart Elementary, via Kim Siever, in the public domain


subdividing quadtrees

Quadtrees make it obviously easy to subdivide a square into its upper-left, upper-right, lower-right, and lower-left regions. Given:

⚪️⚪️⚪️⚫️⚪️⚪️⚪️⚪️
⚪️⚪️⚫️⚪️⚫️⚪️⚪️⚪️
⚪️⚫️⚪️⚫️⚫️⚪️⚪️⚪️
⚫️⚪️⚫️⚪️⚪️⚫️⚫️⚪️
⚪️⚫️⚫️⚪️⚪️⚫️⚪️⚫️
⚪️⚪️⚪️⚫️⚫️⚪️⚫️⚪️
⚪️⚪️⚪️⚫️⚪️⚫️⚪️⚪️
⚪️⚪️⚪️⚪️⚫️⚪️⚪️⚪️

We extract:

⚪️⚪️⚪️⚫️ ⚪️⚪️⚪️⚪️
⚪️⚪️⚫️⚪️ ⚫️⚪️⚪️⚪️
⚪️⚫️⚪️⚫️ ⚫️⚪️⚪️⚪️
⚫️⚪️⚫️⚪️ ⚪️⚫️⚫️⚪️

⚪️⚫️⚫️⚪️ ⚪️⚫️⚪️⚫️
⚪️⚪️⚪️⚫️ ⚫️⚪️⚫️⚪️
⚪️⚪️⚪️⚫️ ⚪️⚫️⚪️⚪️
⚪️⚪️⚪️⚪️ ⚫️⚪️⚪️⚪️
const upperleft = (square) =>
  square.ul;

const upperright = (square) =>
  square.ur;

const lowerright = (square) =>
  square.lr;

const lowerleft = (square) =>
  square.ll;

There are other regions we can easily extract. For example, the upper-centre and lower-centre,:

⚪️⚪️ ⚪️⚫️⚪️⚪️ ⚪️⚪️
⚪️⚪️ ⚫️⚪️⚫️⚪️ ⚪️⚪️
⚪️⚫️ ⚪️⚫️⚫️⚪️ ⚪️⚪️
⚫️⚪️ ⚫️⚪️⚪️⚫️ ⚫️⚪️

⚪️⚫️ ⚫️⚪️⚪️⚫️ ⚪️⚫️
⚪️⚪️ ⚪️⚫️⚫️⚪️ ⚫️⚪️
⚪️⚪️ ⚪️⚫️⚪️⚫️ ⚪️⚪️
⚪️⚪️ ⚪️⚪️⚫️⚪️ ⚪️⚪️

And the left-middle and right middle:

⚪️⚪️⚪️⚫️ ⚪️⚪️⚪️⚪️
⚪️⚪️⚫️⚪️ ⚫️⚪️⚪️⚪️

⚪️⚫️⚪️⚫️ ⚫️⚪️⚪️⚪️
⚫️⚪️⚫️⚪️ ⚪️⚫️⚫️⚪️
⚪️⚫️⚫️⚪️ ⚪️⚫️⚪️⚫️
⚪️⚪️⚪️⚫️ ⚫️⚪️⚫️⚪️

⚪️⚪️⚪️⚫️ ⚪️⚫️⚪️⚪️
⚪️⚪️⚪️⚪️ ⚫️⚪️⚪️⚪️

And the middle-centre:

⚪️⚪️ ⚪️⚫️⚪️⚪️ ⚪️⚪️
⚪️⚪️ ⚫️⚪️⚫️⚪️ ⚪️⚪️

⚪️⚫️ ⚪️⚫️⚫️⚪️ ⚪️⚪️
⚫️⚪️ ⚫️⚪️⚪️⚫️ ⚫️⚪️
⚪️⚫️ ⚫️⚪️⚪️⚫️ ⚪️⚫️
⚪️⚪️ ⚪️⚫️⚫️⚪️ ⚫️⚪️

⚪️⚪️ ⚪️⚫️⚪️⚫️ ⚪️⚪️
⚪️⚪️ ⚪️⚪️⚫️⚪️ ⚪️⚪️

The code is a tad more involved, as we must compose these regions from the subregions of our square’s regions:

const uppercentre = (square) =>
  quadtree(square.ul.ur, square.ur.ul, square.ur.ll, square.ul.lr);

const rightmiddle = (square) =>
  quadtree(square.ur.ll, square.ur.lr, square.lr.ur, square.lr.ul);

const lowercentre = (square) =>
  quadtree(square.ll.ur, square.lr.ul, square.lr.ll, square.ll.lr);

const leftmiddle = (square) =>
  quadtree(square.ul.ll, square.ul.lr, square.ll.ur, square.ll.ul);

const middlecentre = (square) =>
  quadtree(square.ul.lr, square.ur.ll, square.lr.ul, square.ll.ur);

Of course, these regions we extract and compose will benefit from canonicalization. Ok, back to averaging!


averaging the centre of an 8x8 square

Now let’s consider an 8x8 square, something like this:

⚫️⚪️⚪️⚫️⚫️⚪️⚪️⚪️
⚪️⚫️⚫️⚪️⚪️⚫️⚫️⚪️
⚪️⚫️⚪️⚫️⚪️⚫️⚪️⚫️
⚪️⚪️⚫️⚪️⚫️⚪️⚫️⚪️
⚪️⚫️⚪️⚪️⚪️⚫️⚪️⚫️
⚫️⚪️⚫️⚪️⚫️⚪️⚫️⚪️
⚪️⚫️⚫️⚪️⚪️⚫️⚫️⚪️
⚫️⚪️⚪️⚫️⚪️⚪️⚪️⚫️

We can, of course, write a function that calculates the average for the centre 6x6 square, using methods very much like the ones we used for calculating the centre 2x2 of a 4x4 square. However, what we want to do is make the calculation based on the function we already have for computing the average of a 4x4 square.

If we can use the function we already have as a building block, we can build a recursive function that memoizes and canonicalizes.

To start with, let’s imagine we are computing averages of the 4x4 regions. To show the geometry, we will colour the averages blue. Here’s what we get when we average each of the four regions:

⚫️⚪️⚪️⚫️⚫️⚪️⚪️⚪️
⚪️🔵🔵⚪️⚪️🔵🔵⚪️
⚪️🔵🔵⚫️⚪️🔵🔵⚫️
⚪️⚪️⚫️⚪️⚫️⚪️⚫️⚪️
⚪️⚫️⚪️⚪️⚪️⚫️⚪️⚫️
⚫️🔵🔵⚪️⚫️🔵🔵⚪️
⚪️🔵🔵⚪️⚪️🔵🔵⚪️
⚫️⚪️⚪️⚫️⚪️⚪️⚪️⚫️

Here’s a 2x2 gap we didn’t average:

⚫️⚪️⚪️⚫️⚫️⚪️⚪️⚪️
⚪️⚫️⚫️🔵🔵⚫️⚫️⚪️
⚪️⚫️⚪️🔵🔵⚫️⚪️⚫️
⚪️⚪️⚫️⚪️⚫️⚪️⚫️⚪️
⚪️⚫️⚪️⚪️⚪️⚫️⚪️⚫️
⚫️⚪️⚫️⚪️⚫️⚪️⚫️⚪️
⚪️⚫️⚫️⚪️⚪️⚫️⚫️⚪️
⚫️⚪️⚪️⚫️⚪️⚪️⚪️⚫️

How do we get its average? By averaging this 4x4 square:

⚫️⚪️🔴🔴🔴🔴⚪️⚪️
⚪️⚫️🔴🔵🔵🔴⚫️⚪️
⚪️⚫️🔴🔵🔵🔴⚪️⚫️
⚪️⚪️🔴🔴🔴🔴⚫️⚪️
⚪️⚫️⚪️⚪️⚪️⚫️⚪️⚫️
⚫️⚪️⚫️⚪️⚫️⚪️⚫️⚪️
⚪️⚫️⚫️⚪️⚪️⚫️⚫️⚪️
⚫️⚪️⚪️⚫️⚪️⚪️⚪️⚫️

Luckily, we know how to get that 4x4 square from an 8x8 square, it’s the uppercentre function we wrote earlier. And we can fill in the rest of the gaps using our rightmiddle, lowercentre, leftmiddle, and middlecentre functions:

⚫️⚪️⚪️⚫️⚫️⚪️⚪️⚪️
⚪️⚫️⚫️⚪️⚪️⚫️⚫️⚪️
⚪️⚫️⚪️⚫️🔴🔴🔴🔴
⚪️⚪️⚫️⚪️🔴🔵🔵🔴
⚪️⚫️⚪️⚪️🔴🔵🔵🔴
⚫️⚪️⚫️⚪️🔴🔴🔴🔴
⚪️⚫️⚫️⚪️⚪️⚫️⚫️⚪️
⚫️⚪️⚪️⚫️⚪️⚪️⚪️⚫️

⚫️⚪️⚪️⚫️⚫️⚪️⚪️⚪️
⚪️⚫️⚫️⚪️⚪️⚫️⚫️⚪️
⚪️⚫️⚪️⚫️⚪️⚫️⚪️⚫️
⚪️⚪️⚫️⚪️⚫️⚪️⚫️⚪️
⚪️⚫️🔴🔴🔴🔴⚪️⚫️
⚫️⚪️🔴🔵🔵🔴⚫️⚪️
⚪️⚫️🔴🔵🔵🔴⚫️⚪️
⚫️⚪️🔴🔴🔴🔴⚪️⚫️

⚫️⚪️⚪️⚫️⚫️⚪️⚪️⚪️
⚪️⚫️⚫️⚪️⚪️⚫️⚫️⚪️
⚪️⚫️⚪️⚫️⚪️⚫️⚪️⚫️
⚪️⚪️⚫️⚪️⚫️⚪️⚫️⚪️
🔴🔴🔴🔴⚪️⚫️⚪️⚫️
🔴🔵🔵🔴⚫️⚪️⚫️⚪️
🔴🔵🔵🔴⚪️⚫️⚫️⚪️
🔴🔴🔴🔴⚪️⚪️⚪️⚫️

⚫️⚪️⚪️⚫️⚫️⚪️⚪️⚪️
⚪️⚫️⚫️⚪️⚪️⚫️⚫️⚪️
⚪️⚫️🔴🔴🔴🔴⚪️⚫️
⚪️⚪️🔴🔵🔵🔴⚫️⚪️
⚪️⚫️🔴🔵🔵🔴⚪️⚫️
⚫️⚪️🔴🔴🔴🔴⚫️⚪️
⚪️⚫️⚫️⚪️⚪️⚫️⚫️⚪️
⚫️⚪️⚪️⚫️⚪️⚪️⚪️⚫️

This gives us the centre 6x6 average of an 8x8 square:

⚫️⚪️⚪️⚫️⚫️⚪️⚪️⚪️
⚪️🔵🔵🔵🔵🔵🔵⚪️
⚪️🔵🔵🔵🔵🔵🔵⚫️
⚪️🔵🔵🔵🔵🔵🔵⚪️
⚪️🔵🔵🔵🔵🔵🔵⚫️
⚫️🔵🔵🔵🔵🔵🔵⚪️
⚪️🔵🔵🔵🔵🔵🔵⚪️
⚫️⚪️⚪️⚫️⚪️⚪️⚪️⚫️

Let’s write it:

const from8x8to6x6 = (sq) => ({
    ul: averageOf4x4(upperleft(sq)),
    uc: averageOf4x4(uppercentre(sq)),
    ur: averageOf4x4(upperright(sq)),
    lm: averageOf4x4(leftmiddle(sq)),
    mc: averageOf4x4(middlecentre(sq)),
    rm: averageOf4x4(rightmiddle(sq)),
    ll: averageOf4x4(lowerleft(sq)),
    lc: averageOf4x4(lowercentre(sq)),
    lr: averageOf4x4(lowerright(sq))
  });

This is an ungainly beast. It doesn’t look like our quadtrees at all, so we obviously don’t have an algorithm that is isomorphic to our data structure. What we need is a way to get from an 8x8 square to a 4x4 averaged centre. That would have the same “shape” as going from a 4x4 to a 2x2 averaged centre.

Can we go from a 6x6 square to a 4x4 square? Yes. First, note that a 6x6 square can be decomposed into four overlapping 4x4 squares:

🔵🔵🔵🔵⚫️⚫️
🔵🔵🔵🔵⚫️⚪️
🔵🔵🔵🔵⚪️⚫️
🔵🔵🔵🔵⚫️⚪️
⚪️⚫️⚪️⚫️⚪️⚫️
⚫️⚫️⚪️⚪️⚫️⚫️

⚫️⚫️🔵🔵🔵🔵
⚫️⚪️🔵🔵🔵🔵
⚪️⚫️🔵🔵🔵🔵
⚫️⚪️🔵🔵🔵🔵
⚪️⚫️⚪️⚫️⚪️⚫️
⚫️⚫️⚪️⚪️⚫️⚫️

⚫️⚫️⚪️⚪️⚫️⚫️
⚫️⚪️⚫️⚪️⚫️⚪️
⚪️⚫️🔵🔵🔵🔵
⚫️⚪️🔵🔵🔵🔵
⚪️⚫️🔵🔵🔵🔵
⚫️⚫️🔵🔵🔵🔵

⚫️⚫️⚪️⚪️⚫️⚫️
⚫️⚪️⚫️⚪️⚫️⚪️
🔵🔵🔵🔵⚪️⚫️
🔵🔵🔵🔵⚫️⚪️
🔵🔵🔵🔵⚪️⚫️
🔵🔵🔵🔵⚫️⚫️

And we can decompose them quite easily:

const decompose = ({ ul, uc, ur, lm, mc, rm, ll, lc, lr }) =>
  ({
    ul: quadtree(ul, uc, mc, lm),
    ur: quadtree(uc, ur, rm, mc),
    lr: quadtree(mc, rm, lr, lc),
    ll: quadtree(lm, mc, lc, ll)
  });

We can average those individually, we would get

⚫️⚫️⚪️⚪️⚫️⚫️
⚫️🔵🔵⚪️⚫️⚪️
⚪️🔵🔵⚫️⚪️⚫️
⚫️⚪️⚪️⚪️⚫️⚪️
⚪️⚫️⚪️⚫️⚪️⚫️
⚫️⚫️⚪️⚪️⚫️⚫️

⚫️⚫️⚪️⚪️⚫️⚫️
⚫️⚪️⚫️🔵🔵⚪️
⚪️⚫️⚪️🔵🔵⚫️
⚫️⚪️⚪️⚪️⚫️⚪️
⚪️⚫️⚪️⚫️⚪️⚫️
⚫️⚫️⚪️⚪️⚫️⚫️

⚫️⚫️⚪️⚪️⚫️⚫️
⚫️⚪️⚫️⚪️⚫️⚪️
⚪️⚫️⚪️⚫️⚪️⚫️
⚫️⚪️⚪️🔵🔵⚪️
⚪️⚫️⚪️🔵🔵⚫️
⚫️⚫️⚪️⚪️⚫️⚫️

⚫️⚫️⚪️⚪️⚫️⚫️
⚫️⚪️⚫️⚪️⚫️⚪️
⚪️⚫️⚪️⚫️⚪️⚫️
⚫️🔵🔵⚪️⚫️⚪️
⚪️🔵🔵⚫️⚪️⚫️
⚫️⚫️⚪️⚪️⚫️⚫️

Here’s that code:

const averages = ({ ul, ur, lr, ll }) =>
  ({
    ul: averageOf4x4(ul),
    ur: averageOf4x4(ur),
    lr: averageOf4x4(lr),
    ll: averageOf4x4(ll)
  });

And we can compose them back into a quadtree, giving us:

⚫️⚫️⚪️⚪️⚫️⚫️
⚫️🔵🔵🔵🔵⚪️
⚪️🔵🔵🔵🔵⚫️
⚫️🔵🔵🔵🔵⚪️
⚪️🔵🔵🔵🔵⚫️
⚫️⚫️⚪️⚪️⚫️⚫️
const centre4x4 = ({ ul, ur, lr, ll }) =>
  quadtree(ul, ur, lr, ll)

If we superimpose it on our 8x8 square, we see that we have:

⚫️⚪️⚪️⚫️⚫️⚪️⚪️⚪️
⚪️⚫️⚫️⚪️⚪️⚫️⚫️⚪️
⚪️⚫️🔵🔵🔵🔵⚪️⚫️
⚪️⚪️🔵🔵🔵🔵⚫️⚪️
⚪️⚫️🔵🔵🔵🔵⚪️⚫️
⚫️⚪️🔵🔵🔵🔵⚫️⚪️
⚪️⚫️⚫️⚪️⚪️⚫️⚫️⚪️
⚫️⚪️⚪️⚫️⚪️⚪️⚪️⚫️

Aha! This is the averaged centre 4x4 of an 8x8 square. It has the same “shape” as getting the averaged 2x2 of a 4x4 square. We’ve averaging twice to get here, but hold that thought.

If we can turn this into a generalized algorithm, we can write a multirec to average quadtrees of any size.


generalizing averaging

Here’s our memoized multirec again:

function memoizedMultirec({ indivisible, value, divide, combine, key }) {
  const myself = memoized((input) => {
    if (indivisible(input)) {
      return value(input);
    } else {
      const parts = divide(input);
      const solutions = mapWith(myself)(parts);

      return combine(solutions);
    }
  }, key);

  return myself;
}

Just like multirec, we need indivisible, value, divide, combine, and key. Since the smallest square we want to average is 4x4, the test for indivisible is simple. value is our existing averageOf4x4 function, and we’ll use our simpleKey for the key:

const is4x4 = (square) => isString(square.ul.ul);

const average = memoizedMultirec({
    indivisible: is4x4,
    value: averageOf4x4,
    // divide: ???
    // combine: ???
    key: simpleKey
  });

What about dividing a square that is larger than 4x4? We wrote that code, we divide it into nine regions, not four. We’ll adjust to just do the division:

const divideQuadtreeIntoNine = (square) => [
    upperleft(square),
    uppercentre(square),
    upperright(square),
    leftmiddle(square),
    middlecentre(square),
    rightmiddle(square),
    lowerleft(square),
    lowercentre(square),
    lowerright(square)
  ];

And given the averages of those nine squares, we can recombine them into a “nonettree.” A nonettree of 2x2 squares is a 6x6 square, but larger nonettrees are possible too:

const combineNineIntoNonetTree = ([ul, uc, ur, lm, mc, rm, ll, lc, lr]) =>
  ({ ul, uc, ur, lm, mc, rm, ll, lc, lr });

As discussed that isn’t enough. If we were recursively computing the averages of nonettrees, we would extract the four overlapping quadtrees from a nonettree:

const divideNonetTreeIntoQuadTrees = ({ ul, uc, ur, lm, mc, rm, ll, lc, lr }) =>
  [
    quadtree(ul, uc, mc, lm), // ul
    quadtree(uc, ur, rm, mc), // ur
    quadtree(mc, rm, lr, lc), // lr
    quadtree(lm, mc, lc, ll)  // ll
  ];

And we know exactly how to combine four quadtrees into a bigger quadtree, we use regionsToQuadTree.

Harrumph, another problem. Are we dividing with divideQuadtreeIntoNine? Or divideNonetTreeIntoQuadTrees? And are we combining the results using combineNineIntoNonetTree? Or regionsToQuadTree?

The problem is, memoizedMultirec is predicated on every step of the recursion involving a single division followed by a single combine of the results. But our average algorithm requires two steps.

We divide a quadtree into nine, and run our algorithm on each piece. Then we subcombine those results into a nonet. Then we subdivide the nonet, and run our algorithm on each piece. Then we combine those results into a final result.

So let’s make ourselves a new combinator:

function memoizedDoubleMultirec({ indivisible, value, divide, subcombine, subdivide, combine, key }) {
  const myself = memoized((input) => {
    if (indivisible(input)) {
      return value(input);
    } else {
      const parts = divide(input);
      const solutions = mapWith(myself)(parts);
      const subcombined = subcombine(solutions);

      const subparts = subdivide(subcombined);
      const subsolutions = mapWith(myself)(subparts);

      return combine(subsolutions);
    }
  }, key);

  return myself;
}

const average = memoizedDoubleMultirec({
    indivisible: is4x4,
    value: averageOf4x4,
    divide: divideQuadtreeIntoNine,
    subcombine: combineNineIntoNonetTree,
    subdivide: divideNonetTreeIntoQuadTrees,
    combine: regionsToQuadTree
  });

const eightByEight = arrayToQuadTree([
    ['⚫️', '⚪️', '⚪️', '⚫️', '⚫️', '⚪️', '⚪️', '⚪️'],
    ['⚪️', '⚫️', '⚫️', '⚪️', '⚪️', '⚫️', '⚫️', '⚪️'],
    ['⚪️', '⚫️', '⚪️', '⚫️', '⚪️', '⚫️', '⚪️', '⚫️'],
    ['⚪️', '⚪️', '⚫️', '⚪️', '⚫️', '⚪️', '⚫️', '⚪️'],
    ['⚪️', '⚫️', '⚪️', '⚪️', '⚪️', '⚫️', '⚪️', '⚫️'],
    ['⚫️', '⚪️', '⚫️', '⚪️', '⚫️', '⚪️', '⚫️', '⚪️'],
    ['⚪️', '⚫️', '⚫️', '⚪️', '⚪️', '⚫️', '⚫️', '⚪️'],
    ['⚫️', '⚪️', '⚪️', '⚫️', '⚪️', '⚪️', '⚪️', '⚫️']
  ]);

quadTreeToArray(average(eightByEight))
  //=>
    [
      ["⚪️", "⚪️", "⚪️", "⚪️"],
      ["⚪️", "⚪️", "⚪️", "⚪️"],
      ["⚪️", "⚪️", "⚪️", "⚪️"],
      ["⚪️", "⚪️", "⚪️", "⚪️"]
    ]

Excellent! Our memoizedDoubleMultirec can be used to implement algorithms—like average—where the result that can be memoized is smaller than the square itself, and with some care, we can accomplish the entire thing using memoized operations on squares.

As interesting as this is, we have two problems compared to a operation like rotation. First, we only wind up with half of the result we want. Second, we have a problem of time.

Let’s solve the half problem before we worry about the time.


getting the whole result

As long as edges have a special set of rules, e.g. fewer neighbours, we will not be able to use our recursive algorithm to compute the average of any arbitrary square, just the centre.

But what if edges don’t have a special behaviour? One way to eliminate the problem of edges is to define the problem such that it has no edges. Specifically, we can say that the behaviour of edge cells is as if the entire squre was padded on all sides by “⚪️” cells.

So when given:

⚪️⚫️⚪️⚫️
⚫️⚪️⚫️⚪️
⚪️⚫️⚫️⚪️
⚪️⚪️⚪️⚫️

We compute it as if it was padded on all sides with blanks:

⚪️⚪️⚪️⚪️⚪️⚪️
⚪️⚪️⚫️⚪️⚫️⚪️
⚪️⚫️⚪️⚫️⚪️⚪️
⚪️⚪️⚫️⚫️⚪️⚪️
⚪️⚪️⚪️⚪️⚫️⚪️
⚪️⚪️⚪️⚪️⚪️⚪️

Of course, it won’t fit our “square algorithm” if we only pad it with one column and row. But if we double its size, we will have all the paddig we need, and the result will be the same size as the input square, like this:

⚪️⚪️⚪️⚪️⚪️⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚪️⚪️⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚫️⚪️⚫️⚪️⚪️⚪️
⚪️⚪️⚫️⚪️⚫️⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚫️⚫️⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚪️⚪️⚫️⚪️⚪️⚪️
⚪️⚪️⚪️⚪️⚪️⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚪️⚪️⚪️⚪️⚪️⚪️

The necessary code is straightforward:

const is2x2 = (square) => isString(square.ul);

const divideQuadTreeIntoRegions = ({ ul, ur, lr, ll }) =>
  [ul, ur, lr, ll];

const blank4x4 = quadtree('⚪️', '⚪️', '⚪️', '⚪️');

const blankCopy = memoizedMultirec({
    indivisible: is2x2,
    value: () => blank4x4,
    divide: divideQuadTreeIntoRegions,
    combine: regionsToQuadTree
  });

const double = (sqre) => {
  const padding = blankCopy(sqre.ul);

  const ul = quadtree(padding, padding, sqre.ul, padding);
  const ur = quadtree(padding, padding, padding, sqre.ur);
  const lr = quadtree(sqre.lr, padding, padding, padding);
  const ll = quadtree(padding, sqre.ll, padding, padding);

  return quadtree(ul, ur, lr, ll);
}

Now we can think about time.


TIME

Time, © 2010 Sean MacEntee, some rights reserved


time and cellular automata

When we rotate a square of any size, we rotate it once. We rotate and move about many parts of it, but when we conclude, it has only been rotated ninety degrees. But this is not the case with our average algorithm.

If we average a 4x4 square, the centre 2x2 pixels have been averaged once. But when we average an 8x8 square, the centre 4x4 square is composed of 2x2 squares that have been averaged twice, as we saw above.

If we average a 16x16 square, we would wind up averaging the centre 8x8 square four times. And up it goes exponentially. Were we to average a 1024x1024 square, we would get as a result a 512x512 square, representing the result of averaging the pixels 256 times!

This turns out to be not very useful for operations that are only meant to be performed once. On the other hand, if we want to iteratively perform an operation many, many, many times, it is very useful and can be very fast.

So perhaps averaging is not a good domain for memoizing and canonicalizing. What kind of operation would benefit from being run dozens, hundreds, thousands, or even millions and in some cases billions of times?


So far, we’ve talked about quadtrees storing image information. This was nice because algorithms like rotate are very visual. But images aren’t the only thing we can represent with a quadtree, and don’t often benefit from repeated operations.

But let’s look at averagedPixel one more time:

const B = [5, 6, 7, 8];
const S = [4, 5, 6, 7, 8];

const averagedPixel = (pixel, blackNeighbours) =>
  (pixel === '⚪️')
  ? B.includes(blackNeighbours) ? '⚫️' : '⚪️'
  : S.includes(blackNeighbours) ? '⚫️' : '⚪️';

If we think of our pixels as state machines, what we are describing is a state machine with two states (‘⚫️’ and ‘⚪️’), and a rule for determining the next state it will take based on the number of neighbours in the ‘⚫️’ state.

We have, in fact, a two-dimensional cellular automaton, and our averagedPixel state machine encodes one particular set of rules. There are many others.

A cellular automaton consists of a regular grid of cells, each in one of a finite number of states, such as on and off (in contrast to a coupled map lattice). The grid can be in any finite number of dimensions. For each cell, a set of cells called its neighbourhood is defined relative to the specified cell.

An initial state (time t = 0) is selected by assigning a state for each cell. A new generation is created (advancing t by 1), according to some fixed rule (generally, a mathematical function) that determines the new state of each cell in terms of the current state of the cell and the states of the cells in its neighbourhood.

Typically, the rule for updating the state of cells is the same for each cell and does not change over time, and is applied to the whole grid simultaneously.

The usual vernacular is to call the ‘⚫️’ state “alive,” and the ‘⚪️’ state “dead.” With those two names, the B and S variables can now be called “born” and “survives.” B or “born” describes a set of conditions for a dead cell being born, or changing to the alive state. S or “survives” describes a set of conditions for an alive state remaining alive.

Every iteration or application of “average” is simultaneously advancing the states of all the cells by one generation, using average’s rules. For compactness, “average” is called “B5678S45678.”

We can explore other rule sets by refactoring our average function to accept a rule set as a parameter. Here it is refactored:

function automaton ({ B, S }) {
  const applyRuleToCell = (pixel, blackNeighbours) =>
    (pixel === '⚪️')
    ? B.includes(blackNeighbours) ? '⚫️' : '⚪️'
    : S.includes(blackNeighbours) ? '⚫️' : '⚪️';

  const applyRuleTo4x4 = (sq) => ({
      ul: applyRuleToCell(sq.ul.lr, count(neighboursOfUlLr(sq))),
      ur: applyRuleToCell(sq.ur.ll, count(neighboursOfUrLl(sq))),
      lr: applyRuleToCell(sq.lr.ul, count(neighboursOfLrUl(sq))),
      ll: applyRuleToCell(sq.ll.ur, count(neighboursOfLlUr(sq)))
    });

  return memoizedDoubleMultirec({
      indivisible: is4x4,
      value: applyRuleTo4x4,
      divide: divideQuadtreeIntoNine,
      subcombine: combineNineIntoNonetTree,
      subdivide: divideNonetTreeIntoQuadTrees,
      combine: regionsToQuadTree
    });
}

const average = automaton({ B: [5, 6, 7, 8], S: [4, 5, 6, 7, 8] });

Alas, “average” is an uninteresting set of rules. “Interesting” rules are those that give rise to a balance between growth and destruction and provide a rich set of interactions between patterns. Sufficiently interesting rules permit many exotic patterns and have been proven to be Turing complete.


Crab Nebula

Crab Nebula, © 2005 NASA Goddard Space Flight Center, some rights reserved


life, the universe, and everything

The most famous rule set for two-dimensional automata is “B3S23,” known as Conway’s Game of Life:

const conwaysGameOfLife = automaton({ B: [3], S: [2, 3]});

John Horton Conway was interested in life. One of the characteristics that people used to distinguish “life” from “non-life” in the natural world was the ability to replicate itself from a blueprint. Crystals “replicate” themselves by forces in the natural world, but not from a description of a crystal.

Higher orders of life, including plants and animals, replicate themselves from a representation in the form of genes. Some people claimed that this capability was somehow special, conferred by divinity. Conway wondered whether a machine could replicate itself from a description.

Lots of machines replicate things from descriptions, computing engines trace their lineage back to Jacquard Looms that used punch cards to describe the patterns to weave. But looms do not use punch cards to build more looms.

Conway did not attempt to build such a machine. As a mathematician, he would be satisfied if he could simply prove that the rules of the universe did not preclude building such a machine. His strategy was to create a very simple simulation of the universe, and within that simulation, devise a self-replicating machine.

He ended up with a two-dimensional cellular automaton that had twenty-nine states for each cell. And he constructed a clever proof that it was possible to build a self-replicating pattern of cells within that automaton that relied upon an encoded description of the device.

From this, he reasoned, the laws of our physical universe would also permit a mechanical device to self-replicate from a description encoded within the machine. And while this does not prove that life like ourselves is mechanical, it disproves the notion that life like ourselves cannot be mechanical.

The physics of our universe are much more complicated than a twenty-nine state automaton. However, it is natural to wonder, “How simple can a universe be and still permit self-replicating patterns?”

The simplest possible universe would have only two states, and Conway (along with a very talented team) set out to find a two-dimensional automaton with only two states that could support self-replicating machines.

B3S23 turned out to be such an automaton. While Conway did not build such a machine, proofs that such a machine was possible followed, and B3S23 has been studied by mathematicians, computer scientists, and hobbyists ever since.


Universal Turing Machine in B3S23, Initial Configuration

Universal Turing Machine, Initial Configuration, via Giulio Prisco


studying life

In order to investigate really complicated patterns, we need fast hardware and an algorithm for performing a stupendous amount of computation. In the 1980s, Bill Gosper discovered that memoized and canonicalized quadtrees could be used to simulate tremendously large patterns over enormous numbers of generations. He called the algorithm [hashlife], and what we have built here is a toy implementation, but it includes hashlife’s essential features.

But what does hashlife afford us? Why is it important? One practical application of hashlife is to help us understand the significance of Conway’s original proof.

The written proof is difficult for a layperson (like myself) to follow. But thanks to high-speed algorithms like hashlife, people have built actual self-replicating machines, including machines that replicate an instruction tape. You don’t need to prove it is possible when you can simply observe it replicating itself.

Equally interestingly, people have proven that anything that can be computed, can be computed in B3S23 in a remarkably easy-to-grasp format: They built Turing Machines.

The screenshot above is of a Turing Machine running in B3S23. It shows the 6,366,548,773,467,669,985,195,496,000th generation. Computations like this are only possible on commodity hardware when we can use algorithms like our memoized and canonicalized quadtrees.

From this we grasp that all computation can–in principle–be performed by remarkably simple devices.

Studying B3S23 helps us understand how much of the complexity we observe in our universe can actually arise from the interactions between very simple parts with very simple behaviour.

And B3S23 isn’t the only automaton to study. Most rule sets produce boring universes. “Average” from above tends towards empty space. Others fill their universes with chaos. But a few, like B3S23, support the creation of independent patterns that interact with each other.

In 1994, Nathan Thompson devised Highlife, described by the rules B36S23. In B3S23, a self-replicating pattern was proven to exist, but not devised until 2013 In B36S23, there are a number of trivial replicator patterns that can be used to engineer other patterns.

It’s easy to play with using our code:

const highlife = automaton({ B: [3, 6], S: [2, 3]});

And there are many others to try.


what do recursive algorithms tell us?

Our algorithm is, of course, a toy. We use strings for cells and keys. We have no way to evict squares from the cache, so on patterns with a non-trivial amount of entropy, we will quickly exhaust the space available to the JavaScript engine.

We haven’t constructed any way to advance an arbitrary number of generations, we can only advance a number of generations driven by the size of our square when doubled. These and other problems are all fixable in one way or another, and many non-trivial implementations have been written.

But the existence of an algorithm that runs in logarithmic time tells us that many things that seem impractical, can actually be implemented if we just find the right representation. When Conway and his students were simulating life by hand using a go board and coloured stones, nobody thought that one day you could buy a machine in a retail store that could run a Turing Machine or self-replicating pattern in a few minutes.

Breakthroughs like hashlife are more than just “optimizations,” even though thats the word used in this very essays. For problems suited to their domain, some algorithms are so much faster that they fundamentally change the way we approach solving problems.

We can and should take this thinking outside of algorithms and mathematical proofs. So many of the things we take for granted today are artefacts of constraints that no longer exist and/or can be removed if we put our minds to it. Out programming languages, our frameworks, our development processes, all of these things are driven by choices that were made when computation cycles were expensive and communication was slow.

We can and should ask ourselves what software would look like if it was many, many, many orders of magnitude faster. We can and should ask ourselves how our tools and the things we create with them would be deeply and fundamentally changed.

And then we should make it happen.


end

End, © 2011 Brownpau, some rights reserved


appendix: memoized and canonicalized quad trees
function mapWith (fn) {
  return (mappable) => mappable.map(fn);
}

function multirec({ indivisible, value, divide, combine }) {
  return function myself (input) {
    if (indivisible(input)) {
      return value(input);
    } else {
      const parts = divide(input);
      const solutions = mapWith(myself)(parts);

      return combine(solutions);
    }
  }
}

const memoized = (fn, keymaker = JSON.stringify) => {
    const lookupTable = Object.create(null);

    return function (...args) {
      const key = keymaker.call(this, args);

      return lookupTable[key] || (lookupTable[key] = fn.apply(this, args));
    }
  };

function memoizedMultirec({ indivisible, value, divide, combine, key }) {
  const myself = memoized((input) => {
    if (indivisible(input)) {
      return value(input);
    } else {
      const parts = divide(input);
      const solutions = mapWith(myself)(parts);

      return combine(solutions);
    }
  }, key);

  return myself;
}

const isOneByOneArray = (something) =>
  Array.isArray(something) && something.length === 1 &&
  Array.isArray(something[0]) && something[0].length === 1;

const contentsOfOneByOneArray = (array) => array[0][0];

const firstHalf = (array) => array.slice(0, array.length / 2);
const secondHalf = (array) => array.slice(array.length / 2);

const divideSquareIntoRegions = (square) => {
    const upperHalf = firstHalf(square);
    const lowerHalf = secondHalf(square);

    const upperLeft = upperHalf.map(firstHalf);
    const upperRight = upperHalf.map(secondHalf);
    const lowerRight = lowerHalf.map(secondHalf);
    const lowerLeft= lowerHalf.map(firstHalf);

    return [upperLeft, upperRight, lowerRight, lowerLeft];
  };

const KEY = Symbol('key');

const simpleKey = (something) =>
  isString(something)
  ? something
  : something[KEY];

const compositeKey = (...regions) => regions.map(simpleKey).join('');

const quadtree = memoized(
    (ul, ur, lr, ll) => ({ ul, ur, lr, ll, [KEY]: compositeKey(ul, ur, lr, ll) }),
    compositeKey
  );

const regionsToQuadTree = ([ul, ur, lr, ll]) =>
  quadtree(ul, ur, lr, ll);

const arrayToQuadTree = multirec({
    indivisible: isOneByOneArray,
    value: contentsOfOneByOneArray,
    divide: divideSquareIntoRegions,
    combine: regionsToQuadTree
  });

const isString = (something) => typeof something === 'string';

const itself = (something) => something;

const quadTreeToRegions = (qt) =>
  [qt.ul, qt.ur, qt.lr, qt.ll];

const regionsToRotatedQuadTree = ([ur, lr, ll, ul]) =>
  quadtree(ul, ur, lr, ll);

const memoizedRotateQuadTree = memoizedMultirec({
      indivisible : isString,
      value : itself,
      divide: quadTreeToRegions,
      combine: regionsToRotatedQuadTree,
      key: simpleKey
  });


appendix: naïve quad trees and coloured quad trees
function mapWith (fn) {
  return (mappable) => mappable.map(fn);
}

function multirec({ indivisible, value, divide, combine }) {
  return function myself (input) {
    if (indivisible(input)) {
      return value(input);
    } else {
      const parts = divide(input);
      const solutions = mapWith(myself)(parts);

      return combine(solutions);
    }
  }
}

const isOneByOneArray = (something) =>
  Array.isArray(something) && something.length === 1 &&
  Array.isArray(something[0]) && something[0].length === 1;

const contentsOfOneByOneArray = (array) => array[0][0];

const firstHalf = (array) => array.slice(0, array.length / 2);
const secondHalf = (array) => array.slice(array.length / 2);

const divideSquareIntoRegions = (square) => {
    const upperHalf = firstHalf(square);
    const lowerHalf = secondHalf(square);

    const upperLeft = upperHalf.map(firstHalf);
    const upperRight = upperHalf.map(secondHalf);
    const lowerRight = lowerHalf.map(secondHalf);
    const lowerLeft= lowerHalf.map(firstHalf);

    return [upperLeft, upperRight, lowerRight, lowerLeft];
  };

const regionsToQuadTree = ([ul, ur, lr, ll]) =>
  ({ ul, ur, lr, ll });

const arrayToQuadTree = multirec({
    indivisible: isOneByOneArray,
    value: contentsOfOneByOneArray,
    divide: divideSquareIntoRegions,
    combine: regionsToQuadTree
  });

const isString = (something) => typeof something === 'string';

const itself = (something) => something;

const quadTreeToRegions = (qt) =>
  [qt.ul, qt.ur, qt.lr, qt.ll];

const regionsToRotatedQuadTree = ([ur, lr, ll, ul]) =>
  ({ ul, ur, lr, ll });

const rotateQuadTree = multirec({
    indivisible : isString,
    value : itself,
    divide: quadTreeToRegions,
    combine: regionsToRotatedQuadTree
  });

const combinedColour = (...elements) =>
  elements.reduce((acc, element => acc === element ? element : '❓'))

const regionsToColouredQuadTree = ([ul, ur, lr, ll]) => ({
    ul, ur, lr, ll, colour: combinedColour(ul, ur, lr, ll)
  });

const arrayToColouredQuadTree = multirec({
  indivisible: isOneByOneArray,
  value: contentsOfOneByOneArray,
  divide: divideSquareIntoRegions,
  combine: regionsToColouredQuadTree
});

const colour = (something) => {
    if (something.colour != null) {
      return something.colour;
    } else if (something === '⚪️') {
      return '⚪️';
    } else if (something === '⚫️') {
      return '⚫️';
    } else {
      throw "Can't get the colour of this thing";
    }
  };

const isEntirelyColoured = (something) =>
  colour(something) !== '❓';

const rotateColouredQuadTree = multirec({
    indivisible : isEntirelyColoured,
    value : itself,
    divide: quadTreeToRegions,
    combine: regionsToRotatedQuadTree
  });


afterward

There is more to read about multirec in the previous essays, From Higher-Order Functions to Libraries And Frameworks and Why recursive data structures?.

Have an observation? Spot an error? You can open an issue, discuss the post on hacker news or even edit this page yourself.


notes
  1. Actually, there is another, far more delightful way to memoize recursive functions: You can read about it in Fixed-point combinators in JavaScript: Memoizing recursive functions

https://raganwald.com/2017/01/12/time-space-life-as-we-know-it
Why Recursive Data Structures?
Show full content

In this essay, we are going to look at recursive algorithms, and how sometimes, we can organize an algorithm so that it resembles the data structure it manipulates, and organize a data structure so that it resembles the algorithms that manipulate it.

When algorithms and the data structures they manipulate are isomorphic,1 the code we write is easier to understand for exactly the same reason that code like template strings and regular expressions are easy to understand: The code resembles the data it consumes or produces.

We’ll finish up by observing that we also can employ optimizations that are only possible when algorithms and the data structures they manipulate are isomorphic.

Here we go.


GEB recursive

GEB Recursive, © 2006 Alexandre Duret-Lutz, some rights reserved


an exercise: rotating a square

Here is a square2 composed of elements, perhaps pixels or cells that are on or off. We could write them out like this:

⚪️⚪️⚪️⚪️⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚪️⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚫️⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚪️⚫️⚪️⚪️⚪️
⚪️⚪️⚫️⚫️⚫️⚪️⚪️⚪️
⚪️⚪️⚪️⚪️⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚪️⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚪️⚪️⚪️⚪️⚪️

Consider the problem of rotating our square. There is an uncommon, but particularly delightful way to do this. First, we cut the square into four smaller squares:

⚪️⚪️⚪️⚪️ ⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚪️ ⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚫️ ⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚪️ ⚫️⚪️⚪️⚪️

⚪️⚪️⚫️⚫️ ⚫️⚪️⚪️⚪️
⚪️⚪️⚪️⚪️ ⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚪️ ⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚪️ ⚪️⚪️⚪️⚪️

Now, we rotate each of the four smaller squares 90 degrees clockwise:

⚪️⚪️⚪️⚪️ ⚫️⚪️⚪️⚪️
⚪️⚪️⚪️⚪️ ⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚪️ ⚪️⚪️⚪️⚪️
⚪️⚫️⚪️⚪️ ⚪️⚪️⚪️⚪️

⚪️⚪️⚪️⚪️ ⚪️⚪️⚪️⚫️
⚪️⚪️⚪️⚪️ ⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚫️ ⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚫️ ⚪️⚪️⚪️⚪️

Finally, we move the squares as a whole, 90 degrees clockwise:

⚪️⚪️⚪️⚪️ ⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚪️ ⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚫️ ⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚫️ ⚪️⚫️⚪️⚪️

⚪️⚪️⚪️⚫️ ⚫️⚪️⚪️⚪️ 
⚪️⚪️⚪️⚪️ ⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚪️ ⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚪️ ⚪️⚪️⚪️⚪️

Then reassemble:

⚪️⚪️⚪️⚪️⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚪️⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚫️⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚫️⚪️⚫️⚪️⚪️
⚪️⚪️⚪️⚫️⚫️⚪️⚪️⚪️ 
⚪️⚪️⚪️⚪️⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚪️⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚪️⚪️⚪️⚪️⚪️

How do we rotate each of the four smaller squares? Exactly the same way. For example,

⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚫️
⚪️⚪️⚪️⚫️

Becomes:

⚪️⚪️ ⚪️⚪️
⚪️⚪️ ⚪️⚪️

⚪️⚪️ ⚪️⚫️
⚪️⚪️ ⚪️⚫️

By rotating each smaller square, it becomes:

⚪️⚪️ ⚪️⚪️
⚪️⚪️ ⚪️⚪️

⚪️⚪️ ⚪️⚪️
⚪️⚪️ ⚫️⚫️

And we rotate all four squares to finish with:

⚪️⚪️ ⚪️⚪️
⚪️⚪️ ⚪️⚪️

⚪️⚪️ ⚪️⚪️
⚫️⚫️ ⚪️⚪️

Reassembled, it becomes this:

⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚪️
⚫️⚫️⚪️⚪️

How would we rotate the next size down?

⚪️⚪️
⚫️⚫️

Becomes:

⚪️ ⚪️

⚫️ ⚫️

Rotating an individual dot is a NOOP, so all we have to do is rotate the four dots around, just like we do above:

⚫️ ⚪️

⚫️ ⚪️

Reassembled, it becomes this:

⚫️⚪️
⚫️⚪️

Voila! Rotating a square consists of dividing it into four “region” squares, rotating each one clockwise, then moving the regions one position clockwise. It brings whirling dervishes to mind.3

Here’s the algorithm in action:4


recursion, see recursion

In From Higher-Order Functions to Libraries And Frameworks, we had a look at multirec, a recursive combinator.

function mapWith (fn) {
  return function * (iterable) {
    for (const element of iterable) {
      yield fn(element);
    }
  };
}

function multirec({ indivisible, value, divide, combine }) {
  return function myself (input) {
    if (indivisible(input)) {
      return value(input);
    } else {
      const parts = divide(input);
      const solutions = mapWith(myself)(parts);

      return combine(solutions);
    }
  }
}

With multirec, we can write functions that perform computation using divide-and-conquer algorithms. multirec handles the structure of divide-and-conquer, we just have to write four smaller functions that implement the parts specific to the problem we are solving.

In computer science, divide and conquer (D&C) is an algorithm design paradigm based on multi-branched recursion. A divide and conquer algorithm works by recursively breaking down a problem into two or more sub-problems of the same or related type, until these become simple enough to be solved directly. The solutions to the sub-problems are then combined to give a solution to the original problem.—Wikipedia

We’ll implement rotating a square using multirec. Let’s begin with a naïve representation for squares, a two-dimensional array. For example, we would represent the square:

⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚪️
⚪️⚪️⚪️⚫️
⚪️⚪️⚪️⚫️

With this array:

[['⚪️', '⚪️', '⚪️', '⚪️'],
 ['⚪️', '⚪️', '⚪️', '⚪️'],
 ['⚪️', '⚪️', '⚪️', '⚫️'],
 ['⚪️', '⚪️', '⚪️', '⚫️']]

To use multirec, we need four pieces:

  1. An indivisible predicate function. It should report whether an array is too small to be divided up. It’s simplicity itself: (square) => square.length === 1.
  2. A value function that determines what to do with a value that is indivisible. For rotation, we simply return what we are given: (something) => something
  3. A divide function that breaks a divisible problem into smaller pieces. Our function will break a square into four regions. We’ll see how that works below.
  4. A combine function that puts the result of rotating the smaller pieces back together. Our function will take four region squares and put them back together into a big square.

As noted, indivisible and value are trivial. We’ll call our functions hasLengthOne, and, itself:5

const hasLengthOne = (square) => square.length === 1;
const itself = (something) => something;

divide involves no more than breaking arrays into halves, and then those halves again. We’ll write a divideSquareIntoRegions function for this:

const firstHalf = (array) => array.slice(0, array.length / 2);
const secondHalf = (array) => array.slice(array.length / 2);

const divideSquareIntoRegions = (square) => {
  const upperHalf = firstHalf(square);
  const lowerHalf = secondHalf(square);

  const upperLeft = upperHalf.map(firstHalf);
  const upperRight = upperHalf.map(secondHalf);
  const lowerRight = lowerHalf.map(secondHalf);
  const lowerLeft= lowerHalf.map(firstHalf);

  return [upperLeft, upperRight, lowerRight, lowerLeft];
};

Our combine function, rotateAndCombineArrays, makes use of a little help from some functions we saw in an essay about generators:

function split (iterable) {
  const iterator = iterable[Symbol.iterator]();
  const { done, value: first } = iterator.next();

  if (done) {
    return { rest: [] };
  } else {
    return { first, rest: iterator };
  }
};

function * join (first, rest) {
  yield first;
  yield * rest;
};

function * zipWith (fn, ...iterables) {
  const asSplits = iterables.map(split);

  if (asSplits.every((asSplit) => asSplit.hasOwnProperty('first'))) {
    const firsts = asSplits.map((asSplit) => asSplit.first);
    const rests = asSplits.map((asSplit) => asSplit.rest);

    yield * join(fn(...firsts), zipWith(fn, ...rests));
  }
}

const concat = (...arrays) => arrays.reduce((acc, a) => acc.concat(a));

const rotateAndCombineArrays = ([upperLeft, upperRight, lowerRight, lowerLeft]) => {
  // rotate
  [upperLeft, upperRight, lowerRight, lowerLeft] =
    [lowerLeft, upperLeft, upperRight, lowerRight];

  // recombine
  const upperHalf = [...zipWith(concat, upperLeft, upperRight)];
  const lowerHalf = [...zipWith(concat, lowerLeft, lowerRight)];

  return concat(upperHalf, lowerHalf);
};

Armed with hasLengthOne, itself, divideSquareIntoRegions, and rotateAndCombineArrays, we can use multirec to write rotate:

const rotate = multirec({
  indivisible : hasLengthOne,
  value : itself,
  divide: divideSquareIntoRegions,
  combine: rotateAndCombineArrays
});

rotate(
   [['⚪️', '⚪️', '⚪️', '⚪️'],
    ['⚪️', '⚪️', '⚪️', '⚪️'],
    ['⚪️', '⚪️', '⚪️', '⚫️'],
    ['⚪️', '⚪️', '⚪️', '⚫️']]
  )
  //=>
    ([
      ['⚪️', '⚪️', '⚪️', '⚪️'],
      ['⚪️', '⚪️', '⚪️', '⚪️'],
      ['⚪️', '⚪️', '⚪️', '⚪️'],
      ['⚫️', '⚫️', '⚪️', '⚪️']
    ])

Voila!


accidental complexity

Rotating a square in this recursive manner is intellectually stimulating, but our code is encumbered with some accidental complexity. Here’s a flashing strobe-and-neon hint of what it is:

const firstHalf = (array) => array.slice(0, array.length / 2);
const secondHalf = (array) => array.slice(array.length / 2);

const divideSquareIntoRegions = (square) => {
  const upperHalf = firstHalf(square);
  const lowerHalf = secondHalf(square);

  const upperLeft = upperHalf.map(firstHalf);
  const upperRight = upperHalf.map(secondHalf);
  const lowerRight = lowerHalf.map(secondHalf);
  const lowerLeft= lowerHalf.map(firstHalf);

  return [upperLeft, upperRight, lowerRight, lowerLeft];
};

divideSquareIntoRegions is all about extracting region squares from a bigger square, and while we’ve done our best to make it readable, it is rather busy. Likewise, here’s the same thing in rotateAndCombineArrays:

const rotateAndCombineArrays = ([upperLeft, upperRight, lowerRight, lowerLeft]) => {
  // rotate
  [upperLeft, upperRight, lowerRight, lowerLeft] =
    [lowerLeft, upperLeft, upperRight, lowerRight];

  // recombine
  const upperHalf = [...zipWith(concat, upperLeft, upperRight)];
  const lowerHalf = [...zipWith(concat, lowerLeft, lowerRight)];

  return concat(upperHalf, lowerHalf);
};

rotateAndCombineArrays is a very busy function. The core thing we want to talk about is actually the rotation: Having divided things up into four regions, we want to rotate the regions. The zipping and concatenating is all about the implementation of regions as arrays.

We can argue that this is necessary complexity, because squares are arrays, and that’s just what we programmers do for a living, write code that manipulates basic data structures to do our bidding.

But what if our implementation wasn’t an array of arrays? Maybe divide and combine could be simpler? Maybe that complexity would turn out to be unnecessary after all?


Recursive Chessboard

Recursive Chessboard, © 2007 fdecomite, some rights reserved


isomorphic data structures

When we have what ought to be an elegant algorithm, but the interface between the algorithm and the data structure ends up being as complicated as the rest of the algorithm put together, we can always ask ourselves, “What data structure would make this algorithm stupidly simple?”

The answer can often be found by imagining a data structure that looks like the algorithm’s basic form. If we follow that heuristic, our data structure would be recursive, rather than ‘flat.’ Since we do all kinds of work sorting out which squares form the four regions of a bigger square, our data structure would describe a square as being composed of four region squares.

Such a data structure already exists, it’s called a quadtree.6 Squares are represented as four regions, each of which is a smaller square or a cell. A simple implementation is a “Plain Old JavaScript Object” (or “POJO”) with properties for each of the regions. If the property contains a string, it’s cell. If it contains another POJO, it’s a quadtree.

A square that looks like this:

⚪️⚫️⚪️⚪️
⚪️⚪️⚫️⚪️
⚫️⚫️⚫️⚪️
⚪️⚪️⚪️⚪️

Is composed of four regions, the ul (“upper left”), ur (“upper right”), lr (“lower right”), and ll (“lower left”), something like this:

ul | ur
---+---
ll | lr

Thus, for example, the ul is:

⚪️⚫️
⚪️⚪️

And the ur is:

⚪️⚪️
⚫️⚪️

And so forth. Each of those regions is itself composed of four regions. Thus, the ul of the ul is ⚪️, and the ur of the ul is ⚫️.

The quadtree could be expressed in JavaScript like this:

const quadTree = {
  ul: { ul: '⚪️', ur: '⚫️', lr: '⚪️', ll: '⚪️' },
  ur: { ul: '⚪️', ur: '⚪️', lr: '⚪️', ll: '⚫️' },
  lr: { ul: '⚫️', ur: '⚪️', lr: '⚪️', ll: '⚪️' },
  ll: { ul: '⚫️', ur: '⚫️', lr: '⚪️', ll: '⚪️' }
};

Now to our algorithm. Rotating a quadtree is simpler than rotating an array of arrays. First, our test for indivisibility is now whether something is a string or not:

const isString = (something) => typeof something === 'string';

The value of an indivisible cell remain the same, itself.

Our divide function is simple: quadtrees are already divided in the manner we require, we just have to turn them into an array of regions:

const quadTreeToRegions = (qt) =>
  [qt.ul, qt.ur, qt.lr, qt.ll];

And finally, our combine function reassembles the rotated regions into a POJO, rotating them in the process:

const regionsToRotatedQuadTree = ([ur, lr, ll, ul]) =>
  ({ ul, ur, lr, ll });

And here’s our function for rotating a quadtree:

const rotateQuadTree = multirec({
  indivisible : isString,
  value : itself,
  divide: quadTreeToRegions,
  combine: regionsToRotatedQuadTree
});

Let’s put it to the test:

rotateQuadTree(quadTree)
  //=>
    ({
       ul: { ll: "⚪️", lr: "⚫️", ul: "⚪️", ur: "⚫️" },
       ur: { ll: "⚪️", lr: "⚫️", ul: "⚪️", ur: "⚪️" },
       lr: { ll: "⚪️", lr: "⚪️", ul: "⚫️", ur: "⚪️" },
       ll: { ll: "⚪️", lr: "⚪️", ul: "⚪️", ur: "⚫️" }
     })

If we reassemble the square by hand, it’s what we expect:

⚪️⚫️⚪️⚪️
⚪️⚫️⚪️⚫️
⚪️⚫️⚫️⚪️
⚪️⚪️⚪️⚪️

Now we can be serious about the word “Isomorphic.” Isomorphic means, fundamentally, “having the same shape.” Obviously, a quadtree doesn’t look anything like the code in rotateQuadTree or multirec. So how can a quadtree “look like” an algorithm? The answer is that the quadtree’s data structure looks very much like the way rotateQuadTree behaves at run time.

More precisely, the elements of the quadtree and the relationships between them can be put into a one-to-one correspondance with the call graph of rotateQuadTree when acting on that quadtree.


separation of concerns

But back to our code. All we’ve done so far is moved the “faffing about” out of our code and we’re doing it by hand. That’s bad: we don’t want to retrain our eyes to read quadtrees instead of flat arrays, and we don’t want to sit at a computer all day manually translating quadtrees to flat arrays and back.

If only we could write some code to do it for us… Some recursive code…

Here’s a function that recursively turns a two-dimensional array into a quadtree:

const isOneByOneArray = (something) =>
  Array.isArray(something) && something.length === 1 &&
  Array.isArray(something[0]) && something[0].length === 1;

const contentsOfOneByOneArray = (array) => array[0][0];

const regionsToQuadTree = ([ul, ur, lr, ll]) =>
  ({ ul, ur, lr, ll });

const arrayToQuadTree = multirec({
  indivisible: isOneByOneArray,
  value: contentsOfOneByOneArray,
  divide: divideSquareIntoRegions,
  combine: regionsToQuadTree
});

arrayToQuadTree([
  ['⚪️', '⚪️', '⚪️', '⚪️'],
  ['⚪️', '⚫️', '⚪️', '⚪️'],
  ['⚫️', '⚪️', '⚪️', '⚪️'],
  ['⚫️', '⚫️', '⚫️', '⚪️']
])
  //=>
    ({
      ul:  { ul: "⚪️", ur: "⚪️", lr: "⚫️", ll: "⚪️" },
      ur:  { ul: "⚪️", ur: "⚪️", lr: "⚪️", ll: "⚪️" },
      lr:  { ul: "⚪️", ur: "⚪️", lr: "⚪️", ll: "⚫️" },
      ll:  { ul: "⚫️", ur: "⚪️", lr: "⚫️", ll: "⚫️" }
    })

Naturally, we can also write a function to convert quadtrees back into two-dimensional arrays again:

const isSmallestActualSquare = (square) => isString(square.ul);

const asTwoDimensionalArray = ({ ul, ur, lr, ll }) =>
  [[ul, ur], [ll, lr]];

const regions = ({ ul, ur, lr, ll }) =>
  [ul, ur, lr, ll];

const combineFlatArrays = ([upperLeft, upperRight, lowerRight, lowerLeft]) => {
  const upperHalf = [...zipWith(concat, upperLeft, upperRight)];
  const lowerHalf = [...zipWith(concat, lowerLeft, lowerRight)];

  return concat(upperHalf, lowerHalf);
}

const quadTreeToArray = multirec({
  indivisible: isSmallestActualSquare,
  value: asTwoDimensionalArray,
  divide: regions,
  combine: combineFlatArrays
});

quadTreeToArray(
  arrayToQuadTree([
    ['⚪️', '⚪️', '⚪️', '⚪️'],
    ['⚪️', '⚫️', '⚪️', '⚪️'],
    ['⚫️', '⚪️', '⚪️', '⚪️'],
    ['⚫️', '⚫️', '⚫️', '⚪️']
  ])
)
  //=>
    ([
      ["⚪️", "⚪️", "⚪️", "⚪️"],
      ["⚪️", "⚫️", "⚪️", "⚪️"],
      ["⚫️", "⚪️", "⚪️", "⚪️"],
      ["⚫️", "⚫️", "⚫️", "⚪️"]
    ])

And thus, we can take a two-dimensional array, turn it into a quadtree, rotate the quadtree, and convert it back to a two-dimensional array again:

quadTreeToArray(
  rotateQuadTree(
    arrayToQuadTree([
      ['⚪️', '⚪️', '⚪️', '⚪️'],
      ['⚪️', '⚫️', '⚪️', '⚪️'],
      ['⚫️', '⚪️', '⚪️', '⚪️'],
      ['⚫️', '⚫️', '⚫️', '⚪️']
    ])
  )
)
  //=>
    ([
      ["⚫️", "⚫️", "⚪️", "⚪️"],
      ["⚫️", "⚪️", "⚫️", "⚪️"],
      ["⚫️", "⚪️", "⚪️", "⚪️"],
      ["⚪️", "⚪️", "⚪️", "⚪️"]
    ])

but why?

Now, we argued above that we’ve neatly separated the concerns by making three separate functions, instead of interleaving dividing two-dimensional squares into regions, rotating regions, and then reassembling two-dimensional squares.

But the converse side of this is that what we’re doing is now a lot less efficient: We’re recursing through our data structures three separate times, instead of once. And frankly, multirec was designed such that the divide function breaks things up, and the combine function puts them back together, so these concerns are already mostly separate once we use multirec instead of a bespoke7 recursive function.

One reason to break the logic up into three separate functions would be if we want to do lots of different kinds of things with quadtrees. Besides rotating quadtrees, what else might we do?

Well, we might want to superimpose one image on top of another. This could be part of an image editing application, where we have layers of images and want to superimpose all the layers to derive the finished image for the screen. Or we might be implementing Conway’s Game of Life, and might want to ‘paste’ a pattern like a glider onto a larger universe.

Let’s go with a very simple implementation: We’re only editing black-and-white images, and each ‘pixel’ is either a ⚪️ or ⚫️. If we use two-dimensional arrays to represent our images, we need to iterate over every ‘pixel’ to perform the superimposition:

const superimposeCell = (left, right) =>
  (left === '⚫️' || right === '⚫️') ? '⚫️' : '⚪️';

const superimposeRow = (left, right) =>
  [...zipWith(superimposeCell, left, right)];

const superimposeArray = (left, right) =>
  [...zipWith(superimposeRow, left, right)];

const canvas =
  [ ['⚪️', '⚪️', '⚪️', '⚪️'],
    ['⚪️', '⚪️', '⚪️', '⚪️'],
    ['⚪️', '⚪️', '⚪️', '⚫️'],
    ['⚪️', '⚪️', '⚪️', '⚫️']];

const glider =
  [ ['⚪️', '⚪️', '⚪️', '⚪️'],
    ['⚪️', '⚫️', '⚪️', '⚪️'],
    ['⚫️', '⚪️', '⚪️', '⚪️'],
    ['⚫️', '⚫️', '⚫️', '⚪️']];

superimposeArray(canvas, glider)
  //=>
    ([
      ['⚪️', '⚪️', '⚪️', '⚪️'],
      ['⚪️', '⚫️', '⚪️', '⚪️'],
      ['⚫️', '⚪️', '⚪️', '⚫️'],
      ['⚫️', '⚫️', '⚫️', '⚫️']
    ])

Seems simple enough. How about superimposing a quadtree on a quadtree?


two trees

Two trees, © 2013 Jon Bunting, some rights reserved


recursive operations on pairs of quadtrees

We can use multirec to superimpose one quadtree on top of another: Our function will take a pair of quadtrees, using destructuring to extract one called left and the other called right:

const superimposeQuadTrees = multirec({
  indivisible: ({ left, right }) => isString(left),
  value: ({ left, right }) => right ==='⚫️'
                              ? right
                              : left,
  divide: ({ left, right }) => [
      { left: left.ul, right: right.ul },
      { left: left.ur, right: right.ur },
      { left: left.lr, right: right.lr },
      { left: left.ll, right: right.ll }
    ],
  combine: ([ul, ur, lr, ll]) => ({  ul, ur, lr, ll })
});

quadTreeToArray(
  superimposeQuadTrees({
    left: arrayToQuadTree(canvas),
    right: arrayToQuadTree(glider)
  })
)
  //=>
    ([
      ['⚪️', '⚪️', '⚪️', '⚪️'],
      ['⚪️', '⚫️', '⚪️', '⚪️'],
      ['⚫️', '⚪️', '⚪️', '⚫️'],
      ['⚫️', '⚫️', '⚫️', '⚫️']
    ])

Again, this feels like faffing about just so we can be recursive. But we are in position to do something interesting!


optimizing recursive algorithms with isomorphic data structures

Many images have large regions that are entirely white or black. When superimposing one region on another, if either region is entirely white, we know the result must be the same as the other region. When superimposing one region on another, if either region is entirely black, the result must be entirely black.

We can use the quadtree’s hierarchal representation to exploit this. We’ll store some extra information in each quadtree, its colour: If it is entirely white, its colour will be white. If it is entirely black, its colour will be black. And if it contains a mix of white and black cells, its colour will be a question mark.

const isOneByOneArray = (something) =>
  Array.isArray(something) && something.length === 1 &&
  Array.isArray(something[0]) && something[0].length === 1;

const contentsOfOneByOneArray = (array) => array[0][0];

const divideSquareIntoRegions = (square) => {
  const upperHalf = firstHalf(square);
  const lowerHalf = secondHalf(square);

  const upperLeft = upperHalf.map(firstHalf);
  const upperRight = upperHalf.map(secondHalf);
  const lowerRight = lowerHalf.map(secondHalf);
  const lowerLeft= lowerHalf.map(firstHalf);

  return [upperLeft, upperRight, lowerRight, lowerLeft];
};

const colour = (something) => {
  if (something.colour != null) {
    return something.colour;
  } else if (something === '⚪️') {
    return '⚪️';
  } else if (something === '⚫️') {
    return '⚫️';
  } else {
    throw "Can't get the colour of this thing";
  }
};

const combinedColour = (...elements) =>
  elements.reduce((acc, element => acc === element ? element : '❓'))

const regionsToQuadTree = ([ul, ur, lr, ll]) => ({
    ul, ur, lr, ll, colour: combinedColour(ul, ur, lr, ll)
  });

const arrayToQuadTree = multirec({
  indivisible: isOneByOneArray,
  value: contentsOfOneByOneArray,
  divide: divideSquareIntoRegions,
  combine: regionsToQuadTree
});

arrayToQuadTree(
  [ ['⚪️', '⚪️'],
    ['⚪️', '⚪️'] ]
).colour
  //=> "⚪️"

arrayToQuadTree(
  [ ['⚪️', '⚪️'],
    ['⚪️', '⚫️'] ]
).colour
  //=> "❓"

arrayToQuadTree(
  [ ['⚫️', '⚫️'],
    ['⚫️', '⚫️'] ]
).colour
  //=> "⚫️"

arrayToQuadTree(
  [ ['⚪️', '⚪️', '⚪️', '⚪️'],
    ['⚪️', '⚫️', '⚪️', '⚪️'],
    ['⚫️', '⚪️', '⚪️', '⚪️'],
    ['⚫️', '⚫️', '⚫️', '⚪️']]
).colour
  //=> "❓"

Now, we can take advantage of every region’s computed colour when we superimpose “coloured” quadtrees:

const eitherAreEntirelyColoured = ({ left, right }) =>
  colour(left) !== '❓' || colour(right) !== '❓' ;

const superimposeColoured = ({ left, right }) => {
    if (colour(left) === '⚪️' || colour(right) === '⚫️') {
      return right;
    } else if (colour(left) === '⚫️' || colour(right) === '⚪️') {
      return left;
    } else {
      throw "Can't superimpose these things";
    }
  };

const divideTwoQuadTrees = ({ left, right }) => [
    { left: left.ul, right: right.ul },
    { left: left.ur, right: right.ur },
    { left: left.lr, right: right.lr },
    { left: left.ll, right: right.ll }
  ];

const combineColouredRegions = ([ul, ur, lr, ll]) => ({
    ul, ur, lr, ll, colour: combinedColour(ul, ur, lr, ll)
  });

const superimposeColouredQuadTrees = multirec({
  indivisible: eitherAreEntirelyColoured,
  value: superimposeColoured,
  divide: divideTwoQuadTrees,
  combine: combineColouredRegions
});

We get the same output, but now instead of comparing every cell whenever we superimpose quadtrees, we compare entire regions at a time. If either is “entirely coloured,” we can return the other one without recursively drilling down to the level of individual pixels.

There is no savings if both quadtrees are composed of a fairly evenly spread mix of black and white pixels (e.g. a checkerboard pattern), but in cases where there are large expanses of white or black, the difference is substantial.

In the case of comparing the 4x4 canvas and glider images above, the superimposeArray function requires sixteen comparisons. The superimposeQuadTrees function requires twenty comparisons. But the superimposeColouredQuadTrees function requires just seven comparisons.

If we were writing an image manipulation application, we’d provide much snappier behaviour using coloured quadtrees to represent images on screen.

The interesting thing about this optimization is that it is tuned to the characteristics of both the data structure and the algorithm: It is not something that is easy to perform in the algorithm without the data structure, or in the data structure without the algorithm.

And it’s not the only optimization. Remember our ‘whirling regions’ implementation of rotateQuadTree? Here’s rotateColouredQuadTree:

const isEntirelyColoured = (something) =>
  colour(something) !== '❓' ;

const rotateColouredQuadTree = multirec({
  indivisible : isEntirelyColoured,
  value : itself,
  divide: quadTreeToRegions,
  combine: regionsToRotatedQuadTree
});

Any region that is entirely white or entirely black is its own rotation, so no further dividing and conquering need be done. For images that have large areas of blank space, the “whirling regions” algorithm is not just aesthetically delightful, it’s faster than a brute-force transposition of array elements.

Optimizations like this can only be implemented when the algorithm and the data structure are isomorphic to each other.


Game of Life 6

Detail from “Game of Life 6,” © Windell Oskay, some rights reserved


why!

So back to, “Why convert data into a structure that is isomorphic to our algorithm.”

The first reason to do so, is that the code is clearer and easier to read if we convert, then perform operations on the data structure, and then convert it back (if need be).

The second reason to do so, is that if we want to do lots of different operations on the data structure, it is much more efficient to keep it in the form that is isomorphic to the operations we are going to perform on it.

The example we saw was that if we were building a hypothetical image processing application, we could convert an image into quad trees, then rotate or superimpose images at will. We would only need to convert our quadtrees when we need to save or display the image in a rasterized (i.e. array-like) format.

And third, we saw that once we embraced a data structure that was isomorphic to the form of the algorithm, we could employ elegant optimizations that are impossible (or ridiculously inconvenient) when the algorithm and data structure do not match.

Separating conversion from operation allows us to benefit from all three reasons for ensuring that our algorithms and data structures are isomorphic to each other.


afterward

There is more to read about multirec in the previous essay, From Higher-Order Functions to Libraries And Frameworks, and in the follow-up, Time, Space, and Life As We Know It.

Have an observation? Spot an error? You can open an issue, discuss this on hacker news or reddit, or even edit this post yourself.

p.s. Thank you for reading this far. Here is your reward, An Algorithm for Compressing Space and Time. And hey! If you like this kind of thing, JavaScript Allongé is exactly the kind of thing you’ll like.


notes
  1. In biology, two things are isomorphic if they resemble each other. In mathematics, two things are isomorphic if there is a structure-preserving map between them in both directions. In computer science, two things are isomorphic if the person explaining a concept wishes to seem educated. 

  2. To maintain a laser-focus on the principles being discussed, we will make a huge number of simplifying assumptions in this essay, starting with the constraint that all squares will have sides that are a “power of two” in length, e.g. 2x2, 4x4, 8x8, 16x16, an so forth. Every single function discussed can be adjusted to deal with other cases, but we will omit those adjustments as our goal is understanding principles, not writing production code. 

  3. There are other interesting, and elegant ways to rotate a square 90 degrees clockwise, the simplest being zip(square). They each have their own set of trade-offs to consider. For example, the ‘whirling regions’ approach can also be generalized to handle rotating squares in 180- and 270- degree increments, not to mention reflections on either axis. But for the purpose of this essay, ‘whirling regions’ is the one we will consider most interesting. 

  4. 🎩 teddyh on Hacker News. 

  5. itself is known formally is the I Combinator, and also fondly nicknamed “The Idiot Bird” using Raymond Smullyan’s ornithological taxonomy. 

  6. More specifically, the data structure we are going to use is called a region quadtree. But we’ll just call it a quadtree. 

  7. In American English, bespoke typically refers to a garment that is hand-crafted for its wearer. “Bespoke” has, in the last decade, been associated with various hipster endeavours, to the point where its use has become ironic. The turning point was likely when a popped-collar founder of a pre-revenue startup boasted of having two iPhones running a bespoke time management application.

    Today, “bespoke” often refers to an item where the owner obtains more value from the status conferred by having a bespoke item, than from the item’s fitness for their personalized purpose. Calling a function “bespoke” implies that it was written to display the author’s trendy use of functional programming, rather than to efficiently rotate a square. 

https://raganwald.com/2016/12/27/recursive-data-structures
From Higher-Order Functions to Libraries And Frameworks
Show full content

In this essay, we will take a look at some higher-order functions, with an eye to seeing how they can be used to make our programs more expressive, while balancing that against the need to limit the perceived complexity of our programs.


introduction: expressiveness

One of the most basic ideas in programming is that functions can invoke other functions.1

When a function invokes other functions, and when one function can be invoked by more than one other function, we have a very good thing. When we have a many-to-many relationship between functions, we have a more expressive power than when we have a one-to-many relationship. We have the ability to give each function a single responsibility, and name that responsibility. We also have the ability to ensure that one and only one function has that responsibility.

A many-to-many relationship between functions is what enables us to create a one-to-one relationship between functions and responsibilities.

Programmers often speak of languages as being expressive. Although there is no single universal definition for this word, most programmers agree that an important aspect of “expressiveness” is that the language makes it easy to write programs that are not unnecessarily verbose. If functions have many responsibilities, they become large and unwieldy. If the same responsibility needs to be implemented more than once, there is de facto redundancy.

Programs where functions have single responsibilities, and where responsibilities are implemented by single functions, avoid unnecessary verbosity.

Thus, facilitating the many-to-many relationship between functions makes it possible to write programs that are more expressive than those that do not have a many-to-many relationship between functions.


the dark side: perceived complexity

However, “With great power comes great responsibility.”2 The downside of a many-to-many relationship between functions is that the ‘space of things a program might do’ grows very rapidly as the size increases. “Expressiveness” is often in tension with “Perceived Complexity.”

One way to think about this by analogy is to imagine we are drawing a graph. Each function is a vertex, and the calling relationship between them is an edge. Assuming that there is no “dead code,” every structured program forms a connected graph.

connected graph

Given a known number of nodes, the number of different ways to draw a connected graph between them is the A001187 integer sequence. Its first eleven terms are: 1, 1, 1, 4, 38, 728, 26704, 1866256, 251548592, 66296291072, 34496488594816. Meaning that there are more than thirty-four trillion ways to organize a program with just ten functions.

This explosion of flexibility is so great that programmers have to temper it. The benefits of creating one-to-one relationships between functions and responsibilities can become overwhelmed by the difficulty of understanding programs with unconstrained potential complexity.

JavaScript has tools to help. Its blocks create namespaces, and so do its formal modules. It may soon have private object properties.

Namespaces constrain large graphs into many smaller graphs, each of which has a constrained set of ways they can be connected to other graphs. It’s still a large graph, but the number of possible ways to draw it is smaller, and by analogy, it is easier to sort out what it does, and how.

What we have described is a heuristic for designing good software systems: Provide the flexibility to use many-to-many relationships between entities, while simultaneously providing ways for programmers to intentionally limit the ways that entities can be connected.

But notice that we’re not saying that one mechanism does both jobs. No, we’re saying that one tool helps us increase expressivity, while another helps us limit the perceived complexity of our programs, and the two work in tension with each other.

Now that we’ve established our heuristic, let’s look at some higher-order functions, and see what they can tell us about expressiveness and perceived complexity.


brown Mandelbrot

Brown Mandelbrot, © 2008 docnic, Some rights reserved


higher-order functions

Functions that accept functions as parameters, and/or return functions as values, are called Higher-Order Functions, or “HOFs.” Languages that support HOFs also support the idea of functions as first-class values, and nearly always support the idea of dynamically creating functions.

HOFs give programmers even more ways to decompose and compose programs, and thus more ways to write programs where there is a one-to-one relationship between functions and responsibilities. Let’s look at an example.

Rumour has it that there are excellent companies that ask coöp students to write code as part of the interview process. A typical problem will ask the student to demonstrate their facility solving a problem that ought to be familiar to a computer science or computer engineering student.

For example, merging two sorted lists. This is something a student will have at least looked at, and it does have some applicability to modern service architectures. Here’s a naïve implementation:

function merge ({ list1, list2 }) {
  if (list1.length === 0 || list2.length === 0) {
    return list1.concat(list2);
  } else {
    let atom, remainder;

    if (list1[0] < list2[0]) {
      atom = list1[0];
      remainder = {
        list1: list1.slice(1),
        list2
      };
    } else {
      atom = list2[0],
      remainder = {
        list1,
        list2: list2.slice(1)
      };
    }
    const left = atom;
    const right = merge(remainder);

    return [left, ...right];
  }
}

merge({
  list1: [1, 2, 5, 8],
  list2: [3, 4, 6, 7]
})
  //=> [1, 2, 3, 4, 5, 6, 7, 8]

And here’s a function that finds the sum of a list of numbers:

function sum(list) {
  if (list.length === 0) {
    return 0;
  } else {
    const [atom, ...remainder] = list;
    const left = atom;
    const right = sum(remainder);

    return left + right;
  }
}

sum([42, 3, -1])
  //=> 44

We’ve written them so that both have the same structure, they are linearly recursive. Can we extract this structure?


linrec

Linear recursion has a simple form:

  1. Look at the input. Can we break an element off?
  2. If not, what value do we return?
  3. If we can break a chunk off, divide the input into the element and a remainder
  4. Run our linearly recursive function on the remainder, then
  5. Combine our chunk with the result of our linearly recursive function on the remainder

Both of our examples above have this form, and we will write a higher-order function to implement linear recursion. To get started with our extraction, it helps to take an example of the function we want to implement, and extract its future parameters as constants:

function sum(list) {
  const indivisible = (list) => list.length === 0;
  const value = () => 0;
  const divide = (list) => {
    const [atom, ...remainder] = list;

    return  { atom, remainder }
  };
  const combine = ({ left, right }) => left + right;

  if (indivisible(list)) {
    return value(list);
  } else {
    const { atom, remainder } = divide(list);
    const left = atom;
    const right = sum(remainder);

    return combine({ left, right });
  }
}

We’re just about ready to make our higher-order function. Our penultimate step is to rename sum to myself, and list to input:

function myself (input) {
  const indivisible = (list) => list.length === 0;
  const value = () => 0;
  const divide = (list) => {
    const [atom, ...remainder] = list;

    return  { atom, remainder }
  };
  const combine = ({ left, right }) => left + right;

  if (indivisible(input)) {
    return value(input);
  } else {
    const { atom, remainder } = divide(input);
    const left = atom;
    const right = myself(remainder);

    return combine({ left, right });
  }
}

The final step is to turn our constant functions into parameters of a function that returns our myself function:

function linrec({ indivisible, value, divide, combine }) {
  return function myself (input) {
    if (indivisible(input)) {
      return value(input);
    } else {
      const { atom, remainder } = divide(input);
      const left = atom;
      const right = myself(remainder);

      return combine({ left, right });
    }
  }
}

const sum = linrec({
  indivisible: (list) => list.length === 0,
  value: () => 0,
  divide: (list) => {
    const [atom, ...remainder] = list;

    return  { atom, remainder }
  },
  combine: ({ left, right }) => left + right
});

And now we can exploit the similarities between sum and merge, by using linrec to write merge as well:

const merge = linrec({
  indivisible: ({ list1, list2 }) => list1.length === 0 || list2.length === 0,
  value: ({ list1, list2 }) => list1.concat(list2),
  divide: ({ list1, list2 }) => {
    if (list1[0] < list2[0]) {
      return {
        atom: list1[0],
        remainder: {
          list1: list1.slice(1),
          list2
        }
      };
    } else {
      return {
        atom: list2[0],
        remainder: {
          list1,
          list2: list2.slice(1)
        }
      };
    }
  },
  combine: ({ left, right }) => [left, ...right]
});

But why stop there?


binrec

binrec is a higher-order function for implementing binary recursion. Remember our coöp student implementing a merge between sorted lists? One of the cool things you can do with a merge function is write a merge sort, and advanced students are often asked to at least sketch out how it would work.

binrec is actually simpler than linrec in at least one respect, because instead of having an element and a remainder, binrec divides a problem into two parts and applies the same algorithm to both halves:

function binrec({ indivisible, value, divide, combine }) {
  return function myself (input) {
    if (indivisible(input)) {
      return value(input);
    } else {
      let { left, right } = divide(input);

      left = myself(left);
      right = myself(right);

      return combine({ left, right });
    }
  }
}

const mergeSort = binrec({
  indivisible: (list) => list.length <= 1,
  value: (list) => list,
  divide: (list) => ({
    left: list.slice(0, list.length / 2),
    right: list.slice(list.length / 2)
  }),
  combine: ({ left: list1, right: list2 }) => merge({ list1, list2 })
});

mergeSort([1, 42, 4, 5])
  //=> [1, 4, 5, 42]

From binrec, we can derive multirec, which divides the problem into an arbitrary number of symmetrical parts:

function mapWith (fn) {
  return function * (iterable) {
    for (const element of iterable) {
      yield fn(element);
    }
  };
}

function multirec({ indivisible, value, divide, combine }) {
  return function myself (input) {
    if (indivisible(input)) {
      return value(input);
    } else {
      const parts = divide(input);
      const solutions = mapWith(myself)(parts);

      return combine(solutions);
    }
  }
}

const mergeSort = multirec({
  indivisible: (list) => list.length <= 1,
  value: (list) => list,
  divide: (list) => [
    list.slice(0, list.length / 2),
    list.slice(list.length / 2)
  ],
  combine: ([list1, list2]) => merge({ list1, list2 })
});

There are an infinitude of higher-order functions we could explore, but these are enough for now. Let’s return to thinking about the relationship between expressiveness and perceived complexity.


the relationship between higher-order functions, expressiveness, and complexity

In typical programming, functions invoke each other, and by creating many-to-many relationships between functions, we increase expressiveness by making sure that one and only one functions implements any one responsibility. If two functions implement the same responsibility, we are less DRY, and less expressive.

How do higher-order functions come into this? Well, as we saw, merge and sum have different responsibilities in the solution domain–merging lists and summing lists. But they share a common implementation structure, linear recursion. Therefore, they both are responsible for implementing a linearly recursive algorithm.

By extracting this algorithm into linrec, we once again make sure that one and only one entity–linrec is responsible for implementing linear recursion. Thus, we find that a feature like first-class functions does give us the power of greater expressiveness, as it gives us at least one more way to create many-to-many relationships between functions.

And we also know that this can increase perceived complexity if we do not also temper this increased expressiveness with language features or architectural designs that allow us to define groups of functions that have rich relationships within themselves, but only limited relationships with other groups.


web

Photo © 2010 Denis Mihailov, some rights reserved


one-to-many and many-to-many

There’s more to it than that. Let’s compare binrec and multirec. Or rather, let’s compare how we write mergeSort using binrec and multirec:

const mergeSort1 = binrec({
  indivisible: (list) => list.length <= 1,
  value: (list) => list,
  divide: (list) => ({
    left: list.slice(0, list.length / 2),
    right: list.slice(list.length / 2)
  }),
  combine: ({ left: list1, right: list2 }) => merge({ list1, list2 })
});

const mergeSort2 = multirec({
  indivisible: (list) => list.length <= 1,
  value: (list) => list,
  divide: (list) => [
    list.slice(0, list.length / 2),
    list.slice(list.length / 2)
  ],
  combine: ([list1, list2]) => merge({ list1, list2 })
});

The interesting thing for us are the functions we supply as arguments. Let’s name them:

const hasAtMostOne = (list) => list.length <= 1;
const Identity = (list) => list;
const bisectLeftAndRight = (list) => ({
    left: list.slice(0, list.length / 2),
    right: list.slice(list.length / 2)
  });
const bisect = (list) => [
    list.slice(0, list.length / 2),
    list.slice(list.length / 2)
  ];
const mergeLeftAndRight = ({ left: list1, right: list2 }) => merge({ list1, list2 });
const mergeBisected = ([list1, list2]) => merge({ list1, list2 });

Looking at the names and at what the functions do, it seems that some, namely hasAtMostOne, Identity, and bisect feel like general-purpose functions that we might find ourselves using throughout one or many programs. And in fact, they can often be found in general-purpose function utility libraries. They express universal operations on lists.

Whereas, bisectLeftAndRight, and mergeLeftAndRight, seem more specialized. They are unlikely to be used anywhere else. mergeBisected is a toss-up. We might need it elsewhere, we might not.

We can also say that there is a many-to-many relationship between functions in our programs and the hasAtMostOne, Identity, and bisect functions. Functions like mergeSort2 call many other functions, and functions like bisect can be called by many other functions.

And as noted in the beginning, this “many-to-many-ness” contributes to expressiveness and to ensuring that we can write software where there is a one-to-one relationship between entities and responsibilities. For example, bisect is the authority on bisecting lists. We can arrange to write all of our code to invoke bisect, rather than duplicating its functionality.

Our heuristic is that the more general-purpose the interface and behavioural “contract” that a function provides, and the more focused and simple a responsibility it has, the greater its “many-to-many-ness.” Therefore, when we write higher-order functions like multirec, we should strive to design them to accept general-purpose parameters with simple responsibilities.

But we can also write functions like bisectLeftAndRight and mergeLeftAndRight. But when we do, there will be a one-to-many relationship, because we have little use for them outside of our specific merge application. This does allow us to structure our code and extract commonality like how to perform a binary recursion, but by limiting the many-to-many-ness of our program, we limit its expressiveness.

Unfortunately, this limitation of expressiveness does not directly translate to limiting the perceived complexity of our programs. We can tell from detailed inspection that a function like bisectLeftAndRight will not be useful elsewhere in the program, but if we do not employ a tool like module scoping to enforce this and make it obvious at a glance, we do not really limit its perceived complexity.

From this we can observe that many programming techniques, such as writing highly specialized interfaces for functions, or having complex responsibilities, can serve to limit a program’s expressiveness without providing the benefit of limiting its perceived complexity.


framework

Framework, © 2006 kaz k, some rights reserved


what higher-order functions tell us about frameworks and libraries

Roughly speaking, both frameworks and libraries are collections of classes, functions, and other code that we blend with our own code to write programs. But frameworks are designed to call our code, while libraries are designed to be called by our code.

Frameworks typically expect us to write functions or create entities with very specific, proprietary interfaces and behavioural contracts. For example, Ember requires us to extend its own base classes for things like component classes, instead of using ordinary JavaScript ES-6 classes. As we noted above, when we have specific interfaces, we limit the expressiveness of our programs, but not the incidental complexity.

The underlying assumption is that we are writing code for the framework, so the framework’s author is not concerned with setting-up a many-to-many relationship between the framework’s code and our code. For example, we cannot use JavaScript mixins, subclass factories, or method advice with the classes we write in Ember. We have to use the specific, proprietary meta-programming facilities that Ember provides, or are provided in specific plugins written for Ember.

Framework-oriented code tends to be more one-to-many than many-to-many, and thus tends to be less expressive.

Whereas, libraries are designed to be called by our code. And more importantly, by the code of many, many different teams, each of whom have their own programming style and approach. This leads library authors in general to write functions with generic interfaces and simple responsibilities.

Library-oriented code tends to be more many-to-many than one-to-many, and thus can be more expressive.

Is framework-oriented code a bad thing? It’s a tradeoff. Frameworks provide standard ways to do things. Frameworks hold out the promise of doing more things for us, and especially doing more complex things for us.

Ideally, although our code may be less expressive with a framework, our goal should be that we write less code against a framework than we would using libraries, and that we use other mechanisms to limit the perceived complexity of our code.

But from our exploration of linrec, binrec, and multirec, we can see that higher-order functions teach us something about specific and general interfaces, and that teaches us something about the tradeoffs between using frameworks and libraries.


afterward

There is more to read about multirec in the follow-up essay, Why recursive data structures?, and the final chapter, Time, Space, and Life As We Know It.

An early draft of this post was reviewed by Jinny Kim and Peter Sobot. Thank you.

Have an observation? Spot an error? You can open an issue, discuss this on reddit, or even edit this post yourself.

And hey: If you like this essay, you’ll love JavaScript Allongé. It’s free to read online!


notes
  1. Although this essay is going to talk about functions, everything we look at is applicable to methods and by analogy, to classes. We’re just sticking to talking about functions for simplicity’s sake. 

  2. “Ils doivent envisager qu’une grande responsabilité est la suite inséparable d’un grand pouvoir.”—quoteinvestigator.com 

https://raganwald.com/2016/12/15/what-higher-order-functions-can-teach-us-about-libraries-and-frameworks
Anamorphisms in JavaScript
Show full content

Unfolded

Unfolded, © 2011 Regulla, Some rights reserved


preamble: unfolds and folds

Anamorphisms are functions that map from some object to a more complex structure containing the type of the object. For example, mapping from an integer to a list of integers.

Here’s an anamorphism:

function downToOne(n) {
  const list = [];

  for (let i = n; i > 0; --i) {
    list.push(i);
  }

  return list;
}

downToOne(5)
  //=> [ 5, 4, 3, 2, 1 ]

An integer is our object, and the array containing integers is our “structure containing integers.” Maps from integers to arrays of integers are anamorphisms. “Anamorphism” is a very long word, and using it implies that we are going to be strict about following category theory. So let’s use a simpler word that has some poetic value: Unfold. Anamorphisms “unfold” values into more complex structures containing those values.

I like to think of the integer 5 as having all the whole numbers less than five folded up inside itself.

So we’ll use the word “unfold” from now on.

Unfolds (or anamorphisms) are the dual—a fancy word for complement—of Catamorphisms, functions that map from some complex structure containing values down to values.

Here’s a catamorphism:

function product(list) {
  let product = 1;

  for (const n of list) {
    product *= n;
  }

  return product;
}

product(downToOne(5))
  //=> 120

“Catamorphism” is another long word that implies that we are going to be strict about following category theory. So let’s call these things folds instead. We can think of product as folding a list of integers into an integer.


folding and unfolding with generators

Unfolding a number into a list is questionable practice. It takes up a lot of space, and we just want to iterate over it, as we did above with product. Well, we can unfold it into a finite iteration with a generator:

function * downToOne(n) {
  for (let i = n; i > 0; --i) {
    yield i;
  }
}

[...downToOne(5)]
  //=> [ 1, 2, 3, 4, 5 ]

Remember our product definition? Thanks to the way a for... of loop works in JavaScript, it folds any iterable of integers:

function product(iterable) {
  let product = 1;

  for (const n of iterable) {
    product *= n;
  }

  return product;
}

product(downToOne(5))
  //=> 120

Folding and unfolding work very well with iterators.


generalizing folds

We know that there is a very handy way to fold arrays, we can use the .reduce method:

function product(list) {
  return list.reduce((acc, n) => acc * n, 1);
}

product(downToOne(5))
  //=> 120

const factorial = (n) => product(downToOne(n));

The two problems we have with .reduce are that first, it takes multiple arguments, and second, it is a method on arrays and not on everything iterable. But it’s a useful pattern, and we can reproduce it by hand.

Here we create a foldWithFnAndSeed function that takes a folding function and a seed value, and gives us back a fold function. We use that to make our own product fold:

function foldWithFnAndSeed(fn, seed) {
  return function fold (iterable) {
    let acc = seed;

    for (const element of iterable) {
      acc = fn(acc, element);
    }

    return acc;
  }
}

const product = foldWithFnAndSeed(
    (acc, n) => acc * n, 1
  );

product(downToOne(5))

A variation uses the first element of the iterable as the seed, then iterates over the remainder. This is adequate for many purposes, including product:

function foldWith(fn) {
  return function fold (iterable) {
    const iterator = iterable[Symbol.iterator]();
    let { value: acc } = iterator.next();

    for (const element of iterator) {
      acc = fn(acc, element);
    }

    return acc;
  }
}

const product = foldWith(
    (acc, n) => acc * n
  );

product(downToOne(5))

For historical reasons, reduce in JavaScript takes two positional arguments, and we have to remember what they are and what order they’re in. Those reasons no longer exist: JavaScript, like nearly every other language in this century, has rediscovered what Smalltalk knew in 1980: named parameters are better. So we’ll use destructuring to emulate named parameters:

function foldWith(fn) {
  return function fold (iterable) {
    const iterator = iterable[Symbol.iterator]();
    let { value: acc } = iterator.next();

    for (const element of iterator) {
      acc = fn({ acc, element });
    }

    return acc;
  }
}

const product = foldWith(
    ({ acc, element: n }) => acc * n
  );

product(downToOne(5))

generalizing unfolds

Our foldWith function is handy, and we can use the same general idea for making unfolders. We want to end up with an unfoldWith function that takes, as its argument, a function for unfolding elements.

What will this function look like? The opposite of the function we used to fold, in that it will take a value as its argument, and return the next value, an element to yield, and whether it is done. Naturally, we’ll use destructuring to extract multiple return values:

function unfoldWith(fn) {
  return function * unfold (value) {
    let { next, element, done } = fn(value);

    while (!done) {
      yield element;
      ({ next, element, done } = fn(next));
    }
  }
}

const downToOne = unfoldWith(
    (n) => n > 0
    ? {
        element: n,
        next: n - 1
      }
    : { done: true }
  );

product(downToOne(5))
  //=> 120

For a moment, let’s close our eyes very tightly, plug our ears with our fingers, and murmur “la-la-la-la-la” at the thought of performance costs or implementation limits. Ready? Ok:

If we didn’t have a looping construct like while, we could write unfoldWith recursively:

function unfoldWith(fn) {
  return function * unfold (value) {
    let { next, element, done } = fn(value);

    if (!done) {
      yield element;
      yield * unfold(next);
    }
  }
}

const downToOne = unfoldWith(
    (n) => n > 0
      ? {
          element: n,
          next: n - 1
        }
      : { done: true }
  );

product(downToOne(5))
  //=> 120

It works just the same up until JavaScript’s stack overflows.

Reminder: yield * yields all the elements of an iterable, so yield * unfold(next) will yield the remaining elements.

Although it may be impractical for working at scale, what is interesting about the recursive unfold is that it encodes very directly how to do an unfold using linear recursion: Given a structure, turn it into part of the result and a structure representing the rest of the work to do.

In the case of unfolding, we take a structure, and return an element and a structure representing the part we haven’t unfolded yet. We then yield the element and the result of unfolding the rest of the structure.

Divide-and-conquer algorithms all have this structure: Break the problem into parts, apply the algorithm to each part, then glue the results back together. Recursive divide-and-conquer applies the exact same algorithm at finer and finer scale until a simple case is reached.

Linear recursion is a special case of divide-and-conquer, where we break the simple case off, solve it, and recursively apply our algorithm to the remainder of the input.


traversing lists, trees and forests

A traversal, or path, is a function that takes a structure and returns its elements as an iterator. JavaScript gives us built-in traversals for returning the values in arrays, we just iterate over them. But sometimes we want to iterate in another order. For that, we need a traversal.

One handy use for unfolds is to use them to express traversals. We can do that with an array. Here’s one that uses our unfoldWith exactly as described: it divides a non-empty array into an element and the rest of the array to unfold:

const butLast = (array) => array.slice(0, array.length - 1);
const last = (array) => array[array.length - 1];

const inReverse = unfoldWith(
    (array) => array.length > 0
      ? {
          element: last(array),
          next: butLast(array)
        }
      : { done: true }
  );

[...inReverse(['a', 'b', 'c'])]
  //=> ['c', 'b', 'a']

This is simple, but it makes successive copies of the array, and in JavaScript, this is expensive. It would be easier if we used the original array, but managed a cursor to keep track of our position in the array.

function * inReverse(array, cursor = array.length - 1) {
  if (cursor >= 0) {
    yield array[cursor];
    yield * inReverse(array, cursor - 1);
  }
}

[...inReverse(['a', 'b', 'c'])]
  //=> ['c', 'b', 'a']

We can easily write this with a while loop, but there is more interesting business afoot.

Consider this binary tree:


Binary Tree

Image © Sam Gavis-Hughson, Coding Interview Question: Balanced Binary Tree


This tree can be represented as a nested POJO:

const tree = {
    label: 1,
    children: [
      {
        label: 2,
        children: [
          {
            label: 4,
            children: []
          },
          {
            label: 5,
            children: []
          }
        ]
      },
      {
        label: 3,
        children: [
          {
            label: 6,
            children: []
          }
        ]
      }
    ]
  };

Let’s write a traversal for it:

function * elements (tree) {
  yield tree.label;
  for (const child of tree.children) {
    yield * elements(child);
  }
}

[...elements(tree)]
  //=> [ 1, 2, 4, 5, 3, 6]

Note that the form of this traversal is almost identical to the form of our recursive linear unfold. The big difference is that we start with one element, but the “remainder” of our work is multiple elements.

We can unify them. A tree is a divergent graph with a single root node. If we have a collection of root nodes, it’s called a forest. The second-simplest possible forest is a collection with just root, and that’s just like a tree.

So if we write a traversal for forests, we also get one for trees. We’ll give this a more descriptive name:

function * depthFirst (forest) {
  if (forest.length > 0) {
    const [first, ...butFirst] = forest;

    yield first.label;
    yield * depthFirst(first.children.concat(butFirst))
  }
}

const simpleForest = [tree];

[...depthFirst(simpleForest)]
  //=> [ 1, 2, 4, 5, 3, 6]

This looks more familiar! We can express this as a linear recursion unfold:

const first = (array) => array[0];
const butFirst = (array) => array.slice(1);

const depthFirst = unfoldWith(
    (forest) => forest.length > 0
      ? {
          element: first(forest).label,
          next: first(forest).children.concat(butFirst(forest))
        }
      : { done: true }
  );

[...depthFirst(simpleForest)]
  //=> [ 1, 2, 4, 5, 3, 6]

All this work brings us to this: Just as we can express forward and backwards traverses through a list, we can express different kinds of traversals through more complex structures, like trees or forests.

Here is a breadth-first traversal of a forest:

const breadthFirst = unfoldWith(
    (forest) => forest.length > 0
      ? {
          element: first(forest).label,
          next: butFirst(forest).concat(first(forest).children)
        }
      : { done: true }
  );

[...breadthFirst(simpleForest)]
  //=> [ 1, 2, 3, 4, 5, 6 ]

Notice how it looks almost exactly identical to the depth-first expression. This tells us that we have exploited the symmetry between the various paths we can take through a forest. A right-to-left breadth-first search is likewise easy to code:

const rightToLeftBreadthFirst = unfoldWith(
    (forest) => forest.length > 0
      ? {
          element: last(forest).label,
          next: last(forest).children.concat(butLast(forest))
        }
      : { done: true }
  );

[...rightToLeftBreadthFirst(simpleForest)]
  //=> [ 1, 3, 2, 6, 5, 4 ]

Writing traversals illustrates that not only can we can separate the way to iterate over a particular data structure from the things we do with its elements, but we can decompose the algorithm of traversing or unfolding such that we hide away questions like whether we are looping or recursing and focus on the traverse or unfold’s pertinent logic.


unfolds, wrapped up

When working with data structures, we often want to provide certain common operations on their elements. For example, questions like “What is the set of unique elements of this collection?” or “What is the sum of the numbers in this collection?”

If we write a traversal for the collection, we turn it into an iterator, and we can write a single generic fold over the resulting iteration. We can write:

const sum = foldWith(
    ({ acc, element }) => acc + element
  );

And it works with forests as easily as it works with lists:

sum([1, 2, 3, 4, 5, 6])
  //=> 21

sum(depthFirst(simpleForest))
  //=> 21

Writing traversals separates the concern of how to iterate over a data structure from the concern of what to do with the elements of the data structure.

You often hear the advice to write code in small functions, to avoid putting too many lines in one method. Obviously, we can break big methods up by extracting “helper” methods, and that will satisfy JSLint, but that is treating the symptom and not the problem.

Building blocks like unfoldWith serve to illustrate the idea that many algorithms do share common concerns. All we need to do is train ourselves to see the underlying symmetry between them, and we can decompose our functions and recompose them with abandon.


have your say

Have an observation? Spot an error? You can open an issue, discuss this on reddit, or even edit this post yourself!

https://raganwald.com/2016/11/30/anamorphisms-in-javascript
From Mixins to Object Composition
Show full content

In Why Are Mixins Considered Harmful, we saw that concatenative sharing–as exemplified by mixins–leads to snowballing complexity because of three effects:

  1. Lack of Encapsulation
  2. Implicit Dependencies
  3. Name Clashes

We looked at some variations on creating encapsulation to help reduce the “surface area” for dependencies to grow and names to clash, but noted that this merely slows down the growth of the problem, it does not fix it.

That being said, that doesn’t mean that mixins are terrible. Mixins are simple to understand and to implement. As we’ll see in this essay, it is straightforward to refactor away from mixins, and then we can work on reducing our code’s complexity.

For many projects, mixins are the right choice right now, the important thing is to understand their limitations and the problems that may arise, so that we can know when to incorporate a heavier, but more complexity-taming architecture.

It is more important to know how to refactor to a particular architecture, than to know in advance which architecture can serve all of our needs now, and in the future.

In this post, we are going to look at object composition, a technique that has a few more moving parts than mixins, but opens up the opportunity to make dependencies explicit, enforce a stronger level of encapsulation, and can be built upon for richer forms of method decoration.


mixing deck

encapsulation for mixins

Mixins can encapsulate methods and properties. We saw in the previous post how we can use symbols (or pseudo-random strings) to separate methods intended to be a part of the interface from those intended to be part of the implementation (a/k/a “private”). Take this mixin where we have just used a comment to indicate our preferences:

const Coloured = {
  // __Public Methods__
  setColourRGB ({r, g, b}) {
    return this.colourCode = {r, g, b};
  },
  getColourRGB () {
    return this.colourCode;
  },
  getColourHex () {
    return this.rgbToHex(this.colourCode);
  },

  // __Private Methods__
  componentToHex(c) {
    const hex = c.toString(16);

    return hex.length == 1 ? "0" + hex : hex;
  },
  rgbToHex({r, g, b}) {
    return "#" + this.componentToHex(r) + this.componentToHex(g) + this.componentToHex(b);
  }
};

class Todo {
  constructor (name) {
    this.name = name || 'Untitled';
    this.done = false;
  }
  title () {
    return name;
  }
  do () {
    this.done = true;
    return this;
  }
  undo () {
    this.done = false;
    return this;
  }
}

Object.assign(Todo.prototype, Coloured);

We write Coloured as:

const colourCode = Symbol('colourCode');
const componentToHex = Symbol('componentToHex');
const rgbToHex = Symbol('rgbToHex');

const Coloured = {
  setColourRGB ({r, g, b}) {
    return this[colourCode] = {r, g, b};
  },
  getColourRGB () {
    return this[colourCode];
  },
  getColourHex () {
    return this[rgbToHex](this.getColourRGB());
  },
  [componentToHex](c) {
    const hex = c.toString(16);

    return hex.length == 1 ? "0" + hex : hex;
  },
  [rgbToHex]({r, g, b}) {
    return "#" + this[componentToHex](r) + this[componentToHex](g) + this[componentToHex](b);
  }
};

But let’s move along a bit and we’ll see how to fix the implicit/explicit problem. First, let’s look at another way to create encapsulation, using composition.


stone wall

encapsulating behaviour with object composition

“Composition” is a general term for any mixing of behaviour from two entities. Mixins as described above is a form of composition. Functional composition is another. Object composition is when we mix two objects, not an object and a prototype or two functions.

Let’s do it by hand: We’ll start with Our Todo class as usual:

class Todo {
  constructor (name) {
    this.name = name || 'Untitled';
    this.done = false;
  }
  title () {
    return name;
  }
  do () {
    this.done = true;
    return this;
  }
  undo () {
    this.done = false;
    return this;
  }
}

We’ll give it a colouredObject property, using the Symbol pattern from above:

const colourCode = Symbol('colourCode');
const componentToHex = Symbol('componentToHex');
const rgbToHex = Symbol('rgbToHex');

const Coloured = {
  setColourRGB ({r, g, b}) {
    return this[colourCode] = {r, g, b};
  },
  getColourRGB () {
    return this[colourCode];
  },
  getColourHex () {
    return this[rgbToHex](this.getColourRGB());
  },
  [componentToHex](c) {
    const hex = c.toString(16);

    return hex.length == 1 ? "0" + hex : hex;
  },
  [rgbToHex]({r, g, b}) {
    return "#" + this[componentToHex](r) + this[componentToHex](g) + this[componentToHex](b);
  }
};

const colouredObject = Symbol('colouredObject');

class Todo {
  constructor (name) {
    this.name = name || 'Untitled';
    this.done = false;
    this[colouredObject] = Object.assign({}, Coloured);
  }
  do () {
    this.done = true;
    return this;
  }
  undo () {
    this.done = false;
    return this;
  }
}

Now we have a copy of Coloured in every Todo instance. But we haven’t actually added any behaviour to Todo. To do that, we’ll delegate some methods from Todo to Coloured:

class Todo {
  constructor (name) {
    this.name = name || 'Untitled';
    this.done = false;
    this[colouredObject] = Object.assign({}, Coloured);
  }
  do () {
    this.done = true;
    return this;
  }
  undo () {
    this.done = false;
    return this;
  }
  setColourRGB ({r, g, b}) {
    return this[colouredObject].setColourRGB({r, g, b});
  }
  getColourRGB () {
    return this[colouredObject].getColourRGB();
  }
  getColourHex () {
    return this[colouredObject].getColourHex();
  }
}

Presto, we have the setColourRGB, getColourRGB, and getColourHex methods added to Todo, but we delegate them to a separate object, this[colouredObject], to implement. All of this[colouredObject]’s properties and other methods are somewhat encapsulated away.

As a bonus, we have what we might call “weak explicit dependencies:” Looking at Todo, it’s quite easy to see that we have delegated the setColourRGB, getColourRGB, and getColourHex methods. If we had a much bigger and more complex class with lots of object compositions, we could easily see which methods were delegated to which objects.

All the same, this has an awful lot of moving parts compared to Object.assign. Do we have to type all of this? Or is there an easier way?


automating object composition

Let’s automate the process. To begin with, let’s recognize that although Object.assign can be all you need for naïve mixins, a better way to write them is to turn the mixin definition into a function that transforms a class, like this:

const FunctionalMixin = (behaviour) =>
  target => {
    Object.assign(target.prototype, behaviour);
    return target;
  };

const colourCode = Symbol('colourCode');
const componentToHex = Symbol('componentToHex');
const rgbToHex = Symbol('rgbToHex');


const Coloured = FunctionalMixin({
  setColourRGB ({r, g, b}) {
    return this[colourCode] = {r, g, b};
  },
  getColourRGB () {
    return this[colourCode];
  },
  getColourHex () {
    return this[rgbToHex](this.getColourRGB());
  },
  [componentToHex](c) {
    const hex = c.toString(16);

    return hex.length == 1 ? "0" + hex : hex;
  },
  [rgbToHex]({r, g, b}) {
    return "#" + this[componentToHex](r) + this[componentToHex](g) + this[componentToHex](b);
  }
});

const Todo = Coloured(class {
  constructor (name) {
    this.name = name || 'Untitled';
    this.done = false;
  }
  title () {
    return name;
  }
  do () {
    this.done = true;
    return this;
  }
  undo () {
    this.done = false;
    return this;
  }
});

Angus Croll calls this a functional mixin, and they work very well in ES6.

If we are adventurous, we might use a compiler that supports proposed–but unratified–JavaScript features like class decorators. If we do, functional mixins are very elegant:

@Coloured
class Todo {
  constructor (name) {
    this.name = name || 'Untitled';
    this.done = false;
  }
  title () {
    return name;
  }
  do () {
    this.done = true;
    return this;
  }
  undo () {
    this.done = false;
    return this;
  }
};

Now, if we write our mixins like this, what about refactoring mixins into object composition?

const ObjectComposer = (behaviour) =>
  target => {
    const composedObject = Symbol('composedObject');
    const exportedMethodNames = Object.keys(behaviour);

    for (const methodName of exportedMethodNames) {
      Object.defineProperty(target.prototype, methodName, {
        value: function (...args) {
          if (this[composedObject] == null) {
            this[composedObject] = Object.assign({}, behaviour);
          }
          return this[composedObject][methodName](...args);
        },
        writeable: true
      });
    }
    return target;
  };

const colourCode = Symbol('colourCode');
const componentToHex = Symbol('componentToHex');
const rgbToHex = Symbol('rgbToHex');

const Coloured = ObjectComposer({
  setColourRGB ({r, g, b}) {
    return this[colourCode] = {r, g, b};
  },
  getColourRGB () {
    return this[colourCode];
  },
  getColourHex () {
    return this[rgbToHex](this.getColourRGB());
  },
  [componentToHex](c) {
    const hex = c.toString(16);

    return hex.length == 1 ? "0" + hex : hex;
  },
  [rgbToHex]({r, g, b}) {
    return "#" + this[componentToHex](r) + this[componentToHex](g) + this[componentToHex](b);
  }
});

const Todo = Coloured(class {
  constructor (name) {
    this.name = name || 'Untitled';
    this.done = false;
  }
  title () {
    return name;
  }
  do () {
    this.done = true;
    return this;
  }
  undo () {
    this.done = false;
    return this;
  }
});

If we look carefully at the ObjectComposer function, we see that for each method of the behaviour we want to compose, it makes a method in the target’s prototype that delegates the method to the composed object. In our hand-rolled example, we initialized the object in the target’s constructor. For simplicity here, we lazily initialize it.

And wonder of wonders, because Object.keys does not enumerate symbols, every method we bind to a symbol in the behaviour is kept “private.”

This is a bit more complex, but it gives us almost everything mixins already gave us. But we wanted more, specifically explicit dependencies. Can we do that?


list

making delegation explicit

Sure thing! Here’s a new version of ObjectComposer:

const ObjectComposer = (behaviour) =>
  (...exportedMethodNames) =>
    target => {
      const composedObject = Symbol('composedObject');

      for (const methodName of exportedMethodNames) {
        Object.defineProperty(target.prototype, methodName, {
          value: function (...args) {
            if (this[composedObject] == null) {
              this[composedObject] = Object.assign({}, behaviour);
            }
            return this[composedObject][methodName](...args);
          },
          writeable: true
        });
      }
      return target;
    };

const Coloured = ObjectComposer({
  // __Public Methods__
  setColourRGB ({r, g, b}) {
    return this.colourCode = {r, g, b};
  },
  getColourRGB () {
    return this.colourCode;
  },
  getColourHex () {
    return this.rgbToHex(this.colourCode);
  },

  // __Private Methods__
  componentToHex(c) {
    const hex = c.toString(16);

    return hex.length == 1 ? "0" + hex : hex;
  },
  rgbToHex({r, g, b}) {
    return "#" + this.componentToHex(r) + this.componentToHex(g) + this.componentToHex(b);
  }
});

const Todo = Coloured('setColourRGB', 'setColourRGB')(class {
  constructor (name) {
    this.name = name || 'Untitled';
    this.done = false;
  }
  title () {
    return name;
  }
  do () {
    this.done = true;
    return this;
  }
  undo () {
    this.done = false;
    return this;
  }
});

let t = new Todo('test');
t.setColourRGB({r: 1, g: 2, b: 3});
t.getColourHex();
  //=> t.getColourHex is not a function

That is correct! We didn’t explicitly say that we wanted to “import” the getColourHex method, so we didn’t get it. This is fun! What about resolving name conflicts? Let’s make a general-purpose renaming option:

const ObjectComposer = (behaviour) =>
  (...exportedMethodNames) => {
    const methodNameMap = exportedMethodNames.reduce((acc, name) => {
      const splits = name.split(' as ');

      if (splits.length === 1) {
        acc[name] = name;
      } else if (splits.length == 2) {
        acc[splits[0]] = splits[1]
      }
      return acc;
    }, {});
    return target => {
      const composedObject = Symbol('composedObject');

      for (const methodName of Object.keys(methodNameMap)) {
        const targetName = methodNameMap[methodName];

        Object.defineProperty(target.prototype, targetName, {
          value: function (...args) {
            if (this[composedObject] == null) {
              this[composedObject] = Object.assign({}, behaviour);
            }
            return this[composedObject][methodName](...args);
          },
          writeable: true
        });
      }
      return target;
    };
  }

const Coloured = ObjectComposer({
  // __Public Methods__
  setColourRGB ({r, g, b}) {
    return this.colourCode = {r, g, b};
  },
  getColourRGB () {
    return this.colourCode;
  },
  getColourHex () {
    return this.rgbToHex(this.colourCode);
  },

  // __Private Methods__
  componentToHex(c) {
    const hex = c.toString(16);

    return hex.length == 1 ? "0" + hex : hex;
  },
  rgbToHex({r, g, b}) {
    return "#" + this.componentToHex(r) + this.componentToHex(g) + this.componentToHex(b);
  }
});

const Todo = Coloured('setColourRGB as setColorRGB', 'getColourRGB as getColorRGB', 'getColourHex as getColorHex')(class {
  constructor (name) {
    this.name = name || 'Untitled';
    this.done = false;
  }
  title () {
    return name;
  }
  do () {
    this.done = true;
    return this;
  }
  undo () {
    this.done = false;
    return this;
  }
});

let t = new Todo('test');
t.setColorRGB({r: 1, g: 2, b: 3});
t.getColorHex()
  //=> #010203

I do not condone these Americanized misspellings, of course, but they demonstrate a facility that can be used to resolve unavoidable naming conflicts.1


recursive surface

going deeper

Our naïve mixins can do a few things that our object composition cannot. For one thing, we can write a method that returns this. Our composed objects should not return this, because their this is not the same thing as the target instance’s this.

A related issue is that our composed objects cannot call any of the class’s methods. We can write completely independent standalone functionality like Coloured above, but we can’t write functionality that “decorates” existing functionality.

For example, what if we want to compose this behaviour with Todo?

const Coloured = {
  // __Public Methods__
  setColourRGB ({r, g, b}) {
    return this.colourCode = {r, g, b};
  },
  getColourRGB () {
    return this.colourCode;
  },
  getColourHex () {
    return this.rgbToHex(this.colourCode);
  },
  colouredTitle () {
    return `<span font-color=${this.getColourHex()}>${this.title()}</span>`;
  },

  // __Private Methods__
  componentToHex(c) {
    const hex = c.toString(16);

    return hex.length == 1 ? "0" + hex : hex;
  },
  rgbToHex({r, g, b}) {
    return "#" + this.componentToHex(r) + this.componentToHex(g) + this.componentToHex(b);
  }
};

The colouredTitle method isn’t going to work, because we want to access the instance’s title method, not the composed object (which doesn’t have one). What to do?

Well, if we can use delegation to “export” a bunch of methods, perhaps we can use delegation to import them as well:

const ObjectComposer =
  (...importedMethodNames) =>
    (behaviour) =>
      (...exportedMethodNames) =>
        target => {
          const composedObject = Symbol('composedObject');
          const instance = Symbol('instance');

          for (const exportedMethodName of exportedMethodNames) {
            Object.defineProperty(target.prototype, methodName, {
              value: function (...args) {
                if (this[composedObject] == null) {
                  this[composedObject] = Object.assign({}, behaviour);
                  this[composedObject][instance] = this;
                  for (const importedMethodName of importedMethodNames) {
                    this[composedObject][methodName] = function (...args) {
                      return this[instance][methodName](...args);
                    }
                  }
                }
                return this[composedObject][methodName](...args);
              },
              writeable: true
            });
          }
          return target;
        };

Now we can write:

const Coloured = ObjectComposer('title')({
  // __Public Methods__
  setColourRGB ({r, g, b}) {
    return this.colourCode = {r, g, b};
  },
  getColourRGB () {
    return this.colourCode;
  },
  getColourHex () {
    return this.rgbToHex(this.colourCode);
  },
  colouredTitle () {
    return `<span font-color=${this.getColourHex()}>${this.title()}</span>`;
  },

  // __Private Methods__
  componentToHex(c) {
    const hex = c.toString(16);

    return hex.length == 1 ? "0" + hex : hex;
  },
  rgbToHex({r, g, b}) {
    return "#" + this.componentToHex(r) + this.componentToHex(g) + this.componentToHex(b);
  }
});

const Todo = Coloured('setColourRGB', 'colouredTitle')(class {
  constructor (name) {
    this.name = name || 'Untitled';
    this.done = false;
  }
  title () {
    return this.name;
  }
  do () {
    this.done = true;
    return this;
  }
  undo () {
    this.done = false;
    return this;
  }
});

let t = new Todo('test');

t.setColourRGB({r: 1, g: 2, b: 3});
t.colouredTitle();
  //=> <span font-color=#010203>test</span>

Note that we are explicit about our dependencies in both directions.2


can we go even deeper?

Sure. We could use the subclass factory pattern, this would allow us to override methods, and call super. It also has some performance advantages in a modern JIT. We usually don’t need to prematurely optimize for performance, but sometimes we care deeply about that.

Now that we have the beginnings of a protocol for declaring our dependencies in both directions, we can start thinking about other kinds of behaviour we’d like to mix in, like decorating individual methods with before or after advice, e.g. updateLastModified after setColourRGB.

If we whole-heartedly embrace object composition, we can even go from composing objects with classes to composing classes with each other: This would allow us to write constructors for our composed objects.


the finish line

so where do we finish?

Let’s step back and look at what we have: We have a way to make something that looks a lot like functional mixin, but behind the scenes it implements object composition. Unlike a mixin, we get explicit dependencies. This adds some declarations to our code, but we win in the long run by having code that is easier to trace when we need to work out what is going on or how to refactor something that has grown.

// create an object composition function.
// this object depends upon the class defining
// a `title` method:
const Coloured = ObjectComposer('title')({
  // ...
});

// compose `Coloured` with `Todo`, including the
// `setColouredRGB` method and the `colouredTitle`
// method, renamed `htmlTitle`:
const Todo = Coloured('setColourRGB', 'colouredTitle as htmlTitle')(class {
  // ...
});

Our example implementation is dense but small, showing us that JavaScript can be powerful when we choose to put it to work. And now we have the tools to tame growing dependencies, implicit dependencies, and name clashes.

And that’s enough for us to make sensible decisions about whether to use mixins now and refactor in the future, stick with mixins, or go for composition right off the bat.


the composition

afterword: “prefer composition to inheritance”

The problems outlined with mixins are the same as the problems we have discovered with inheritance over the last 30+ years. Subclasses have implicit dependencies on their superclasses. This makes superclasses extremely fragile: One little change could break code in a subclass that is in an entirely different file, what we call “action at a distance,” or its more pejorative term, “coupling.” Likewise, naming conflicts can easily occur between subclasses and superclasses.

The root cause is the lack of encapsulation in the relationship between subclasses and superclasses. This is the exact same problem between classes and mixins: The lack of encapsulation.

In OOP, the unit of encapsulation is the object. An object has a defined interface of public methods, and behind this public interface, it has methods and properties that implement its interface. Other objects are supposed to interact only with the public methods.

For this reason, the mature OOP community has migrated away from “inheritance” as the primary way to share behaviour, towards object composition and delegation. The revelations we are having about mixins are a sign that as the JavaScript community matures, it will inevitably rediscover what OOP languages already know.


have your say

Have an observation? Spot an error? You can open an issue, discuss this on reddit, or even edit this post yourself!


notes
  1. We can also extend this function to report if we are accidentally overwriting an existing method. Whether we wish to do so is an interesting discussion we will not have here. Some feel that permitting overriding is excellent OOP practise, others dissent. A very good practise is to only permit it when signalling that you intend to do so, e.g. to write 'override setColourRGB'. We will leave those details for another day. 

  2. As already noted, we can also add lots of error checking, like noting when a dependency doesn’t exist. The Coloured function should raise an error if it depends on title but is being mixed into a class that doesn’t have a title method. 

https://raganwald.com/2016/07/20/prefer-composition-to-inheritance
Why Are Mixins Considered Harmful?
Show full content

update: Part II, From Mixins to Object Composition is now available.


In Mixins Considered Harmful, Dan Abramov wrote something that sounds familiar to everyone1 who works with legacy applications:

Some of our code using React gradually became incomprehensible. Occasionally, the React team would see groups of components in different projects that people were afraid to touch. These components were too easy to break accidentally, were confusing to new developers, and eventually became just as confusing to the people who wrote them in the first place.

This is not specific to “React:” All legacy applications exhibit this behaviour: They accumulate chunks of code that are easy to break and confusing to everyone, even the original authors. Worse, such chunks of code tend to grow over time, they are infectious: People write code to work around the incomprehensible code instead of refactoring it, and the workarounds become easy to break accidentally and confusing in their own right.

The problems grow over time.


bikes on queen street west in toronto

How do mixins figure into this? Dan articulated three issues with mixins:

  1. Mixins introduce implicit dependencies
  2. Mixins cause name clashes
  3. Mixins cause snowballing complexity

He’s 100% right!

dependencies

Mixins absolutely introduce dependencies. Let’s look at how this happens. The simplest form of mixin uses Object.assign to mix a template object into a class’s prototype.2 For example, here’s a class of todo items:

class Todo {
  constructor (name) {
    this.name = name || 'Untitled';
    this.done = false;
  }
  do () {
    this.done = true;
    return this;
  }
  undo () {
    this.done = false;
    return this;
  }
}

And a “mixin” that is responsible for colour-coding:

const Coloured = {
  setColourRGB ({r, g, b}) {
    this.colourCode = {r, g, b};
    return this;
  },
  getColourRGB () {
    return this.colourCode;
  }
};

Mixing colour coding into our Todo prototype is straightforward:

Object.assign(Todo.prototype, Coloured);

new Todo('test')
  .setColourRGB({r: 1, g: 2, b: 3})
  //=> {"name":"test","done":false,"colourCode":{"r":1,"g":2,"b":3}}

Note that our mixin simply grows our class’ prototype via copying, a process sometimes called “concatenative sharing.” Because the mixin’s methods wind up being the prototype’s methods, this is really no different than simply adding the mixin’s methods directly to the class by hand.

The consequence is that every mixin and class method can access every other mixin and class method. Furthermore, every mixin method can read and write the properties written by class methods, and every class method can read and write the properties written by mixin methods.

In short, the concatenative sharing mechanism permits the maximum possible set of dependencies between the class and its mixins. This is a problem, because these dependencies exemplify the complete opposite of the principles of encapsulation: The point of encapsulation is to define an interface through which entities interact with each other. Each entity then implements its behaviour using private methods and properties that are hidden from other entities.

Mixins do not permit any encapsulation whatsoever, and over time dependencies gradually creep into the code.

implicit dependencies

So we see that mixins permit dependencies. But worse, they permit implicit dependencies. Consider our Coloured mixin from above. It defines two methods, setColourRGB and getColourRGB. But when we mix it into Todo, how do we know what methods we are mixing in? We don’t:

Object.assign(Todo.prototype, Coloured);

We have to examine the code carefully to determine that we have added setColourRGB and getColourRGB to the Todo class. And if we use multiple mixins, the source for each method or property must be divined through careful analysis of the source code and behaviour.

As we saw above, mixins also introduce the possibility of dependencies between a mixin’s methods and a class’s methods. Just as we must carefully examine the source to understand what dependencies the Todo class has on Coloured, we must likewise carefully examine Coloured to determine whether it has any dependencies on Todo. In this case, it doesn’t, but that is not obvious.

As code grows, as Coloured gains in complexity, dependencies can be introduced, but they will not be obvious.

This problem is another that has been well-understood for decades. JavaScript has tried to address it in another context: When we use modules in ES6, each module explicitly names the entities it exports, for example this module exports two functions:

export function getWith (key) {
  return (map) => map[key];
}

export function dict (map) {
  return (key) => map[key];
}

/// ...

All other entities are private to the module. This is encapsulation, and we saw that mixins do not provide encapsulation. But modules do something else as well. When we import a module, we explicitly name the entities we wish to import from it:

import { getWith } from 'foo/bar/lists';

/// ...

This is an explicit dependency. We can now use the getWith function at will. If we later try to use the dict function, it will not be available because we haven’t imported it. We have to manually import it as well, like this:

import { getWith, dict } from 'foo/bar/utils';

/// ...

The dependencies are explicit, not implicit. We can see the dependencies declared in the source, and we can even write tools for statically checking that the dependencies are fulfilled.3 If mixin dependencies were explicit, we would know which methods were being mixed into a class because they would be declared. And likewise, there would be some mechanism for declaring which methods and/or properties that a mixin depends upon when it is mixed into a class.

But the various patterns for implementing “naïve” mixins have no such mechanisms for making dependencies explicit. As a result, dependencies can creep as we see above, and there is no obvious way to notice that the dependencies are creeping, or to disentangle the dependencies.

name clashes

Since class methods and mixin methods wind up all being properties of the class prototype, you cannot give any method or property any name you like. In one big class file, you have the same problem: The various methods and properties needed all must have the same name.

What makes mixins different, is that in a single class you can easily inspect the code and determine which property and method names are already in use. But when modifying a mixin, you cannot easily determine which class or classes may already depend on this mixin. The name clashes reach out between files. Mixins create “action-at-a-distance,” and the name clashes happen at a distance as well.

For example, what happens if we decide that we ought to be able to name colours instead of using their RGB values?

const Coloured = {
  setColourName (name) {
    this.name = name;
  },
  setColourRGB ({r, g, b}) {
    this.colourCode = {r, g, b};
    return this;
  },
  getColourName () {
    return this.name;
  },
  getColourRGB () {
    return this.colourCode;
  }
};

Oops. We just broke Todo. The name clash problem is a second-order consequence of concatenative sharing. JavaScript solved this problem for modules: When you import a module, you explicitly name your dependencies as we saw above. You can also rename them to avoid conflicts:

import { getWith as squareBracketAccessWith } from 'foo/bar/lists';

function getWith (key) {
  return (gettable) => gettable.get(key);
}

/// ...

This file imports getWith as squareBracketAccessWith so that it does not conflict with the getWith function it defines for itself.

Mixins provide no mechanism for resolving name clashes, and because they have implicit dependencies, we have no easy way of even noticing that we have a name clash to begin with. So as we grow or classes and mixins, we bump into them more and more. Worse, if we try to expand a class by adding another mixin, we may discover that we have irresolvable name clashes.

snowballing complexity

Dan wrote:

Every new requirement makes the mixins harder to understand. Components using the same mixin become increasingly coupled with time. Any new capability gets added to all of the components using that mixin. There is no way to split a “simpler” part of the mixin without either duplicating the code or introducing more dependencies and indirection between mixins. Gradually, the encapsulation boundaries erode, and since it’s hard to change or remove the existing mixins, they keep getting more abstract until nobody understands how they work.

This makes sense, and it’s a direct consequence of the dependencies between mixins, the fact that these dependencies are implicit, and the fact that names can clash between mixins.

this is all true, and very familiar

If this seems very familiar, congratulations. Like me, you wrote Java in the 1990s and 2000s. Or Ruby in the 2000s.4 When you have a hierarchy of classes, you have the exact same set of problems:

When you have classes depending upon superclasses, you have implicit dependencies and name clashes caused by the lack of encapsulation. A subclass has access by default to all of the private properties and methods of its superclass, just as a class has access by default to all of the private properties and methods of its superclass.

Languages like Java and C++ provide mechanisms for minimizing these dependencies in the form of access controls. A superclass has a way of making certain properties and methods private, and such properties and methods are not only walled off from access by the outside world, they are not accessible by subclass code either.

Such access mechanisms help control dependencies and eliminate some of the name clashes by reducing the “surface area” of implicit dependencies. But such languages still have the implicit dependencies problem, and experience has shown that over time, class hierarchies snowball in complexity just as Dan describes mixin architectures as snowballing in complexity.

In classes, this is known as a fragile base class problem, and it is exactly the same as the mixin problem.

It turns out that with class hierarchies, we have a fragile base class problem and a many-to-many dependencies problem. Mixins solve the many-to-many dependencies problem, but spread out the fragile base class problem and introduce new vectors for dependencies between mixins.

We can reduce the surface area with encapsulation techniques, but if we want to eliminate the implicit dependencies problem, we need a whole new mechanism for mixing in behaviour.

Concatenative sharing doesn’t scale over time, space, and teams.

So what can we do about mixins?

The first and simplest thing to do about mixins doesn’t solve the problems of implicit dependencies and name clashes, but it will reduce the rate at which they increase complexity. Thus, your architecture will fail to scale, but fail at a much slower rate.

Sometimes, that’s enough! Sometimes, software development is about being lean, about tightening the conjecture-experiment-feedback cycle. Doing “the minimum” to get to the next cycle is sometimes a big win.

The first and simplest thing to do is to impose some encapsulation for classes and mixins. Do not have all your “private” properties and methods intermingled. This will reduce the number of accidental (or deliberate) dependencies and eliminate a number of accidental name clashes.

There are a few techniques. The first is to use helper functions instead of private methods. Let’s say we have:

export default class Widget {

  // __public methods__
  foo (baz) {
    this._bar(baz);
  },

  // __private methods__
  _bar(baz) {
    return this.snaf = baz;
  }

  // ...
}

_bar is obviously a private method, and we have signalled this with a naming convention. However, we can still have someone make a dependency on it, and we can accidentally define a _bar in a mixin by accident.

We can refactor _bar into a helper function by extracting its body from the class, and then changing our invocations from this._bar(baz) to bar.call(this, baz):

export default class Widget {

  // __public methods__
  foo (baz) {
    bar.call(this, baz);
  },

  // ...
}

function bar(baz) {
  return this.snaf = baz;
}

By invoking helper functions like bar with .call(this, baz), we give them access to the instance’s private state just like a method. However, because helper functions are explicitly not exported, they are private to our class.

If we use this technique with classes and with mixins, we limit the dependencies and potential name clashes to those we explicitly have decided ought to be public methods. Helper functions can never name clash because they exist in separate scopes.

The syntax looks a little unusual, but it is better to get all your work done in 40 hours a week using something that looks odd than to work 70 hours a week dealing with ugly consequences of code that looks simple but has terrible consequences.

The disadvantage of this approach is that while it solves the problem of dependencies and name clashes between methods, it does nothing for properties.5 Someone can write a mixin that depends on the snaf property or accidentally collides with it.

To fix that problem, we either wait until JavaScript introduces private properties, or we use symbols for method and property names.

using symbols for method and property names

With an extra level of indirection, we can use symbols for method and property names instead of strings. Here’s how to refactor our class above to use symbols as method and property names.

We start with a completely abstract mixin:

export default {

  // __public methods__
  foo (baz) {
    this._bar(baz);
  },

  // __private methods__
  _bar(baz) {
    return this.snaf = baz;
  }

  // ...
};

The first step is to replace the names of our private methods and properties with string constants:

const bar = 'bar';
const snaf = 'snaf';

export default {

  // __public methods__
  foo (baz) {
    this[bar](baz);
  },

  // __private methods__
  [bar](baz) {
    return this[snaf] = baz;
  }

  // ...
};

Next, we replace the strings with symbols:

const bar = Symbol('bar');
const snaf = Symbol('snaf');

export default {

  // __public methods__
  foo (baz) {
    this[bar](baz);
  },

  // __private methods__
  [bar](baz) {
    return this[snaf] = baz;
  }

  // ...
};

Now our bar private method and snaf property are still properties of our mixin object , but their actual names are not shared with any class we mix it into or other mixins, and cannot cause a name clash.6

Using either helper functions or symbols for private methods and properties will cut down on dependencies, but if we want to do something about implicit dependencies, we need to rethink mixins altogether.

We’ll do that in the next post.


have your say

(you can discuss on reddit, file an issue or even edit this post yourself)


notes
  1. Yes, I said everyone, I didn’t cover my ass with a phrase like “many people.” Everyone. 

  2. At this time, the most common JavaScript engines have a slower implementation for prototypes that have been modified with Object.assign than those that are initialized and thereafter do not change. Thus, in practice, people often use other approaches like subclass factories. But that is tangential to the explanation for why mixins introduce dependencies, as implementations like subclass factories have the same software engineering problems. 

  3. being able to statically check dependencies is marvellously useful, but it solves a problem that is entirely orthogonal to the software engineering problem we are discussing here. 

  4. Or C++. Or Smalltalk. Or Python. Or any other OOP language, really. Let’s not get hung up on whether it was actually Java. 

  5. At this time, the most common JavaScript engines also implement helper functions more slowly than helper methods. However, when you actually measure your actual code in production, you may discover that the benefit of using a different approach is negligible. Or it may be that in one place, it matters, and you refactor that one thing, and leave the others as-is. 

  6. Redditor mlamers mentioned that instead of symbols, we could use custom prefixes for names, e.g. this.mixin_1_bar(baz). This technique has the benefit for working with older versions of JavaScript, and if we don’t want to use an ES6 -> ES5 compiler & shim in our build pipeline, that is a reasonable choice. Likewise, we might be using some marvellous framework with its own MOP that doesn’t support symbols. Countered against that are some technical benefits of symbols, mostly with respect to meta-programming we might write or encounter in a library. There is also the argument that it is technically easier to break encapsulation (and thus drive up coupling) with a prefixed method name. That leads to a conversation about the purpose of code review and of developing team practises. 

https://raganwald.com/2016/07/16/why-are-mixins-considered-harmful
The Hubris of Impatient Sieves of Eratosthenes
Show full content

Espresso


the hubris of blog post authors

Althea and Ben were sipping feature espressos at their local indie coffee shop. “Althea,” Ben began, “I’m prepping for interviews with you-know-who, so I’ve been trying to read as many algorithm blog posts as I can, just to catch up on all the stuff I’ve forgotten now that Ive been working for a few years…” Althea died a little inside. That conversation. Programmers were notorious for taking interview questions extremely personally.

“Oh?” Althea tried to discourage Ben with overt disinterest. It was futile, of course: Ben carried on.

“Well, I was just reading a blog post about lazily generating prime numbers, and I remember being asked to write a program to generate primes back when I first entered the industry.”

Althea laughed. “If you’re thinking of the same post that I read, the algorithm is wrong! Or at least, terrible.”

Ben nodded. “I saw something to that effect on Hacker News, but since the article wasn’t precisely about prime numbers, I guess the OP1 thought it was ok.”

Althea frowned. “It’s never ok to post terrible code. It’s an enormous act of arrogance and hubris to think that just because you can write something and publish it to the whole world, that you therefore should just publish any old thing on your mind without taking care and consideration to make sure it’s right!”

“Somebody can and will ship it to production. Or foist it on impressionable interns as the Gospel Truth. Stuff like this is why the industry ignores forty years of CS research, and…”

Ben tuned out the rest of Althea’s rant, then resumed his anecdote when the storm subsided:

the unfaithful sieve

Ben pulled up the blog post on a laptop. “The code in the blog post was the most naïve possible mapping from the written description of the Sieve of Eratosthenes to code:”

function * nullEveryNth (skipFirst, n, iterable) {
  const iterator = iterable[Symbol.iterator]();

  yield * take(skipFirst, iterator);

  while (true) {
    yield * take(n - 1, iterator);
    iterator.next();
    yield null;
  }
}

function * sieve (iterable) {
  const iterator = iterable[Symbol.iterator]();
  let n;

  do {
    const { value } = iterator.next();

    n = value;
    yield n;
  } while (n == null);

  yield * sieve(nullEveryNth(n * (n - 2), n, iterator));
}

const Primes = compact(sieve(range(2)));

// General-Purpose Lazy Operations

function * range (from = 0, to = null) {
  let number = from;

  if (to == null) {
    while (true) {
      yield number++
    }
  }
  else {
    while (from <= to) {
      yield number++;
    }
  }
}

function * take (numberToTake, iterable) {
  const iterator = iterable[Symbol.iterator]();

  for (let i = 0; i < numberToTake; ++i) {
    const { done, value } = iterator.next();
    if (!done) yield value;
  }
}

function * compact (list) {
  for (const element of list) {
    if (element != null) {
      yield element;
    }
  }
}

Althea chimed in: “Naïve is right! This mimics what a child does when the sieve is explained to them for the first time. Given a big table of numbers, they start crossing them out using what we know to be modulo arithmetic: They scan forward number by number, counting as they go:”

One TWO (cross out), one TWO (cross out), one TWO (cross out), …

One two THREE (cross out), one two THREE (cross out), one two THREE (cross out), …

One two three four FIVE (cross out), one two three four FIVE (cross out), one two three four FIVE (cross out), …


Prime Numbers


ben’s sieve

Ben continued. “Yes, it’s naïve, but it’s terrible for other reasons: I dislike how everything is jumbled together. And it looks to me like the author was focused on showing how carelessly using an eager version of compact would break everything, rather than writing a good lazy sieve.”

“I figured I’d rewrite it from scratch. The main decision I made was to extract the sieve into its own object. In this day and age, there’s no need to be all fussy about pure functional programming if you aren’t actually using a pure functional language.”

“The important thing is to avoid terrible stateful anti-patterns and action-at-a-distance. So I created a Sieve, an object with a constructor and two methods of note:

  1. addAll(iterable) adds all the elements of iterable to our sieve. It is required that the elements of iterable be ordered, and that the first element of iterable be larger than the lowest number of any iterable already added.

  2. has(number) tests whether number is present in our sieve. It is required that successive calls to has must provide numbers that increase. In other words, calls to has are also ordered. Since calls to has are ordered by definition, the sieve is free to internally discard number if it returns true.

“Given a sieve object, my generator for primes is much simpler:”

function * Primes () {
  let prime = 2;
  const composites = new Sieve();

  while (true) {
    yield prime;
    composites.addAll(multiplesOf(prime * prime, prime));

    while (composites.has(++prime)) {
      // do nothing
    }
  }
}

Althea nodded and Ben cracked his knuckles metaphorically.

“So now to write the Sieve class. Instead of “crossing out” numbers in a list, I decided to merge lists of composite numbers together. Here is a generator that takes two ordered lists and merges them naïvely:”

function * merge (aIterable, bIterable) {
  const aIterator = aIterable[Symbol.iterator]();
  const bIterator = bIterable[Symbol.iterator]();
  let { done: aDone, value: aValue } = aIterator.next();
  let { done: bDone, value: bValue } = bIterator.next();

  while (true) {
    if (aDone && bDone) {
      return;
    } else if (aDone) {
      yield bValue;
      yield * bIterator;
      return;
    } else if (bDone) {
      yield aValue;
      yield * aIterator;
      return;
    } else if (aValue <= bValue) {
      yield aValue;
      ({ done: aDone, value: aValue } = aIterator.next());
    } else {
      yield bValue;
      ({ done: bDone, value: bValue } = bIterator.next());
    }
  }
}

“We need to work with ordered lists that are also unique, so this generator lazily eliminates duplicates in a stream:”

function * unique (iterable) {
  const iterator = iterable[Symbol.iterator]();
  let lastYielded = {};
  let { done, value } = iterator.next();

  while (!done) {
    if (value !== lastYielded) {
      yield value;
      lastYielded = value;
    }
    ({ done, value } = iterator.next());
  }
}

“As I worked, I needed to to get the first element of a list and the rest of a list on a regular basis. Here’s a helper that works very nicely with JavaScript’s destructuring assignment:”

function destructure (iterable) {
  const iterator = iterable[Symbol.iterator]();
  const { done, value } = iterator.next();

  if (!done) {
    return { first: value, rest: iterator }
  }
}

Althea chafed at Ben’s style of going through all the preliminaries before getting to the main business. It was very academic, but not the most effective way to communicate how code is written and what it does.

Ben continued “With these in hand, I could write the sieve in the new way. As we collect primes, we create a list of composite numbers by collecting the multiples of each primes, starting with the prime squared. So for two, our composites are 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, .... For three, they’re 9, 12, 15, 18, 21, 24, ..., and for five they’re 25, 30, 35, 40, 45, ....”

function * multiplesOf (startingWith, n) {
  let number = startingWith;

  while (true) {
    yield number;
    number = number + n;
  }
}

“By successively merging them together, we get a list of numbers that aren’t prime. The merge of the composites above is 4, 6, 8, 9, 10, 12, 12, 14, 15, 16, 18, 18, 20, 21, 22, 24, 24, 25, ..., which we can pass to unique to get 4, 6, 8, 9, 10, 12, 14, 15, 16, 18, 20, 21, 22, 24, 25, ....”

With a flourish, Ben finally revealed his work. “Here is my MergeSieve class. It implements addAll by merging the new iterator with its existing iterator of composite numbers, and it implements has by checking whether the number provided is equal to the first number in its list. If it is, it removes the first.”

class MergeSieve {
  constructor () {
    this._first = undefined;
    this._rest = [];
  }

  addAll (iterable) {
    this._rest = unique(merge(this._rest, iterable));
    if (this._first == null) {
      ({
        first: this._first,
        rest: this._rest
      } = destructure(this._rest));
    }
  }

  has (number) {
    while (this._first < number) {
      this._removeFirst();
    }
    if (number === this._first) {
      this._removeFirst();
      return true;
    }
    else return false;
  }

  _removeFirst () {
    ({
      first: this._first,
      rest: this._rest
    } = destructure(this._rest));
  }
}

function * Primes () {
  let prime = 2;
  const composites = new MergeSieve();

  while (true) {
    yield prime;
    composites.addAll(multiplesOf(prime * prime, prime));

    while (composites.has(++prime)) {
      // do nothing
    }
  }
}

“And it works flawlessly!”

take(100, Primes())
  //=>
    [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47,
     53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107,
     109, 113, 127, 131, 137, 139, 149, 151, 157, 163, 167,
     173, 179, 181, 191, 193, 197, 199, 211, 223, 227, 229,
     233, 239, 241, 251, 257, 263, 269, 271, 277, 281, 283,
     293, 307, 311, 313, 317, 331, 337, 347, 349, 353, 359,
     367, 373, 379, 383, 389, 397, 401, 409, 419, 421, 431,
     433, 439, 443, 449, 457, 461, 463, 467, 479, 487, 491,
     499, 503, 509, 521, 523, 541]

The exposition ran out of steam like a clock winding down. Ben looked at Althea, anxiously. “What do you think?”


althea’s feedback

“Ben,” Althea began, “This is much cleaner than the code from the blog.”

Ben nodded. “But,” Althea continued, “If you were to show this to me in an interview, I would ask you about performance. Does this improve on the original? Or is it the same thing, dressed up in nicer code?”

“Let’s look at the OP’s original code. For each prime, the algorithm stepped through the numbers, counting one-TWO, one-TWO, or one-two-THREE, one-two-THREE, and so on. So each prime stepped through all the numbers larger than it. Actually, there’s a prime-squared optimization, but roughly speaking, we can see that in the OP’s code, every number n must be touched by all the primes smaller than n, whether they are factors of n or not.””

“Melissa O’Neill calls this an ‘Unfaithful Sieve’ in her paper The Genuine Sieve of Eratosthenes.”

Ben thought about this, then agreed that for every number n, the OP’s code required an operation for each prime smaller than n. In the OP’s naïve sieve, checking a number like 26 required a comparison for the multiples of 2, 3, 5, 7, 11 and so on up to 23 even though 26 is only divisible by 2 and 13.

Althea switched to Ben’s code.

“Now let’s look at this implementation of merge. The way it works is that as we take things from a collection of lists merged together, we’re invoking a series of comparisons, one for each list. So every time we come across a composite number, we’re invoking one comparison for each prime less than the composite number.”

“Again, there’s a prime-squared optimization, but the larger factor for computing the time complexity of this algorithm is how many operations are required for each composite number. And in this respect, your algorithm is almost identical to the OP’s algorithm.”

“The ideal performance of the Sieve of Eratosthenes is that every composite number gets crossed out once for each of its factors less than its square root. Therefore, a number like 26 would get crossed out for 2, but not 3, 5, or any other prime including 13, its other factor.”

“So what we want is an algorithm where we only have to check a composite’s prime factors, and even then only those less than its square root.”

Ben looked a little glum. “Well, at least I’m hearing this from you and not from an interviewer trying to impress themselves by tearing me down!”

They both laughed wryly. It’s almost impossible to interview for tech jobs without encountering the phenomena of an interviewer who thinks the purpose of an interview is to make other people feel stupid.2


althea and bob pair

Espressos finished, Althea and Ben ordered another round and started pairing in the coffee shop.

Althea pointed out that the merge algorithm is useful if you always need the lowest composite number. But in truth, the sieve does not need the lowest composite number, it merely needs to know if the number it is testing is any of the lowest multiples of the primes seen so far.

So when testing 26, we need to know if it is any of the smallest of each of our multiplesOf iterators: 26 (2x13), 27 (3x9), 30 (5x10), 49 (7x7), 121 (11x11), 169 (13x13), 289 (17x17), 361 (19x19), or 529 (23x23). It’s true that if we know that 26 is the smallest of the nine iterators seen so far, it is very cheap to test whether 26 === 26.

But as we’ve seen, the naïve merge means we need eight tests to determine that 26 is the smallest. What if it was cheaper to check whether 26 is anywhere in the set 26, 27, 30, 49, 121, 169, 289, 361, 529?

Ben thought about Althea’s revelation. “We could use a set! Checking for membership in a set is more expensive than ===, but once we have a lot of primes, it’ll be way cheaper than doing comparisons.”

Althea nodded and suggested Ben try coding that.

“But wait!” A thought struck Ben. “A set is great, but after finding that 26 is in the set, we need to remove 26 from the set and insert 28, the next multiple of two. We’d need to associate iterators with each of the values… So we need a dictionary.”

Ben started coding, with Althea providing feedback. Fuelled by an excellent Blue Mountain pour-over, the code flowed from keyboard to screen.


Espresso extraction


the hash merge

Ben placed the prime iterators into a hash table, indexed by the next value for the iterator. Thus, the keys of the table were composites, and the values of the table were lists of iterators (a single composite might have two or more iterators, for example 12).

He spoke aloud as he walked through his new implementation:

“When we start, our HashMerge will have one iterable, at index 4. Its remaining numbers will be 6, 8, and so on. We then add another at 9, with numbers 12, 15, and so on. We try removing 4, and when we do so, we re-merge the iterator for multiples of two, but now it will be at number 6, with remaining numbers 8, 10, and so on.”

“Thus, _remove is always relocating iterables to their next higher number. When two or more iterators end up at the same index (like 12), all get relocated.”

class HashSieve {
  constructor () {
    this._hash = Object.create(null);
  }

  addAll (iterable) {
    const { first, rest } = destructure(iterable);

    if (this._hash[first]) {
      this._hash[first].push(rest);
    }
    else this._hash[first] = [rest];

    return this;
  }

  has (number) {
    if (this._hash[number]) {
      this._remove(number);
      return true;
    }
    else return false;
  }

  _remove (number) {
    const iterables = this._hash[number];

    if (iterables == null) return false;

    delete this._hash[number];
    iterables.forEach((iterable) => this.addAll(iterable));

    return number;
  }
}

function * Primes () {
  let prime = 2;
  const composites = new HashSieve();

  while (true) {
    yield prime;
    composites.addAll(multiplesOf(prime * prime, prime));

    while (composites.has(++prime)) {
      // do nothing
    }
  }
}

take(100, Primes())
  //=>
    [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47,
     53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107,
     109, 113, 127, 131, 137, 139, 149, 151, 157, 163, 167,
     173, 179, 181, 191, 193, 197, 199, 211, 223, 227, 229,
     233, 239, 241, 251, 257, 263, 269, 271, 277, 281, 283,
     293, 307, 311, 313, 317, 331, 337, 347, 349, 353, 359,
     367, 373, 379, 383, 389, 397, 401, 409, 419, 421, 431,
     433, 439, 443, 449, 457, 461, 463, 467, 479, 487, 491,
     499, 503, 509, 521, 523, 541]

“This is much better than my original MergeSieve. The check for has involves the constant overhead of performing a hash lookup, of course, and that is more expensive than the === test of MergeSieve. But what happens when each sieve removes the no-longer-needed number?

  • When MergeSieve tests a composite number, it iterates. Because of the way merge is written, iterating requires an if statement for the number of iterables seen so far -1. In other words, every time it hits a composite number, if there are p primes less than the composite number, MergeSieve needs to perform p-1 operations, regardless of how many prime factors the composite number actually has.

  • When HashSieve tests a composite number, it iterates and relocates each of the iterators at that number. There will be one iterator for each prime factor, and only the prime factors less than the square root of the composite number will be included. However, for each one, there is a large constant factor as we have to perform an insert in the has table. Finally, there is a single remove from the hash table. So HashSieve does fewer operations, but each is more expensive.

“As the numbers grow, primes become scarcer, but the total number of primes grows. Therefore, MergeSieve gets slower and slower as it is performing operations for each prime discovered so far. HashSieve catches up and then gets relatively faster and faster.”

Althea congratulated him. “You’ve got it!”

“So,” Ben smiled, “I guess the problem was that the OP had the hubris to write about a lazy algorithm, but not one that was impatient enough to run quickly!”

“Absolutely!” Althea was reassuring. “This has improved on the OP’s code style and performance. And now you’re ready to discuss the Sieve of Eratosthenes with greater rigour.”

“What’s there to discuss?” For someone who had only hours before written their own Unfaithful Sieve, Ben was exhibiting some hubris of his own: “This is way faster. Given the number of broken sieves on the internet, I’ll bet this is better than anything the interviewer sees from anyone else.”


the final point

Althea tried her best Han Solo impersonation: “Don’t get cocky, kid! After all, if I could read The Genuine Sieve of Eratosthenes, so could anybody else looking for a job. And besides, that’s not the point.”

“The point,” Althea said patiently–Ben was, after all, a friend–“The point is that even when setting out to implement an algorithm with the best of intentions, a small error in the selection of a data structure can have a major effect on its behaviour.”

“Software is built in layers of abstractions. In the OP’s case, using a counter to null out the composite numbers was the right abstraction but the wrong implementation. And in your case, Ben, using a naïve merge to was also the right abstraction: You were able to write a prime sieve that used === for comparisons, it ought to have been wicked fast. But the implementation of the merge let you down, it was as slow as the OP’s counting.”

“So the lesson is, studying algorithms is not about studying abstractions. It’s about the implementations, at every level of detail.”

Ben considered. “Ok, fair enough. In that case, how do I know whether the hash table implementation is fast enough?”

Althea grinned: “If you do some more research, you will discover that this is not the fast-est implementation. But for production code, with all of the requirements and trade-offs that come into play, it may be fast enough.”

“After all, we can’t keep tweaking the same thing over and over again for diminishing returns. We need to move on and find big gains somewhere else. That’s why impatience can be a virtue: We programmers should always be hungry for important work to do.”


(edit this post yourself)


source code
notes
  1. OP: Short for Original Poster. Used on online message boards and forums. 

  2. Belittling interviewees on the basis of the interviewer’s superior understanding of a contrived problem is a ridiculous practice. First and foremost, interviews exist to find and filter people, not to bolster the egos of interviewers. Second, an interview question is carefully selected beforehand, and the interviewer has the luxury of knowing and studying the question in advance. It is not a level playing field for comparing the experience and knowledge of interviewer and interviewee. 

https://raganwald.com/2016/04/25/hubris-impatient-sieves-of-eratosthenes
“We will encourage you to develop the three great virtues of a programmer: laziness, impatience, and hubris”
Show full content
larry wall

Larry Wall and Camelia, the Perl 6 Mascot


laziness and eagerness

In computing, “laziness” is a broad term, generally referring to not doing any work unless you need it. Whereas its opposite is “eagerness,” doing as much work as possible in case you need it later.

Consider this JavaScript:

function ifThen (a, b) {
  if (a) return b;
}

ifThen(1 === 0, 2 + 3)
  //=> undefined

Now, here’s the question: Does JavaScript evaluate 2+3? You probably know the answer: Yes it does. When it comes to passing arguments to a function invocation, JavaScript is eager, it evaluates all of the expressions, and it does so whether the value of the expression is used or not.1

If JavaScript was lazy, it would not evaluate 2+3 in the expression ifThen(1 === 0, 2 + 3). So is JavaScript an “eager” language? Mostly. But not always! If we write: 1 === 0 ? 2 + 3 : undefined, JavaScript does not evaluate 2+3. Operators like ?: and && and ||, along with program control structures like if, are lazy. You just have to know in your head what is eager and what is lazy.

And if you want something to be lazy that isn’t naturally lazy, you have to work around JavaScript’s eagerness. For example:

function ifThenEvaluate (a, b) {
  if (a) return b();
}

ifThenEvaluate(1 === 0, () => 2 + 3)
  //=> undefined

JavaScript eagerly evaluates () => 2 + 3, which is a function. But it doesn’t evaluate the expression in the body of the function until it is invoked. And it is not invoked, so 2+3 is not evaluated.

Wrapping expressions in functions to delay evaluation is a longstanding technique in programming. They are colloquially called thunks, and there are lots of interesting applications for them.

generating laziness

The bodies of functions are a kind of lazy thing: They aren’t evaluated until you invoke the function. This is related to if statements, and every other kind of control flow construct: JavaScript does not evaluate statements unless the code actually encounters the statement.

Consider this code:

function containing(value, list) {
  let listContainsValue = false;

  for (const element of list) {
    if (element === value) {
      listContainsValue = true;
    }
  }

  return listContainsValue;
}

You are doubtless chuckling at its naïveté. Imagine this list was the numbers from one to a billion–e.g. [1, 2, 3, ..., 999999998, 999999999, 1000000000]–and we invoke:

const billion = [1, 2, 3, ..., 999999998, 999999999, 1000000000];

containing(1, billion)
  //=> true

We get the correct result, but we iterate over every one of our billion numbers first. Awful! Small children and the otherwise febrile know that you can return from anywhere in a JavaScript function, and the rest of its evaluation is abandoned. So we can write this:

function containing(list, value) {
  for (const element of list) {
    if (element === value) {
      return true;
    }
  }

  return false;
}

This version of the function is lazier than the first: It only does the minimum needed to determine whether a particular list contains a particular value.

From containing, we can make a similar function, findWith:

function findWith(predicate, list) {
  for (const element of list) {
    if (predicate(element)) {
      return element;
    }
  }
}

findWith applies a predicate function to lazily find the first value that evaluates truthily. Unfortunately, while findWith is lazy, its argument is evaluated eagerly, as we mentioned above. So let’s say we want to find the first number in a list that is greater than 99 and is a palindrome:

function isPalindromic(number) {
  const forwards = number.toString();
  const backwards = forwards.split('').reverse().join('');

  return forwards === backwards;
}

function gt(minimum) {
  return (number) => number > minimum;
}

function every(...predicates) {
  return function (value) {
    for (const predicate of predicates) {
      if (!predicate(value)) return false;
    }
    return true;
  };
}

const billion = [1, 2, 3, ..., 999999998, 999999999, 1000000000];

findWith(every(isPalindromic, gt(99)), billion)
  //=> 101

It’s the same principle as before, of course, we iterate through our billion numbers and stop as soon as we get to 101, which is greater than 99 and palindromic.

But JavaScript eagerly evaluates the arguments to findWith. So it evaluates isPalindromic, gt(99)) and binds it to predicate, then it eagerly evaluates billion and binds it to list.

Binding one value to another is cheap. But what if we had to generate a billion numbers?

function NumbersUpTo(limit) {
  const numbers = [];
  for (let number = 1; number <= limit; ++number) {
    numbers.push(number);
  }
  return numbers;
}

findWith(every(isPalindromic, gt(99)), NumbersUpTo(1000000000))
  //=> 101

NumbersUpTo(1000000000) is eager, so it makes a list of all billion numbers, even though we only need the first 101. This is the problem with laziness: We need to be lazy all the way through a computation.

Luckily, we just finished working with generators2 and we know exactly how to make a lazy list of numbers:

function * Numbers () {
  let number = 0;
  while (true) {
    yield ++number;
  }
}

findWith(every(isPalindromic, gt(99)), Numbers())
  //=> 101

Generators yield values lazily. And findWith searches lazily, so we can find 101 without first generating an infinite array of numbers. JavaScript still evaluates Numbers() eagerly and binds it to list, but now it’s binding an iterator, not an array. And the for (const element of list) { ... } statement lazily takes values from the iterator just as it did from the billion array.

the sieve of eratosthenes

We start with a table of numbers (e.g., 2, 3, 4, 5, . . . ) and progressively cross off numbers in the table until the only numbers left are primes. Specifically, we begin with the first number, p, in the table, and:

  1. Declare p to be prime, and cross off all the multiples of that number in the table, starting from p squared, then;

  2. Find the next number in the table after p that is not yet crossed off and set p to that number; and then repeat from step 1.

Here is the Sieve of Eratosthenes, written in eager style:

function compact (list) {
  const compacted = [];

  for (const element of list) {
    if (element != null) {
      compacted.push(element);
    }
  }

  return compacted;
}

function PrimesUpTo (limit) {
  const numbers = NumbersUpTo(limit);

  numbers[0] = undefined; // `1` is not a prime
  for (let i = 1; i <= Math.ceil(Math.sqrt(limit)); ++i) {
    if (numbers[i]) {
      const prime = numbers[i];

      for (let ii = i + prime; ii < limit; ii += prime) {
        numbers[ii] = undefined;
      }
    }
  }

  return compact(numbers);

}

PrimesUpTo(100)
  //=> [2,3,5,7,11,13,17,19,23,29,31,37,41,43,47,53,59,61,67,71,73,79,83,89,97]

Let’s take a pass at writing the Sieve of Eratosthenes in lazy style. First off, a few handy things we’ve already seen in this blog, and in JavaScript Allongé:

function * range (from = 0, to = null) {
  let number = from;

  if (to == null) {
    while (true) {
      yield number++
    }
  }
  else {
    while (from <= to) {
      yield number++;
    }
  }
}

function * take (numberToTake, iterable) {
  const iterator = iterable[Symbol.iterator]();

  for (let i = 0; i < numberToTake; ++i) {
    const { done, value } = iterator.next();
    if (!done) yield value;
  }
}

With those in hand, we can write a generator that maps an iterable to a sequence of values with every nth element changed to null:3

function * nullEveryNth (skipFirst, n, iterable) {
  const iterator = iterable[Symbol.iterator]();

  yield * take(skipFirst, iterator);

  while (true) {
    yield * take(n - 1, iterator);
    iterator.next();
    yield null;
  }
}

That’s the core of the “sieving” behaviour: take the front element of the list of numbers, call it n, and sieve every nth element afterwards.

Now we can apply nullEveryNth recursively: Take the first unsieved number from the front of the list, sieve its multiples out, and yield the results of sieving what remains:

function * sieve (iterable) {
  const iterator = iterable[Symbol.iterator]();
  let n;

  do {
    const { value } = iterator.next();

    n = value;
    yield n;
  } while (n == null);

  yield * sieve(nullEveryNth(n * (n - 2), n, iterator));
}

With sieve in hand, we can use range to get a list of numbers from 2, sieve those recursively, then we compact the result to filter out all the nulls, and what is left are the primes:

const Primes = compact(sieve(range(2)));

Besides performance, did you spot the full-on bug? Try running it yourself, it won’t work! The problem is that at the last step, we called compact, and compact is an eager function, not a lazy one. So we end up trying to build an infinite list of primes before filtering out the nulls.

We need to write a lazy version of compact:

function * compact (list) {
  for (const element of list) {
    if (element != null) {
      yield element;
    }
  }
}

And now it works!

take(100, Primes())
  //=>
    [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47,
     53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107,
     109, 113, 127, 131, 137, 139, 149, 151, 157, 163, 167,
     173, 179, 181, 191, 193, 197, 199, 211, 223, 227, 229,
     233, 239, 241, 251, 257, 263, 269, 271, 277, 281, 283,
     293, 307, 311, 313, 317, 331, 337, 347, 349, 353, 359,
     367, 373, 379, 383, 389, 397, 401, 409, 419, 421, 431,
     433, 439, 443, 449, 457, 461, 463, 467, 479, 487, 491,
     499, 503, 509, 521, 523, 541]

When we write things in lazy style, we need lazy versions of all of our usual operations. For example, here’s an eager implementation of unique:

function unique (list) {
  const orderedValues = [];
  const uniqueValues = new Set();

  for (const element of list) {
    if (!uniqueValues.has(element)) {
      uniqueValues.add(element);
      orderedValues.push(element);
    }
  }
  return orderedValues;
}

Naturally, we’d need a lazy implementation if we wanted to find the unique values of lazy iterators:

function * unique (iterable) {
  const uniqueValues = new Set();

  for (const element of iterable) {
    if (!uniqueValues.has(element)) {
      uniqueValues.add(element);
      yield element;
    }
  }
}

And so it goes with all of our existing operations that we use with lists: We need lazy versions we can use with iterables, and we have to use the lazy operations throughout: We can’t mix them.

it comes down to types

This brings us to an unexpected revelation.

Generators and laziness can be wonderful. Exciting things are happening with using generators to emulate synchronized code with asynchronous operations, for example. But as we’ve seen, if we want to write lazy code, we have to be careful to be consistently lazy. If we accidentally mix lazy and eager code, we have problems.

This is a symmetry problem. And at a deeper level, it exposes a problem with the “duck typing” mindset: There is a general idea that as long as objects handle the correct interface–as long as they respond to the right methods–they are interchangeable.

But this is not always the case. The eager and lazy versions of compact both quack like ducks that operate on lists, but one is lazy and the other is not. “Duck typing” does not and cannot capture difference between a function that assures laziness and another that assures eagerness.

Many other things work this way, for example escaped and unescaped strings. Or obfuscated and native IDs. To distinguish between things that have the same interfaces, but also have semantic or other contractural differences, we need types.

We need to ensure that our programs work with each of the types, using the correct operations, even if the incorrect operations are also “duck compatible” and appear to work at first glance.


Follow-up: The Hubris of Impatient Sieves of Eratosthenes


the full source
notes
  1. A few people have pointed out that a sufficiently smart compiler can notice that 2+3involves two constants and a fixed operator, and therefore it can be compiled to 5 in advance. JavaScript does not necessarily perform this optimization, but if it did, we could substitute something like x + y and get to the same place in the essay. 

  2. “Programs must be written for people to read, and only incidentally for machines to execute” 

  3. This is the simplest and most naïve implementation that is recognizably identical to the written description. In The Genuine Sieve of Eratosthenes, Melissa E. O’Neill describes how to write a lazy functional sieve that is much faster than this implementation, although it abstracts away the notion of crossing off multiples from a list. 

https://raganwald.com/2016/04/15/laziness-is-a-virtue
“Programs must be written for people to read, and only incidentally for machines to execute”
Show full content
hal abelson

Photo of Hal Abelson by Joi Ito


the fibonacci numbers

In mathematics, the Fibonacci numbers or Fibonacci sequence are numbers in the following integer sequence: 0, 1, 1, 2, 3, 5, 8, 13, 21, ….1

The rule for determining the sequence is quite simple to explain:

  1. The first number is 0.
  2. The second number is 1.
  3. Every subsequent number is the sum of the two previous numbers.

Thus, the third number is 1 (0 + 1), the fourth number is 2 (1 + 1), and that makes the fifth number 3 (1 + 2), the sixth number 5 (2 + 3) and so on ad infinitum.

There are many ways to write a program that will output the Fibonacci numbers. each method optimizes for some particular purpose. We’ll start by optimizing for being as close as possible to the written description of the numbers:

function fibonacci () {
  console.log(0);
  console.log(1);

  let [previous, current] = [0, 1];

  while (true) {
    [previous, current] = [current, current + previous];
    console.log(current);
  }
}

This is a reasonable first crack, but we can do better.

The sample above prints the numbers out to infinity. Which is the letter of the definition, but not useful for most purposes. If we only wanted, say, the first 10 or first 100, or any arbitrary number of fibonacci numbers? We’d have to weave logic about when to stop into our code:

function fibonacci (numberToPrint) {
  console.log(0);

  if (numberToPrint === 1) return;

  console.log(1);

  if (numberToPrint === 2) return;

  let [previous, current] = [0, 1];

  for(let numberPrinted = 2; numberPrinted <= numberToPrint; ++numberPrinted) {
    [previous, current] = [current, current + previous];
    console.log(current);
  }
}

The logic for the number of results we want is buried inside the middle of our code. Ideally, the definition of the sequence can be written completely independently of the mechanism for figuring out how many numbers we need.

And there’s another problem. How do we know what we want to do with the numbers? maybe we want to print them out, but then again, maybe we want to do something else, like stuff them in an array, or count how many are even and how many are odd?


separating concerns

Our code at the moment entangles these concerns, and our first improvement is to separate the concerns by rewriting our algorithm as a generator. Generators are an excellent way of separating “what we do with the steps of a calculation” from “how we calculate the steps.”

function * fibonacci () {
  yield 0;
  yield 1;

  let [previous, current] = [0, 1];

  while (true) {
    [previous, current] = [current, current + previous];
    yield current;
  }
}

Our generator yields the values of the fibonacci function instead of logging them to the console. So instead of calling a function and things happen, we call fibonacci and get an iterator. That iterator can be used in a for loop, we can call its Symbol.iterator function to extract the values in sequence, or better still, we can take advantage of standard operations on generators and iterators, like take.

take is a function that turns an iterator that yields many values (even an infinite number), and yields no more than a certain number of values. We use it when we want to (cough) take a certain number of values from an iterator.

We can find an implementation of take in an npm module, or just borrow some code from JavaScript Allongé:

function * take (numberToTake, iterable) {
  const iterator = iterable[Symbol.iterator]();

  for (let i = 0; i < numberToTake; ++i) {
    const { done, value } = iterator.next();
    if (!done) yield value;
  }
}

And then log them to the console:

for (let n of take(10, fibonacci())) {
  console.log(n);
}

The code above calls fibonacci() to get an iterator over the Fibonacci numbers, then take(10, fibonacci()) turns that into an iterator over the first ten numbers of the Fibonacci numbers, then we run a for loop over those.

To show that we are now able to be much more flexible, here we can splat the same values into an array:

[...take(10, fibonacci())]

We won’t get into counting evens and odds just yet, we’ve already made the point that we can make our fibonacci function more readable for people by ruthlessly pairing it down to do just one thing and combining it with other functions and code externally, rather than stuffing the other code inside our function.


simplicity

Turning fibonacci into a generator requires understanding what a generator is, and how the take operation converts a generator with a possibly infinite number of values into a generator that produces a fixed number of values.

It’s almost certainly not worth learning all this just for Fibonacci numbers, but if we do learn these things and then “internalize” them, it becomes a marvellous win, because we can write something like:

function * fibonacci () {
  yield 0;
  yield 1;

  let [previous, current] = [0, 1];

  while (true) {
    [previous, current] = [current, current + previous];
    yield current;
  }
}

And we are simply and very directly reproducing the definition as it was given to us, without cluttering it up with a lot of other concerns that dilute the basic thing we want to communicate.

But if we don’t know about generators, or we know about generators but aren’t familiar with operations like take, or we have never written a generator but vaguely know that clever people can use them to create some keywords for serializing asynchronous code but we don’t need to know how it works as long as we have an async keyword and a compiler…

Well, then, this code just looks like mathematical wankery, and we will write a blog post congratulating ourselves on doing the “simple” thing and just writing:

function fibonacci (numberToPrint) {
  console.log(0);

  if (numberToPrint === 1) return;

  console.log(1);

  if (numberToPrint === 2) return;

  let [previous, current] = [0, 1];

  for(let numberPrinted = 2; numberPrinted <= numberToPrint; ++numberPrinted) {
    [previous, current] = [current, current + previous];
    console.log(current);
  }
}

And then when we build larger and larger programs, at each step of the way eschewing an abstraction or technique because not using the technique we don’t know is “simpler,” and we are 100% certain at every step that we have done the right thing and avoided writing “clever” code.

It seems obvious that understanding the capabilities of our tools and how to use them in direct and obvious ways to do the things they were designed to do is not “clever.” So what is “clever code?”


clever code

Here is the naïve way to extract a particular Fibonacci number from our generator:

const fibonacciAt = (index) =>
  [...take(index + 1, fibonacci())][index];

fibonacciAt(7)
  //=> 13

Take all the values up to the one we want, splat them into an array, and then take the one we want. This is very wasteful of space, and really, we’re trying to write:

const fibonacciAt = (index) => [...fibonacci()][index];

But the way JavaScript works, that would first try to create an infinitely long array, then it would run out of space. So sticking take in the expression is mixing what we want to write with some workaround for JavaScript being an eagerly evaluated language.

Mixing two things together is not what we want to do, so even though on the surface [...take(index + 1, fibonacci())][index] looks clever because it’s so terse, it’s the wrong kind of clever.

This gives us a hint about when some inscrutable code is an abstraction that maybe we ought to learn, and when it’s just “clever:” If it’s short because it only does one thing, that’s good. If it’s short but mixes concerns, maybe it’s just clever.

If taking a set of values from an iterator is a standard operation, maybe we can separate “how we take a particular number” from “how we calculate the numbers.” Our first crack looks like this:

const at = (index, iterable) => [...take(index+1, iterable)][index];

at(7, fibonacci())
  //=> 13

We are still using take as a workaround for JavaScript, but now we’ve tucked it inside the at function, and being able to write at(7, fibonacci()) is short and whatever we do with that expression won’t be cluttered up with implementation details.

For example, we could rewrite at so that it doesn’t create a long array just to ignore all but the last value:

function at (index, iterable) {
  const iterator = iterable[Symbol.iterator]();
  let value = undefined;

  for (let i = 0; i <= index; ++i) {
    value = iterator.next().value;
  }

  return value;
}

at(7, fibonacci())
  //=> 13

Separating concerns is more valuable than mixing them in terse code for precisely this reason: You can work on the separate pieces independently.


writing for an audience

Let’s look at fibonacci again:

function fibonacci () {
  console.log(0);
  console.log(1);

  let [previous, current] = [0, 1];

  while (true) {
    [previous, current] = [current, current + previous];
    console.log(current);
  }
}

This is procedural: It’s a recipe for calculating the values one by one, as you might give it to a school child to practise arithmetic. Which is fine, but it’s just arithmetic. Math is more than arithmetic.

What if the written instructions were: “The sequence of Fibonacci numbers are the numbers 0, 1, and sum of composing the sequence with itself offset by one.” That’s a more geometric way to visualize the numbers, and it requires some mental facility with recursion and operations on sequences.

Working along those lines, the simplest implementation starts with zipWith, an operation that composes two iterators using a supplied “zipper” function:

function * zipWith (zipper, ...iterables) {
  const iterators = iterables.map(i => i[Symbol.iterator]());

  while (true) {
    const pairs = iterators.map(j => j.next()),
          dones = pairs.map(p => p.done),
          values = pairs.map(p => p.value);

    if (dones.indexOf(true) >= 0) break;
    yield zipper(...values);
  }
};

zipWith((x, y) => x + y, [1, 2, 3], [1000, 2000, 3000])
  //=> iterator over 1001, 2002, 3003

For offsetting a sequence by one, we can use tail, which iterates over all the values of an iterator except its “head:”

function * tail (iterable) {
  const iterator = iterable[Symbol.iterator]();

  iterator.next();
  yield * iterator;
}

Given these two, if we had a fibonacci generator, we could yield the values of composing it with itself offset by one like this:

function * fibonacci () {
  yield * zipWith((x, y) => x + y, fibonacci(), tail(fibonacci()));
}

What about the first two values?

function * fibonacci () {
  yield 0;
  yield 1;
  yield * zipWith((x, y) => x + y, fibonacci(), tail(fibonacci()));
}

Now, there is a performance implication of this expression, but let’s set that aside for a moment to consider: Which is better? The expression that describes composing a sequence with itself? Or the expression that describes procedurally generating numbers?

In other words, do we think in arithmetic or geometry?

The answer seems easy: If we’re talking about Fibonacci, go with geometry. It is, after all, a mathematics function. If you ever did have to write it for a program, anybody looking at the code ought to have enough of a background in mathematics to appreciate composing sequences recursively. For the same reason, if you wanted to write this:

import { zero, one } from 'big-integer';

let times = (...matrices) =>
  matrices.reduce(
    ([a, b, c], [d, e, f]) => [
        a.times(d).plus(b.times(e)),
        a.times(e).plus(b.times(f)),
        b.times(e).plus(c.times(f))
      ]
  );

let power = (matrix, n) => {
  if (n === 1) return matrix;

  let halves = power(matrix, Math.floor(n / 2));

  return n % 2 === 0
         ? times(halves, halves)
         : times(halves, halves, matrix);
}

let fibonacciAt = (n) =>
  n < 2
  ? n
  : power([one, one, zero], n - 1)[0];

That would be fine as well. It is math, anybody looking at it ought to have the mathematics background or be prepared to look it up. As a car driver, I expect the steering wheel in the usual place and to find the other controls as a driver would expect them. But I appreciate that the engine will be designed for the mechanically inclined.

This analogy of the driver and the automobile applies to our “geometric” expression:

function * fibonacci () {
  yield 0;
  yield 1;
  yield * zipWith((x, y) => x + y, fibonacci(), tail(fibonacci()));
}

The mathematician in the driver’s seat may be happy, but the programmer working with the engine realizes that this expression recursively generates generators. Nice car, but it’s a gas guzzler.

We can fix this, but once again, we do our utmost to separate how we fix it from the code itself:

function memoize (generator) {
  const memos = {},
        iterators = {};

  return function * (...args) {
    const key = JSON.stringify(args);
    let i = 0;

    if (memos[key] == null) {
      memos[key] = [];
      iterators[key] = generator(...args);
    }

    while (true) {
      if (i < memos[key].length) {
        yield memos[key][i++];
      }
      else {
        const { done, value } = iterators[key].next();

        if (done) {
          return;
        } else {
          yield memos[key][i++] = value;
        }
      }
    }
  }
}

const mfibs = memoize(function * () {
  yield 0;
  yield 1;
  yield * zipWith(plus, mfibs(), tail(mfibs()));
});

Some code has multiple audiences, and separating the code’s concerns enables each piece to speak to specialists in the appropriate domain without demanding that anybody reading it be familiar with both mathematics and the efficient reuse of previously computed values.


“writing for people to read”

Code that is written in a particular domain can and should be written for programmers who are proficient with the tools of their trade. In ES6, that includes generators and common operations on sequences like take, tail, and zipWith.

Also, code that is written for a particular domain can and should be written for programmers who have domain-knowledge. A Fibonacci function should be written for the reader who has familiarity with mathematics. Code is written for humans to read, but there is a presumption that humans choosing to read it will have or be prepared to acquire the knowledge appropriate for that domain.2

When there are multiple concerns, each requiring attention to a different domain, we separate those concerns. This is why the engine of a car is hidden away from the driver and the passengers, and it is why the mechanics of computing a fibonacci number is separated from the programming issues of how to implement things like take, tail, zipWith, or memoize.

(discuss on hacker news or edit this page)


notes
  1. The numbers were originally given as 1, 1, 2, 3, 5, 8, 13, 21, …, but it is more convenient for modern purposes to begin with 0 and 1

  2. Code written for the business domain can and should have abstractions appropriate for business software. Like state machines, domain-specific languages, batch jobs, and so forth. 

https://raganwald.com/2016/03/17/programs-must-be-written-for-people-to-read
First-Class Commands (the annotated presentation)
Show full content
foreword

This talk was given at NDC London on January 14, 2016. This is not a literal transcript: A selection of the original slides are shown here, along with some annotations explaining the ideas presented.


Part I: The Basics

“In object-oriented programming, the command pattern is a behavioural design pattern in which an object is used to encapsulate all information needed to perform an action or trigger an event at a later time.”

We will review the command pattern’s definition, then look at some interesting applications. We’ll see why what matters about the command pattern is the underlying idea that behaviour can be treated as a first-class entity in its own right.

The command pattern was popularized by the 1994 book Design Patterns: Elements of Reusable Object-Oriented Software. But it’s 2016. Why do we care? Why is it worth another look?

At that time, most software ran on the desktop or in a client-server environment. Distributed software was relatively exotic. So naturally, the examples given of the command pattern in use were often those applicable to single users. Like “undo,” writing macros, or perhaps displaying a progress bar.

Nevertheless, the underlying idea of the command pattern becomes particularly interesting when applied to parallel and distributed software, whether we are thinking of job queues, thread pools, or algorithms that provide eventual consistency across a distributed system.

In 2016, software is parallel and distributed by default. And the command pattern deserves another look, with fresh eyes.

The “canonical example” of the command pattern is working with mutable data. Here’s one such example, chosen because it fits on a couple of sides:

class Buffer {
  constructor (text = '') { this.text = text; }

  replaceWith (replacement, from = 0, to = this.text.length) {
    this.text = this.text.slice(0, from) +
                  replacement +
                  this.text.slice(to);
    return this;
  }

  toString () { return this.text; }
}

let buffer = new Buffer();

buffer.replaceWith(
  "The quick brown fox jumped over the lazy dog"
);
buffer.replaceWith("fast", 4, 9);
buffer.replaceWith("canine", 40, 43);
 //=> The fast brown fox jumped over the lazy canine

We have buffer that contains some plain text, and it has a single behaviour, a replaceWith method that replaces a selection of the buffer with some new text. Insertions can be managed by replacing a zero-length selection, and deletions can be handled by replacing a selection with the empty string.

Ten years ago, Steve Yegge described OOP as a Kingdom of Nouns: Everything is an object and objects own their behaviours.

There is a very explicit idea that objects model entities in the real world, and methods model changes to those entities. Objects are “first-class:” They can be stored in variables, we can query them for their properties, and we can transform them into different states or different entities altogether.

Many languages also permit us to treat methods as first-class entities. In Python, we can easily extract a bound method from an object. In Ruby, we can manipulate both bound and unbound methods. In JavaScript, methods are just functions.

Typically, treating methods as first-class entities is rarer than treating “nouns” as first-class entities, but it is possible. This forms the basis of meta-programming techniques like writing method decorators.

But the command pattern concerns itself with invocations. An invocation is a specific method, invoked on a specific receiver, with specific parameters:

Classes are to instances as methods are to invocations.

If an invocation was a first-class entity, we could store it in a variable or data structure. Let’s try it:

class Edit {
  constructor (buffer, {replacement, from, to}) {
    this.buffer = buffer;
    Object.assign(this, {replacement, from, to});
  }

  doIt () {
    this.buffer.text =
      this.buffer.text.slice(0, this.from) +
      this.replacement +
      this.buffer.text.slice(this.to);
    return this.buffer;
  }
}

class Buffer {
  constructor (text = '') { this.text = text; }

  replaceWith (replacement, from = 0, to = this.text.length) {
    return new Edit(this, {replacement, from, to});
  }

  toString () { return this.text; }
}

let buffer = new Buffer(), jobQueue = [];

jobQueue.push(
  buffer.replaceWith(
    "The quick brown fox jumped over the lazy dog"
  )
);
jobQueue.push( buffer.replaceWith("fast", 4, 9) );
jobQueue.push( buffer.replaceWith("canine", 40, 43) );

while (jobQueue.length > 0) {
  jobQueue.shift().doIt();
}
 //=> The fast brown fox jumped over the lazy canine

Since we’re taking an OO approach, we’ve created an Edit class that represents invocations. Each instance is an invocation, and thus we can create new invocations with new Edit(...) and actually perform the invocation with .doIt().

In this example, we’ve created a job queue, deferring a number of invocations until we pop them off the queue and perform them. Note that “invoking” methods on a buffer no longer does anything: Instead, they return invocations we manipulate explicitly.1

This is the canonical way to “do commands” in OOP: Make them instances of a class and perform them with a method. There are other ways to implement the command pattern, and it can be implemented in FP as well, but for our purposes this is enough to explore its applications.

We can also query commands. Naturally, we do this by implementing methods that report on some critical characteristic, like a command’s scope. For simplicity, we won’t implement a .scope() method that reports the extent of an edit’s selection, since JavaScript encourages unencapsulated direct property access.

But we can report on the amount by which an edit lengthens or shortens a buffer:

class Edit {

  netChange () {
    return this.from - this.to + this.replacement.length;
  }

}

let buffer = new Buffer();

buffer.replaceWith(
    "The quick brown fox jumped over the lazy dog"
).netChange();
 //=> 44

buffer.replaceWith("fast", 4, 9).netChange();
 //=> -1

This can be useful.

First-class entities can also be transformed. And here we come to the most interesting application of commands. Here’s a .reversed() method that returns the inverse of any edit:

class Edit {

  reversed () {
    let replacement = this.buffer.text.slice(this.from, this.to),
        from = this.from,
        to = from + this.replacement.length;
    return new Edit(buffer, {replacement, from, to});
  }
}

let buffer = new Buffer(
  "The quick brown fox jumped over the lazy dog"
);

let doer = buffer.replaceWith("fast", 4, 9),
    undoer = doer.reversed();

doer.doIt();
  //=> The fast brown fox jumped over the lazy dog

undoer.doIt();
  //=> The quick brown fox jumped over the lazy dog

Let’s put our storing and transforming together. Instead of returning a command from the replaceWith method, we’ll create a doer command, and push its reverse onto a history stack. We’ll then invoke doer.doIt() to actually perform the replacement on the buffer:

class Buffer {

  constructor (text = '') {
    this.text = text;
    this.history = [];
    this.future = [];
  }

}

class Buffer {

  replaceWith (replacement, from = 0, to = this.length()) {
    let doer = new Edit(this, {replacement, from, to}),
        undoer = doer.reversed();

    this.history.push(undoer);
    this.future = [];
    return doer.doIt();
  }

}

Implementing undo is straightforward: Pop an undoer from the stack, create a redoer for later, push the redoer onto a future stack, and invoke the undoer:

class Buffer {

  undo () {
    let undoer = this.history.pop(),
        redoer = undoer.reversed();

    this.future.unshift(redoer);
    return undoer.doIt();
  }

}

let buffer = new Buffer(
  "The quick brown fox jumped over the lazy dog"
);

buffer.replaceWith("fast", 4, 9)
  //=> The fast brown fox jumped over the lazy dog

buffer.replaceWith("canine", 40, 43)
  //=> The fast brown fox jumped over the lazy canine

buffer.undo()
  //=> The fast brown fox jumped over the lazy dog

buffer.undo()
  //=> The quick brown fox jumped over the lazy dog

Redoing something we’ve undone is now simple:

class Buffer {

  redo () {
    let redoer = this.future.shift(),
        undoer = redoer.reversed();

    this.history.push(undoer);
    return redoer.doIt();
  }

}

buffer.redo()
  //=> The fast brown fox jumped over the lazy dog

buffer.redo()
  //=> The fast brown fox jumped over the lazy canine

And again, its reverse goes onto the history so we can toggle back and forth between undoing and redoing.

Like the slide says, this is the basic idea you’ll find in the GoF book as well as in 1980s tomes on OO programming. I recall an Object Pascal book using this pattern to implement undo within the MacApp framework in the late 1980s.

Our example hits all three of the characteristics of invocations as first-class entities. But that isn’t really enough to “provoke our intellectual curiosity.” So let’s consider a more interesting direction.

coupling through time

We begin by asking a question.

Recall this code for replacing text in a buffer:

class Buffer {

  replaceWith (replacement, from = 0, to = this.length()) {
    let doer = new Edit(this, {replacement, from, to}),
        undoer = doer.reversed();

    this.history.push(undoer);
    this.future = [];
    return doer.doIt();
  }

}

Note that when we perform a replacement, we execute this.future = [], throwing away any “redoers” we may have accumulated by undoing edits.

Let’s try not throwing it away:

class Buffer {

  replaceWith (replacement, from = 0, to = this.length()) {
    let doer = new Edit(this, {replacement, from, to}),
        undoer = doer.reversed();

    this.history.push(undoer);
    // this.future = [];
    return doer.doIt();
  }

}

let buffer = new Buffer(
  "The quick brown fox jumped over the lazy dog"
);

buffer.replaceWith("fast", 4, 9);
  //=> The fast brown fox jumped over the lazy dog

buffer.undo();
  //=> The quick brown fox jumped over the lazy dog

buffer.replaceWith("My", 0, 3);
  //=> My quick brown fox jumped over the lazy dog

We’ve performed a replacement, then we’ve undone the replacement, restoring the buffer to its original state. Then we performed a different replacement. But since our code no longer discards the future, a redoer is still in this.future.

Unfortunately, the result is not what we expect semantically:

What went wrong?

As the illustration shows, when we first performed .replaceWith('fast', 4, 9), it replaced the characters q, u, i, c, and k, because those were in the selection between 4 and 9 of the buffer.

Our redoer in the future performs this same replacement, but now that we’ve invoked .replaceWith('My', 0, 3), the characters in the selection between 4 and 9 are now u, i, c, k, and ` `, a blank space.

Invoking .replaceWith('My', 0, 3) has moved the part of the buffer we semantically want to replace.

If we step through the invocations, we can see that when we first invoke .replaceWith('fast', 4, 9), no other edits were invoked before it.

Then after undoing it and invoking .replaceWith('My', 0, 3), we have created a situation where .replaceWith('My', 0, 3) is now before .replaceWith('fast', 4, 9) in the future. If we invoke it, we see this clearly as it moves to the past, but it is now preceded by .replaceWith('My', 0, 3):

It turns out that commands are first-class entities, but there is a spooky relationship between them and the models they manipulate, thanks to cause-and-effect. They aren’t 100% independent entities that can be invoked in any order, any number of times.

Commands mutating a model have a semantic dependency on all of the commands that have mutated the model in the past. If you change the order of commands, they may no longer be semantically valid. In some cases, they could even become logically invalid.

Semantically, we can think that if we alter the history of edits before invoking a command, we are altering the meaning of the command. Replacing The with My altered the meaning of .replaceWith('fast', 4, 9).

adjusting for changes in history

Let’s go about fixing this specific problem, that of commands altering the position of other commands.2 We being with another query, we can ask whether a particular edit is before another edit, meaning that A is before B if A affects a selection of text that entirely precedes the selection affected by B.

let buffer = new Buffer(
  "The quick brown fox jumped over the lazy dog"
);

let fast = new Edit(
    buffer,
    { replacement: "fast", from: 4, to: 9 }
  );

let my = new Edit(
    buffer,
    { replacement: "My", from: 0, to: 3 }
  );

class Edit {

  isBefore (other) {
    return other.from >= this.to;
  }

}

fast.isBefore(my);
  //=> false

my.isBefore(fast);
  //=> true

Equipped with .isBefore and .netChange(), we can write .prependedWith method that takes an edit, and returns a new version of the edit that corrects for any change caused by prepending another edit into its history.

There are two cases we cover: If we write a.prependedWith(b), and a is before b, then we return a since b doesn’t change its semantic meaning. But if we write a.prependedWith(b), and b is before a, then we return a copy of a that has been adjusted by the amount of b’s net change:

class Edit {

  prependedWith (other) {
    if (this.isBefore(other)) {
      return this;
    }
    else if (other.isBefore(this)) {
      let change = other.netChange(),
          {replacement, from, to} = this;

      from = from + change;
      to = to + change;
      return new Edit(this.buffer, {replacement, from, to})
    }
  }

}

my.prependedWith(fast)
  //=> buffer.replaceWith("My", 0, 3)

fast.prependedWith(my)
  //=> buffer.replaceWith("fast", 3, 8)

my.prependedWith(fast)
  //=> buffer.replaceWith("My", 0, 3)

fast.prependedWith(my)
  //=> buffer.replaceWith("fast", 3, 8)

With this in hand, we see what to do with this.future: Whenever we invoke a fresh command, we must replace all of the edits in the future with versions prepended with the command we’re invoking, thus adjusting them to maintain the same semantic meaning:

class Buffer {

  replaceWith (replacement, from = 0, to = this.length()) {
    let doer = new Edit(this, {replacement, from, to}),
        undoer = doer.reversed();

    this.history.push(undoer);
    this.future = this.future.map(
      (edit) => edit.prependedWith(doer)
    );
    return doer.doIt();
  }

}

let buffer = new Buffer(
  "The quick brown fox jumped over the lazy dog"
);

buffer.replaceWith("fast", 4, 9);
  //=> The fast brown fox jumped over the lazy dog

buffer.undo();
  //=> The quick brown fox jumped over the lazy dog

buffer.replaceWith("My", 0, 3);
  //=> My quick brown fox jumped over the lazy dog

buffer.redo();

Now we get the correct result!

the bigger picture

Once upon a time, “undo” was a magical feature for single users. It transformed the software experience for users, because they could act without fear of making irreversible catastrophic mistakes. There was a natural progression to undo and redo stacks. But it was rare that applications went further.

Only the most esoteric would surface the undo and redo stacks, permitting execution of arbitrary commands from the redo stack, or maintained the redo stack after performing new edits (as we’ve implemented here). This is a neat feature, but challenging to design into an application in the “real world.” It’s challenging to set user expectations about what the redo command will do.3

But not all implementations of commands have a direct representation in the user experience. And if we put aside the problem of user experience, we have a very strong takeaway from dealing with maintaining the future while inserting new edits into the history. While it’s just one limited example, it hints at being able to arbitrarily manipulate history, inserting, removing, or reordering edits as we desire.

This is a very powerful concept: Typically, we are slaves to mutable state. It moves forward inexorably. Taming it is a struggle. But commands suggest a way to take control.

Part II: Software in a Distributed World

Alice and Bob are writing a screenplay. Naturally, their editors use our buffers and edits:

let alice = new Buffer(
  "The quick brown fox jumped over the lazy dog"
);

let bob = new Buffer(
  "The quick brown fox jumped over the lazy dog"
);

To keep the code simple, we’ll omit some of the moving parts to support undoing edits from our command-oriented Buffer class:

class Buffer {

  constructor (text = '') {
    this.text = text;
    this.history = [];
  }

  replaceWith (replacement, from = 0, to = this.length()) {
    let edit = new Edit(this,
                   {replacement, from, to}
                 );

    this.history.push(edit);
    return edit.doIt();
  }

}


Now we want to synchronize the screenplay, so that Alice can see Bob’s change, and Bob can see Alice’s change. So, naturally, Alice sends Bob her change, and Bob sends Alice his change. We want to apply those changes so that we end up with both Alice and Bob looking at identical buffers.

What we want to do looks like this:

Alice and Bob each perform a different edit, causing their buffers to diverge. We want to apply each other’s edits in such a way that they converge back to a consistent view of the buffer.

We can try that:

class Buffer {

  append (theirEdit) {
    this.history.forEach( (myEdit) => {
      theirEdit = theirEdit.prependedWith(myEdit);
    });
    return new Edit(this, theirEdit).doIt();
  }

  appendAll(otherBuffer) {
    otherBuffer.history.forEach(
      (theirEdit) => this.append(theirEdit)
    );
    return this;
  }

}

Now we can write alice.appendAll(bob) to apply all of Bob’s edits to Alice’s copy of the buffer. And we can write bob.appendAll(alice) to apply all of Alice’s edits to Bob’s copy of the buffer. Problem solved?

alice.appendAll(bob);
  //=> My fast brown fox jumped over the lazy dog

bob.appendAll(alice);
  //=> My fast brown fox jumped over the lazy dog

This appears to work: By prepending the exiting edits onto edits being appended to a buffer, we transform the new edits to producet the same result, synchronizing the buffers.

Unfortunately, there’s a bug.

A big bug!

What happens if we try to append again? Since neither Alice nor Bob have made any further edits, the buffers should remain unchanged. But they don’t:

alice.appendAll(bob);
  //=> My fastbrown fox jumped over the lazy dog

bob.appendAll(alice);
  //=> Myfast brown fox jumped over the lazy dog

Our append methods are applying each edit all over again. To fix that, we have to modify our algorithm to pay attention to whether edits already exist in a buffer or edit’s history. First, let’s upgrade our edits and give them a guid we can use to identify them, as well as a set of the guids of the edits that came before them:

let GUID = () => {
    let _p8 = (s) => {
        let p = (Math.random().toString(16)+"000000000").substr(2,8);
        return s ? "-" + p.substr(0,4) + "-" + p.substr(4,4) : p ;
    }
    return _p8() + _p8(true) + _p8(true) + _p8();
}

class Edit {

  constructor (buffer,
    { guid = GUID(), befores = new Set(),
      replacement, from, to }) {
    this.buffer = buffer;
    befores = new Set(befores);

    Object.assign(this,
                  {guid, replacement, from, to, befores});
  }

}

Our buffers will also track the guids of the edits in their history:

class Buffer {

  constructor (text = '', history = []) {
    let befores = new Set(history.map(e => e.guid));
    history = history.slice(0);
    Object.assign(this, {text, history, befores});
  }

  share () {
    return new Buffer(this.text, this.history);
  }

  has (edit) { return this.befores.has(edit.guid); }

}

We’ll refactor replaceWith to extract a .perform(edit), it will simplify a lot of what’s coming:

class Buffer {

  perform (edit) {
    if (!this.has(edit)) {
      this.history.push(edit);
      this.befores.add(edit.guid);
      return edit.doIt();
    }
  }

  replaceWith (replacement,
               from = 0, to = this.length()) {
    let befores = this.befores,
    let edit = new Edit(this,
                   {replacement, from, to, befores}
                 );
    return this.perform(edit);
  }

}

Now our append method can be fixed to prepend every edit with everything in its history, much as we did with fixing redo:

class Buffer {

  append (theirEdit) {
    this.history.forEach( (myEdit) => {
      theirEdit = theirEdit.prependedWith(myEdit);
    });
    return this.perform(new Edit(this, theirEdit));
  }

}

Here’s an updated appendAll that only appends edits that aren’t already in the history. What? We didn’t mention that was another bug in the code? Silly us.4

class Buffer {

  appendAll(otherBuffer) {
    otherBuffer.history.forEach(
      (theirEdit) =>
        this.has(theirEdit) || this.append(theirEdit)
    );
    return this;
  }

}

Now we’re finally ready to update the prependedWith method to check whether an edit is “before” another edit, is the same as another edit, or is already in the edit’s history:

class Edit {

  prependedWith (other) {
    if (this.isBefore(other) ||
        this.befores.has(other.guid) ||
        this.guid === other.guid) return this;

    let change = other.netChange(),
        {guid, replacement, from, to, befores} = this;

    from = from + change;
    to = to + change;
    befores = new Set(befores);
    befores.add(other.guid);

    return new Edit(this.buffer, {guid, replacement, from, to, befores});
  }

}

With all these changes in place, Alice and Bob can exchange edits at will.5 Let’s try it!

alice, bob, and carol

Alice, Bob and Carol are writing a screenplay.

let alice = new Buffer(
  "The quick brown fox jumped over the lazy dog"
);

let bob = alice.share();
  //=> The quick brown fox jumped over the lazy dog

alice.replaceWith("My", 0, 3);
  //=> My quick brown fox jumped over the lazy dog

let carol = alice.share();
  //=> My quick brown fox jumped over the lazy dog

bob.replaceWith("fast", 4, 9);
  //=> The fast brown fox jumped over the lazy dog

alice.appendAll(bob);
  //=> My fast brown fox jumped over the lazy dog

bob.appendAll(alice);
  //=> My fast brown fox jumped over the lazy dog

alice.replaceWith("spotted", 8, 13);
  //=> My fast spotted fox jumped over the lazy dog

bob.appendAll(alice);
  //=> My fast spotted fox jumped over the lazy dog

carol.appendAll(bob);
  //=> My fast spotted fox jumped over the lazy dog

It works!

Or rather, it works for some definition of “works.” The algorithm we just implemented is called Operational Transformation, and John Gentle’s quote above is pertinent.

We’ve completely omitted the problem of overlapping edits. We’re working with a remarkably simple data model, a string. Even so, what if Alice, Bob, and Carol each make edits that don’t conflict with each other when compared individually: Can we guarantee that we can apply them in any order and not end up with a conflict?

And if we imagine trying to use these techniques to maintain consistency while multiple users edit a complex data structure with internal references, things get complicated. For example, what if we have users, each of whom have multiple addresses, and one person deletes an address that another person is editing. What happens then?

Our algorithm skipped over undos. Are undo queues local? Or can you undo an edit another user makes?6

OT relies on making a very careful analysis of the different kinds of edits that can be made, and determining exactly how to transform them when prepended by any other edit. Even then, it is hairy.

Recognizing this, people have come up with other mechanisms for distributing edits. Mapping commands 1-1 with user actions is necessary for undo. But it is hard to infer user intentions from their actions: What if instead of selecting a word and replacing it with another, Alice backspaces five times and then types four letters. Is that nine edits? Two edits? Or one?

And it may not be necessary for us to infer actions to synchronize documents. We can, for example, regularly take a diff of the document and send that off to be synchronized. That’s the Differential Synchronization algorithm, and it’s how Google Docs originally worked when Google acquired Writely:

At it’s heart, though, we’re still dealing with the idea that we don’t just treat physical entities–nouns–as our software entities. We also model changes as first-class entities that can be stored, queried, and edited.

Part III: Commands, More Useful Now Than Ever

Working with distributed changes is now a very, very big problem space. Software is no longer living on one device. We chat, we have distributed sessions, we demand eventual consistency from our data.

Everything we do in these areas requires treating changes as first-class entities.

“There are only two hard problems in Computer Science: Cache invalidation, and naming things.”–Phil Karlton

What if we take the names of our Buffer class:

And changed them:

Does this look familiar? We’ve discussed reordering time for an individual user, and we’ve discussed synchronizing changes across distributed users. But we now write software that puts control of cause and effect in the hands of distributed users as well.

Being able to fork repositories, cherry-pick changes to apply, and merge (or rebase) changes is another aspect of the same concept: Changes as first-class entities. What new user models can we develop if we take that kind of thinking to other kinds of software?

Will there one day be a version of PowerPoint that allows someone to submit a pull request to a presentation?7 If there is, it will be because somebody modeled presentations as commands rather than as big binary data blobs.

Getting back to OT and DS, synchronizing data is far more than supporting simultaneous document editing. Database systems often model transactions as commands or collections of commands, and use various types of protocols to permit the commands to execute in parallel without blocking each other.

Replicated data stores use distributed algorithms built out of commands to propagate changes and guarantee consistency.

And synchronizing data is far more than distributed editing applications and databases. We are in a world where people expect their documents and applications to sync everything, all the time, over unreliable channels.

This is no longer a special feature of specialized applications It’s the new normal.

So back to the Command Pattern. Sure, it’s twenty years old. Sure, undoing user edits is well-understood. But we should never look at a pattern and think that because we understand the example use case for the pattern, we understand everything about the pattern.

For the command pattern, undo is the example, but treating invocations as first-class entities that can be stored, queried, and transformed is the underlying idea. And the opportunity to use that idea has never been greater.


image credits

https://www.flickr.com/photos/fatedenied/7335413942 https://www.flickr.com/photos/fatedenied/7335413942 https://www.flickr.com/photos/mwichary/2406482529 https://www.flickr.com/photos/tompagenet/8580371564 https://www.flickr.com/photos/ooocha/2869485136 https://www.flickr.com/photos/oskay/2550938136 https://www.flickr.com/photos/baccharus/4474584940 https://www.flickr.com/photos/micurs/4906349993 https://www.flickr.com/photos/purdman1/2875431305 https://www.flickr.com/photos/daryl_mitchell/15427050433 https://www.flickr.com/photos/the00rig/3753005997 https://www.flickr.com/photos/robbie1/8656027235 https://www.flickr.com/photos/mwichary/2406489333 https://www.flickr.com/photos/pedrosimoes7/17386505158 https://www.flickr.com/photos/a-barth/2846621384 https://www.flickr.com/photos/mleung311/9468927282 https://www.flickr.com/photos/bludgeoner86/5590795033 https://www.flickr.com/photos/49024304@N00/ https://www.flickr.com/photos/29143375@N05/4575806708 https://www.flickr.com/photos/30239838@N04/4268147953 https://www.flickr.com/photos/benetd/4429314827 https://www.flickr.com/photos/shimgray/2811100997 https://www.flickr.com/photos/wordridden/4308645407 https://www.flickr.com/photos/sidelong/18620995913 https://www.flickr.com/photos/stawarz/3848824508 https://www.flickr.com/photos/mwichary/3338901313

notes
  1. This is vaguely related to working with promises in JavaScript, although we won’t explore that as this is decidedly not a talk about JavaScript, it’s a talk in JavaScript. 

  2. There are other problems, like overlapping commands, but this is enough to move along and illustrate the kind of thinking we need to do with first-class invocations. 

  3. Another problem is that we have made a massive number of hand waves. We only correctly handle edits that do not overlap. We’ll talk about this a little later. 

  4. Many programmers will take a strong exception to using this.has(theirEdit) || this.append(theirEdit) as a control-flow construct. Mind you, most people don’t have to make a method fit on a slide. Be more explicit in real code. 

  5. Furiously hand-waving over edits that overlap, of course. Not to mention pesky protocol issues like unreliable communication channels. 

  6. And the “overlapping edits” question applies to undos. Consider what happens if Bob inserts the word co-operation, and Alice edits it to the more literary coöperation. Now Bob hits undo, expecting the word he just typed to vanish. What happens? 

  7. In fact, this presentation was written in Markdown and presented using DeckSet, precisely because this affords using git to manipulate its history. 

https://raganwald.com/2016/01/19/command-pattern
This is not an essay about 'Traits in Javascript' (updated)
Show full content

Scott Penkava, Untitled (Portrait of Felix in NY), 2009

A trait is a concept used in object-oriented programming: a trait represents a collection of methods that can be used to extend the functionality of a class. Essentially a trait is similar to a class made only of concrete methods that is used to extend another class with a mechanism similar to multiple inheritance, but paying attention to name conflicts, hence with some support from the language for a name-conflict resolution policy to use when merging.—Wikipedia

A trait is like a mixin, however with a trait, we can not just define new behaviour, but also define ways to extend or override existing behaviour. Traits are a first-class feature languages like Scala. Traits are also available as a standard library in languages like Racket. Most interestingly, traits are a feature of the Self programming language, one of the inspirations for JavaScript.

Traits are not a JavaScript feature as this essay is being written, but we can easily make lightweight traits out of the features JavaScript already has.

Our problem is that we want to be able to override or extend functionality from shared behaviour, whether that shared behaviour is defined as a class or as functionality to be mixed in.

our toy problem

Here’s a toy problem we solved elsewhere with a subclass factory that in turn is made out of a an extremely simple mixin.1

To recapitulate from the very beginning, we have a Todo class:

class Todo {
  constructor (name) {
    this.name = name || 'Untitled';
    this.done = false;
  }

  do () {
    this.done = true;
    return this;
  }

  undo () {
    this.done = false;
    return this;
  }

  toHTML () {
    return this.name; // highly insecure
  }
}

And we have the idea of “things that are coloured:”

let toSixteen = (c) => '0123456789ABCDEF'.indexOf(c),
    toTwoFiftyFive = (cc) => toSixteen(cc[0]) * 16 + toSixteen(cc[1]);

class Coloured {
  setColourRGB ({r, g, b}) {
    this.colourCode = {r, g, b};
    return this;
  }

  luminosity () {
    let {r, g, b} = this.getColourRGB();

    return 0.21 * toTwoFiftyFive(r) +
           0.72 * toTwoFiftyFive(g) +
           0.07 * toTwoFiftyFive(b);
  }

  getColourRGB () {
    return this.colourCode;
  }
}

And we want to create a time-sensitive to-do that has colour according to whether it is overdue, close to its deadline, or has plenty of time left. If we had multiple inheritance, we would write:

let yellow = {r: 'FF', g: 'FF', b: '00'},
    red    = {r: 'FF', g: '00', b: '00'},
    green  = {r: '00', g: 'FF', b: '00'},
    grey   = {r: '80', g: '80', b: '80'};

let oneDayInMilliseconds = 1000 * 60 * 60 * 24;

class TimeSensitiveTodo extends Todo, Coloured {
  constructor (name, deadline) {
    super(name);
    this.deadline = deadline;
  }

  getColourRGB () {
    let slack = this.deadline - Date.now();

    if (this.done) {
      return grey;
    }
    else if (slack <= 0) {
      return red;
    }
    else if (slack <= oneDayInMilliseconds){
      return yellow;
    }
    else return green;
  }

  toHTML () {
    let rgb = this.getColourRGB();

    return `<span style="color: #${rgb.r}${rgb.g}${rgb.b};">${super.toHTML()}</span>`;
  }
}

But we don’t have multiple inheritance. In languages where mixing in functionality is difficult, we can fake a solution by having ColouredTodo inherit from Todo:

class ColouredTodo extends Todo {
  setColourRGB ({r, g, b}) {
    this.colourCode = {r, g, b};
    return this;
  }

  luminosity () {
    let {r, g, b} = this.getColourRGB();

    return 0.21 * toTwoFiftyFive(r) +
           0.72 * toTwoFiftyFive(g) +
           0.07 * toTwoFiftyFive(b);
  }

  getColourRGB () {
    return this.colourCode;
  }
}

class TimeSensitiveTodo extends ColouredTodo {
  constructor (name, deadline) {
    super(name);
    this.deadline = deadline;
  }

  getColourRGB () {
    let slack = this.deadline - Date.now();

    if (this.done) {
      return grey;
    }
    else if (slack <= 0) {
      return red;
    }
    else if (slack <= oneDayInMilliseconds){
      return yellow;
    }
    else return green;
  }

  toHTML () {
    let rgb = this.getColourRGB();

    return `<span style="color: #${rgb.r}${rgb.g}${rgb.b};">${super.toHTML()}</span>`;
  }
}

The drawback of this approach is that we can no longer make other kinds of things “coloured” without making them also todos. For example, if we had coloured meetings in a time management application, we’d have to write:

class Meeting {
  // ...
}

class ColouredMeeting extends Meeting {
  setColourRGB ({r, g, b}) {
    this.colourCode = {r, g, b};
    return this;
  }

  luminosity () {
    let {r, g, b} = this.getColourRGB();

    return 0.21 * toTwoFiftyFive(r) +
           0.72 * toTwoFiftyFive(g) +
           0.07 * toTwoFiftyFive(b);
  }

  getColourRGB () {
    return this.colourCode;
  }
}

This forces us to duplicate “coloured” functionality throughout our code base. But thanks to mixins, we can have our cake and eat it to: We can make ColouredAsWellAs a kind of mixin that makes a new subclass and then mixes into the subclass. We call this a “subclass factory:”

function ClassMixin (behaviour) {
  const instanceKeys = Reflect.ownKeys(behaviour);

  return function mixin (clazz) {
    for (let property of instanceKeys)
      Object.defineProperty(clazz.prototype, property, {
        value: behaviour[property],
        writable: true
      });
    return clazz;
  }
}

const SubclassFactory = (behaviour) =>
  (superclazz) => ClassMixin(behaviour)(class extends superclazz {});

const ColouredAsWellAs = SubclassFactory({
  setColourRGB ({r, g, b}) {
    this.colourCode = {r, g, b};
    return this;
  },

  luminosity () {
    let {r, g, b} = this.getColourRGB();

    return 0.21 * toTwoFiftyFive(r) +
           0.72 * toTwoFiftyFive(g) +
           0.07 * toTwoFiftyFive(b);
  },

  getColourRGB () {
    return this.colourCode;
  }
});

class TimeSensitiveTodo extends ColouredAsWellAs(Todo) {
  constructor (name, deadline) {
    super(name);
    this.deadline = deadline;
  }

  getColourRGB () {
    let slack = this.deadline - Date.now();

    if (this.done) {
      return grey;
    }
    else if (slack <= 0) {
      return red;
    }
    else if (slack <= oneDayInMilliseconds){
      return yellow;
    }
    else return green;
  }

  toHTML () {
    let rgb = this.getColourRGB();

    return `<span style="color: #${rgb.r}${rgb.g}${rgb.b};">${super.toHTML()}</span>`;
  }
}

This allows us to override both our Todo methods and the ColourAsWellAs methods. And elsewhere, we can write:

const ColouredMeeting = ColouredAsWellAs(Meeting);

Or perhaps:

class TimeSensitiveMeeting extends ColouredAsWellAs(Meeting) {
  // ...
}

To summarize, our problem is that we want to be able to override or extend functionality from shared behaviour, whether that shared behaviour is defined as a class or as functionality to be mixed in. Subclass factories are one way to solve that problem.

Now we’ll solve the same problem with traits.

defining lightweight traits

Let’s start with our ClassMixin. We’ll modify it slightly to insist that it never attempt to define a method that already exists, and we’ll use that to create Coloured, a function that defines two methods:

function Define (behaviour) {
  const instanceKeys = Reflect.ownKeys(behaviour);

  return function define (clazz) {
    for (let property of instanceKeys)
      if (!clazz.prototype[property]) {
        Object.defineProperty(clazz.prototype, property, {
          value: behaviour[property],
          writable: true
        });
      }
      else throw `illegal attempt to override ${property}, which already exists.`
  }
}

const Coloured = Define({
  setColourRGB ({r, g, b}) {
    this.colourCode = {r, g, b};
    return this;
  },

  luminosity () {
    let {r, g, b} = this.getColourRGB();

    return 0.21 * toTwoFiftyFive(r) +
           0.72 * toTwoFiftyFive(g) +
           0.07 * toTwoFiftyFive(b);
  },

  getColourRGB () {
    return this.colourCode;
  }
});

Coloured is now a function that modifies a class, adding two methods provided that they don’t already exist in the class.

But we need a variation that “overrides” getColourRGB. We can write a variation of Define that always overrides the target’s methods, and passes in the original method as the first parameter. This is similar to “around” method advice:

function Override (behaviour) {
  const instanceKeys = Reflect.ownKeys(behaviour);

  return function overrides (clazz) {
    for (let property of instanceKeys)
      if (!!clazz.prototype[property]) {
        let overriddenMethodFunction = clazz.prototype[property];

        Object.defineProperty(clazz.prototype, property, {
          value: function (...args) {
            return behaviour[property].call(this, overriddenMethodFunction.bind(this), ...args);
          },
          writable: true
        });
      }
      else throw `attempt to override non-existant method ${property}`;
    return clazz;
  }
}

const DeadlineSensitive = Override({
  getColourRGB () {
    let slack = this.deadline - Date.now();

    if (this.done) {
      return grey;
    }
    else if (slack <= 0) {
      return red;
    }
    else if (slack <= oneDayInMilliseconds){
      return yellow;
    }
    else return green;
  },

  toHTML (original) {
    let rgb = this.getColourRGB();

    return `<span style="color: #${rgb.r}${rgb.g}${rgb.b};">${original()}</span>`;
  }
});

Define and Override are protocols: They define whether methods may conflict, and if they do, how that conflict is resolved. Define prohibits conflicts, forcing us to pick another protocol. Override permits us to write a method that overrides an existing method and (optionally) call the original.

composing protocols

We could now write:

const TimeSensitiveTodo = DeadlineSensitive(
  Coloured(
    class TimeSensitiveTodo extends Todo {
      constructor (name, deadline) {
        super(name);
        this.deadline = deadline;
      }
    }
  )
);

Or:

@DeadlineSensitive
@Coloured
class TimeSensitiveTodo extends Todo {
  constructor (name, deadline) {
    super(name);
    this.deadline = deadline;
  }
}

But if we want to use DeadlineSensitive and Coloured together more than once, we can make a lightweight trait with simple function composition:

const pipeline =
  (...fns) =>
    (value) =>
      fns.reduce((acc, fn) => fn(acc), value);

const SensitizeTodos = pipeline(Coloured, DeadlineSensitive);

@SensitizeTodos
class TimeSensitiveTodo extends Todo {
  constructor (name, deadline) {
    super(name);
    this.deadline = deadline;
  }
}

Now SensitizeTodos combines defining methods with overriding existing methods: We’ve built a lightweight trait by composing protocols.

And that’s all a trait is: The composition of protocols. And we don’t need a bunch of new keywords or decorators (like @overrides) to do it, we just use the functional composition that is so easy and natural in JavaScript.

other protocols

We can incorporate other protocols. Two of the most common are prepending behaviour to an existing method, or appending behaviour to an existing method:

function Prepends (behaviour) {
  const instanceKeys = Reflect.ownKeys(behaviour);

  return function prepend (clazz) {
    for (let property of instanceKeys)
      if (!!clazz.prototype[property]) {
        let overriddenMethodFunction = clazz.prototype[property];

        Object.defineProperty(clazz.prototype, property, {
          value: function (...args) {
            const prependValue = behaviour[property].apply(this, args);

            if (prependValue === undefined || !!prependValue) {
              return overriddenMethodFunction.apply(this, args);;
            }
          },
          writable: true
        });
      }
      else throw `attempt to override non-existant method ${property}`;
    return clazz;
  }
}

function Append (behaviour) {
  const instanceKeys = Reflect.ownKeys(behaviour);

  function append (clazz) {
    for (let property of instanceKeys)
      if (!!clazz.prototype[property]) {
        let overriddenMethodFunction = clazz.prototype[property];

        Object.defineProperty(clazz.prototype, property, {
          value: function (...args) {
            const returnValue = overriddenMethodFunction.apply(this, args);

            behaviour[property].apply(this, args);
            return returnValue;
          },
          writable: true
        });
      }
      else throw `attempt to override non-existant method ${property}`;
    return clazz;
  }
}

We can compose a lightweight trait using any combination of Define, Override, Prepend, and Append, and the composition is handled by pipeline, a plain old function composition tool.

Lightweight traits are nothing more than protocols, composed in a simple and easy-to-understand way. And then applied to simple classes, in a direct and obvious manner.

what do lightweight traits tell us?

Once again we have seen the strength of JavaScript: We don’t need a lot of special language features baked in, provided we are careful to make our existing features out of functions and simple objects. We can then compose them at will using simple tools to make the language features we need.

Over time, when features become popular, those features will get added to the language. But like so many other things either added to ES6 or proposed for future versions, features begin with people rolling their own tools. JavaScript makes this exceptionally easy.

We just have to start with simple things, and combine them in simple ways.

Simplicity is the peak of civilization

the heavyweight. and the light.

When employing a new approach, like traits, there are two ways to do it. The heavyweight way, and the lightweight way.

The lightweight way, as shown here, attempts to be as “JavaScript-y” as possible. For example, using functions for protocols and composing them. With the lightweight way, everything is still just a function, or just an object, or just a class with just a prototype. Lightweight code interoperates 100% with code from other libraries. Lightweight approaches can be incrementally added to an existing code base, refactoring a bit here and a bit there.

The heavyweight way would greenspun a special class hierarchy with support for traits baked in. The heavyweight way would produce “classes” that don’t easily interoperate with other libraries or code, so you can’t incrementally make changes: You have to “boil the ocean” and commit 100% to the new approach. Heavyweight approaches often demand new kinds of tooling in the build pipeline.

When we do things the lightweight way, we make very small bets on their benefits. It’s easy to change our mind and abandon the approach in favour of something else. because we make small bets along the way, we collect on the small benefits continuously: We don’t have to kick off a massive rewrite of our code base to start using lightweight traits, for example. We just start using them as little or as much as we like, and immediately start benefitting from them.

“A language that doesn’t affect the way you think about programming isn’t worth learning.”—Alan J. Perlis

Every tool affects the way we think about programming. But heavyweight tools force us to think about the heavyweight tooling. That thinking isn’t always portable to another tool or another code base.

Whereas lightweight tools are simple things, composed together in simple ways. If we move to a different code base or tool, we can take our experience with the simple things along. With lightweight traits, for example, we are not teaching ourselves how to “program with traits,” we’re teaching ourselves how to “decompose behaviour,” how to “compose functions” and how to “write functions that decorate entities.”

These are all fundamental ideas that apply everywhere, even if we don’t end up applying them to build traits every time we write code. Lightweight thinking is portable and future-proof.

This essay is not, in the end, about how to write traits in JavaScript. Traits are just an example of how the lightweight approach is particularly easy in JavaScript, and an explanation of why that matters.

fin.


postscript from the author: “happy new year!”

As I write this on New Year’s Eve, 2015, I am struck by how much this essay is the same as almost every other essay I’ve written about JavaScript. That’s partly because my brain is shaped by “lightweight” thinking, and partly because JavaScript, for all of its faults, and despite attempts to write heavyweight frameworks for it, is a lightweight language.

People often say that JavaScript wants to be a functional programming language. I believe this is not the whole story: I believe JavaScript wants to be a lightweight programming language, and functions-as-first-class-entities is a deeply lightweight idea. The same is true of newer ideas like classes-as-expressions and decorators-as-functions.

But I repeat myself. Again.

Writing in this modern world is a conversation. With a language. With readers. With fellow language enthusiasts who also write. But conversations run their course. When you find yourself repeating repeating repeating yourself… Perhaps you have made your contribution and it’s time to sidle out and order another whiskey.

I will never say “never again,” but if you do not hear from me on the subject of JavaScript in the future, it is not because I have nothing to say, but rather because I think I have already tried to say it.

Thank you, and I am excited to see what you have to say, in words or in code, in 2016.

—Reginald Braithwaite, Toronto, 2015-12-31


more reading:

notes:

  1. The implementations given here are extremely simple in order to illustrate a larger principle of how the pieces fit together. A production library based on these principles would handle needs we’ve seen elsewhere, like defining “class” or “static” properties, making instanceof work, and appeasing the V8 compiler’s optimizations. 

https://raganwald.com/2015/12/31/this-is-not-an-essay-about-traits-in-javascript
JavaScript Mixins, Subclass Factories, and Method Advice
Show full content

Mixins solve a very common problem in class-centric OOP: For non-trivial applications, there is a messy many-to-many relationship between behaviour and classes, and it does not neatly decompose into a tree. In this essay, we only touch lightly over the benefits of using mixins with classes, and in their stead we will focus on some of the limitations of mixins and ways to not just overcome them, but create designs that are superior to those created with classes alone.

(For more on why mixins matter in the first place, you may want to review Prototypes are Objects (and why that matters), Functional Mixins in ECMAScript 2015, and Using ES.later Decorators as Mixins.)

Crossed Wires

simple mixins

As noted above, for non-trivial applications, there is a messy many-to-many relationship between behaviour and classes. However, JavaScript’s single-inheritance model forces us to organize behaviour in trees, which can only represent one-to-many relationships.

The mixin solution to this problem is to leave classes in a single inheritance hierarchy, and to mix additional behaviour into individual classes as needed. Here’s a vastly simplified functional mixin for classes:1

function mixin (behaviour) {
  let instanceKeys = Reflect.ownKeys(behaviour);
  let typeTag = Symbol('isa');

  function _mixin (clazz) {
    for (let property of instanceKeys)
      Object.defineProperty(clazz.prototype, property, {
        value: behaviour[property],
        writable: true
      });
    Object.defineProperty(clazz.prototype, typeTag, { value: true });
    return clazz;
  }
  Object.defineProperty(_mixin, Symbol.hasInstance, {
    value: (i) => !!i[typeTag]
  });
  return _mixin;
}

This is more than enough to do a lot of very good work in JavaScript, but it’s just the starting point. Here’s how we put it to work:

let BookCollector = mixin({
  addToCollection (name) {
    this.collection().push(name);
    return this;
  },
  collection () {
    return this._collected_books || (this._collected_books = []);
  }
});

class Person {
  constructor (first, last) {
    this.rename(first, last);
  }
  fullName () {
    return this.firstName + " " + this.lastName;
  }
  rename (first, last) {
    this.firstName = first;
    this.lastName = last;
    return this;
  }
}

let Executive = BookCollector(
  class extends Person {
    constructor (title, first, last) {
      super(first, last);
      this.title = title;
    }

    fullName () {
      return `${this.title} ${super.fullName()}`;
    }
  }
);

let president = new Executive('President', 'Barak', 'Obama');

president
  .addToCollection("JavaScript Allongé")
  .addToCollection("Kestrels, Quirky Birds, and Hopeless Egocentricity");

president.collection()
  //=> ["JavaScript Allongé","Kestrels, Quirky Birds, and Hopeless Egocentricity"]
multiple inheritance

If you want to mix behaviour into a class, mixins do the job very nicely. But sometimes, people want more. They want multiple inheritance. Meaning, what they really want is for class Executive to inherit from Person and from BookCollector.

What’s the difference between Executive mixing BookCollector in and Executive inheriting from BookCollector?

  1. If Executive mixes BookCollector in, the properties addToCollection and collection become own properties of Executive’s prototype. If Executive inherits from BookCollector, they don’t.

  2. If Executive mixes BookCollector in, Executive can’t override methods of BookCollector. If Executive inherits from BookCollector, it can.

  3. If Executive mixes BookCollector in, Executive can’t override methods of BookCollector, and therefore it can’t make a method that overrides a method of BookCollector and then uses super to call the original. If Executive inherits from BookCollector, it can.

If JavaScript had multiple inheritance, we could extend a class with more than one superclass:

class Todo {
  constructor (name) {
    this.name = name || 'Untitled';
    this.done = false;
  }

  do () {
    this.done = true;
    return this;
  }

  undo () {
    this.done = false;
    return this;
  }

  toHTML () {
    return this.name; // highly insecure
  }
}

class Coloured {
  setColourRGB ({r, g, b}) {
    this.colourCode = {r, g, b};
    return this;
  }

  getColourRGB () {
    return this.colourCode;
  }
}

let yellow = {r: 'FF', g: 'FF', b: '00'},
    red    = {r: 'FF', g: '00', b: '00'},
    green  = {r: '00', g: 'FF', b: '00'},
    grey   = {r: '80', g: '80', b: '80'};

let oneDayInMilliseconds = 1000 * 60 * 60 * 24;

class TimeSensitiveTodo extends Todo, Coloured {
  constructor (name, deadline) {
    super(name);
    this.deadline = deadline;
  }

  getColourRGB () {
    let slack = this.deadline - Date.now();

    if (this.done) {
      return grey;
    }
    else if (slack <= 0) {
      return red;
    }
    else if (slack <= oneDayInMilliseconds){
      return yellow;
    }
    else return green;
  }

  toHTML () {
    let rgb = this.getColourRGB();

    return `<span style="color: #${rgb.r}${rgb.g}${rgb.b};">${super.toHTML()}</span>`;
  }
}

This hypothetical TimeSensitiveTodo extends both Todo and Coloured, and it overrides toHTML from Todo as well as overriding getColourRGB from Coloured.

Boeing Factory

subclass factories

However, JavaScript does not have “true” multiple inheritance, and therefore this code does not work. But we can simulate multiple inheritance for cases like this. The way it works is to step back and ask ourselves, “What would we do if we didn’t have mixins or multiple inheritance?”

The answer is, we’d force a square multiple inheritance peg into a round single inheritance hole, like this:

class Todo {
  // ...
}

class ColouredTodo extends Todo {
  // ...
}

class TimeSensitiveTodo extends ColouredTodo {
  // ...
}

By making ColouredTodo extend Todo, TimeSensitiveTodo can extend ColouredTodo and override methods from both. This is exactly what most programmers do, and we know that it is an anti-pattern, as it leads to duplicated class behaviour and deep class hierarchies.

But.

What if, instead of manually creating this hierarchy, we use our simple mixins to do the work for us? We can take advantage of the fact that classes are expressions, like this:

let Coloured = mixin({
  setColourRGB ({r, g, b}) {
    this.colourCode = {r, g, b};
    return this;
  },

  getColourRGB () {
    return this.colourCode;
  }
});

let ColouredTodo = Coloured(class extends Todo {});

Thus, we have a ColouredTodo that we can extend and override, but we also have our Coloured behaviour in a mixin we can use anywhere we like without duplicating its functionality in our code. The full solution looks like this:

class Todo {
  constructor (name) {
    this.name = name || 'Untitled';
    this.done = false;
  }

  do () {
    this.done = true;
    return this;
  }

  undo () {
    this.done = false;
    return this;
  }

  toHTML () {
    return this.name; // highly insecure
  }
}

let Coloured = mixin({
  setColourRGB ({r, g, b}) {
    this.colourCode = {r, g, b};
    return this;
  },

  getColourRGB () {
    return this.colourCode;
  }
});

let ColouredTodo = Coloured(class extends Todo {});

let yellow = {r: 'FF', g: 'FF', b: '00'},
    red    = {r: 'FF', g: '00', b: '00'},
    green  = {r: '00', g: 'FF', b: '00'},
    grey   = {r: '80', g: '80', b: '80'};

let oneDayInMilliseconds = 1000 * 60 * 60 * 24;

class TimeSensitiveTodo extends ColouredTodo {
  constructor (name, deadline) {
    super(name);
    this.deadline = deadline;
  }

  getColourRGB () {
    let slack = this.deadline - Date.now();

    if (this.done) {
      return grey;
    }
    else if (slack <= 0) {
      return red;
    }
    else if (slack <= oneDayInMilliseconds){
      return yellow;
    }
    else return green;
  }

  toHTML () {
    let rgb = this.getColourRGB();

    return `<span style="color: #${rgb.r}${rgb.g}${rgb.b};">${super.toHTML()}</span>`;
  }
}

let task = new TimeSensitiveTodo('Finish blog post', Date.now() + oneDayInMilliseconds);

task.toHTML()
  //=> <span style="color: #FFFF00;">Finish blog post</span>

The key snippet is let ColouredTodo = Coloured(class extends Todo {});, it turns behaviour into a subclass that can be extended and overridden. We can turn this pattern into a function:

let subclassFactory = (behaviour) => {
  let mixBehaviourInto = mixin(behaviour);

  return (superclazz) => mixBehaviourInto(class extends superclazz {});
}

Using subclassFactory, we wrap the class we want to extend, instead of the class we are declaring. Like this:

let subclassFactory = (behaviour) => {
  let mixBehaviourInto = mixin(behaviour);

  return (superclazz) => mixBehaviourInto(class extends superclazz {});
}

let ColouredAsWellAs = subclassFactory({
  setColourRGB ({r, g, b}) {
    this.colourCode = {r, g, b};
    return this;
  },

  getColourRGB () {
    return this.colourCode;
  }
});

class TimeSensitiveTodo extends ColouredAsWellAs(ToDo) {
  constructor (name, deadline) {
    super(name);
    this.deadline = deadline;
  }

  getColourRGB () {
    let slack = this.deadline - Date.now();

    if (this.done) {
      return grey;
    }
    else if (slack <= 0) {
      return red;
    }
    else if (slack <= oneDayInMilliseconds){
      return yellow;
    }
    else return green;
  }

  toHTML () {
    let rgb = this.getColourRGB();

    return `<span style="color: #${rgb.r}${rgb.g}${rgb.b};">${super.toHTML()}</span>`;
  }
}

The syntax of class TimeSensitiveTodo extends ColouredAsWellAs(ToDo) says exactly what we mean: We are extending our Coloured behaviour as well as extending ToDo.2

another way forward

The solution subclass factories offer is emulating inheritance from more than one superclass. That, in turn, makes it possible to override methods from our superclass as well as the behaviour we want to mix in. Which is fine, but we don’t actually want multiple inheritance!

It’s just that we’re looking at an overriding/extending methods problem, but we’re holding an inheritance-shaped hammer. So it looks like a multiple-inheritance nail. But what if we address the problem of overriding and extending methods directly, rather than indirectly via multiple inheritance?

Nail

simple overwriting with simple mixins

We start by noting that in the first pass of our mixin function, we blindly copied properties from the mixin into the class’s prototype, whether the class defined those properties or not. So if we write:

let RED        = { r: 'FF', g: '00', b: '00' },
    WHITE      = { r: 'FF', g: 'FF', b: 'FF' },
    ROYAL_BLUE = { r: '41', g: '69', b: 'E1' },
    LIGHT_BLUE = { r: 'AD', g: 'D8', b: 'E6' };

let BritishRoundel = mixin({
  shape () {
    return 'round';
  },

  roundels () {
    return [RED, WHITE, ROYAL_BLUE];
  }
})

let CanadianAirForceRoundel = BritishRoundel(class {
  roundels () {
    return [RED, WHITE, LIGHT_BLUE];
  }
});

new CanadianAirForceRoundel().roundels()
  //=> [
    {"r":"FF","g":"00","b":"00"},
    {"r":"FF","g":"FF","b":"FF"},
    {"r":"41","g":"69","b":"E1"}
  ]

Our CanadianAirForceRoundel’s third stripe winds up being regular blue instead of light blue, because the roundels method from the mixin BritishRoundel overwrites its own. (Yes, this is a ridiculous example, but it gets the point across.)

We can fix this by not overwriting a property if the class already defines it. That’s not so hard:

function mixin (behaviour) {
  let instanceKeys = Reflect.ownKeys(behaviour);
  let typeTag = Symbol('isa');

  function _mixin (clazz) {
    for (let property of instanceKeys)
      if (!clazz.prototype.hasOwnProperty(property)) {
        Object.defineProperty(clazz.prototype, property, {
          value: behaviour[property],
          writable: true
        });
      }
    Object.defineProperty(clazz.prototype, typeTag, { value: true });
    return clazz;
  }
  Object.defineProperty(_mixin, Symbol.hasInstance, {
    value: (i) => !!i[typeTag]
  });
  return _mixin;
}

Now we can override roundels in CanadianAirForceRoundel while mixing shape in just fine:

new CanadianAirForceRoundel().roundels()
  //=> [
    {"r":"FF","g":"00","b":"00"},
    {"r":"FF","g":"FF","b":"FF"},
    {"r":"AD","g":"D8","b":"E6"}
  ]

The method defined in the class is now the “definition of record,” just as we might expect. But it’s not enough in and of itself.

combining advice with simple mixins

The above adjustment to ‘mixin’ is fine for simple overwriting, but what about when we wish to modify or extend a method’s behaviour while still invoking the original? Recall that our TimeSensitiveTodo example performed a simple override of getColourRGB, but its implementation of toHTML used super to invoke the method it was overriding.

Our adjustment will not allow a method in the class to invoke the body of a method in a mixin. So we can’t use it to implement TimeSensitiveTodo. For that, we need a different tool, method advice.

Method advice is a powerful tool in its own right: It allows us to compose method functionality in a declarative way. Here’s a simple “override” function that decorates a class:

let override = (behaviour, ...overriddenMethodNames) =>
  (clazz) => {
    if (typeof behaviour === 'string') {
      behaviour = clazz.prototype[behaviour];
    }
    for (let overriddenMethodName of overriddenMethodNames) {
      let overriddenMethodFunction = clazz.prototype[overriddenMethodName];

      Object.defineProperty(clazz.prototype, overriddenMethodName, {
        value: function (...args) {
          return behaviour.call(this, overriddenMethodFunction.bind(this), ...args);
        },
        writable: true
      });
    }
    return clazz;
  };

It takes behaviour in the form of a name of a method or a function, and one or more names of methods to override. It overrides each of the methods with the behaviour, which is invoked with the overridden method’s function as the first argument.

This allows us to invoke the original without needing to use super. And although we don’t show all the other use cases here, it is handy for far more than overriding mixin methods, it can be used to decompose methods into separate responsibilities.

Using override, we can decorate methods with any arbitrary functionality. We’d use it like this:

class Todo {
  constructor (name) {
    this.name = name || 'Untitled';
    this.done = false;
  }

  do () {
    this.done = true;
    return this;
  }

  undo () {
    this.done = false;
    return this;
  }

  toHTML () {
    return this.name; // highly insecure
  }
}

let Coloured = mixin({
  setColourRGB ({r, g, b}) {
    this.colourCode = {r, g, b};
    return this;
  },

  getColourRGB () {
    return this.colourCode;
  }
});

let yellow = {r: 'FF', g: 'FF', b: '00'},
    red    = {r: 'FF', g: '00', b: '00'},
    green  = {r: '00', g: 'FF', b: '00'},
    grey   = {r: '80', g: '80', b: '80'};

let oneDayInMilliseconds = 1000 * 60 * 60 * 24;

let TimeSensitiveTodo = override('wrapWithColour', 'toHTML')(
  Coloured(
    class extends Todo {
      constructor (name, deadline) {
        super(name);
        this.deadline = deadline;
      }

      getColourRGB () {
        let slack = this.deadline - Date.now();

        if (this.done) {
          return grey;
        }
        else if (slack <= 0) {
          return red;
        }
        else if (slack <= oneDayInMilliseconds){
          return yellow;
        }
        else return green;
      }

      wrapWithColour (original) {
        let rgb = this.getColourRGB();

        return `<span style="color: #${rgb.r}${rgb.g}${rgb.b};">${original()}</span>`;
      }
    }
  )
);
let task = new TimeSensitiveTodo('Finish blog post', Date.now() + oneDayInMilliseconds);

task.toHTML()
  //=> <span style="color: #FFFF00;">Finish blog post</span>

With this solution, we’ve used our revamped mixin function to support getColourRGB overriding the mixin’s definition, and we’ve used override to support wrapping functionality around the original toHTML method.

As a final bonus, if we are using a transpiler that supports ES.who-knows-when, we can use the proposed class decorator syntax:

@override('wrapWithColour', 'toHTML')
@Coloured
class TimeSensitiveTodo extends Todo {
  constructor (name, deadline) {
    super(name);
    this.deadline = deadline;
  }

  getColourRGB () {
    let slack = this.deadline - Date.now();

    if (this.done) {
      return grey;
    }
    else if (slack <= 0) {
      return red;
    }
    else if (slack <= oneDayInMilliseconds){
      return yellow;
    }
    else return green;
  }

  wrapWithColour (original) {
    let rgb = this.getColourRGB();

    return `<span style="color: #${rgb.r}${rgb.g}${rgb.b};">${original()}</span>`;
  }
}

This is extremely readable.

A Touch of Light

method advice beyond extending mixin methods

override in and of itself is not spectacular. But most functionality that extends the behaviour of a method doesn’t process the result of the original. Most extensions do some work before the method is invoked, or do some work after the method is invoked.

So in addition to override, or toolbox should include before and after method advice. before invokes the behaviour first, and if its return value is undefined or truthy, it invokes the decorated method:

let before = (behaviour, ...decoratedMethodNames) =>
  (clazz) => {
    if (typeof behaviour === 'string') {
      behaviour = clazz.prototype[behaviour];
    }
    for (let decoratedMethodName of decoratedMethodNames) {
      let decoratedMethodFunction = clazz.prototype[decoratedMethodName];

      Object.defineProperty(clazz.prototype, decoratedMethodName, {
        value: function (...args) {
          let behaviourValue = behaviour.apply(this, ...args);

          if (behaviourValue === undefined || !!behaviourValue)
             return decoratedMethodFunction.apply(this, args);
        },
        writable: true
      });
    }
    return clazz;
  };

before should be used to decorate methods with setup or validation behaviour. Its “partner” is after, a decorator that runs behaviour after the decorated method is invoked:

let after = (behaviour, ...decoratedMethodNames) =>
  (clazz) => {
    if (typeof behaviour === 'string') {
      behaviour = clazz.prototype[behaviour];
    }
    for (let decoratedMethodName of decoratedMethodNames) {
      let decoratedMethodFunction = clazz.prototype[decoratedMethodName];

      Object.defineProperty(clazz.prototype, decoratedMethodName, {
        value: function (...args) {
          let decoratedMethodValue = ecoratedMethodFunction.apply(this, args);

          behaviour.apply(this, ...args);
          return decoratedMethodValue;
        },
        writable: true
      });
    }
    return clazz;
  };

With before, after, and override in hand, we have several advantages over traditional method overriding. First, before and after do a better job of declaring our intent when decomposing behaviour. And second, method advice allows us to add behaviour to multiple methods at once, focusing responsibility for cross-cutting concerns, like this:

const mustBeLoggedIn = () => {
    if (currentUser() == null)
      throw new PermissionsException("Must be logged in!");
  }

const mustBeMe = () => {
    if (currentUser() == null || !currentUser().person().equals(this))
      throw new PermissionsException("Must be me!");
  }

@HasAge
@before(mustBeMe, 'setName', 'setAge', 'age')
@before(mustBeLoggedIn, 'fullName')
class Person {

  setName (first, last) {
    this.firstName = first;
    this.lastName = last;
  }

  fullName () {
    return this.firstName + " " + this.lastName;
  }

};

Mixins allow us to have a many-to-many relationship between behaviour and classes. Method advice is similar: It makes a many-to-many relationship between behaviour and methods particularly easy to declare.

After using mixins and method advice on a regular basis, instead of using superclasses for shared behaviour, we use mixins and method advice instead. Superclasses are then relegated to those cases where we need to build behaviour into the constructor.

wrapping up

A simple mixin can cover many cases, but when we wish to override or extend method behaviour, we need to either use the subclass factory pattern or incorporate method advice. Method advice offers benefits above and beyond overriding mixin methods, especially if we use before and after in addition to override.

That being said, subclass factories are most convenient of we are comfortable with hierarchies of superclasses and with using super to extend method behaviour. Subclass factories work best when we don’t have a lot of behaviour that needs to be shared between different methods.

Method advice permits us to use a simpler approach to mixins, and makes it easy to have a many-to-many relationship between methods and behaviour, and it makes it easy to factor the responsibility for cross-cutting concerns out of each method.

No matter which approach we use, we find ourselves needing shallower and shallower class hierarchies when we use mixins to their fullest. Which demonstrates the power of working with simple constructs (like mixins and decorators) in JavaScript: We do not need nearly as much of the heavyweight OOP apparatus borrowed from 30 year-old languages, we just need to use the language we already have, in ways that cut with its grain.

(discuss on hacker news)


more reading:

notes:

  1. A production-ready implementation would handle more than just methods. For example, it would allow you to mix getters and setters into a class, and it would allow us to attach properties or methods to the target class itself, and not just instances. But this simplified version handles methods, simple properties, “mixin properties,” and instanceof, and that is enough for the purposes of investigating OO design questions. 

  2. Justin Fagnani named this pattern “subclass factory” in his essay “Real” Mixins with JavaScript Classes. It’s well worth a read, and his implementation touches on other matters such as optimizing performance on modern JavaScript engines. 

https://raganwald.com/2015/12/28/mixins-subclass-factories-and-method-advice
super() considered hmmm-ful
Show full content

Threat Display

I highly recommend reading Justin Fagnani’s “Real” Mixins with JavaScript Classes. To summarize my understanding, Justin likes using “mixins,” but takes issue with the way they are implemented as described in things like Using ES7 Decorators as Mixins.1

  • Justin wants to be able to have a fully open many-to-many relationship between meta-objects and objects.
  • Justin also wants to have mixins be much more equivalent to classes, especially with respect to being able to override a mixin’s method, and to be able to invoke the mixin’s original definition within an overridden method, just as you can invoke a superclass’s definition of a method from within a class’s method.
  • Finally, Justin wants to create code that existing engines can optimize easily, and avoid changing the “shape” of prototypes.

One of the things I like the most about Justin’s article is that it shines a light on two longstanding debates in OOP, both going back at least as far as Smalltalk. The first is about deep class hierarchies. My opinion can be expressed in three words: Don’t do that! Just about everyone agrees that flattened hierarchies are superior to deep hierarchies, especially when the deep hierarchies are an accidental complexity created by trying to fake a many-to-many relationship using a tree.

The second debate is more subtle, and it concerns overriding methods. It’s a massive oversimplification to suggest that there are only two sides to that debate, but for the purpose of this discussion, there are two different OOP tribes. One of them is called virtual-by-default, and the other is called final-by-default.

virtual-by-default

In languages like Smalltalk and almost every other “dynamically typed” OO descendent, including JavaScript, you can override any method at any level in the class hierarchy. In languages like Javascript and Ruby, you can even override a method within a single object.

When the method is invoked on an object, the most-specific version of the method is invoked. The other versions are available via various methods, from denoting them by absolute name (e.g. SomeSuperclassName.prototype.foo.call(this, 'bar', 'baz')) or using a magic keyword, super, e.g. super('bar', 'baz') in most languages, or super.baz('bar', 'baz') in JavaScript ES6.2

The canonical name for this is Dynamic Dispatch, because the method invocation is dynamically dispatched to the most appropriate method implementation. Such methods or functions are often called virtual functions, and thus a language where methods are automatically virtual is called “virtual-by-default.”

JavaScript out of the box is very definitely virtual-by-default. The technical opposite of a virtual-by-default language is a static-by-default language. In a static-by-default language, no matter whether the function is overridden or not, the implementation to be used is chosen at compile time based on the declared class of the receiver.

For example, making up our own little JavaScript flavour that has manifest typing:

class Foo {
  toString () {
    return "foo";
  }
}

class Bar extends Foo {
  toString () {
    return "bar";
  }
}

Foo f = new Bar();
console.log(f.toString());

In a virtual-by-default language, the console logs bar. In a static-by-default language, the console logs foo, because even though the object f is a Bar, it is declared as a Foo, and the compiler translates f.toString() into roughly Foo.prototype.toString.call(f).

C++ is a static-by-default language. If you want dynamic dispatching, you use a special virtual keyword to indicate which functions should be dynamically dispatched. If our hypothetical JavaScript flavour was static-by-default and we wanted toString() to be a virtual function, we would write:

class Foo {
  virtual toString () {
    return "foo";
  }
}

class Bar extends Foo {
  virtual toString () {
    return "bar";
  }
}

Foo f = new Bar();
console.log(f.toString());

After much experience with errors from forgetting to use the virtual keyword, most programming languages abandoned static-by-default and went with virtual-by-default.

final-by-default

If most languages are settled on virtual-by-default, how can there be another tribe? Well, the static-by-default people had two excellent reasons for liking static dispatch. The first was speed, and they loved speed. But as things got faster, that implementation consideration became less-and-less persuasive.

But there was another argument, a semantic argument. The argument was this. If we write:

class Foo {
  toString () {
    return "foo";
  }
}

We are defining Foo to be:

  1. A class
  2. That has a method, toString
  3. That returns "foo"

Everyone agrees on the first two points, but OO programmers are split on the third point. Some say that a Foo is defined to return "foo", others say that it returns "foo" by default, but any subclass of Foo can override this, and it could return anything, raise an exception, or erase your hard drive and email 419 scam letters to everyone in your contacts, you can’t tell unless you examine an individual object that happens to be declared to be a Foo and see how it actually behaves.

When the Java language was released, it was virtual-by-default, but it didn’t ignore this question. Java introduced the final keyword. When a method was declared final, it was illegal to override it, and if you tried, you got a compiler error.

If our imaginary JavaScript dialog worked this way, this code would not compile at all:

class Foo {
  final toString () {
    return "foo";
  }
}

class Bar extends Foo {
  toString () {
    return "bar";
  }
}
// => Error: Method toString of superclass Foo is final and cannot be overridden.

In Java, final was the way you wrote: “This class has a method, and you can be sure that all subclasses implement this method in this way.” In Java, final was optional. So whether by intent or by sheer laziness, most Java methods in the wild are not final.

But many people felt that like C++, the designers got it backwards. They felt that by default, all methods should be final. The special treatment should be for virtual methods, not for final methods.

If our dialect worked like that, all methods would be virtual, but if we wanted to allow a method to be overridden, we would use a special keyword, like default:

class Foo {
  toString () {
    return "foo";
  }
}

class Bar extends Foo {
  toString () {
    return "bar";
  }
}
// => Error: Method toString of superclass Foo is final and cannot be overridden.
class Foo {
  default toString () {
    return "foo";
  }
}

class Bar extends Foo {
  toString () {
    return "bar";
  }
}
//=> No errors, because Foo#toString is a default method.

In essence, the “final-by-default” tribe believe that methods can override each other, but that it should be rare, not common. So when a final-by-default programmer writes this code:

class Foo {
  toString () {
    return "foo";
  }
}

They are defining the behaviour of all instances of Foo. If they need to override toString later, they will come back and declare it to be default toString () { ... }. That makes it easier to reason about the behaviour of the code, because when you look at at the code for Foo, you know what a Foo is and does, not what it might do but nobody-really-knows-without-reading-every-possible-subclass.

We can think of final-by-default as The paranoid fringe of the virtual-by-default tribe.

The “virtual-by-default” tribe are not impressed. They ask, “if you can’t override, what makes you think you have polymorphism?” Of course, you can have two different subclasses each implement the same method without one overriding the other. And with “dynamic” languages and duck typing, you can have completely different classes implement the same “interface” without any overriding whatsoever. Or you can do all kinds of monkeying about with private methods but always expose the same public behaviour.

In the end, the “final-by-default” people are just as OO as the “virtual-by-default” people, but they spend a lot more time trying to keep their inheritance hierarchies “clean.”

the academic basis for final-by-default

Overriding methods is often taught as a central plank of OOP. So why would there by a hardy band of dissenting final-by-default people?

The problem final-by-default tries to solve is called the Liskov Substitution Principle or “LSP.” It states that if a Bar is-a Foo, you ought to be able to take any piece of code that works for a Foo, and substitute a Bar, and it should just work.

Overriding public methods is the easiest way to break LSP. Not always, of course. If you have a HashMap, you might override the implementation of its [] and []= methods in such a way that it has the exact same external behaviour.

But in general, if you treat methods as “defaults, open to overriding in any arbitrary way,” you are abandoning LSP. Is that a bad thing? Well, many people feel that it makes object-oriented programs very difficult to reason about. Or in plain English, prone to defects.

Another principle you will hear discussed in this vein is called the Open-Closed Principle: “Software entities (classes, modules, functions, etc.) should be open for extension, but closed for modification.”

In our examples above, overriding toString in Bar modifies the definition of Foo, because it changes the definition of the behaviour of objects that are instances of Foo. Whereas, if we write:

class Bar extends Foo {
  toArray () {
    return this.toString().split('');
  }
}

Now we are extending Foo for those objects that are both a Foo and a Bar, but not modifying the definition of Foo.3

The “final-by-default” tribe of OO programmers like their programs to confirm to LSP and Open/Closed. This makes them nervous of language features that encourage overriding methods.

Reconfiguring the Station

mixins and final-by-default

If you’re a member of the final-by-default tribe, you don’t want a lot of overriding of methods. You don’t want mixins to blindly copy over an existing prototype’s methods just as you don’t want a classes’ methods to will-nilly override a superclass’s methods or a mixin’s methods.

If you’re a member of the final-by-default tribe, every time you see the super keyword, you stare at it long and hard, and work out the tradeoff of convenience now versus potential for bugs down the road.

If you’re a member of the final-by-default tribe, your mixin implementation will throw an error if a mixin and a class’s method clash:

let HappyObjects = final_by_default_mixin({
  toString () {
    return "I'm a happy object!";
  }
});

@HappyObjects
class Foo {
  toString () {
    return "foo";
  }
};
// => Error: HappyObjects and Foo both define toString

Members of the final-by-default tribe want HappyObjects to describe all happy objects, and Foo to define all instances of Foo. Blindly copying methods won’t protect against naming clashes like this.

Of course, setting mixins up as subclass factories won’t do that either. With subclass factories, we would actually write something like:

let HappyObjects = subclass_factory_mixin({
  toString () {
    return 'happy';
  }
});

class HappyFoo extends HappyObjects(Object) {
  toString () {
    return `${super.toString()} foo`;
  }
};

let f = new HappyFoo();
f.toString()
  //=> happy foo

With a subclass factory, you have everything virtual-by-default and overridable-by-default. Which is fine if you aren’t a member of the final-by-default tribe.

there has to be a catch

So, if there are these fancy “Liskov Substitution Principles” and “Open/Closed Principles” arguing for not encouraging overriding methods, what is the catch? Why doesn’t everyone program this way?

Well, convenience. If you can’t override methods (because that modifies the meaning of the superclass or mixin), you need to do something else when you want to extend the behaviour of a superclass or mixin. For example, if you want the mixin for implementation convenience but aren’t trying to imply that a Foo is-a HappyObject, you would use delegation, like this:

class HappyObjects {
  toString () {
    return 'happy';
  }
};

class HappyFoo {
  constructor () {
    this.happiness = new HappyObjects();
  }

  toString () {
    return `${this.happiness.toString()} foo`;
  }
};

let f = new HappyFoo();
f.toString()
  //=> I'm a happy foo!

A HappyFoo delegates part of its behaviour to an instance of HappyObjects that it owns. Some people find this kind of things more trouble than its worth, no matter how many times they hear grizzled veterans intoning “Prefer Composition Over Inheritance.”

Another technique that final-by-default tribe members use is to focus on extending superclass methods rather than replacing them outright. Method Advice can help. In the Ruby on Rails framework, for example, you can add behaviour to existing methods that is run before, after, or around methods, without overriding the methods themselves.

In this example, decorators add behaviour to methods that could be inherited from a superclass or mixed in:

@before(validatePersonhood, 'setName', 'setAge', 'age')
@before(mustBeLoggedIn, 'fullName')
class User extends Person {
 // ...
};

Using method advice adds some semantic complexity in terms of learning what decorators like before or after might do, but encourages writing code where behaviour is extended rather than overridden. On larger and more complicated code bases, this can be a win.

People have also investigated other ways of composing metaobjects. One promising direction is traits: A trait is like a mixin, but when it is applied, there is a name resolution policy that determines whether conflicting names should override or act like method advice.

Traits are very much from the “final by default” school, but instead of simply preventing name overriding and leaving it up to the programmer to find another way forward, traits provide mechanisms for composing both metaobjects (like classes and mixins) as well as the methods they define.

Thinking, please wait

is super() considered hmmm-ful?

In JavaScript ES6, we car write super.baz() within a method baz, to denote that we wish to invokes the method it overrides. Overriding any arbitrary method and calling the overridden method when and how you like obviously provides maximum flexibility and convenience. It’s characteristic of the virtual-by-default mindset: Everything can be overridden, in any arbitrary way.

Replacing super.baz() with method advice, for example, requires careful design, but offers an easier way to reason about the code: Looking at a Foo class, you can have confidence that instances of Foo might extend its methods, but you will have a higher degree of confidence how they will behave.

The “Liskov Substitution” and “Open/Closed” principles are guidelines for writing software that is extensible and maintainable, just as “Prefer Composition over Inheritance” expresses a preference, not an ironclad rule to never inherit when you could compose.

So: Is super() considered harmful? No. Like anything else, it depends upon how you use it. Pragmatically, we shouldn’t reject all uses of super.baz() (or as noted, super() in other languages). But we can always stop for a moment and ask ourselves if it’s the best way to accomplish a particular objective. And we ought to understand the alternatives available to us.

(discuss on hacker news)


more reading:

notes:

  1. I don’t treat these objections as personal criticism: They describe what Justin needs from a tool they intend to use in production, while I am giving examples of tools for the purpose of understanding how pieces of the language can fit together in extremely simple and elegant ways. 

  2. In most OO languages, if you have a class Bar that extends Foo, and both implement a method called baz, within Bar’s definition of baz, you can invoke Foo’s definition of baz with super(...). The language knows that super(...) refers to the superclass’s definition of that same method, baz. JavaScript is different. Within a method, JavaScript ES6’s super is a kind of reference to the prototype of a class’s superclass, rather than a reference to the superclass’s definition of that same method. Thus, if you have a class Bar that extends Foo, and both implement a method called baz, within Bar’s definition of baz, you can invoke Foo’s definition of baz with super.baz(...). This is roughly equivalent to writing Foo.prototype.baz.call(this, ...). There are a number of ramifications of this design choice that go beyond the scope of this post, but we will stick to discussing the aspects of this behaviour that are the same as when other languages call super(...) within a method. 

  3. As originally professed, the Open-Closed Principle had more to do with saying that a language or system should allow things to be modified by adding subclasses and so forth, while strongly discouraging changing original things. So in the late eighties and early nineties, overriding methods was in keeping with the Open/Closed Principle, because superclasses remain closed to modification. This was a good idea at the time, because it encouraged building systems that didn’t have brittle dependencies. It has since evolved to have much more in common with LSP. 

https://raganwald.com/2015/12/23/super-considered-hmmmful
An ES6 function to compute the nth Fibonacci number
Show full content

Fibonacci Spiral

Once upon a time, programming interviews would include a fizz-buzz problem to weed out the folks who couldn’t string together a few lines of code. We can debate when and how such things are appropriate interview questions, but one thing that is always appropriate is to use them as inspiration for practising our own skills.1

There are various common problems offered in such a vein, including fizz-buzz itself, computing certain prime numbers, and computing Fibonacci number. A few years back, I had a go at writing my own Fibonacci function. When I started researching approaches, I discovered an intriguing bit of matrix math, so I learned something while practicing my skills.2

enter the matrix

One problem with calculating a Fibonacci number is that naïve algorithms require n additions. This is obviously expensive for large values of n. But of course, there are some interesting things we can do to improve on this.

In this solution, we observe that we can express the Fibonacci number F(n) using a 2x2 matrix that is raised to the power of n:

[ 1 1 ] n      [ F(n+1) F(n)   ]
[ 1 0 ]    =   [ F(n)   F(n-1) ]

On the face of it, raising someting to the power of n turns n additions into n multiplications. n multiplications sounds worse than n additions, however there is a trick about raising something to a power that we can exploit. Let’s start by writing some code to multiply matrices:

Multiplying two matrices is a little interesting if you have never seen it before:

[ a b ]       [ e f ]   [ ae + bg  af + bh ]
[ c d ] times [ g h ] = [ ce + dg  cf + dh ]

Our matrices always have diagonal symmetry, so we can simplify the calculation because c is always equal to b:

[ a b ]       [ d e ]   [ ad + be  ae + bf ]
[ b c ] times [ e f ] = [ bd + ce  be + cf ]

Now we are given that we are multiplying two matrices with diagonal symmetry. Will the result have diagonal symmetry? In other words, will ae + bf always be equal to bd + ce? Remember that a = b + c at all times and d = e + f provided that each is a power of [[1,1], [1,0]]. Therefore:

ae + bf = (b + c)e + bf
        = be + ce + bf
bd + ce = b(e + f) + ce
        = be + bf + ce

That simplifies things for us, we can say:

[ a b ]       [ d e ]   [ ad + be  ae + bf ]
[ b c ] times [ e f ] = [ ae + bf  be + cf ]

And thus, we can always work with three elements instead of four. Let’s express this as operations on arrays:

[a, b, c] times [d, e, f] = [ad + be, ae + bf, be + cf]

Which we can code in JavaScript, using array destructuring:

let times = (...matrices) =>
  matrices.reduce(
    ([a,b,c], [d,e,f]) => [a*d + b*e, a*e + b*f, b*e + c*f]
  );

times([1, 1, 0]) // => [1, 1, 0]
times([1, 1, 0], [1, 1, 0]) // => [2, 1, 1]
times([1, 1, 0], [1, 1, 0], [1, 1, 0]) // => [3, 2, 1]
times([1, 1, 0], [1, 1, 0], [1, 1, 0], [1, 1, 0]) // => [5, 3, 2]
times([1, 1, 0], [1, 1, 0], [1, 1, 0], [1, 1, 0], [1, 1, 0]) // => [8, 5, 3]

To get exponentiation from multiplication, we could write out a naive implementation that constructs a long array of copies of [1, 1, 0] and then calls times:

let naive_power = (matrix, n) =>
  times(...new Array(n).fill([1, 1, 0]));

naive_power([1, 1, 0], 1) // => [1, 1, 0]
naive_power([1, 1, 0], 2) // => [2, 1, 1]
naive_power([1, 1, 0], 3) // => [3, 2, 1]
naive_power([1, 1, 0], 4) // => [5, 3, 2]
naive_power([1, 1, 0], 5) // => [8, 5, 3]

Very interesting, and less expensive than multiplying any two arbitrary matrices, but we are still performing n multiplications when we raise a matrix to the nth power. What can we do about that?

exponentiation with matrices

Now let’s make an observation: instead of accumulating a product by iterating over the list, let’s Divide and Conquer. Let’s take the easy case: Don’t you agree that times([1, 1, 0], [1, 1, 0], [1, 1, 0], [1, 1, 0]) is equal to times(times([1, 1, 0], [1, 1, 0]), times([1, 1, 0], [1, 1, 0]))?

This saves us an operation, since times([1, 1, 0], [1, 1, 0], [1, 1, 0], [1, 1, 0]) is implemented as:

times([1, 1, 0],
  times([1, 1, 0],
    times([1, 1, 0], [1, 1, 0]))

Whereas times(times([1, 1, 0], [1, 1, 0]), times([1, 1, 0], [1, 1, 0])) can be implemented as:

let double = times([1, 1, 0], [1, 1, 0]),
    quadruple = times(double, double);

This only requires two operations rather than three. Furthermore, this pattern is recursive. For example, naive_power([1, 1, 0], 8) requires seven operations:

times([1, 1, 0],
  times([1, 1, 0],
    times([1, 1, 0],
      times([1, 1, 0],
        times([1, 1, 0],
          times([1, 1, 0],
            times([1, 1, 0], [1, 1, 0])))))))

However, it can be formulated with just three operations:

let double = times([1, 1, 0], [1, 1, 0]),
    quadruple = times(double, double),
    octuple = times(quadruple, quadruple);

Of course, we left out how to deal with odd numbers. Fixing that also fixes how to deal with even numbers that aren’t neat powers of two:

let power = (matrix, n) => {
  if (n === 1) return matrix;

  let halves = power(matrix, Math.floor(n / 2));

  return n % 2 === 0
         ? times(halves, halves)
         : times(halves, halves, matrix);
}

power([1, 1, 0], 1) // => [1, 1, 0]
power([1, 1, 0], 2) // => [2, 1, 1]
power([1, 1, 0], 3) // => [3, 2, 1]
power([1, 1, 0], 4) // => [5, 3, 2]
power([1, 1, 0], 5) // => [8, 5, 3]

Now we can perform exponentiation of our matrices, and we take advantage of the symmetry to perform log2n multiplications.

and thus to fibonacci

We can now write our complete fibonacci function:

let times = (...matrices) =>
  matrices.reduce(
    ([a,b,c], [d,e,f]) => [a*d + b*e, a*e + b*f, b*e + c*f]
  );

let power = (matrix, n) => {
  if (n === 1) return matrix;

  let halves = power(matrix, Math.floor(n / 2));

  return n % 2 === 0
         ? times(halves, halves)
         : times(halves, halves, matrix);
}

let fibonacci = (n) =>
  n < 2
  ? n
  : power([1, 1, 0], n - 1)[0];

fibonacci(62)
  // => 4052739537881

If we’d like to work with very large numbers, JavaScript’s integers are insufficient. Using a library like BigInteger.js, our solution becomes:

import { zero, one } from 'big-integer';

let times = (...matrices) =>
  matrices.reduce(
    ([a, b, c], [d, e, f]) => [
        a.times(d).plus(b.times(e)),
        a.times(e).plus(b.times(f)),
        b.times(e).plus(c.times(f))
      ]
  );

let power = (matrix, n) => {
  if (n === 1) return matrix;

  let halves = power(matrix, Math.floor(n / 2));

  return n % 2 === 0
         ? times(halves, halves)
         : times(halves, halves, matrix);
}

let fibonacci = (n) =>
  n < 2
  ? n
  : power([one, one, zero], n - 1)[0];

Let’s stretch our wings and calculate the 19,620,614th Fibonacci number:3

fibonacci(19620614).toString()
  // =>
    29554981652302145421961363135286189884298419359021591207414
    94029404508891979849589048890639433083583865137532017734839
    03976989846431379560920380384049354648973349793444866743699
    77583527866834756211404857224578913159290369224744375346007
    69568427064684727742279727974589543524504574566687346018118
    // ...
    // ...69,490 lines elided...
    // ...
    44917243010555501044826365112091652635017254277919365055752
    92134790638460565443537453870610661665070132289987927432062
    35114452925784094009088430159367013806505734580798084002841
    21219542644844237050682927035326929321204843947060841278604
    7707601726389614978163177

(full result, as calculated in Safari)

We’re done. And this is a win over the typical recursive or even iterative solution for large numbers, because while each operation is more expensive, we only perform log2n operations.4

(This is a translation of a blog post written in 2008. It feels cleaner than the Ruby original, possibly because of the destructuring, and possibly because writing functions is idiomatic JavaScript, whereas refining core classes is idiomatic Ruby. What do you think? Discuss on Hacker News.)


notes:

  1. Actually, let me be candid: I just like programming, and I find it’s fun, even if I don’t magically transform myself into a 10x programming ninja through putting in 10,000 hours of practice. But practice certainly doesn’t hurt. 

  2. There is a closed-form solution to the function fib, but floating point math has some limitations you should be aware of before using it in an interview. Naturally, if you’re running into some of those limits, you would use a BigInt library such as BigInteger.js

  3. 1962-06-14 is a number near and dear to me :-) 

  4. No, this isn’t the fastest implementation by far. But it beats the pants off of a naïve iterative implementation. 

https://raganwald.com/2015/12/20/an-es6-program-to-compute-fibonacci
Solving a Coding Problem with Iterators and Generators
Show full content
a fizz-buzz problem

Job interviews sometimes contain simple programming tasks. Often called “fizz-buzz problems,” the usual purpose is to quickly weed out hopefuls who can’t actually program anything.

Here’s an example, something that might be used in a phone screen or an in-person interview with programmers early in their career: Write a merge function, that given two sorted lists, produces a sorted list containing the union of each list’s elements. For example:

merge([1, 7, 11, 17], [3, 5, 13])
  //=> [1, 3, 5, 7, 11, 13, 17]

merge([2, 3, 5, 7, 11], [2, 4, 6, 8, 10, 12, 14])
  //=> [2, 2, 3, 4, 5, 6, 7, 8, 10, 11, 12, 14]

In a language with convenient array semantics, and with a reckless disregard for memory and performance, a solution is straightforward to compose:

function merge (originalA, originalB) {
  const merged = [],
        tempA = originalA.slice(0),
        tempB = originalB.slice(0);

  while (tempA.length > 0 && tempB.length > 0) {
    merged.push(
      tempA[0] < tempB[0] ? tempA.shift() : tempB.shift()
    );
  }
  return merged.concat(tempA).concat(tempB);
}

The usual hazards to navigate are cases like either array being empty or having a single element. In a follow-up discussion, an interview might explore why this implementation takes a beating from the ugly memory stick, and how to use indices to make it better.

taking it up a level

Sometimes, the interviewer will then move on to a follow-up that adds some complexity. Whereas the previous problem was given just to eliminate the (hopefully few) candidates who really should have been filtered out before getting an interview of any type, now we are looking for an opportunity to discuss approaches to problem solving.

Follow-up problems often incorporate a few extra elements to manage. They shouldn’t be “gotchas,” just things that require some careful consideration and the ability to juggle several problems at the same time.

For example: Write a function that given an arbitrary number of ordered streams of elements, produces an ordered stream containing the union of each stream’s elements.

let’s write it

In ECMAScript 2015, we can represent the streams we have to merge as Iterables. We’ll write a generator, a function that yields values. Our generator will take the iterables as arguments, and yield the values in the correct order to represent an ordered merge.

The skeleton will look like this:

function * merge (...iterables) {

  // setup

  while (our_iterables_are_not_done) {

    // find the iterator with the lowest value

    yield lowestIterator.next().value;
  }
}

Our first problem is handling more than two iterables. Our second is that to iterate over an iterable, we have to turn it into an iterator. That’s easy, every iterable has a method named Symbol.iterator that returns a new iterator over that iterable.

function * merge (...iterables) {

  const iterators = iterables.map(
    i => i[Symbol.iterator]()
  );

  while (our_iterables_are_not_done) {

    // find the iterator with the lowest value

    yield lowestIterator.next().value;
  }
}

Our third problem is thorny: To test whether an iterator has one or more values to return, we call .next(). But doing so actually fetches the value and changes the state of the iterator. If we write:

while (iterators.some(i => !i.next().done))

We will fetch the first element of each iterator and discard it. That’s a problem. What we want is a magic iterator that lets us peek at the next element (and whether the iterator is done), while allowing us to grab the element later.

So let’s write an iterator adaptor class that does that:

const _iterator = Symbol('iterator');
const _peeked = Symbol('peeked');

class PeekableIterator {
  constructor (iterator) {
    this[_iterator] = iterator;
    this[_peeked] = iterator.next();
  }

  peek () {
    return this[_peeked];
  }

  next () {
    const returnValue = this[_peeked];
    this[_peeked] = this[_iterator].next();
    return returnValue;
  }
}

Our PeekableIterator class wraps around an existing iterator, but in addition to a next method that advances to the next value (if any), it also provides a peek method that doesn’t advance the iterator.

Now we can back up and use PeekableIterators instead of plain iterators:

function * merge (...iterables) {

  const iterators = iterables.map(
    i => new PeekableIterator(i[Symbol.iterator]())
  );

  while (iterators.some(i => !i.peek().done)) {

    // find the iterator with the lowest value

    yield lowestIterator.next().value;
  }
}

We can also use our peek method to find the iterator with the lowest value. We’ll take our iterators, filter out any that are done, sort them according to the value we peek, and the first iterator has the lowest value:

function * merge (...iterables) {

  const iterators = iterables.map(
    i => new PeekableIterator(i[Symbol.iterator]())
  );

  while (iterators.some(i => !i.peek().done)) {

    const lowestIterator =
      iterators
        .filter(
          i => !i.peek().done
        ).sort(
          (a, b) => a.peek().value - b.peek().value
        )[0];

    yield lowestIterator.next().value;
  }
}

We’re done!

the complete solution
const _iterator = Symbol('iterator');
const _peeked = Symbol('peeked');

class PeekableIterator {
  constructor (iterator) {
    this[_iterator] = iterator;
    this[_peeked] = iterator.next();
  }

  peek () {
    return this[_peeked];
  }

  next () {
    const returnValue = this[_peeked];
    this[_peeked] = this[_iterator].next();
    return returnValue;
  }
}

function * merge (...iterables) {

  const iterators = iterables.map(
    i => new PeekableIterator(i[Symbol.iterator]())
  );

  while (iterators.some(i => !i.peek().done)) {

    const lowestIterator =
      iterators
        .filter(
          i => !i.peek().done
        ).sort(
          (a, b) => a.peek().value - b.peek().value
        )[0];

    yield lowestIterator.next().value;
  }
}

This is reasonably straightforward for programmers comfortable with iterators and generators.1 Since our merge function is a generator, we can easily iterate over its contents or spread them into an array. In fact, it’s almost interchangeable with the solution for arrays, we just need to remember to spread the result.

const primes = [2, 3, 5, 7, 11];
const evens = function * () {
  for (let n of [1, 2, 3, 4, 5, 6, 7]) {
    yield n * 2;
  }
}

for (let value of merge(primes, evens())) {
  console.log(value);
}
  //=>
    2
    2
    3
    4
    5
    6
    7
    8
    10
    11
    12
    14

[...merge(primes, evens())]
  //=> [2, 2, 3, 4, 5, 6, 7, 8, 10, 11, 12, 14]

There is plenty to discuss about this solution. Here are a few to start things off:

  • What are the performance implications of having lots and lots of iterables, maybe a few hundred or a few thousand?
  • What happens if you have one iterable that produces thousands of values, along with a few hundred that only produce a few hundred each? Or more generally, what if there is an inverse power law relationship between the number of iterables and the number of values they produce?

As an exercise, we can ask ourselves what other questions we would ask a candidate who wrote this solution within the confines of a forty-minute time slot.


Servers

but what if i hate cs-style puzzles?

Given the first problem, the more experienced candidate might roll their eyes. But could it be a mistake to dismiss fizz-buzz problems out of hand? Consider what happens if the interview proceeds to merging an arbitrary number of streams as we’ve discussed here. It’s clearly related to the first problem. But is it “Impractical Computer Science?”

Let’s try wrapping it in a story:

You work for a company that manages alerting and event remediation. You have a large, distributed cluster of servers, each of which emits a huge number of events tagged with a customer id, type, timestamp, and so forth. You are looking for certain patterns of events. Write a function that creates an alert when it sees a certain pattern of evens occurring within a certain time frame.

Naturally, the first thing to do is to get all the alerts for a customer into a single stream, ordered by timestamp. We can’t get them all and sort them, because they won’t fit into memory. So what do we do?

That’s right, we create a stream of events that merges the streams from each server. We can then write filters and pattern matchers that operates on the merged stream.

Now perhaps this won’t happen in JavaScript. And perhaps there will be some mechanism other than an ECMAScript Iterator for representing a real time stream. But somewhere, there will be some code that merges streams, and demonstrating an aptitude for understanding such algorithms is certainly demonstrating on-the-job skills.

conclusion

Coding in job interviews doesn’t seem to be going away any time soon. Until it does, it behooves engineers to be competent at writing code in real time, and it behooves employers to choose problems that have a reasonable relationship to the problems they solve at work.

And if we encounter a programming problem that seems “Way out there…” Maybe we should solve it brilliantly, then ask a question of our own: “Say, if this works out and I come to work for you, when would I be working with algorithms like this?”

We might be pleasantly surprised by the answer.

(discuss on hacker news)


PagerDuty

Note: At PagerDuty, we do indeed work with streams of data on highly-available and distributed clusters of systems. We have great problems to solve with the engines that keep everything going, and with developing user-facing tooling. If that piques your intellectual curiosity, we’re hiring engineers for our applications and real-time teams. There are positions in both San Francisco and Toronto.

Come work with us!


  1. And if iterators and generators are fairly new to you, you can read JavaScript Allongé for free! 

https://raganwald.com/2015/11/03/a-coding-problem
Getters, Setters, and Organizing Responsibility in JavaScript
Show full content

Once upon a time, there was a language called C,1 and this language had something called a struct, and you could use it to make heterogeneously aggregated data structures that had members. The key thing to know about C is that when you have a struct called currentUser, and a member like id, and you write something like currentUser.id = 42, the C complier turned this into extremely fast assembler instructions. Same for int id = currentUser.id.

Also of importance was that you could have pointers to functions in structs, so you could write things like currentUser->setId(42) if you preferred to make setting an id a function, and this was also translated into fast assembler.

And finally, C programming has a very strong culture of preferring “extremely fast” to just “fast,” and thus if you wanted a C programmer’s attention, you had to make sure that you never do something that is just fast when you could do something that is extremely fast. This is a generalization, of course. I’m sure that if we ask around, we’ll eventually meet both C programmers who prefer elegant abstractions to extremely fast code.

Flathead Dragster

java and javascript

Then there was a language called Java, and it was designed to run in browsers, and be portable across all sorts of hardware and operating systems, and one of its goals was to get C programmers to write Java code in the browser instead of writing C that lived in a plugin. Or rather, that was one of its strategies, the goal was for Sun Microsystems to stay relevant in a world that Microsoft was commoditizing, but that is another chapter of the history book.

So the nice people behind Java gave it C-like syntax with the braces and the statement/expression dichotomy and the dot notation. They have “objects” instead of structs, and objects have a lot more going on than structs, but Java’s designers made a distinction between currentUser.id = 42 and currentUser.setId(42), and made sure that one was extremely fast and the other was just fast. Or rather, that one was fast, and the other was just ok compared to C, but C programmers could feel like they were doing important thinking when deciding whether id ought to be directly accessed for performance or indirectly accessed for elegance and flexibility.

History has shown that this was the right way to sell a new language. History has also shown that the actual performance distinction was irrelevant to almost everybody. Performance is only for now, code flexibility is forever.

Well, it turned out that Sun was right about getting C programmers to use Java (it worked on me, I ditched CodeWarrior and Lightspeed C), but wrong about using Java in browsers. Instead, people started using another language called JavaScript to write code in browsers, and used Java to write code on servers.

Will it surprise you to learn that JavaScript was also designed to get C programmers to write code? And that it went with the C-like syntax with curly braces, the statement/expression dichotomy, and dot notation? And although JavaScript has a thing that is kinda-sorta like a Java object, and kinda-sorta like a Smalltalk dictionary, will it surprise you to learn that JavaScript also has a distinction between currentUser.id = 42 and currentUser.setId(42)? And that originally, one was slow, and the other dog-slow, but programmers could do important thinking about when to optimize for performance and when to give a hoot about programmer sanity?

No, it will not surprise you to learn that it works kinda-sorta like C in the same way that Java kinda-sort works like C, and for exactly the same reason. And the reason really doesn’t matter any more.

Professor Frink on Java

the problem with direct access

Very soon after people begun working with Java at scale, they learned that directly accessing instance variables was a terrible idea. JIT compilers narrowed the performance difference between currentUser.id = 42 and currentUser.setId(42) to almost nothing of relevance to anybody, and code using currentUser.id = 42 or int id = currentUser.id was remarkably inflexible.

There was no way to decorate such operations with cross-cutting concerns like logging or validation. You could not override the behaviour of setting or getting an id in a subclass. (Java programmers love subclasses!)

Meanwhile, JavaScript programmers were also writing currentUser.id = 42, and eventually they too discovered that this was a terrible idea. One of the catalysts for change was the arrival of frameworks for client-side JavaScript applications. Let’s say we have a ridiculously simple person class:

class Person {
  constructor (first, last) {
    this.first = first;
    this.last = last;
  }

  fullName () {
    return `${this.first} ${this.last}`;
  }
};

And an equally ridiculous view:

class PersonView {
  constructor (person) {
    this.model = person;
  }

  // ...

  redraw () {
    document
      .querySelector(`person-${person.id}`)
      .text(person.fullName())
  }
}

Every time we update the person class, we have to remember to redraw the view:

const currentUser = new Person('Reginald', 'Braithwaite');
const currentUserView = new PersonView(currentUser);

currentUserView.redraw();

currentUser.first = 'Ragnvald';
currentUserView.redraw();

Why does this matter?

Well, if you can’t control where certain responsibilities are handled, you can’t really organize your program. Subclasses, methods, mixins and decorators are techniques: What they make possible is choosing which code is responsible for which functionality.

And that’s the whole thing about programming: Organizing the functionality. Direct access does not allow you to organize the functionality associated with getting and setting properties, it forces the code doing the getting and setting to also be responsible for anything else associated with getting and setting.

Magnetic Core Memory

get and set

It didn’t take long for JavaScript library authors to figure out how to make this go away by using a get and set method. Stripped down to the bare essentials for illustrative purposes, we could write this:

class Model {
  constructor () {
    this.listeners = new Set();
  }

  get (property) {
    this.notifyAll('get', property, this[property]);
    return this[property];
  }

  set (property, value) {
    this.notifyAll('set', property, value);
    return this[property] = value;
  }

  addListener (listener) {
    this.listeners.add(listener);
  }

  deleteListener (listener) {
    this.listeners.delete(listener);
  }

  notifyAll (message, ...args) {
    for (let listener of this.listeners) {
      listener.notify(this, message, ...args);
    }
  }
}

class Person extends Model {
  constructor (first, last) {
    super();
    this.set('first', first);
    this.set('last', last);
  }

  fullName () {
    return `${this.get('first')} ${this.get('last')}`;
  }
};

class View {
  constructor (model) {
    this.model = model;
    model.addListener(this);
  }
}

class PersonView extends View {
  // ...

  notify(notifier, method, ...args) {
    if (notifier === this.model && method === 'set') this.redraw();
  }

  redraw () {
    document
      .querySelector(`person-${this.model.id}`)
      .text(this.model.fullName())
  }
}

Our new Model superclass manually manages allowing objects to listen to the get and set methods on a model. If they are called, the “listeners” are notified via the .notifyAll method. We use that to have the PersonView listen to its Person and call its own .redraw method when a property is set via the .set method.

So we can write:

const currentUser = new Person('Reginald', 'Braithwaite');
const currentUserView = new PersonView(currentUser);

currentUser.set('first', 'Ragnvald');

And we don’t need to call currentUserView.redraw(), because the notification built into .set does it for us.

We can do other things with .get and .set, of course. Now that they are methods, we can decorate them with logging or validation if we choose. Methods make our code flexible and open to extension. For example, we can use an ES.later decorator to add logging advice to .set:

const after = (behaviour, ...methodNames) =>
  (clazz) => {
    for (let methodName of methodNames) {
      const method = clazz.prototype[methodName];

      Object.defineProperty(clazz.prototype, methodName, {
        value: function (...args) {
          const returnValue = method.apply(this, args);

          behaviour.apply(this, args);
          return returnValue;
        },
        writable: true
      });
    }
    return clazz;
  }

function LogSetter (model, property, value) {
  console.log(`Setting ${property} of ${model.fullName()} to ${value}`);
}

@after(LogSetter, 'set')
class Person extends Model {
  constructor (first, last) {
    super();
    this.set('first', first);
    this.set('last', last);
  }

  fullName () {
    return `${this.get('first')} ${this.get('last')}`;
  }
};

Whereas we can’t do anything like that with direct property access. Mediating property access with methods is more flexible than directly accessing properties, and this allows us to organize our program and distribute responsibility properly.

Note: All the ES.later class decorators can be used in vanilla ES 6 code as ordinary functions. Instead of @after(LogSetter, 'set') class Person extends Model {...}, simply write const Person = after(LogSetter, 'set')(class Person extends Model {...})

Techniques

getters and setters in javascript

The problem with getters and setters was well-understood, and the stewards behind JavaScript’s evolution responded by introducing a special way to turn direct property access into a kind of method. Here’s how we’d write our Person class using “getters” and “setters:”

class Model {
  constructor () {
    this.listeners = new Set();
  }

  addListener (listener) {
    this.listeners.add(listener);
  }

  deleteListener (listener) {
    this.listeners.delete(listener);
  }

  notifyAll (message, ...args) {
    for (let listener of this.listeners) {
      listener.notify(this, message, ...args);
    }
  }
}

const _first = Symbol('first'),
      _last = Symbol('last');

class Person extends Model {
  constructor (first, last) {
    super();
    this.first = first;
    this.last = last;
  }

  get first () {
    this.notifyAll('get', 'first', this[_first]);
    return this[_first];
  }

  set first (value) {
    this.notifyAll('set', 'first', value);
    return this[_first] = value;
  }

  get last () {
    this.notifyAll('get', 'last', this[_last]);
    return this[_last];
  }

  set last (value) {
    this.notifyAll('set', 'last', value);
    return this[_last] = value;
  }

  fullName () {
    return `${this.first} ${this.last}`;
  }
};

When we preface a method with the keyword get, we are defining a getter, a method that will be called when code attempts to read from the property. And when we preface a method with set, we are defining a setter, a method that will be called when code attempts to write to the property.

Getters and setters needn’t actually read or write any properties, they can do anything. But in this essay, we’ll talk about using them to mediate property access. With getters and setters, we can write:

const currentUser = new Person('Reginald', 'Braithwaite');
const currentUserView = new PersonView(currentUser);

currentUser.first = 'Ragnvald';

And everything still works just as if we’d written currentUser.set('first', 'Ragnvald') with the .set-style code.

Getters and setters allow us to have the semantics of using methods, but the syntax of direct access.

Keypunch

an after combinator that can handle getters and setters

Getters and setters seem at first glance to be a magic combination of familiar syntax and powerful ability to meta-program. However, a getter or setter isn’t a method in the usual sense. So we can’t decorate a setter using the exact same code we’d use to decorate an ordinary method.

With the .set method, we could directly access Model.prototype.set and wrap it in another function. That’s how our decorators work. But there is no Person.prototype.first method. Instead, there is a property descriptor we can only introspect using Object.getOwnPropertyDescriptor() and update using Object.defineProperty().

For this reason, the naïve after decorator given above won’t work for getters and setters.2 We’d have to use one kind of decorator for methods, another for getters, and a third for setters. That doesn’t sound like fun, so let’s modify our after combinator so that you can use a single function with methods, getters, and setters:

function getPropertyDescriptor (obj, property) {
  if (obj == null) return null;

  const descriptor = Object.getOwnPropertyDescriptor(obj, property);

  if (obj.hasOwnProperty(property))
    return Object.getOwnPropertyDescriptor(obj, property);
  else return getPropertyDescriptor(Object.getPrototypeOf(obj), property);
};

const after = (behaviour, ...methodNames) =>
  (clazz) => {
    for (let methodNameExpr of methodNames) {
      const [_, accessor, methodName] = methodNameExpr.match(/^(?:(get|set) )(.+)$/);
      const descriptor = getPropertyDescriptor(clazz.prototype, methodName);


      if (accessor == null) {
        const method = clazz.prototype[methodName];

        descriptor.value = function (...args) {
          const returnValue = method.apply(this, args);

          behaviour.apply(this, args);
          return returnValue;
        };
        descriptor.writable = true;
      }
      else if (accessor === "get") {
        const method = descriptor.get;

        descriptor.get = function (...args) {
          const returnValue = method.apply(this, args);

          behaviour.apply(this, args);
          return returnValue;
        };
        descriptor.configurable = true;
      }
      else if (accessor === "set") {
        const method = descriptor.set;

        descriptor.set = function (...args) {
          const returnValue = method.apply(this, args);

          behaviour.apply(this, args);
          return returnValue;
        };
        descriptor.configurable = true;
      }
      Object.defineProperty(clazz.prototype, methodName, descriptor);
    }
    return clazz;
  }

Now we can write:

const notify = (name) =>
  function (...args) {
    this.notifyAll(name, ...args);
  };

@after(notify('set'), 'set first', 'set last')
@after(notify('get'), 'get first', 'get last')
class Person extends Model {
  constructor (first, last) {
    super();
    this.first = first;
    this.last = last;
  }

  get first () {
    return this[_first];
  }

  set first (value) {
    return this[_first] = value;
  }

  get last () {
    return this[_last];
  }

  set last (value) {
    return this[_last] = value;
  }

  fullName () {
    return `${this.first} ${this.last}`;
  }
};

We have now decoupled the code for notifying listeners from the code for getting and setting values. Which provokes a simple question: If the code that tracks listeners is already decoupled in Model, why shouldn’t the code for triggering notifications be in the same entity?

There are a few ways to do that. We’ll use a universal mixin instead of stuffing that logic into a superclass:

const Notifier = mixin({
  init () {
    this.listeners = new Set();
  },

  addListener (listener) {
    this.listeners.add(listener);
  },

  deleteListener (listener) {
    this.listeners.delete(listener);
  },

  notifyAll (message, ...args) {
    for (let listener of this.listeners) {
      listener.notify(this, message, ...args);
    }
  }
}, {
  notify (name) {
    return function (...args) {
      this.notifyAll(name, ...args);
    }
  }
});

This permits us to write:

@Notifier
@after(Notifier.notify('set'), 'set first', 'set last')
@after(Notifier.notify('get'), 'get first', 'get last')
class Person {
  constructor (first, last) {
    this.init();
    this.first = first;
    this.last = last;
  }

  get first () {
    return this[_first];
  }

  set first (value) {
    return this[_first] = value;
  }

  get last () {
    return this[_last];
  }

  set last (value) {
    return this[_last] = value;
  }

  fullName () {
    return `${this.first} ${this.last}`;
  }
};

What have we done? We have incorporated getters and setters into our code, while maintaining the ability to decorate them with added functionality as if they were ordinary methods.

That’s a win for decomposing code. And it points to something to think about: When all you have are methods, you’re encouraged to make heavyweight superclasses. That’s why so many frameworks force you to extend their special-purpose base classes like Model or View.

But when you find a way to use mixins and decorate methods, you can decompose things into smaller pieces and apply them where they are needed. This leads in the direction of using collections of libraries instead of a heavyweight framework.

summary

Getters and setters allow us to maintain the legacy style of writing code that appears to directly access properties, while actually mediating that access with methods. With care, we can update our tooling to permit us to decorate our getters and setters, distributing responsibility as we see fit and freeing us from dependence upon heavyweight base classes.

(discuss on Hacker News)

Post Scriptum

This post uses listening to property setters as an excuse to discuss the getter and setter mechanisms, and ways to decorate them so that we can organize code around concerns.

Of course, propagating changes through explicit notification is not the only way to organize code that needs to manage dependencies on data changing. It’s beyond the scope of this post to discuss the many alternatives, but readers have suggested exploring Object.observe and working with immutable data.


Flying saucers for everyone

one more thing

Rubyists scoff at:

get first () {
  return this[_first];
}

set first (value) {
  return this[_first] = value;
}

get last () {
  return this[_last];
}

set last (value) {
  return this[_last] = value;
}

Rubyists would use the built-in class method attr_accessor to write them for us. So just for kicks, we’ll write a decorator that writes getters and setters. The raw values will be stored in an attributes map:

function attrAccessor (...propertyNames) {
  return function (clazzOrObject) {
    const target = clazzOrObject.prototype || clazzOrObject;

    for (let propertyName of propertyNames) {
      Object.defineProperty(target, propertyName, {
        get: function () {
          if (this.attributes)
            return this.attributes[propertyName];
        },
        set: function (value) {
          if (this.attributes == undefined)
            this.attributes = new Map();
          return this.attributes[propertyName] = value;
        },
        configurable: true,
        enumerable: true
      })
    }

    return clazzOrObject;
  }
}

Now we can write:

@Notifier
@attrAccessor('first', 'last')
@after(Notifier.notify('set'), 'set first', 'set last')
@after(Notifier.notify('get'), 'get first', 'get last')
class Person {
  constructor (first, last) {
    this.init();
    this.first = first;
    this.last = last;
  }

  fullName () {
    return `${this.first} ${this.last}`;
  }
};

attrAccessor takes a list of property names and returns a decorator for a class. It writes a plain getter or setter function for each property, and all the properties defined are stored in the .attributes hash. This is very convenient for serialization or other persistance mechanisms.

(It’s trivial to also make attrReader and attrWriter functions using this template. We just need to omit the set when writing attrReader and omit the get when writing attrWriter.)


  1. There was also a language called BCPL, and others before that, but our story has to start somewhere, and it starts with C. 

  2. Neither will the mixin recipe we’ve evolved in previous posts like Using ES.later Decorators as Mixins. It can be enhanced to add a special case for getters, setters, and other concerns like working with POJOs. For example, Andrea Giammarchi’s Universal Mixin

https://raganwald.com/2015/08/24/ready-get-set-go
Extension Methods, Monkey-Patching, and the Bind Operator
Show full content
extension methods

In some programming languages, there is a style of “monkey-patching” classes in order to add convenience methods. For example, let’s say that we’re using Ruby on Rails, and we have an array called chain_of_command. Although Ruby’s Array class does not define a special method for obtaining the second element of an array, we can still write:

second_in_command = chain_of_command.second()

This is possible because deep within ActiveSupport is this snippet of code:

class Array

  # ...

  def second
    self[1]
  end
end

This code adds a second method to every instance of Array, everywhere in our running instance of Ruby. (There are also word methods for getting the third, fourth, fifth, and forty_two of an array. The code does not document why the last one isn’t called forty_second).

We also get methods like days and from_now added to Number, so we can write:

access.expires_at(14.days.from_now)

And it goes on and on with all kinds of methods added to all kinds of classes. These kinds of methods are generally thought to provide readability, convenience, or both, and it is one of the reasons why Rails became popular.

We can do the same kind of thing in JavaScript, of course. Here’s a .second extension method for Array:

Array.prototype.second = function () {
  return this[1];
};

['a', 'b', 'c'].second()
  //=> 'b'

A method added to a class from a place other than the class’s primary definition is called an extension method. Note that methods defines in modules/mixins are not extension methods, we are referring to a method added completely orthogonally from the code or library or built-in that defines the standard behaviour of the class.

People call this Monkey patching, but to be precise, the phrase “monkey patching” is a colloquial expression referring to one particular way of implementing extension methods. We’ll look at some others in a moment, but the thing to remember right now is that an extension method is a method extending the behaviour of a class that is defined later and/or elsewhere from the primary class definition, however that might happen to be implemented.

there is more than one way to do it

The obvious alternative to extension methods is to write utility functions, or methods in utility classes. So instead of writing chain_of_command.second() we write second(chain_of_command). Functions are easy in languages like JavaScript:

const second = indexed => indexed[1];

In Ruby, we can make a method that reads like a function call:

module Kernel
  def second (indexed)
    indexed[1]
  end
end

Close enough for government work! So, the key question is, Why should we write extension methods instead of functions?

the “oo” arguments against and for extension methods

There is a general principle in OO that objects should be responsible for implementing all of the operations where they are the primary participant. So by this reasoning, if you want a second operation on arrays, the Array class should be responsible for implementing it.

At first glance, an extension method accomplishes this. Array should be responsible for implementing .second, and look! We opened the Array class up and added second to it. But this reasoning does not apply to a language like Ruby that has a strong distinction between the static organization of the code in .rb files and the runtime organization of the code in classes and instances.

At runtime, an extension method makes Array responsible for implementing the .second method, but in the organization of the code, the programming entity responsible for defining .second is ActiveSupport, not Array. And if there was source code for Array, we would not find a second method in it, or a reference to include ActiveSupport or anything like that.

Thus, we can’t really say that Array is responsible for second in a conceptual sense. And thus, making second a method of all arrays isn’t about what the programmer thinks of as an array being responsible for its behaviour. It’s a syntactic consideration.

Is that wrong?

No, that is not wrong. The other OO perspective is that objects should be responsible for implementing the central and characteristic operations where they are the primary participant. Thus Array implements [], .push, .pop, and so forth. An operation like .second can be implemented in terms of Array’s primary operations, so it is a secondary concern.

Secondary concerns could be defined elsewhere, and thus there is an OO argument in favour of .second not being an Array method.

the syntactic argument for extension methods

If .second isn’t an object’s primary responsibility, then why implement something like .second as a method? Why not as a function?

There are two syntactic arguments for extension methods. First, there is the question of consistency. This Ruby reads in a consistent way:

chain_of_command
  .select { |officer| officer.belongs_to(club) }
  .map(&:salary)
  .reduce(&:+)

Likewise, this JavaScript reads in a consistent way:

sum(
  pluck(
    select(
      chain_of_command,
      officer => belongs_to(officer, club)
    ),
    'salary'
  )
);

But this JavaScript can’t make up its mind which way to go:

sum(
  chain_of_command
    .find(officer => belongs_to(officer, club))
    .map(get('salary'))
);

Ugh.

Making everything in one expression into a method (or everything in one expression into a function) allows us to write more readable code by having chained expressions read in a consistent direction using a consistent invocation style. And if you choose to use methods, extension methods help us make everything read like a method, even things that aren’t an entity’s primary responsibility.

drawbacks of monkey-patching as an implementation

Modifying core classes has been considered and rejected many times by many other object-oriented communities. The essential problem is that classes are global in scope. A modification made to a class like Array affects the code used within Rails and your own application code as you’d expect. But it also affects the code within every other gem you use, so if one of them happens to define a .second method, that gem is incompatible with ActiveSupport.

If every gem defined its own extensions to every core class, you’d have an unmanageable mess. Rails gets away with it, because it’s the 800 pound gorilla of Ruby libraries, so everyone else works around their choices. Most other rubyists avoid the practice entirely.

monkey-patching javascript

Some early JavaScript libraries tried to follow suit, but for technical reasons, this caused even more headaches for programmers than it did in languages like Ruby, so today you find that most JavaScript programmers view the practice with extreme suspicion.

Ember Monkey-Patching

But not all. At the moment, Ember.js “monkeys around” with Array.prototype and String.prototype.1 And a few other libraries implement things like Function.prototype.delay: Anybody who tries to use two such libraries in the same code base is in for a headache.

static extension methods as an implementation

Various mechanisms have been proposed to permit writing expressions syntactically as 14.days.from_now without actually modifying a global class like Number. One of the most widely used is C# 3.0’s Extension Methods.

In C#, we can write an extension method for a class like this:

public static class Something
{
  // ...

  public static string Reverse(this string input)
  {
    char[] chars = input.ToCharArray();
    Array.Reverse(chars);
    return new String(chars);
  }

  public void BackAsswardAlphabet()
  {
    return "abcdefghijklmnopqrstuvwxyz".Reverse();
  }
}

The compiler knows that Reverse is to be implemented as an extension method on strings, by virtue of the this string portion of the signature. And since it’s defined as a static member of the Something class, it is not actually changing strings in any way.

The code within the BackAsswardAlphabet method includes the expression "abcdefghijklmnopqrstuvwxyz".Reverse(), and the C# compiler knows that "abcdefghijklmnopqrstuvwxyz" is a string, and therefore that it should treat "abcdefghijklmnopqrstuvwxyz".Reverse() as if we had actually written Something.Reverse("abcdefghijklmnopqrstuvwxyz").

This is only possible because C# includes static typing, and thus that the compiler knows that "abcdefghijklmnopqrstuvwxyz" is a string, so it can resolve the extension method at compile time.

Languages like JavaScript ought to know the same thing for a string literal, and for any const variable bound to a string literal, but reasoning about types beyond some very simple cases is very difficult in “untyped” languages, so this technique is out of reach until some future version of JavaScript brings us gradual typing.

es.maybe’s bind operator as an implementation

One of the features proposed for possible inclusion in a future formal release of ECMAScript (a/k/a “ES.maybe”) is the bind operator, ::. In short, ::fn is equivalent to fn.bind(this), foo::bar is equivalent to bar.bind(foo), and foo::bar(baz) is equivalent to bar.call(foo, baz).2 We can experiment with it right now using transpilation tools like Babel.

Its uses for abbreviating code where we are already using .bind, .call, and .apply have been explored elsewhere. It’s nice, because something like foo::bar(baz) looks like what we’re trying to say: “Treat .bar as a method being sent to foo with the parameter baz.”

Whereas, when we write bar.call(foo, baz), we’re saying something different: “Send the .call method to the entity bar with the parameters foo and baz.” And that speaks directly to our exploration of extension methods. Consider:

Array.prototype.second = function () {
  return this[1];
};

['a', 'b', 'c'].second()
  //=> 'b'

With the bind operator, we can write:

function second () {
  return this[1];
};

const abc = ['a', 'b', 'c'];

abc::second()
  //=> 'b'

Now we’re writing abc::second() instead of abc.second(). But we aren’t modifying Array’s prototype in any way. And we still have syntax that puts the subject of the operation first, like a method.

If we’re using JavaScript and have a tolerance for ES.maybe features, the bind operator provides a very good alternative to monkey-patching extension methods.

summary

Extension methods are a reasonable design choice when we want to provide the syntactic appearance of methods, and also wish to provide secondary functionality that does not belong in the core class definition (or was not shipped in the standard implementation fo a class we don’t control).

Monkey-patching is a popular choice in some languages, but has deep and difficult-to-resolve conflicting dependency problems. There are some language-specific alternatives, such as C#’s extension method syntax, and ES.maybe’s bind operator.


  1. This behaviour is on-by-default, but can be turned off. Future versions of Ember may discontinue the practice outright. 

  2. There is some talk that transpilers or implementations may turn foo::bar(baz) into bar.bind(foo)(baz) and not bar.call(foo, baz). This would temporarily create another function that needs to be garbage collected. Does that matter? If and when we are putting this code into production, we should measure the performance, and if foo::bar(baz) seems to be a problem, we will optimize it then. But for now, the important thing is the semantics. 

https://raganwald.com/2015/08/08/monkey-patching-extension-methods-bind-operator
Method Advice in Modern JavaScript
Show full content

We’ve previously looked at using ES.later method decorators like this:1

const wrapWith = (decorator) =>
  function (target, name, descriptor) {
    descriptor.value = decorator(descriptor.value);
  }

const fluent = (method) =>
  function (...args) {
    method.apply(this, args);
    return this;
  }

class Person {
  @wrapWith(fluent)
  setName (first, last) {
    this.firstName = first;
    this.lastName = last;
  }

  fullName () {
    return this.firstName + " " + this.lastName;
  }
};

The wrapWith function takes an ordinary method decorator and turns it into an ES.later method decorator. This is not necessary in production, as you can write your decorators directly for ES.later if you are using a Transpiler like Babel. But it does allow us to write decorators that work in ES6 and ES.later, so if you aren’t targeting ES.later, you can write your code like this:


const fluent = (method) =>
  function (...args) {
    method.apply(this, args);
    return this;
  }

class Person {
  setName (first, last) {
    this.firstName = first;
    this.lastName = last;
  }

  fullName () {
    return this.firstName + " " + this.lastName;
  }
};

Person.prototype.setName = fluent(Person.prototype.setName);
what question do method decorators answer?

The ES.later method decorators put the decorations right next to the method body. This makes it easy to answer the question “What is the precise behaviour of this method?”

But sometimes, this is not what you want. Consider a responsibility like authentication. Let’s imagine that we validate permissions in our model classes. We might write something like this:

const wrapWith = (decorator) =>
  function (target, name, descriptor) {
    descriptor.value = decorator(descriptor.value);
  }

const mustBeMe = (method) =>
  function (...args) {
    if (currentUser() && currentUser().person().equals(this))
      return method.apply(this, args);
    else throw new PermissionsException("Must be me!");
  }

class Person {

  @wrapWith(mustBeMe)
  setName (first, last) {
    this.firstName = first;
    this.lastName = last;
  }

  fullName () {
    return this.firstName + " " + this.lastName;
  }

  @wrapWith(mustBeMe)
  setAge (age) {
    this.age = age;
  }

  @wrapWith(mustBeMe)
  age () {
    return this.age;
  }

};

(Obviously real permissions systems involve roles and all sorts of other important things.)

Now we can look at setName and see that users can only set their own name, likewise if we look at setAge, we see that users can only set their own age.

In a tiny toy example the next question is easy to answer: What methods can only be invoked by the person themselves? We see at a glance that the answer is setName, setAge, and age.

But as classes grow, this becomes more difficult to answer. This especially becomes difficult if we decompose classes using mixins. For example, what if setAge and age come from a mixin:

@HasAge
class Person {

  @wrapWith(mustBeMe)
  setName (first, last) {
    this.firstName = first;
    this.lastName = last;
  }

  fullName () {
    return this.firstName + " " + this.lastName;
  }

};

Are they wrapped with mustBeMe? Quite possibly not, because the mixin is responsible for defining the behaviour, it’s up to the model class to decide the permissions required. But how would you know?

Method decorators make it easy to answer the question “what is the behaviour of this method?” But they dont make it easy to answer the question “what methods share this behaviour?”

That question matters, because when decomposing responsibilities, we often decide that a cross-cutting responsibility like permissions should be distinct from an implementation responsibility like storing a name.

cross-cutting method decorators

There is another way to decorate methods: We can decorate multiple methods in a single declaration. This is called providing method advice.

In JavaScript, we can implement method advice by decorating the entire class. A class decorator is nothing more than a function that takes a class as an argument and returns the same or a different class. We already have a combinator for making mixins (see Using ES.later Decorators as Mixins):

function mixin (behaviour, sharedBehaviour = {}) {
  const instanceKeys = Reflect.ownKeys(behaviour);
  const sharedKeys = Reflect.ownKeys(sharedBehaviour);
  const typeTag = Symbol('isa');

  function _mixin (clazz) {
    for (let property of instanceKeys)
      Object.defineProperty(clazz.prototype, property, {
        value: behaviour[property],
        writable: true
      });
    Object.defineProperty(clazz.prototype, typeTag, { value: true });
    return clazz;
  }
  for (let property of sharedKeys)
    Object.defineProperty(_mixin, property, {
      value: sharedBehaviour[property],
      enumerable: sharedBehaviour.propertyIsEnumerable(property)
    });
  Object.defineProperty(_mixin, Symbol.hasInstance, {
    value: (i) => !!i[typeTag]
  });
  return _mixin;
}

const HasAge = mixin({
  setAge (age) {
    this.age = age;
  },

  age () {
    return this.age;
  }
});

@HasAge
class Person {

  setName (first, last) {
    this.firstName = first;
    this.lastName = last;
  }

  fullName () {
    return this.firstName + " " + this.lastName;
  }

};

We can use the same technique to write a class decorator that decorates one or more methods:

const around = (behaviour, ...methodNames) =>
  (clazz) => {
    for (let methodName of methodNames)
      Object.defineProperty(clazz.prototype, property, {
        value: behaviour(clazz.prototype[methodName]),
        writable: true
      });
    return clazz;
  }

@HasAge
@around(mustBeMe, 'setName', 'setAge', 'age')
class Person {

  setName (first, last) {
    this.firstName = first;
    this.lastName = last;
  }

  fullName () {
    return this.firstName + " " + this.lastName;
  }

};

Now when you look at setName, you don’t see what permissions apply. However, when we look at @around(mustBeMe, 'setName', 'setAge', 'age'), we see that we’re wrapping setName, setAge and age with mustBeMe.

This focuses the responsibility for permissions in one place. Of course, we could make things simpler. For one thing, some actions are only performed before a method, and some only after a method:

const before = (behaviour, ...methodNames) =>
  (clazz) => {
    for (let methodName of methodNames) {
      const method = clazz.prototype[methodName];

      Object.defineProperty(clazz.prototype, property, {
        value: function (...args) {
          behaviour.apply(this, args);
          return method.apply(this, args);
        },
        writable: true
      });
    }
    return clazz;
  }

const after = (behaviour, ...methodNames) =>
  (clazz) => {
    for (let methodName of methodNames) {
      const method = clazz.prototype[methodName];

      Object.defineProperty(clazz.prototype, property, {
        value: function (...args) {
          const returnValue = method.apply(this, args);

          behaviour.apply(this, args);
          return returnValue;
        },
        writable: true
      });
    }
    return clazz;
  }

Precondition checks like mustBeMe are good candidates for before. Here’s mustBeLoggedIn and mustBeMe set up to use before. They’re far simpler since before handles the wrapping:

const mustBeLoggedIn = () => {
    if (currentUser() == null)
      throw new PermissionsException("Must be logged in!");
  }

const mustBeMe = () => {
    if (currentUser() == null || !currentUser().person().equals(this))
      throw new PermissionsException("Must be me!");
  }

@HasAge
@before(mustBeMe, 'setName', 'setAge', 'age')
@before(mustBeLoggedIn, 'fullName')
class Person {

  setName (first, last) {
    this.firstName = first;
    this.lastName = last;
  }

  fullName () {
    return this.firstName + " " + this.lastName;
  }

};

This style of moving the responsibility for decorating methods to a single declaration will appear familiar to Ruby on Rails developers. As you can see, it does not require “deep magic” or complex libraries, it is a pattern that can be written out in just a few lines of code.

Mind you, there’s always room for polish and gold plate. We could enhance before, after, and around to include conveniences like regular expressions to match method names, or special declarations like except: or only: if we so desired.

final thought

Although decorating methods in bulk has appeared in other languages and paradigms, it’s not something special and alien to JavaScript, it’s really the same pattern we see over and over again: Programming by composing small and single-responsibility entities, and using functions to transform and combine the entities into their final form.


a word about es6

Although ES.later has not been approved, there is extensive support for ES.later method decorators in transpilation tools. The examples in this post were evaluated with Babel. If we don’t want to use ES.later decorators, we can use the exact same decorators as ordinary functions, like this:

const mustBeLoggedIn = () => {
    if (currentUser() == null)
      throw new PermissionsException("Must be logged in!");
  }

const mustBeMe = () => {
    if (currentUser() == null || !currentUser().person().equals(this))
      throw new PermissionsException("Must be me!");
  }

const Person =
  HasAge(
  before(mustBeMe, 'setName', 'setAge', 'age')(
  before(mustBeLoggedIn, 'fullName')(
    class {
      setName (first, last) {
        this.firstName = first;
        this.lastName = last;
      }

      fullName () {
        return this.firstName + " " + this.lastName;
      }
    }
  )
  )
);

Composition could also help:

const mustBeLoggedIn = () => {
    if (currentUser() == null)
      throw new PermissionsException("Must be logged in!");
  }

const mustBeMe = () => {
    if (currentUser() == null || !currentUser().person().equals(this))
      throw new PermissionsException("Must be me!");
  }

const Person = compose(
  HasAge,
  before(mustBeMe, 'setName', 'setAge', 'age'),
  before(mustBeLoggedIn, 'fullName'),
)(class {
  setName (first, last) {
    this.firstName = first;
    this.lastName = last;
  }

  fullName () {
    return this.firstName + " " + this.lastName;
  }
});

more reading:

notes:

  1. By “ES.later,” we mean some future version of ECMAScript that is likely to be approved eventually, but for the moment exists only in transpilers like Babel. Obviously, using any ES.later feature in production is a complex decision requiring many more considerations than can be enumerated in a blog post. 

https://raganwald.com/2015/08/05/method-advice
Method Decorators in ECMAScript 2015 (and beyond)
Show full content

Writing higher-order functions in JavaScript is a long-established practice:

A higher-order function is a function that takes one or more functions as arguments, returns a function, or both.

For example, compose is a higher-order function that takes two functions as arguments, and returns a function that represents the composition of the arguments:

const compose = (a, b) => (c) => a(b(c));

A particularly interesting subset of higher-order functions are higher-order functions that decorate a function. “Function Decorators” take a function as and argument, and return a new function that has semantically similar behaviour, but is “decorated” with some additional functionality.

For example, this very simple maybe function is a function decorator. It takes a function as an argument, and returns a version of that function that returns undefined or null (without any side-effects) if any of its arguments are undefined or null:

const maybe = (fn) =>
  (...args) => {
    for (let arg of args) {
      if (arg == null) return arg;
    }
    return fn(...args);
  }

[1, null, 3, 4, null, 6, 7].map(maybe(x => x * x))
  //=> [1,null,9,16,null,36,49]

A similar decorator, requireAll, raises an exception if a function is invoked without at least as many arguments as declared parameters:

const requireAll = (fn) =>
  function (...args) {
    if (args.length < fn.length)
      throw new Error('missing required arguments');
    else
      return fn(...args);
  }

Function decorators are fairly straightforward. You’ll find a variety of them in popular libraries, such as decorators that memoize a computation or debounce an action that might be performed repeatedly.


Messerschmidt

simple method decoration

As noted in The Symmetry of JavaScript Functions, these simple decorators work and work well for ordinary functions. But in JavaScript, functions can be invoked in different ways, and some of those ways are slightly incompatible with each other.

Of great interest to us are methods in JavaScript, functions that are used to define the behaviour of instances. When a function is invoked as a method, the name this is bound to the instance, and most methods rely on that binding to work properly.

Consider, for example Person:

class Person {
  setName (first, last) {
    this.firstName = first;
    this.lastName = last;
    return this;
  }
  fullName () {
    return this.firstName + " " + this.lastName;
  }
};

const thinker = new Person()
                  .setName('Albert', 'Einstein');

thinker.fullName()
  //=> 'Albert Einstein'

thinker.setName('Marie', 'Curie');

thinker.fullName()
  //=> 'Marie Curie'

The setName method is a function. Let’s see what happens if we try to decorate it with requireAll:

Object.defineProperty(Person.prototype, 'setName', { value: requireAll(Person.prototype.setName) });

const thinker = new Person()
                  .setName('Albert', 'Einstein');
  //=> Attempted to assign to readonly property.

WTF!?

After some inspection, we realize the problem: Before we decorated it, setName was invoked as a method, and thus this was bound to the thinker instance. But once wrapped in requireAll, our setName function is now invoked as an ordinary function with the line return fn(...args);, so this is set to the wrong thing.

If we want to use requireAll with methods, we have to write it in such a way that it preserves this when it invokes the underlying function:

const requireAll = (fn) =>
  function (...args) {
    if (args.length < fn.length)
      throw new Error('missing required arguments');
    else
      return fn.apply(this, args);
  }

const thinker = new Person()
                  .setName('Prince');
  //=> missing required arguments

It now works properly, including ignoring invocations that do not pass all the arguments. But you have to be very careful when writing higher-order functions to make sure they work as both function decorators and as method decorators.

the problem with stateful method decorators

Handling this properly is not the only way in which ordinary function decorators differ from method decorators. Some decorators are stateful, like once. Here’s a version that correctly sets this:

const once = (fn) => {
  let hasRun = false;

  return function (...args) {
    if (hasRun) return;
    hasRun = true;
    return fn.apply(this, args);
  }
}

Imagining for a moment that we wish to only allow a person to have their name set once, we might write:

const once = (fn) => {
  let hasRun = false;

  return function (...args) {
    if (hasRun) return;
    hasRun = true;
    return fn.apply(this, args);
  }
}

class Person {
  setName (first, last) {
    this.firstName = first;
    this.lastName = last;
    return this;
  }
  fullName () {
    return this.firstName + " " + this.lastName;
  }
};

Object.defineProperty(Person.prototype, 'setName', { value: once(Person.prototype.setName) });

const logician = new Person()
                   .setName('Raymond', 'Smullyan')
                   .setName('Haskell', 'Curry');

logician.fullName()
  //=> Raymond Smullyan

As we expect, only the first call to .setName has any effect, and it works on a method. But there is a subtle bug that could easily evade naïve attempts to write unit tests:

const logician = new Person()
                   .setName('Raymond', 'Smullyan');

const musician = new Person()
                   .setName('Miles', 'Davis');

logician.fullName()
  //=> Raymond Smullyan

musician.fullName()
  //=> Raymond Smullyan

!?!?!?!

What has happened here is that when we write Object.defineProperty(Person.prototype, 'setName', { value: once(Person.prototype.setName) });, we wrapped a function bound to Person.prototype. That function is shared between all instances of Person. That’s deliberate, it’s the whole point of prototypical inheritance (and the “class-based inheritance” JavaScript builds with prototypes).

Since our once decorator returns a decorated function with private state (the hasRun variable), all the instances share the same private state, and thus the bug.

stateful method decorators

If we don’t need to use the same decorator for functions and for methods, we can rewrite our decorator to use a WeakSet to track whether a method has been invoked for an instance:

const once = (fn) => {
  let invocations = new WeakSet();

  return function (...args) {
    if (invocations.has(this)) return;
    invocations.add(this);
    return fn.apply(this, args);
  }
}

const logician = new Person()
                   .setName('Raymond', 'Smullyan');

logician.setName('Haskell', 'Curry');

const musician = new Person()
                   .setName('Miles', 'Davis');

logician.fullName()
  //=> Raymond Smullyan

musician.fullName()
  //=> Miles Davis

Now each instance stores whether .setName has been invoked on each instance a WeakSet, so logician and musician can share the method without sharing its state.

incompatibility

To handle methods, we have introduced “accidental complexity” to handle this and to handle state. Worse, our implementation of once for methods won’t work properly with ordinary functions in “strict” mode:

"use strict"

const hello = once(() => 'hello!');

hello()
  //=> undefined is not an object!

If you haven’t invoked it as a method, this is bound to undefined in strict mode, and undefined cannot be added to a WeakSet.

Correcting our decorator to deal with undefined is straightforward:

const once = (fn) => {
  let invocations = new WeakSet(),
      undefinedContext = Symbol('undefined-context');

  return function (...args) {
    const context = this === undefined
                    ? undefinedContext
                    : this;
    if (invocations.has(context)) return;
    invocations.add(context);
    return fn.apply(this, args);
  }
}

However, we’re adding more accidental complexity to handle the fact that function invocation is blue, and method invocation is khaki.1

In the end, we can either write specialized decorators designed specifically for methods, or tolerate the additional complexity of trying to handle method invocation and function invocation in the same decorator.

the bottom line

Function decorators can be used as method decorators, provided that we take care to handle this properly, and manage state carefully when required. The patterns for creating and using method decorators in JavaScript are straightforward, in large part because underneath the syntactic sugar for classes, we are still working with functions, objects, and delegation through prototypes.


Bonus: Method Decorators in ES.later

XFJ 022

Before ECMAScript 2015 (a/k/a “ES6”), we decorated a method in a simple an direct way. Here’s roughly how we used to write Person, using a pseudo-private property pattern:

const once = (fn) => {
  let hasRunValue = false,
      hasRunProperty = "hasRun-" + fn.name + "-" + new Date().getTime();

  return function (...args) {
    if (this == null) {
      if (hasRunValue) return;
      hasRunValue = true;
    }
    else {
      if (this[hasRunProperty]) return;
      Object.defineProperty(this, hasRunProperty, { value: true });
    }
    return fn.apply(this, args);
  }
}

var Person = function () {};

Person.prototype.setName = once(function setName (first, last) {
  this.firstName = first;
  this.lastName = last;
  return this;
});

Person.prototype.fullName = function fullName () {
  return this.firstName + " " + this.lastName;
};

Our decoration was simply a function call at the exact point where we were associating a function with the prototype. However, this code is inelegant: It separates the creation of the “class” from the definition of each method.

If we had Object.assign or an equivalent, we we’re able to define all of the methods, including decorators, in one step:

var Person = function () {};

_.extend(Person.prototype, {

  setName: once(function setName (first, last) {
    this.firstName = first;
    this.lastName = last;
    return this;
  }),

  fullName: function fullName () {
  return this.firstName + " " + this.lastName;
  }

});

Easy, peasy, lemon-squeezy. But the ECMAScript 2015 syntaxes for classes makes this a tiny bit awkward. When we use a compact method definition, we get things like the method being non-enumerable by default. So to get a similar result in ECMAScript 2015, we have to write some clumsy code after the class has been defined:

Object.defineProperty(Person.prototype, 'setName', { value: once(Person.prototype.setName) });

This is weak for two reasons. First, it’s fugly and full of accidental complexity. Second, modifying the prototype after defining the class separates two things that conceptually ought to be together. The class keyword giveth, but it also taketh away.

To solve a problem created by ECMAScript 2015, method decorators have been proposed for ES.later.2 The syntax is similar to class decorators, but where a class decorator takes a class as an argument and returns the same (or a different) class, a method decorator actually intercedes when a property is defined on the prototype.

Thus, a fluent (a/k/a chain) decorator would look like this:

function fluent (target, name, descriptor) {
  const method = descriptor.value;

  descriptor.value = function (...args) {
    method.apply(this, args);
    return this;
  }
}

And we’d use it like this:

class Person {

  @fluent
  setName (first, last) {
    this.firstName = first;
    this.lastName = last;
  }

  fullName () {
    return this.firstName + " " + this.lastName;
  }

};

Once again, we end up with two kinds of decorators: One for functions, and one for methods, with different structures. We need a new colour!

But since decorators are expressions, we can alleviate the pain with an adaptor:

const wrapWith = (decorator) =>
  function (target, name, descriptor) {
    descriptor.value = decorator(descriptor.value);
  }

function fluent (method) {
  return function (...args) {
    method.apply(this, args);
    return this;
  }
}

class Person {

  @wrapWith(once)
  @wrapWith(fluent)
  setName (first, last) {
    this.firstName = first;
    this.lastName = last;
  }

  fullName () {
    return this.firstName + " " + this.lastName;
  }

};

  1. See the aforelinked The Symmetry of JavaScript Functions 

  2. By “ES.later,” we mean some future version of ECMAScript that is likely to be approved eventually, but for the moment exists only in transpilers like Babel. Obviously, using any ES.later feature in production is a complex decision requiring many more considerations than can be enumerated in a blog post. 

https://raganwald.com/2015/06/28/method-decorators
Using ES.later Decorators as Mixins
Show full content

In Functional Mixins, we discussed mixing functionality into JavaScript classes, changing the class. We observed that this has pitfalls when applied to a class that might already be in use elsewhere, but is perfectly cromulent when used as a technique to build a class from scratch. When used strictly to build a class, mixins help us decompose classes into smaller entities with focused responsibilities that can be shared between classes as necessary.

Let’s recall our helper for making a functional mixin. We’ll just call it mixin:

function mixin (behaviour, sharedBehaviour = {}) {
  const instanceKeys = Reflect.ownKeys(behaviour);
  const sharedKeys = Reflect.ownKeys(sharedBehaviour);
  const typeTag = Symbol('isa');

  function _mixin (target) {
    for (let property of instanceKeys)
      Object.defineProperty(target, property, { value: behaviour[property] });
    Object.defineProperty(target, typeTag, { value: true });
    return target;
  }
  for (let property of sharedKeys)
    Object.defineProperty(_mixin, property, {
      value: sharedBehaviour[property],
      enumerable: sharedBehaviour.propertyIsEnumerable(property)
    });
  Object.defineProperty(_mixin, Symbol.hasInstance, {
    value: (i) => !!i[typeTag]
  });
  return _mixin;
}

This creates a function that mixes behaviour into any target, be it a class prototype or a standalone object. There is a convenience capability of making “static” or “shared” properties of the the function, and it even adds some simple hasInstance handling so that the instanceof operator will work.

Here we are using it on a class’ prototype:

const BookCollector = mixin({
  addToCollection (name) {
    this.collection().push(name);
    return this;
  },
  collection () {
    return this._collected_books || (this._collected_books = []);
  }
});

class Person {
  constructor (first, last) {
    this.rename(first, last);
  }
  fullName () {
    return this.firstName + " " + this.lastName;
  }
  rename (first, last) {
    this.firstName = first;
    this.lastName = last;
    return this;
  }
};

BookCollector(Person.prototype);

const president = new Person('Barak', 'Obama')

president
  .addToCollection("JavaScript Allongé")
  .addToCollection("Kestrels, Quirky Birds, and Hopeless Egocentricity");

president.collection()
  //=> ["JavaScript Allongé","Kestrels, Quirky Birds, and Hopeless Egocentricity"]
mixins just for classes

It’s very nice that our mixins support any kind of target, but let’s make them class-specific:

function mixin (behaviour, sharedBehaviour = {}) {
  const instanceKeys = Reflect.ownKeys(behaviour);
  const sharedKeys = Reflect.ownKeys(sharedBehaviour);
  const typeTag = Symbol('isa');

  function _mixin (clazz) {
    for (let property of instanceKeys)
      Object.defineProperty(clazz.prototype, property, {
        value: behaviour[property],
        writable: true
      });
    Object.defineProperty(clazz.prototype, typeTag, { value: true });
    return clazz;
  }
  for (let property of sharedKeys)
    Object.defineProperty(_mixin, property, {
      value: sharedBehaviour[property],
      enumerable: sharedBehaviour.propertyIsEnumerable(property)
    });
  Object.defineProperty(_mixin, Symbol.hasInstance, {
    value: (i) => !!i[typeTag]
  });
  return _mixin;
}

This version’s _mixin function mixes instance behaviour into a class’s prototype, so we gain convenience at the expense of flexibility:

const BookCollector = mixin({
  addToCollection (name) {
    this.collection().push(name);
    return this;
  },
  collection () {
    return this._collected_books || (this._collected_books = []);
  }
});

class Person {
  constructor (first, last) {
    this.rename(first, last);
  }
  fullName () {
    return this.firstName + " " + this.lastName;
  }
  rename (first, last) {
    this.firstName = first;
    this.lastName = last;
    return this;
  }
};

BookCollector(Person);

const president = new Person('Barak', 'Obama')

president
  .addToCollection("JavaScript Allongé")
  .addToCollection("Kestrels, Quirky Birds, and Hopeless Egocentricity");

president.collection()
  //=> ["JavaScript Allongé","Kestrels, Quirky Birds, and Hopeless Egocentricity"]

So far, nice, but it feels a bit bolted-on-after-the-fact. Let’s take advantage of the fact that Classes are Expressions:

const BookCollector = mixin({
  addToCollection (name) {
    this.collection().push(name);
    return this;
  },
  collection () {
    return this._collected_books || (this._collected_books = []);
  }
});

const Person = BookCollector(class {
  constructor (first, last) {
    this.rename(first, last);
  }
  fullName () {
    return this.firstName + " " + this.lastName;
  }
  rename (first, last) {
    this.firstName = first;
    this.lastName = last;
    return this;
  }
});

This is structurally nicer, it binds the mixing in of behaviour with the class declaration in one expression, so we’re getting away from this idea of mixing things into classes after they’re created.

But (there’s always a but), our pattern has three different elements (the name being bound, the mixin, and the class being declared). And if we wanted to mix two or more behaviours in, we’d have to nest the functions like this:

const Author = mixin({
  writeBook (name) {
    this.books().push(name);
    return this;
  },
  books () {
    return this._books_written || (this._books_written = []);
  }
});

const Person = Author(BookCollector(class {
  // ...
}));

Some people find this “clear as day,” arguing that this is a simple expression taking advantage of JavaScript’s simplicity. The code behind mixin is simple and easy to read, and if you understand prototypes, you understand everything in this expression.

But others want a language to give them “magic,” an abstraction that they learn on the outside. At the moment, JavaScript has no “magic” for mixing functionality into classes. But what if there were?

class decorators

There is a well-regarded proposal to add Python-style class decorators to JavaScript in the next major revision after ECMAScript 2015.

A decorator is a function that operates on a class. Here’s a very simple example from the aforelinked implementation:

function annotation(target) {
   // Add a property on target
   target.annotated = true;
}

@annotation
class MyClass {
  // ...
}

MyClass.annotated
  //=> true

As you can see, annotation is a class decorator, and it takes a class as an argument. The function can do anything, including modifying the class or the class’s prototype. If the decorator function doesn’t return anything, the class’ name is bound to the modified class.1

A class is “decorated” with the function by preceding the definition with @ and an expression evaluating to the decorator. in the simple example, we use a variable name.

Hmmm. A function that modifies a class, you say? Let’s try it:

const BookCollector = mixin({
  addToCollection (name) {
    this.collection().push(name);
    return this;
  },
  collection () {
    return this._collected_books || (this._collected_books = []);
  }
});

@BookCollector
class Person {
  constructor (first, last) {
    this.rename(first, last);
  }
  fullName () {
    return this.firstName + " " + this.lastName;
  }
  rename (first, last) {
    this.firstName = first;
    this.lastName = last;
    return this;
  }
};

const president = new Person('Barak', 'Obama')

president
  .addToCollection("JavaScript Allongé")
  .addToCollection("Kestrels, Quirky Birds, and Hopeless Egocentricity");

president.collection()
  //=> ["JavaScript Allongé","Kestrels, Quirky Birds, and Hopeless Egocentricity"]

You can also mix in multiple behaviours with decorators:

const BookCollector = mixin({
  addToCollection (name) {
    this.collection().push(name);
    return this;
  },
  collection () {
    return this._collected_books || (this._collected_books = []);
  }
});

const Author = mixin({
  writeBook (name) {
    this.books().push(name);
    return this;
  },
  books () {
    return this._books_written || (this._books_written = []);
  }
});

@BookCollector @Author
class Person {
  constructor (first, last) {
    this.rename(first, last);
  }
  fullName () {
    return this.firstName + " " + this.lastName;
  }
  rename (first, last) {
    this.firstName = first;
    this.lastName = last;
    return this;
  }
};

And if you want to use decorators to emulate Purely Functional Composition, it’s a fairly simple pattern:

class Person {
  constructor (first, last) {
    this.rename(first, last);
  }
  fullName () {
    return this.firstName + " " + this.lastName;
  }
  rename (first, last) {
    this.firstName = first;
    this.lastName = last;
    return this;
  }
};

@BookCollector @Author
class BookLover extends Person {};

Class decorators provide a compact, “magic” syntax that is closely tied to the construction of the class. They also require understanding one more kind of syntax. But some argue that having different syntax for different things aids understandability, and that having both @foo for decoration and bar(...) for function invocation is a win.

using decorators

Decorators have not been formally approved, however there are various implementations available for transpiling decorator syntax to ES5 syntax. The examples in this post were evaluated with Babel.

If you prefer syntactic sugar that gives the appearance of a declarative construct, combining a mixin function with [ES.later]’s class decorators does the trick.2

(discuss on hacker news)


more reading:

notes:

  1. Although this example doesn’t show it, if it returns a constructor function, that is what will be assigned to the class’ name. This allows the creation of purely functional mixins and other interesting techniques that are beyond the scope of this post. 

  2. By “ES.later,” we mean some future version of ECMAScript that is likely to be approved eventually, but for the moment exists only in transpilers like Babel. Obviously, using any ES.later feature in production is a complex decision requiring many more considerations than can be enumerated in a blog post. 

https://raganwald.com/2015/06/26/decorators-in-es7
Purely Functional Composition
Show full content

In Functional Mixins, we discussed mixing functionality into JavaScript classes. The act of mixing functionality in changes the class. This approach maps well to idioms from other languages, such as Ruby’s modules. It also helps us decompose classes into smaller entities with focused responsibilities that can be shared between classes as necessary.1

That being said, mutation has its drawbacks as well. People say, “it’s hard to reason about code that mutates data,” and when it comes to modifying classes, they are right.

Classes are often global to an entire program. Experience has shown that changing a class in one place can break the functionality of an entirely different part of the program that expects the class to remain unmodified.

Of course, if the modifications are only made as part of building the class in the first place, these concerns really do not apply. But what if we wish to modify a class that was made somewhere else? What if we wish to make modifications in just one place?

Campos Macchiato

extension

Let’s revisit our ridiculously trivial Todo class:

class Todo {
  constructor (name) {
    this.name = name || 'Untitled';
    this.done = false;
  }
  do () {
    this.done = true;
    return this;
  }
  undo () {
    this.done = false;
    return this;
  }
}

Now let us presume that this class is used throughout our application for various purposes. In one section of the code, we want Todo items that are also coloured. As we saw previously, we can accomplish that with a simple mixin like this:

const Coloured = {
  setColourRGB ({r, g, b}) {
    this.colourCode = {r, g, b};
    return this;
  },
  getColourRGB () {
    return this.colourCode;
  }
};

Object.assign(Todo.prototype, Coloured);

While this works just fine for all of the Todos we create in this part of the program, we may accidentally break Todo instances used elsewhere. What we really want is a ColoredTodo in one part of the program, and Todo everywhere else.

The extends keyword solves that problem in the trivial case:

class ColouredTodo extends Todo {
  setColourRGB ({r, g, b}) {
    this.colourCode = {r, g, b};
    return this;
  }
  getColourRGB () {
    return this.colourCode;
  }
}

A ColouredTodo is just like a Todo, but with added colour.

sharing is caring

One oft-repeated drawback of using extension is that it is difficult to share the “colour” functionality with other classes. Extension forms a strict tree. Another drawback is that the functionality can only be tested in concert with Todo, whereas it is trivial to independently test a well-crafted mixin.

Our problem is that with extension, our colour functionality is coupled to the Todo class. With a mixin, it isn’t. But with a mixin, our Todo class ended up coupled to Coloured. With extension, it wasn’t.

What we want is to decouple Todo from Coloured with extension, and to decouple Coloured from ColouredTodo with a mixin:

class ColouredTodo extends Todo {}

const Coloured = {
  setColourRGB ({r, g, b}) {
    this.colourCode = {r, g, b};
    return this;
  },
  getColourRGB () {
    return this.colourCode;
  }
};

Object.assign(ColouredTodo.prototype, Coloured);

We can write a simple function to encapsulate this pattern:

function ComposeWithClass(clazz, ...mixins) {
  const subclazz = class extends clazz {};
  for (let mixin of mixins) {
    Object.assign(subclazz.prototype, mixin);
  }
  return subclazz;
}

const ColouredTodo = ComposeWithClass(Todo, Coloured);

The ComposeWithClass function returns a new class without modifying its arguments. In other words, it’s composing behaviour with a class, not mixing behaviour into a class.

Cappuccinos and coffee cake, baked in capp cups

enhance

We can enhance ComposeWithClass to address some of the issues we noticed with mutating mixins, such as making methods non-enumerable:

const shared = Symbol("shared");

function ComposeWithClass(clazz, ...mixins) {
  const subclazz = class extends clazz {};

  for (let mixin of mixins) {
    const instanceKeys = Reflect
      .ownKeys(mixin)
      .filter(key => key !== shared && key !== Symbol.hasInstance);
    const sharedBehaviour = mixin[shared] || {};
    const sharedKeys = Reflect.ownKeys(sharedBehaviour);

    for (let property of instanceKeys)
      Object.defineProperty(subclazz.prototype, property, { value: mixin[property] });
    for (let property of sharedKeys)
      Object.defineProperty(subclazz, property, {
        value: sharedBehaviour[property],
        enumerable: sharedBehaviour.propertyIsEnumerable(property)
      });
  }
  return subclazz;
}

ComposeWithClass.shared = shared;

Written like this, it’s up to individual behaviours to sort out instanceof:

const isaColoured = Symbol();

const Coloured = {
  setColourRGB ({r, g, b}) {
    this.colourCode = {r, g, b};
    return this;
  },
  getColourRGB () {
    return this.colourCode;
  },
  [isaColoured]: true,
  [Symbol.hasInstance] (instance) { return instance[isaColoured]; }
};

And that’s something we can encapsulate, if we wish:

function HasInstances (behaviour) {
  const typeTag = Symbol();
  return Object.assign({}, behaviour, {
    [typeTag]: true,
    [Symbol.hasInstance] (instance) { return instance[typeTag]; }
  })
}
the complete composition
class Todo {
  constructor (name) {
    this.name = name || 'Untitled';
    this.done = false;
  }
  do () {
    this.done = true;
    return this;
  }
  undo () {
    this.done = false;
    return this;
  }
}

const Coloured = HasInstances({
  setColourRGB ({r, g, b}) {
    this.colourCode = {r, g, b};
    return this;
  },
  getColourRGB () {
    return this.colourCode;
  }
});

const ColouredTodo = ComposeWithClass(Todo, Coloured);
summary

A “purely functional” approach to composing functionality is appropriate when we wish to compose behaviour with classes, but do not wish to mutate a class that is used elsewhere. One approach is to extend the class into a subclass, and mix behaviour into the newly created subclass.

(discuss on hacker news)


more reading:

notes:

  1. Another, speculative benefit is that it maps well to features like class decorators or the with keyword, either of which may land in a future version of JavaScript or may be adopted by transpiling tools like Babel. 

https://raganwald.com/2015/06/20/purely-functional-composition
Functional Mixins in ECMAScript 2015
Show full content

In Prototypes are Objects, we saw that you can emulate “mixins” using Object.assign on the prototypes that underly JavaScript “classes.” We’ll revisit this subject now and spend more time looking at mixing functionality into classes.

First, a quick recap: In JavaScript, a “class” is implemented as a constructor function and its prototype, whether you write it directly, or use the class keyword. Instances of the class are created by calling the constructor with new. They “inherit” shared behaviour from the constructor’s prototype property.1

the object mixin pattern

One way to share behaviour scattered across multiple classes, or to untangle behaviour by factoring it out of an overweight prototype, is to extend a prototype with a mixin.

Here’s a class of todo items:

class Todo {
  constructor (name) {
    this.name = name || 'Untitled';
    this.done = false;
  }
  do () {
    this.done = true;
    return this;
  }
  undo () {
    this.done = false;
    return this;
  }
}

And a “mixin” that is responsible for colour-coding:

const Coloured = {
  setColourRGB ({r, g, b}) {
    this.colourCode = {r, g, b};
    return this;
  },
  getColourRGB () {
    return this.colourCode;
  }
};

Mixing colour coding into our Todo prototype is straightforward:

Object.assign(Todo.prototype, Coloured);

new Todo('test')
  .setColourRGB({r: 1, g: 2, b: 3})
  //=> {"name":"test","done":false,"colourCode":{"r":1,"g":2,"b":3}}

We can “upgrade” it to have a private property if we wish:

const colourCode = Symbol("colourCode");

const Coloured = {
  setColourRGB ({r, g, b}) {
    this[colourCode]= {r, g, b};
    return this;
  },
  getColourRGB () {
    return this[colourCode];
  }
};

So far, very easy and very simple. This is a pattern, a recipe for solving a certain problem using a particular organization of code.

Macchiato

functional mixins

The object mixin we have above works properly, but our little recipe had two distinct steps: Define the mixin and then extend the class prototype. Angus Croll pointed out that it’s more elegant to define a mixin as a function rather than an object. He calls this a functional mixin. Here’s Coloured again, recast in functional form:

const Coloured = (target) =>
  Object.assign(target, {
    setColourRGB ({r, g, b}) {
      this.colourCode = {r, g, b};
      return this;
    },
    getColourRGB () {
      return this.colourCode;
    }
  });

Coloured(Todo.prototype);

We can make ourselves a factory function that also names the pattern:

const FunctionalMixin = (behaviour) =>
  target => Object.assign(target, behaviour);

This allows us to define functional mixins neatly:

const Coloured = FunctionalMixin({
  setColourRGB ({r, g, b}) {
    this.colourCode = {r, g, b};
    return this;
  },
  getColourRGB () {
    return this.colourCode;
  }
});
enumerability

If we look at the way class defines prototypes, we find that the methods defined are not enumerable by default. This works around a common error where programmers iterate over the keys of an instance and fail to test for .hasOwnProperty.

Our object mixin pattern does not work this way, the methods defined in a mixin are enumerable by default, and if we carefully defined them to be non-enumerable, Object.assign wouldn’t mix them into the target prototype, because Object.assign only assigns enumerable properties.

And thus:

Coloured(Todo.prototype)

const urgent = new Todo("finish blog post");
urgent.setColourRGB({r: 256, g: 0, b: 0});

for (let property in urgent) console.log(property);
  // =>
    name
    done
    colourCode
    setColourRGB
    getColourRGB

As we can see, the setColourRGB and getColourRGB methods are enumerated, although the do and undo methods are not. This can be a problem with naïve code: we can’t always rewrite all the other code to carefully use .hasOwnProperty.

One benefit of functional mixins is that we can solve this problem and transparently make mixins behave like class:

const FunctionalMixin = (behaviour) =>
  function (target) {
    for (let property of Reflect.ownKeys(behaviour))
      Object.defineProperty(target, property, { value: behaviour[property] })
    return target;
  }

Writing this out as a pattern would be tedious and error-prone. Encapsulating the behaviour into a function is a small win.

Just Below the Surface

mixin responsibilities

Like classes, mixins are metaobjects: They define behaviour for instances. In addition to defining behaviour in the form of methods, classes are also responsible for initializing instances. But sometimes, classes and metaobjects handle additional responsibilities.

For example, sometimes a particular concept is associated with some well-known constants. When using a class, can be handy to namespace such values in the class itself:

class Todo {
  constructor (name) {
    this.name = name || Todo.DEFAULT_NAME;
    this.done = false;
  }
  do () {
    this.done = true;
    return this;
  }
  undo () {
    this.done = false;
    return this;
  }
}

Todo.DEFAULT_NAME = 'Untitled';

// If we are sticklers for read-only constants, we could write:
// Object.defineProperty(Todo, 'DEFAULT_NAME', {value: 'Untitled'});

We can’t really do the same thing with simple mixins, because all of the properties in a simple mixin end up being mixed into the prototype of instances we create by default. For example, let’s say we want to define Coloured.RED, Coloured.GREEN, and Coloured.BLUE. But we don’t want any specific coloured instance to define RED, GREEN, or BLUE.

Again, we can solve this problem by building a functional mixin. Our FunctionalMixin factory function will accept an optional dictionary of read-only mixin properties, provided they are associated with a special key:

const shared = Symbol("shared");

function FunctionalMixin (behaviour) {
  const instanceKeys = Reflect.ownKeys(behaviour)
    .filter(key => key !== shared);
  const sharedBehaviour = behaviour[shared] || {};
  const sharedKeys = Reflect.ownKeys(sharedBehaviour);

  function mixin (target) {
    for (let property of instanceKeys)
      Object.defineProperty(target, property, { value: behaviour[property] });
    return target;
  }
  for (let property of sharedKeys)
    Object.defineProperty(mixin, property, {
      value: sharedBehaviour[property],
      enumerable: sharedBehaviour.propertyIsEnumerable(property)
    });
  return mixin;
}

FunctionalMixin.shared = shared;

And now we can write:

const Coloured = FunctionalMixin({
  setColourRGB ({r, g, b}) {
    this.colourCode = {r, g, b};
    return this;
  },
  getColourRGB () {
    return this.colourCode;
  },
  [FunctionalMixin.shared]: {
    RED:   { r: 255, g: 0,   b: 0   },
    GREEN: { r: 0,   g: 255, b: 0   },
    BLUE:  { r: 0,   g: 0,   b: 255 },
  }
});

Coloured(Todo.prototype)

const urgent = new Todo("finish blog post");
urgent.setColourRGB(Coloured.RED);

urgent.getColourRGB()
  //=> {"r":255,"g":0,"b":0}
mixin methods

Such properties need not be values. Sometimes, classes have methods. And likewise, sometimes it makes sense for a mixin to have its own methods. One example concerns instanceof.

In earlier versions of ECMAScript, instanceof is an operator that checks to see whether the prototype of an instance matches the prototype of a constructor function. It works just fine with “classes,” but it does not work “out of the box” with mixins:

urgent instanceof Todo
  //=> true

urgent instanceof Coloured
  //=> false

To handle this and some other issues where programmers are creating their own notion of dynamic types, or managing prototypes directly with Object.create and Object.setPrototypeOf, ECMAScript 2015 provides a way to override the built-in instanceof behaviour: An object can define a method associated with a well-known symbol, Symbol.hasInstance.

We can test this quickly:2

Object.defineProperty(Coloured, Symbol.hasInstance, {value: (instance) => true});
urgent instanceof Coloured
  //=> true
{} instanceof Coloured
  //=> true

Of course, that is not semantically correct. But using this technique, we can write:

const shared = Symbol("shared");

function FunctionalMixin (behaviour) {
  const instanceKeys = Reflect.ownKeys(behaviour)
    .filter(key => key !== shared);
  const sharedBehaviour = behaviour[shared] || {};
  const sharedKeys = Reflect.ownKeys(sharedBehaviour);
  const typeTag = Symbol("isA");

  function mixin (target) {
    for (let property of instanceKeys)
      Object.defineProperty(target, property, { value: behaviour[property] });
    target[typeTag] = true;
    return target;
  }
  for (let property of sharedKeys)
    Object.defineProperty(mixin, property, {
      value: sharedBehaviour[property],
      enumerable: sharedBehaviour.propertyIsEnumerable(property)
    });
  Object.defineProperty(mixin, Symbol.hasInstance, {value: (instance) => !!instance[typeTag]});
  return mixin;
}

FunctionalMixin.shared = shared;

urgent instanceof Coloured
  //=> true
{} instanceof Coloured
  //=> false

Do you need to implement instanceof? Quite possibly not. “Rolling your own polymorphism” is usually a last resort. But it can be handy for writing test cases, and a few daring framework developers might be working on multiple dispatch and pattern-matching for functions.

summary

The charm of the object mixin pattern is its simplicity: It really does not need an abstraction wrapped around an object literal and Object.assign.

However, behaviour defined with the mixin pattern is slightly different than behaviour defined with the class keyword. Two examples of these differences are enumerability and mixin properties (such as constants and mixin methods like [Symbol.hasInstance]).

Functional mixins provide an opportunity to implement such functionality, at the cost of some complexity in the FunctionalMixin function that creates functional mixins.

As a general rule, it’s best to have things behave as similarly as possible in the domain code, and this sometimes does involve some extra complexity in the infrastructure code. But that is more of a guideline than a hard-and-fast rule, and for this reason there is a place for both the object mixin pattern and functional mixins in JavaScript.

(discuss on hacker news and /r/javascript)

follow-up: Purely Functional Composition


more reading:

notes:

  1. A much better way to put it is that objects with a prototype delegate behaviour to their prototype (and that may in turn delegate behaviour to its prototype if it has one, and so on). 

  2. This may not work with various transpilers and other incomplete ECMAScript 2015 implementations. Check the documentation. For example, you must enable the “high compliancy” mode in BabelJS. This is off by default to provide the highest possible performance for code bases that do not need to use features like this. 

https://raganwald.com/2015/06/17/functional-mixins
Prototypes are Objects (and why that matters)
Show full content

Prerequisite: This post presumes that readers are familiar with JavaScript’s objects, know how a prototype defines behaviour for an object, know what a constructor function is, and how a constructor’s .prototype property is related to the objects it constructs. Passing familiarity with ECMAScript 2015 syntax will be helpful.

We have always been able to create a JavaScript class like this:

function Person (first, last) {
  this.rename(first, last);
}

Person.prototype.fullName = function fullName () {
  return this.firstName + " " + this.lastName;
};


Person.prototype.rename = function rename (first, last) {
  this.firstName = first;
  this.lastName = last;
  return this;
}

Person is a constructor function, and it’s also a class, in the JavaScript sense of the word “class.”

ECMAScript 2015 provides the class keyword and “compact method notation” as syntactic sugar for writing a function and assigning methods to its prototype (there is a little more involved, but that isn’t relevant here). So we can now write our Person class like this:

class Person {
  constructor (first, last) {
    this.rename(first, last);
  }
  fullName () {
    return this.firstName + " " + this.lastName;
  }
  rename (first, last) {
    this.firstName = first;
    this.lastName = last;
    return this;
  }
};

Nice. But behind the scenes, you still wind up with a constructor function bound to the name Person, and with Person.prototype being an object that looks like this:

{
  fullName: function fullName () {
    return this.firstName + " " + this.lastName;
  },
  rename: function rename (first, last) {
    this.firstName = first;
    this.lastName = last;
    return this;
  }
}
prototypes are objects

If we want to change the behaviour of a JavaScript object, we can add, remove, or modify methods of the object by adding, removing, or modifying the functions bound to properties of the object. This differs from most classical languages, they have a special form (e.g. Ruby’s def) for defining methods.

Prototypes in JavaScript are “just objects,” and since they are “just objects,” we can add, remove, or modify methods of the prototype by adding, removing, or modifying the functions bound to properties of the prototype.

That’s exactly what the ECMAScript 5 code above does, and the ECMAScript 2015 class syntax “desugars” to equivalent code.

Prototypes being “just objects” means we can use any technique that works on objects on prototypes. For example, instead of binding functions to a prototype one-at-a-time, we can bind them en masse using Object.assign:

function Person (first, last) {
  this.rename(first, last);
}

Object.assign(Person.prototype, {
  fullName: function fullName () {
    return this.firstName + " " + this.lastName;
  },
  rename: function rename (first, last) {
    this.firstName = first;
    this.lastName = last;
    return this;
  }
})

And of course, we could use compact method syntax1 if we like:

function Person (first, last) {
  this.rename(first, last);
}

Object.assign(Person.prototype, {
  fullName () {
    return this.firstName + " " + this.lastName;
  },
  rename (first, last) {
    this.firstName = first;
    this.lastName = last;
    return this;
  }
})
mixins

Since class desugars to constructor functions and prototypes, we can mix and match techniques:

class Person {
  constructor (first, last) {
    this.rename(first, last);
  }
  fullName () {
    return this.firstName + " " + this.lastName;
  }
  rename (first, last) {
    this.firstName = first;
    this.lastName = last;
    return this;
  }
};

Object.assign(Person.prototype, {
  addToCollection (name) {
    this.collection().push(name);
    return this;
  },
  collection () {
    return this._collected_books || (this._collected_books = []);
  }
})

We have just “mixed” methods concerned with collecting books into our Person class. It’s great that we can write code in a very “point-free” style, but naming things is also great:

const BookCollector = {
  addToCollection (name) {
    this.collection().push(name);
    return this;
  },
  collection () {
    return this._collected_books || (this._collected_books = []);
  }
};

class Person {
  constructor (first, last) {
    this.rename(first, last);
  }
  fullName () {
    return this.firstName + " " + this.lastName;
  }
  rename (first, last) {
    this.firstName = first;
    this.lastName = last;
    return this;
  }
};

Object.assign(Person.prototype, BookCollector);

We can do this as much as we like:

const BookCollector = {
  addToCollection (name) {
    this.collection().push(name);
    return this;
  },
  collection () {
    return this._collected_books || (this._collected_books = []);
  }
};

const Author = {
  writeBook (name) {
    this.books().push(name);
    return this;
  },
  books () {
    return this._books_written || (this._books_written = []);
  }
};

class Person {
  constructor (first, last) {
    this.rename(first, last);
  }
  fullName () {
    return this.firstName + " " + this.lastName;
  }
  rename (first, last) {
    this.firstName = first;
    this.lastName = last;
    return this;
  }
};

Object.assign(Person.prototype, BookCollector, Author);
why we might want to use mixins

Composing classes out of base functionality (Person) and mixins (BookCollector and Author) provides several benefits.

First, sometimes functionality does not neatly decompose in a tree-like form. Book authors are sometimes corporations, not persons. And antiquarian book stores collect books just like bibliophiles.

A “mixin” like BookCollector or Author can be mixed into more than one class. Trying to compose functionality using “inheritance” doesn’t always work cleanly.

Another benefit is not obvious from a toy example, but in production systems classes can grow to be very large. Even if a mixin is not used in more than one class, decomposing a large class into mixins helps fulfil the Single Responsibility Principle. Each mixin can handle exactly one responsibility. That makes things easier to understand, and much easier to test.

why this matters

There are other ways to decompose responsibilities for classes (such as delegation and composition), but the point here is that if we wish to use mixins, it is very simple and easy to do, because JavaScript does not have a large and complicated OOP mechanism that imposes a rigid model on programs.

In Ruby, for example, mixins are easy because a special feature, modules was baked into Ruby from the start. In other OO languages, mixins are difficult, because the class system does not support them and they are not particularly friendly to metaprogramming.

JavaScript’s choice to build OOP out of simple parts–objects, functions, and properties–makes the development of new ideas possible.

(discuss on hacker news)


more reading:

notes:

  1. There are subtleties involving the super keyword to consider, but that is not the point of this article. 

https://raganwald.com/2015/06/10/mixins
Classes are Expressions (and why that matters)
Show full content

Prerequisite: This post presumes that readers are familiar with JavaScript’s objects, know how a prototype defines behaviour for an object, know what a constructor function is, and how a constructor’s .prototype property is related to the objects it constructs. Passing familiarity with ECMAScript 2015 syntax like let and gathering parameters will be extremely helpful.

Vacuum

We have always been able to create a JavaScript class like this:

function Person (first, last) {
  this.rename(first, last);
}

Person.prototype.fullName = function fullName () {
  return this.firstName + " " + this.lastName;
};


Person.prototype.rename = function rename (first, last) {
  this.firstName = first;
  this.lastName = last;
  return this;
}

Person is a constructor function, and it’s also a class, in the JavaScript sense of the word “class.” As we’ve written it here, it’s a function declaration. But let’s rewrite it as a function expression. We’ll use let just to get into the ECMAScript 2015 swing of things (many people would use const, that doesn’t matter here):

let Person = function (first, last) {
  this.rename(first, last);
}

Person.prototype.fullName = function fullName () {
  return this.firstName + " " + this.lastName;
};


Person.prototype.rename = function rename (first, last) {
  this.firstName = first;
  this.lastName = last;
  return this;
}
classes with class

ECMAScript 2015 provides the class keyword and “compact method notation” as syntactic sugar for writing a function and assigning methods to its prototype (there is a little more involved, but that isn’t relevant here). So we can now write our Person class like this:

class Person {
  constructor (first, last) {
    this.rename(first, last);
  }
  fullName () {
    return this.firstName + " " + this.lastName;
  }
  rename (first, last) {
    this.firstName = first;
    this.lastName = last;
    return this;
  }
};

Just like a function declaration, we can also write a class expression:

let Person = class {
  constructor (first, last) {
    this.rename(first, last);
  }
  fullName () {
    return this.firstName + " " + this.lastName;
  }
  rename (first, last) {
    this.firstName = first;
    this.lastName = last;
    return this;
  }
};

This is interesting, because it shows that creating a class in JavaScript (whether we write constructor functions or use the class keyword) is evaluating an expression. In this case, our class is created anonymously, we just happen to bind it to Person.1 We can create classes, assign them to variables, pass them to functions, or return them from functions, just like any other value in JavaScript.

That’s a very powerful thing. Not all OOP languages do things that way, some have classes, but they aren’t values. Some have classes with names, but the names live in a special space that is separate from the variables we bind. But having classes be “just a function” and having prototypes be “just an object” means they are “just values.” And that lets us do anything with a class or a prototype we could do with any other value.

Like what? I’m glad you asked. First, let’s review the ECMAScript 2015 Symbol:

symbols

In its simplest form, Symbol is a function that returns a unique entity. No two symbols are alike, ever:

Symbol() !=== Symbol()

Symbols have string representations, although they may appear cryptic:2

Symbol().toString()
  //=> Symbol(undefined)_u.mwf0blvw5
Symbol().toString()
  //=> Symbol(undefined)_s.niklxrko8m
Symbol().toString()
  //=> Symbol(undefined)_s.mbsi4nduh

You can add your own text to help make it intelligible:

Symbol("Allongé").toString()
  //=> Symbol(Allongé)_s.52x692eab
Symbol("Allongé").toString()
  //=> Symbol(Allongé)_s.q6hq5lx01p
Symbol("Allongé").toString()
  //=> Symbol(Allongé)_s.jii7eyiyza

There are some ways that JavaScript makes symbols especially handy. Using symbols as property names, for example.

a problem with encapsulation

One of the huge problems with OOP in JavaScript is that it is very easy for code to become highly coupled. By default, all methods and properties are “public,” any piece of code can read and write any property. In our Person, it looks very much to the eye like firstName and lastName are intended to be private, while other objects interact with a person using the .rename and .fullName methods.

The usual argument against other code reading or writing .firstName and .lastName directly is that makes it difficult to modify the Person class. Imagine that we wish to accommodate an optional middle name:

class Person {
  constructor (first, last, middle) {
    this.rename(first, last, middle);
  }
  fullName () {
    return this.middleName
           ? (this.firstName + " " + this.middleName + " " + this.lastName)
           : (this.firstName + " " + this.lastName);
  }
  rename (first, last, middle) {
    this.firstName = first;
    this.lastName = last;
    this.middleName = middle;
    return this;
  }
};

How awkward, but so far nothing breaks, not even the code that directly accesses .firstName and .lastName. Now we refactor:

class Person {
  constructor (...names) {
    this.rename(...names);
  }
  fullName () {
    return this.names.join(" ");
  }
  rename (...names) {
    this.names = names;
    return this;
  }
};

Presto, we just broke everything that depends directly upon .firstName and .lastName.

The problem here is that all code has dependencies. The code using Person depends upon it’s behaviour. The trouble is, code that manipulates .firstName and .lastName depends upon both the “interface” and the “implementation” of Person, which makes it difficult to change Person in the future. And it’s not just Person that can’t change. We won’t write them out here, but every piece of code that uses Person depend upon each other using it correctly. If one writes the wrong thing in .firstName or .lastName, all the other pieces of code using Person could break.

This may seem very theoretical. But as applications and teams grow, and deadlines loom, and SEV-1 incidents occur, the best of intentions get watered down, and over time, the code gradually becomes fragile. This has been known since the 1960s, and gave rise to Modular Programming, where a hard separation was made between the implementation inside a module, and the interface it exposed to the rest of the code. “OOP” embraced this with the premise that every object encapsulates its own implementation.3

But our Person class does not encapsulate its implementation. Let’s use symbols to do so:

using symbols to encapsulate private properties

In ECMAScript 2015, a symbol can be a property name. So if we arrange things such that a class’s methods have a symbol in scope, but no other code has that symbol in scope, we can create relatively private properties.4

As you probably know, writing foo.bar is synonymous with foo['bar']. Same thing semantically. So let’s begin by rewriting Person to use strings for property keys:

class Person {
  constructor (first, last) {
    this.rename(first, last);
  }
  fullName () {
    return this['firstName'] + " " + this['lastName'];
  }
  rename (first, last) {
    this['firstName'] = first;
    this['lastName'] = last;
    return this;
  }
};

So far, exactly the same behaviour, any code that wants to, can access a person’s .firstName or .lastName. Next, we’ll extract some variables:

let firstNameProperty = 'firstName',
    lastNameProperty  = 'lastName';

class Person {
  constructor (first, last) {
    this.rename(first, last);
  }
  fullName () {
    return this[firstNameProperty] + " " + this[lastNameProperty];
  }
  rename (first, last) {
    this[firstNameProperty] = first;
    this[lastNameProperty] = last;
    return this;
  }
};

Same thing, but we aren’t done yet. Let’s use symbols instead of strings:

let firstNameProperty = Symbol('firstName'),
    lastNameProperty  = Symbol('lastName');

class Person {
  constructor (first, last) {
    this.rename(first, last);
  }
  fullName () {
    return this[firstNameProperty] + " " + this[lastNameProperty];
  }
  rename (first, last) {
    this[firstNameProperty] = first;
    this[lastNameProperty] = last;
    return this;
  }
};

This is different. Instances of Person won’t have properties like .lastName, they will be properties like ['Symbol(lastName)_v.cn3u8ad08']. Furthermore, JavaScript automatically makes these properties non-enumerable, so they won’t show up should we use things like for...in loops.

So it will be difficult for other code to directly manipulate the properties we use for a person’s first and last name. But that being said, we’re “exposing” the firstNameProperty and lastNameProperty variables to the world. We’ve encapsulated instances of Person, but not Person itself.

encapsulating our class implementation

Recall that we said a class is a value that can be assigned to a variable or returned from a function. Functions are excellent mechanisms for encapsulating code. Let’s start by changing our class declaration into a class expression. We’ll make this one a named class expression to help with debugging and what-not:

let firstNameProperty = Symbol('firstName'),
    lastNameProperty  = Symbol('lastName');

let Person = class Person {
  constructor (first, last) {
    this.rename(first, last);
  }
  fullName () {
    return this[firstNameProperty] + " " + this[lastNameProperty];
  }
  rename (first, last) {
    this[firstNameProperty] = first;
    this[lastNameProperty] = last;
    return this;
  }
};

Now we wrap the class in an IIFE5:

let firstNameProperty = Symbol('firstName'),
    lastNameProperty  = Symbol('lastName');

let Person = (() = > {
  return class Person {
    constructor (first, last) {
      this.rename(first, last);
    }
    fullName () {
      return this[firstNameProperty] + " " + this[lastNameProperty];
    }
    rename (first, last) {
      this[firstNameProperty] = first;
      this[lastNameProperty] = last;
      return this;
    }
  };
)();

And finally, we move the property name variables inside the IIFE:

let Person = (() = > {
  let firstNameProperty = Symbol('firstName'),
      lastNameProperty  = Symbol('lastName');

  return class Person {
    constructor (first, last) {
      this.rename(first, last);
    }
    fullName () {
      return this[firstNameProperty] + " " + this[lastNameProperty];
    }
    rename (first, last) {
      this[firstNameProperty] = first;
      this[lastNameProperty] = last;
      return this;
    }
  };
)();

Now this is different. Code outside the IIFE cannot see the property names. We construct a class and return it from the IIFE. We then assign it to the Person variable. Its mechanism has been completely encapsulated.

commentary

Other languages have features like private instance variables, of course. But what makes JavaScript different from languages like Java or C++ is that JavaScript’s flexibility gave us the tools to construct our own way to encapsulate properties inside an instance, and to encapsulate the construction of a class inside an IIFE.

This pattern of creating a class that has private variables emerged from combining a few things: The fact that instance variables are properties, the fact that we can use symbols as non-enumerable and hard-to-guess property keys, and the fact that class can be used as an expression.

There’s no need to have special keywords or magic namespaces. That keeps the “surface area” of the language small, and provides a surprising amount of flexibility. If we want to, we can build mixins, traits, eigenclasses, and all sorts of other constructs that have to be baked into other languages.

(discuss on hacker news)


more reading:

notes:

  1. JavaScript engines will “infer” that the otherwise anonymous function expression should be named Person because it is immediately assigned to a variable of that name. There are ways to create a truly anonymous constructor function or “class” and bind it to a name, but that isn’t relevant here. 

  2. The exact representation depends upon the implementation 

  3. Some teams take a coöperative approach to separating interfaces from implementation. They might, for example, name the properties _firstName and _lastName, with the understanding that anything prefixed with an underscore was off-limits. Usually, this works for a while, but the moment someone breaks the “rule,” the team needs to crack down on the “violation” immediately. If there is one exception in the code, another one will spring up via copy-and-paste, and then another, and then it entirely breaks down. So some teams look around for linting tools to identify people breaking the rule as early as possible. And if we’re to accept that tooling is a good idea after the code is written, why not write the code in such a way that it discourages violations in the first place? 

  4. There are still ways to get around this form of privacy, but they are sufficiently awkward that they will discourage excessive coupling and stand out like a sore thumb at code reviews. 

  5. “Immediately Invoked Function Expressions” 

https://raganwald.com/2015/06/04/classes-are-expressions
De Stijl: How necessary are var, let, and const?
Show full content

Disclaimer: JavaScript the language has some complicated edge cases, and as such, the following essay has some hand-wavey bits and some bits that are usually correct but wrong for certain edge cases. If it helps any, pretend that nearly every statement has a footnote reading, “for most cases in practice, however ______.”


ECMAScript-2015 gives us three different variable declaration statements: var, let, and const. Language features are interesting, but they aren’t free: Every feature we use in a program increases its surface area, and the additional complexity of the tool should be justified by the simplification it brings to the program.

We already had var. What value do let and const confer? And is that value enough to justify their use?

One way to answer that question is to perform a thought experiment:

Take a function using one of these features, and convert it to an equivalent function that doesn’t use the feature. We can compare the two versions and see how much code and accidental complexity is added to replace the feature with code that repolicates the feature’s semantics.

Gerrit Rietveld's Roodblauwe stoel

is var necessary?

Let’s try this with var. Is var really necessary in function scope? Can we write JavaScript without it? And let’s make it interesting: Can we get rid of var without using let?

Variables declared with var have exactly the same scope as function arguments. So, one strategy for removing var from functions is to replace declared variables with function arguments.

So:

function callFirst (fn, larg) {
  return function () {
    var args = Array.prototype.slice.call(arguments, 0);
    
    return fn.apply(this, [larg].concat(args))
  }
}

Would become:

function callFirstWithoutVar (fn, larg) {
  return function (args) {
    args = Array.prototype.slice.call(arguments, 0);
    
    return fn.apply(this, [larg].concat(args))
  }
}

We can manually hoist any var that doesn’t appear at the top of the function, so:

function repeat (num, fn) {
  var i;
  
  for (i = 1; i <= num; ++i)
    var value = fn(i);
  
  return value;
}

Would become:

function repeat (num, fn, i, value) {
  i = value = undefined;
  
  for (i = 1; i <= num; ++i)
    value = fn(i);
  
  return value;
}

There are a few flaws with this approach, most significantly that the code we write is misleading to human readers: It clutters the function’s signature with its local variables.1 Fortunately, there’s a fix: We can wrap function bodies in IIFEs2 and give the IIFEs the extra parameters. Like this:

function repeat (num, fn) {
  return ((i, value) => {
    for (i = 1; i <= num; ++i)
      value = fn(i);
  
    return value;
  })();
}

Now our function has its original signature, and we have the expected behaviour. The flaw with this approach, of course, is that our function is more complicated both in code and behaviour: There’s this confusing return ((i, value) => { and })(); stuff going on, and even though we all love the techniques espoused in JavaScript Allongé, this appears a bit gratuitous.

And at runtime, we are creating an extra closure with every invocation. This has performance implications, memory implications, and it certainly isn’t doing our stack traces any favours.

But we get the general idea: If we were willing to live with this code, we could get rid of a lot or even all uses of var from our programs. Now, what about let?

Detail of the Rietveld-Schröderhuis

is let necessary?

What if we wanted to remove let and just program with var? Or perhaps remove let altogether? Can it be done?

let has a more complicated behaviour, but if we are careful, we can translate let declarations into IIFEs that use var. And of course, if we want to remove let altoogether, if we can translate let into var, and we can remove var altogether,w e can remove let altogether as well.

The simplest case is when a let is at the top-level of a function. In that case, we can replace it with a var.3 And from there, if we are removing both let and var, we can excise it completely.

So:

function arraySum (array) {
  let done,
      sum = 0,
      i = 0;
  
  while ((done = i == array.length, !done)) {
    sum += array[i++];
  }
  return sum
}

Would become:

function arraySum (array) {
  var done,
      sum = 0,
      i = 0;
  
  while ((done = i == array.length, !done)) {
    sum += array[i++];
  }
  return sum
}

And then:

function arraySum (array) {
  return ((done, sum, i) => {
    sum = i = 0;
    
    while ((done = i == array.length, !done)) {
      sum += array[i++];
    }
    return sum
  })();
}

That works.4

Now what about let inside a block? This is, after all, it’s claim to fame. The least complicated case is when the body of the block does not contain a return. In that case, we use the same IIFE technique, but don’t return anything. So this variation:

function arraySum (array) {
  let done,
      sum = 0,
      i = 0;
  
  while ((done = i == array.length, !done)) {
    let value = array[i++]
    sum += value;
  }
  return sum
}

Would become:

function arraySum (array) {
  var done,
      sum = 0,
      i = 0;
    
  while ((done = i == array.length, !done)) {
    (() => {
      var value = array[i++];
      sum += value;
    })();
  }
  return sum
}

By the way, the performance is worse than rubbish, because we’re creating and discarding our IIFE on every trip through the loop. In cases, like this, we can avoid a lot of that by cleverly “hoisting” the IIFE out of the loop:

function arraySum (array) {
  var done,
      sum = 0,
      i = 0,
      __closure = () => {
        var value = array[i++];
        sum += value;
      };
    
  while ((done = i == array.length, !done)) __closure();
  return sum
}

Rietveld's Hanging Lamp

loops and blocks

let has special rules for loops. So if we simplify our arraySum with a for...in loop, we’ll need an IIFE around the for loop to prevent any let within the loop from leaking into the surrounding scope, and one inside the for loop to preserve its value within the block. Let’s write a completely contrived function:

function sumFrom (original, i) {
  let sum = 0,
      array = original.slice(i);
  
  for (let i in array) {
    sum += array[i];
  }
  return `The sum of the numbers ${original.join(', ')} from ${i} is ${sum}`
}

This can be rewritten as:

function sumFrom (original, i) {
  var sum = 0,
      array = original.slice(i),
      __closure = (i) => sum += array[i];;
  (() => {
    var i;
    
    for (i in array) __closure(i);
  })();

  return `The sum of the numbers ${original.join(', ')} from ${i} is ${sum}`
}

Some blocks contain a return, and that returns from the nearest enclosing function. But if we replace the block with an IIFE, the return will return to the IIFE. When the IIFE surrounds the entire body of the function, we can just return whatever the IIFE returns, as we do above. But when the IIFE represents a block within the body of the function, we can only return the value of the block if it returns something.

So something like this:

function maybe (fn) {
  return function (...args) {
    for (let arg of args) {
      if (arg == null) return null;
    }
    return fn.apply(this, args)
  }
}

Becomes this:

function maybe (fn) {
  return function (...args) {
    var __iife_returns,
        __closure = (arg) => {
      if (arg == null) return null;
    };
    
    __iife_returns = (() => {
      var arg, __closure_returns;
      
      for (arg of args) {
        __closure_returns = __closure(arg);
        
        if (__closure_returns !== undefined) return __closure_returns;
      }
    })();
    if (__iife_returns !== undefined) return __iife_returns;
    
    return fn.apply(this, args)
  }
}

We’ll leave it as “an exercise for the reader” to sort out how to handle a return that doesn’t return anything:

function maybe (fn) {
  return function (...args) {
    for (let arg of args) {
      if (arg == null) return;
    }
    return fn.apply(this, args)
  }
}

Or a return when we don’t know what we are returning:

function maybe (fn) {
  return function (...args) {
    for (let arg of args) {
      if (arg == null) return arg;
    }
    return fn.apply(this, args)
  }
}

Gerrit Rietveld Academie

what have we learnt from removing var and let?

The first thing we’ve learnt is that for most purposes, var and let aren’t strictly necessary in JavaScript. Roughly speaking, scoping constructs with lexical scope can be mechanically transformed into functional arguments.

This is not news, it’s how let was originally written in the Scheme flavour of Lisp, and it’s how do works in CoffeeScript to provide let-like behaviour.

So one argument is, we could strip these out of the language to provide a more minimal set of features. Or we could just use var, and translate all lets to var.

However, looking at the code we would have to write if we didn’t have var, or if we had to write let without var, it’s clear that while a language without let would be smaller, the programs we write in it would be larger.

This is a case where taking something away does not create elegance. If we take let away and only use var, we have to add IIFEs to get block scope. If we take var away too, we get even more IIFEs. Removing let makes our programs less elegant.

The Rietveld Schröderhuis

wait, what about const?

As you know, const behaves exactly like let, however when a program is first parsed, it is analyzed, and if there are any lines of code that attempt to assign to a const variable, an error is generated. This happens before the program is executed, it’s a syntax error, not a runtime error.

Presuming that it compiles correctly and you haven’t attempted to rebind a const name, const is exactly the same as let at runtime. Therefore, removing const from a working program is as simple as replacing it with let. So the following:

function sumFrom (original, i) {
  let sum = 0;
  const array = original.slice(i);
  
  for (let i in array) {
    sum += array[i];
  }
  return `The sum of the numbers ${original.join(', ')} from ${i} is ${sum}`
}

Can be translated to:

function sumFrom (original, i) {
  let sum = 0,
      array = original.slice(i);
  
  for (let i in array) {
    sum += array[i];
  }
  return `The sum of the numbers ${original.join(', ')} from ${i} is ${sum}`
}
one of these things is not like the others

As we can see, const is not like var or let. Removing var by changing it into parameters involves the creation of additional IIFEs, cluttering the code and changing the runtime behaviour. Removing let adds much more complexity again. But removing const by changing it into let is benign. It doesn’t add any complexity to the code or the runtime behaviour.

This is not surprising: const isn’t a scoping construct, it’s a typing construct. It exists to make assertions about the form of the program, not about its runtime behaviour. That’s why languages like C++ implement const as an annotation on top of an existing declaration. If JavaScript followed the same philosophy, const would be an annotation on top of an existing declaration:

It might look something like this:

@const function sumFrom (original, i) {
  let sum = 0;
  let @const array = original.slice(i);
  
  for (let i in array) {
    sum += array[i];
  }
  return `The sum of the numbers ${original.join(', ')} from ${i} is ${sum}`
}

The secret to understanding const is to understand that it’s a shorthand for let with an annotation, as hypothetically shown above. But it’s really just a let.

what is the value proposition of const?

The value proposition of const is that we have an annotation that is enforced by static analysis. It’s like a comment that can never mislead the reader, because the compiler forces you to either not rebind a const or to switch from const to let.

How valuable is this comment to the reader of the code?

There’s some argument that restricting variables to being constant “makes a function easier to reason about.” Of course that’s true in the literal English sense, but if you don’t rebind references, a function is just as easy to reason about if you use const as if you use let. It’s just that with let, you have to read the whole function to see which variables are rebound and which aren’t.5

The value of const is that you don’t have to examine everywhere the variable is used to know that the variable is not rebound. This point cannot be repeated enough, but I’ll settle for repeating it just once: The value of const is that you don’t have to examine everywhere the variable is used to know that the variable is not rebound.

How valuable is that, exactly?

Variables in JavaScript have a fixed scope: You can see every single rebinding of a variable within the lexical scope of the function, and there’re only two ways to rebind a variable; With a simple assignment, or with a destructuring assignment.

There are no other ways to rebind it. JavaScript does not have indirect variable access like SNOBOL. It does not have pointers to variables like C. It does not have call-by-reference like C++. It does not treat the environment as a mutable dictionary.6

So with a variable, we always know exactly what we have to review. Reasoning about variable rebinding is easy.

Steltman chair

const vs. immutability

Consider a related, but mostly orthogonal idea, immutability of data. With immutable data, you have a data structure, like an array, and you never change it. Nothing is added or removed. No elements are changed.

The value of an immutable data structure is that you don’t have to examine everywhere the data structure is accessed to know that the data structure is not mutated. This point also cannot be repeated enough, and again I’ll settle for repeating it just once: The value of an immutable data structure is that you don’t have to examine everywhere the data structure is accessed to know that the data structure is not mutated.

Guaranteeing that an array is immutable means examining everywhere the array is accessed and verifying that none of those accesses mutate the array, much as guaranteeing that a variable is const means examining everywhere the variable is used and verifying that none of those uses change its binding.

These two things sound the same, but they are not. As we saw above, variables have a fixed scope, we always know exactly what we have to review, and thus reasoning about variables is easy.

Data, on the other hand, is not narrowly scoped. Objects are passed by reference to functions. Objects are returned by reference from functions. Object properties can be dynamically accessed with []. For this reason, any code within a program could modify data. To truly understand whether an object is mutated, you need to examine the whole program—including libraries and standard classes—and even then there are lots of common cases for which you can make no guarantee.

So with data, we do not always know what we have to review. Reasoning about data is hard.

And that’s exactly why having guarantees about immutability are so valuable in the languages that provide them. But reasoning about variable rebinding is quite a bit easier. And thus, providing a guarantee about variable rebinding may sound like guarantees about data immutability, but it is is considerably less valuable.

so… should we use var, let, and const?

One can see immediately that var and let may be theoretically unnecessary, but in practice make the functions we write simpler, and therefore easier to read and write.

Whereas, const does not make functions simpler than let, but does provide a kind of annotation that saves us some effort when examining a function. It is not nearly as useful as immutable data, because the problem it solves is easy, not hard.


(discuss on hacker news)


  1. It also changes the arity of our functions. That can matter for certain meta-programming implementations. 

  2. “Immediately Invoked Function Expressions” 

  3. There are some edge cases with respect to the behaviour of let and variables used before they are declared, but the basic principle here is straightforward. 

  4. From now on, we’ll just translate let into var and leave removing let altogether as an exercise for the reader. 

  5. You’ll often hear functional programmers talk about immutability making programs easier to reason about. They don’t mean easier in the sense of, “Immutability saves me some effort.” They mean, “It would be impossible to reason about this data without immutability.” They’re using the same words, but in FP, the words “easier to reason about” have a specific technical meaning that does not apply to const. We’ll read more about this below. 

  6. Then again, there’s always eval

https://raganwald.com/2015/05/30/de-stijl
OOP, JavaScript, and so-called Classes
Show full content

Programming with objects and classes began in Norway in the late 1960s with the Simula programming language. Its creators, Ole-Johan Dahl and Kristen Nygaard, did not use those words to describe what would eventually become the dominant paradigm in computing.

A decade later, Dr. Alan Kay coined the phrase “Object-Oriented Programming” along with co-creating the Smalltalk programming language. He has famously said that to him, “OOP” was objects communicating with each other using messages, and that other languages copied the things that didn’t matter from Smalltalk, and ignored the things he thought did matter.

Since that time, languages have either bolted object-ish ideas on top of their existing paradigms (like Object Pascal and OCaml), baked them in alongside other paradigms (like JavaScript), or embraced objects wholeheartedly.

That being said, there really is no one definition of “object-oriented.” For one thing, there is no one definition of “object.”

objects

Some languages, like Smalltalk and Ruby, treat an object as a fully encapsulated entity. There is no access to an object’s private state, all you can do is invoke one of its methods. Other languages, like Java, permit objects to access each other’s state.

Some languages (again, like Java) have very rigid objects and classes, it is impossible or awkward to add new methods or properties to objects at run time. Some are flexible about adding methods and properties at run time. And yet other languages treat objects as dictionaries, where properties and even methods can be added, modified, or removed with abandon.

So we can see that the concept of “object” is flexible across languages.

classes

The concept of “class” is also flexible across languages. Object-oriented languages do not uniformly agree on whether classes are necessary, much less how they work. For example, The Common Lisp Object System defines behaviour with classes, and it also defines behaviour with generic functions. The Self and NewtonScript languages have prototypes instead of classes.

So some “OO” languages have objects, but not classes.

C++ has classes, but they are not “first-class entities.” You can’t assign a class to a variable or pass it to a function. You can, however, manipulate the constructors for classes, the functions that make new objects. But you can’t manipulate those constructors to change the behaviour of objects that have already been constructed, instance behaviour is early-bound by default.

Ruby has classes, and they’re first-class entities. You can ask an object for its class, you can put a class in a variable, pass it to a method, or return it from a method, just like every other entity in the language. Classes in Ruby and Smalltalk even have their own class, they are instances of Class!1 Instance behaviour is late-bound and open for extension.2

constructors

Some languages allow programs to construct objects independently, others (notably those that are heavily class-centric) require that objects always be constructed by their classes. Some languages allow any function or method to be used as a constructor, others require a special syntax or declaration for constructors.

prototypes are not classes

Prototypical languages like Self and NewtonScript eschew classes altogether, using prototypes to define common behaviour for a set of objects. The difference between a prototype and a class is similar to the difference between a model home and a blueprint for a home.

You can say to a builder, “make me a home just like that model home,” and the builder makes you a home that has a lot in common with the model home. You then decorate your home with additional personalization. But the model home is, itself, a home. Although you may choose to keep it empty, you could in principle move a family into it. This is different than asking a builder to make you a home based on a blueprint. The blueprint may specify the features of the home, but it isn’t a home. It could never be used as a home.

Prototypes are like model homes, and classes are like blueprints. Classes are not like the objects they describe.3

“object-oriented programming” can mean almost anything

From this whirlwind tour of “object-oriented programming,” we can see that the ideas behind “object-oriented programming” have some common roots in the history of programming languages, but each language implements its own particular flavour in its own particular way.

Thus, when we talk about “objects” and “prototypes” and “classes” in JavaScript, we’re talking about objects, prototypes, and classes as implemented in JavaScript. And we must keep in mind that other languages can have a radically different take on these ideas.

the javascript approach

JavaScript has objects, and by default, those objects are dictionaries. By default, objects directly manipulate each other’s state. Methods can be added to, or removed from objects at run time.

JavaScript has optional prototypes. Prototypes are objects in the same sense that model homes are homes.

In JavaScript, object and array literals construct objects that delegate behaviour to the standard library’s object prototype and array prototype, respectively. JavaScript also supports using Object.create to construct objects with or without a prototype, and new to construct objects using a constructor function.

Using prototypes and constructor functions, JavaScript programs can emulate many of the features of classes in other languages. JavaScript also has a class keyword that provides syntactic sugar for writing constructor functions and prototypes in a declarative fashion.

By default, a JavaScript class is a constructor composed with an object as its associated prototype. This can be denoted with the class keyword, by working with a function’s default .prototype property, or by composing functions and objects independently.

JavaScript classes are constructors, but they are more than C++ constructors, in that manipulation of their prototype extends or modifies the behaviour of the instances they create. JavaScript classes take a minimalist approach to OO in the same sense that JavaScript objects take a minimal approach to OO. For example, behaviour can be mixed into an object, a prototype, or a class using the exact same mechanism, because objects, prototypes, and a constructor’s prototype are all objects.

In sum, JavaScript is not exactly like any other object-oriented programming language, and its classes aren’t like any other language that features classes, but then again, neither is any other object-oriented programming language, and neither are any other classes.



This post was extracted from a draft of the book, JavaScript Allongé, The “Six” Edition. The extracts so far:


  1. If the class of a class is Class, what class is the class of Class? In Ruby, Class.class == Class. In Smalltalk, it is MetaClass, which opens up the possibility for changing the way classes behave in a deep way. 

  2. Abuse of this feature by extending the behaviour of built-in classes is a controversial topic. 

  3. Well, actually, the difference between prototypes and classes is like the difference between model homes and blueprints. But prototypes are not like model homes. In actual fact, the relationship between an object and its prototype is one of delegation. So if a model home had a kitchen, and you asked the builder to make you a home using the model as a prototype, you could customize your own kitchen. But if you didn’t want to have your own custom kitchen, you would just use the model home’s kitchen to do all your own cooking. The relationship between a model home and a house is sometimes described as concatenative inheritance, and JavaScript lets you do that too. 

https://raganwald.com/2015/05/11/javascript-classes
Carnac the Magnificent
Show full content

Programmers love to discuss interviewing programmers. And hate to discuss it. Interviewing touches the very heart of human social interaction: It’s a process for picking “winners” and “losers,” for determining who’s “in” and who’s “out.”

Today I’d like to discuss an anti-pattern, Carnac the Magnificent.

Carnac the Magnificent was a recurring comedic role played by Johnny Carson on The Tonight Show Starring Johnny Carson. One of Carson’s most well known characters, Carnac was a “mystic from the East” who could psychically “divine” unknown answers to unseen questions.–wikipedia

The “Carnac the Magnificent” anti-pattern is setting up a situation where the only way to pass is to guess what the interviewer is looking for. Here’s an example from a blog post that is currently causing tongues to wag:

Write a program that outputs all possibilities to put + or - or nothing between the numbers 1, 2, …, 9 (in this order) such that the result is always 100. For example: 1 + 2 + 34 – 5 + 67 – 8 + 9 = 100.

According to TFA, this question is a “filter,” designed to separate those who have no hope of being a programmer from those who have the basic qualifications to write software for a living.

the problem with the problem

So what is the problem? Well, the problem is that there are too many ways to solve the problem.

For starters, you can generate all of the possible strings (e.g. 123456789, 12345678-9, 12345678+9, 1234567-89, 1234567-8-9, 1234567-8+9, 1234567+89, 1234567+8-9, 1234567+8+9, …), then use eval to compute the answer, and select those that evaluate to 100.

Or you could do the same thing, but avoid eval and bake in a little of your own computation. Because eval is “bad.”

And of course, this brute force executes fewer than 10,000 iterations, and runs faster than you can blink on contemporary hardware. But you’re applying for a job where you’re supposed to know about “scale” and “speed,” so you could optimize things and not do obviously wasted computations. Nothing that starts with 12345 can ever add up to 100, for example. Aren’t programmers supposed to know this?

And should you solve this recursively or iteratively? One is fast, the other reveals the underlying mathematical symmetry of the problem.

no hire!

There are a bunch of ways forward (many more than these four considerations, in fact).

And you can easily imagine a sadistic interviewer failing a candidate for getting the correct answer the wrong way. If you use eval, you’re a bozo. And if you write your way around eval, you’re a “theorist” who doesn’t know when to use the right tool for the job. If you don’t optimize, you don’t value scale. And if you do optimize, you’re wasting time that could be better used for another part of the interview.

And if you solve it without recursion, you don’t grasp elegance. And if you do solve it with recursion, sorry, but we use JavaScript here, Lisp jobs are down the hall.

Here’s the most naïve code I can think of:

for (let o1 of ["", "+", "-"]) {
  for (let o2 of ["", "+", "-"]) {
    for (let o3 of ["", "+", "-"]) {
      for (let o4 of ["", "+", "-"]) {
        for (let o5 of ["", "+", "-"]) {
          for (let o6 of ["", "+", "-"]) {
            for (let o7 of ["", "+", "-"]) {
              for (let o8 of ["", "+", "-"]) {
                const expr = `1${o1}2${o2}3${o3}4${o4}5${o5}6${o6}7${o7}8${o8}9`;
                const value = eval(expr);
                if (value === 100) {
                  console.log(expr)
                }
              }
            }
          }
        }
      }
    }
  }
}

(es6fiddle)

Every single thing you can say negatively about this solution represents an unstated requirement.

Likewise, here’s a recursive solution:

function solutions (accumulatedOutput, runningTotal, ...numbers) {
  if (numbers.length === 0) {
    if (runningTotal == 100) console.log(accumulatedOutput);
  }
  else {
    const [first, ...butFirst] = numbers;

    if (accumulatedOutput !== "") {

      // case one, addition
      solutions(`${accumulatedOutput}+${first}`, runningTotal + first, ...butFirst);

      // case two, subtraction
      solutions(`${accumulatedOutput}-${first}`, runningTotal - first, ...butFirst);
  
    }
    else solutions(`${first}`, first, ...butFirst);

    // case three, catenation
    if (butFirst.length > 0) {
      const [second, ...butSecond] = butFirst;
  
      solutions(accumulatedOutput, runningTotal, first * 10 + second, ...butSecond);
    }
  }
}

solutions("", 0, 1, 2, 3, 4, 5, 6, 7, 8, 9);

(es6fiddle)

It’s faster, and more beautiful mathematically, but it’s actually harder to understand how it works than the iterative solution. And it took me a lot longer to write. As did this one, based on Generators and Iterators:

// Utility function from https://leanpub.com/javascriptallongesix

const filterIterableWith = (fn, iterable) =>
  ({
    [Symbol.iterator]: function* () {
      for (let element of iterable) {
        if (!!fn(element)) yield element;
      }
    }
  });
  
// problem-specific functions
  
const catenate = (left, right) =>
  left * Math.pow(10, Math.ceil(Math.log(right) / Math.LN10)) + right;

function * expressions ([first, ...rest]) {
  if (rest.length === 0) {
    yield [first];
  }
  else {
    for (let restBlanks of expressions(rest)) {
      const [firstOfRest, ...restOfRest] = restBlanks;
      
      yield [first, '+', ...restBlanks];
      yield [first, '-', ...restBlanks];
      yield [catenate(first, firstOfRest), ...restOfRest];
    }
  }
}

function calculate([first, operation, ...rest]) {
  if (rest.length === 0) {
    return first
  }
  else if (operation === '+') {
    return first + calculate(rest);
  }
  else if (operation === '-') {
    return first - calculate(rest);
  }
}

const is100 = (expr) =>
  calculate(expr) === 100

const solutions = filterIterableWith(is100,
    expressions([1, 2, 3, 4, 5, 6, 7, 8, 9]));

(es6fiddle)

Again, this highlights a ceratin way of thinking about the problem, that of viewing it as a pipeline of operations on iterable collections: Recursively generate all the expressions, and filter for those that sum to 100.

Beyond proving that a candidate knows how to write things recursively, or with iterators, or both… Why are either of these better? When are they better? For which interviewers are these better?

We don’t know from the problem as stated.

So maybe what you should do is ask the interviewer about the hidden requirements. Optimize for speed above all else? Write tests or not? Is shorter code better? Should the code be factored neatly and all repetition DRY’d out?

That seems reasonable, diving requirements is part of a developer’s job. And some interviewers will rate you highly for that. But others will consider it wasting time when all they wanted as a working answer, any answer, you are obviously tedious and slow and can’t GetShitDone™.

The bottom line is, there is no right thing to do given a problem where the interviewer does not make it very, very clear what they want. The only person who can get this right is Carnac the Magnificent, a mystic from the east who can read minds and the contents of sealed envelopes.

stress

Now let’s be honest: If the interviewer and the interviewee are on the same page, this doesn’t seem bad. But the fact is, the interviewee will be stressed 100% of the time. And that is not good for the interviewee or for the interviewing process.

There is good stress and bad stress, and uncertainty about what the interviewer wants is bad stress. You aren’t testing whether the candidate can solve hard problems, you’re testing whether the candidate can write code just before an all-hands where the CEO will announce layoffs.

the right way forward

If all you want is working code, say so, preferably in writing so that all candidates get the same information:

Write a program that outputs all possibilities to put + or - or nothing between the numbers 1, 2, …, 9 (in this order) such that the result is always 100. For example: 1 + 2 + 34 – 5 + 67 – 8 + 9 = 100. Use any technique you want, the only thing that matters is getting the correct answer.

Or be up front that you want production-ish code:

Write a program that outputs all possibilities to put + or - or nothing between the numbers 1, 2, …, 9 (in this order) such that the result is always 100. For example: 1 + 2 + 34 – 5 + 67 – 8 + 9 = 100. Although this is a toy problem, solve it using the kind of code you’re use in a production code base.

Or encourage the candidate to ask questions:

Write a program that outputs all possibilities to put + or - or nothing between the numbers 1, 2, …, 9 (in this order) such that the result is always 100. For example: 1 + 2 + 34 – 5 + 67 – 8 + 9 = 100. Feel free to ask questions if you need clarification on what is required.

simple, right?

That’s simple, right? So why do people do this? My theory is They don’t know they are asking a Carnac the Magnificent problem. Interviewers often have a huge blind spot about how it feels to be on the other side of the table.

Perhaps they were hired with this exact same question, they wrote out an answer without asking any questions, they were hired, what’s the problem? Or they asked a few questions, they weren’t failed for wasting time, why do some candidates charge ahead without asking questions?

You need to have experience with interviews and experience working with a variety of programmers to appreciate how a “simple” technical question might actually have hidden pitfalls. Sometimes that’s a pitfall in itself! Imagine a 22 year-old extremely smart programmer interviewing a 52 year-old veteran. Why is the veteran thinking so hard about this problem? Their brain must have fossilized. NO HIRE.


Let me be brutally frank:

Part of the job of being a software developer is to understand the ways in which things that appear simple—like code, user experiences, security protocols, and almost everything else we touch—are not actually simple. And it is our job to either make them simple, or make it very clear that they aren’t that simple and document the way in which they aren’t simple.

If you’re interviewing candidates, it’s your responsibility to figure out that the interviewing process, including the questions you ask, isn’t as simple as it appears, and then to make it simple. Ask hard questions, fine, but don’t make the process hard.

tl;dr

Hidden possible requirements add stress to the interviewing process, and it’s bad stress, not good stress. And it causes the most stress for the candidates with the most experience. So you get bad results.

It’s so easy to get good results from questions: Be clear about what you want from the candidate.

So if you are interviewing programmers, here’s your homework: Go through all of your interview questions, whether technical or otherwise, and ask yourself if there are any hidden assumptions about what you expect. Then ask yourself if your process would be even better if you made those implicit requirements explicit.

I think you’ll find that eliminating Carnac the Magnificent from your interviewing process will make it better.

(discuss on hacker news)


post scriptum:

Note: I’m not saying that the author of TFA poses the question exactly as worded to real candidates without any further discussion. TFA is a blog post explaining the problem to fellow interviewers, not going into all the details of how to set up the problem, whether to encourage conversation or give hints, and so forth.

However, I am saying that I have seen similar problems posed as filters, without any further exposition about what is wanted, and I have definitely encountered interviewers who have hidden expectations, interviewers who are impatient if asked too many questions, interviewers who are judging candidates by the speed of writing a solution and consider questions to take up valuable time, and so forth.

TFA was simply an excuse to discuss something I have observed many, many times over the past couple of decades.

https://raganwald.com/2015/05/08/carnac-the-magnificent
Hilbert's Grand JavaScript School (2015 Edition)
Show full content

(This material originally appeared, using ECMAScript-5 semantics, in 2013.)


Dr. Hilbert “Bertie” David grows tired of blogging about JavaScript, and decides to cash in on the seemingly inexhaustible supply of impressionable young minds seeking to “Learn JavaScript in Five Days.”

He opens his Grand JavaScript School on the shores of the Andaman Sea in Thailand, and with some clever engineering, he is able to install a countably infinite1 number of seats in his lecture hall.

Island panorama 3

day one

Well, an infinite number of students show up on the first day. “Line up please!” he calls out to them with a bullhorn of his own invention. “Line up! Good. Each of you has a number. The first person in line is zero, the next person is one, and so on. The machine will call out a number. When you hear your number, step forward, pay your fee in bitcoins, take your receipt, then you may enter the lecture hall and find the seat with your number on it. Thank you, the lecture will begin when everyone has been seated.”

Bertie quickly whips out a JavaScript IDE he has devised, and he writes himself a generator. Instead of iterating over a data structure in memory, it generates seat numbers on demand:

function* Numbers (from = 0) {
  let number = from;
  while (true)
    yield number++;
};

const seats = Numbers();

for (let seat of seats) {
  console.log(seat);
}

//=>
  0
  1
  2
  3
  ...

He simply calls out the numbers as they are printed, and the students file into the auditorium in an orderly fashion, filling it completely. Well, the first day is very long indeed. But Bertie has an infinite supply of bitcoins and things go well.

Avoiding the well-travelled road of explaining “this,” “closures,” or “monads,” he decides to explain the difference between functional iterators and iterables.2 People are scratching their heads, but on the second day, all of the students from the first day return. So it must have been a decent lecture.

day two

In fact, a few people liked the lecture so much that they recommended it to their friends, and one million additional students are lined up for seats in his class on the morning of the second day. He has an infinite number of seats in the auditorium, but they are all full. What can he do?

Out comes the IDE and the bullhorn. This time, he digs into his copy of JavaScript Allonge, The “Six” Edition and writes the following:

const zipIterables = (...iterables) =>
  ({
    [Symbol.iterator]: function * () {
      const iterators = iterables.map(i => i[Symbol.iterator]());
      
      while (true) {
        const pairs = iterators.map(j => j.next()),
              dones = pairs.map(p => p.done),
              values = pairs.map(p => p.value);
        
        if (dones.indexOf(true) >= 0) break;
        yield values;
      }
    }
  });

const oldSeats = Numbers(0),
      newSeats = Numbers(1000000),
      correspondence = zipIterables(oldSeats, newSeats);

for (let pair of correspondence) {
  const [from, to] = pair;
  console.log(`${from} -> ${to}`);
}

//=>
  0 -> 1000000
  1 -> 1000001
  2 -> 1000002
  3 -> 1000003
  ...

He’s constructed an iterable with instructions for moving seats. Bertie tells the first person to move from seat zero to seat one million, the second from one to one million and one, and so forth. This means that seats 0 through 999,999 become vacant, so the 1,000,000 new students have a place to sit. Day Two goes well, and he is very pleased with his venture.

day three

His fame spreads, and Jeff Atwood starts a discussion about Bertie’s JavaScript School on his new Discourse discussion platform. There’s so much interest, Jeff charters a bus with an infinite number of seats and brings his infinite number of fans to Bertie’s school for Day Three. The bus’s seats have numbers from zero just like the auditorium.

All of the students from Day Two have returned, so the auditorium is already full. Bertie is perplexed, but after scratching his head for a few moments, whips out his bullhorn and write the following JavaScript:

const mapIterableWith = (fn, iterable) =>
  ({
    [Symbol.iterator]: function* () {
      for (let element of iterable) {
        yield fn(element);
      }
    }
  });

const oldSeats = Numbers(0),
      newSeats = mapIterableWith(n => n * 2, Numbers(0)),
      correspondence = zipIterables(oldSeats, newSeats);

for (let pair of correspondence) {
  const [from, to] = pair;
  console.log(`${from} -> ${to}`);
}

//=>
  0 -> 0
  1 -> 2
  2 -> 4
  3 -> 6
  4 -> 8
  5 -> 10
  ...

Now all the existing students are in the even numbered seats, so he’s ready to seat Jeff’s fans:


const oldSeats = Numbers(0),
      newSeats = mapIterableWith(n => n * 2 + 1, Numbers(0)),
      correspondence = zipIterables(oldSeats, newSeats);

for (let pair of correspondence) {
  const [from, to] = pair;
  console.log(`${from} -> ${to}`);
}

//=>
  0 -> 1
  1 -> 3
  2 -> 5
  3 -> 7
  4 -> 9
  5 -> 11
  ...

Bertie calls out the seat numbers on Jeff’s bus and the number of an odd-numbered (and therefore vacant) seat in the auditorium for them to occupy. Bertie has managed to add an infinite number of students to an infinitely large but full auditorium.

He’s so pleased, Bertie lets Jeff be the guest lecturer. The audience has loved Bertie’s abstract approach to programming so far, but they’re hungry for practical knowledge and Jeff enthrals them with a walkthrough of how the Discourse User Experience is implemented.

As a bonus, Jeff shares his insights into programming productivity.3 By the end of the day, everyone is typing over 100wpm and has placed an order for multiple wall-sized monitors on eBay.

Nice selection of REAL buses!

day four

Day Three went well, so all the students return and the auditorium remains full. Everyone is very pleased and looking forward to Day Four.

But the excitement has a downside: Reddit hears about what’s going on and an infinite number of subreddits, each of which has an infinite number of redditors, all decide to show up on day four to disrupt his lecture with trolling about how lame JavaScript is as a programming language. Each sends an infinitely large bus, with every seat full. Like Jeff’s bus, each bus numbers its seat from zero and as luck would have it, each bus has has a number and the buses are numbered from zero.

Bertie has to seat an infinite number of infinite groups of people, in an infinite auditorium that is already full! Now what? Out comes the bullhorn and yesterday’s program, and he quickly moves all of his existing students into the even-numbered seats, leaving an infinite number of odd seats available for newcomers.

He starts with the obvious: If you have three buses with three seats each, you can put the students into a one-to-one correspondence with the odd numbers by nesting iterators, like this:

function * seatsOnBuses(buses, seats) {
  for (let bus of buses) {
    for (let seat of seats) {
      yield [bus, seat];
    }
  }
};

He writes a quick test:

const seatAndBus = seatsOnBuses([0, 1, 2], [0, 1, 2]),
      newSeats = mapIterableWith(n => n * 2 + 1, Numbers(0)),
      correspondence = zipIterables(seatAndBus, newSeats);

for (let pair of correspondence) {
  const [[bus, seat], to] = pair;
  console.log(`bus ${bus}, seat ${seat} -> seat ${to}`);
}

//=>
  bus 0, seat 0 -> seat 1
  bus 0, seat 1 -> seat 3
  bus 0, seat 2 -> seat 5
  bus 1, seat 0 -> seat 7
  bus 1, seat 1 -> seat 9
  bus 1, seat 2 -> seat 11
  bus 2, seat 0 -> seat 13
  bus 2, seat 1 -> seat 15
  bus 2, seat 2 -> seat 17

Looks good, he grabs the bullhorn and writes:

const seatAndBus = seatsOnBuses(Numbers(), Numbers()),
      newSeats = mapIterableWith(n => n * 2 + 1, Numbers(0)),
      correspondence = zipIterables(seatAndBus, newSeats);

for (let pair of correspondence) {
  const [[bus, seat], to] = pair;
  console.log(`bus ${bus}, seat ${seat} -> seat ${to}`);
}

//=>
  bus 0, seat 0 -> seat 1
  bus 0, seat 1 -> seat 3
  bus 0, seat 2 -> seat 5
  bus 0, seat 3 -> seat 7
  bus 0, seat 4 -> seat 9
  bus 0, seat 5 -> seat 11
  bus 0, seat 6 -> seat 13
  bus 0, seat 7 -> seat 15
  bus 0, seat 8 -> seat 17
  bus 0, seat 9 -> seat 19
  ...

After he has been seating people from bus 0 for a good long while, people from the other buses get restless. When will they be seated? What seat will they have? Bertie realizes that although there are infinite numbers of people involved, up to this point, he could point to any one student and tell them exactly where they would end up being seated.

But with this scheme, he can’t really put anyone from any of the other buses into a particular seat. He calls for order, and tries again:

function * Diagonals () {
  for (let n of Numbers()) {
    for (let i = 0; i <= n; ++i) {
      yield [i, n-i];
    }
  }
};

const seatAndBus = Diagonals(),
      newSeats = mapIterableWith(n => n * 2 + 1, Numbers(0)),
      correspondence = zipIterables(seatAndBus, newSeats);

for (let pair of correspondence) {
  const [[bus, seat], to] = pair;
  console.log(`bus ${bus}, seat ${seat} -> seat ${to}`);
}

//=>
  bus 0, seat 0 -> seat 1
  bus 0, seat 1 -> seat 3
  bus 1, seat 0 -> seat 5
  bus 0, seat 2 -> seat 7
  bus 1, seat 1 -> seat 9
  bus 2, seat 0 -> seat 11
  bus 0, seat 3 -> seat 13
  bus 1, seat 2 -> seat 15
  bus 2, seat 1 -> seat 17
  bus 3, seat 0 -> seat 19
  ...

If you think of the buses and seats forming a square, the diagonals iterator makes a path from one corner and works its way out, enumerating over every possible combination of bus and seat. Thus, given countably infinite time, it will list every one of the countably infinite number of Redditors on each of the countably infinite number of buses.

Well, this seats an infinite number of Redditors on an infinite number of buses in an infinite auditorium that was already full. He does a code walkthrough with the students, then segues on to talk about other interesting aspects of Georg Cantor4’s work and a digression into Hotel Management.5 By the time he finishes with a discussion of the Hypergame6 proof of the infinite number of infinities, everyone has forgotten that they came to scoff.

He finishes with a summary of what he learned seating students:

  1. You can put a countably infinite number of seats into a one-to-one correspondence with the numbers, therefore they have the same cardinality.
  2. You can add a finite number to an countably infinite number and put your new number into a one-to-one correspondence with the numbers, therefore infinity plus a finite number has the same cardinality as the numbers.
  3. You can add infinity to infinity and put your new number into a one-to-one correspondence with the numbers, therefore infinity plus infinity has the same cardinality as the numbers. By induction, you can add a finite number of infinities together and have the same cardinality as the numbers.
  4. You can add an infinite number of infinities to infinity and put your new number into a one-to-one correspondence with the numbers, therefore an infinity times infinity has the same cardinality as the numbers.
day five

On Day Five, everyone is back and he announces that there will be a test: “Outside our doors,” he announces, “Are an infinite number of aircraft carriers, each of which has an infinitely large flight deck. Parked on each flight deck are an infinite number of buses, each of which contains–you guessed it–an infinite number of sailors and air crew eager to join our school for the next semester.”

“Write a JavaScript program to seat them all in our lecture hall. If your program works, you may come up to the front and receive your signed diploma. If you can prove that no program works, you will also receive your diploma. Good luck!”

Before long, the students have figured out that yes, the school can accommodate a countably infinite number of aircraft carriers, each carrying a countably infinite number of buses, each carrying a countably infinite number of students.

The simplest way to infer that this is true is to observe that each aircraft carrier contains a countably infinite number of buses, each carrying a countably infinite number of students. We already know how to put that into a one-to-one correspondence with a countably infinite set of numbers using the diagonalization above. So step one is to take each aircraft carrier’s students and put them into a one-to-one correspondence with the numbers counting from zero.

Now that we have done this, we have an infinite number of aircraft carriers, each containing a countably infinite number of students. We can put this into a one-to-one correspondence with the seat numbers using diagonalization, so now we have all the students in seats.

But as an exercise, it is valuable to try writing a singe diagonalization that operates directly on three iterators (aircraft carriers, buses, and seats). You may use the Babel REPL online, or make an ES-6 Fiddle you can share.

day six

Bertie goes home, exhausted, and dreams that having graduated everyone at the end of Day Five, things are busier than ever. In his dreams he imagines an infinite number of galactic superclusters, each containing an infinite number of galaxies, each containing an infinite number of stars, each containing an infinite number of worlds, each containing an infinite number of oceans, each containing an infinite number of aircraft carriers, each containing an infinite number of buses, each containing an infinite number of students.

He awakens and reasons that what he is dealing with are powers of infinity. A simple infinity is infinity to the first power. Two infinities (buses and students) is infinity to the second power. Three infinities (aircraft carriers, buses, and students) is infinity to the third power. And so forth up to galactic superclusters, infinity to the eighth power.

He quickly observes that no matter what finite power he chooses, he can put the total number of students into a one-to-one correspondence with the countably infinite number of seats in his auditorium. But then he wonders… What if he has infinity raised to the power of infinity students to seat?

He imagines some kind of crazy infinite series of ever-greater cosmological containers, universes contained within the atoms of other universes, time and space folding over unto itself. In such a crazy circumstance, can all the students be accommodated?

read his answer here


hacker news edit this page
  1. Meaning, he is able to put the seats in a one-to-one correspondence with the natural numbers. He does this by numbering the seats from zero. See Countable Sets

  2. Lazy Iterables in JavaScript 

  3. “As far as I’m concerned, you can never be too rich, too thin, or have too much screen space.”–Three Monitors For Every User 

  4. https://en.wikipedia.org/wiki/Georg_Cantor 

  5. https://en.wikipedia.org/wiki/Hilbert’s_paradox_of_the_Grand_Hotel 

  6. http://www.kongregate.com/forums/9/topics/93615 

https://raganwald.com/2015/04/24/hilberts-school
Left-Variadic Functions in JavaScript
Show full content

Football at Fenway

A variadic function is designed to accept a variable number of arguments.1 In JavaScript, you can make a variadic function by gathering parameters. For example:

const abccc = (a, b, ...c) => {
  console.log(a);
  console.log(b);
  console.log(c);
};

abccc(1, 2, 3, 4, 5)
  1
  2
  [3,4,5]

This can be useful when writing certain kinds of destructuring algorithms. For example, we might want to have a function that builds some kind of team record. It accepts a coach, a captain, and an arbitrary number of players. Easy in ECMAScript 2015:

function team(coach, captain, ...players) {
  console.log(`${captain} (captain)`);
  for (let player of players) {
    console.log(player);
  }
  console.log(`squad coached by ${coach}`);
}

team('Luis Enrique', 'Xavi Hernández', 'Marc-André ter Stegen', 'Martín Montoya', 'Gerard Piqué')
  //=>
    Xavi Hernández (captain)
    Marc-André ter Stegen
    Martín Montoya
    Gerard Piqué
    squad coached by Luis Enrique

But we can’t go the other way around:

function team2(...players, captain, coach) {
  console.log(`${captain} (captain)`);
  for (let player of players) {
    console.log(player);
  }
  console.log(`squad coached by ${coach}`);
}
//=> Unexpected token

ECMAScript 2015 only permits gathering parameters from the end of the parameter list. Not the beginning. What to do?

a history lesson

In “Ye Olde Days,”2 JavaScript could not gather parameters, and we had to either do backflips with arguments and .slice, or we wrote ourselves a variadic decorator that could gather arguments into the last declared parameter. Here it is in all of its ECMAScript-5 glory:

var __slice = Array.prototype.slice;

function rightVariadic (fn) {
  if (fn.length < 1) return fn;

  return function () {
    var ordinaryArgs = (1 <= arguments.length ? 
          __slice.call(arguments, 0, fn.length - 1) : []),
        restOfTheArgsList = __slice.call(arguments, fn.length - 1),
        args = (fn.length <= arguments.length ?
          ordinaryArgs.concat([restOfTheArgsList]) : []);
    
    return fn.apply(this, args);
  }
};

var firstAndButFirst = rightVariadic(function test (first, butFirst) {
  return [first, butFirst]
});

firstAndButFirst('why', 'hello', 'there', 'little', 'droid')
  //=> ["why",["hello","there","little","droid"]]

We don’t need rightVariadic any more, because instead of:

var firstAndButFirst = rightVariadic(
  function test (first, butFirst) {
    return [first, butFirst]
  });

We now simply write:

const firstAndButFirst = (first, ...butFirst) =>
  [first, butFirst];

This is a right-variadic function, meaning that it has one or more fixed arguments, and the rest are gathered into the rightmost argument.

overcoming limitations

It’s nice to have progress. But as noted above, we can’t write:

const butLastAndLast = (...butLast, last) =>
  [butLast, last];

That’s a left-variadic function. All left-variadic functions have one or more fixed arguments, and the rest are gathered into the leftmost argument. JavaScript doesn’t do this. But if we wanted to write left-variadic functions, could we make ourselves a leftVariadic decorator to turn a function with one or more arguments into a left-variadic function?

We sure can, by using the techniques from rightVariadic. Mind you, we can take advantage of modern JavaScript to simplify the code:

const leftVariadic = (fn) => {
  if (fn.length < 1) {
    return fn;
  }
  else {
    return function (...args) {
      const gathered = args.slice(0, args.length - fn.length + 1),
            spread   = args.slice(args.length - fn.length + 1);
            
      return fn.apply(
        this, [gathered].concat(spread)
      );
    }
  }
};

const butLastAndLast = leftVariadic((butLast, last) =>
  [butLast, last]);

butLastAndLast('why', 'hello', 'there', 'little', 'droid')
  //=> [["why","hello","there","little"],"droid"]

Our leftVariadic function is a decorator that turns any function into a function that gathers parameters from the left, instead of from the right.

left-variadic destructuring

Gathering arguments for functions is one of the ways JavaScript can destructure arrays. Another way is when assigning variables, like this:

const [first, ...butFirst] = ['why', 'hello', 'there', 'little', 'droid'];

first
  //=> 'why'
butFirst
  //=> ["hello","there","little","droid"]

As with parameters, we can’t gather values from the left when destructuring an array:

const [...butLast, last] = ['why', 'hello', 'there', 'little', 'droid'];
  //=> Unexpected token

We could use leftVariadic the hard way:

const [butLast, last] = leftVariadic((butLast, last) =>
  [butLast, last])(...['why', 'hello', 'there', 'little', 'droid']);

butLast
  //=> ['why', 'hello', 'there', 'little']

last
  //=> 'droid'

But we can write our own left-gathering function utility using the same principles without all the tedium:

const leftGather = (outputArrayLength) => {
  return function (inputArray) {
    const gathered = inputArray.slice(0, inputArray.length - outputArrayLength + 1),
          spread = inputArray.slice(inputArray.length - outputArrayLength + 1);
          
    return [gathered].concat(spread);
  }
};

const [butLast, last] = leftGather(2)(['why', 'hello', 'there', 'little', 'droid']);
  
butLast
  //=> ['why', 'hello', 'there', 'little']

last
  //=> 'droid'

With leftGather, we have to supply the length of the array we wish to use as the result, and it gathers excess arguments into it from the left, just like leftVariadic gathers excess parameters for a function.

summary

ECMAScript 2015 makes it easy to gather parameters or array elements from the right. If we want to gather them from the left, we can roll our own left-variadic decorator for functions, or left-gatherer for destructuring arrays.


edit this page

This post was extracted from a draft of the book, JavaScript Allongé, The “Six” Edition. The extracts so far:


  1. English is about as inconsistent as JavaScript: Functions with a fixed number of arguments can be unary, binary, ternary, and so forth. But can they be “variary?” No! They have to be “variadic.” 

  2. Another history lesson. “Ye” in “Ye Olde,” was not actually spelled with a “Y” in days of old, it was spelled with a thorn, and is pronounced “the.” Another word, “Ye” in “Ye of little programming faith,” is pronounced “ye,” but it’s a different word altogether. 

https://raganwald.com/2015/04/03/left-variadic
Partial Application in ECMAScript 2015
Show full content

Some of this material originally appeared in What’s the difference between Currying and Partial Application? Here it is again, in ECMAScript 2015 syntax.


What is Partial Application? Good question!

arity

Before we jump in, let’s get some terminology straight. Functions have arity, meaning the number of arguments they accept. A “unary” function accepts one argument, a “polyadic” function takes more than one argument. There are specialized terms we can use: A “binary” function accepts two, a “ternary” function accepts three, and you can rustle about with greek or latin words and invent names for functions that accept more than three arguments.

(Some functions accept a variable number of arguments, we call them variadic, although variadic functions and functions taking no arguments aren’t our primary focus in this essay.)

partial application

Partial application is straightforward. We could start with addition or some such completely trivial example, but if you don’t mind we’ll have a look at something that is of actual use in daily programming.

As a preamble, let’s make ourselves a mapWith function that maps a function over any collection that has a .map method:

const mapWith = (unaryFn, collection) =>
  collection.map(unaryFn);

const square = (n) => n * n;

mapWith(square, [1, 2, 3])
  //=> [1, 4, 9]

mapWith is a binary function, square is a unary function. When we called mapWith with arguments square and [1, 2, 3] and square, we applied the arguments to the function and got our result.

Since mapWith takes two arguments, and we supplied two arguments, we fully applied the arguments to the function. So what’s “partial” application? Supplying fewer arguments. Like supplying one argument to mapWith.

Now what happens if we supply one argument to mapWith? We can’t get a result without the other argument, so as we’ve written it, it breaks:

mapWith(square)
  //=> undefined is not an object (evaluating 'collection.map')

But let’s imagine that we could apply fewer arguments. We wouldn’t be fully applying mapWith, we’d be partially applying mapWith. What would we expect to get? Well, imagine we decide to buy a $2,000 bicycle. We go into the store, we give them $1,000. What do we get back? A pice of paper saying that we are doing a layaway program. The $1,000 is held in trust for us, when we come back with the other $1,000, we get the bicycle.

Putting down $1,000 on a $2,000 bicycle is partially buying a bicycle. What it gets us is the right to finish buying the bicycle later. It’s the same with partial application. If we were able to partially apply the mapWith function, we’d get back the right to finish applying it later, with the other argument.

Something like this:

const mapWith =
  (unaryFn) =>
    (collection) =>
      collection.map(unaryFn);

const square = (n) => n * n;

const partiallyAppliedMapWith = mapWith(square);

partiallyAppliedMapWith([1, 2, 3])
  //=> [1, 4, 9]

The thing is, we don’t want to always write our functions in this way. So what we want is a function that takes this:

const mapWith = (unaryFn, collection) =>
  collection.map(unaryFn);

And turns it into this:

partiallyAppliedMapWith([1, 2, 3])
  //=> [1, 4, 9]

Working backwards:

const partiallyAppliedMapWith = (collection) =>
  mapWith(unaryFn, collection);

The expression (collection) => mapWith(unaryFn, collection) has two free variables, mapWith and unaryFn. If we were using a fancy refactoring editor, we could extract a function. Let’s do it by hand:

const ____ = (mapWith, unaryFn) =>
  (collection) =>
    mapWith(unaryFn, collection);

What is this _____ function? It takes the mapWith function and the unaryFn, and returns a function that takes a collection and returns the result of applying unaryFn and collection to mapWith. Let’s make the names very generic, the function works no matter what we call the arguments:

const ____ =
  (fn, x) =>
    (y) =>
      fn(x, y);

This is a function that takes a function and an argument, and returns a function that takes another argument, and applies both arguments to the function. So, we can write this:

const mapWith = (unaryFn, collection) =>
  collection.map(unaryFn);
  
const square = (n) => n * n;

const ____ =
  (fn, x) =>
    (y) =>
      fn(x, y);
    
const partiallyAppliedMapWith = ____(mapWith, square);

partiallyAppliedMapWith([1, 2, 3])
  //=> [1, 4, 9]

So what is this ____ function? It partially applies one argument to any function that takes two arguments.

We can dress it up a bit. For one thing, it doesn’t work with methods, it’s strictly a blue higher-order function. Let’s make it khaki by passing this properly:

const ____ =
  (fn, x) =>
    function (y) {
      return fn.call(this, x, y);
    };

Another problem is that it only works for binary functions. Let’s make it so we can pass one argument and we get back a function that takes as many remaining arguments as we like:

const ____ =
  (fn, x) =>
    function (...remainingArgs) {
      return fn.call(this, x, ...remainingArgs);
    };
    
const add = (verb, a, b) =>
  `The ${verb} of ${a} and ${b} is ${a + b}`
  
const summer = ____(add, 'sum');

summer(2, 3)
  //=> The sum of 2 and 3 is 5

And what if we want to partially apply more than one argument?

const ____ =
  (fn, ...partiallyAppliedArgs) =>
    function (...remainingArgs) {
      return fn.apply(this, partiallyAppliedArgs.concat(remainingArgs));
    };
    
const add = (verb, a, b) =>
  `The ${verb} of ${a} and ${b} is ${a + b}`
  
const sum2 = ____(add, 'sum', 2);

sum2(3)
  //=> The sum of 2 and 3 is 5

What we have just written is a left partial application function: Given any function and some arguments, we partially apply those arguments and get back a function that applies those arguments and any more you supply to the original function.

Partial application is thus the application of part of the arguments of a function, and getting in return a function that takes the remaining arguments.

And here’s our finished function, properly named:

const leftPartialApply =
  (fn, ...partiallyAppliedArgs) =>
    function (...remainingArgs) {
      return fn.apply(this, partiallyAppliedArgs.concat(remainingArgs));
    };
    
const add = (verb, a, b) =>
  `The ${verb} of ${a} and ${b} is ${a + b}`
  
const sum2 = leftPartialApply(add, 'sum', 2);

sum2(3)
  //=> The sum of 2 and 3 is 5
right partial application

It is very convenient to have a mapWith function, because you are far more likely to want to write something like:

const squareAll = leftPartialApply(mapWith, x => x * x);

Than to write:

const map123 = leftPartialApply(map, [1, 2, 3]);

But sometime you have map but not mapWith, or some other analogous situation where you want to apply the values from the right rather than the left. No problem:

const rightPartialApply =
  (fn, ...partiallyAppliedArgs) =>
    function (...remainingArgs) {
      return fn.apply(this, remainingArgs.concat(partiallyAppliedArgs));
    };
arbitrary partial application

What if you want to apply some, but not all of the arguments, and they may not be neatly lined up at the beginning or end? This is also possible, provided we define a placeholder of some kind, and then write some code to “fill in the blanks”.

This implementation takes a “template” of values, you insert placeholder values (traditionally _, but anything will do) where you want values to be supplied later.

const div = (verbed, numerator, denominator) =>
  `${numerator} ${verbed} ${denominator} is ${numerator/denominator}`
  
div('divided by', 1, 3)
  //=> 1 divided by 3 is 0.3333333333333333
  
const arbitraryPartialApply = (() => {
  const placeholder = {},
        arbitraryPartialApply = (fn, ...template) => {
          let remainingArgIndex = 0;
          const mapper = template.map((templateArg) =>
                           templateArg === placeholder
                             ? ((i) => (args) => args[i])(remainingArgIndex++)
                             : (args) => templateArg);
          
          return function (...remainingArgs) {
            const composedArgs = mapper.map(f => f(remainingArgs));
            
            return fn.apply(this, composedArgs);
          }
          
        };
        
  arbitraryPartialApply._ = placeholder;
  return arbitraryPartialApply;
})();

const _ = arbitraryPartialApply._;

const dividedByThree =
  arbitraryPartialApply(div, 'divided by', _, 3);

dividedByThree(2)
  //=> 2 divided by 3 is 0.6666666666666666

Arbitrary partial application handles most of the cases for left- or right-partial application, but has more internal moving parts. It also doesn’t handle cases involving an arbitrary number of parameters.

For example, the built-in function Math.max returns the largest of its arguments, or null if no arguments are supplied:

Math.max(1, 2, 3)
  //=> 3
  
Math.max(-1, -2, -3)
  //=> -1
  
Math.max()
  //=> null

What if we want to have a default argument? For example, what if we want it tor return the largest number greater than or equal to 0, or 0 if there aren’t any? We can do that with leftPartialApplication, but we can’t with arbitraryPartialApply, because we want to accept an arbitrary number of arguments:

const maxDefaultZero = leftPartialApply(Math.max, 0);

maxDefaultZero(1, 2, 3)
  //=> 3
  
Math.max(-1, -2, -3)
  //=> 0
  
maxDefaultZero()
  //=> 0

So there’s good reason to have left-, right-, and arbitrary partial application functions in our toolbox.

what’s partial application again?

“Partial application is the conversion of a polyadic function into a function taking fewer arguments arguments by providing one or more arguments in advance.” JavaScript does not have partial application built into the language (yet), but it is possible to write our own higher-order functions that perform left-, right-, or arbitrary partial application.


hacker news edit this page

This post was extracted from a draft of the book, JavaScript Allongé, The “Six” Edition. The extracts so far:

https://raganwald.com/2015/04/01/partial-application
The Symmetry of JavaScript Functions (revised)
Show full content

In JavaScript, functions are first-class entities: You can store them in data structures, pass them to other functions, and return them from functions. An amazing number of very strong programming techniques arise as a consequence of functions-as-first-class-entities. One of the strongest is also one of the simplest: You can write functions that compose and transform other functions.

a very, very, very basic introduction to decorators

Let’s consider logical negation. We might have a function like this:

const isaFruit = (f) =>
  f === 'apple' || f === 'banana' || f === 'orange';

isaFruit('orange')
  //=> true

We can use it to pick fruit from a basket, using an array’s .filter method:

['pecan',
 'apple',
 'chocolate',
 'cracker',
 'orange'].filter(isaFruit)
  //=> ["apple","orange"]

What if we want the things-that-are-not-fruit? There are a few solutions. Languages like Smalltalk and Ruby have a style where collections provide a .reject method. Or we could write a notaFruit function:

const notaFruit = (f) =>
  f !== 'apple' && f !== 'banana' && f !== 'orange';

['pecan',
 'apple',
 'chocolate',
 'cracker',
 'orange'].filter(notaFruit)
  //=> ["pecan","chocolate","cracker"]

We could also take advantage of function expressions and inline the logical negation of isaFruit:

['pecan',
 'apple',
 'chocolate',
 'cracker',
 'orange'].filter(it => !isaFruit(it));
  //=> ["pecan","chocolate","cracker"]

That is interesting. It’s a pattern we can repeat to find things in the basket that don’t start with “c:”

const startsWithC = (f) =>
  f[0] === 'c' || f[0] === 'C';

['pecan',
 'apple',
 'chocolate',
 'cracker',
 'orange'].filter(it => !startsWithC(it))
  //=> ["pecan","apple","orange"]

We can take advantage of functions-as-first-class-entities to turn this pattern into a function that modifies another function. We can use that to name another function, or even use it inline as an expression:

const not = (fn) =>
  (...args) =>
    !(fn(...args))

const anotherNotaFruit = not(isaFruit);

anotherNotaFruit("pecan")
  //=> true

['pecan',
 'apple',
 'chocolate',
 'cracker',
 'orange'].filter(not(startsWithC))
  //=> ["pecan","apple","orange"]

not is a decorator, a function that takes another function and “decorates it” with new functionality that is semantically related to the original function’s behaviour. This allows us to use not(isaFruit) anywhere we could use isaFuit, or use not(startsWithC) anywhere we can use startsWithC.

not is so trivial that it doesn’t feel like it wins us much, but the exact same principle allows us to write decorators like maybe:

const maybe = (fn) =>
  (...args) => {
    for (let arg of args) {
      if (arg == null) return arg;
    }
    return fn(...args);
  }

[1, null, 3, 4, null, 6, 7].map(maybe(x => x * x))
  //=> [1,null,9,16,null,36,49]

And to make combinators like compose:

const compose = (fn, ...rest) =>
  rest.length === 0
  ? fn
  : (arg) => fn(compose(...rest)(arg));

compose(x => x + 1, y => y * y)(10)
  //=> 101

You’ll find lots of other decorators and combinators swanning about in books about using functions in JavaScript. And your favourite JavaScript library is probably loaded with decorators that memoize the result of an idempotent function, or debounce functions that you may use to call a server from a browser.

what makes decorators and combinators easy

The power arising from functions-as-first-class-entities is that we have a very flexible way to make functions out of functions, using functions. We are not “multiplying our entities unnecessarily.” On the surface, decorators and combinators are made possible by the fact that we can pass functions to functions, and return functions that invoke our original functions.

But there’s something else: The fact that all functions are called in the exact same way. We write foo(bar) and know that we will evaluate bar, and pass the resulting value to the function we get by evaluating foo. This allows us to write decorators and combinators that work with any function.

Or does it?

what would make decorators and combinators difficult

Imagine, if you will, that functions came in two colours: blue, and khaki. Now imagine that when we invoke a function in a variable, we type the name of the function in the proper colour. So if we write const square = (x) => x * x, we also have to write square(5), so that square is always blue.

If we write const square = (x) => x * x, but elsewhere we write square(5), it won’t work because square is a blue function and square(5) would be a khaki invocation.

If functions worked like that, decorators would be very messy. We’d have to make colour-coded decorators, like a blue maybe and a khaki maybe. We’d have to carefully track which functions have which colours, much as in gendered languages like French, you need to know the gender of all inanimate objects so that you can use the correct gendered grammar when talking about them.

This sounds bad, and for programming tools, it is.1 The general principle is: Have fewer kinds of similar things, but allow the things you do have to combine in flexible ways. You can’t just remove things, you have to also make it very easy to combine things. Functions as first-class-entities are a good example of this, because they allow you to combine functions in flexible ways.

Coloured functions would be an example of how not to do it, because you’d be making it harder to combine functions by balkanizing them.2

Functions don’t have colours in JavaScript. But there are things that have this kind of asymmetry that make things just as awkward. For example, methods in JavaScript are functions. But, when you invoke them, you have to get this set up correctly. You have to either:

  1. Invoke a method as a property of an object. e.g. foo.bar(baz) or foo['bar'](baz).
  2. Bind an object to a method before invoking it, e.g. bar.bind(foo).
  3. Invoke the method with with .call or .apply, e.g bar.call(foo, baz).

Thus, we can imagine that calling a function directly (e.g. bar(baz)) is blue, invoking a function and setting this (e.g. bar.call(foo, baz)) is khaki.

Or in other words, functions are blue, and methods are khaki.

the composability problem

We often write decorators in blue, a/k/a pure functional style. Here’s a decorator that makes a function throw an exception if its argument is not a finite number:

const requiresFinite = (fn) =>
  (n) => {
    if (Number.isFinite(n)){
      return fn(n);
    }
    else throw "Bad Wolf";
  }

const plusOne = x => x + 1;

plusOne(1)
  //=> 2

plusOne([])
  //=> 1 WTF!?

const safePlusOne = requiresFinite(plusOne);

safePlusOne(1)
  //=> 2

safePlusOne([])
  //=> throws "Bad Wolf"

But it won’t work on methods. Here’s a Circle class that has an unsafe .scaleBy method:

class Circle {
  constructor (radius) {
    this.radius = radius;
  }
  circumference () {
    return Math.PI * 2 * this.radius;
  }
  scaleBy (factor) {
    return new Circle(factor * this.radius);
  }
}

const two = new Circle(2);

two.scaleBy(3).circumference()
  //=> 37.69911184307752

two.scaleBy(null).circumference()
  //=> 0 WTF!?

Let’s decorate the scaleBy method to check its argument:

Circle.prototype.scaleBy = requiresFinite(Circle.prototype.scaleBy);

two.scaleBy(null).circumference()
  //=> throws "Bad Wolf"

Looks good, let’s put it into production:

Circle.prototype.scaleBy = requiresFinite(Circle.prototype.scaleBy);

two.scaleBy(3).circumference()
  //=> undefined is not an object (evaluating 'this.radius')

Whoops, we forgot that method invocation is khaki code, so our blue requiresFinite decorator will not work on methods. This is the problem of khaki and blue code colliding.

composing functions with green code

Fortunately, we can write higher-order functions like decorators and combinators in a style that works for both “pure” functions and for methods. We have to use the function keyword so that this is bound, and then invoke our decorated function using .call so that we can pass this along.

Here’s requiresFinite written in this style, which we will call green . It works for decorating both methods and functions:

const requiresFinite = (fn) =>
  function (n) {
    if (Number.isFinite(n)){
      return fn.call(this, n);
    }
    else throw "Bad Wolf";
  }

Circle.prototype.scaleBy = requiresFinite(Circle.prototype.scaleBy);

two.scaleBy(3).circumference()
  //=> 37.69911184307752

two.scaleBy("three").circumference()
  //=> throws "Bad Wolf"

const safePlusOne = requiresFinite(x => x + 1);

safePlusOne(1)
  //=> 2

safePlusOne([])
  //=> throws "Bad Wolf"

We can write all of our decorators and combinators in green style. For example, instead of writing maybe in functional (blue) style like this:

const maybe = (fn) =>
  (...args) => {
    for (let arg of args) {
      if (arg == null) return arg;
    }
    return fn(...args);
  }

We can write it in both functional and method style ( green ) style like this:

const maybe = (method) =>
  function (...args) {
    for (let arg of args) {
      if (arg == null) return arg;
    }
    return method.apply(this, args);
  }

And instead of writing our simple compose in functional (blue) style like this:

const compose = (a, b) =>
  (x) => a(b(x));

We can write it in both functional and method style ( green ) style like this:

const compose = (a, b) =>
  function (x) {
    return a.call(this, b.call(this, x));
  }

What makes JavaScript tolerable is that green handling works for both functional (blue) and method invocation (khaki) code. But when writing large code bases, we have to remain aware that some functions are blue and some are khaki, because if we write a mostly blue program, we could be lured into complacency with with blue decorators and combinators for years. But everything would break if a khaki method was introduced that didn’t play nicely with our blue combinators

The safe thing to do is to write all our higher-order functions in green style, so that they work for functions or methods. And that’s why we might talk about the simpler, blue form when introducing an idea, but we write out the more complete, green form when implementing it as a recipe.

red functions vs. object factories

JavaScript classes (and the equivalent prototype-based patterns) rely on creating objects with the new keyword. As we saw in the example above:

class Circle {
  constructor (radius) {
    this.radius = radius;
  }
  circumference () {
    return Math.PI * 2 * this.radius;
  }
  scaleBy (factor) {
    return new Circle(factor * this.radius);
  }
}

const round = new Circle(1);

round.circumference()
  //=> 6.2831853

That new keyword introduces yet another colour of function, constructors are red functions. We can’t make circles using blue function calls:

const round2 = Circle(2);
  //=> Cannot call a class as a function

[1, 2, 3, 4, 5].map(Circle)
  //=> Cannot call a class as a function

And we certainly can’t use a decorator on them:

const CircleRequiringFiniteRadius = requiresFinite(Circle);

const round3 = new CircleRequiringFiniteRadius(3);
  //=> Cannot call a class as a function

Some experienced developers dislike new because of this problem: It introduces one more kind of function that doesn’t compose neatly with other functions using our existing decorators and combinators.

We could eliminate red functions by using prototypes and Object.create instead of using the class and new keywords. A “factory function” is a function that makes new objects. So instead of writing a Circle class, we would write a CirclePrototype and a CircleFactory function:

const CirclePrototype = {
  circumference () {
    return Math.PI * 2 * this.radius;
  },
  scaleBy (factor) {
    return CircleFactory(factor * this.radius);
  }
};

const CircleFactory = (radius) =>
  Object.create(CirclePrototype, {
    radius: { value: radius, enumerable: true }
  })

CircleFactory(2).scaleBy(3).circumference()
  //=> 37.69911184307752

Now we have a blue CircleFactory function, and we have the benefits of objects and methods, along with the benefits of decorating and composing factories like any other function. For example:

const requiresFinite = (fn) =>
  function (n) {
    if (Number.isFinite(n)){
      return fn.call(this, n);
    }
    else throw "Bad Wolf";
  }

const FiniteCircleFactory = requiresFinite(CircleFactory);

FiniteCircleFactory(2).scaleBy(3).circumference()
  //=> 37.69911184307752

FiniteCircleFactory(null).scaleBy(3).circumference()
  //=> throws "Bad Wolf"

All that being said, programming with factory functions instead of with classes and new is not a cure-all. Besides losing some of the convenience and familiarity for other developers, we’d also have to use extreme discipline for fear that accidentally introducing some red classes would break our carefully crafted “blue in green” application.

In the end, there’s no avoiding the need to know which functions are functions, and which are actually classes. Tooling can help: Some linting applications can enforce a naming convention where classes start with an upper-case letter and functions start with a lower-case letter.

charmed functions

Consider:

const likesToDrink = (whom) => {
  switch (whom) {
    case 'Bob':
      return 'Ristretto';
    case 'Carol':
      return 'Cappuccino';
    case 'Ted':
      return 'Allongé';
    case 'Alice':
      return 'Cappuccino';
  }
}

likesToDrink('Alice')
  //=> 'Cappuccino'

likesToDrink('Peter')
  //=> undefined;

That’s a pretty straightforward function that implements a mapping from Bob, Carol, Ted, and Alice to the drinks ‘Ristretto’, ‘Cappuccino’, and ‘Allongé’. The mapping is encoded implicitly in the code’s switch statement.

We can use it in combination with other functions. For example, we can find out if the first letter of what someone likes is “c:”

const startsWithC = (something) => !!something.match(/^c/i)

startsWithC(likesToDrink('Alice'))
  //=> true

const likesSomethingStartingWithC =
  compose(startsWithC, likesToDrink);

likesSomethingStartingWithC('Ted')
  //=> false

So far, that’s good, clean blue function work. But there’s yet another kind of “function call.” If you are a mathematician, this is a mapping too:

const personToDrink = {
  Bob: 'Ristretto',
  Carol: 'Cappuccino',
  Ted: 'Allongé',
  Alice: 'Cappuccino'
}

personToDrink['Alice']
  //=> 'Cappuccino'

personToDrink['Ted']
  //=> 'Allongé'

personToDrink also maps the names ‘Bob’, ‘Carol’, ‘Ted’, and ‘Alice’ to the drinks ‘Ristretto’, ‘Cappuccino’, and ‘Allongé’, just like likesToDrink. But even though it does the same thing as a function, we can’t use it as a function:

const personMapsToSomethingStartingWithC =
  compose(startsWithC, personToDrink);

personMapsToSomethingStartingWithC('Ted')
  //=> undefined is not a function (evaluating 'b.call(this, x)')

As you can see, [ and ] are a little like ( and ), because we can pass Alice to personToDrink and get back Cappuccino. But they are just different enough, that we can’t write personToDrink(...). Objects (as well as ES-6 maps and sets) are “charmed functions.”

And you need a different piece of code to go with them. We’d need to write things like this:

const composeblueWithCharm =
  (bluefunction, charmedfunction) =>
    (arg) =>
      bluefunction(charmedfunction[arg]);

const composeCharmWithblue =
  (charmedfunction, bluefunction) =>
    (arg) =>
      charmedfunction[bluefunction(arg)]

// ...

That would get really old, really fast.

adapting to handle red and charmed functions

We can work our way around some of these cross-colour and charm issues by writing adaptors, wrappers that turn red and charmed functions into blue functions. As we saw above, a “factory function” is a function that is called in the normal way, and returns a freshly created object.

If we wanted to create a CircleFactory, we could use Object.create as we saw above. We could also wrap new Circle(...) in a function:

class Circle {
  constructor (radius) {
    this.radius = radius;
  }
  circumference () {
    return Math.PI * 2 * this.radius;
  }
  scaleBy (factor) {
    return new Circle(factor * this.radius);
  }
}

const CircleFactory = (radius) =>
  new Circle(radius);

CircleFactory(2).scaleBy(3).circumference()
  //=> 37.69911184307752

With some argument jiggery-pokery, we could abstract Circle from CircleFactory and make a factory for making factories, a FactoryFactory:

We would write a CircleFactory function:

const FactoryFactory = (clazz) =>
  (...args) =>
    new clazz(...args);

const CircleFactory = FactoryFactory(Circle);

circleFactory(5).circumference()
  //=> 31.4159265

FactoryFactory turns any red class into a blue function. So we can use it any where we like:

[1, 2, 3, 4, 5].map(FactoryFactory(Circle))
  //=>
    [{"radius":1},{"radius":2},{"radius":3},{"radius":4},{"radius":5}]

Sadly, we still have to remember that Circle is a class and be sure to wrap it in FactoryFactory when we need to use it as a function, but that does work.

We can do a similar thing with our “charmed” maps (and arrays, for that matter). Here’s Dictionary, a function that turns objects and arrays (our “charmed” functions) into ordinary (blue) functions:

const Dictionary = (data) => (key) => data[key];

const personToDrink = {
  Bob: 'Ristretto',
  Carol: 'Cappuccino',
  Ted: 'Allongé',
  Alice: 'Cappuccino'
}

['Bob', 'Ted', 'Carol', 'Alice'].map(Dictionary(personToDrink))
  //=> ["Ristretto","Allongé","Cappuccino","Cappuccino"]

Dictionary makes it easier for us to use all of the same tools for combining and manipulating functions on arrays and objects that we do with functions.

dictionaries as proxies

As David Nolen has pointed out, languages like Clojure have maps that can be called as functions automatically. This is superior to wrapping a map in a plain function, because the underlying map is still available to be iterated over and otherwise treated as a map. Once we wrap a map in a function, it becomes opaque, useless for anything except calling as a function.

If we wish, we can create a dictionary function that is a partial proxy for the underlying collection object. For example, here is an IterableDictionary that turns a collection into a function that is also iterable if its underlying data object is iterable:

const IterableDictionary = (data) => {
  const proxy = (key) => data[key];
  proxy[Symbol.iterator] = function* (...args) {
    yield* data[Symbol.iterator](...args);
  }
  return proxy;
}

const people = IterableDictionary(['Bob', 'Ted', 'Carol', 'Alice']);
const drinks = IterableDictionary(personToDrink);

for (let name of people) {
  console.log(`${name} prefers to drink ${drinks(name)}`)
}
  //=>
    Bob prefers to drink Ristretto
    Ted prefers to drink Allongé
    Carol prefers to drink Cappuccino
    Alice prefers to drink Cappuccino

This technique has limitations. For example, objects in JavaScript are not iterable by default. So we can’t write:

for (let [name, drink] of personToDrink) {
  console.log(`${name} prefers to drink ${drink}`)
}
  //=> undefined is not a function (evaluating 'personToDrink[Symbol.iterator]()')

We could write:

for (let [name, drink] of Object.entries(personToDrink)) {
  console.log(`${name} prefers to drink ${drink}`)
}
  //=>
    Bob prefers to drink Ristretto
    Carol prefers to drink Cappuccino
    Ted prefers to drink Allongé
    Alice prefers to drink Cappuccino

It would be an enormous hack to make Object.entries(IterableDictionary(personToDrink)) work. While we’re at it, how would we make .length work? Functions implement .length as the number of arguments they accept. Arrays implement it as the number of entries they hold. If we wrap an array in a dictionary, what is its .length?

Proxying collections, meaning “creating an object that behaves like the collection,” works for specific and limited contexts, but it is enormously fragile to attempt to make a universal proxy that also acts as a function.

summary

JavaScript’s elegance comes from having a simple thing, functions, that can be combined in many flexible ways. Exceptions to the ways functions combine, like the new keyword, handling this, and [...], make combining awkward, but we can work around that by writing adaptors to convert these exceptions to regular function calls.

p.s. For bonus credit, write adaptors for EcmaScript’s Map and Set collections.

p.p.s. Some of this material was originally published in Reusable Abstractions in CoffeeScript (2012). If you’re interested in Ruby, Paul Mucur wrote a great post about Data Structures as Functions.


edit this page

This post was extracted from a draft of the book, JavaScript Allongé, The “Six” Edition. The extracts so far:

  1. Bad for programming languages, of course. French is a lovely human language. 

  2. Bob Nystrom introduced this excellent metaphor in What colour is your function? 

https://raganwald.com/2015/03/12/symmetry
(unlikely to be) The Last Word on Interviewing for a JavaScript Job
Show full content

These are my comments on Interviewing for a JavaScript Job. The story concerns a job interview, where the interviewer (“Christine”) asks the candidate (known as “The Carpenter”) to whiteboard JavaScript code solving this problem:

Consider a finite checkerboard of unknown size. On each square, we randomly place an arrow pointing to one of its four sides. A chequer is placed randomly on the checkerboard. Each move consists of moving the chequer one square in the direction of the arrow in the square it occupies. If the arrow should cause the chequer to move off the edge of the board, the game halts.

The problem is this: The game board is hidden from us. A player moves the chequer, following the rules. As the player moves the chequer, they calls out the direction of movement, e.g. “↑, →, ↑, ↓, ↑, →…” Write an algorithm that will determine whether the game halts, strictly from the called out directions, in finite time and space.

Meanwhile, the Carpenter had been coached by a headhunter (“Bob Plissken”) that the company likes to ask this question and about detecting cycles in a graph. The Carpenter tries to convert the problem into a graph problem, but Christine fails him out of the interview without even giving him a chance to test and polish his first draft.

Let’s start with the technical bits, because that’s what many commenters fixate upon.

flaws in the solution given

The Carpenter’s solution is not correct. This is partly deliberate, partly accidental. When I wrote the problem, I deliberately inserted a rather obvious flaw. Here’s his complete solution:

const Game = (size = 8) => {
  
  // initialize the board
  const board = [];
  for (let i = 0; i < size; ++i) {
    board[i] = [];
    for (let j = 0; j < size; ++j) {
      board[i][j] = '←→↓↑'[Math.floor(Math.random() * 4)];
    }
  }
  
  // initialize the position
  let initialPosition = [
    2 + Math.floor(Math.random() * (size - 4)), 
    2 + Math.floor(Math.random() * (size - 4))
  ];
  
  return ({
    [Symbol.iterator]: function* () {
      let [x, y] = initialPosition;
  
      const MOVE = {
        "←": ([x, y]) => [x - 1, y],
        "→": ([x, y]) => [x + 1, y],
        "↓": ([x, y]) => [x, y - 1],
        "↑": ([x, y]) => [x, y + 1] 
      };
      
      while (x >= 0 && y >=0 && x < size && y < size) {
        const arrow = board[x][y];
        
        yield arrow;
        [x, y] = MOVE[arrow]([x, y]);
      }
    }
  });
};

const statefulMapIterableWith = (fn, seed, iterable) =>
  ({
    [Symbol.iterator]: function* () {
      let value,
          state = seed;
      
      for (let element of iterable) {
        [state, value] = fn(state, element);
        yield value;
      }
    }
  });

const positionsOf = (game) =>
  statefulMapIterableWith(
    (position, direction) => {
      const MOVE = {
        "←": ([x, y]) => [x - 1, y],
        "→": ([x, y]) => [x + 1, y],
        "↓": ([x, y]) => [x, y - 1],
        "↑": ([x, y]) => [x, y + 1] 
      };
      const [x, y] = position = MOVE[direction](position);
      
      return [position, `x: ${x}, y: ${y}`];
    },
    [0, 0],
    game);

const tortoiseAndHare = (iterable) => {
  const hare = iterable[Symbol.iterator]();
  let hareResult = (hare.next(), hare.next());
  
  for (let tortoiseValue of iterable) {
    
    hareResult = hare.next();
    
    if (hareResult.done) {
      return false;
    }
    if (tortoiseValue === hareResult.value) {
      return true;
    }
    
    hareResult = hare.next();
    
    if (hareResult.done) {
      return false;
    }
    if (tortoiseValue === hareResult.value) {
      return true;
    }
  }
  return false;
};

const terminates = (game) =>
  tortoiseAndHare(positionsOf(game))

The obvious flaw is that tortoiseAndHare reports true when there is a cycle, while the function terminates implies that true would mean the game’s moves have no cycle. IMO, this is an error best solved with naming. The correct function would be:

// implements Tortoise and Hare cycle
// detection algorithm.
const hasCycle = (iterable) => {
  const hare = iterable[Symbol.iterator]();
  let hareResult = (hare.next(), hare.next());
  
  for (let tortoiseValue of iterable) {
    
    hareResult = hare.next();
    
    if (hareResult.done) {
      return false;
    }
    if (tortoiseValue === hareResult.value) {
      return true;
    }
    
    hareResult = hare.next();
    
    if (hareResult.done) {
      return false;
    }
    if (tortoiseValue === hareResult.value) {
      return true;
    }
  }
  return false;
};

const terminates = (game) =>
  !hasCycle(positionsOf(game))

I left that flaw in because I wanted to create the dynamic where Christine dislikes the code for other reasons, but fails him for getting one character wrong.

There is another, more subtle “flaw,” namely that the Carpenter treats the game as an ordered collection, but the verbal description of the problem presents the directions as a stream. Meaning, you should not be able to create two independent iterators over the elements. This is definitely the fictitious Carpenter’s fault.

There is at least one more flaw in the code as presented in the post, but I can say outright that all other flaws are my fault as an imperfect author, not the fictitious Carpenter’s fault. FWIW, here is how I could clean up the Carpenter’s solution, with a little refactoring to make it easier to test:

const MOVE = {
  "←": ([x, y]) => [x - 1, y],
  "→": ([x, y]) => [x + 1, y],
  "↓": ([x, y]) => [x, y + 1],
  "↑": ([x, y]) => [x, y - 1] 
};

const Board = (size = 8) => {
  
  // initialize the board
  const board = [];
  for (let i = 0; i < size; ++i) {
    board[i] = [];
    for (let j = 0; j < size; ++j) {
      board[i][j] = '←→↓↑'[Math.floor(Math.random() * 4)];
    }
  }
  
  // initialize the position
  const position = [
    Math.floor(Math.random() * size), 
    Math.floor(Math.random() * size)
  ];
  
  return {board, position};
};

const Game = ({board, position}) => {
  
  const size = board[0].length;
  
  return ({
    [Symbol.iterator]: function* () {
      let [x, y] = position;
      
      while (x >= 0 && y >=0 && x < size && y < size) {
        const direction = board[y][x];
        
        yield direction;
        [x, y] = MOVE[direction]([x, y]);
      }
    }
  });
};

const statefulMapIterableWith = (fn, seed, iterable) =>
  ({
    [Symbol.iterator]: function* () {
      let value,
          state = seed;
      
      for (let element of iterable) {
        [state, value] = fn(state, element);
        yield value;
      }
    }
  });

const positionsOf = (game) =>
  statefulMapIterableWith(
    (position, direction) => {
      const [x, y] = position = MOVE[direction](position);
      position = [x, y];
      return [position, `x: ${x}, y: ${y}`];
    },
    [0, 0],
    game);

// implements Tortoise and Hare cycle
// detection algorithm.
const hasCycle = (orderedCollection) => {
  const hare = orderedCollection[Symbol.iterator]();
  let hareResult = (hare.next(), hare.next());
  
  for (let tortoiseValue of orderedCollection) {
    
    hareResult = hare.next();
    
    if (hareResult.done) {
      return false;
    }
    if (tortoiseValue === hareResult.value) {
      return true;
    }
    
    hareResult = hare.next();
    
    if (hareResult.done) {
      return false;
    }
    if (tortoiseValue === hareResult.value) {
      return true;
    }
  }
  return false;
};

const terminates = (game) =>
  !hasCycle(positionsOf(game))
  
const test = [
  ["↓","←","↑","→"],
  ["↓","→","↓","↓"],
  ["↓","→","→","←"],
  ["↑","→","←","↑"]
];

terminates(Game({board: test, position: [0, 0]}))
  //=> false
terminates(Game({board: test, position: [3, 0]}))
  //=> true
terminates(Game({board: test, position: [0, 3]}))
  //=> false
terminates(Game({board: test, position: [3, 3]}))
  //=> false

Some people would say that there were errors precisely because it’s a longer bit of code, and that is correct. But I wouldn’t judge that in a vacuum. OOP code is often more convoluted than simple procedural code. Is it unnecessary AbstractFacadefactoryArchitectureAstronatics? Or is it separating concerns in a way that makes the code easier to understand and maintain? Sometimes you have to have a conversation to decide.

ordered collections and streams

As given in the description, the list of moves are a stream, not an ordered collection. Therefore, this solution sort-of works given the template code, but does not work given the requirements. Udik on Hacker News was the first person to point this out.

To reiterate:

The game board is hidden from us. A player moves the chequer, following the rules. As the player moves the chequer, they calls out the direction of movement, e.g. “↑, →, ↑, ↓, ↑, →…” Write an algorithm that will determine whether the game halts, strictly from the called out directions, in finite time and space.

I was trying to give the impression that the Carpenter was unnecessarily force-fitting the constant-space cycle detection algorithm to the chequerboard game. In the story, he tried to joke about that:

There’s an old joke that a mathematician is someone who will take a five-minute problem, then spend an hour proving it is equivalent to another problem they have already solved. I approached this question in that spirit.

Many people commented that he was trying to be a show-off, but I’d like to point out that Plissken did tell him that the company liked to ask about cycle detection, so the Carpenter was simply trying to show Christine something he thought she wanted to see. And he was trying to imply that he could approach the problem in several different ways, but he chose to approach it in this way.

The simplest solution to the problem as given is to keep a set of positions that have already been visited. That takes finite space, and can be written either entirely within the original template, or can be bolted onto the iteratable answer:

const repeatsItself = (orderedCollection) => {
  const visited = new Set();
  
  for (let element of orderedCollection) {
    if (visited.has(element)) {
      return true;
    }
    visited.add(element);
  }
  return false;
};

const terminates = (game) =>
  !repeatsItself(positionsOf(game))

This is the answer Christine was looking for. A brilliant answer that takes constant space was suggested by alisey on Hacker News: Track the rectangle representing the maximum distance travelled from the start. If the number of steps exceeds the height times width, you are cycling.

thoughts on interviews

Now that we have gotten the technical aspects out of the way, here are my candid thoughts.

First, I intended for all three participants to be selfish actors, each trying to play the game by themselves.

  • Bob is trying to jam the Carpenter into the job, and coaches him what to expect even though the Carpenter is an experienced programmer.
  • Christine has a preconceived answer in mind and presents what I think is a very poor template for the solution to fill in. It even has information (like the size of the game board) that the solution is not supposed to take advantage of.
  • And the Carpenter is sincerely trying to tell them what he thinks they want to hear, without asking Christine if that is indeed what she wants him to do.

Leaving the recruiter out, an interview is actually supposed to be a coöperative game. Both Christine and the Carpenter lose if Thing would have been a better fit than FOG, and if the Carpenter would have been a better colleague than whomever Thing eventually hired.

As I wrote the story, neither Christine nor the Carpenter really talked to each other. Christine had her pet problem, and in her head, a kind of script for what she would ask the Carpenter once he wrote some obvious bits of code. She was nonplussed when he went “off-script,” and that’s really why she failed him. Meanwhile, the Carpenter had arrived with a solution in his head that would make him (in his mind) stand out from other applicants.

Christine could have and should have made it clear up front whether she just wanted him to FizzBuzz, i.e. to prove he can code anything in JavaScript. And once she realized he was going off into architectural abstractions (what some call “achieving escape velocity”), she could have and should have interrupted him and been more explicit about what she wanted him to demonstrate.

It is also clearly the Carpenter’s responsibility to ask Christine what she wants to see in a solution. He was trying to stand out from other applicants, and use what Plissken had told him to do better than expected. But he still could have and should have discussed this with Christine. It would have taken five seconds to say, “Well we could solve this with a Set, but it’s isomorphic to a problem of finding cycles in a graph, would you like me to solve it that way?”

the problem with programming problems

Quite frankly, whiteboard problems in interviews are minefields. When asked a programming question, an interviewer might want to see any of the following mutually exclusive things:

  1. Demonstrate that you can put together any old basic thing (“FizBuzzing”).
  2. Demonstrate that you understand algorithm fundamentals like space and time requirements, mutability, state, and so forth.
  3. Demonstrate the kind of code you’d write in production for colleagues to understand and maintain.
  4. Demonstrate that you are current and familiar with the latest developments in your toolset, regardless of whether you are employing them in production.

You really can’t answer all of these in one code snippet. If the interviewer is just trying to quickly weed out the bullshitters, they don’t want you to factor the code and write tests for each piece. But if they want to see how you write code for production, they do. If they want to know that you’re keeping up to date, they might want to see you demonstrate your knowledge of some new language features.

And, if they want to see how you solve a day-to-day problem, they don’t want to see the solution use ES-6 transpilation or Mori persistent data structures. Unless they use those, in which case they do want to see them.

How do you know what to write?

The answer is, they have to tell you, or you have to ask. It is 100% grade-A bullshit for a candidate to write some code and then “fail” them because they tried to address one of those four objectives but the interviewer was trying to satisfy a different objective.

You might as well come right out and say, “We’re looking for people who think the way we do, without being told what we’re thinking.” That is a so-called “cultural fit” test, but what it’s really testing is whether they have read the same blog posts and HN discussions about how to interpret the results.

snap judgments

It’s fairly easy to say, “No hire, because X,” or, “I wouldn’t want to work at Thing, because Y.” But really, we have almost no information to go on. In a detective story, we start with very little information, and we decide who the suspect is, and then we gather further information to confirm or refute our hypothesis. Did I say “in a detective story?” I meant to say, in science.

In an actual interview, we do not need to get up and walk out if asked to find out whether some hypothetical “game” terminates, not do we need to sit and say nothing while a candidate writes out a long “solution.” We can ask questions. We can form a hypothesis, sure, but then we can confirm or refute it by asking questions.

Likewise, if our hypothesis is that the Carpenter is unable to write clear code, we don’t need to say “Fail,” we can simply ask: “How would you write this if you knew that it would be maintained by interns we pick up from the local university’s CS program?” Does he rewrite it? Or does he argue with us about how it’s the intern’s fault if they don’t know how the semantics of an EXCMASCript 2015 iterable?

Likewise, we can be charitable about Christine. Unless she is a full-time professional technical interviewer, she probably spends twenty hours programming for every one hour interviewing. If for whatever reason she seems to be clumsy or ask poor questions, why should we make assumptions about what she would be like as a fellow programming colleague?

And she is only one person in a company. Why should we make assumptions about the entire company based on one interaction with one person? We can and should answer her question, and make sure to ask some questions of our own about the kind of programming Thing does, the culture it has, and so forth. It is not necessary to make sweeping generalizations based on almost no information.

what are we testing?

As I noted, Christine is unlikely to be very good at the job of interviewing, because she doesn’t interview people for a living. And for all his experience, the Carpenter doesn’t interview for a living. This dynamic is the rule, not the exception. Most technical interviews are conducted by people who are inexperienced with technical interviews.

Therefore, I counsel being charitable: Assume that most people are good at their primary job, and happen to be less than amazing at this part-time necessity of interviewing or being interviewed.

The alternative is to hire people who have a lot of experience going on interviews. And that could be a sign they spend a lot of time being unemployed. Likewise, companies that are amazing at interviewing people might have to hire a lot of people because good people leave.

This last bit is obvious if you have spent any amount of time dating people you meet through a dating service or in nightclubs. The people who are “good at dating” are good at dating because they spend a lot of time dating instead of being in a relationship. Being good at dating doesn’t say anything about whether someone is good at being in a relationship.

the anti-patterns

The first anti-pattern in interviews is go go in with the objective of weeding out losers (whether we mean loser candidates or loser companies). As we all know, if you show me a metric, I’ll show you a game. If the objective is to take ten interviews and “weed out” nine losers, the most efficient way to play that game is to look for false negatives.

For example, the Carpenter got the terminates function wrong. I deliberately put that in. Everyone I know gets things like this backwards from time to time. But we you fail him outright for that, rather than asking, “Do you want to write some test cases,” we are playing a game on our own. If the metric was really, “Make sure no good person escapes without an offer,” then we would want to make sure that this wasn’t a simple transcription error, or the result of interview pressure.

That goes for rejecting a company because you don’t like one question they ask as well.

The second anti-pattern is more insidious. This is when we form a subjective opinion about someone, then look for an objective way to validate our hunch. So, we decide that the Carpenter’s solution is unnecessarily complicated. I happen to think that is the case for what Christine wanted, and I wrote the story. So you know we’re right about this.

But, if we then get a hunch that the Carpenter will always write complicated code, we’re now guessing. And if we think he’ll be a terrible programmer because of this, well, that is unfounded. But whatever, we have a hunch, nothing wrong with trying to confirm it. But what we sometimes do is go in with a swinging axe. We look over this code we don’t like… Hmmm… What about this… AHA!

  • uses arrow in one place and direction in another: Sloppy naming.
  • repeats MOVE: Doesn’t understand DRY.
  • excessive use of language features that interns may not know.
  • solution is not perfectly in accordance with stated requirement: FAIL. FAIL. FAIL.
  • gets terminate wrong: FAIL. FAIL. FAIL.

All of these observations are valid, but if we’re just gathering evidence to support our hypothesis, we’re not reasoning correctly. I am not making this kind of stuff up, industry spends a lot of money (and a lot of time in courtrooms) going over these things, and people can be insanely biased.

If we get a good immediate impression of someone, we hand-wave their faults. We say, “Well, it’s a live whiteboard test, it demonstrates the basic competency, we can sit down and debug together, that will be an excellent exercise to understand how they think.”

Whereas if we dislike them, we take one look and then look at the clock, thank them, and end the interview. If we like them, and they seem to be going down the wrong track, we coach them. If we dislike them, we check our email on our phone while they write a solution to the wrong problem.

This is very, very bad behaviour. It’s exactly what leads to certain cultural problems with respect to gender and race disparities. And it’s easy to do subconsciously. I can’t tell you what your company should do about it, but I know that on a personal level, it begins with noticing that we have this tendency to form a bias, and trying to be hyper-vigilant about it.

summary

It’s just a story, no big deal. But FWIW, I tried to make this about three reasonable humans, none of them awesome or particularly flawed. Each was trying, more-or-less, to do the right thing by themselves, but not trying very hard to really work with the other participants to ensure that the very best interview took place.


hacker news edit this page
https://raganwald.com/2015/02/23/the-last-word-on-interviewing
Interviewing for a JavaScript Job
Show full content

“The Carpenter” was a JavaScript programmer, well-known for a meticulous attention to detail and love for hand-crafted, exquisitely joined code. The Carpenter normally worked through personal referrals, but from time to time a recruiter would slip through his screen. One such recruiter was Bob Plissken. Bob was well-known in the Python community, but his clients often needed experience with other languages.

Plissken lined up a technical interview with a well-funded startup in San Francisco. The Carpenter arrived early for his meeting with “Thing Software,” and was shown to conference room 13. A few minutes later, he was joined by one of the company’s developers, Christine.

the problem

After some small talk, Christine explained that they liked to ask candidates to whiteboard some code. Despite his experience and industry longevity, the Carpenter did not mind being asked to demonstrate that he was, in fact, the person described on the resumé.

Many companies use white-boarding code as an excuse to have a technical conversation with a candidate, and The Carpenter felt that being asked to whiteboard code was an excuse to have a technical conversation with a future colleague. “Win, win” he thought to himself.

Chessboard

Christine intoned the question, as if by rote:

Consider a finite checkerboard of unknown size. On each square, we randomly place an arrow pointing to one of its four sides. A chequer is placed randomly on the checkerboard. Each move consists of moving the chequer one square in the direction of the arrow in the square it occupies. If the arrow should cause the chequer to move off the edge of the board, the game halts.

The problem is this: The game board is hidden from us. A player moves the chequer, following the rules. As the player moves the chequer, they calls out the direction of movement, e.g. “↑, →, ↑, ↓, ↑, →…” Write an algorithm that will determine whether the game halts, strictly from the called out directions, in finite time and space.

“So,” The Carpenter asked, “I am to write an algorithm that takes a possibly infinite stream of…”

Christine interrupted. “To save time, we have written a template of the solution for you in ECMASCript 2015 notation. Fill in the blanks. Your code should not presume anything about the gameboard’s size or contents, only that it is given an arrow every time though the while loop. You may use babeljs.io, or ES6Fiddle to check your work. “

Christine quickly scribbled on the whiteboard:

const Game = (size = 8) => {

  // initialize the board
  const board = [];
  for (let i = 0; i < size; ++i) {
    board[i] = [];
    for (let j = 0; j < size; ++j) {
      board[i][j] = '←→↓↑'[Math.floor(Math.random() * 4)];
    }
  }

  // initialize the position
  let initialPosition = [
    2 + Math.floor(Math.random() * (size - 4)),
    2 + Math.floor(Math.random() * (size - 4))
  ];

  // ???
  let [x, y] = initialPosition;

  const MOVE = {
    "←": ([x, y]) => [x - 1, y],
    "→": ([x, y]) => [x + 1, y],
    "↓": ([x, y]) => [x, y - 1],
    "↑": ([x, y]) => [x, y + 1]
  };
  while (x >= 0 && y >=0 && x < size && y < size) {
    const arrow = board[x][y];

    // ???

    [x, y] = MOVE[arrow]([x, y]);
  }
  // ???
};

“What,” Christine asked, “Do you write in place of the three // ??? placeholders to determine whether the game halts?”

the carpenter’s solution

The Carpenter was not surprised at the problem. Bob Plissken was a crafty, almost reptilian recruiter that traded in information and secrets. Whenever Bob sent a candidate to a job interview, he debriefed them afterwards and got them to disclose what questions were asked in the interview. He then coached subsequent candidates to give polished answers to the company’s pet technical questions.

And just as companies often pick a problem that gives them broad latitude for discussing alternate approaches and determining that depth of a candidate’s experience, The Carpenter liked to sketch out solutions that provided an opportunity to judge the interviewer’s experience and provide an easy excuse to discuss the company’s approach to software design.

Bob had, in fact, warned The Carpenter that “Thing” liked to ask either or both of two questions: Determine how to detect a loop in a linked list, and determine whether the chequerboard game would halt. To save time, The Carpenter had prepared the same answer for both questions.

The Carpenter coughed softly, then began. “To begin with, I’ll transform a game into an iterable that generates arrows, using the ‘Starman’ notation for generators.”

“I will add just five lines of code the Game function, and two of those are closing braces:”

  return ({
    [Symbol.iterator]: function* () {

And:

        yield arrow;

And:

    }
  });

“The finished function reads:”

const Game = (size = 8) => {

  // initialize the board
  const board = [];
  for (let i = 0; i < size; ++i) {
    board[i] = [];
    for (let j = 0; j < size; ++j) {
      board[i][j] = '←→↓↑'[Math.floor(Math.random() * 4)];
    }
  }

  // initialize the position
  let initialPosition = [
    2 + Math.floor(Math.random() * (size - 4)),
    2 + Math.floor(Math.random() * (size - 4))
  ];

  return ({
    [Symbol.iterator]: function* () {
      let [x, y] = initialPosition;

      const MOVE = {
        "←": ([x, y]) => [x - 1, y],
        "→": ([x, y]) => [x + 1, y],
        "↓": ([x, y]) => [x, y - 1],
        "↑": ([x, y]) => [x, y + 1]
      };

      while (x >= 0 && y >=0 && x < size && y < size) {
        const arrow = board[x][y];

        yield arrow;
        [x, y] = MOVE[arrow]([x, y]);
      }
    }
  });
};

“Now that we have an iterable, we can transform the iterable of arrows into an iterable of positions.” The Carpenter sketched quickly. “We’ll need some common utilities. You’ll find equivalents in a number of JavaScript libraries, but I’ll quote those given in JavaScript Allongé:”

“For starters, takeIterable transforms an iterable into one that yields at most a fixed number of elements. It’s handy for debugging. We’ll use it to check that our Game is working as an iterable:”

const takeIterable = (numberToTake, iterable) =>
  ({
    [Symbol.iterator]: function* () {
      let remainingElements = numberToTake;

      for (let element of iterable) {
        if (remainingElements-- <= 0) break;
        yield element;
      }
    }
  });

“This doesn’t actually end up in our solution, it’s just to check our work as we go along. And you can find it in libraries, it’s not something we need to reinvent whenever we work with iterables.”

“But now to the business. We want to take the arrows and convert them to positions. For that, we’ll map the Game iterable to positions. A statefulMap is a lazy map that preserves state from iteration to iteration. That’s what we need, because we need to know the current position to map each move to the next position.”

“Again, this is a standard idiom we can obtain from libraries, we don’t reinvent the wheel. I’ll show it here for clarity:”

const statefulMapIterableWith = (fn, seed, iterable) =>
  ({
    [Symbol.iterator]: function* () {
      let value,
          state = seed;

      for (let element of iterable) {
        [state, value] = fn(state, element);
        yield value;
      }
    }
  });

“Armed with this, it’s straightforward to map an iterable of directions to an iterable of strings representing positions:”

const positionsOf = (game) =>
  statefulMapIterableWith(
    (position, direction) => {
      const MOVE = {
        "←": ([x, y]) => [x - 1, y],
        "→": ([x, y]) => [x + 1, y],
        "↓": ([x, y]) => [x, y - 1],
        "↑": ([x, y]) => [x, y + 1]
      };
      const [x, y] = position = MOVE[direction](position);

      return [position, `x: ${x}, y: ${y}`];
    },
    [0, 0],
    game);

The Carpenter reflected. “Having turned our game loop into an iterable, we can now see that our problem of whether the game terminates is isomorphic to the problem of detecting whether the positions given ever repeat themselves: If the chequer ever returns to a position it has previously visited, it will cycle endlessly.”

“We could draw positions as nodes in a graph, connected by arcs representing the arrows. Detecting whether the game terminates is equivalent to detecting whether the graph contains a cycle.”

Cycle Detection

“There’s an old joke that a mathematician is someone who will take a five-minute problem, then spend an hour proving it is equivalent to another problem they have already solved. I approached this question in that spirit. Now that we have created an iterable of values that can be compared with ===, I can show you this function:”

// implements Tortoise and Hare cycle
// detection algorithm.
const hasCycle = (orderedCollection) => {
  const hare = orderedCollection[Symbol.iterator]();
  let hareResult = (hare.next(), hare.next());

  for (let tortoiseValue of orderedCollection) {

    hareResult = hare.next();

    if (hareResult.done) {
      return false;
    }
    if (tortoiseValue === hareResult.value) {
      return true;
    }

    hareResult = hare.next();

    if (hareResult.done) {
      return false;
    }
    if (tortoiseValue === hareResult.value) {
      return true;
    }
  }
  return false;
};

“A long time ago,” The Carpenter explained, “Someone asked me a question in an interview. I have never forgotten the question, or the general form of the solution. The question was, Given a linked list, detect whether it contains a cycle. Use constant space.

“This is, of course, the most common solution, it is Floyd’s cycle-finding algorithm, although there is some academic dispute as to whether Robert Floyd actually discovered it or was misattributed by Knuth.”

“Thus, the solution to the game problem is:”

const terminates = (game) =>
  !hasCycle(positionsOf(game))

“This solution makes use of iterables and a single utility function, statefulMapIterableWith. It also cleanly separates the mechanics of the game from the algorithm for detecting cycles in a graph.”

the aftermath

The Carpenter sat down and waited. This type of solution provided an excellent opportunity to explore lazy versus eager evaluation, the performance of iterators versus native iteration, single responsibility design, and many other rich topics.

The Carpenter was confident that although nobody would write this exact code in production, prospective employers would also recognize that nobody would try to detect whether a chequer game terminates in production, either. It’s all just a pretext for kicking off an interesting conversation, right?

Time

Christine looked at the solution on the board, frowned, and glanced at the clock on the wall. “Well, where has the time gone?

“We at the Thing Software company are very grateful you made some time to visit with us, but alas, that is all the time we have today. If we wish to talk to you further, we’ll be in touch.”

The Carpenter never did hear back from them, but the next day there was an email containing a generous contract from Friends of Ghosts (“FOG”), a codename for a stealth startup doing interesting work, and the Thing interview was forgotten.

Some time later, The Carpenter ran into Bob Plissken at a local technology meet-up. “John! What happened at Thing?” Bob wanted to know, “I asked them what they thought of you, and all they would say was, Writes unreadable code. I thought it was a lock! I thought you’d finally make your escape from New York.”

The Carpenter smiled. “I forgot about them, it’s been a while. So, do They Live?”


hacker news edit this page

p.s. (unlikely to be) The Last Word on Interviewing for a JavaScript Job

p.p.s. The Carpenter probably cribbed the solution from The “Drunken Walk” Programming Problem, and Solving the “Drunken Walk” problem with iterators.

https://raganwald.com/2015/02/21/interviewing-for-a-front-end-job
Lazy Iterables in JavaScript
Show full content

Coffee Labels at the Saltspring Coffee Processing Facility

Nota Bene: All of the examples in this essay require ECMAScript-6, and were tested with the Babel transpiler. You can integrate Babel into your toolchain, or copy and paste the examples into its REPL in your browser.


Many objects in JavaScript can model collections of things. A collection is like a box containing stuff. Sometimes you just want to move the box around. But sometimes you want to open it up and do things with its contents.

Things like “put a label on every bag of coffee in this box,” Or, “Open the box, take out the bags of decaf, and make a new box with just the decaf.” Or, “go through the bags in this box, and take out the first one marked ‘Espresso’ that contains at least 454 grams of beans.”

All of these actions involve going through the contents one by one. Acting on the elements of a collection one at a time is called iterating over the contents, and JavaScript has a standard way to iterate over the contents of collections.

a look back at functional iterators

When discussing functions, we looked at the benefits of writing Functional Iterators. We can do the same thing for objects. Here’s a stack that has its own functional iterator method:

const Stack1 = () =>
  ({
    array:[],
    index: -1,
    push: function (value) {
      return this.array[this.index += 1] = value;
    },
    pop: function () {
      const value = this.array[this.index];
    
      this.array[this.index] = undefined;
      if (this.index >= 0) { 
        this.index -= 1 
      }
      return value
    },
    isEmpty: function () {
      return this.index < 0
    },
    iterator: function () {
      let iterationIndex = this.index;
      
      return () => {
        if (iterationIndex > this.index) {
          iterationIndex = this.index;
        }
        if (iterationIndex < 0) {
          return {done: true};
        }
        else {
          return {done: false, value: this.array[iterationIndex--]}
        }
      }
    }
  });

const stack = Stack1();

stack.push("Greetings");
stack.push("to");
stack.push("you!")

const iter = stack.iterator();
iter().value
  //=> "you!"
iter().value
  //=> "to"

The way we’ve written .iterator as a method, each object knows how to return an iterator for itself.

Note that the .iterator() method is implemented with the function keyword, so when we invoke it with stack.iterator(), JavaScript sets this to the value of stack. But what about the function .iterator() returns? It is defined with a fat arrow () => { ... }. What is the value of this within that function?

Since JavaScript doesn’t bind this within a fat arrow function, we follow the same rules of variable scoping as any other variable name: We check in the environment enclosing the function. Although the .iterator() method has returned, its environment is the one that encloses our () => { ... } function, and that’s where this is bound to the value of stack.

Therefore, the iterator function returned by the .iterator() method has this bound to the stack object, even though we call it with iter().

And here’s a sum function implemented as a fold over a functional iterator:

const iteratorSum = (iterator) => {
  let eachIteration,
      sum = 0;
  
  while ((eachIteration = iterator(), !eachIteration.done)) {
    sum += eachIteration.value;
  }
  return sum
}

We can use it with our stack:

const stack = Stack1();

stack.push(1);
stack.push(2);
stack.push(3);

iteratorSum(stack.iterator())
  //=> 6

We could save a step and write collectionSum, a function that folds over any object, provided that the object implements an .iterator method:

const collectionSum = (collection) => {
  const iterator = collection.iterator();
  
  let eachIteration,
      sum = 0;
  
  while ((eachIteration = iterator(), !eachIteration.done)) {
    sum += eachIteration.value;
  }
  return sum
}

collectionSum(stack)
  //=> 6

If we write a program with the presumption that “everything is an object,” we can write maps, folds, and filters that work on objects. We just ask the object for an iterator, and work on the iterator. Our functions don’t need to know anything about how an object implements iteration, and we get the benefit of lazily traversing our objects.

This is a good thing.

iterator objects

Iteration for functions and objects has been around for many, many decades. For simple linear collections like arrays, linked lists, stacks, and queues, functional iterators are the simplest and easiest way to implement iterators.

In programs involving large collections of objects, it can be handy to implement iterators as objects, rather than functions. The mechanics of iterating can then be factored using the same tools that are used to factor the mechanics of all other objects in the system.

Fortunately, an iterator object is almost as simple as an iterator function. Instead of having a function that you call to get the next element, you have an object with a .next() method.

Like this:

const Stack2 = ()) =>
  ({
    array: [],
    index: -1,
    push: function (value) {
      return this.array[this.index += 1] = value;
    },
    pop: function () {
      const value = this.array[this.index];
    
      this.array[this.index] = undefined;
      if (this.index >= 0) { 
        this.index -= 1 
      }
      return value
    },
    isEmpty: function () {
      return this.index < 0
    },
    iterator: function () {
      let iterationIndex = this.index;
      
      return {
        next: () => {
          if (iterationIndex > this.index) {
            iterationIndex = this.index;
          }
          if (iterationIndex < 0) {
            return {done: true};
          }
          else {
            return {done: false, value: this.array[iterationIndex--]}
          }
        }
      }
    }
  });

const stack = Stack2();

stack.push(2000);
stack.push(10);
stack.push(5)

const collectionSum = (collection) => {
  const iterator = collection.iterator();
  
  let eachIteration,
      sum = 0;
  
  while ((eachIteration = iterator.next(), !eachIteration.done)) {
    sum += eachIteration.value;
  }
  return sum
}

collectionSum(stack)
  //=> 2015

Now our .iterator() method is returning an iterator object. When working with objects, we do things the object way. But having started by building functional iterators, we understand what is happening underneath the object’s scaffolding.

iterables

People have been writing iterators since JavaScript was first released in the late 1990s. Since there was no particular standard way to do it, people used all sorts of methods, and their methods returned all sorts of things: Objects with various interfaces, functional iterators, you name it.

So, when a standard way to write iterators was added to the JavaScript language, it didn’t make sense to use a method like .iterator() for it: That would conflict with existing code. Instead, the language encourages new code to be written with a different name for the method that a collection object uses to return its iterator.

To ensure that the method would not conflict with any existing code, JavaScript provides a symbol. Symbols are unique constants that are guaranteed not to conflict with existing strings. Symbols are a longstanding technique in programming going back to Lisp, where the GENSYM function generated… You guessed it… Symbols.1

The expression Symbol.iterator evaluates to a special symbol representing the name of the method that objects should use if they return an iterator object.

Our stack does, so instead of binding the existing iterator method to the name iterator, we bind it to the Symbol.iterator. We’ll do that using the [ ] syntax for using an expression as an object literal key:

const Stack3 = () =>
  ({
    array: [],
    index: -1,
    push: function (value) {
      return this.array[this.index += 1] = value;
    },
    pop: function () {
      const value = this.array[this.index];
    
      this.array[this.index] = undefined;
      if (this.index >= 0) { 
        this.index -= 1 
      }
      return value
    },
    isEmpty: function () {
      return this.index < 0
    },
    [Symbol.iterator]: function () {
      let iterationIndex = this.index;
      
      return {
        next: () => {
          if (iterationIndex > this.index) {
            iterationIndex = this.index;
          }
          if (iterationIndex < 0) {
            return {done: true};
          }
          else {
            return {done: false, value: this.array[iterationIndex--]}
          }
        }
      }
    }
  });

const stack = Stack3();

stack.push(2000);
stack.push(10);
stack.push(5)

const collectionSum = (collection) => {
  const iterator = collection[Symbol.iterator]();
  
  let eachIteration,
      sum = 0;
  
  while ((eachIteration = iterator.next(), !eachIteration.done)) {
    sum += eachIteration.value;
  }
  return sum
}

collectionSum(stack)
  //=> 2015

Using [Symbol.iterator] instead of .iterator seems like adding an extra moving part for nothing. Do we get anything in return?

Indeed we do. Behold the for...of loop:

const iterableSum = (iterable) => {
  let sum = 0;
  
  for (let num of iterable) {
    sum += num;
  }
  return sum
}

iterableSum(stack)
  //=> 2015

The for...of loop works directly with any object that is iterable, meaning it works with any object that has a Symbol.iterator method that returns a object iterator. Here’s another linked list, this one is iterable:

const EMPTY = {
  isEmpty: () => true
};

const isEmpty = (node) => node === EMPTY;

const Pair1 = (first, rest = EMPTY) =>
  ({
    first,
    rest,
    isEmpty: () => false,
    [Symbol.iterator]: function () {
      let currentPair = this;
      
      return {
        next: () => {
          if (currentPair.isEmpty()) {
            return {done: true}
          }
          else {
            const value = currentPair.first;
            
            currentPair = currentPair.rest;
            return {done: false, value}
          }
        }
      }
    }
  });

const list = (...elements) => {
  const [first, ...rest] = elements;
  
  return elements.length === 0
    ? EMPTY
    : Pair1(first, list(...rest))
}

const someSquares = list(1, 4, 9, 16, 25);
    
iterableSum(someSquares)
  //=> 55

As we can see, we can use for...of with linked lists just as easily as with stacks. And there’s one more thing: You recall that the spread operator (...) can spread the elements of an array in an array literal or as parameters in a function invocation.

Now is the time to note that we can spread any iterable. So we can spread the elements of an iterable into an array literal:

['some squares', ...someSquares]
  //=> ["some squares", 1, 4, 9, 16, 25]

And we can also spread the elements of an array literal into parameters:

const firstAndSecondElement = (first, second) =>
  ({first, second})
  
firstAndSecondElement(...stack)
  //=> {"first":5,"second":10}

This can be extremely useful.

One caveat of spreading iterables: JavaScript creates an array out of the elements of the iterable. That might be very wasteful for extremely large collections. For example, if we spread a large collection just to find an element in the collection, it might have been wiser to iterate over the element using its iterator directly.

And if we have an infinite collection, spreading is going to fail outright.

iterables out to infinity

Iterables needn’t represent finite collections:

const Numbers = {
  [Symbol.iterator]: () => {
    let n = 0;
    
    return {
      next: () =>
        ({done: false, value: n++})
    }
  }
}

There are useful things we can do with iterables representing an infinite number of elements. Before we point out something we can do with them, let’s point out what we can’t do with them:

['all the numbers', ...Numbers]
  //=> infinite loop!
  
firstAndSecondElement(...Numbers)
  //=> infinite loop!

Attempting to spread an infinite iterable into an array is always going to fail.

We can look at useful things to do with both infinite and finite iterables. But first, let’s define some operations on iterables. Here’s mapIterableWith, it takes any iterable and returns an iterable representing a mapping over the original iterable:

const mapIterableWith = (fn, iterable) =>
  ({
    [Symbol.iterator]: () => {
      const iterator = iterable[Symbol.iterator]();
      
      return {
        next: () => {
          const {done, value} = iterator.next();
    
          return ({done, value: done ? undefined : fn(value)});
        }
      }
    }
  });

This illustrates the general pattern of working with iterables: An iterable is an object, representing a collection, with a [Symbol.iterator] method, that returns an iteration over the elements of a collection. The iteration over elements is an iterator. An iterator is also an object, but with a .next() method taht is invoked repeatedly to obtain the elements in order.

Many operations on iterables return iterables. Our mapIterableWith returns an iterable. But the iterable it returns is not the same kind of collection as the iterable it consumes. If we give it a Stack3, we don’t get a stack back. We just get an iterable. (If we want a specific kind of collection, we have to gather the iterable into a collection. We’ll see how to do that below.)

Here are two more operations on iterables, filterIterableWith and untilIterable:

const filterIterableWith = (fn, iterable) =>
  ({
    [Symbol.iterator]: () => {
      const iterator = iterable[Symbol.iterator]();
      
      return {
        next: () => {
          do {
            const {done, value} = iterator.next();
          } while (!done && !fn(value));
          return {done, value};
        }
      }
    }
  });
  
const untilIterable (fn, iterable) =>
  ({
    [Symbol.iterator]: () => {
      const iterator = iterable[Symbol.iterator]();
      
      return {
        next: () => {
          let {done, value} = iterator.next();
          
          done = done || fn(value);
    
          return ({done, value: done ? undefined : value});
        }
      }
    }
  });

And here’s a computation performed using operations on iterables: We’ll print the odd squares that are less than or equal to one hundred:

const compose = (fn, ...rest) =>
  (...args) =>
    (rest.length === 0)
      ? fn(...args)
      : fn(compose(...rest)(...args))

const callLeft = (fn, ...args) =>
    (...remainingArgs) =>
      fn(...args, ...remainingArgs);

const squaresOf = callLeft(mapIterableWith, (x) => x * x);
const oddsOf = callLeft(mapIterableWith, (x) => x % 2 === 1);
const untilTooBig = callLeft(until, (x) => x > 100);

for (let s of compose(untilTooBig, oddsOf, squaresOf)(Numbers)) {
  console.log(s)
}
  //=>
    1
    9
    25
    49
    81

For completeness, here are two more handy iterable functions. firstIterable returns the first element of an iterable (if it has one), and restIterable returns an iterable that iterates over all but the first element of an iterable. They are equivalent to destructuring arrays with [first, ...rest]:

const firstIterable = (iterable) =>
  iterable[Symbol.iterator]().next().value;

const restIterable = (iterable) => 
  ({
    [Symbol.iterator]: () => {
      const iterator = iterable[Symbol.iterator]();
      
      iterator.next();
      return iterator;
    }
  });
from

Having iterated over a collection, are we limited to for..do and/or gathering the elements in an array literal and/or gathering the elements into the parameters of a function? No, of course not, we can do anything we like with them.

One useful thing is to write a .from function that gathers an iterable into a particular collection type. JavaScript’s built-in Array class already has one:

Array.from(compose(untilTooBig, oddsOf, squaresOf)(Numbers))
  //=> [1, 9, 25, 49, 81]

We can do the same with our own collections. As you recall, functions are mutable objects. And we can assign properties to functions with a . or even [ and ]. And if we assign a function to a property, we’ve created a method.

So let’s do that:

Stack3.from = function (iterable) {
  const stack = this();
  
  for (let element of iterable) {
    stack.push(element);
  }
  return stack;
}

Pair1.from = (iterable) =>
  (function interationToList (iteration) {
    const {done, value} = iteration.next();
    
    return done ? EMPTY : Pair1(value, interationToList(iteration));
  })(iterable[Symbol.iterator]())

Now we can go “end to end,” If we want to map a linked list of numbers to a linked list of the squares of some numbers, we can do that:

const numberList = Pair1.from(untilIterable((x) => x > 10, Numbers));

Pair1.from(squaresOf(numberList))
  //=> {"first":0,
        "rest":{"first":1,
                "rest":{"first":4,
                        "rest":{ ...
why operations on iterables?

The operations on iterables are interesting, but let’s reiterate why we care: In JavaScript, we build single-responsibility objects, and single-responsibility functions, and we compose these together to build more full-featured objects and algorithms.

Composing an iterable with a mapIterable method cleaves the responsibility for knowing how to map from the fiddly bits of how a linked list differs from a stack

in the older style of object-oriented programming, we built “fat” objects. Each collection knew how to map itself (.map), how to fold itself (.reduce), how to filter itself (.filter) and how to find one element within itself (.find). If we wanted to flatten collections to arrays, we wrote a .toArray method for each type of collection.

Over time, this informal “interface” for collections grows by accretion. Some methods are only added to a few collections, some are added to all. But our objects grow fatter and fatter. We tell ourselves that, well, a collection ought to know how to map itself.

But we end up recreating the same bits of code in each .map method we create, in each .reduce method we create, in each .filter method we create, and in each .find method. Each one has its own variation, but the overall form is identical. That’s a sign that we should work at a higher level of abstraction, and working with iterables is that higher level of abstraction.

This “fat object” style springs from a misunderstanding: When we say a collection should know how to perform a map over itself, we don’t need for the collection to handle every single detail. That would be like saying that when we ask a bank teller for some cash, they personally print every bank note.

Object-oriented collections should definitely have methods for mapping, reducing, filtering, and finding. And they should know how to accomplish the desired result, but they should do so by delegating as much of the work as possible to operations like mapIterableWith.

Composing an iterable with a mapIterable method cleaves the responsibility for knowing how to map from the fiddly bits of how a linked list differs from a stack. And if we want to create convenience methods, we can reuse common pieces:

const extend = function (consumer, ...providers) {
  for (let i = 0; i < providers.length; ++i) {
    const provider = providers[i];
    for (let key in provider) {
      if (provider.hasOwnProperty(key)) {
        consumer[key] = provider[key]
      }
    }
  }
  return consumer
};
  
const mapIterableWith = (fn, iterable) =>
  extend({
    [Symbol.iterator]: () => {
      const iterator = iterable[Symbol.iterator]();
      
      return {
        next: () => {
          const {done, value} = iterator.next();
    
          return ({done, value: done ? undefined : fn(value)});
        }
      }
    }
  }, LazyIterable);
  
const reduceIterableWith = (fn, seed, iterable) => {
  const iterator = iterable[Symbol.iterator]();
  let iterationResult,
      accumulator = seed;
  
  while ((iterationResult = iterator.next(), !iterationResult.done)) {
    accumulator = fn(accumulator, iterationResult.value);
  }
  return accumulator;
};
  
const filterIterableWith = (fn, iterable) =>
  extend({
    [Symbol.iterator]: () => {
      const iterator = iterable[Symbol.iterator]();
      
      return {
        next: () => {
          do {
            const {done, value} = iterator.next();
          } while (!done && !fn(value));
          return {done, value};
        }
      }
    }
  }, LazyIterable);

const untilIterable = (fn, iterable) =>
  extend({
    [Symbol.iterator]: () => {
      const iterator = iterable[Symbol.iterator]();
    
      return {
        next: () => {
          let {done, value} = iterator.next();
        
          done = done || fn(value);
  
          return ({done, value: done ? undefined : value});
        }
      }
    }
  }, LazyIterable);
  
const firstIterable = (iterable) =>
  iterable[Symbol.iterator]().next().value;

const restIterable = (iterable) => 
  extend({
    [Symbol.iterator]: () => {
      const iterator = iterable[Symbol.iterator]();
      
      iterator.next();
      return iterator;
    }
  }, LazyIterable);
  
const takeIterable = (numberToTake, iterable) =>
  extend({
    [Symbol.iterator]: () => {
      const iterator = iterable[Symbol.iterator]();
      let remainingElements = numberToTake;
    
      return {
        next: () => {
          let {done, value} = iterator.next();
        
          done = done || remainingElements-- <= 0;
  
          return ({done, value: done ? undefined : value});
        }
      }
    }
  }, LazyIterable);
    
const LazyIterable = {
   map: function (fn) {
     return mapIterableWith(fn, this);
   },
   reduce: function (fn, seed) {
     return reduceIterableWith(fn, seed, this);
   },
   filter: function (fn) {
     return filterIterableWith(fn, this);
   },
   find: function (fn) {
     return filterIterableWith(fn, this).first();
   },
   first: function () {
     return firstIterable(this);
   },
   rest: function () {
     return restIterable(this);
   },
   until: function (numberToTake) {
     return untilIterable(numberToTake, this);
   },
   take: function (numberToTake) {
     return takeIterable(numberToTake, this);
   }
}

// Pair, a/k/a linked lists

const EMPTY = {
  isEmpty: () => true
};

const isEmpty = (node) => node === EMPTY;

const Pair = (car, cdr = EMPTY) =>
  extend({
    car,
    cdr,
    isEmpty: () => false,
    [Symbol.iterator]: function () {
      let currentPair = this;
      
      return {
        next: () => {
          if (currentPair.isEmpty()) {
            return {done: true}
          }
          else {
            const value = currentPair.car;
            
            currentPair = currentPair.cdr;
            return {done: false, value}
          }
        }
      }
    }
  }, LazyIterable);

Pair.from = (iterable) =>
  (function interationToList (iteration) {
    const {done, value} = iteration.next();
    
    return done ? EMPTY : Pair(value, interationToList(iteration));
  })(iterable[Symbol.iterator]());
  
// Stack

const Stack = () =>
  extend({
    array: [],
    index: -1,
    push: function (value) {
      return this.array[this.index += 1] = value;
    },
    pop: function () {
      const value = this.array[this.index];
    
      this.array[this.index] = undefined;
      if (this.index >= 0) { 
        this.index -= 1 
      }
      return value
    },
    isEmpty: function () {
      return this.index < 0
    },
    [Symbol.iterator]: function () {
      let iterationIndex = this.index;
      
      return {
        next: () => {
          if (iterationIndex > this.index) {
            iterationIndex = this.index;
          }
          if (iterationIndex < 0) {
            return {done: true};
          }
          else {
            return {done: false, value: this.array[iterationIndex--]}
          }
        }
      }
    }
  }, LazyIterable);
  
Stack.from = function (iterable) {
  const stack = this();
  
  for (let element of iterable) {
    stack.push(element);
  }
  return stack;
}

// Pair and Stack in action
  
Stack.from([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
  .map((x) => x * x)
  .filter((x) => x % 2 == 0)
  .first()

//=> 100
  
Pair.from([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
  .map((x) => x * x)
  .filter((x) => x % 2 == 0)
  .reduce((seed, element) => seed + element, 0)
  
//=> 220
lazy iterables

“Laziness” is a very pejorative word when applied to people. But it can be an excellent strategy for efficiency in algorithms. Let’s be precise: Laziness is the characteristic of not doing any work until you know you need the result of the work.

Here’s an example. Compare these two:

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
  .map((x) => x * x)
  .filter((x) => x % 2 == 0)
  .reduce((seed, element) => seed + element, 0)
  
Pair.from([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
  .map((x) => x * x)
  .filter((x) => x % 2 == 0)
  .reduce((seed, element) => seed + element, 0)

Both expressions evaluate to 220. And they array is faster in practice, because it is a built-in data type that performs its work in the engine, while the linked list does its work in JavaScript.

But it’s still illustrative to dissect something important: Array’s .map and .filter methods gather their results into new arrays. Thus, calling .map.filter.reduce produces two temporary arrays that are discarded when .reduce performs its final computation.

Whereas the .map and .filter methods on Pair work with iterators. They produce small iterable objects that refer back to the original iteration. This reduces the memory footprint. When working with very large collections and many operations, this can be important.

The effect is even more pronounced when we use methods like first, until, or take:

Stack.from([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9,
            10, 11, 12, 13, 14, 15, 16, 17, 18, 19,
            20, 21, 22, 23, 24, 25, 26, 27, 28, 29])
  .map((x) => x * x)
  .filter((x) => x % 2 == 0)
  .first()

This expression begins with a stack containing 30 elements. The top two are 29 and 28. It maps to the squares of all 30 numbers, but our code for mapping an iteration returns an iterable that can iterate over the squares of our numbers, not an array or stack of the squares. Same with .filter, we get an iterable that can iterate over the even squares, but not an actual stack or array.

Finally, we take the first element of that filtered, squared iterable and now JavaScript actually iterates over the stack’s elements, and it only needs to square two of those elements, 29 and 28, to return the answer.

We can confirm this:

Stack.from([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9,
            10, 11, 12, 13, 14, 15, 16, 17, 18, 19,
            20, 21, 22, 23, 24, 25, 26, 27, 28, 29])
  .map((x) => {
    console.log(`squaring ${x}`);
    return x * x
  })
  .filter((x) => {
    console.log(`filtering ${x}`);
    return x % 2 == 0
  })
  .first()

//=>
  squaring 29
  filtering 841
  squaring 28
  filtering 784
  784

If we write the almost identical thing with an array, we get a different behaviour:

[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9,
 10, 11, 12, 13, 14, 15, 16, 17, 18, 19,
 20, 21, 22, 23, 24, 25, 26, 27, 28, 29]
  .reverse()
  .map((x) => {
    console.log(`squaring ${x}`);
    return x * x
  })
  .filter((x) => {
    console.log(`filtering ${x}`);
    return x % 2 == 0
  })[0]

//=>
  squaring 0
  squaring 1
  squaring 2
  squaring 3
  ...
  squaring 28
  squaring 29
  filtering 0
  filtering 1
  filtering 4
  ...
  filtering 784
  filtering 841
  784

Arrays copy-on-read, so every time we perform a map or filter, we get a new array and perform all the computations. This might be expensive.

You recall we briefly touched on the idea of infinite collections? Let’s make iterable numbers. They have to be lazy, otherwise we couldn’t write things like:

const Numbers = extend({
  [Symbol.iterator]: () => {
    let n = 0;
    
    return {
      next: () =>
        ({done: false, value: n++})
    }
  }
}, LazyCollection);

const firstCubeOver1234 =
  Numbers
    .map((x) => x * x * x)
    .filter((x) => x > 1234)
    .first()

//=> 1331

Balanced against their flexibility, our “lazy iterables” use structure sharing. If we mutate a collection after taking an iterable, we might get an unexpected result. This is why “pure” functional languages like Haskell combine lazy semantics with immutable collections, and why even “impure” languages like Clojure emphasize the use of immutable collections.

eager iterables

Arrays have eager semantics for .map, .filter, .rest and .take. They return another array, not a lazy iterable. Whereas, the Stack and Pair collections we wrote have lazy semantics: They return a lazy iterable and when we want a true collection, we have to gather the elements into an array or another collection using .from:

const evenSquares = Pair.from(
  Pair.from([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
    .map((x) => x * x)
    .filter((x) => x % 2 == 0)
  );

[...evenSquares]
  //=> [4,16,36,64,100]

Or if we want to design a collection with eager semantics for .map, .filter, .rest and .take, we can do that:

const EagerIterable = (gatherable) =>
  ({
     map: function (fn) {
       return gatherable.from(mapIterableWith(fn, this));
     },
     reduce: function (fn, seed) {
       return reduceIterableWith(fn, seed, this);
     },
     filter: function (fn) {
       return gatherable.from(filterIterableWith(fn, this));
     },
     find: function (fn) {
       return filterIterableWith(fn, this).first();
     },
     first: function () {
       return firstIterable(this);
     },
     rest: function () {
       return gatherable.from(restIterable(this));
     },
     take: function (numberToTake) {
       return gatherable.from(takeIterable(numberToTake, this));
     }
  })
  
const EagerStack = () =>
  extend({
    array: [],
    index: -1,
    push: function (value) {
      return this.array[this.index += 1] = value;
    },
    pop: function () {
      const value = this.array[this.index];
    
      this.array[this.index] = undefined;
      if (this.index >= 0) { 
        this.index -= 1 
      }
      return value
    },
    isEmpty: function () {
      return this.index < 0
    },
    [Symbol.iterator]: function () {
      let iterationIndex = this.index;
      
      return {
        next: () => {
          if (iterationIndex > this.index) {
            iterationIndex = this.index;
          }
          if (iterationIndex < 0) {
            return {done: true};
          }
          else {
            return {done: false, value: this.array[iterationIndex--]}
          }
        }
      }
    }
  }, EagerIterable(EagerStack));
  
EagerStack.from = function (iterable) {
  const stack = this();
  
  for (let element of iterable) {
    stack.push(element);
  }
  return stack;
}

EagerStack
  .from([1, 2, 3, 4, 5])
  .map((x) => x * 2)
  
//=> {"array":[10,8,6,4,2],"index":4}

And we can go back and forth between them. For example, if we want a lazy map of an array, we can use the mapIterableWith function to return a lazy iterable. And as we just noted, we can use .from to eagerly gather any iterable into a collection.

summary

Iterators are a JavaScript feature that allow us to separate the concerns of how to iterate over a collection from what we want to do with the elements of a collection. Iterable collections can be iterated over or gathered into another collection, either lazily or eagerly.

Separating concerns with iterators speaks to JavaScript’s fundamental nature: It’s a language that wants to compose functionality out of small, singe-responsibility pieces, whether those pieces are functions or objects built out of functions.


reddit edit this page

This post was extracted from a draft of the book, JavaScript Allongé, The “Six” Edition. The extracts so far:

  1. You can read more about JavaScript symbols in Axel Rauschmayer’s Symbols in ECMAScript 6

https://raganwald.com/2015/02/17/lazy-iteratables-in-javascript
The Quantum Electrodynamics of Functional JavaScript
Show full content

Coffee served at the CERN particle accelerator

In our code so far (Destructuring and Recursion in ES-6 and Tail Calls, Default Arguments, and Excessive Recycling in ES-6), we have used arrays and objects to represent the structure of data, and we have extensively used the ternary operator to write algorithms that terminate when we reach a base case.

For example, this length function uses a functions to bind values to names, POJOs to structure nodes, and the ternary function to detect the base case, the empty list.

const EMPTY = {};
const OneTwoThree = { first: 1, rest: { first: 2, rest: { first: 3, rest: EMPTY } } };

OneTwoThree.first
  //=> 1

OneTwoThree.rest.first
  //=> 2

OneTwoThree.rest.rest.first
  //=> 3

const length = (node, delayed = 0) =>
  node === EMPTY
    ? delayed
    : length(node.rest, delayed + 1);

length(OneTwoThree)
  //=> 3

A very long time ago, mathematicians like Alonzo Church, Moses Schönfinkel, Alan Turing, and Haskell Curry asked themselves if we really needed all these features to perform computations. They searched for a radically simpler set of tools that could accomplish all of the same things.

They established that arbitrary computations could be represented a small set of axiomatic components. For example, we don’t need arrays to represent lists, or even POJOs to represent nodes in a linked list. We can model lists just using functions.

To Mock a Mockingbird established the metaphor of songbirds for the combinators, and ever since then logicians have called the K combinator a “kestrel,” the B combinator a “bluebird,” and so forth.

The oscin.es library contains code for all of the standard combinators and for experimenting using the standard notation.

Let’s start with some of the building blocks of combinatory logic, the K, I, and V combinators, nicknamed the “Kestrel,” the “Idiot Bird,” and the “Vireo:”

const K = (x) => (y) => x;
const I = (x) => (x);
const V = (x) => (y) => (z) => z(x)(y);
the kestrel and the idiot

A constant function is a function that always returns the same thing, no matter what you give it. For example, (x) => 42 is a constant function that always evaluates to 42. The kestrel, or K, is a function that makes constant functions. You give it a value, and it returns a constant function that gives that value.

For example:

const K = (x) => (y) => x;

const fortyTwo = K(42);

fortyTwo(6)
  //=> 42

fortyTwo("Hello")
  //=> 42

The identity function is a function that evaluates to whatever parameter you pass it. So I(42) => 42. Very simple, but useful. Now we’ll take it one more step forward: Passing a value to K gets a function back, and passing a value to that function gets us a value.

Like so:

K(6)(7)
  //=> 6

K(12)(24)
  //=> 12

This is very interesting. Given two values, we can say that K always returns the first value: K(x)(y) => x (that’s not valid JavaScript, but it’s essentially how it works).

Now, an interesting thing happens when we pass functions to each other. Consider K(I). From what we just wrote, K(x)(y) => x So K(I)(x) => I. Makes sense. Now let’s tack one more invocation on: What is K(I)(x)(y)? If K(I)(x) => I, then K(I)(x)(y) === I(y) which is y.

Therefore, K(I)(x)(y) => y:

K(I)(6)(7)
  //=> 7

K(I)(12)(24)
  //=> 24

Aha! Given two values, K(I) always returns the second value.

K("primus")("secundus")
  //=> "primus"

K(I)("primus")("secundus")
  //=> "secundus"

If we are not feeling particularly academic, we can name our functions:

const first = K,
      second = K(I);

first("primus")("secundus")
  //=> "primus"

second("primus")("secundus")
  //=> "secundus"

This is very interesting. Given two values, we can say that K always returns the first value, and given two values, K(I) always returns the second value.

backwardness

Our first and second functions are a little different than what most people are used to when we talk about functions that access data. If we represented a pair of values as an array, we’d write them like this:

const first = ([first, second]) => first,
      second = ([first, second]) => second;

const latin = ["primus", "secundus"];

first(latin)
  //=> "primus"

second(latin)
  //=> "secundus"

Or if we were using a POJO, we’d write them like this:

const first = ({first, second}) => first,
      second = ({first, second}) => second;

const latin = {first: "primus", second: "secundus"};

first(latin)
  //=> "primus"

second(latin)
  //=> "secundus"

In both cases, the functions first and second know how the data is represented, whether it be an array or an object. You pass the data to these functions, and they extract it.

But the first and second we built out of K and I don’t work that way. You call them and pass them the bits, and they choose what to return. So if we wanted to use them with a two-element array, we’d need to have a piece of code that calls some code.

Here’s the first cut:

const first = K,
      second = K(I);

const latin = (selector) => selector("primus")("secundus");

latin(first)
  //=> "primus"

latin(second)
  //=> "secundus"

Our latin data structure is no longer a dumb data structure, it’s a function. And instead of passing latin to first or second, we pass first or second to latin. It’s exactly backwards of the way we write functions that operate on data.

the vireo

Given that our latin data is represented as the function (selector) => selector("primus")("secundus"), our obvious next step is to make a function that makes data. For arrays, we’d write cons = (first, second) => [first, second]. For objects we’d write: cons = (first, second) => {first, second}. In both cases, we take two parameters, and return the form of the data.

For “data” we access with K and K(I), our “structure” is the function (selector) => selector("primus")("secundus"). Let’s extract those into parameters:

(first, second) => (selector) => selector(first)(second)

For consistency with the way combinators are written as functions taking just one parameter, we’ll curry the function:

(first) => (second) => (selector) => selector(first)(second)

Let’s try it, we’ll use the word pair for the function that makes data (When we need to refer to a specific pair, we’ll use the name aPair by default):

const first = K,
      second = K(I),
      pair = (first) => (second) => (selector) => selector(first)(second);

const latin = pair("primus")("secundus");

latin(first)
  //=> "primus"

latin(second)
  //=> "secundus"

It works! Now what is this node function? If we change the names to x, y, and z, we get: (x) => (y) => (z) => z(x)(y). That’s the V combinator, the Vireo! So we can write:

const first = K,
      second = K(I),
      pair = V;

const latin = pair("primus")("secundus");

latin(first)
  //=> "primus"

latin(second)
  //=> "secundus"

As an aside, the Vireo is a little like JavaScript’s .apply function. It says, “take these two values and apply them to this function.” There are other, similar combinators that apply values to functions. One notable example is the “thrush” or T combinator: It takes one value and applies it to a function. It is known to most programmers as .tap.

Armed with nothing more than K, I, and V, we can make a little data structure that holds two values, the cons cell of Lisp and the node of a linked list. Without arrays, and without objects, just with functions. We’d better try it out to check.

lists with functions as data

Here’s another look at linked lists using POJOs. We use the term rest instead of second, but it’s otherwise identical to what we have above:

const first = ({first, rest}) => first,
      rest  = ({first, rest}) => rest,
      pair = (first, rest) => ({first, rest}),
      EMPTY = ({});

const l123 = pair(1, pair(2, pair(3, EMPTY)));

first(l123)
  //=> 1

first(rest(l123))
  //=> 2

first(rest(rest(l123)))
  //=3

We can write length and mapWith functions over it:

const length = (aPair) =>
  aPair === EMPTY
    ? 0
    : 1 + length(rest(aPair));

length(l123)
  //=> 3

const reverse = (aPair, delayed = EMPTY) =>
  aPair === EMPTY
    ? delayed
    : reverse(rest(aPair), pair(first(aPair), delayed));

const mapWith = (fn, aPair, delayed = EMPTY) =>
  aPair === EMPTY
    ? reverse(delayed)
    : mapWith(fn, rest(aPair), pair(fn(first(aPair)), delayed));

const doubled = mapWith((x) => x * 2, l123);

first(doubled)
  //=> 2

first(rest(doubled))
  //=> 4

first(rest(rest(doubled)))
  //=> 6

Can we do the same with the linked lists we build out of functions? Yes:

const first = K,
      rest  = K(I),
      pair = V,
      EMPTY = (() => {});

const l123 = pair(1)(pair(2)(pair(3)(EMPTY)));

l123(first)
  //=> 1

l123(rest)(first)
  //=> 2

return l123(rest)(rest)(first)
  //=> 3

We write them in a backwards way, but they seem to work. How about length?

const length = (aPair) =>
  aPair === EMPTY
    ? 0
    : 1 + length(aPair(rest));

length(l123)
  //=> 3

And mapWith?

const reverse = (aPair, delayed = EMPTY) =>
  aPair === EMPTY
    ? delayed
    : reverse(aPair(rest), pair(aPair(first))(delayed));

const mapWith = (fn, aPair, delayed = EMPTY) =>
  aPair === EMPTY
    ? reverse(delayed)
    : mapWith(fn, aPair(rest), pair(fn(aPair(first)))(delayed));

const doubled = mapWith((x) => x * 2, l123)

doubled(first)
  //=> 2

doubled(rest)(first)
  //=> 4

doubled(rest)(rest)(first)
  //=> 6

Presto, we can use pure functions to represent a linked list. And with care, we can do amazing things like use functions to represent numbers, build more complex data structures like trees, and in fact, anything that can be computed can be computed using just functions and nothing else.

But without building our way up to something insane like writing a JavaScript interpreter using JavaScript functions and no other data structures, let’s take things another step in a slightly different direction.

We used functions to replace arrays and POJOs, but we still use JavaScript’s built-in operators to test for equality (===) and to branch ?:.

say “please”

We keep using the same pattern in our functions: aPair === EMPTY ? doSomething : doSomethingElse. This follows the philosophy we used with data structures: The function doing the work inspects the data structure.

We can reverse this: Instead of asking a pair if it is empty and then deciding what to do, we can ask the pair to do it for us. Here’s length again:

const length = (aPair) =>
  aPair === EMPTY
    ? 0
    : 1 + length(aPair(rest));

Let’s presume we are working with a slightly higher abstraction, we’ll call it a list. Instead of writing length(list) and examining a list, we’ll write something like:

const length = (list) => list(
  () => 0,
  (aPair) => 1 + length(aPair(rest)))
);

Now we’ll need to write first and rest functions for a list, and those names will collide with the first and rest we wrote for pairs. So let’s disambiguate our names:

const pairFirst = K,
      pairRest  = K(I),
      pair = V;

const first = (list) => list(
    () => "ERROR: Can't take first of an empty list",
    (aPair) => aPair(pairFirst)
  );

const rest = (list) => list(
    () => "ERROR: Can't take first of an empty list",
    (aPair) => aPair(pairRest)
  );

const length = (list) => list(
    () => 0,
    (aPair) => 1 + length(aPair(pairRest)))
  );

We’ll also write a handy list printer:

const print = (list) => list(
    () => "",
    (aPair) => `${aPair(pairFirst)} ${print(aPair(pairRest))}`
  );

How would all this work? Let’s start with the obvious. What is an empty list?

const EMPTYLIST = (whenEmpty, unlessEmpty) => whenEmpty()

And what is a node of a list?

const node = (x) => (y) =>
  (whenEmpty, unlessEmpty) => unlessEmpty(pair(x)(y));

Let’s try it:

const l123 = node(1)(node(2)(node(3)(EMPTYLIST)));

print(l123)
  //=> 1 2 3

We can write reverse and mapWith as well. We aren’t being super-strict about emulating combinatory logic, we’ll use default parameters:

const reverse = (list, delayed = EMPTYLIST) => list(
  () => delayed,
  (aPair) => reverse(aPair(pairRest), node(aPair(pairFirst))(delayed))
);

print(reverse(l123));
  //=> 3 2 1

const mapWith = (fn, list, delayed = EMPTYLIST) =>
  list(
    () => reverse(delayed),
    (aPair) => mapWith(fn, aPair(pairRest), node(fn(aPair(pairFirst)))(delayed))
  );

print(mapWith(x => x * x, reverse(l123)))
  //=> 941

We have managed to provide the exact same functionality that === and ?: provided, but using functions and nothing else.

functions are not the real point

There are lots of similar texts explaining how to construct complex semantics out of functions. You can establish that K and K(I) can represent true and false, model magnitudes with Church Numerals or Surreal Numbers, and build your way up to printing FizzBuzz.

The superficial conclusion reads something like this:

Functions are a fundamental building block of computation. They are “axioms” of combinatory logic, and can be used to compute anything that JavaScript can compute.

However, that is not the interesting thing to note here. Practically speaking, languages like JavaScript already provide arrays with mapping and folding methods, choice operations, and other rich constructs. Knowing how to make a linked list out of functions is not really necessary for the working programmer. (Knowing that it can be done, on the other hand, is very important to understanding computer science.)

Knowing how to make a list out of just functions is a little like knowing that photons are the Gauge Bosons of the electromagnetic force. It’s the QED of physics that underpins the Maxwell’s Equations of programming. Deeply important, but not practical when you’re building a bridge.

So what is interesting about this? What nags at our brain as we’re falling asleep after working our way through this?

a return to backward thinking

To make pairs work, we did things backwards, we passed the first and rest functions to the pair, and the pair called our function. As it happened, the pair was composed by the vireo (or V combinator): (x) => (y) => (z) => z(x)(y).

But we could have done something completely different. We could have written a pair that stored its elements in an array, or a pair that stored its elements in a POJO. All we know is that we can pass the pair function a function of our own, at it will be called with the elements of the pair.

The exact implementation of a pair is hidden from the code that uses a pair. Here, we’ll prove it:

const first = K,
      second = K(I),
      pair = (first) => (second) => {
        const pojo = {first, second};

        return (selector) => selector(pojo.first)(pojo.second);
      };

const latin = pair("primus")("secundus");

latin(first)
  //=> "primus"

latin(second)
  //=> "secundus"

This is a little gratuitous, but it makes the point: The code that uses the data doesn’t reach in and touch it: The code that uses the data provides some code and asks the data to do something with it.

The same thing happens with our lists. Here’s length for lists:

const length = (list) => list(
    () => 0,
    (aPair) => 1 + length(aPair(pairRest)))
  );

We’re passing list what we want done with an empty list, and what we want done with a list that has at least one element. We then ask list to do it, and provide a way for list to call the code we pass in.

We won’t bother here, but it’s easy to see how to swap our functions out and replace them with an array. Or a column in a database. This is fundamentally not the same thing as this code for the length of a linked list:

const length = (node, delayed = 0) =>
  node === EMPTY
    ? delayed
    : length(node.rest, delayed + 1);

The line node === EMPTY presumes a lot of things. It presumes there is one canonical empty list value. It presumes you can compare these things with the === operator. We can fix this with an isEmpty function, but now we’re pushing even more knowledge about the structure of lists into the code that uses them.

Having a list know itself whether it is empty hides implementation information from the code that uses lists. This is a fundamental principle of good design. It is a tenet of Object-Oriented Programming, but it is not exclusive to OOP: We can and should design data structures to hide implementation information from the code that use them, whether we are working with functions, objects, or both.

There are many tools for hiding implementation information, and we have now seen two particularly powerful patterns:

  • Instead of directly manipulating part of an entity, pass it a function and have it call our function with the part we want.
  • And instead of testing some property of an entity and making a choice of our own with ?: (or if), pass the entity the work we want done for each case and let it test itself.

hacker news lobste.rs reddit edit this page

If you speak Ruby, Tom Stuart’s Programming with Nothing is a must-watch and a must-read.

postscript:

This post was extracted from a draft of the book, JavaScript Allongé, The “Six” Edition. The extracts so far:

https://raganwald.com/2015/02/13/functional-quantum-electrodynamics
Tail Calls, Default Arguments, and Excessive Recycling in ES-6
Show full content

The mapWith and foldWith functions we wrote in Destructuring and Recursion in ES6 are useful for illustrating the basic principles behind using recursion to work with self-similar data structures, but they are not “production-ready” implementations. One of the reasons they are not production-ready is that they consume memory proportional to the size of the array being folded.

Let’s look at how. Here’s our extremely simple mapWith function again:

const mapWith = (fn, [first, ...rest]) =>
  first === undefined
    ? []
    : [fn(first), ...mapWith(fn, rest)];

mapWith((x) => x * x, [1, 2, 3, 4, 5])
  //=> [1,4,9,16,25]

Let’s step through its execution. First, mapWith((x) => x * x, [1, 2, 3, 4, 5]) is invoked. first is not undefined, so it evaluates [fn(first), …mapWith(fn, rest)]. To do that, it has to evaluate fn(first) and mapWith(fn, rest), then evaluate [fn(first), ...mapWith(fn, rest)].

This is roughly equivalent to writing:

const mapWith = function (fn, [first, ...rest]) {
  if (first === undefined) {
    return [];
  }
  else {
    const _temp1 = fn(first),
          _temp2 = mapWith(fn, rest),
          _temp3 = [_temp1, ..._temp2];

    return _temp3;
  }
}

Note that while evaluating mapWith(fn, rest), JavaScript must retain the value first or fn(first), plus some housekeeping information so it remembers what to do with mapWith(fn, rest) when it has a result. JavaScript cannot throw first away. So we know that JavaScript is going to hang on to 1.

Next, JavaScript invokes mapWith(fn, rest), which is semantically equivalent to mapWith((x) => x * x, [2, 3, 4, 5]). And the same thing happens: JavaScript has to hang on to 2 (or 4, or both, depending on the implementation), plus some housekeeping information so it remembers what to do with that value, while it calls the equivalent of mapWith((x) => x * x, [3, 4, 5]).

This keeps on happening, so that JavaScript collects the values 1, 2, 3, 4, and 5 plus housekeeping information by the time it calls mapWith((x) => x * x, []). It can start assembling the resulting array and start discarding the information it is saving.

That information is saved on a call stack, and it is quite expensive. Furthermore, doubling the length of an array will double the amount of space we need on the stack, plus double all the work required to set up and tear down the housekeeping data for each call (these are called call frames, and they include the place where the function was called, an environment, and so on).

In practice, using a method like this with more than about 50 items in an array may cause some implementations to run very slow, run out of memory and freeze, or cause an error.

mapWith((x) => x * x, [
   0,  1,  2,  3,  4,  5,  6,  7,  8,  9,
  10, 11, 12, 13, 14, 15, 16, 17, 18, 19,
  20, 21, 22, 23, 24, 25, 26, 27, 28, 29,
  30, 31, 32, 33, 34, 35, 36, 37, 38, 39,
  40, 41, 42, 43, 44, 45, 46, 47, 48, 49,
  50, 51, 52, 53, 54, 55, 56, 57, 58, 59,
  60, 61, 62, 63, 64, 65, 66, 67, 68, 69,
  70, 71, 72, 73, 74, 75, 76, 77, 78, 79,
  80, 81, 82, 83, 84, 85, 86, 87, 88, 89,
  90, 91, 92, 93, 94, 95, 96, 97, 98, 99,
   0,  1,  2,  3,  4,  5,  6,  7,  8,  9,
  10, 11, 12, 13, 14, 15, 16, 17, 18, 19,
  20, 21, 22, 23, 24, 25, 26, 27, 28, 29,
  30, 31, 32, 33, 34, 35, 36, 37, 38, 39,
  40, 41, 42, 43, 44, 45, 46, 47, 48, 49,
  50, 51, 52, 53, 54, 55, 56, 57, 58, 59,
  60, 61, 62, 63, 64, 65, 66, 67, 68, 69,
  70, 71, 72, 73, 74, 75, 76, 77, 78, 79,
  80, 81, 82, 83, 84, 85, 86, 87, 88, 89,
  90, 91, 92, 93, 94, 95, 96, 97, 98, 99
])
  //=> ???

Is there a better way? Several, in fact, fast algorithms is a very highly studied field of computer science. The one we’re going to look at here is called tail-call optimization, or “TCO.”

tail-call optimization

A “tail-call” occurs when a function’s last act is to invoke another function, and then return whatever the other function returns. For example, consider the maybe function decorator:

const maybe = (fn) =>
  function (...args) {
    if (args.length === 0) {
      return;
    }
    else {
      for (let arg in args) {
        if (arg == null) return;
      }
      return fn.apply(this, args);
    }
  }

There are three places it returns. The first two don’t return anything, they don’t matter. But the third is fn.apply(this, args). This is a tail-call, because it invokes another function and returns its result. This is interesting, because after sorting out what to supply as arguments (this, args), JavaScript can throw away everything in its current stack frame. It isn’t going to do any more work, so it can throw its existing stack frame away.

And in fact, it does exactly that: It throws the stack frame away, and does not consume extra memory when making a maybe-wrapped call. This is a very important characteristic of JavaScript: If a function makes a call in tail position, JavaScript optimizes away the function call overhead and stack space.

That is excellent, but one wrapping is not a big deal. When would we really care? Consider this implementation of length:

const length = ([first, ...rest]) =>
  first === undefined
    ? 0
    : 1 + length(rest);

The length function calls itself, but it is not a tail-call, because it returns 1 + length(rest), not length(rest).

The problem can be stated in such a way that the answer is obvious: length does not call itself in tail position, because it has to do two pieces of work, and while one of them is in the recursive call to length, the other happens after the recursive call.

The obvious solution?

converting non-tail-calls to tail-calls

The obvious solution is push the 1 + work into the call to length. Here’s our first cut:

const lengthDelaysWork = ([first, ...rest], numberToBeAdded) =>
  first === undefined
    ? 0 + numberToBeAdded
    : lengthDelaysWork(rest, 1 + numberToBeAdded)

lengthDelaysWork(["foo", "bar", "baz"], 0)
  //=> 3

This lengthDelaysWork function calls itself in tail position. The 1 + work is done before calling itself, and by the time it reaches the terminal position, it has the answer. Now that we’ve seen how it works, we can clean up the 0 + numberToBeAdded business. But while we’re doing that, it’s annoying to remember to call it with a zero. Let’s fix that:

const lengthDelaysWork = ([first, ...rest], numberToBeAdded) =>
  first === undefined
    ? numberToBeAdded
    : lengthDelaysWork(rest, 1 + numberToBeAdded)

const length = (n) =>
  lengthDelaysWork(n, 0);

Or we could use partial application:

const callLast = (fn, ...args) =>
    (...remainingArgs) =>
      fn(...remainingArgs, ...args);

const length = callLast(lengthDelaysWork, 0);

length(["foo", "bar", "baz"])
  //=> 3

This version of length calls uses lengthDelaysWork, and JavaScript optimizes that not to take up memory proportional to the length of the string. We can use this technique with mapWith:

const mapWithDelaysWork = (fn, [first, ...rest], prepend) =>
  first === undefined
    ? prepend
    : mapWithDelaysWork(fn, rest, [...prepend, fn(first)]);

const mapWith = callLast(mapWithDelaysWork, []);

mapWith((x) => x * x, [1, 2, 3, 4, 5])
  //=> [1,4,9,16,25]

We can use it with ridiculously large arrays:

mapWith((x) => x * x, [
     0,    1,    2,    3,    4,    5,    6,    7,    8,    9,
    10,   11,   12,   13,   14,   15,   16,   17,   18,   19,
    20,   21,   22,   23,   24,   25,   26,   27,   28,   29,
    30,   31,   32,   33,   34,   35,   36,   37,   38,   39,
    40,   41,   42,   43,   44,   45,   46,   47,   48,   49,
    50,   51,   52,   53,   54,   55,   56,   57,   58,   59,
    60,   61,   62,   63,   64,   65,   66,   67,   68,   69,
    70,   71,   72,   73,   74,   75,   76,   77,   78,   79,
    80,   81,   82,   83,   84,   85,   86,   87,   88,   89,
    90,   91,   92,   93,   94,   95,   96,   97,   98,   99,

  // ...

  2980, 2981, 2982, 2983, 2984, 2985, 2986, 2987, 2988, 2989,
  2990, 2991, 2992, 2993, 2994, 2995, 2996, 2997, 2998, 2999 ])

  //=> [0,1,4,9,16,25,36,49,64,81,100,121,144,169,196, ...

Brilliant! We can map over large arrays without incurring all the memory and performance overhead of non-tail-calls. And this basic transformation from a recursive function that does not make a tail call, into a recursive function that calls itself in tail position, is a bread-and-butter pattern for programmers using a language that incorporates tail-call optimization.

factorials

Introductions to recursion often mention calculating factorials:

In mathematics, the factorial of a non-negative integer n, denoted by n!, is the product of all positive integers less than or equal to n. For example:

5! = 5  x  4  x  3  x  2  x  1 = 120.

The naïve function for calculating the factorial of a positive integer follows directly from the definition:

const factorial = (n) =>
  n == 1
  ? n
  : n * factorial(n - 1);

factorial(1)
  //=> 1

factorial(5)
  //=> 120

While this is mathematically elegant, it is computational filigree.

Once again, it is not tail-recursive, it needs to save the stack with each invocation so that it can take the result returned and compute n * factorial(n - 1). We can do the same conversion, pass in the work to be done:

const factorialWithDelayedWork = (n, work) =>
  n === 1
  ? work
  : factorialWithDelayedWork(n - 1, n * work);

const factorial = (n) =>
  factorialWithDelayedWork(n, 1);

Or we could use partial application:

const callLast = (fn, ...args) =>
    (...remainingArgs) =>
      fn(...remainingArgs, ...args);

const factorial = callLast(factorialWithDelayedWork, 1);

factorial(1)
  //=> 1

factorial(5)
  //=> 120

As before, we wrote a factorialWithDelayedWork function, then used partial application (callLast) to make a factorial function that took just the one argument and supplied the initial work value.

default arguments

Our problem is that we can directly write:

const factorial = (n, work) =>
  n === 1
  ? work
  : factorial(n - 1, n * work);

factorial(1, 1)
  //=> 1

factorial(5, 1)
  //=> 120

But it is hideous to have to always add a 1 parameter, we’d be demanding that everyone using the factorial function know that we are using a tail-recursive implementation.

What we really want is this: We want to write something like factorial(6), and have JavaScript automatically know that we really mean factorial(6, 1). But when it calls itself, it will call factorial(5, 6) and that will not mean factorial(5, 1).

JavaScript provides this exact syntax, it’s called a default argument, and it looks like this:

const factorial = (n, work = 1) =>
  n === 1
  ? work
  : factorial(n - 1, n * work);

factorial(1)
  //=> 1

factorial(6)
  //=> 720

By writing our parameter list as (n, work = 1) =>, we’re stating that if a second parameter is not provided, work is to be bound to 1. We can do similar thngs with our other tail-recursive functions:

const length = ([first, ...rest], numberToBeAdded = 0) =>
  first === undefined
    ? numberToBeAdded
    : length(rest, 1 + numberToBeAdded)

length(["foo", "bar", "baz"])
  //=> 3

const mapWith = (fn, [first, ...rest], prepend = []) =>
  first === undefined
    ? prepend
    : mapWith(fn, rest, [...prepend, fn(first)]);

mapWith((x) => x * x, [1, 2, 3, 4, 5])
  //=> [1,4,9,16,25]

Now we don’t need to use two functions. A default argument is concise and readable.

Garbage, Garbage Everywhere

Garbage Day

We have now seen how to use Tail Calls to execute mapWith in constant space:

const mapWith = (fn, [first, ...rest], prepend = []) =>
  first === undefined
    ? prepend
    : mapWith(fn, rest, [...prepend, fn(first)]);

mapWith((x) => x * x, [1, 2, 3, 4, 5])
  //=> [1,4,9,16,25]

But when we try it on very large arrays, we discover that it is still very slow. Much slower than the built-in .map method for arrays. The right tool to discover why it’s still slow is a memory profiler, but a simple inspection of the program will reveal the following:

Every time we call mapWith, we’re calling [...prepend, fn(first)]. To do that, we take the array in prepend and push fn(first) onto the end, creating a new array that will be passed to the next invocation of mapWith.

Worse, the JavaScript Engine actually copies the elements from prepend into the new array one at a time. That is very laborious.1

The array we had in prepend is no longer used. In GC environments, it is marked as no longer being used, and eventually the garbage collector recycles the memory it is using. Lather, rinse, repeat: Ever time we call mapWith, we’re creating a new array, copying all the elements from prepend into the new array, and then we no longer use prepend.

We may not be creating 3,000 stack frames, but we are creating three thousand new arrays and copying elements into each and every one of them. Although the maximum amount of memory does not grow, the thrashing as we create short-lived arrays is very bad, and we do a lot of work copying elements from one array to another.

Key Point: Our [first, ...rest] approach to recursion is slow because that it creates a lot of temporary arrays, and it spends an enormous amount of time copying elements into arrays that end up being discarded.

So here’s a question: If this is such a slow approach, why do some examples of “functional” algorithms work this exact way?

The IBM 704

some history

Once upon a time, there was a programming language called Lisp, an acronym for LISt Processing.2 Lisp was one of the very first high-level languages, the very first implementation was written for the IBM 704 computer. (The very first FORTRAN implementation was also written for the 704).

The 704 had a 36-bit word, meaning that it was very fast to store and retrieve 36-bit values. The CPU’s instruction set featured two important macros: CAR would fetch 15 bits representing the Contents of the Address part of the Register, while CDR would fetch the Contents of the Decrement part of the Register.

In broad terms, this means that a single 36-bit word could store two separate 15-bit values and it was very fast to save and retrieve pairs of values. If you had two 15-bit values and wished to write them to the register, the CONS macro would take the values and write them to a 36-bit word.

Thus, CONS put two values together, CAR extracted one, and CDR extracted the other. Lisp’s basic data type is often said to be the list, but in actuality it was the “cons cell,” the term used to describe two 15-bit values stored in one word. The 15-bit values were used as pointers that could refer to a location in memory, so in effect, a cons cell was a little data structure with two pointers to other cons cells.

Lists were represented as linked lists of cons cells, with each cell’s head pointing to an element and the tail pointing to another cons cell.

Having these instructions be very fast was important to those early designers: They were working on one of the first high-level languages (COBOL and FORTRAN being the others), and computers in the late 1950s were extremely small and slow by today’s standards. Although the 704 used core memory, it still used vacuum tubes for its logic. Thus, the design of programming languages and algorithms was driven by what could be accomplished with limited memory and performance.

Here’s the scheme in JavaScript, using two-element arrays to represent cons cells:

const cons = (a, d) => [a, d],
      car  = ([a, d]) => a,
      cdr  = ([a, d]) => d;

We can make a list by calling cons repeatedly, and terminating it with null:

const oneToFive = cons(1, cons(2, cons(3, cons(4, cons(5, null)))));

oneToFive
  //=> [1,[2,[3,[4,[5,null]]]]]

Notice that though JavaScript displays our list as if it is composed of arrays nested within each other like Russian Dolls, in reality the arrays refer to each other with references, so [1,[2,[3,[4,[5,null]]]]] is actually more like:

const node5 = [5,null],
      node4 = [4, node5],
      node3 = [3, node4],
      node2 = [2, node3],
      node1 = [1, node2];

const oneToFive = node1;

This is a Linked List, it’s just that those early Lispers used the names car and cdr after the hardware instructions, whereas today we use words like data and reference. But it works the same way: If we want the head of a list, we call car on it:

car(oneToFive)
  //=> 1

car is very fast, it simply extracts the first element of the cons cell.

But what about the rest of the list? cdr does the trick:

cdr(oneToFive)
  //=> [2,[3,[4,[5,null]]]]

Again, it’s just extracting a reference from a cons cell, it’s very fast. In Lisp, it’s blazingly fast because it happens in hardware. There’s no making copies of arrays, the time to cdr a list with five elements is the same as the time to cdr a list with 5,000 elements, and no temporary arrays are needed. In JavaScript, it’s still much, much, much faster to get all the elements except the head from a linked list than from an array. Getting one reference to a structure that already exists is faster than copying a bunch of elements.

So now we understand that in Lisp, a lot of things use linked lists, and they do that in part because it was what the hardware made possible.

Getting back to JavaScript now, when we write [first, ...rest] to gather or spread arrays, we’re emulating the semantics of car and cdr, but not the implementation. We’re doing something laborious and memory-inefficient compared to using a linked list as Lisp did and as we can still do if we choose.

That being said, it is easy to understand and helps us grasp how literals and destructuring works, and how recursive algorithms ought to mirror the self-similarity of the data structures they manipulate. And so it is today that languages like JavaScript have arrays that are slow to split into the equivalent of a car/cdr pair, but instructional examples of recursive programs still have echoes of their Lisp origins.

so why arrays

If [first, ...rest] is so slow, why does JavaScript use arrays instead of making everything a linked list?

Well, linked lists are fast for a few things, like taking the front element off a list, and taking the remainder of a list. But not for iterating over a list: Pointer chasing through memory is quite a bit slower than incrementing an index. In addition to the extra fetches to dereference pointers, pointer chasing suffers from cache misses. And if you want an arbitrary item from a list, you have to iterate through the list element by element, whereas with the indexed array you just fetch it.

We have avoided discussing rebinding and mutating values, but if we want to change elements of our lists, the naïve linked list implementation suffers as well: When we take the cdr of a linked list, we are sharing the elements. If we make any change other than cons-ing a new element to the front, we are changing both the new list and the old list.

Arrays avoid this problem by pessimistically copying all the references whenever we extract an element or sequence of elements from them.

For these and other reasons, almost all languages today make it possible to use a fast array or vector type that is optimized for iteration, and even Lisp now has a variety of data structures that are optimized for specific use cases.

summary

Although we showed how to use tail calls to map and fold over arrays with [first, ...rest], in reality this is not how it ought to be done. But it is an extremely simple illustration of how recursion works when you have a self-similar means of constructing a data structure.

hacker news edit this page

This post was extracted from a draft of the book, JavaScript Allongé, The “Six” Edition. The extracts so far:


  1. It needn’t always be so: Programmers have developed specialized data structures that make operations like this cheap, often by arranging for structures to share common elements by default, and only making copies when changes are made. But this is not how JavaScript’s built-in arrays work. 

  2. Lisp is still very much alive, and one of the most interesting and exciting programming languages in use today is Clojure, a Lisp dialect that runs on the JVM, along with its sibling ClojureScript, Clojure that transpiles to JavaScript. 

https://raganwald.com/2015/02/07/tail-calls-defult-arguments-recycling
Destructuring and Recursion in ES-6
Show full content

Drink HAND-POURED coffee.

array literals

Arrays are JavaScript’s “native” representation of lists. Lists are important because they represent ordered collections of things, and ordered collections are a fundamental abstraction for making sense of reality.

JavaScript has a literal syntax for creating an array: The [ and ] characters. We can create an empty array:

[]
  //=> []

We can create an array with one or more elements by placing them between the brackets and separating the items with commas. Whitespace is optional:

[1]
  //=> [1]

[2, 3, 4]
  //=> [2,3,4]

Any expression will work:

[ 2,
  3,
  2 + 2
]
  //=> [2,3,4]

Including an expression denoting another array:

[[[[[]]]]]

This is an array with one element that is an array array with one element this an array with one element that is an array with one element that is an empty array. Although that seems like something nobody would ever construct, many students have worked with almost the exact same thing when they explored various means of constructing arithmetic from Set Theory.

Any expression will do, including names:

const wrap = (something) => [something];

wrap("lunch")
  //=> ["lunch"]

Array literals are expressions, and arrays are reference types. We can see that each time an array literal is evaluated, we get a new, distinct array, even if it contains the exact same elements:

[] === []
  //=> false

[2 + 2] === [2 + 2]
  //=> false

const array_of_one = () => [1];

array_of_one() === array_of_one()
  //=> false
destructuring arrays

Destructuring is a feature going back to Common Lisp, if not before. We saw how to construct an array literal using [, expressions, , and ]. Here’s an example of an array literal that uses a name:

const wrap = (something) => [something];

Let’s expand it to use a block and an extra name:

const wrap = (something) => {
  const wrapped = [something];

  return wrapped;
}

wrap("package")
  //=> ["package"]

The line const wrapped = [something]; is interesting. On the left hand is a name to be bound, and on the right hand is an array literal, a template for constructing an array, very much like a quasi-literal string.

In JavaScript, we can actually reverse the statement and place the template on the left and a value on the right:

const unwrap = (wrapped) => {
  const [something] = wrapped;

  return something;
}

unwrap(["present"])
  //=> "present"

The statement const [something] = wrapped; destructures the array represented by wrapped, binding the value of its single element to the name something. We can do the same thing with more than one element:

const surname = (name) => {
  const [first, last] = name;

  return last;
}

surname(["Reginald", "Braithwaite"])
  //=> "Braithwaite"

We could do the same thing with (name) => name[1], but destructuring is code that resembles the data it consumes, a valuable coding style.

Destructuring can nest:

const description = (nameAndOccupation) => {
  const [[first, last], occupation] = nameAndOccupation;

  return `${first} is a ${occupation}`;
}

description([["Reginald", "Braithwaite"], "programmer"])
  //=> "Reginald is a programmer"
gathering

Sometimes we need to extract arrays from arrays. Here is the most common pattern: Extracting the head and gathering everything but the head from an array:

const [car, ...cdr] = [1, 2, 3, 4, 5];

car
  //=> 1
cdr
  //=> [2, 3, 4, 5]

car and cdr are archaic terms that go back to an implementation of Lisp running on the IBM 704 computer. Some other languages call them first and butFirst, or head and tail. We will use a common convention and call variables we gather rest, but refer to the ... operation as a “gather,” follow Kyle Simpson’s example.1

Alas, the ... notation does not provide a universal patten-matching capability. For example, we cannot write

const [...butLast, last] = [1, 2, 3, 4, 5];
  //=> ERROR

const [first, ..., last] = [1, 2, 3, 4, 5];
  //=> ERROR

Also it’s important to note that the ... can be at the beginning, for example in case of constructors like:

const date = new Date(...[2015, 1, 1]);

Now, when we introduced destructuring, we saw that it is kind-of-sort-of the reverse of array literals. So if

const wrapped = [something];

Then:

const [unwrapped] = something;

What is the reverse of gathering? We know that:

const [car, ...cdr] = [1, 2, 3, 4, 5];

What is the reverse? It would be:

const cons = [car, ...cdr];

Let’s try it:

const oneTwoThree = ["one", "two", "three"];

["zero", ...oneTwoThree]
  //=> ["zero","one","two","three"]

It works! We can use ... to place the elements of an array inside another array. We say that using ... to destructure is gathering, and using it in a literal to insert elements is called “spreading.”

destructuring parameters

Consider the way we pass arguments to parameters:

foo()
bar("smaug")
baz(1, 2, 3)

It is very much like an array literal. And consider how we bind values to parameter names:

const foo = () => ...
const bar = (name) => ...
const baz = (a, b, c) => ...

It looks like destructuring. It acts like destructuring. There is only one difference: We have not tried gathering. Let’s do that:

const numbers = (...nums) => nums;

numbers(1, 2, 3, 4, 5)
  //=> [1,2,3,4,5]

const headAndTail = (head, ...tail) => [head, tail];

headAndTail(1, 2, 3, 4, 5)
  //=> [1,[2,3,4,5]]

Gathering works with parameters! This is very useful indeed, and we’ll see more of it in a moment.2

Stacked Cups

Self-Similarity

Recursion is the root of computation since it trades description for time.—Alan Perlis, Epigrams in Programming

We saw that the basic idea that putting an array together with a literal array expression was the reverse or opposite of taking it apart with a destructuring assignment.

Let’s be more specific. Some data structures, like lists, can obviously be seen as a collection of items. Some are empty, some have three items, some forty-two, some contain numbers, some contain strings, some a mixture of elements, there are all kinds of lists.

But we can also define a list by describing a rule for building lists. One of the simplest, and longest-standing in computer science, is to say that a list is:

  1. Empty, or;
  2. Consists of an element concatenated with a list .

Let’s convert our rules to array literals. The first rule is simple: [] is a list. How about the second rule? We can express that using a spread. Given an element e and a list list, [e, ...list] is a list. We can test this manually by building up a list:

[]
//=> []

["baz", ...[]]
//=> ["baz"]

["bar", ...["baz"]]
//=> ["bar","baz"]

["foo", ...["bar", "baz"]]
//=> ["foo","bar","baz"]

Thanks to the parallel between array literals + spreads with destructuring + rests, we can also use the same rules to decompose lists:

const [first, ...rest] = [];
first
  //=> undefined
rest
  //=> []:

const [first, ...rest] = ["foo"];
first
  //=> "foo"
rest
  //=> []

const [first, ...rest] = ["foo", "bar"];
first
  //=> "foo"
rest
  //=> ["bar"]

const [first, ...rest] = ["foo", "bar", "baz"];
first
  //=> "foo"
rest
  //=> ["bar","baz"]

For the purpose of this exploration, we will presume the following:3

const isEmpty = ([first, ...rest]) => first === undefined;

isEmpty([])
  //=> true

isEmpty([0])
  //=> false

isEmpty([[]])
  //=> false

Armed with our definition of an empty list and with what we’ve already learned, we can build a great many functions that operate on arrays. We know that we can get the length of an array using its .length. But as an exercise, how would we write a length function using just what we have already?

First, we pick what we call a terminal case. What is the length of an empty array? 0. So let’s start our function with the observation that if an array is empty, the length is 0:

const length = ([first, ...rest]) =>
  first === undefined
    ? 0
    : // ???

We need something for when the array isn’t empty. If an array is not empty, and we break it into two pieces, first and rest, the length of our array is going to be length(first) + length(rest). Well, the length of first is 1, there’s just one element at the front. But we don’t know the length of rest. If only there was a function we could call… Like length!

const length = ([first, ...rest]) =>
  first === undefined
    ? 0
    : 1 + length(rest);

Let’s try it!

length([])
  //=> 0

length(["foo"])
  //=> 1

length(["foo", "bar", "baz"])
  //=> 3

Our length function is recursive, it calls itself. This makes sense because our definition of a list is recursive, and if a list is self-similar, it is natural to create an algorithm that is also self-similar.

linear recursion

“Recursion” sometimes seems like an elaborate party trick. There’s even a joke about this:

When promising students are trying to choose between pure mathematics and applied engineering, they are given a two-part aptitude test. In the first part, they are led to a laboratory bench and told to follow the instructions printed on the card. They find a bunsen burner, a sparker, a tap, an empty beaker, a stand, and a card with the instructions “boil water.”

Of course, all the students know what to do: They fill the beaker with water, place the stand on the burner and the beaker on the stand, then they turn the burner on and use the sparker to ignite the flame. After a bit the water boils, and they turn off the burner and are lead to a second bench.

Once again, there is a card that reads, “boil water.” But this time, the beaker is on the stand over the burner, as left behind by the previous student. The engineers light the burner immediately. Whereas the mathematicians take the beaker off the stand and empty it, thus reducing the situation to a problem they have already solved.

There is more to recursive solutions that simply functions that invoke themselves. Recursive algorithms follow the “divide and conquer” strategy for solving a problem:

  1. Divide the problem into smaller problems
  2. If a smaller problem is solvable, solve the small problem
  3. If a smaller problem is not solvable, divide and conquer that problem
  4. When all small problems have been solved, compose the solutions into one big solution

The big elements of divide and conquer are a method for decomposing a problem into smaller problems, a test for the smallest possible problem, and a means of putting the pieces back together. Our solutions are a little simpler in that we don’t really break a problem down into multiple pieces, we break a piece off the problem that may or may not be solvable, and solve that before sticking it onto a solution for the rest of the problem.

A very good recursive algorithm is one that parallels the recursive nature of the data being manipulated. This simpler form of “divide and conquer” is called linear recursion. It’s very useful and simple to understand, and it parallels the linearly self-similar definition we made for lists. Let’s take another example. Sometimes we want to flatten an array, that is, an array of arrays needs to be turned into one array of elements that aren’t arrays.4

We already know how to divide arrays into smaller pieces. How do we decide whether a smaller problem is solvable? We need a test for the terminal case. Happily, there is something along these lines provided for us:

Array.isArray("foo")
  //=> false

Array.isArray(["foo"])
  //=> true

The usual “terminal case” will be that flattening an empty array will produce an empty array. The next terminal case is that if an element isn’t an array, we don’t flatten it, and can put it together with the rest of our solution directly. Whereas if an element is an array, we’ll flatten it and put it together with the rest of our solution.

So our first cut at a flatten function will look like this:

const flatten = ([first, ...rest]) => {
  if (first === undefined) {
    return [];
  }
  else if (!Array.isArray(first)) {
    return [first, ...flatten(rest)];
  }
  else {
    return [...flatten(first), ...flatten(rest)];
  }
}

flatten(["foo", [3, 4, []]])
  //=> ["foo",3,4]

Once again, the solution directly displays the important elements: Dividing a problem into subproblems, detecting terminal cases, solving the terminal cases, and composing a solution from the solved portions.

mapping

Another common problem is applying a function to every element of an array. JavaScript has a built-in function for this, but let’s write our own using linear recursion.

If we want to square each number in a list, we could write:

const squareAll = ([first, ...rest]) =>
  first === undefined
  ? []
  : [first * first, ...squareAll(rest)];

squareAll([1, 2, 3, 4, 5])
  //=> [1,4,9,16,25]

And if we wanted to “truthify” each element in a list, we could write:

const truthyAll = ([first, ...rest]) =>
  first === undefined
    ? []
    : [!!first, ...truthyAll(rest)];

truthyAll([null, true, 25, false, "foo"])
  //=> [false,true,true,false,true]

This specific case of linear recursion is called “mapping,” and it is not necessary to constantly write out the same pattern again and again. Functions can take functions as arguments, so let’s “extract” the thing to do to each element and separate it from the business of taking an array apart, doing the thing, and putting the array back together.

Given the signature:

const mapWith = (fn, array) => // ...

We can write it out using a ternary operator. Even in this small function, we can identify the terminal condition, the piece being broken off, and recomposing the solution.

const mapWith = (fn, [first, ...rest]) =>
  first === undefined
    ? []
    : [fn(first), ...mapWith(fn, rest)];

mapWith((x) => x * x, [1, 2, 3, 4, 5])
  //=> [1,4,9,16,25]

mapWith((x) => !!x, [null, true, 25, false, "foo"])
  //=> [false,true,true,false,true]
folding

With the exception of the length example at the beginning, our examples so far all involve rebuilding a solution using spreads. But they needn’t. A function to compute the sum of the squares of a list of numbers might look like this:

const sumSquares = ([first, ...rest]) =>
  first === undefined
    ? 0
    : first * first + sumSquares(rest);

sumSquares([1, 2, 3, 4, 5])
  //=> 55

There are two differences between sumSquares and our maps above:

  1. Given the terminal case of an empty list, we return a 0 instead of an empty list, and;
  2. We catenate the square of each element to the result of applying sumSquares to the rest of the elements.

Let’s rewrite mapWith so that we can use it to sum squares.

const foldWith = (fn, terminalValue, [first, ...rest]) =>
  first === undefined
    ? terminalValue
    : fn(first, foldWith(fn, terminalValue, rest));

And now we supply a function that does slightly more than our mapping functions:

foldWith((number, rest) =>
  number * number + rest, 0, [1, 2, 3, 4, 5])
  //=> 55

Our foldWith function is a generalization of our mapWith function. We can represent a map as a fold, we just need to supply the array rebuilding code:

const squareAll = (array) =>
  foldWith((first, rest) => [first * first, ...rest], [], array);

squareAll([1, 2, 3, 4, 5])
  //=> [1,4,9,16,25]

And if we like, we can write mapWith using foldWith:

const mapWith = (fn, array) =>
  foldWith((first, rest) => [fn(first), ...rest], [], array);
const squareAll = (array) => mapWith((x) => x * x, array);

squareAll([1, 2, 3, 4, 5])
  //=> [1,4,9,16,25]

And to return to our first example, our version of length can be written as a fold:

const length = (array) =>
  foldWith((first, rest) => 1 + rest, 0, array);

length([1, 2, 3, 4, 5])
  //=> 5
what does it all mean?

Some data structures, like lists, have a can be defined as self-similar. When working with a self-similar data structure, a recursive algorithm parallels the data’s self-similarity.

Linear recursion is a basic building block of algorithms. Its basic form parallels the way linear data structures like lists are constructed: This helps make it understandable. Its specialized cases of mapping and folding are especially useful and can be used to build other functions. And finally, while folding is a special case of linear recursion, mapping is a special case of folding.

And last but certainly not least, destructuring, spreading, and gathering make this very natural to express in JavaScript ES-6.

reddit hacker news edit this page

This post was extracted from a draft of the book, JavaScript Allongé, The “Six” Edition. The extracts so far:


  1. Kyle Simpson is the author of You Don’t Know JS, available here 

  2. Gathering in parameters has a long history, and the usual terms are to call gathering “pattern matching” and to call a name that is bound to gathered values a “rest parameter.” The term “rest” is perfectly compatible with gather: “Rest” is the noun, and “gather” is the verb. We gather the rest of the parameters. 

  3. Well, actually, this does not work for arrays that contain undefined as a value, but we are not going to see that in our examples. A more robust implementation would be (array) => array.length === 0, but we are doing backflips to keep this within a very small and contrived playground. 

  4. flatten is a very simple unfold, a function that takes a seed value and turns it into an array. Unfolds can be thought of a “path” through a data structure, and flattening a tree is equivalent to a depth-first traverse. 

https://raganwald.com/2015/02/02/destructuring
Why Why Functional Programming Matters Matters
Show full content

(This was originally posted on Sunday, March 11, 2007)

I recently re-read the amazing paper Why Functional Programming Matters (“WhyFP”). Although I thought that I understood WhyFP when I first read it a few years ago, when I had another look last weekend I suddenly understood that I had missed an important message.1

Now obviously (can you guess from the title?) the paper is about the importance of one particular style of programming, functional programming. And when I first read the paper, I took it at face value: I thought, “Here are some reasons why functional programming languages matter.”

On re-reading it, I see that the paper contains insights that apply to programming in general. I don’t know why this surprises me. The fact is, programming language design revolves around program design. A language’s design reflects the opinions of its creators about the proper design of programs.

In a very real sense, the design of a programming language is a strong expression of the opinions of the designer about good programs. When I first read WhyFP, I thought the author was expressing an opinion about the design of good programming languages. Whereas on the second reading, I realized he was expressing an opinion about the design of good programs.

Can we add though subtraction?

It is a logical impossibility to make a language more powerful by omitting features, no matter how bad they may be. Is this obvious? So how do we explain that one reason Java is considered “better than C++” is because it omits manual memory management? And one reason many people consider Java “better than Ruby” is because you cannot open base classes like String in Java? So no, it is not obvious. Why not?

The key is the word better. It’s not the same as the phrase more powerful.2 The removal or deliberate omission of these features is an expression about the idea that programs which do not use these features are better than programs which do. Any feature (or removal of a feature) which makes the programs written in the language better makes the language better. Thus, it is possible to make a language “better” by removing features that are considered harmful,3 if by doing so it makes programs in the language better programs.

In the opinion of the designers of Java, programs that do not use malloc and free are safer than those that do. And the opinion of the designers of Java is that programs that do not modify base classes like String are safer than those that do. The Java language design emphasizes a certain kind of safety, and to a Java language designer, safer programs are better programs.

“More powerful” is a design goal just like “safer.” But yet, what does it mean? We understand what a safer language is. It’s a language where programs written in the language are safer. But what is a “more powerful” language? That programs written in the language are more powerful? What does that mean? Fewer symbols (the “golf” metric)?

WhyFP asserts that you cannot make a language more powerful through the removal of features. To paraphrase an argument from the paper, if removing harmful features was useful by itself, C and C++ programmers would simply have stopped using malloc and free twenty years ago. Improving on C/C++ was not just a matter of removing malloc and free, it was also a matter of adding automatic garbage collection.

This space, wherein the essay ought to argue that Java compensates for its closed base classes by providing a more powerful substitute feature, left intentionally blank.

At the same time, there is room for arguing that some languages are improved by the removal of harmful features. To understand why they may be improved but not more powerful, we need a more objective definition of what it means for a language to be “more powerful.” Specifically, what quality does a more powerful programming language permit or encourage in programs?

When we understand what makes a program “better” in the mind of a language designer, we can understand the choices behind the language.

Factoring

Factoring a program is the act of dividing it into units that are composed to produce the working software.4 Factoring happens as part of the design. (Re-factoring is the act of rearranging an existing program to be factored in a different way). If you want to compare this to factoring in number theory, a well designed program has lots of factors, like the number 3,628,800 (10!). A Big Ball of Mud is like the number 3,628,811, a prime.

Composition is the construction of programs from smaller programs. So factoring is to composition as division is to multiplication.

Factoring programs isn’t really like factoring simple divisors. The most important reason is that programs can be factored in orthogonal ways. When you break a program into subprograms (using methods, subroutines, functions, what-have-you), that’s one axis of factoring. When you break an a modular program up into modules, that’s another, orthogonal axis of factoring.

Programs that are well-factored are more desirable than programs that are poorly factored.

In computer science, separation of concerns (SoC) is the process of breaking a program into distinct features that overlap in functionality as little as possible. A concern is any piece of interest or focus in a program.

SoC is a long standing idea that simply means a large problem is easier to manage if it can be broken down into pieces; particularly so if the solutions to the sub-problems can be combined to form a solution to the large problem.

The term separation of concerns was probably coined by Edsger W. Dijkstra in his paper On the role of scientific thought. —Excerpts from the Wikipedia entry on Separation of Concerns

Programs that separate their concerns are well-factored. There’s a principle of software development, responsibility-driven design. Each component should have one clear responsibility, and it should have everything it needs to carry out its responsibility.

This is the separation of concerns again. Each component of a program having one clearly defined responsibility means each concern is addressed in one clearly defined place.

Let’s ask a question about Monopoly (and Enterprise software). Where do the rules live? In a noun-oriented design, the rules are smooshed and smeared across the design, because every single object is responsible for knowing everything about everything that it can ‘do’. All the verbs are glued to the nouns as methods.—My favourite interview question

In a game design where you have important information about a rule smeared all over the object hierarchy, you have very poor separation of concerns. It looks at first like there’s a clear factoring “Baltic Avenue has a method called isUpgradableToHotel,” but when you look more closely you realize that every object representing a property is burdened with knowing almost all of the rules of the game.

The concerns are not clearly separated: there’s no one place to look and understand the behaviour of the game.

Programs that separate their concerns are better programs than those that do not. And languages that facilitate this kind of program design are better than those that hamper it.

Power through features that separate concerns

One thing that makes a programming language “more powerful” in my opinion is the provision of more ways to factor programs. Or if you prefer, more axes of composition. The more different ways you can compose programs out of subprograms, the more powerful a language is.

Do you remember Structured Programming? The gist is, you remove goto and you replace it with well-defined control flow mechanisms: some form of subroutine call and return, some form of selection mechanism like Algol-descendant if, and some form of repetition like Common Lisp’s loop macro.

Dijkstra’s view on structured programming was that it promoted the separation of concerns. The factoring of programs into blocks with well-defined control flow made it easy to understand blocks and rearrange programs in different ways. Programs with indiscriminate jumps did not factor well (if at all): they were difficult to understand and often could not be rearranged at all.

Structured 68k ASM programming is straightforward in theory. You just need a lot of boilerplate, design patterns, and the discipline to stick to your convictions. But of course, lots of 68k ASM programming in practice is only partially structured. Statistically speaking, 68k ASM is not a structured programming language even though structured programming is possible in 68k ASM.

Structured Pascal programming is straightforward both in theory and in practice. Pascal facilitates separation of concerns through structured programming. So we say that Pascal “is more powerful than 68k ASM” to mean that in practice, programs written in Pascal are more structured than programs written in 68k ASM because Pascal provides facilities for separating concerns that are missing in 68k ASM.

For example: working with lists

Consider this snippet of iterative code:

int numberOfOldTimers = 0;
for (Employee emp: employeeList) {
    for (Department dept: departmentsInCompany) {
        if (emp.getDepartmentId() == dept.getId() && emp.getYearsOfService() > dept.getAge()) {
            ++numberOfOldTimers;
        }
    }
}

This is an improvement on older practices.5 6 For one thing, the for loops hide the implementation details of iterating over employeeList and departmentsInCompany. Is this better because you have less to type? Yes. Is it better because you eliminate the fence-post errors associated with loop variables? Of course.

But most interestingly, you have the beginnings of a separation of concerns: how to iterate over a single list is separate from what you do in the iteration.

Try calling a colleague on the telephone and explaining what we want as succinctly as possible. Do you say “We want a loop inside a loop and inside of that an if, and…”? Or do you say “We want to count the number of employees that have been with the company longer than their departments have existed.” One problem with the for loop is that it can only handle one loop at a time. We have to nest loops to work with two lists at once. This is patently wrong: there’s nothing inherently nested about what we’re trying to do. We can demonstrate this easily: try calling a colleague on the telephone and explaining what we want as succinctly as possible. Do you say “We want a loop inside a loop and inside of that an if, and…”?

No, we say, “We want to count the number of employees that have been with the company longer than their departments have existed.” There’s no discussion of nesting.

In this case, a limitation of our tool has caused our concerns to intermingle again. The concern of “How to find the employees that have been with the company longer than their departments have existed” is intertwined with the concern of “count them.” Let’s try a different notation that separates the details of how to find from the detail of counting what we’ve found:

old_timers = (employees * departments).select do |emp, dept|
  emp.department_id == dept.id && emp.years_of_service > dept.age
end
number_of_old_timers = old_timers.size

Now we have separated the concern of finding from counting. And we have hidden the nesting by using the * operator to create a Cartesian product of the two lists. Now let’s look at what we used to filter the combined list, select. The difference is more than just semantics, or counting characters, or the alleged pleasure of fooling around with closures.

* and select facilitates separating the concerns of how to filter things (like iterate over them applying a test) from the concern of what we want to filter. So languages that make this easy are more powerful than languages that do not. In the sense that they facilitate additional axes of factoring.

The Telephone Test

Let’s look back a few paragraphs. We have an example of the “Telephone Test:” when code very closely resembles how you would explain your solution over the telephone, we often say it is “very high level.” The usual case is that such code expresses a lot more what and a lot less how. The concern of what has been very clearly separated from the concern of how: you can’t even see the how if you don’t go looking for it.

In general, we think this is a good thing. But it isn’t free: somewhere else there is a mass of code that supports your brevity. When that extra mass of code is built into the programming language, or is baked into the standard libraries, it is nearly free and obviously a Very Good Thing. A language that doesn’t just separate the concern of how but does the work for you is very close to “something for nothing” in programming.

But sometimes you have to write the how as well as the what. It isn’t always handed to you. In that case, it is still valuable, because the resulting program still separates concerns. It still factors into separate components. The components can be changed.

I recently separated the concern of describing “how to generate sample curves for some data mining” from the concern of “managing memory when generating the curves.” I did so by writing my own lazy evaluation code (Both the story and the code are on line). Here’s the key “what” code that generates an infinite list of parameters for sample beziér curves:

def magnitudes
  LazyList.binary_search(0.0, 1.0)
end

def control_points
  LazyList.cartesian_product(magnitudes, magnitudes) do |x, y|
    Dictionary.new( :x => x, :y => y )
  end
end

def order_one_flows args = {}
  height, width = (args[:height] || 100.0), (args[:width] || 100.0)
  LazyList.cartesian_product(
      magnitudes, control_points, control_points, magnitudes
  ) do |initial_y, p1, p2, final_y|
    FlowParams.new(
      height, width, initial_y * height,
      CubicBezierParams.new(
        :x => width,          :y => final_y * height,
        :x1 => p1.x * width,  :y1 => p1.y * height,
        :x2 => p2.x * width,  :y2 => p2.y * height
      )
    )
  end
end

That’s it. Just as I might tell you on the phone: “Magnitudes” is a list of numbers between zero and one created by repeatedly dividing the intervals in half, like a binary search. “Control Points” is a list of the Cartesian product of magnitudes with itself, with one magnitude assigned to x and the other to y. And so forth.

I will not say that the sum of this code and the code that actually implements infinite lists is shorter than imperative code that would intermingle loops and control structures, entangling what with how. I will say that it separates the concerns of what and how, and it separates them in a different way than select separated the concerns of what and how.

So why does “Why Functional Programming Matters” matter again?

The great insight is that better programs separate concerns. They are factored more purely, and the factors are naturally along the lines of responsibility (rather than in Jenga piles of abstract virtual base mixin module class proto_ extends private implements). Languages that facilitate better separation of concerns are more powerful in practice than those that don’t.

WhyFP illustrates this point beautifully with the same examples I just gave: first-class functions and lazy evaluation, both prominent features of modern functional languages like Haskell.

WhyFP’s value is that it expresses an opinion about what makes programs better. It backs this opinion up with reasons why modern functional programming languages are more powerful than imperative programming languages. But even if you don’t plan to try functional programming tomorrow, the lessons about better programs are valuable for your work in any language today.

That’s why Why Functional Programming Matters matters.


  1. And now I’m worried: what am I still missing? 

  2. Please let’s not have a discussion about Turing Equivalence. Computer Science “Theory” tells us “there’s no such thing as more powerful.” Perhaps we share the belief that In theory, there’s no difference between theory and practice. But in practice, there is

  3. I am not making the claim that I consider memory management or unsealed base classes harmful, but I argue that there exists at least one person who does. 

  4. The word “factor” has been a little out of vogue in recent times. But thanks to an excellent post on reddit, it could make a comeback. 

  5. So much so that we won’t even bother to show what loops looked like in the days of for (int i = 0; i < employeeList.size(); ++i)

  6. Another organization might merge employees and departments, or have each department “own” a collection of employees. This makes our example easier, but now the data doesn’t factor well. Everything we’ve learned from databases in the last forty years tells us that we often need to find new ways to compose our data. The relational model factors well. The network model factors poorly. 

https://raganwald.com/2014/12/20/why-why-functional-programming-matters-matters
Fun with Named Functions in JavaScript
Show full content

In JavaScript, you make a named function like this:

function rank () { return "Captain"; }

A named function is a function declaration if it appears as a statement. For example:

function officer () {
  return rank() + " Reginald Thistleton";
  
  function rank () { return "Captain"; }
}

officer()
  //=> 'Captain Reginald Thistleton'

Captain Reginald Thistleton

The function rank is defined in the function declaration function rank () { return "Captain"; }. We use the function rank in the statement return rank() + " Reginald Thistleton";. We can deduce two things from this:

  1. Declaring a named function binds the function to the name in its surrounding environment. That’s why we can use the function rank within the function officer. Likewise, officer is declared in the global environment, and that’s why we can use it on the Node command line (or wherever we’re testing this code).

  2. We can declare a named function anywhere and its binding can be used everywhere. That’s why we can declare rank at the bottom of the function, but use it at the top.

That’s a function declaration. What about function expressions? As we know, we can declare a function in an expression, meaning we can use it anywhere, like this:

(function () { return "Captain Reginald Thistleton"; })()
  //=> 'Captain Reginald Thistleton'

Or this:

!function () { return "Captain Reginald Thistleton"; }()
  //=> false

Or this:

var reggie = function () { return "Captain Reginald Thistleton"; };

This last statement binds an anonymous function to a variable in its environment. The binding takes place when the statement is executed, not before everything is executed. Therefore, this won’t work:

function officer () {
  return rank() + " " + given() + " Thistleton";
  
  var given = function () { return "Reginald"; };
  
  function rank () { return "Captain"; }
}

officer()
  //=> TypeError: undefined is not a function

But this will:

function officer () {
  var given = function () { return "Reginald"; };
  
  return rank() + " " + given() + " Thistleton";
  
  function rank () { return "Captain"; }
}

officer()
  //=> 'Captain Reginald Thistleton'

So, this is a named function: function rank () { return "Captain"; }, and this is an anonymous function: function () { return "Captain"; }. Pop quiz:

  1. Is function () { return "Reginald"; } an expression or a declaration?
  2. Is function surname () { return "Thistleton"; } an expression or a declaration?

The answers are 1: function () { return "Reginald"; } is always an expression, but 2: function surname () { return "Thistleton"; } can be an expression or a declaration, depending on how you use it. For example:

function officer () {
  var given = function () { return "Reginald"; };
  
  return rank() + " " + given() + " " + surname();
  
  function rank () { return "Captain"; }
  
  function surname () { return "Thistleton"; }
}

officer()
  //=> 'Captain Reginald Thistleton'

And also:

function officer () {
  var given   = function () { return "Reginald"; },
      surname = function family () { return "Thistleton"; };
  
  return rank() + " " + given() + " " + surname();
  
  function rank () { return "Captain"; }
}

officer()
  //=> 'Captain Reginald Thistleton'

We’ve used function family () { return "Thistleton"; } as an expression here, and bound the value to the name surname just as we did with an anonymous function. It’s a named function expression, and it is very interesting.

trace elements

In most environments, there is some way of inspecting the call stack for debugging or documentation purposes. For example:

function officer () {
  return rank() + " " + given() + " Thistleton";
  
  var given = function () { return "Reginald"; };
  
  function rank () { return "Captain"; }
}

officer()
  //=> TypeError: undefined is not a function
         at officer (repl:5:39)

Note the second line: at officer (repl:5:39). We know that the TypeError occurred within the officer function. How does the environment know it’s the officer function? Because we named it in the declaration.

If we used an anonymous function bound to a name, Node can deduce the name of the function:

var officer = function () {
  return rank() + " " + given() + " Thistleton";
  
  var given = function () { return "Reginald"; };
  
  function rank () { return "Captain"; }
}

officer()
  //=> TypeError: undefined is not a function
         at officer (repl:5:39)

But not all functions are so easily deduced. Callbacks in Node, event handlers in the browser, and functions passed to higher-order functions and methods are all often anonymous:

[1962, 6, 14].filter(function (n) { return n <= 12; })
  //=> [ 6 ]

Sometimes, such functions go wrong:

function factorial (n) {
  system.out.println(n);
  
  return n < 2 ? n : n * factorial(n-1);
}

[1962, 6, 14].filter(function (n) { return factorial(n) % 2 == 1; })
  //=> ReferenceError: system is not defined
          at factorial (repl:2:1)
          at repl:1:45
          at Array.filter (native)

We seem to have confused “JavaScript” with “Java,” and looking at the stack trace, we can see it happens within factorial, but what calls it? repl:1:45 is not very helpful. This case is trivial enough to work it out, but lots of stack traces are much deeper and contain multiple anonymous functions.

But we know that a function expression doesn’t need to be anonymous:

function factorial (n) {
  system.out.println(n);
  
  return n < 2 ? n : n * factorial(n-1);
}

[1962, 6, 14].filter(function numbersWithOddFactorials (n) { return factorial(n) % 2 == 1; })
  //=> ReferenceError: system is not defined
          at factorial (repl:2:1)
          at numbersWithOddFactorials (repl:1:70)
          at Array.filter (native)

Naming functions is extremely useful for debugging purposes. There are very few reasons not to name functions. Where by “very few”, we mean “probably zero.”

Are there any other benefits? Yes.

scope

When we use a named function expression (not a declaration, but an expression), the name of the function is not bound in its enclosing environment:

function officer () {
  var given   = function () { return "Reginald"; },
      surname = function family () { return "Thistleton"; };
  
  return rank() + " " + given() + " " + family();
  
  function rank () { return "Captain"; }
}

officer()
  //=> ReferenceError: family is not defined

So, when we declare a function, its name is bound in the enclosing environment, but when we use the function as an expression, its name is not bound in the enclosing environment. So where is it bound?

Here’s a named function expression: function even (n) { return n == 0 ? true : !even(n-1) }. We’ll use it in an Immediately Invoked Function Expression (“IIFE”):

(function even (n) { return n == 0 ? true : !even(n-1) })(42)
  //=> true

even
  //=> ReferenceError: even is not defined

Aha! The name is bound inside the body of the function. This is very useful if you’re writing a lot of recursive functions, but where else?

class is in session

Well, how about “classes” (please excuse the scare-quotes):

function Board () {
  this.height = Board.defaultHeight;
  this.width  = Board.defaultWidth;
  // ...
}

Board.defaultheight = Board.defaultWidth = 8;

We’re making a “constructor” function, old-school style, and we’re using properties of the constructor function as the rough equivalent of “class variables” in other languages.

So far there’s nothing special about this, because our constructor is a function declaration. But let’s write a function that generates classes:

function boardMaker (defaultSize) {
  var konstruktor = function Board () {
    this.height = Board.defaultHeight;
    this.width  = Board.defaultWidth;
    // ...
  };

  konstruktor.defaultHeight = konstruktor.defaultWidth = defaultSize;
  
  return konstruktor;
}

Now we can make different board constructors, and each constructor’s Board variable doesn’t conflict with any other constructor’s Board variable:

var Chess   = boardMaker(8),
    Go      = boardMaker(19),
    SmallGo = boardMaker(9);

var board = new Go();

board.height
  //=> 19
closures

Of course, we could accomplish a similar thing by taking advantage of JavaScript’s closures, like this:

function boardMaker (defaultSize) {
  var defaultHeight = defaultSize,
      defaultWidth  = defaultSize;
      
  var konstruktor = function Board () {
    this.height = defaultHeight;
    this.width  = defaultWidth;
    // ...
  };
  
  return konstruktor;
}

We won’t say this is worse or better, but it’s not the same. First, as elegant as a closure is (closures really are awesome), it does use more memory: The JavaScript runtime can’t throw away the invocation environment after boardMaker returns, it has to save it because konstructor refers to its variables. That might matter in some implementations.

Second, the defaultHeight and defaultWidth variables are only visible to function Board. So given the code we’ve written, we can’t change our minds and write SmallGo.defaultHeight = SmallGo.defaultWidth = 11.

Closures and function properties serve different purposes, and both tools belong in our toolbox.

in closing

Named functions can be either declared in a statement or used in an expression. Named function expressions create readable stack traces. The name of the function is bound inside its body, and that can be useful. And we can use the name to have a function invoke itself, or to access its properties like any other object.


Happy trails, and if you find functions interesting, you’ll love my book JavaScript Allongé. It’s free to read online and free as in speech!

(discuss on reddit)

https://raganwald.com/2014/10/24/fun-with-named-functions
600 Months
Show full content

The tweet says it all:

“I have personally found that LISP is unbelievably productive if you’re willing to invest in the 600-month learning curve.”-Paul Ford

Now Mr. Ford is probably exaggerating by a factor of five: I’ll go with Peter Norvig1 and say that Lisp is unbelievably productive if you’re willing to invest in the 120 month (ten year!) learning curve. But exaggeration or no, doesn’t this seem damning?

Aren’t people productive in Rails or JavaScript in a few years, maybe even a few months? Don’t people learn to write complete Rails or Node applications at “boot camps” in the space of weeks?

define “productive”

Sure people learn to write complete Rails, Node, Ember, or whatever applications in weeks. If we define “productive” as “being able to make something useful,” You can be productive in Ruby or JavaScript in weeks.

But then again, you can learn to write complete Racket, Clojure, or PureScript programs in weeks as well. Are we using the same definition of “productive” for Lisp as we are for JavaScript?

Somehow, I think we’re “grading on a curve.” We can learn to be “unbelievably productive” In Lisp after a decade of study, but that “unbelievably productive” is a different kind of productive than “whip together a web app using Ember.js” productive.

In my anecdotal experience, many supposedly “advanced” languages (like members of the Lisp family, but also Haskell, Factor, PureScript, and just about anything not mainstream) suffer from the high expectations set for them by enthusiasts.

We have heard about the phenomenal things you can do with them, so we naturally assume that phrases like “productive” or “proficient” or “knows well” apply to doing these wonderful things. Whereas, the pedestrian and “pragmatic” languages like JavaScript or Python aren’t held out as being these magnificent mental force multipliers, so we grade them against our lowered expectations.

ten years

This makes me wonder: What happens if we invest ten years in learning Lisp? Wonderful things, we get a wizard’s hat. If we invest ten years in PureScript I expect we’ll be dreaming of fizz buzz with semigroups and apply.

But what happens if we put ten years of study into JavaScript? Will we be as “incredibly productive” as we would be with ten years of Lisp? Or do we end up with two years of JavaScript productivity and five years of general-purpose broad principles we borrow from Java and Python?

Either way, we’ll be better after ten years studying something. But perhaps not all somethings are created equal?

What language has the property that after ten years’ study, you are the most incredibly productive? What language has the learning curve that stays steep, longer?

I laugh at the joke about spending 600 months becoming incredibly productive in Lisp. But then I wonder: If I keep studying the things I’m studying today, will I be on my way to becoming incredibly productive? Or will I flatten out long before a decade passes?

Hmmm.


  1. Teach Yourself Programming in Ten Years 

https://raganwald.com/2014/09/28/600-months
Chickens And Pigs
Show full content

Paul Graham tweeted something interesting, and worth discussing. But first, a fabulous story, The Chicken and the Pig:

A Pig and a Chicken are walking down the road.

The Chicken says: “Hey Pig, I was thinking we should open a restaurant!”

Pig replies: “Hm, maybe, what would we call it?”

The Chicken responds: “How about ‘ham-n-eggs’?”

The Pig thinks for a moment and says: “No thanks. I’d be committed, but you’d only be involved!”

The story is usually told to illustrate the effects that differening levels of commitment have on stakeholders in a project. When writing departmental software, the manager of the department, who is accountable for its productivity, is a pig. The company’s IT manager, on the other hand, is usually a chicken.

This manifests itself in behaviour such as the IT manager demanding SharePoint compatibility, because they’re accountable for the ops personnel training budget.

Such a decision might make the software less usable for the department, but the IT Manager is not responsible for the department’s productivity, and this produces a natural tension between the two managers.

But what about the tweet?

Textbooks

Textbooks are not just expensive, they’re usually terrible. If they were great, the cost would be a minor concern. But when you have a medicore product, it’s natural to resent being forced to pay through the nose to own it.

terrible textbooks

The mechanism behind terrible textbooks is the same mechanism behind terrible enterprise software. In my experience, enterprise software can be effective when selected by people who don’t have to use it. The key is whether it is selected by pigs. People who have to use it are one kind of pig. People accountable for the productivity it delivers are another kind of pig. As long as an enterprise software project is driven by pigs and the chickens are restricted to an advisory capacity, you can have effective software.

But even one chicken with direct management authority or veto power can ruin an enterprise project: They “coattail” requirements that suit their own objectives at the expense of the project’s success, and you end up with nonsense. (This happens everywhere. Microsoft is suffering now because its “cash cows,” Windows and Office, behaved like chickens on steriods for decades, making almost every other division and department subordinate to their goals.)

So about textbooks. It’s fairly obvious that the problem with textbook quality and price is that the students are pigs and everyone else–faculty, administration, publishers–are chickens with authority. Fixing this problem could be done by routing around the whole system. For example, make universities irrelevant.

why we are about chickens and pigs

Regardless of what we do or do not do with textbooks, we should always pay attention to the chickens and pigs dynamic. It’s an enormously indsidious pheneomenon, and wherever you find it, you find waste and human misery.

Which also means, wherever you find it, there are opportunities to make money and to make people’s lives better. If you figure out how to disrupt textbooks, you’re cracking a billion dollar market, and helping people get an affordable education. If you figure out how to help enterprise software developers work together, better, you’re cracking another billion dollar market and again making people happy doing something they love.

So in conlusion… Textbooks are an example of the chicken and egg dynamic. It’s an important dynamic to understand, because it is very common, and if you can crack it, you can make money while making people’s lives better.

Which reminds me of another thing I once read:

“Where there’s muck, there’s brass.”—Paul Graham

https://raganwald.com/2014/08/22/chickens-and-pigs
I Can't Find Good Salespeople
Show full content

I was meeting Sarah for our regular coffee. Sarah’s one of those born entrepreneurs, she’s built and sold a variety of businesses, and she’s has a great touch with investments. I pick her brain about business issues, and she picks my brain about tech.

As we took our seats, Sarah sighed with frustration. “I’m trying to get a mobile app put together for the new business, but we simply can’t find any good programmers, do you think—”

I was a little brusque: “Hang on, Sarah, last week we talked about your business the whole time. How about I start?” She laughed, and I took the imaginary talking-stick.

“I can’t find good salespeople! They simply can’t be found anywhere, and I’ve tried.” I went on for a few minutes, and then Sarah gently recited the facts of business life to me:

why i can’t find good salespeople

“Robin, you believe in your product, and I believe in you, but to a salesperson the whole setup is very high risk. You’re offering a generous commission, but with a start-up, the salespeople have to convince customers they have a need AND convince them you can satisfy the need AND close them on satisfying the need now. Or they starve.”

“Salespeople are willing to take risks, but only on the things they can control. Your development schedule, your PR, the receptivity of the market to your innovation… Salespeople have no control over that. Generous commissions can’t compensate for risks they can’t mitigate with their skill and experience.”

“Furthermore, the fact that you aren’t a sales manager with your own track record of success is a red flag to them: There’s no social proof that someone who knows sales thinks this is a great opportunity. And if things aren’t going well, they can’t count on your sales management experience to know how to fix the problem. You are a great guy, you talk at programming conferences, but you have zero credibility as a VP-Sales when you pitch a job selling your product.”

“So, you have to pony up a sensible guaranteed compensation package. And even then, if someone comes to your business for a year and the business flames out, their resumé has a big red flag on it, prospective employers may think they couldn’t sell. That doesn’t matter for inexperienced people, they’re happy just to get a paycheque and say they’ve worked in sales for x number of years.”

motivation

“Let’s think about the more experienced people, the ones with track records. They don’t need another year or five of experience. If they’re any good, they’re not chasing another gold rolex, they’re chasing success, bragging rights, or a chance to say they made a difference. Given their perception that they may have to tell people the company failed, you’ll have to find another motivation to get them interested.”

“There’s no single answer to the motivation problem. It may be finding someone who’s a little evangelistic, who wants to be consultative and work more closely with customers. Or helping someone grow into marketing or product development. It may be finding someone who’s already fascinated by the problems your business solves. And it may be different for each person. But you certainly can’t expect good people to sign up just because you find some talent and are willing to pay.”

solving my problem

I finished scribbling notes. “So, Sarah, it comes down to this: Good salespeople don’t want to come on board because:”

  1. Their success is gated by risks they can’t control;
  2. My lack of management experience in their domain;
  3. Experienced people don’t need money or another couple of years on their resumé.

“So if I want good people, I have to:”

  1. Base their compensation on risks they can control;
  2. Hire a leader with credibility in their domain;
  3. Work with experienced people to tailor the job to their motivation.

“Got it, thanks.”

solving sarah’s problem

I finished my coffee. “Now, Sarah, what exactly were you asking about hiring good programmers to work in a startup for a non-technical founder?”

(discuss)

https://raganwald.com/2014/08/04/i-cant-find-good-salespeople
A JavaScript Constructor Problem, and Three Solutions
Show full content
preamble

As you know, you can create new objects in JavaScript using a Constructor Function, like this:

function Fubar (foo, bar) {
  this._foo = foo;
  this._bar = bar;
}

var snafu = new Fubar("Situation Normal", "All Fsked Up");

When you “call” the constructor with the new keyword, you get a new object allocated, and the constructor is called with the new object as the current context. If you don’t explicitly return anything from the constructor, you get the new object as the result.

Thus, the body of the constructor function is used to initialize the newly created object. There’s another thing: The newly created object is initialized to have a prototype. What prototype? The contents of the constructor’s prototype property. So we can write:

Fubar.prototype.concatenated = function () {
  return this._foo + " " + this._bar;
}

snafu.concatenated()
  //=> 'Situation Normal All Fsked Up'

Thanks to the internal mechanics of JavaScript’s instanceof operator, we can use it to test whether an object is likely to have been created with a particular constructor:

snafu instanceof Fubar
  //=> true

(It’s possible to “fool” instanceof when working with more advanced idioms, or if you’re the kind of malicious troglodyte who collects language corner cases and enjoys inflicting them on candidates in job interviews. But it works well enough for our purposes.)

the problem

What happens if we call the constructor, but accidentally omit the new keyword?

var fubar = Fubar("Fsked Up", "Beyond All Recognition");

fubar
  //=> undefined

William-Thomas-Fredreich!? We’ve called an ordinary function that doesn’t return anything. so fubar is undefined. That’s not what we want. Actually, it’s worse than that:

_foo
  //=> 'Fsked Up'

JavaScript sets this to the global environment by default for calling an ordinary function, so we’ve just blundered about in the global environment. We can fix that somewhat:

function Fubar (foo, bar) {
  "use strict"

  this._foo = foo;
  this._bar = bar;
}

Fubar("Situation Normal", "All Fsked Up");
  //=> TypeError: Cannot set property '_foo' of undefined

Although "use strict" might be omitted from code in blog posts and books (mea culpa!), in production it is very nearly mandatory for reasons just like this. But nevertheless, constructors that do not take into account the possibility of being called without the new keyword are a potential problem.

So what can we do?

solution: auto-instantiation

In Effective JavaScript, David Herman describes auto-instantiation. When we call a constructor with new, The pseudo-variable this is set to a new instance of our “class,” so-to-speak. We can use this to detect whether our constructor has been called with new:

function Fubar (foo, bar) {
  "use strict"

  var obj,
      ret;

  if (this instanceof Fubar) {
    this._foo = foo;
    this._bar = bar;
  }
  else return new Fubar(foo, bar);
}

Fubar("Situation Normal", "All Fsked Up");
  //=>
    { _foo: 'Situation Normal',
      _bar: 'All Fsked Up' }

Why bother making it work without new? One problem this solves is that new Fubar(...) does not compose. Consider:

function logsArguments (fn) {
  return function () {
    console.log.apply(this, arguments);
    return fn.apply(this, arguments)
  }
}

function sum2 (a, b) {
  return a + b;
}

var logsSum = logsArguments(sum2);

logsSum(2, 2)
  //=>
    2 2
    4

logsArguments decorates a function by returning a version of the function that logs its arguments. Let’s try it on the original Fubar:

function Fubar (foo, bar) {
  this._foo = foo;
  this._bar = bar;
}
Fubar.prototype.concatenated = function () {
  return this._foo + " " + this._bar;
}

var LoggingFubar = logsArguments(Fubar);

var snafu = new LoggingFubar("Situation Normal", "All Fsked Up");
  //=> Situation Normal All Fsked Up

snafu.concatenated()
  //=> TypeError: Object [object Object] has no method 'concatenated'

This doesn’t work because snafu is actually an instance of LoggingFubar, not of Fubar. But if we use the auto-instantiating version of Fubar:

function Fubar (foo, bar) {
  "use strict"

  var obj,
      ret;

  if (this instanceof Fubar) {
    this._foo = foo;
    this._bar = bar;
  }
  else {
    obj = new Fubar();
    ret = Fubar.apply(obj, arguments);
    return ret === undefined
           ? obj
           : ret;
  }
}
Fubar.prototype.concatenated = function () {
  return this._foo + " " + this._bar;
}

var LoggingFubar = logsArguments(Fubar);

var snafu = new LoggingFubar("Situation Normal", "All Fsked Up");
  //=> Situation Normal All Fsked Up

snafu.concatenated()
  //=> 'Situation Normal All Fsked Up'

Now it works, but of course snafu is an instance of Fubar, not of LoggingFubar. Is that what you want? Who knows!? This isn’t a justification for the pattern, as much as an explanation that it is a useful, but leaky abstraction. It’s doesn’t “just work,” but it can make certain things possible (like decorating constructors) that are otherwise even more awkward to implement.

solution: overload its meaning

It can be very handy to have a function that tests for an object being an instance of a particular class. If we can stomach the idea of one function doing two different things, we can make the constructor its own instanceof test:

function Fubar (foo, bar) {
  "use strict"

  if (this instanceof Fubar) {
    this._foo = foo;
    this._bar = bar;
  }
  else return arguments[0] instanceof Fubar;
}

var snafu = new Fubar("Situation Normal", "All Fsked Up");

snafu
  //=>
    { _foo: 'Situation Normal',
      _bar: 'All Fsked Up' }

Fubar({})
  //=> false
Fubar(snafu)
  //=> true

This allows us to use the constructor as an argument in predicate and multiple dispatch, or as a filter:

var arrayOfSevereProblems = problems.filter(Fubar);
solution: kill it with fire

If we don’t have some pressing need for auto-instantiation, and if we care not for overloaded functions, we may wish to avoid accidentally calling a constructor without using new. We saw that "use strict" can help, but it’s not a panacea. It won’t throw an error if we don’t actually try to assign a value to the global environment. And if we try to do something before assigning a value, it will do that thing no matter what.

Perhaps it’s better to take matters into our own hands. Olivier Scherrer suggests the following pattern:

function Fubar (foo, bar) {
  "use strict"

  if (!(this instanceof Fubar)) {
      throw new Error("Fubar needs to be called with the new keyword");
  }

  this._foo = foo;
  this._bar = bar;
}

Fubar("Situation Normal", "All Fsked Up");
  //=> Error: Fubar needs to be called with the new keyword

Simple and safer than only relying on "use strict". If you like having a simple instanceof test, you can bake it into the constructor as a function method:

Fubar.is = function (obj) {
  return obj instanceof Fubar;
}

var arrayOfSevereProblems = problems.filter(Fubar.is);

There you have it: Constructors that fail when called without new are a potential problem, and three solutions we can use are, respectively, auto-instantiation, overloading the constructor, or killing such calls with fire.

https://raganwald.com/2014/07/09/javascript-constructor-problem
Greenspunning Predicate and Multiple Dispatch in JavaScript
Show full content

Pattern matching is a feature found (with considerable rigor) in functional languages like Erlang and Haskell. In mathematics, algorithms and problems can often be solved by breaking them down into simple cases. Sometimes, those cases are reductions of the original problem.

A famous example is the naïve expression of the factorial function. The factorial of a non-negative integer n is the product of all positive integers less than or equal to n. For example, factorial(5) is equal to 5 * 4 * 3 * 2 * 1, or 120.

The algorithm to compute it can be expressed as two cases. Let’s pretend that JavaScript has pattern-matching baked in:

function factorial (1) {
  return 1;
}

function factorial (n > 1) {
  return n * factorial(n - 1);
}

This can be done with an if statement, of course, but the benefit of breaking problems down by cases is that we can combine small pieces of code in a way that does not tightly couple them.

We can fake a simple form of pattern matching in JavaScript, and we’ll see later that it will be very useful for implementing multiple dispatch.

prelude: return values

Let’s start with a convention: Methods and functions must return something if they successfully handle a method invocation, or raise an exception if they catastrophically fail. They cannot return undefined (which in JavaScript, also includes not explicitly returning something).

For example:

// returns a value, so it is successful
function sum (a, b) {
  return a + b;
}

// returns this, so it is successful
function fluent (x, y, z) {
  // do something
  return this;
}

// returns undefined, so it is not successful
function fail () {
  return undefined;
}

// decorates a function by making it into a fail
function dont (fn) {
  return fail;
}

// logs something and fails,
// because it doesn't explicitly return anything
function logToConsole () {
  console.log.apply(null, arguments);
}
guarded functions

We can write ourself a simple method decorator that guards a function or method, and fails if the guard function fails on the arguments provided. It’s self-currying to facilitate writing utility guards:

function when (guardFn, optionalFn) {
  function guarded (fn) {
    return function () {
      if (guardFn.apply(this, arguments))
        return fn.apply(this, arguments);
    };
  }
  return optionalFn == null
         ? guarded
         : guarded(optionalFn);
}

when(function (x) {return x != null; })(function () { return "hello world"; })();
  //=> undefined

when(function (x) {return x != null; })(function () { return "hello world"; })(1);
  //=> 'hello world'

when is useful independently of our work here, and that’s a good thing: Whenever possible, we don’t just make complicated things out of simpler things, we make them out of reusable simpler things. Now we can compose our guarded functions. Match takes a list of methods, and apply them in order, stopping when one of the methods returns a value that is not undefined.

function Match () {
  var fns = [].slice.call(arguments, 0);

  return function () {
    var i,
        value;

    for (i in fns) {
      value = fns[i].apply(this, arguments);

      if (value !== undefined) return value;
    }
  };
}

// Some predicates to make it easy to write patterns
function equals (x) {
  return function eq (y) { return (x === y); };
}

function greaterThan (x) {
  return function gt (y) { return (y > x); };
}

var factorial = Match(
  when(equals(1),      function (n) { return 1; }),
  when(greaterThan(1), function (n) { return n * factorial(n-1); })
);

factorial(5);
  //=> 120

This is called predicate dispatch, we’re dispatching a function call to another function based on a series of predicates we apply to the arguments. Predicate dispatch declutters individual cases and composes functions and methods from smaller, simpler components that are decoupled from each other.

The Expression Problem

The expression problem originated as follows: Given a set of entities and a set of operations on those entities, how do we add new entities and new operations, without recompiling, without unsafe operations like casts, and while maintaining type safety?

The general form of the problem does not concern type safety, but does concern the elegance of the design.

The Expression Problem is a programming design challenge: Given two orthogonal concerns of equal importance, how do we express our programming solution in such a way that neither concern becomes secondary?

An example given in the c2 wiki concerns a set of shapes (circle, square) and a set of calculations on those shapes (circumference, area). We could write this using metaobjects:

var Square = encapsulate({
	constructor: function (length) {
		this.length = length;
	},
	circumference: function () {
		return this.length * 4;
	},
	area: function () {
		return this.length * this.length;
	}
});

var Circle = encapsulate({
	constructor: function (radius) {
		this.radius = radius;
	},
	circumference: function () {
		return Math.PI * 2.0 * this.radius;
	},
	area: function () {
		return Math.PI * this.radius * this.radius;
	}
});

Or functions on structs:

var Square = Struct('Square', 'length');
var Circle = Struct('Circle', 'radius');

function circumference(shape) {
	if (Square(shape)) {
		return shape.length * 4;
	}
	else if (Circle(shape)) {
		return Math.PI * 2.0 * this.radius;
	}
}

function area (shape) {
	if (Square(shape)) {
		return this.length * this.length;
	}
	else if (Circle(shape)) {
		return Math.PI * this.radius * this.radius;
	}
}

Both of these operations make one thing a first-class citizen and the the other a second-class citizen. The object solution makes shapes first-class, and operations second-class. The function solution makes operations first-class, and shapes second-class. We can see this by adding new functionality:

  1. If we add a new shape (e.f. Triangle), it’s easy with the object solution: Everything you need to know about a triangle goes in one place. But it’s hard with the function solution: We have to carefully add a case to each function covering triangles.
  2. If we add a new operation, (e.g. boundingBox returns the smallest square that encloses the shape), it’s easy with the function solution: we add a new function and make sure it has a case for each kind of shape. But it’s hard with the object solution: We have to make sure that we add a new method to each object.

In a simple (two objects and two methods) example, the expression problem does not seem like much of a stumbling block. But imagine we are operating at scale, with a hierarchy of classes that have methods at every level of the ontology. Adding new operations can be messy, especially in a language that does not have type checking to make sure we cover all of the appropriate cases.

And the functions-first approach is equally messy in contemporary software. It’s a very sensible technique when we program with a handful of canonical data structures and want to make many operations on those data structures. This is why, despite decades of attempts to write Object-Relational Mapping libraries, PL/SQL is not going away. Given a slowly-changing database schema, it’s far easier to write a new procedure that operates across tables, than to try to write methods on objects representing a single entity in a table.

dispatches from space

There’s a related problem. Consider some kind of game involving meteors that fall from the sky towards the Earth. You have fighters of some kind that fly around and try to shoot the meteors. We have an established way of handling a meteors hitting the Earth or a fighter flying into the ground and crashing: We write a .hitsGround() method for meteors and for fighters.

Whenever something hits the ground, we invoke its .hitsGround() method, and it handles the rest. A fighter hitting the ground will cost so many victory points and trigger a certain animation. A meteor hitting the ground will cost a different number of victory points and trigger a different animation.

And it’s easy to add new kinds of things that can hit the ground. As long as they implement .hitsGround(), we’re good. Each object knows what to do.

This resembles encapsulation, but it’s actually called ad hoc polymorphism. It’s not an object hiding its state from tampering, it’s an object hiding its semantic type from the code that uses it. Fighters and meteors both have the same structural type, but different semantic types and different behaviour.

“Standard” OO, as practiced by Smalltalk and its descendants on down to JavaScript, makes heavy use of polymorphism. The mechanism is known as single dispatch because given an expression like a.b(c,d), The choice of method to invoke given the method b is made based on a single receiver, a. The identities of c and d are irrelevant to choosing the code to handle the method invocation.

Single-dispatch handles crashing into the ground brilliantly. It also handles things like adjusting the balance of a bank account brilliantly. But not everything fits the single dispatch model.

Consider a fighter crashing into a meteor. Or another fighter. Or a meteor crashing into a fighter. Or a meteor crashing into another meteor. If we write a method like .crashInto(otherObject), then right away we have an antipattern, there are things that ought to be symmetrical, but we’re forcing an asymmetry on them. This is vaguely like forcing class A to extend B because we don’t have a convenient way to compose metaobjects.

In languages with no other option, we’re forced to do things like have one object’s method know an extraordinary amount of information about another object. For example, if a fighter’s .crashInto(otherObject) method can handle crashing into meteors, we’re imbuing fighters with knowledge about meteors.

double dispatch

Over time, various ways to handle this problem with single dispatch have arisen. One way is to have a polymorphic method invoke another object’s polymorphic methods. For example:

var FighterPrototype = {
	crashInto: function (otherObject) {
		this.collide();
		otherObject.collide();
		this.destroyYourself();
		otherObject.destroyYourself();
	},
	collide: function () {
		// ...
	},
	destroyYourself: function () {
		// ...
	}
}

In this scheme, each object knows how to collide and how to destroy itself. So a fighter doesn’t have to know about meteors, just to trust that they implement .collide() and .destroyYourself(). Of course, this presupposes that a collisions between objects can be subdivided into independent behaviour.

What if, for example, we have special scoring for ramming a meteor, or perhaps a sarcastic message to display? What if meteors are unharmed if they hit a fighter but shatter into fragments if they hit each other?

A pattern for handling this is called double-dispatch. It is a little more elegant in manifestly typed languages than in dynamically typed languages, but such superficial elegance is simply masking some underlying issues. Here’s how we could implement collisions with special cases:

var FighterPrototype = {
	crashInto: function (objectThatCrashesIntoFighters) {
		return objectThatCrashesIntoFighters.isStruckByAFighter(this)
	},
	isStruckByAFighter: function (fighter) {
		// handle fighter-fighter collisions
	},
	isStruckByAMeteor: function (meteor) {
		// handle fighter-meteor collisions
	}
}

var MeteorPrototype = {
	crashInto: function (objectThatCrashesIntoMeteors) {
		return objectThatCrashesIntoMeteors.isStruckByAMeteor(this)
	},
	isStruckByAFighter: function (fighter) {
		// handle meteor-fighter collisions
	},
	isStruckByAMeteor: function (meteor) {
		// handle meteor-meteor collisions
	}
}

var someFighter = Object.create(FighterPrototype),
    someMeteor  = Object.create(MeteorPrototype);

someFighter.crashInto(someMeteor);

In this scheme, when we call someFighter.crashInto(someMeteor), FighterPrototype.crashInto invokes someMeteor.isStruckByAFighter(someFighter), and that handles the specific case of a meteor being struck by a fighter.

To make this work, both fighters and meteors need to know about each other. They are coupled. And as we add more types of objects (observation balloons? missiles? clouds? bolts of lightning?), our changes must be spread across our prototypes. It is obvious that this system is highly inflexible. The principle of messages and encapsulation is ignored, we are simply using JavaScript’s method dispatch system to achieve a result, rather than modeling entities.

Generally speaking, double dispatch is considered a red flag. Sometimes it’s the best technique to use, but often it’s a sign that we have chosen the wrong abstractions.

Multiple Dispatch

JavaScript’s single-dispatch system makes it difficult to model interactions that varied on two (or more) semantic types. Our example was modeling collisions between fighters and meteors, where we want to have different outcomes depending upon whether a fighter or a meteor collided with another fighter or a meteor.

Languages such as Common Lisp bake support for this problem right in, by supporting multiple dispatch. With multiple dispatch, generic functions can be specialized depending upon any of their arguments. In this example, we’re defining forms of collide to work with a meteors and fighters:

(defmethod collide ((object-1 meteor) (object-2 fighter))
   (format t "meteor ~a collides with fighter ~a" object-1 object-2))

(defmethod collide ((object-1 meteor) (object-2 meteor))
   (format t "meteor ~a collides with another meteor ~a" object-1 object-2))

Common Lisp’s generic functions use dynamic dispatch on both object-1 and object-2 to determine which body of collide to evaluate. Meaning, both types are checked at run time, at the time when the function is invoked. Since more than one argument is checked dynamically, we say that Common Lisp has multiple dispatch.

Manifestly typed OO languages like Java appear to support multiple dispatch. You can create one method with several signatures, something like this:

interface Collidable {
  public void crashInto(Meteor meteor);
  public void crashInto(Fighter fighter);
}

class Meteor implements Collidable {
  public void crashInto(Meteor meteor);
  public void crashInto(Fighter fighter);
}

class Fighter implements Collidable {
  public void crashInto(Meteor meteor);
  public void crashInto(Fighter fighter);
}

Alas this won’t work. Although we can specialize crashInto by the type of its argument, the Java compiler resolves this specialization at compile time, not run time. It’s early bound. Thus, if we write something like this pseudo-Java:

Collidable thingOne = Math.random() < 0.5 ? new Meteor() : new Fighter(),
Collidable thingTwo = Math.random() < 0.5 ? new Meteor() : new Fighter();

thingOne.crashInto(thingTwo);

It won’t even compile! The compiler can figure out that thingOne is a Collidable and that it has two different signatures for the crashInto method, but all it knows about thingTwo is that it’s a Collidable, the compiler doesn’t know if it should be compiling an invocation of crashInto(Meteor meteor) or crashInto(Fighter fighter), so it refuses to compile this code.

Java’s system uses dynamic dispatch for the receiver of a method: The class of the receiver is determined at run time and the appropriate method is determined based on that class. But it uses static dispatch for the specialization based on arguments: The compiler sorts out which specialization to invoke based on the declared type of the argument at compile time. If it can’t sort that out, the code does not compile.

Java may have type signatures to specialize methods, but it is still single dispatch, just like JavaScript.

emulating multiple dispatch

Javascript cannot do true multiple dispatch without some ridiculous greenspunning of method invocations. But we can fake it pretty reasonably using predicate dispatch.

We start with the same convention: Methods and functions must return something if they successfully hand a method invocation, or raise an exception if they catastrophically fail. They cannot return undefined (which in JavaScript, also includes not explicitly returning something).

Recall that this allowed us to write the Match function that took a serious of guards, functions that checked to see if the value of arguments was correct for each case. Our general-purpose guard, when, took all of the arguments as parameters.

What we want is to write guards for each argument. So we’ll write whenArgsAre, a guard that takes predicates for each argument as well as the body of the function case:

function whenArgsAre () {
  var matchers = [].slice.call(arguments, 0, arguments.length - 1),
      body     = arguments[arguments.length - 1];

  return function () {
    var i,
        arg,
        value;

    if (arguments.length != matchers.length) return;
    for (i in arguments) {
      arg = arguments[i];
      if (!matchers[i].call(this, arg)) return;
    }
    value = body.apply(this, arguments);
    return value === undefined
           ? null
           : value;
  };
}

// handy predicates for testing the "type" of arguments
function instanceOf (clazz) {
  return function (arg) {
    return arg instanceof clazz;
  };
}

function isOfType (type) {
  return function (arg) {
    return typeof(arg) === type;
  };
}

function isPrototypeOf (proto) {
  return Object.prototype.isPrototypeOf.bind(proto);
}

function Fighter () {};
function Meteor () {};

var handlesManyCases = Match(
  whenArgsAre(
    instanceOf(Fighter), instanceOf(Meteor),
    function (fighter, meteor) {
      return "a fighter has hit a meteor";
    }
  ),
  whenArgsAre(
    instanceOf(Fighter), instanceOf(Fighter),
    function (fighter, fighter) {
      return "a fighter has hit another fighter";
    }
  ),
  whenArgsAre(
    instanceOf(Meteor), instanceOf(Fighter),
    function (meteor, fighters) {
      return "a meteor has hit a fighter";
    }
  ),
  whenArgsAre(
    instanceOf(Meteor), instanceOf(Meteor),
    function (meteor, meteor) {
      return "a meteor has hit another meteor";
    }
  )
);

handlesManyCases(new Meteor(),  new Meteor());
  //=> 'a meteor has hit another meteor'
handlesManyCases(new Fighter(), new Meteor());
  //=> 'a fighter has hit a meteor'

Our Match function now allows us to build generic functions that dynamically dispatch on all of their arguments. They work just fine for creating multiply dispatched methods:

var FighterPrototype = {},
    MeteorPrototype  = {};

FighterPrototype.crashInto = Match(
  whenArgsAre(
    isPrototypeOf(FighterPrototype),
    function (fighter) {
      return "fighter(fighter)";
    }
  ),
  whenArgsAre(
    isPrototypeOf(MeteorPrototype),
    function (fighter) {
      return "fighter(meteor)";
    }
  )
);

MeteorPrototype.crashInto = Match(
  whenArgsAre(
    isPrototypeOf(FighterPrototype),
    function (fighter) {
      return "meteor(fighter)";
    }
  ),
  whenArgsAre(
    isPrototypeOf(MeteorPrototype),
    function (meteor) {
      return "meteor(meteor)";
    }
  )
);

var someFighter = Object.create(FighterPrototype),
    someMeteor  = Object.create(MeteorPrototype);

someFighter.crashInto(someMeteor);
  //=> 'fighter(meteor)'

We now have usable dynamic multiple dispatch for generic functions and for methods. It’s built on predicate dispatch, so it plays well with other kinds of predicates for each argument.

caveat

Consider the following problem:

We wish to create a specialized entity, an ArmoredFighter that behaves just like a regular fighter, only when it strikes another fighter it has some special behaviour.

var ArmoredFighterPrototype = Object.create(FighterPrototype);

ArmoredFighterPrototype.crashInto = Match(
  whenArgsAre(
    isPrototypeOf(FighterPrototype),
    function (fighter) {
      return "armored-fighter(fighter)";
    }
  )
);

Our thought is that we are “overriding” the behaviour of crashInto when an armored fighter crashes into any other kind of fighter. But we wish to retain the behaviour we have already designed when an armored fighter crashes into a meteor.

This is not going to work. Although we have written our code such that the various cases and predicates are laid out separately, at run time they are composed opaquely into functions. As far as JavaScript is concerned, we’ve written:

var FighterPrototype = {};

FighterPrototype.crashInto = function (q) {
  // black box
};

var ArmoredFighterPrototype = Object.create(FighterPrototype);

ArmoredFighterPrototype.crashInto = function (q) {
  // black box
};

We’ve written code that composes, but it doesn’t decompose. We’ve made it easy to manually take the code for these functions apart, inspect their contents, and put them back together in new ways, but it’s impossible for us to write code that inspects and decomposes the code.

A better design might incorporate reflection and decomposition at run time.

(discuss on reddit)


post-scriptum

Our Match function is fairly simple, but it has a drawback: The functions it creates have no name and length. This means that it will not compose nicely with other JavaScript functional techniques such as creating variadic functions or currying.

To fix that, we can add some extra bits that extract the name and length from the cases we provide:

// "nameAndLength" and "imitate" are not strictly necessary to understand what we're
// doing, but they do help us write functions that preserve the name and arity
// of functions we work with. This is very helpful if we combine these techniques
// with other utilities that performa partial application and/or currying.

function nameAndLength(name, length, body) {
  var abcs = [ 'q', 'w', 'e', 'r', 't', 'y', 'u', 'i', 'o', 'p',
               'a', 's', 'd', 'f', 'g', 'h', 'j', 'k', 'l',
               'z', 'x', 'c', 'v', 'b', 'n', 'm' ],
      pars = abcs.slice(0, length),
      src  = "(function " + name + " (" + pars.join(',') + ") { return body.apply(this, arguments); })";

  return eval(src);
}

function imitate(exemplar, body) {
  return nameAndLength(exemplar.name, exemplar.length, body);
}

// "when" is our guard function
function when (guardFn, optionalFn) {
  function guarded (fn) {
    return imitate(fn, function () {
      if (guardFn.apply(this, arguments))
        return fn.apply(this, arguments);
    });
  }
  return optionalFn == null
         ? guarded
         : guarded(optionalFn);
}

// "getWith," "mapWith," and "pluckWith" can all be found in the allong.es
// library, https://github.com/raganwald/allong.es
function getWith (prop, obj) {
  function gets (obj) {
    return obj[prop];
  }

  return obj === undefined
         ? gets
         : gets(obj);
}

function mapWith (fn, mappable) {
  function maps (collection) {
    return collection.map(fn);
  }

  return mappable === undefined
         ? maps
         : maps(collection);
}

function pluckWith (prop, collection) {
  var plucker = mapWith(getWith(prop));

  return collection === undefined
         ? plucker
         : plucker(collection);
}

// Our pattern-matching function
function Match () {
  var fns     = [].slice.call(arguments, 0),
      lengths = pluckWith('length', fns),
      length  = Math.min.apply(null, lengths),
      names   = pluckWith('name', fns).filter(function (name) { return name !== ''; }),
      name    = names.length === 0
                ? ''
                : names[0];

  console.log(names)

  return nameAndLength(name, length, function () {
    var i,
        value;

    for (i in fns) {
      value = fns[i].apply(this, arguments);

      if (value !== undefined) return value;
    }
  });
}

// Some predicates to make it easy to write patterns
function equals (x) {
  return function eq (y) { return (x === y); };
}

function greaterThan (x) {
  return function gt (y) { return (y > x); };
}

var factorial = Match(
  when(equals(1),      function factorial (n) { return 1; }),
  when(greaterThan(1), function           (n) { return n * factorial(n-1); })
);

factorial(5);
  //=> 120
factorial.name;
  //=> 'factorial'
factorial.length;
  //=> 1

function whenArgsAre () {
  var matchers = [].slice.call(arguments, 0, arguments.length - 1),
      body     = arguments[arguments.length - 1];

  function typeChecked () {
    var i,
        arg,
        value;

    if (arguments.length != matchers.length) return;
    for (i in arguments) {
      arg = arguments[i];
      if (!matchers[i].call(this, arg)) return;
    }
    value = body.apply(this, arguments);
    return value === undefined
           ? null
           : value;
  }

  return imitate(body, typeChecked);
}
https://raganwald.com/2014/06/23/multiple-dispatch
Structs and ImmutableStructs
Show full content

Sometimes we want to share objects by reference for performance and space reasons, but we don’t want them to be mutable. One motivation is when we want many objects to be able to share a common entity without worrying that one of them may inadvertently change the common entity.

JavaScript provides a way to make properties immutable:

var rentAmount = {};

Object.defineProperty(rentAmount, 'dollars', {
  enumerable: true,
  writable: false,
  value: 420
});

Object.defineProperty(rentAmount, 'cents', {
  enumerable: true,
  writable: false,
  value: 0
});

rentAmount.dollars
  //=> 420


// Strict Mode:

!function () {
  "use strict"

  rentAmount.dollars = 600;
}();
  //=> TypeError: Cannot assign to read only property 'dollars' of #<Object>

// Beware: Non-Strict Mode

rentAmount.dollars = 600;
  //=> 600

rentAmount.dollars
  //=> 420

Object.defineProperty is a general-purpose method for providing fine-grained control over the properties of any object. When we make a property enumerable, it shows up whenever we list the object’s properties or iterate over them. When we make it writable, assignments to the property change its value. If the property isn’t writable, assignments are ignored.

When we want to define multiple properties, we can also write:

var rentAmount = {};

Object.defineProperties(rentAmount, {
  dollars: {
    enumerable: true,
    writable: false,
    value: 420
  },
  cents: {
    enumerable: true,
    writable: false,
    value: 0
  }
});

rentAmount.dollars
  //=> 420

rentAmount.dollars = 600;
  //=> 600

rentAmount.dollars
  //=> 420

We can make properties immutable, but that doesn’t prevent us from adding properties to an object:

rentAmount.feedbackComments = []
rentAmount.feedbackComments.push("The rent is too damn high.")
rentAmount
  //=>
    { dollars: 420,
      cents: 0,
      feedbackComments: [ 'The rent is too damn high.' ] }

Immutable properties make an object closed for modification. This is a separate matter from making it closed for extension. But we can do that too:

Object.preventExtensions(rentAmount);

function addCurrency(amount, currency) {
  "use strict";

  amount.currency = currency;
  return currency;
}

addCurrency(rentAmount, "CAD")
  //=> TypeError: Can't add property currency, object is not extensible
structs

Many other languages have a formal data structure that has one or more named properties that are open for modification, but closed for extension. Here’s a function that makes a Struct:

function Struct (template) {
  if (Struct.prototype.isPrototypeOf(this)) {
    var struct = this;

    Object.keys(template).forEach(function (key) {
      Object.defineProperty(struct, key, {
        enumerable: true,
        writable: true,
        value: template[key]
      });
    });
    return Object.preventExtensions(struct);
  }
  else return new Struct(template);
}

var rentAmount2 = Struct({dollars: 420, cents: 0});

addCurrency(rentAmount2, "ISK");
  //=> TypeError: Can't add property currency, object is not extensible

And when you need an ImmutableStruct:

function ImmutableStruct (template) {

  if (ImmutableStruct.prototype.isPrototypeOf(this)) {
    var immutableObject = this;

    Object.keys(template).forEach(function (key) {
      Object.defineProperty(immutableObject, key, {
        enumerable: true,
        writable: false,
        value: template[key]
      });
    });
    return Object.preventExtensions(immutableObject);
  }
  else return new ImmutableStruct(template);
}

ImmutableStruct.prototype = new Struct({});

function copyAmount(to, from) {
  "use strict"

  to.dollars = from.dollars;
  to.cents   = from.cents;
  return to;
}

var immutableRent = ImmutableStruct({dollars: 1000, cents: 0});

copyAmount(immutableRent, rentAmount);
  //=> TypeError: Cannot assign to read only property 'dollars' of #<Struct>

Structs and Immutable Structs are a handy way to prevent inadvertent errors and to explicitly communicate that an object is intended to be used as a struct and not as a dictionary.1

structural vs. semantic typing

A long-cherished principle of dynamic languages is that programs employ “Duck” or “Structural” typing. So if we write:

function deposit (account, instrument) {
  account.dollars += instrument.dollars;
  account.cents   += instrument.cents;
  account.dollars += Math.floor(account.cents / 100);
  account.cents    = account.cents % 100;
  return account;
}

This works for things that look like cheques, and for things that look like money orders:2

cheque = {
  dollars: 100,
  cents: 0,
  number: 6
}

deposit(currentAccount, cheque);

moneyOrder = {
  dollars: 100,
  cents: 0,
  fee: 1.50
}

deposit(currentAccount, moneyOrder);

The general idea here is that as long as we pass deposit an instrument that has dollars and cents properties, the function will work. We can think about hasDollarsAndCents as a “type,” and we can say that programming in a dynamic language like JavaScript is programming in a world where there is a many-many relationship between types and entities.

Every single entity that has dollars and cents has the imaginary type hasDollarsAndCents, and every single function that takes a parameter and uses only its dollars and cents properties is a function that requires a parameter of type hasDollarsAndCents.

There is no checking of this in advance, like some other languages, but there also isn’t any explicit declaration of these types. They exist logically in the running system, but not manifestly in the code we write.

This maximizes flexibility, in that it encourages the creation of small, independent pieces work seamlessly together. It also makes it easy to refactor to small, independent pieces. The code above could easily be changed to something like this:

cheque = {
  amount: {
    dollars: 100,
    cents: 0
  },
  number: 6
}

deposit(currentAccount, cheque.amount);

moneyOrder = {
  amount: {
    dollars: 100,
    cents: 0
  },
  fee: 1.50
}

deposit(currentAccount, moneyOrder.amount);
drawbacks

This flexibility has a cost. With our ridiculously simple example above, we can easy deposit new kinds of instruments. But we can also do things like this:

var backTaxesOwed = {
  dollars: 10,874,
  cents: 06
}

var rentReceipt = {
  dollars: 420,
  cents: 0,
  unit: 504,
  month: 6,
  year: 1962
}

deposit(backTaxesOwed, rentReceipt);

Structurally, deposit is compatible with any two things that haveDollarsAndCents. But not all things that haveDollarsAndCents are semantically appropriate for deposits. This is why some OO language communities work very hard developing and using type systems that incorporate semantics.

This is not just a theoretical concern. Numbers and strings are the ultimate in semantic-free data types. Confusing metric with imperial measures is thought to have caused the loss of the Mars Climate Orbiter. To prevent mistakes like this in software, forcing values to have compatible semantics–and not just superficially compatible structure–is thought to help create self-documenting code and to surface bugs.

semantic structs

We’ve already seen structs, above. Struct is a structural type, not a semantic type. But it can be extended to incorporate the notion of semantic types by turning it from and object factory into a “factory-factory.” Here’s a completely new version of Struct, we give it a name and the keys we want, and it gives us a JavaScript constructor function:

function Struct () {
  var name = arguments[0],
      keys = [].slice.call(arguments, 1),
      constructor = eval("(function "+name+"(argument) { return initialize.call(this, argument); })");

  function initialize (argument) {
    if (constructor.prototype.isPrototypeOf(this)) {
      var struct = this;

      keys.forEach(function (key) {
        Object.defineProperty(struct, key, {
          enumerable: true,
          writable: true,
          value: argument[key]
        });
      });
      return Object.preventExtensions(struct);
    }
    else return new constructor(argument);
  };

  return constructor;
}

var Depositable = Struct('Depositiable', 'dollars', 'cents'),
    RecordOfPayment = Struct('RecordOfPayment', 'dollars', 'cents');

var cheque = new Depositable({dollars: 420, cents: 0});

cheque.constructor;
  //=> [Function: Depositiable]

cheque instanceof Depositable;
  //=> true
cheque instanceof RecordOfPayment;
  //=> false

Although Depositable and RecordOfPayment have the same structural type, they are different semantic types, and we can detect the difference with instanceof (and Object.isPrototypeOf).

We can also bake this test into our constructors. The code above uses a pattern borrowed from Effective JavaScript so that you can write either new Depositable(...) or Depositable(...) and always get a new instance of Depositable. This version abandons that convention in favour of making Depositable a prototype check, and adds an explicit assertion method:

function Struct () {
  var name = arguments[0],
      keys = [].slice.call(arguments, 1),
      constructor = eval("(function "+name+"(argument) { return initialize.call(this, argument); })");

  function initialize (argument) {
    if (constructor.prototype.isPrototypeOf(this)) {
      var argument = argument,
          struct = this;

      keys.forEach(function (key) {
        Object.defineProperty(struct, key, {
          enumerable: true,
          writable: true,
          value: argument[key]
        });
      });
      return Object.preventExtensions(struct);
    }
    else return constructor.prototype.isPrototypeOf(argument);
  };

  constructor.assertIsPrototypeOf = function (argument) {
    if (!constructor.prototype.isPrototypeOf(argument)) {
      var name = constructor.name === ''
                 ? "Struct(" + keys.join(", ") + ")"
                 : constructor.name;
      throw "Type Error: " + argument + " is not a " + name;
    }
    else return argument;
  }

  return constructor;
}

var Depositable = Struct('Depositable', 'dollars', 'cents'),
    RecordOfPayment = Struct('RecordOfPayment', 'dollars', 'cents');

var cheque = new Depositable({dollars: 420, cents: 0});

Depositable(cheque);
  //=> true

RecordOfPayment.assertIsPrototypeOf(cheque);
  //=> Type Error: [object Object] is not a RecordOfPayment

We can use these “semantic” structs by adding assertions to critical functions:

function deposit (account, instrument) {
  Depositable.assertIsPrototypeOf(instrument);

  account.dollars += instrument.dollars;
  account.cents   += instrument.cents;
  account.dollars += Math.floor(account.cents / 100);
  account.cents    = account.cents % 100;
  return account;
}

This prevents us from accidentally trying to deposit a rent receipt.

With few exceptions, a programming system cannot be improved solely by removing features that can be subject to abuse.

is semantic typing worthwhile?

The value of semantic typing is an open question. There are its proponents, and its detractors. One thing to consider is the proposition that with few exceptions, a programming system cannot be improved solely by removing features that can be subject to abuse.

Instead, a system is improved by removing harmful features in such a way that they enable the addition of other, more powerful features that were “blocked” by the existence of harmful features. For example, proper or “pure” functional programming languages do not allow the mutation of values. This does remove a number of harmful possibilities, but in and of itself removing mutation is not a win.

However, once you remove mutation you enable a number of optimization techniques such as lazy evaluation. This frees programmers to do things like write code for data structures that would not be possible in languages (like JavaScript) that have an eager evaluation strategy.

If we subscribe to this philosophy about only reducing flexibility when it enables us to make an even greater increase in flexibility, then structural typing cannot be not be a win solely because we can “catch errors earlier” with things like Depositable.assertIsPrototypeOf(instrument). To be of benefit, it must enable some completely new kind of programming paradigm or feature that increases overall flexibility.

are immutable structs worthwhile?

Taking this same reasoning to immutable structures, some would argue that the extra infrastructure required to create immutable structs is not warranted if the sole objective is to prevent inadvertent errors.

Immutable data structures (like immutable structs) are only of benefit if we can “turn them up to eleven” and realize some new benefit, if we incorporate paradigms like copy-on-write and build high-performance and high-reliability code around immutable data.

Funny we should mention that. The Clojure and ClojureScript people have systematized the use of immutable data. And we can incorporate their libraries in our JavaScript code using the Mori Javascript library.

Have a look at Mori. Then consider whether immutable structs could be useful in your code.

(discuss on hacker news)


  1. JavaScript also provides a single method that can close an objetc for modification, extension, and configuration at the same time: Object.freeze(...)

  2. There’re good things we can say about why we should consider making an amount property, and/or encapsulate these structs so they behave like objects, but this gives the general idea of structural typing. 

https://raganwald.com/2014/06/15/immutable-structs
Repost: Captain Obvious on JavaScript
Show full content

In JavaScript, anywhere you find yourself writing:

function (x) { return foo(x); }

You can usually substitute just foo. For example, this code:

var floats = someArray.map(function (value) {
  return parseFloat(value);
});

Could be written:

var floats = someArray.map(parseFloat);

This understanding is vital. Without it, you can be led astray into thinking that this code:

array.forEach(function (element) {
  // do something
});

…Is just a funny way of writing a for loop. It’s not!

forEach isn’t a way of saying “Do this thing with every member of array.” No, forEach is an array method that takes a function as an argument, so this code is a way of saying “Apply every member of array to this function”. You can pass forEach a function literal (as above), a variable name that resolves to a function, even an expression like myObject.methodName that looks like a method but is really a function defined in an object’s prototype.

Once you have internalized the fact that any function will do, you can refactor code to clear out the cruft. This example uses jQuery in the browser:

$('input').toArray().map(function (domElement) {
  return parseFloat(domElement.value);
})

Let’s turn that into:

$('input').toArray()
  .map(function (domElement) {
    return domElement.value;
  })
  .map(function (value) {
    return parseFloat(value);
  })

And thus:

$('input').toArray()
  .map(function (domElement) {
    return domElement.value;
  })
  .map(parseFloat)

Once you get started turning function literals into other expressions, you can’t stop. The next step on the road to addiction is using functions that return functions:

function get (attr) {
  return function (object) { return object[attr]; }
}

Which permits us to write:

$('input').toArray()
  .map(get('value'))
  .map(parseFloat)

Which really means, “Get the .value from every input DOM element, and map the parseFloat function over the result.”

Obviously!

Part II: Less obvious but still interesting

Although the example illustrated the obvious points about functions as first class entities, the final example involved creating a new function and iterating twice over the array. Avoiding the extra loop may be an important performance optimization. Then again, it may be premature optimization. But either way, once we have absorbed the obvious, we’re ready to look at the practical.

We might express our discomfort thus: “We wish to decompose an expression into functions. Our obvious example recomposed them into two functions and two maps, but for performance reasons we would like to compose two functions and only one map.”

As usual, finding the right question to ask is half the battle. Familiarity with good libraries is the other half. For our purposes, Functional.sequence from Oliver Steele’s Functional JavaScript library will be useful. Functional.sequence composes two or more functions in the argument order (Functional Javascript also provides .compose to compose multiple functions in applicative order). For the purpose of this post, they could be defined something like this:

var naiveSequence = function (a, b) {
  return function (c) {
    return b(a(c));
  };
}

var naiveCompose = function (a, b) {
  return function (c) {
    return a(b(c));
  };
}

(Functional.sequence and Functional.compose are far more thorough than these naive examples, of course.)

Given this, you could expect that:

Functional.sequence(get('value'), parseFloat)({ value: '1.5' })
  // => 1.5

Thus, we can rewrite our map as:

$('input').toArray()
  .map(
    Functional.sequence(
      get('value'),
      parseFloat
    )
  )

Now the code iterates over the array just once, mapping it to the composition of the two functions while still preserving the new character of the code where the elements of an expression have been factored into separate functions. Is this better than the original? It is if you want to refactor the code to do interesting things like memoize one of the functions. But that is no longer obvious.

What is obvious is that JavaScript is a functional language, and the more ways you have of factoring expressions into functions, the more ways you have of organizing your code to suit your own style, performance, or assignment of responsibility purposes.

p.s. Captain Obvious would not write such excellently plain-as-the-nose-on-his-face posts without the help of people like @jcoglan, @CrypticSwarm, @notmatt, @cammerman, Skyhighatrist, and @BrendanEich.

p.p.s. This post first appeared in January of 2012, on my old blog. I may repost other technical material, if and when time permits.

https://raganwald.com/2014/05/30/repost-captain-obvious
How to Trick the Gullible into Learning a New Programming Language
Show full content

Yesterday, I read a middling article, How to Trick the Guilty and Gullible into Revealing Themselves. I wasn’t entirely fascinated by the stories about plunging arms into boiling water, or of brown M&Ms and concert venues. But then the author revealed why Nigerian 419 scammers use such blatantly ridiculous letters to troll for victims:

If the Nigerian scam is so famous, why would a Nigerian scammer ever admit he is from Nigeria?

This is a fascinating question. Why do their letters have obvious spelling mistakes and impossibly florid prose? The answer is economic. While sending the letters is essentially free, talking to prospects is expensive, and the scammer’s time is limited.

So the scammer is not concerned with the response rate, but instead is concerned with the conversion rate for those prospects who respond. In other words, false positives suck.

The ridiculously obvious letters help: Only the most trusting, greedy, and gullible people will respond to such insanely obvious letters, and there is a good chance of extracting money from such people. If the scammer used a more plausible pitch, people like you and I might respond and then back out of the scam later, after the scammer had spent valuable time wooing us.

This reminds me of programming languages and frameworks. The most important factor for success in a language is that it be the scripting language for a new platform that disrupts the old. C for Unix, Objective C for iOS, Java, JavaScript, and Ruby for HTML. This insight, unfortunately, is like saying that the most important criterium for personal wealth is to belong to the lucky zygote club.

But if all other things are equal, is there any similarity between a new programming language and a 419 scam? Sure there is.

The Central Bank of Nigeria

why a new language is like ten million dollars in a nigerian bank

When you’re promoting a new idea, early adopters are gold. But early doubters are poison. When you’re launching, you’re vulnerable. A single “I tried Raganwald-C, and Here’s Why It Sucks Sweaty Goat Testes” blog post on the front page of Hacker News can sink you. False positves suck for new programming ideas just as much as for 419 scams.

You want to avoid people actively hostile to your idea, and you also want to avoid shallow dilettantes. An apathetic user won’t write such vituperative blog posts, but they also won’t take the trouble to really understand your language’s “big idea,” and they will complain that your new language isn’t very good at doing the thing they’ve always done, the way they’ve always done it.

Think about people complaining that JavaScript promises are more “complicated” than callbacks, or that immutable data structures in ClojureScript are harder to understand than programming with mutable state. They aren’t wrong from their point of view, but the whole point of a new idea is to have a new point of view!

The very best thing is that the first few folks to try your idea have a very high willingness to adopt a new point of view. You’re looking for people who are a little like the Nigerian 419 victims. They should be trusting, be open-minded, and be greedy. You don’t want people trying to save themselves from extinction, you don’t want people trying to be 1% better, you want people greedy enough to hope your idea makes them an order of magnitude better.

You know… programmers too lazy to keep struggling with the old tech and with the hubris to believe they can adopt entirely new tech.

We will encourage you to develop the three great virtues of a programmer: laziness, impatience, and hubris.

–Larry Wall, Programming Perl (1st edition)

seeking hubris

So how do you find these lazy programmers with insanely optimistic hubris? Act like the 419 scammer. Be florid. Make impossibly bold claims! Use linkbait blog posts. And especially, design your new idea or language around something so insanely different that it weeds out the dilettantes at a glance.

For example, CoffeeScript gets this right. It’s “JavaScript, the Good Parts,” but it’s also significant whitespace. I don’t think CoffeeScript would have gotten anywhere without significant whitespace. Not that indentation matters a damn to anyone’s productivity… But it matters a damn to making sure that the only people who would bother to try it were people willing to throw themselves into something very different.

Those people would go ahead and write books about it, and explore wild things like using do to create block scoping. Had it been JavaScript plus-plus, it would have been swamped with ordinary folks who would have done ordinary things and decided it wasn’t worth the transpilation headaches.

Likewise, I love that Clojure and ClojureScript are still Lisp, s-exprs and all. Algolizing Clojure would have killed it. Never mind macros and homoiconicity, we can solve those problems for Algol. What we can’t do is solve the problem of people trying it and telling everyone they know how they’re more productive using JavaScript to write JavaScript than using ClojureScript to write JavaScript. Well, duh.

What the language needs is people using ClojureScript to write Clojure in the browser, and having a “funny” syntax selects for people willing to invest in a new idea.

in closing

My thesis is that in early days, you need to select for people willing to invest in a new point of view, and having a glaringly self-indulgent features–like funny syntax or a completely new model for managing asynchronicity–is helpful to keeping the concentration of early “wows” to early “meh’s” high.

Which leaves me offering some free advice: There’s plenty of room for new programming editors. If you’re going to do it, don’t be afraid to ignore “vi” and “emacs” compatibility modes.

Harpo Marx and three of his children

https://raganwald.com/2014/05/13/how-to-trick-the-gullible-into-learning-a-new-language
JavaScript Values Algebra
Show full content

One of JavaScript’s defining characteristics is its treatment of functions as first-class values. Like numbers, strings, and other kinds of objects, references to functions can be passed as arguments to functions, returned as the result from functions, bound to variables, and generally treated like any other value.1

Here’s an example of passing a reference to a function around. This simple array-backed stack has an undo function. It works by creating a function representing the action of undoing the last update, and then pushing that onto a stack of actions to be undone:

var stack = {
  array: [],
  undoStack: [],
  push: function (value) {
    this.undoStack.push(function () {
      this.array.pop();
    });
    return this.array.push(value);
  },
  pop: function () {
    var popped = this.array.pop();
    this.undoStack.push(function () {
      this.array.push(popped);
    });
    return popped;
  },
  isEmpty: function () {
    return array.length === 0;
  },
  undo: function () {
    this.undoStack.pop().call(this);
  }
};

stack.push('hello');
stack.push('there');
stack.push('javascript');
stack.undo();
stack.undo();
stack.pop();
  //=> 'hello'

Functions-as-values is a powerful idea. And people often look at the idea of functions-as-values and think, “Oh, JavaScript is a functional programming language.” No.

In computer science, functional programming is a programming paradigm, a style of building the structure and elements of computer programs, that treats computation as the evaluation of mathematical functions and avoids state and mutable data.–Wikipedia

Functional programming might have meant “functions as first-class values” in the 1960s when Lisp was young. But time marches on, and we must march alongside it. JavaScript does not avoid state, and JavaScript embraces mutable data, so JavaScript does not value “functional programming.”

Handshake, Glider, Boat, Box, R-Pentomino, Loaf, Beehive, and Clock by Ben Sisko

objects

JavaScript’s other characteristic is its support for objects. Although JavaScript’s features seem paltry compared to rich OO languages like Scala, its extreme minimalism means that you can actually build almost any OO paradigm up from basic pieces.

Now, people often hear the word “objects” and think kingdom of nouns. But objects are not necessarily nouns, or at least, not models for obvious, tangible entities in the real world.

One example concerns state machines. We could implement a cell in Conway’s Game of Life using if statements and a boolean property to determine whether the cell was alive or dead:2

var __slice = [].slice;

function extend () {
  var consumer = arguments[0],
      providers = __slice.call(arguments, 1),
      key,
      i,
      provider;

  for (i = 0; i < providers.length; ++i) {
    provider = providers[i];
    for (key in provider) {
      if (provider.hasOwnProperty(key)) {
        consumer[key] = provider[key];
      };
    };
  };
  return consumer;
};

var Universe = {
  // ...
  numberOfNeighbours: function (location) {
    // ...
  }
};

var Alive = 'alive',
    Dead  = 'dead';

var Cell = {
  numberOfNeighbours: function () {
    return Universe.numberOfNeighbours(this.location);
  },
  stateInNextGeneration: function () {
    if (this.state === Alive) {
      return (this.numberOfNeighbours() === 3)
             ? Alive
             : Dead;
    }
    else {
      return (this.numberOfNeighbours() === 2 || this.numberOfNeighbours() === 3)
             ? Alive
             : Dead;
    }
  }
};

var someCell = extend({
  state: Alive,
  location: {x: -15, y: 12}
}, Cell);

You could say that the “state” of the cell is represented by the primitive value 'alive' for alive, or 'dead' for dead. But that isn’t modeling the state in any way, that’s just a name. The true state of the object is implicit in the object’s behaviour, not explicit in the value of the .state property.

Here’s a design where we make the state explicit instead of implicit:

function delegateToOwn (receiver, propertyName, methods) {
  var temporaryMetaobject;

  if (methods == null) {
    temporaryMetaobject = receiver[propertyName];
    methods = Object.keys(temporaryMetaobject).filter(function (methodName) {
      return typeof(temporaryMetaobject[methodName]) === 'function';
    });
  }
  methods.forEach(function (methodName) {
    receiver[methodName] = function () {
      var metaobject = receiver[propertyName];
      return metaobject[methodName].apply(receiver, arguments);
    };
  });

  return receiver;
};

var Alive = {
  alive: function () {
    return true;
  },
  stateInNextGeneration: function () {
    return (this.numberOfNeighbours() === 3)
             ? Alive
             : Dead;
  }
};

var Dead = {
  alive: function () {
    return false;
  },
  stateInNextGeneration: function () {
    return (this.numberOfNeighbours() === 2 || this.numberOfNeighbours() === 3)
             ? Alive
             : Dead;
  }
};

var Cell = {
  numberOfNeighbours: function () {
    return thisGame.numberOfNeighbours(this.location);
  }
}

delegateToOwn(Cell, 'state', ['alive', 'aliveInNextGeneration']);

var someCell = extend({
  state: Alive,
  location: {x: -15, y: 12}
}, Cell);

In this design, delegateToOwn delegates the methods .alive and .stateInNextGeneration to whatever object is the value of a Cell’s state property.

So when we write someCell.state = Alive, then the Alive object will handle someCell.alive and someCell.aliveInNextGeneration. And when we write someCell.state = Dead, then the Dead object will handle someCell.alive and someCell.aliveInNextGeneration.

Now we’ve taken the implicit states of being alive or dead and transformed them into the first-class values Alive and Dead. Not a string that is used implicitly in some other code, but all of “The stuff that matters about aliveness and deadness.”

This is not different than the example of passing functions around: They’re both the same thing, taking something would be implicit in another design and/or another language, and making it explicit, making it a value. And making the whole thing a value, not just a boolean or a string, the complete entity.

This example is the same thing as the example of a stack that handles undo with a stack of functions: Behaviour is treated as a first-class value, whether it be a single function or an object with multiple methods.

an algebra of functions

If we left it at that, we could come away with an idea that a function is a small, single-purposed object. We would be forgiven for thinking that there’s nothing special about functions, they’re values representing behaviour.

But functions have another, very important purpose. The form the basis for an algebra of values.

Consider these functions, begin1 and begin. They’re handy for writing function advice, for creating sequences of functions to be evaluated for side effects, and for resolving method conflicts when composing mixins:

var __slice = [].slice;

function begin1 () {
  var fns = __slice.call(arguments, 0);

  return function () {
    var args = arguments,
        values = fns.map(function (fn) {
          return fn.apply(this, args);
        }, this),
        concretes = values.filter(function (value) {
          return value !== void 0;
        });

    if (concretes.length > 0) {
      return concretes[0];
    }
  }
}

function begin () {
  var fns = __slice.call(arguments, 0);

  return function () {
    var args = arguments,
        values = fns.map(function (fn) {
          return fn.apply(this, args);
        }, this),
        concretes = values.filter(function (value) {
          return value !== void 0;
        });

    if (concretes.length > 0) {
      return concretes[concretes.length - 1];
    }
  }
}

Both begin1 and begin take one or more functions, and turn them into a third function. This is much the same as + taking two numbers, and turning them into a third number.3

When you have a bunch of functions that do things in your problem domain (like writeLedger and withdrawFunds), and you have a way to compose your domain functions, you have a little algebra for taking values and computing new values from them.

Just as we can write 1 + 1 = 2, we can also write writeLedger + withdrawFunds = transaction. We have created a very small algebra of functions. We can write functions that transform functions into other functions.

an algebra of values

Functions that transform functions into other functions are very powerful, but it does not stop there.

It’s obvious that functions can take objects as arguments and return objects. Functions (or methods) that take an object representing a client and return an object representing and account balance are a necessary and important part of software.

But just as we created an algebra of functions, we can write an algebra of objects. Meaning, we can write functions that take objects and return other objects that represent a transformation of their arguments.

Here’s a function that transforms an object into a proxy for that object:

function proxy (baseObject, optionalPrototype) {
  var proxyObject = Object.create(optionalPrototype || null),
      methodName;
  for (methodName in baseObject) {
    if (typeof(baseObject[methodName]) ===  'function') {
      (function (methodName) {
        proxyObject[methodName] = function () {
          var result = baseObject[methodName].apply(baseObject, arguments);
          return (result === baseObject)
                 ? proxyObject
                 : result;
        }
      })(methodName);
    }
  }
  return proxyObject;
}

Have you ever wanted to make an object’s properties private while making its methods public? You wanted a proxy for the object:

var stackWithPrivateState = proxy(stack);

stack.array
  //=> []
stackWithPrivateState.array
  //=> undefined

stackWithPrivateState.push('hello');
stackWithPrivateState.push('there');
stackWithPrivateState.push('javascript');
stackWithPrivateState.undo();
stackWithPrivateState.undo();
stackWithPrivateState.pop();
  //=> 'hello'

The proxy function transforms an object into another object with a similar purpose. Functions can compose objects as well, here’s one of the simplest examples:

var __slice = [].slice;

function meld () {
  var melded = {},
      providers = __slice.call(arguments, 0),
      key,
      i,
      provider,
      except;

  for (i = 0; i < providers.length; ++i) {
    provider = providers[i];
    for (key in provider) {
      if (provider.hasOwnProperty(key)) {
        melded[key] = provider[key];
      };
    };
  };
  return melded;
};

var Person = {
  fullName: function () {
    return this.firstName + " " + this.lastName;
  },
  rename: function (first, last) {
    this.firstName = first;
    this.lastName = last;
    return this;
  }
};

var HasCareer = {
  career: function () {
    return this.chosenCareer;
  },
  setCareer: function (career) {
    this.chosenCareer = career;
    return this;
  },
  describe: function () {
    return this.fullName() + " is a " + this.chosenCareer;
  }
};

var PersonWithCareer = meld(Person, HasCareer);
  //=>
    { fullName: [Function],
      rename: [Function],
      career: [Function],
      setCareer: [Function],
      describe: [Function] }

Functions that transform objects or compose objects act at a higher level than functions that query objects or update objects. They form an algebra that allows us to build objects by transformation and composition, just as we can use functions like begin to build functions by composition.

javascript values algebra

JavaScript treats functions and objects as first-class values. And the power arising from this is the ability to write functions that transform and compose first-class values, creating an algebra of values.

This applies to transforming and composing functions, and it also applies to transforming and composing objects.

Harpo Marx and three of his children


appendix 1: a function for composing prototypes out of mixins

First, some helper functions:

var __slice = [].slice;

// extend

function extend () {
  var consumer = arguments[0],
      providers = __slice.call(arguments, 1),
      key,
      i,
      provider;

  for (i = 0; i < providers.length; ++i) {
    provider = providers[i];
    for (key in provider) {
      if (provider.hasOwnProperty(key)) {
        consumer[key] = provider[key];
      };
    };
  };
  return consumer;
};

// partialProxy is like "proxy," but it proxies a subset of an
// object's methods and it also has a fixed set of mutable properties

function partialProxy (baseObject, methods, mutableProperties) {
  var proxyObject = Object.create(null);

  if (mutableProperties) {
    mutableProperties.forEach(function (privatePropertyName) {
      proxyObject[privatePropertyName] = null;
    });
  }

  methods.forEach(function (methodName) {
    proxyObject[methodName] = function () {
      var result = baseObject[methodName].apply(baseObject, arguments);
      return (result === baseObject)
             ? proxyObject
             : result;
    }
  });

  Object.preventExtensions(proxyObject);

  return proxyObject;
}

// extendWith Proxy extends an object with behaviour, but restricts
// the behaviour to interact with a proxy to the object. This
// encapsulates each set of behaviour from the object and from each
// other, reducing coupling.

var number = 0;

function methodsOfType (behaviour, type) {
  var methods = [],
      methodName;

  for (methodName in behaviour) {
    if (typeof(behaviour[methodName]) === type) {
      methods.push(methodName);
    }
  };
  return methods;
}

function propertyFlags (behaviour) {
  var properties = [],
      propertyName;

  for (propertyName in behaviour) {
    if (behaviour[propertyName] === null) {
      properties.push(propertyName);
    }
  }

  return properties;
}

function extendWithProxy (baseObject, behaviour) {
  var safekeepingName = "__" + ++number + "__",
      definedMethods = methodsOfType(behaviour, 'function'),
      dependencies = methodsOfType(behaviour, 'undefined'),
      properties = propertyFlags(behaviour),
      methodName;

  definedMethods.forEach(function (methodName) {
    baseObject[methodName] = function () {
      var context = this[safekeepingName],
          result;
      if (context == null) {
        context = partialProxy(this, definedMethods.concat(dependencies), properties);
        properties.forEach(function (propertyName) {
          context[propertyName] = behaviour[propertyName];
        });
        Object.defineProperty(this, safekeepingName, {
          enumerable: false,
          writable: false,
          value: context
        });
      }
      result = behaviour[methodName].apply(context, arguments);
      return (result === context) ? this : result;
    };
  });

  return baseObject;
}

The Prototype function builds prototypes out of an optional super-prototype (or null) and one or more behaviours, objects with functions to mix in.

function Prototype () {
  var superPrototype = arguments[0],
      baseObject = Object.create(superPrototype),
      behaviours = __slice.call(arguments, 1);

  return behaviours.reduce(function (prototype, behaviour) {
    return extendWithProxy(prototype, behaviour);
  }, baseObject);
}

Examples:

var HasName = {
  // private property, initialized to null
  _name: null,

  // methods
  name: function () {
    return this._name;
  },
  setName: function (name) {
    this._name = name;
    return this;
  }
};

var HasCareer = {
  // private property, initialized to null
  _career: null,

  // methods
  career: function () {
    return this._career;
  },
  setCareer: function (career) {
    this._career = career;
    return this;
  }
};

var IsSelfDescribing = {
  // dependencies, "undefined in this mixin"
  name: undefined,
  career: undefined,

  // method
  description: function () {
    return this.name() + ' is a ' + this.career();
  }
};

// the prototype
var Careerist = Prototype(HasName, HasCareer, IsSelfDescribing);

// create objects with it
var michael    = Object.create(Careerist),
    bewitched = Object.create(Careerist);

michael.setName('Michael Sam');
bewitched.setName('Samantha Stephens');

michael.setCareer('Athlete');
bewitched.setCareer('Thaumaturge');

michael.description()
  //=> 'Michael Sam is a Athlete'
bewitched.description()
  //=> 'Samantha Stephens is a Thaumaturge'
appendix 2: a function for safely composing behaviour

Helpers:

// policies for resolving methods

var policies = {
  overwrite: function overwrite (fn1, fn2) {
    return fn2;
  },
  discard: function discard (fn1, fn2) {
    return fn1;
  },
  before: function before (fn1, fn2) {
    return function () {
      fn2.apply(this, arguments);
      return fn1.apply(this, arguments);
    }
  },
  after: function after (fn1, fn2) {
    return function () {
      fn1.apply(this, arguments);
      return fn2.apply(this, arguments);
    }
  },
  around: function around (fn1, fn2) {
    return function () {
      var argArray = [fn1.bind(this)].concat(__slice.call(arguments, 0));
      return fn2.apply(this, argArray);
    }
  }
};

// helper for writing resolvable mixins

function resolve(mixin, policySpecification) {
  var result = extend(Object.create(null), mixin);

  Object.keys(policySpecification).forEach(function (policy) {
    var methodNames = policySpecification[policy];

    methodNames.forEach(function (methodName) {
      result[methodName] = {};
      result[methodName][policy] = mixin[methodName];
    });
  });

  return result;
}

The Prototype function above can mix more than one behaviour into a prototype, but sometimes you want to make a new behaviour out of two or more existing behaviours without turning them into a prototype. composeBehaviour does that.

It is involved because it must check for conflicts and resolve them at the time of composition. The Prototype method above is simpler because the individual behaviours each get their own proxy with private state. composeBehaviour wires behaviours up so they can share a proxy.

function composeBehaviour () {
  var mixins = __slice.call(arguments, 0),
      dummy  = function () {};

  return mixins.reduce(function (acc1, mixin) {
    return Object.keys(mixin).reduce(function (result, methodName) {
      var bDefinition = mixin[methodName],
          bType       = typeof(bDefinition),
          aDefinition,
          aType,
          bResolverKey,
          bDefinition;

      if (result.hasOwnProperty(methodName)) {
        aDefinition = result[methodName];
        aType = typeof(aDefinition);

        if (aDefinition === null && bDefinition === null) {
          throw "'" + methodName + "' cannot be private to multiple mixins."
        }
        else if (aDefinition === null || bDefinition === null) {
          throw "'" + methodName + "' cannot be a method and a property."
        }
        else if (aType ===  'undefined') {
          if (bType === 'function' || bType === 'undefined') {
            result[methodName] = bDefinition;
          }
          else if (bType === 'object') {
            bResolverKey = Object.keys(bDefinition)[0];
            bDefinition = bDefinition[bResolverKey];
            if (bResolverKey === 'around') {
              result[methodName] = function () {
                return bDefinition.apply(this, [dummy] + __slice.call(0, arguments));
              }
            }
            else result[methodName] = bDefinition;
          }
          else throw aType + " cannot be mixed in as '" + methodName + "'";
        }
        else if (bType === 'object') {
          bResolverKey = Object.keys(bDefinition)[0];
          bDefinition = bDefinition[bResolverKey];
          result[methodName] = policies[bResolverKey](aDefinition, bDefinition);
        }
        else if (bType === 'undefined') {
          // do nothing
        }
        else throw "unresolved method conflict for '" + methodName + "'";
      }
      else if (bDefinition === null) {
        result[methodName] = null;
      }
      else if (bType === 'function' || bType === 'undefined') {
        result[methodName] = bDefinition;
      }
      else if (bType === 'object') {
        bResolverKey = Object.keys(bDefinition)[0];
        bDefinition = bDefinition[bResolverKey];
        if (bResolverKey === 'around') {
          result[methodName] = function () {
            return bDefinition.apply(this, [dummy] + __slice.call(0, arguments));
          }
        }
        else result[methodName] = bDefinition;
      }
      else bType + " cannot be used for '" + methodName + "'";

      return result;
    }, acc1);
  }, {});
}

Examples:

// composing compatible mixins

composeBehaviour(
  HasName,
  HasCareer
);

// rejects incompatible mixins

var HasEmployer = {
  // private property, initialized to null
  _name: null,

  // methods
  employer: function () {
    return this._name;
  },
  setEmployer: function (name) {
    this._name = name;
    return this;
  }
};

composeBehaviour(
  HasName,
  HasEmployer,
  HasCareer
);
  //=> '_name' cannot be private to multiple mixins.

// stacking mixins

var IsSelfDescribing = {
  name: undefined,
  career: undefined,

  description: function () {
    return this.name() + ' is a ' + this.career();
  }
};

var NameAndCareer = composeBehaviour(
  HasName,
  HasCareer,
  IsSelfDescribing
);

var Careerist = Prototype(null, NameAndCareer);

var adolphe = Object.create(Careerist);
adolphe.setName('Adolphe Samuel');
adolphe.setCareer('Composer');
adolphe.description()
  //=> 'Adolphe Samuel is a Composer'

// resolving method conflict

var SingsSongs = {
  _songs: null,

  initialize: function () {
    this._songs = [];
    return this;
  },
  addSong: function (name) {
    this._songs.push(name);
    return this;
  },
  songs: function () {
    return this._songs;
  }
};

var HasAwards = {
  _awards: null,

  initialize: function () {
    this._awards = [];
    return this;
  },
  addAward: function (name) {
    this._awards.push(name);
    return this;
  },
  awards: function () {
    return this._awards;
  }
};

composeBehaviour(SingsSongs, HasAwards);
  //=> "unresolved method conflict for 'initialize'"

// plays well with prototypes

var AwardWinningMusician = composeBehaviour(
  SingsSongs,
  resolve(HasAwards, { after: ['initialize'] })
);

var Musician = Prototype(Careerist, AwardWinningMusician);

var henry = Object.create(Musician).initialize();
henry.setName('Seal Henry Samuel');
henry.setCareer('Singer');
henry.addSong('Kiss from a Rose');
henry.addAward('Best British Male');

notes
  1. JavaScript does not actually pass functions or any other kind of object around, it passes references to functions and other objects around. But it’s awkward to describe var I = function (a) { return a; } as binding a reference to a function to the variable I, so we often take an intellectually lazy shortcut, and say the function is bound to the variable I. This is much the same shorthand as saying that somebody added me to the schedule for an upcoming conference: They really added my name to the list, which is a kind of reference to me. 

  2. This exercise was snarfed from The Four Rules of Simple Design 

  3. Exercise for the reader: Given either begin1 or begin, why are the functions function () {} and function (fn) { return fn; } considered important? 

https://raganwald.com/2014/04/26/what-does-javascript-value
Mixins, Forwarding, and Delegation in JavaScript
Show full content
preface: where did the prototypes go?

This essay discusses how to separate JavaScript domain properties from object behaviour, without prototypes. This is deliberate. By examining four basic ways to have one object define the behaviour of other objects, we gain insight into what we’re trying to accomplish at a very basic level.

We can then take this insight to working with prototypes and understand the conveniences that prototypes provide as well as the tradeoffs that they make. That does not mean, of course that just because prototypes (or classes, for that matter) are not mentioned here, that prototypes are considered inferior to any of these techniques.

This is an essay, not a style guide.

Why metaobjects?

It is technically possible to write software using objects alone. When we need behaviour for an object, we can give it methods by binding functions to keys in the object:

var sam = {
  firstName: 'Sam',
  lastName: 'Lowry',
  fullName: function () {
    return this.firstName + " " + this.lastName;
  },
  rename: function (first, last) {
    this.firstName = first;
    this.lastName = last;
    return this;
  }
}

We call this a “naïve” object. It has state and behaviour, but it lacks division of responsibility between its state and its behaviour.

This lack of separation has two drawbacks. First, it intermingles properties that are part of the model domain (such as firstName), with methods (and possibly other properties, although none are shown here) that are part of the implementation domain. Second, when we needed to share common behaviour, we could have objects share common functions, but does it not scale: There’s no sense of organization, no clustering of objects and functions that share a common responsibility.

Metaobjects solve the lack-of-separation problem by separating the domain-specific properties of objects from their behaviour and implementation-specific properties.

The basic principle of the metaobject is that we separate the mechanics of behaviour from the domain properties of the base object. This has immediate engineering benefits, and it’s also the foundation for designing programs with higher-level constructs like formal classes, expectations, and delegation.


Mixins, Forwarding, and Delegation

The simplest possible metaobject in JavaScript is a mixin. Consider our naïve object:

var sam = {
  firstName: 'Sam',
  lastName: 'Lowry',
  fullName: function () {
    return this.firstName + " " + this.lastName;
  },
  rename: function (first, last) {
    this.firstName = first;
    this.lastName = last;
    return this;
  }
}

We can separate its domain properties from its behaviour:

var sam = {
  firstName: 'Sam',
  lastName: 'Lowry'
};

var person = {
  fullName: function () {
    return this.firstName + " " + this.lastName;
  },
  rename: function (first, last) {
    this.firstName = first;
    this.lastName = last;
    return this;
  }
};

And use extend to mix the behaviour in:

var __slice = [].slice;

function extend () {
  var consumer = arguments[0],
      providers = __slice.call(arguments, 1),
      key,
      i,
      provider;

  for (i = 0; i < providers.length; ++i) {
    provider = providers[i];
    for (key in provider) {
      if (provider.hasOwnProperty(key)) {
        consumer[key] = provider[key];
      };
    };
  };
  return consumer;
};

extend(sam, person);

sam.rename
  //=> [Function]

This allows us to separate the behaviour from the properties in our code. If we want to use the same behaviour with another object, we can do that:

var peck = {
  firstName: 'Sam',
  lastName: 'Peckinpah'
};

extend(peck, person);

Our person object is a template, it provides some functionality to be mixed into an object with a function like extend. Using templates does not require copying entire functions around, each object gets references to the functions in the template.

Things get even better: You can use more than one template with the same object:

var hasCareer = {
  career: function () {
    return this.chosenCareer;
  },
  setCareer: function (career) {
    this.chosenCareer = career;
    return this;
  }
};

extend(peck, hasCareer);
peck.setCareer('Director');

We say that there is a many-to-many relationship between objects and templates.

scope and coupling

Consider a design that has four kinds of templates, we’ll call them A, B, C, and D. Objects in the system might mix in one, two, three, or all four templates. There are fifteen such “kinds” of objects, those that mix in A, B, AB, C, AC, BC, ABC, D, AD, BD, ABD, CD, ACD, BCD, and ABCD.

When you make a change to and one template, say A, you have to consider how that change will affect each of the eight kinds of objects that mixes A in. In only one of those, A, do you just consider A’s behaviour by itself. In AB, ABC, ABD, and ABCD, you have to consider how changes to A may interact with B, because they both share access to each object’s private state. Same for A and C, and A and D, of course.

By itself this is not completely revelatory: When objects interact with each other in the code, there are going to be dependencies between them, and you have to manage those dependencies.

Encapsulation solves this problem by strictly limiting the scope of interaction between objects. If object a invokes a method x() on object b, we know that the scope of interaction between a and b is strictly limited to the method x(). We also know that any change in state it may create is strictly limited to the object b, because x() cannot reach back and touch a’s private state.

(There is some simplification going on here as we are ignoring parameters and/or the possibility that a is part of b’s private state)

However, two methods x() and y() on the same object are tightly coupled by default, because they both interact with all of the object’s private state. When we write an object like this:

var counter = {
  _value: 0,
  value: function () {
    return this._value;
  },
  increment: function () {
    ++this._value;
    return this;
  },
  decrement: function () {
    --this._value;
    return this;
  }
}

We fully understand that value(), increment(), and decrement() are coupled, and they are all together in our code next to each other.

Whereas, if we write:

function isanIncrementor (object) {
  object.increment = function () {
    ++this._value;
    return this;
  };
  return object;
}

// ...hundreds of lines of code...

function isaDecrementor (object) {
  object.decrement = function () {
    --this._value;
    return this;
  };
  return object;
}

Our two templates are tightly coupled to each other, but not obviously so. They just ‘happen’ to use the same property. And they might never be both mixed into the same object. Or perhaps they might. Who knows?

The technical term for templates referring to an object’s private properties is open recursion. It is powerful and flexible, in exactly the same sense that having objects refer to each other’s internal properties is powerful and flexible.

And just as objects can encapsulate their own private state, so can templates.

templates with private properties

Let’s revisit our hasCareer template:

var hasCareer = {
  career: function () {
    return this.chosenCareer;
  },
  setCareer: function (career) {
    this.chosenCareer = career;
    return this;
  }
};

hasCareer stores its private state in the object’s chosenCareer property. As we’ve seen, that introduces coupling if any other method touches chosenCareer. What we’d like to do is make chosenCareer private. Specifically:

  1. We wish to store a copy of chosenCareer for each object that uses the hasCareer template. Mark Twain is a writer, Sam Peckinpah is a director.
  2. chosenCareer must not be a property of each person object, because we don’t want other methods accessing it and becoming coupled.

We have a few options. The very simplest, and most “native” to JavaScript, is to use a closure.

privacy through closures

We’ll write our own functional mixin:

function HasPrivateCareer (obj) {
  var chosenCareer;

  obj.career = function () {
    return chosenCareer;
  };
  obj.setCareer = function (career) {
    chosenCareer = career;
    return this;
  };
  return obj;
}

HasPrivateCareer(peck);

chosenCareer is a variable within the scope of the hasCareer, so the career and setCareer methods can both access it through lexical scope, but no other method can or ever will.

This approach works well for simple cases. It only works for named variables. We can’t, for example, write a function that iterates through all of the private properties of this kind of functional mixin, because they aren’t properties, they’re variables. In the end, we have privacy, but we achieve it by not using properties at all.

privacy through objects

Another way to achieve privacy in templates is to write them as methods that operate on this, but sneakily make this refer to a different object. Let’s revisit our extend function:

function extendPrivately (receiver, template) {
  var methodName,
      privateProperty = Object.create(null);

  for (methodName in template) {
    if (template.hasOwnProperty(methodName)) {
      receiver[methodName] = template[methodName].bind(privateProperty);
    };
  };
  return receiver;
};

We don’t need to embed variables and methods in our function, it creates one private variable (privateProperty), and then uses .bind to ensure that each method is bound to that variable instead of to the receiver object being extended with the template.

Now we can extend any object with any template, ‘privately:’

extendPrivately(twain, hasCareer);
twain.setCareer('Author');
twain.career()
  //=> 'Author'

Has it modified twain’s properties?

twain.chosenCareer
  //=> undefined

No. twain has .setCareer and .career methods, but .chosencareer is a property of an object created when twain was privately extended, then bound to each method using .bind.

The advantage of this approach over closures is that the template and the mechanism for mixing it in are separate: You just write the template’s methods, you don’t have to carefully ensure that they access private state through variables in a closure.

another way to achieve privacy through objects

In our scheme above, we used .bind to create methods bound to a private object before mixing references to them into our object. There is another way to do it:

function forward (receiver, methods, toProvider) {
  methods.forEach(function (methodName) {
    receiver[methodName] = function () {
      return toProvider[methodName].apply(toProvider, arguments);
    };
  });

  return receiver;
};

This function forwards methods to another object. Any other object, it could be a metaobject specifically designed to define behaviour, or it could be a domain object that has other responsibilities.

Dispensing with a lot of mixins, here is a very simple example. We start with some kind of investment portfolio object that has a netWorth method:

var portfolio = {
  _investments: [],
  addInvestment: function (investment) {
    this._investments.push(investment);
    return this;
  },
  netWorth: function () {
    return this._investments.reduce(
      function (acc, investment) {
        return acc + investment.value;
      },
      0
    );
  }
};

And next we create an investor who has this portfolio of investments:

var investor = {
  //...
}

What if we want to make investments and to know an investor’s net worth?

forward(investor, ['addInvestment', 'netWorth'], portfolio);

We’re saying “Forward all requests for addInvestment and netWorth to the portfolio object.”

forwarding

Forwarding is a relationship between an object that receives a method invocation receiver and a provider object. They may be peers. The provider may be contained by the consumer. Or perhaps the provider is a metaobject.

When forwarding, the provider object has its own state. There is no special binding of function contexts, instead the consumer object has its own methods that forward to the provider and return the result. Our forward function above handles all of that, iterating over the provider’s properties and making forwarding methods in the consumer.

The key idea is that when forwarding, the provider object handles each method in its own context. This is very similar to the effect of our solution with .bind above, but not identical.

Because there is a forwarding method in the consumer object and a handling method in the provider, the two can be varied independently. Here’s a snippet of our forward function from above:

consumer[methodName] = function () {
  return toProvider[methodName].apply(toProvider, arguments);
}

Each forwarding function invokes the method in the provider by name. So we can do this:

portfolio.netWorth = function () {
  return "I'm actually bankrupt!";
}

We’re overwriting the method in the portfolio object, but not the forwarding function. So now, our investor object will forward invocations of netWorth to the new function, not the original. This is not how our .bind system worked above.

That makes sense from a “metaphor” perspective. With our extendPrivately function above, we are creating an object as a way of making private state, but we don’t think of it as really being a first-class entity unto itself. We’re mixing those specific methods into a consumer.

Another way to say this is that mixing in is “early bound,” while forwarding is “late bound:” We’ll look up the method when it’s invoked.

summarizing what we know so far

So now we have three things: Mixing in a template; mixing in a template with private state for its methods (“Private Mixin”); and forwarding to a first-class object. And we’ve talked all around two questions:

  1. Is the mixed-in method being early-bound? Or late-bound?
  2. When a method is invoked on a receiving object, is it evaluated in the receiver’s context? Or in the metaobject’s state’s context?

If we make a little table, each of those three things gets its own spot:

  Early-bound Late-bound Receiver’s context Mixin   Metaobject’s context Private Mixin Forwarding

So… What goes in the missing spot? What is late-bound, but evaluated in the receiver’s context?

delegation

Let’s build it. Here’s our forward function, modified to evaluate method invocation in the receiver’s context:

function delegate (receiver, methods, toProvider) {
  methods.forEach(function (methodName) {
    receiver[methodName] = function () {
      return toProvider[methodName].apply(receiver, arguments);
    };
  });

  return receiver;
};

This new delegate function does exactly the same thing as the forward function, but the function that does the delegation looks like this:

function () {
  return toProvider[methodName].apply(receiver, arguments);
}

It uses the receiver as the context instead of the provider. This has all the same coupling implications that our mixins have, of course. And it layers in additional indirection. The indirection gives us some late binding, allowing us to modify the metaobject’s methods after we have delegated behaviour from a receiver to it.

delegation vs. forwarding

Delegation and forwarding are both very similar. One metaphor that might help distinguish them is to think of receiving an email asking you to donate some money to a worthy charity.

  • If you forward the email to a friend, and the friend donates money, the friend is donating their own money and getting their own tax receipt.
  • If you delegate responding to your accountant, the accountant donates your money to the charity and you receive the tax receipt.

In both cases, the other entity does the work when you receive the email.


Later Binding

When comparing Mixins to Delegation (and comparing Private Mixins to Forwarding), we noted that the primary difference is that Mixins are early bound and Delegation is late bound. Let’s be specific. Given:

var counter = {};

var Incrementor = {
  increment: function () {
    ++this._value;
    return this;
  },
  value: function (optionalValue) {
    if (optionalValue != null) {
      this._value = optionalValue;
    }
    return this._value;
  }
};

extend(counter, Incrementor);

We are mixing Incrementor into counter. At some point later, we encounter:

counter.value(42);

What function handles the invocation of .value? because we mixed Incrementor into counter, it’s the same function as Incrementor.value. We don’t look that up when counter.value(42) is evaluated, because that was bound to counter.value when we extended counter. This is early binding.

However, given:

var counter = {};

delegate(counter, ['increment', 'value'], Incrementor);

// ...time passes...

counter.value(42);

We again are most likely invoking Incrementor.value, but now we are determining this at the time counter.value(42) is evaluated. We bound the target of the delegation, Incrementor, to counter, but we are going to look the actual property of Incrementor.value up when it is invoked. This is late binding, and it is useful in that we can make some changes to Incrementor after the delegation has been set up, perhaps to add some logging.

It is very nice not to have to do things like this in a very specific order: When things have to be done in a specific order, they are coupled in time. Late binding is a decoupling technique.

but wait, there’s more

But we can get even later than that. Although the specific function is late bound, the target of the delegation, Incrementor, is early bound. We can late bind that too! Here’s a variation on delegate:

function delegateToOwn (receiver, methods, propertyName) {
  methods.forEach(function (methodName) {
    receiver[methodName] = function () {
      var toProvider = receiver[propertyName];
      return toProvider[methodName].apply(receiver, arguments);
    };
  });

  return receiver;
};

This function sets things up so that an object can delegate to one of its own properties. Let’s take another look at the investor example. First, we’ll set up our portfolio to separate behaviour from properties with a standard mixin:

var HasInvestments = {
  addInvestment: function (investment) {
    this._investments.push(investment);
    return this;
  },
  netWorth: function () {
    return this._investments.reduce(
      function (acc, investment) {
        return acc + investment.value;
      },
      0
    );
  }
};

var portfolio = extend({_investments: []}, HasInvestments);

Next we’ll make that a property of our investor, and delegate to the property, not the object itself:

var investor = {
  // ...
  nestEgg: portfolio
}

delegateToOwn(investor, ['addInvestment', 'netWorth'], 'nestEgg');

Our investor object delegates the addInvestment and netWorth methods to its own nestEgg property. So far, this is just like the delegate method above. But consider what happens if we decide to assign a new portfolio to our investor:

var retirementPortfolio = {
  _investments: [
    {name: 'IRA fund', worth: '872,000'}
  ]
}

investor.nestEgg = retirementPortfolio;

The delegateToOwn delegation now delegates to the new portfolio, because it is bound to the property name, not to the original object. This seems questionable for portfolios–what happens to the old portfolio when you assign a new one?–but has tremendous application for modeling classes of behaviour that change dynamically.

state machines

A very common use case for this delegation is when building finite state machines. As described in the book Understanding the Four Rules of Simple Design by Corey Haines, you could implement Conway’s Game of Life using if statements. Hand waving furiously over other parts of the system, you might get:

var Universe = {
  // ...
  numberOfNeighbours: function (location) {
    // ...
  }
};

var thisGame = extend({}, Universe);

var Cell = {
  alive: function () {
    return this._alive;
  },
  numberOfNeighbours: function () {
    return thisGame.numberOfNeighbours(this._location);
  },
  aliveInNextGeneration: function () {
    if (this.alive()) {
      return (this.numberOfNeighbours() === 3);
    }
    else {
      return (this.numberOfNeighbours() === 2 || this.numberOfNeighbours() === 3);
    }
  }
};

var someCell = extend({
  _alive: true,
  _location: {x: -15, y: 12}
}, Cell);

One of the many insights from Understanding the Four Rules of Simple Design is that this business of having an if (alive()) in the middle of a method is a hint that cells are stateful.

We can extract this into a state machine using delegation to a property:

var Alive = {
  alive: function () {
    return true;
  },
  aliveInNextGeneration: function () {
    return (this.numberOfNeighbours() === 3);
  }
};

var Dead = {
  alive: function () {
    return false;
  },
  aliveInNextGeneration: function () {
    return (this.numberOfNeighbours() === 2 || this.numberOfNeighbours() === 3);
  }
};

var FsmCell = {
  numberOfNeighbours: function () {
    return thisGame.numberOfNeighbours(this._location);
  }
}

delegateToOwn(FsmCell, ['alive', 'aliveInNextGeneration'], '_state');

var someFsmCell = extend({
  _state: Alive,
  _location: {x: -15, y: 12}
}, FsmCell);

someFsmCell delegates alive and aliveInNextGeneration to its _state property, and you can change its state with assignment:

someFsmCell._state = Dead;

In practice, states would be assigned en masse, but this demonstrates one of the simplest possible state machines. In the wild, most business objects are state machines, sometimes with multiple, loosely coupled states. Employees can be:

  • In or out of the office;
  • On probation, on contract, or permanent;
  • Part time or full time.

Delegation to a property representing state takes advantage of late binding to break behaviour into smaller components that have cleanly defined responsibilities.

late bound forwarding

The exact same technique can be used for forwarding to a property, and forwarding to a property can also be used for some kinds of state machines. Forwarding to a property has lower coupling than delegation, and is preferred where appropriate.


Summary

We’ve seen four techniques for separating object behaviour from object properties:

  1. Mixins
  2. Private Mixins
  3. Forwarding
  4. Delegation

We’ve also seen how to implement “later binding” delegation by delegating or forwarding to an object property, and how this can be used for building a state machine. We’ve seen how these four techniques can be understood to implement two orthogonal ideas: Early versus late binding, and whether methods are evaluated in the receiver’s context or the metaobject’s context.

We deliberately haven’t discussed prototypes or the things you can build with prototypes (like classes). Instead, we take our understanding gleaned from these prototype-less techniques to help us understand what prototypes offer and what tradeoffs they make.

(discuss on hacker news)

https://raganwald.com/2014/04/10/mixins-forwarding-delegation
Class Hierarchies? Don't Do That!
Show full content

In theory, JavaScript does not have classes. In practice, the following snippet of code is widely considered to be an example of a “class” in JavaScript:

function Account () {
  this._currentBalance = 0;
}

Account.prototype.balance = function () {
  return this._currentBalance;
}

Account.prototype.deposit = function (howMuch) {
  this._currentBalance = this._currentBalance + howMuch;
  return this;
}

// ...

var account = new Account();

The pattern can be extended to provide the notion of subclassing:

function ChequingAccount () {
  Account.call(this);
}

ChequingAccount.prototype = Object.create(Account.prototype);

ChequingAccount.prototype.sufficientFunds = function (cheque) {
  return this._currentBalance >= cheque.amount();
}

ChequingAccount.prototype.process = function (cheque) {
  this._currentBalance = this._currentBalance - cheque.amount();
  return this;
}

These classes and subclasses provide most of the features of classes we find in languages like Smalltalk:

  • Classes are responsible for creating objects and initializing them with properties (like the current balance);
  • Classes are responsible for and “own” methods, Objects delegate method handling to their classes (and superclasses);
  • Methods directly manipulate an object’s properties.

This pattern has become so ingrained in JavaScript culture that ECMAScript-6, the upcoming major revision of the language, provides some “syntactic sugar” so that we can write classes and subclasses without writing the whole pattern out by hand. There is no significant semantic change, behind the scenes everything works exactly as we see it here.

Smalltalk was, of course, invented forty years ago. In those forty years, we’ve learned a lot about what works and what doesn’t work in object-oriented programming. Unfortunately, this pattern celebrates the things that don’t work, and glosses over or omits the things that work.

Even more unfortunately, the upcoming syntactic sugar doesn’t solve any of the problems with classes, but instead solves the problem of “I wish to type fewer characters,” or perhaps for the new programmer, “I don’t understand how all these moving parts actually work, so I might type it wrong, isn’t there an easier way to type it?”

the semantic problem with hierarchies

At a semantic level, classes are the building blocks of an ontology. This is often formalized in a diagram:

An ontology of accounts

The idea behind class-based OO is to classify (note the word) our knowledge about objects into a tree. At the top is the most general knowledge about all objects, and as we travel down the tree, we get more and more specific knowledge about specific classes of objects, e.g. objects representing Visa Debit accounts.

Only, the real world doesn’t work that way. It really doesn’t work that way. In morphology, for example, we have penguins, birds that swim. And the bat, a mammal that flies. And monotremes like the platypus, an animal that lays eggs but nurses its young with milk.

It turns out that our knowledge of the behaviour of non-trivial domains (like morphology or banking) does not classify into a nice tree, it forms a directed acyclic graph. Or if we are to stay in the metaphor, it’s a thicket.

Furthermore, the idea of building software on top of a tree-shaped ontology would be broken even if our knowledge fit neatly into a tree. Ontologies are not used to build the real world, they are used to describe it from observation. As we learn more, we are constantly updating our ontology, sometimes moving everything around.

In software, this is incredibly destructive: Moving everything around breaks everything. In the real world, the humble Platypus does not care if we rearrange the ontology, because we didn’t use the ontology to build Australia, just to describe what we found there.

It’s sensible to build an ontology from observation of things like bank accounts. That kind of ontology is useful for writing requirements, use cases, tests, and so on. But that doesn’t mean that it’s useful for writing the code that implements bank accounts.

Class Hierarchies are the wrong semantic model, and the wisdom of forty years of experience with them is that there are better ways to compose programs.

encapsulation

Those are the semantic issues. Let’s talk about the engineering issues, let’s address classes as if we don’t care whether they represent some knowledge in the real world. Let’s presume that classes are just a tool for getting our programs to work. Are they still a problem?

Class hierarchies are a problem, even if all we want to do is use them to implement behaviour. Programs have three important use cases:

  1. Programs must be easy to write;
  2. Programs must be easy to understand;
  3. Programs must be easy to change.

Classes have tradeoffs for all three of these use cases, but class hierarchies are harmful with respect to understanding and changing programs, because the way they work creates an encapsulation problem.

Encapsulation is a core principle of object-oriented programming. (Other styles of programming, such as functional programming, also value encapsulation, although they implement it in different ways). In OOP, encapsulation is achieved by objects having private state and exposing a public interface in the form of methods.

JavaScript does not enforce private state, but it’s easy to write well-encapsulated programs: simply avoid having one object directly manipulate another object’s properties. Forty years after Smalltalk was invented, this is a well-understood principle.

Obviously, code will have dependencies. A will depend on B, and B will depend on C, and dependencies are transitive, so A depends on B and A also depends on C. Encapsulation doesn’t eliminate dependencies, but it does limit the scope of the dependencies: If we change B and/or C, we will not break A provided that we do not remove or change the externally observable behaviour of any of the methods A uses.

So far so good. or at least, it is if A, B and C are objects and/or functions. For example:

function depositAndReturnBalance(account, amount) {
  return account.deposit(amount).balance();
}

var account = new Account();
depositAndReturnBalance(account, 100)
  //=> 100

depositAndReturnBalance obviously depends on passing in an object that implements both the .deposit and .balance methods. But it doesn’t depend on how those methods are implemented: we could write this for Account and get the same behaviour:

function Account () {
  this._transactionHistory = [];
}

Account.prototype.balance = function () {
  return this._transactionHistory.reduce(function (acc, transaction) {
    return acc + transaction;
  }, 0);
}

Account.prototype.deposit = function (howMuch) {
  this._transactionHistory.unshift(howMuch)
  return this;
}

function depositAndReturnBalance(account, amount) {
  return account.deposit(amount).balance();
}

var account = new Account();
depositAndReturnBalance(account, 100)
  //=> 100

Completely different implementation of .deposit and .balance, but depositAndReturnBalance does not depend upon the implementation.

So, our class provides us with a way to encapsulate the implementation of an account balance. Great! What is the problem?

superclasses are not encapsulated

We said that encapsulation in JavaScript works when the entities involved are objects and/or functions. But what about classes?

It turns out, the relationship between classes in a hierarchy is not encapsulated. This is because classes do not relate to each other through a well-defined interface of methods while “hiding” their internal state from each other.

Here’s the way our ChequingAccount subclass implements the .process method:

ChequingAccount.prototype.process = function (cheque) {
  this._currentBalance = this._currentBalance - cheque.amount();
  return this;
}

If we rewrite the Account class to use a transaction history instead of a current balance, it breaks the code in ChequingAccount. In JavaScript (and other languages in the same family), classes and subclasses share access to the object’s private properties. It is not possible to change an implementation detail for Account without carefully checking every single subclass and the code depending on those subclasses to see if our internal, “private” change will break them.

Of course, we know that there are dependencies in code, so we are not surprised that subclasses depend on classes. But what is different is that this dependency is not limited in scope to a carefully curated interface of methods and behaviour. We have no encapsulation.

This problem is not a new discovery. It is well-understood, it even has a name: It’s called the Fragile Base Class Problem. Changes to classes near the root of the tree have far-reaching implications, and implications that are orders of magnitude more risky because there is no encapsulation.

Class hierarchies create brittle programs that are difficult to modify.

going forward

JavaScript first appeared in 1995, approximately 15 years after Smalltalk was first publicized. In the twenty years since then, we have learned a great many things about the good and the bad parts of JavaScript the language, and we have also learned a great many things about the good and the bad ideas in Object-Oriented Programming.

It seems obvious that we should look back and learn from what came before. Good ideas, like encapsulation, functions as first-class objects, delegation, traits, and composition should be embraced and improved upon. New ideas, like promises, should be developed.

People often say that “JavaScript isn’t Ruby,” that it’s prototype-based and not class-based. That’s true, but the opportunity is wasted when we reinvent, poorly, ideas that were invented forty years ago and have been deprecated ever since.

So if someone asks you to explain how to write a class hierarchy? Go ahead and tell them: “Don’t do that!”

(discuss on hacker news, /r/javascript, and /r/programming)

https://raganwald.com/2014/03/31/class-hierarchies-dont-do-that
Future JS
Show full content

I am delighted to announce that I will be joining Jeremy Ashkenas, David Nolen, Joanne Cheng, and a host of other thoughtful (and exciting!) speakers at Barcelona Future JS, May 1 – 3, 2014. Future JS is organized by the same good people who’ve organized the amazing BaRuCo (“Barcelona Ruby Conference”), and I’m giddy with anticipation.

Barcelona Future JS

We’re at a very exciting place in JavaScript’s history. I predict that some years from now, we’ll look back at this as a turning point. JavaScript owes a great deal of its popularity to being in the right place–the browser–at the right time-the explosion of the Internet. Most people accept that it is an imperfect language with some good bones and some people who are amazingly passionate about building on those good bones.

And there is exciting building going on, in compile-to-JS languages like CoffeeScript, in compile-to-JS technologies, in sweet.js hygienic macros, and in compile-to-js infrastructure like asm.js. But that’s far from all. Within JavaScript itself, people are doing great work on new paradigms like promises and react.js.

But people are also looking backwards. In mature OO language communities like Java, the notion of class hierarchies has been severely deprecated. Folks have realized that building deep ontology of subclasses leads to the fragile base class anti-pattern.

And yet… ES6 introduces a class keyword that may actually encourage us to move backwards to the programming style of the 1980s and 1990s. Not a day goes by without finding another “how to write classes in JavaScript blog post.” I wrote one myself!

Cambrian Explosion

the future

JavaScript is unique amongst languages in that it is so minimal, so un-opinionated, that literally anyone can invent The Next Big Idea. It provides so little that in some ways, it makes anything possible. That’s part of why there are so many ideas going on in the JavaScript community, why there’s a “Cambrian Explosion” of frameworks and libraries in JavaScript. That’s what makes it a very interesting language and very interesting community if you’re interested in new ideas, if you’re interested in the future.

And being interested in the future, I think it’s wrong to slavishly copy the ideas of forty years ago: JavaScript may have first-class functions and delegation through prototypes, but it isn’t Clojure, it isn’t Haskell, and it certainly isn’t Smalltalk.

That doesn’t mean we should ignore the lessons those languages teach, and ignore the wisdom those communities have accumulated. It’s ridiculous to blatantly ignore sensible and proven practices like encapsulation and use “JavaScript isn’t such-and-such-a-language” as an excuse for writing fragile, coupled code.

We should learn from what came before, and then we should go ahead and invent what is to come next. We should not ask ourselves how to copy the ideas of 1984, we should ask ourselves how to invent ideas other languages will copy from JavaScript in 2044.

I’m looking forward to Future JS, and not just because it’ll be an opportunity to talk to people about these ideas: It’ll be an opportunity for me to hear first-hand some of the ideas that might be part of JavaScript’s legacy in the years and decades to come.

If that excites you the way it excites me, I’ll see you there!

Get your ticket now.

https://raganwald.com/2014/03/17/future-js
Writing OOP using OOP
Show full content

As many people have pointed out, if you turn your head sideways and squint, the following JavaScript can be considered a “class:”

function QuadTree (nw, ne, se, sw) {
  this._nw = nw;
  this._ne = ne;
  this._se = se;
  this._sw = sw;
}
QuadTree.prototype.population = function () {
  return this._nw.population() +
         this._ne.population() +
         this._se.population() +
         this._sw.population();
}

This is very different than the kind of class you find in Smalltalk, but it’s “close enough for government work,” so what’s the big deal?

No big deal, really, there is plenty of excellent JavaScript software that uses this exact pattern for creating objects that delegate their behaviour to a common prototype. But we programmers have a voracious appetite for learning, so in the interests of understanding what we give up, here’s an explanation of how JavaScript’s simple out-of-the-box OO differs from Smalltalk-style OO, and why that might matter for some projects.

Encapsulation is good: By hiding internal state and manipulation, you get delegation, you get polymorphism, you get code that is cohesive but not tightly coupled.

the basic oo-proposition

The basic proposition of OO is that objects encapsulate their private state. They provide methods, and you query and update the objects by invoking methods. Objects do not directly access or manipulate each other’s internal state. This system is held to lower coupling and increase flexibility, as the interactions between objects are understood to be limited entirely to the methods they expose.

In the QuadTree example above, although we don’t know what kinds of things they store, we know that if you want to know a QuadTree’s population, you don’t muck about with its internal state, you call .population(), and it does the rest.

Another part of the proposition is that objects delegate their behaviour to some kind of metaobject, typically called a “class,” although in JavaScript, metaobjects are actually called prototypes. This delegation is the most accessible way for two or more objects to share a common set of methods.

Most people who chose to program JavaScript in an OO style readily accept this proposition: Encapsulation is good: By hiding internal state and manipulation, you get delegation, you get polymorphism, you get code that is cohesive but not tightly coupled.

This is why they build “classes” representing the various entities in their problem domain. For a JavaScript implementation of HashLife, you might find Cell and QuadTree classes, for example.

using oop to write oop

And yet… When it comes to writing and manipulating their classes, does this code look like it encapsulates private state? Or does it look like code directly manipulates internal state?

function Cell (population) {
  this._population = population;
}
Cell.prototype.population = function () {
  return this._population;
}

Quite clearly, while this code supports OOP, it is itself written in a non-OOP manner, it is written with the expectation that other entities get to directly manipulate Cell.prototype. What would this code look like if we took the basic proposition of OOP and applied it to writing classes and not just using classes?

Quite obviously, classes would be objects that you manipulate with methods. Something like:

Cell.defineMethod('population', function () {
  return this._population;
});

Likewise, there is no new Cell(1) in a fully OO sense, we should not assume that Cell is some kind of function. So instead, we have:

var empty = Cell.create(0);
var occupied = Cell.create(1)

If Cell has methods like defineMethod and create, it obviously is an object itself. Now, Cell.defineMethod is presumed to exist, and so is QuadTree.defineMethod. How does OOP handle things when two or more objects share some method? Right! They are both instances of a class.

What is the class of Cell and of QuadTree? How about Class? Let us assume there is a Class class. How do we make Cell and QuadTree out of Class?

var Cell = Class.create();
var QuadTree = Class.create();

Naturally. Everything’s an object, everything follows the same rules, we don’t need to remember a bunch of special cases, because we aren’t peeking at the implementation and directly manipulating an object’s internal state.

OOP allows us to create a subclass for the purpose of extending or sometimes overriding behaviour. So let’s imagine that if we want, we can write something like:

var MinimalQuadTree = Class.create(QuadTree);

This establishes that MinimalQuadTree is a subclass of QuadTree, and somewhere in the implementation of .create is the logic that correctly wires the appropriate prototypes up so that every instance of MinimalQuadTree can delegate population() to QuadTree’s implementation.

“You aren’t serious about OOP until you subclass Class.”

going beyond

We haven’t looked at defineMethod’s implementation, but presumably it looks something like this:

Class.defineMethod('defineMethod', function (name, body) {
  this.prototype[name] = body;
  return this;
});

It hardy seems worth the trouble to abstract this simple line of code away, however we have strong imaginations, let’s make up a problem, then use our new tools to solve it.

We start with:

var Counter = Class.create();
Counter
  .defineMethod('initialize', function () { this._count = 0; })
  .defineMethod('increment', function () { ++this._count; })
  .defineMethod('count', function () { return this._count; });

var c = Counter.create();

(Every essay should include a counter example)

And we add a function written in continuation-passing-style:

function log (message, callback) {
  console.log(message);
  return callback();
}

Alas, we can’t use our counter:

log("doesn't work", c.increment);

The trouble is that the expression c.increment returns the body of the method, but when it is invoked using callback(), the original context of c has been lost. The usual solution is to write:

log("works", c.increment.bind(c));

The .bind method binds the context permanently. Another solution is to write (or use a function to write):

c.increment = c.increment.bind(c);

Then we can write:

log("works without thinking about it", c.increment);

It seems like a lot of trouble to be writing this out everywhere, especially when the desired behaviour is nearly always that methods be bound. Is there a better way?

Recall from above that Class is a class. And that classes can be subclassed. Let’s try it:

var SelfBindingClass = Class.create(Class);

We can override methods in a subclass. Let’s override defineMethod to add some custom semantics:

SelfBindingClass.defineMethod( 'defineMethod', function (name, body) {
  Object.defineProperty(this.prototype, name, {
    get: function () {
      return body.bind(this);
    }
  });
  return this;
});

Let’s try our new subclass of Class:

Counter = SelfBindingClass.create();

c = Counter.create();

log("still works without thinking about it", c.increment);

Classes that are instances of SelfBindingClass are now self-binding. Every one of their methods acts like it’s bound to the instance without special handling.

let’s think about that again

This last example is small, but incredibly important. The proposition of OO is that by encapsulating internal state, you can decouple the what one object wants from the how another object gets it done. You can swap objects for each other using polymorphism. You can delegate.

The last example shows how using first-class objects for classes, objects that encapsulate their internal state and themselves are instances of classes, we can write code that implements new kinds of semantics–like binding methods to objects–without requiring all other code to be coupled to the exact representation employed.

From there, you can go to places like flavouring methods with before- and after- advice, adding singleton/eigenclasses to objects, pattern-matching methods, the entire world of computing paradigms is open to you.

All this is certainly not necessary for writing good JavaScript programs. But if we do buy the proposition that OO is a good idea for our domain, shouldn’t we ask ourselves why we aren’t using it for our classes?

(discuss on hacker news)

p.s. A few people have pointed out that if you want a subset of classes to share functionality, alternatives such as mixing in traits are superior to subclassing Class. This is an excellent observation, and it’s the kind of thinking this post is trying to provoke: If you start thinking of metaobjects (call them classes if you like) as first-class objects, you start thinking of programming them using the tools and techniques you find most appropriate for programming domain objects.


var MetaObjectPrototype = {
  create: function () {
    var instance = Object.create(this.prototype);
    Object.defineProperty(instance, 'constructor', {
      value: this
    });
    if (instance.initialize) {
      instance.initialize.apply(instance, arguments);
    }
    return instance;
  },
  defineMethod: function (name, body) {
    this.prototype[name] = body;
    return this;
  },
  initialize: function (superclass) {
    if (superclass != null && superclass.prototype != null) {
      this.prototype = Object.create(superclass.prototype);
    }
    else this.prototype = Object.create(null);
  }
};

var MetaClass = {
  create: function () {
    var klass = Object.create(this.prototype);
    Object.defineProperty(klass, 'constructor', {
      value: this
    });
    if (klass.initialize) {
      klass.initialize.apply(klass, arguments);
    }
    return klass;
  },
  prototype: MetaObjectPrototype
};

var Class = MetaClass.create(MetaClass);

var QuadTree = Class.create();
QuadTree
  .defineMethod('initialize', function (nw, ne, se, sw) {
    this._nw = nw;
    this._ne = ne;
    this._se = se;
    this._sw = sw;
  })
  .defineMethod('population', function () {
    return this._nw.population() +
           this._ne.population() +
           this._se.population() +
           this._sw.population();
  });

var Cell = Class.create();
Cell
  .defineMethod('population', function () {
    return this._population;
  });

var SelfBindingClass = Class.create(Class);
SelfBindingClass
  .defineMethod( 'defineMethod', function (name, body) {
    Object.defineProperty(this.prototype, name, {
      get: function () {
        return body.bind(this);
      }
    });
    return this;
  });

var Counter = SelfBindingClass.create();
Counter
  .defineMethod('initialize', function () { this._count = 0; })
  .defineMethod('increment', function () { ++this._count; })
  .defineMethod('count', function () { return this._count; });
https://raganwald.com/2014/03/10/writing-oop-using-oop
A Programmer's Story
Show full content

malcolm in the middle

Hal comes home after work, flips the light switch. Hmm, bulb’s burned out. Into the cupboard to grab a replacement. While grabbing the box of bulbs, he notices that the shelf is loose. It seems one of the supports has a loose wood screw. Off to the kitchen, where he pulls open a drawer to get the screwdriver. It closes with a creak.

He slides the drawer open and closed: squeak, creak. Hmmm. Into the garage where he keeps the WD 40. He gives it a shake. Almost empty. Well, how lucky his car keys are still in his pocket. Into the car to get more WD-40. Turn the key, and runk-a-thunk-a-rough-running. Lois comes come to find Hal wrenching under the car, with the engine up on the hoist.

“Did you remember to change that light bulb I asked you about?”

Malcolm in the Middle, Health Scare

https://raganwald.com/2014/02/28/a-programmers-story
At home with the Bumblethwaites
Show full content

The word “inheritance” is widely used when talking about object-oriented programming. People will say things like “Objects inherit methods from classes,” or perhaps “Subclasses inherit behaviour from superclasses,” and sometimes they won’t even say what is being inherited: “Cow inherits from Ungulate.”

Although languages each provide their own unique combination of features and concepts, There are some ideas common to all object-oriented programming that we can grasp and use as the basis for writing our own programs. Since “inheritance” is a metaphor, we’ll explain these concepts using a metaphor. Specifically, we’ll talk about a family.

at home with the braithwaites

Consider a fictitious person, Amanda Bumblethwaite. Amanda was born “Amanda Braithwaite,” but changed surnames to “Bumblethwaite” in protest against a well-known author of programming books. Amanda has several children, one of whom is Alex Bumblethwaite.

Alex is underage, and there are many questions that Alex would defer to Amanda, such as “Can Alex go on a school trip?” Both Amanda and Alex write software programs, Amanda is a web developer, and Amanda taught Alex how to program Lego Mindstorms, just as Amanda is teaching programming to all of Alex’s siblings. All of the Bumblethwaites live in a house together.

What can we say about the Bumblethwaites?

constructors

First, we can say that Amanda is Alex’s constructor. Amanda provides 50% of the blueprint for making Alex, and Amanda actually carried out the work of bringing Alex into existence. (We’ll hand-wave furiously about David Bumblethwaite’s role.)

formal classes

Second, we can say that “Bumblethwaite” is a formal class. Amanda is a member of the Bumblethwaite class, and so is Alex. The formal class itself has no physical existence. Amanda has a physical existence, and there is an understanding that all of Amanda’s children are necessarily Bumblethwaites, but the concept of “Bumblethwaite-ness” is abstract.

expectations

Because Amanda teaches all of her children how to program, knowing that Alex is a Bumblethwaite, we expect Alex to know how to program. Knowing that the Bumblethwaite live at a certain address, knowing that Alex is a Bumblethwaite, we expect Alex to live in the family house.

delegation

Alex delegates a lot of behaviour to Amanda. For example, if a school chum invites Alex for a sleep-over, Alex delegates the question to Amanda to answer.

ad hoc sets

While it’s true that all Bumblethwaites are programmers, the concept of “being a programmer” is different than the concept of “being a Bumblethwaite.” Membership in the “set of all programmers” is determined empirically: If a person programs, they are a programmer. It is possible for someone who doesn’t program to become a programmer.

Membership in “The Bumblethwaites” is a more formal affair. You must be born a Bumblethwaite, and issued a birth certificate with “Bumblethwaite” on it. Or you must marry into the Bumblethwaites, again getting a piece of paper attesting to your “Bumblethwaite-ness.”

Where “Bumblethwaite” is a formal class, “Programmer” is an ad hoc set.

five things

These five ideas–constructors, formal classes, expectations, delegation, and ad hoc sets–characterize most ideas in object-oriented programming. Each programming language provides tools for expressing these ideas, although the languages tend to use the same words in slightly different ways. For example:

JavaScript provides objects, functions and prototypes. The new keyword allows functions to be used as constructors. Prototypes are used for delegating behaviour. Just as Alex delegates behaviour to Amanda and Amanda constructs Alex, it is normal in JavaScript that a function is paired with a prototype to produce, through composition, an entity that handles construction and delegation of behaviour.

“Classic” JavaScript does not have the notion of a class, but JavaScript programmers refer to such compositions as classes. JavaScript provides the instanceof operator to test whether an object was created by such a composite function. instanceof is a leaky abstraction, but it works well enough for treating constructer functions as formal classes.

All that being said, JavaScript does not enforce any constraints around formal classes. There is no “type checking” to ensure that only Bumblethwaites are programmers, for example. Like many dynamic languages, JavaScript treats its “classes” as informal tools for structuring construction and delegation.

Whereas, Ruby provides objects and has a very strong notion of classes: Ruby classes are rich objects in their own right, with many methods for inspecting and modifying the behaviour of objects. Ruby classes are constructors, and objects are created using a new method on the class in conjunction with an initialize method delegated from the instance to the class.

In one sense, Ruby’s classes are very different from JavaScript’s classes. But stepping back, in many important ways they are exactly the same. Although you can explicitly test membership with the instance_of? method, The language as a whole works on the basis of ad hoc polymorphism: If you write a method that operates on programmers, any programmer will do. There is no way to enforce that a method operates only on Bumblethwaites.

There are other kinds of languages. Java is a manifestly typed language. It has classes that handle construction and delegation, just like JavaScript and Ruby. But its classes are formal classes as well: You can write a method that operates on Bumblethwaites, and the compiler checks that you are only working with Bumblethwaites.

There is no reasonable way to have a method that works on programmers that will work with anything that happens to program. You have to create a formal class or interface. Speaking of which, Java’s interfaces combine the formal class concept with formalizing expectations. While Ruby and JavaScript have informal expectations, Java codifies them.

humpty dumpty

terminology

Programming is a pop culture. It’s informed by practitioners that innovate faster than formal education can keep up. Thus, words like “inheritance,” “class,” and “interface” can have specific formal meanings to computer scientists that differ substantially from the informal meanings programmers ascribe to them.

“Class” is probably the most overloaded example. A class can be a kind of metaobject (as you find in SmallTalk or Ruby) responsible for construction and delegation. It can be a formal class (as you find in Java or C#). It can be a keyword that has no runtime existence (as you find in CoffeeScript or JavaScript ES6), or it can be a pattern for composing constructors with prototypes (as you find in ordinary JavaScript).

The notion of a “class” is usually loosely bound to the notion of expectations: If something is a “Bumblethwaite,” here’s what we can rely on it to be or do. Different languages have different ways of expressing or enforcing the relationship between class and expectation, whether it be a compiler checking variable types, a suite of tests, or even assertions and guards bound to function calls in design-by-contract programming style.

In the end, it is not the words that matter so much as the mastery of the underlying concepts: Constructors, formal classes, expectations, delegation, and ad hoc polymorphism.

https://raganwald.com/2014/02/20/at-home-with-the-bumblethwaites
Private Methods In Ruby
Show full content

Ruby allows you to make private methods:

class Sample

  def foo
    :SNAFU
  end

  private

  def private_foo
    :PRIVATE_SNAFU
  end

end

Sample.new.foo
  #=> :SNAFU

Sample.new.private_foo
  #=> NoMethodError: private method `private_foo' called for #<Sample:0x007fa12192e130>

Ruby also allows you to make what other languages call “class methods.” Class methods are singleton methods of the class object, not instance methods of a Class’s object. Got it?

class Sample

  def self.bar
    :FUBAR
  end

end

Sample.bar
  #=> :FUBAR

Sample.new.bar
  #=> NoMethodError: undefined method `bar' for #<Sample:0x007fa12190d2a0>

Can we combine the two techniques to make private class methods?:

class Sample

  private

  def self.private_bar
    :PRIVATE_FUBAR
  end

end

Sample.private_bar
  #=> :PRIVATE_FUBAR

Nay nay! You cannot combine these two techniques to make a private class method. The private keyword does some modal thing with respect to instance methods being defined in the block, but the syntax def self.method_name is a different kind of thing. That different kind of thing applies to any object:

three = BasicObject.new

def three.to_i
  3
end

three.to_i
  #=> 3

The def something.method_name semantics ignores any declaration about privacy. Here’s a question: Where are the methods three.to_i and three.to_s defined? In something called a singleton class, also called an eigenclass. These methods are called singleton methods because they apply to three but not to anything else:

four = BasicObject.new

four.to_i
  #=> NoMethodError: undefined method `to_i' for #<BasicObject:0x007fa121856d20>

There’s another way to declare a singleton method. Behold:

class << four

  def to_i
    4
  end

end

four.to_i
  #=> 4

When you use the class << x ... end syntax, the code in the block is evaluated in the context of the singleton class of x for any object x. Note that it works just like defining instance methods in a typical class declaration. For example, we can include a module:1

module Mathy

  def * that
    self.to_i * that.to_i
  end

end

class << four

  include ::Mathy

end

four * 5
  #=> 20

What happens if we create a singleton method for a class object?

class << Sample

  def glitch
    'Gremlins Lurking In The Computer Hardware'
  end

end

This is not an instance method of Sample instances:

Sample.new.glitch
  #=> NoMethodError: undefined method `glitch' for #<Sample:0x007fa14b0c3340>

It’s a singleton method of the Sample object itself:

Sample.glitch
  #=> "Gremlins Lurking In The Computer Hardware"

Hey, what’s a class method anyways? It’s a method on the class, not a method on an instance of the class. In other words… Class methods are singleton methods of class objects, and thus you can define them with either def Sample.glitch or class << Sample.

Are there any reasons to use class << Sample? Consider this:

five = BasicObject.new

class << five

  private

  def puddy
    'high five!'
  end

end

five.puddy
  #=> NoMethodError: private method `puddy' called for #<BasicObject:0x007fa14a0279c8>

This is very interesting! You can create private singleton methods using the class << x syntax. You can’t using the def x.method_name syntax. So intuition suggests:

class << Sample

  private

  def bug
    :MOTH
  end

end

Sample.new.bug
  #=> NoMethodError: undefined method `bug' for #<Sample:0x007fa14b0976c8>

Sample.bug
  #=> NoMethodError: private method `bug' called for Sample:Class

Aha! That’s how to create private singleton methods for class objects.

homework

Explain this code:

class Sample

  class << self

    private

    def skunkworks
      :ADP
    end

  end

end

(discuss; also, I wrote a book, and it’s free to read online)

  1. You are thinking that you can .extend any object with a module too. Well, you can extend any instance of Object, but not an instance of BasicObject that isn’t also an instance of Object. Tricky raganwald! 

https://raganwald.com/2014/02/11/private-methods-in-ruby
Prototypes Are Not Classes
Show full content

Many people can and do say that “JavaScript has classes.” As a very rough, hand-wavy way of saying that “JavaScript has things that define the characteristics of one or more objects,” this is true. And many people lead healthy, happy, and productive lives without caring whether this statement is actually true, or a wrong but convenient shorthand.

Duty Calls

But it is wrong. JavaScript does not have classes. JavaScript has prototypes, and prototypes are not classes. And understanding why JavaScript’s prototypes are not classes can be helpful for understanding how to “Think in JavaScript” and indeed how to “Think in Objects.”

So let’s go:

what is a class?

Let’s look at a language that everyone agrees has classes: Ruby (SmallTalk is a better example, but a little less familiar these days). Ruby’s classes are also Ruby objects, they have responsibilities and methods just like any other object. What are their responsibilities?

  1. Manufacture new objects
  2. Define the behaviour of the objects they manufacture

So far, so good. Now let’s look at JavaScript

what is a constructor?

A constructor in JavaScript is a function. When used in conjunction with the new keyword, it makes a new object and initializes the new object with a prototype. Here’s an example in use:

function MovieCharacter (firstName, lastName) {
  this.firstName = firstName;
  this.lastName = lastName;
};

MovieCharacter.prototype.fullName = function () {
  return this.firstName + " " + this.lastName;
};

var jm = new MovieCharacter('John', 'Murdoch')
  //=> { firstName: 'John', lastName: 'Murdoch' }

MovieCharacter.prototype.isPrototypeOf(jm)
  //=> true

When programmed in this way, JavaScript has two parts that interact: a constructor function and a prototype:

  1. The constructor manufactures new objects
  2. The prototype defines the behaviour of new objects manufactured by the constructor

These are the same points we made when looking at SmallTalk classes. What’s the difference?

what methods do prototypes and classes have?

To see the difference, let’s focus on the responsibility for defining the behaviour of objects. Classes do it one way, prototypes do it another, and the difference is substantial.

Let’s revisit our example. To make things clearer, we’ll pull the prototype out:

function MovieCharacter (firstName, lastName) {
  this.firstName = firstName;
  this.lastName = lastName;
};

var MovieCharacterPrototype = MovieCharacter.prototype;

MovieCharacterPrototype.fullName = function () {
  return this.firstName + " " + this.lastName;
};

What are the prototype’s methods?

Object.keys(MovieCharacterPrototype).filter(function (key) {
  return typeof(MovieCharacter.prototype[key]) === 'function'
});
  //=> [ 'fullName' ]

In JavaScript, fullName is a method of MovieCharacter’s prototype. The prototype’s methods are the behaviour we’re defining for MovieCharacter objects. Let’s say that again: In JavaScript, the methods of a prototype are the methods of the objects it defines.

Now let’s compare this to a “class.” JavaScript doesn’t have classes right out of the box, so we’ll compare the prototype’s methods to the methods of an equivalent Ruby class as an example:

class MovieCharacter

  def initialize(first_name, last_name)
    @first_name, @last_name = first_name, last_name
  end

  def full_name
    "#{first_name} #{last_name}"
  end

end

MovieCharacter.methods - Object.instance_methods
#=> [ :allocate, :new, :superclass, :<, :<=, :>, :>=, :included_modules, :include?,
      :name, :ancestors, :instance_methods, :public_instance_methods,
      :protected_instance_methods, :private_instance_methods, :constants, :const_get,
      :const_set, :const_defined?, :const_missing, :class_variables,
      :remove_class_variable, :class_variable_get, :class_variable_set,
      :class_variable_defined?, :public_constant, :private_constant, :module_exec,
      :class_exec, :module_eval, :class_eval, :method_defined?, :public_method_defined?,
      :private_method_defined?, :protected_method_defined?, :public_class_method,
      :private_class_method, :autoload, :autoload?, :instance_method,
      :public_instance_method ]

In Ruby, full_name isn’t a method of the MovieCharacter class, and unlike JavaScript, the class has lots and lots of methods that are specific to the business of being a class that aren’t shared by “ordinary” objects.

JavaScript prototypes look just like “ordinary” objects, while Ruby classes don’t look anything like “ordinary” objects.

classes

When programming “ordinary” or “domain” objects, we typically attempt to hide internal state. Objects present an abstraction to other objects by providing a collection of methods that match the concerns the other objects manage. Objects then implement those methods by manipulating their internal, hidden state.

If we look at a Ruby class as an object, we see this in action. Obviously, there is a list of methods inside it somewhere. If you want it, you ask for it with .instance_methods (Let’s ignore specializations for filtering by access level). You can query other things, like .ancestors. Defining methods is also accomplished with a method, define_method (which private, but that’s another story).

These are not “class methods,” they are instance methods of every class object. They are the way in which Ruby programs interact with Ruby classes, and rightly or wrongly, they are “OO” way to meta-program, to write programs that modify themselves.

And where do these “instance methods of a class” come from? How is it that the class Object and the class String both have methods like .const_defined? If you don’t already know, the answer is the same as if we asked why two different MovieCharacter instances both have a .fullName method: All classes in Ruby are instances of the Class (and by extension, Module) metaclasses. There is a class that defines the behaviour of every class, just like we can write a MovieCharacterPrototype to define the behaviour of every MovieCharacter.

Ruby classes thus have two interesting characteristics:

  1. They encapsulate their internal state, presenting an interface for querying and updating object behaviour through methods, and;
  2. Like other specialized objects in an OO program, their behaviour is defined by a class. Since they are classes, the convention is to call a class’s class a “metaclass.”
prototypes

JavaScript prototypes are also objects. But unlike Ruby’s classes, they provide nearly zero encapsulation of their internal state. Methods and other properties are exposed and accessed directly. This is by design, if you try to add an .instance_methods method to a prototype, it would instantly become a method of its objects, and that’s unlikely to be what you want.

Furthermore, prototypes by default do not inherit behavior from a specialized prototype. There is no meta-prototype. Prototypes by default inherit from Object.prototype, just like an object you’d create with literal object syntax.

A prototype is really a representation of private internal state that a class would manage. Only instead of wrapping that in a class and presenting an interface for manipulating behaviour, JavaScript puts the data structure, naked, on the table for JavaScript programs to manipulate directly.

JavaScript and Ruby use completely different approaches. This is not surprising when you learn that Ruby was inspired by Smalltalk, a language that emphasized classes, while JavaScript was inspired by Self, a successor to Smalltalk that used prototypes instead of classes.

metaobjects

Despite the fact that prototypes are not classes, both prototypes and classes accomplish the same thing. If they aren’t “classes,” what word describes what they have in common?

Metaobjects. Prototypes and classes are both metaobjects, objects that define objects.

In computer science, a metaobject is an object that manipulates, creates, describes, or implements other objects (including itself). The object that the metaobject is about is called the base object. Some information that a metaobject might store is the base object’s type, interface, class, methods, attributes, parse tree, etc.—Wikipedia

(To be pedantic, some languages have things they call classes that aren’t metaobjects. To be a metaobject, the entity must be an object in the language. SmallTalk, Java and Ruby classes are metaobjects. C++ classes are not.)

can we build classes in javascript?

Well of course! And the conventional approach of writing a constructor is kind-of sort-of a first step towards that. You can make things more explicit using Object.create:

var MovieCharacterClass = {
  create: function create (firstName, lastName) {
    var mc = Object.create(MovieCharacterClass.prototype);
    mc.firstName = firstName;
    mc.lastName = lastName;
    return mc;
  },
  prototype: {
    fullName: function fullName () {
      return this.firstName + " " + this.lastName;
    }
  }
}

This is extremely minimalist. Is it a class? Not yet. To be a class, you would need to build upwards from it by creating a metaobject that defined the common behaviour for all classes. Once you do that, you have enshrined in your program the idea that classes are a kind thing that is distinct from the kinds of things you use in your program’s domain logic.

good or bad?

Prototypes are not classes, and neither are constructor functions bundled with prototypes. They don’t encapsulate their private data, and they don’t have a metaclass object defining their behaviour.

A prototype is to a class as a database record is to a model object. Is this a good thing or a bad thing? That is a question for each programmer to answer for themselves. If you embrace the notion that OO programming is about encapsulating internal state, direct manipulation of prototypes is a blatant violation of this core principle.

On the other hand, you may not be trying to write “OO” programs. or you may be more relaxed about picking and choosing your principles. Either way, sometimes someone will say that a JavaScript prototype (or a JavaScript constructor function plus its embedded prototype) is a class.

What they really mean is that JavaScript has metaobjects, not classes.

(discuss on hacker news)

https://raganwald.com/2014/01/19/prototypes-are-not-classes
The New JavaScript Problem
Show full content

The new keyword in JavaScript is very straightforward to use:

function Rectangle (x, y) {
  this.x = x;
  this.y = y;
}

var rect = new Rectangle(2, 3);

rect.x
  //=> 2

When a function is called with the new keyword, a new object is created and that object becomes the context of the function call. That context is available within the function using the this keyword. If the function does not explicitly return anything in its body, the result of evaluating the function with the new keyword will be the new object.

By default, all functions have a prototype property, and that prototype defaults to being a new, empty object:

Rectangle.prototype
  //=> {}

Each function gets its own distinct prototype object:

function A () {}
function B () {}
A.prototype === B.prototype
  //=> false

A.prototype.foo = 'SNAFU';
B.prototype.foo = 'FUBAR';
A.prototype
  //=> { foo: 'SNAFU' }
B.prototype
  //=> { foo: 'FUBAR' }

When you create a new object using the new keyword and a function, there is a special relationship established between the object and the function’s prototype:

var a = new A();
Object.getPrototypeOf(a)
  //=> { foo: 'SNAFU' }
A.prototype.isPrototypeOf(a)
  //=> true

That relationship is established at the time the object is created. If we replace the function’s prototype, it doesn’t affect the objects already created:

A.prototype = { FUBAR: 'foo' }
A.prototype.isPrototypeOf(a)
  //=> false

The special relationship goes further: Objects inherit the properties of their prototypes:

var b = new B();

b.foo
  //=> 'FUBAR'

JavaScript doesn’t always show inherited properties to us in the console:

b
  //=> {}

But they’re still there!

b.foo
  //=> 'FUBAR'

Prototypes are very useful for methods:

function Rectangle (x, y) {
  this.x = x;
  this.y = y;
}
Rectangle.prototype.area = function () {
  return this.x * this.y;
}

var twoByThree = new Rectangle(2, 3);
twoByThree.area()
  //=> 6

var threeByFive = new Rectangle(3, 5);
threeByFive.area()
  //=> 15

Reassigning prototypes allows us to share prototypes:

function Square (x) {
  this.x = this.y = x;
}
Square.prototype = Rectangle.prototype;

var fourByFour = new Square(4);
fourByFour.area()
  //=> 16

This might or might not be a bad idea. Another way to accomplish the same objective is to note that a prototype can be any object, including an object created with a function. So:

Square.prototype = new Rectangle();

Now Square has its own prototype that inherits from Rectangle’s prototype, but it isn’t the same object:

Square.prototype === Rectangle.prototype
  //=> false

But we get the same behaviour we wanted:

var fourByFour = new Square(4);
fourByFour.area()
  //=> 16

Separating the two prototypes is superior if there is any difference between a square and a rectangle aside from how they are initialized. For example, if you ever want to write something like this:

Rectange.prototype.toString = function () {
  return "I am a " + this.x + " by " + this.y + " rectangle";
}

Square.prototype.toString = function () {
  return "I am a " + this.x + " by " + this.y + " square";
}

Then you need to have separate prototypes. On the other hand, you might decide to write this:

function GoldenRectangle (x) {
  this.x = x;
  this.y = 1.6 * x;
}
GoldenRectangle.prototype = Rectangle.prototype;

Having multiple JavaScript functions share the same prototype serves the same purpose as one Java class having multiple constructor functions.

Object.create

Constructors are not the only way to create JavaScript objects. Object.create creates a new JavaScript object and permits you to specify the prototype:

var myPrototype = {
  name: "My Prototype"
}

var myObject = Object.create(myPrototype);
Object.getPrototypeOf(myObject)
  //=> { name: 'My Prototype' }

Now that we know this, we can see that the new keyword is a kind of shorthand for:

var pseudoNew = variadic(function (constructor, args) {
  var newObject = Object.create(constructor.prototype);
  var returnedObject = constructor.apply(newObject, args);
  if (typeof(returnedObject) ===  'undefined') {
    return newObject;
  }
  else return returnedObject
});

Using Object.create, we can be explicit about what objects are create with what prototypes. Here we are using a .create method:

var Circle = {
  prototype: {
    area: function () {
      return Math.PI * this.radius * this.radius;
    }
  },
  create: function (radius) {
    var circle = Object.create(this.prototype);
    circle.radius = radius;
    return circle;
  }
}

var fiver = Circle.create(5);
fiver.area()
  //=> 78.53981633974483

So, you can use new or you can use Object.create to create new objects with prototypes. So far, so good.

But there are some clouds on the horizon.

why instanceof might be a problem

JavaScript provides an instanceof keyword. It appears at first to be useful:

fourByFour instanceof Rectangle
  //=> true

It is a kind of shorthand for:

function instanceOf(object, constructor) {
  return constructor.prototype.isPrototypeOf(object)
}

The trouble with instanceof is that it is unreliable whenever you use Object.create instead of the new keyword, or when you replace the prototype of a function used as a constructor. Semantically, the item of interest in the prototype, and therefore if you use Object.create instead of the new keyword, you must also use .isPrototypeOf instead of instanceof.

Most OO programmers prefer using polymorphism to explicitly testing instanceof. Wide use of explicit type testing is generally a design smell, but nevertheless it is a useful tool in some circumstances.

handling the case when we don’t use new

When we choose to write our code to use the new keyword, we may want to consider taking precautions. As noted in Effective JavaScript, traditional constructors will usually fail in ugly ways if we accidentally invoke them without using the new keyword:

function Circle (radius) {
  this.radius = radius;
}
Circle.prototype.area = function () {
  return Math.PI * this.radius * this.radius;
}

new Circle(2).area()
  //=> 12.566370614359172

Circle(2).area()
  //=> TypeError: Cannot call method 'area' of undefined

We can, of course, not do that. We can also code defensively:

function Circle (radius) {
  var newObject = Circle.prototype.isPrototypeOf(this)
                  ? this
                  : new Circle();
  newObject.radius = radius;
  return newObject;
}
Circle.prototype.area = function () {
  return Math.PI * this.radius * this.radius;
}

new Circle(2).area()
  //=> 12.566370614359172

Circle(2).area()
  //=> 12.566370614359172

Our rewritten constructor handles the case when this is not a new Circle instance by creating one before doing the initializing. It also explicitly returns the new object instead of relying on new to infer what we want when we don’t return anything.

We now have a function we expected would be used with the new keyword, but it also handles the case where the new keyword isn’t used.

handling the case when we do use new

It goes the other way as well. Sometimes we write a function that we don’t expect to be used with the new keyword, but we later discover that we’d like to invoke it with the new keyword. For example, we might choose to write a function decorator, as one might find in this essay or in libraries like Method Combinators:

function before (fn, beforeAdvice) {
  return unvariadic(fn.length, function (args) {
    beforeAdvice.apply(this, arguments);
    return fn.apply(this, arguments);
  });
}

function add (x, y) {
  return x + y;
}

add(1, 1)
  //=> 2

function preparation () {
  console.log('Hang on while I get a piece of paper. Ok, I\'m ready!');
}

var newAdd = before(add, preparation);

newAdd(2, 2)
  //=>
    Hang on while I get a piece of paper. Ok, I'm ready!
    4

Alas, our decorator breaks down if we use it with a traditional constructor:

function Square (side) {
  this.side = side;
}
Square.prototype.area = function () {
  return this.side * this.side;
}

new Square(1).area()
  //=> 1

NewSquare = before(Square, preparation);

new NewSquare(2).area()
  //=>
    Hang on while I get a piece of paper. Ok, I'm ready!
    TypeError: Object [object Object] has no method 'area'

Again, we can not do that: When we construct objects in our own functions or methods using Object.create, we don’t need to take special precautions when writing decorators:

var OurSquare = {
  create: function (side) {
    var newObject = Object.create(OurSquare.prototype);
    newObject.side = side;
    return newObject;
  },
  prototype: {
    area: function () {
      return this.side * this.side;
    }
  }
}

OurSquare.create(3).area()
  //=> 9

OurSquare.create = before(OurSquare.create, preparation);

OurSquare.create(4).area()
  //=>
    Hang on while I get a piece of paper. Ok, I'm ready!
    16

But if we wish to accommodate the new keyword when writing things like decorators, we can take precautions:

// shim for Object.setPrototypeOf
Object.setPrototypeOf = Object.setPrototypeOf || function (obj, proto) {
  obj.__proto__ = proto;
  return obj;
}

function defensiveBefore (fn, beforeAdvice) {
  return function wrapped () {
    if (Object.getPrototypeOf(this) === wrapped.prototype) {
      Object.setPrototypeOf(this, fn.prototype);
    }
    beforeAdvice.apply(this, arguments);
    return fn.apply(this, arguments);
  };
}

var NewestSquare = defensiveBefore(Square, preparation);
  //=>
    Hang on while I get a piece of paper. Ok, I'm ready!
    25
so what’s the problem?

The trouble with approaches like this is that precautions can pile up on top of precautions, until our original intent has been obscured. One solution is don’t do that, as in, don’t try to decorate constructor functions, and be sure you know when a function is designed to create an object with new and when it is called normally.

The other possibility is don’t do that, as in, don’t use new: Write functions and methods that explicitly call Object.create, and use .isPrototypeOf instead of instanceof. This is an equally straightforward approach, and resistant to the edge cases surrounding the new keyword when writing modern JavaScript.

(discuss on reddit)

https://raganwald.com/2014/01/14/the-new-javascript-problem
Type-Fu Fighting
Show full content

Everybody was type-fu fighting
Compiling fast as lightning
In fact it was a little bit frightening
But their code had early binding

They were funky ML men from funky CAML town
They were chopping classes up and they were chopping classes down
It’s the ancient Curry’s art, in Haskell but not Dart
It’s a reconstructed Lisp, without the unsafe bits

Everybody was type-fu fighting
Compiling fast as lightning
In fact it was a little bit frightening
But their code had early binding

There was funky R. Hindley and little R. Milner
He said “Here comes the big boss, Simon Peyton-Jones”
Signatures we used to write by hand, have vanished from our land
Type inference made me skip, a parametric polymorphic trip

Everybody was type-fu fighting
Compiling fast as lightning
In fact it was a little bit frightening
But their code had early binding

(repeat)…

Make sure you have early binding
Type-fu fighting, compiling fast as lightning

(inspiration courtesy of hypstr)

https://raganwald.com/2013/12/19/type-fu-fighting
Defactoring
Show full content

The other day, a colleague and I were debating whether to defactor some code. No, not refactor the code, de-factor the code.

what is “factoring?”

Defactoring is the process of removing factoring from code. The word “refactoring” is commonplace, “factoring” is somewhat less common, and “defactoring” is the least commonly discussed as such. Let’s define it, starting with what the word “factoring” means:

Factoring starts with taking code that does something, and organizing it into parts that work together or separately, and without adding or removing functionality. The canonical example would be to take a program consisting of single, monolithic “God Object,” and breaking it out into various entities with individual responsibilities.

It’s not enough to simply “extract method” repeatedly, turning a single, monolithic object with a single public method into a single, monolithic object with a single method and a series of helper methods that are so hopelessly coupled that they can’t be called by anything else, and perhaps can’t even by called in a different order.

That’s reorganizing, perhaps, but it’s more like futzing with whitespace and indentation than it is refactoring. When you factor a number, you extract other numbers that can be recombined. 42 can be factored into three primes: [2, 3, 7]. Those factors can be recombined to make different numbers, for example 6 is 2 * 3 and 21 is 3 * 7.

Here is a method loosely snarfed from Longest Common Subsequence:

def find_common_ends(a, b)
  aa, bb = a.dup, b.dup
  prefix, suffix = Array.new, Array.new
  while (ca = aa.first) && (cb = bb.first) && ca == cb
    aa.shift
    prefix.push bb.shift
  end
  while (ca = aa.last) && (cb = bb.last) && ca == cb
    aa.pop
    suffix.unshift bb.pop
  end
  [prefix, aa, bb, suffix]
end

find_common_ends [1, '2', 3, 4, 5], [1, 2, 4, 3, 5]
  # => [[1], ["2", 3, 4], [2, 4, 3], [5]]

Here we extract the comparison without factoring it much if at all:

def find_common_ends(a, b)
  aa, bb = a.dup, b.dup
  prefix, suffix = Array.new, Array.new
  while (ca = aa.first) && (cb = bb.first) && similar(ca, cb)
    aa.shift
    prefix.push bb.shift
  end
  while (ca = aa.last) && (cb = bb.last) && similar(ca, cb)
    aa.pop
    suffix.unshift bb.pop
  end
  [prefix, aa, bb, suffix]
end

def similar(x,y)
  x == y
end

find_common_ends [1, '2', 3, 4, 5], [1, 2, 4, 3, 5]
  # => [[1], ["2", 3, 4], [2, 4, 3], [5]]

But here we factor the comparison, by allowing you to paramaterize calls to find_common_ends:

def find_common_ends(a, b, &similar)
  similar ||= lambda { |a, b| a == b }
  aa, bb = a.dup, b.dup
  prefix, suffix = Array.new, Array.new
  while (ca = aa.first) && (cb = bb.first) && similar.call(ca, cb)
    aa.shift
    prefix.push bb.shift
  end
  while (ca = aa.last) && (cb = bb.last) && similar.call(ca, cb)
    aa.pop
    suffix.unshift bb.pop
  end
  [prefix, aa, bb, suffix]
end

find_common_ends [1, '2', 3, 4, 5], [1, 2, 4, 3, 5]
  # => [[1], ["2", 3, 4], [2, 4, 3], [5]]

find_common_ends [1, '2', 3, 4, 5], [1, 2, 4, 3, 5] { |x, y| x.to_s == y.to_s }
  # => [[1, 2], [3, 4], [4, 3], [5]]

Now you can use find_common_ends in more ways, because you’ve truly factored the similarity check out of the array scanning method.

Summary: To truly factor something, you have to extract things that can be used independently of the original. Factoring implies introducing flexibility.

what is “defactoring?”

Defactoring is the opposite of factoring. Defactoring reduces the number of ways we can recombine the pieces of code we have. If we defactored find_common_ends, we might take it from having a signature of find_common_ends(a, b, &similar) to find_common_ends(a, b). We’d make it less flexible.

One way to defactor code is to introduce coupling. You have just as many pieces, but they can’t be recombined, they aren’t really independent. Another way is to recombine them so you have fewer pieces.

Why would you want to defactor code? More flexible is better, right?

why defactor

Design is a process of making choices. Bad design is when you punt on a choice. Microsoft PowerPoint makes money by being everything to everybody, but it is not good design for any one user or for any one kind of presentation. It does too much. It’s too flexible. It imposes cognitive overhead sorting out how to use all of its bits and pieces to accomplish a task.

Haiku Deck goes the other way. The authors have made design choices. It does less, much less. It is less flexible overall. But within its domain, it is a better product than PowerPoint. And so it is with software design. Sometimes, increased flexibility introduces unnecessary cognitive overhead. There are options that will never be exercised.

Well-written code isn’t harder to read on account of the flexibility. We can assume our colleagues know what metaprogramming is, how blocks work, what a method combinator does, and so on. But nevertheless, increased flexibility does mean that the code says far less about how it’s intended to work. Whenever you’re reading a piece of it, you are thinking, “it might do this, it might do that.”

This is why there are holy wars over operator overloading. What does a == b mean in Ruby? Nobody knows, because == is a flexible concept.

Software design is the act of making bets about the future. A well-designed program is a bet on what will change in the future, and what will not change. And a well-designed program communicates the nature of that bet by being relatively flexible about things that the designers think are most likely to change, and being relatively inflexible about the things the designers think are least likely to change.

When you discover you need more flexibility, you factor. When you discover you need less flexibility, you defactor.

what defactoring tells us about factoring

Good design communicates its intention. Consider the statement “One way to defactor code is to introduce coupling. You have just as many pieces, but they can’t be recombined, they aren’t really independent.” Code that has a lot of coupled pieces is generally held to be a bad idea.

You sometimes see maxims like limits on the number of lines in a method. Measuring such things is useful, but it’s a trap to game the number rather than using such things as hints about where to seek deeper understanding. If we break a big method into a lot of coupled small methods, we sends the wrong signal: It looks like the code is flexible, but it isn’t.

We lied, telling our colleagues that we think all these pieces ought to be recombined, and that it was designed with changes in mind. But in reality, we were simply trying to break one coupled monolithic method into a bunch of small, coupled helper methods.

Things that are coupled ought to be clustered together in an obvious way, either by recombining them or by clustering them together using language features like classes, modules, scopes and so on. That separates the idea of adding mental whitespace from the idea of factoring.

summary

Factoring is the division of code into independent entities that can be recombined. Defactoring is the reassembly of formerly independent entities. We factor to introduce flexibility. We defactor to reduce flexibility. Flexibility has a cognitive cost, so we apply it where we think we need it, and remove it from where we think we don’t need it. We attempt to keep the appearance of code flexibility aligned with the reality of code flexibility: Things that appear to be independent should not be coupled.

(discuss)

https://raganwald.com/2013/10/08/defactoring
The Predicate Module Pattern
Show full content

In Ruby, modules are often used to mix functionality into concrete classes. Another excellent pattern is to extend objects as a way of avoiding monkey-patching classes you do not “own.” There’s a third pattern that I find handy and expressive: Using modules as object predicates.

Let’s begin by defining the problem: Representing object predicates.

We have some objects that represent entities of some sort. They could be in the domain, they could be in the implementation. For our ridiculously simple example, we will choose bank accounts:

class BankAccount

  # ... 

end

Our bank account instances have lots of state. A really forward-looking way to deal with that is to implement a state machine, but let’s hand-wave over that and imagine that we’re trying to write Java programs with Ruby syntax, so we use a getter and setter for some attribute:

class BankAccount

  attr_accessor :frozen

end

chequing_acct = BankAccount.new(...)
chequing_acct.frozen = false

# ...

if chequeing_acct.frozen
  # do something
end

If this attribute is always a boolean, we call it a predicate, and in the Ruby style borrowed from Lisp, we suffix its getter with a ?:

class BankAccount

  attr_writer :frozen
  
  def frozen?
    @frozen
  end

end

chequing_acct = BankAccount.new(...)
chequing_acct.frozen = false

# ...

if chequeing_acct.frozen?
  # do something
end

That’s how most of my code is written, and it works just fine. But we should be clear about what this code is saying and what it isn’t saying.

what are we saying with predicate attributes?

Let’s compare this:

class BankAccount

  attr_writer :frozen
  
  def frozen?
    @frozen
  end

end

With the following:

class BankAccount

end

class Thawed < BankAccount

  def frozen?; false; end

end

class Frozen < BankAccount

  def frozen?; true; end

end

bank_account = Frozen.new(...)

In the first example, using an attribute implies that frozen can change during an object’s lifespan. In the second example, using classes imples that frozen cannot change during an object’s lifespan. That is very interesting! People talk about code that communicates its intent, having two ways to implement the frozen? method helps us communicate whether the frozen state is expected to change for an object.

cleaning up with predicate modules

If we do have a predicate that is not expected to change during the object’s lifespan, having a pattern to communicate that is a win, provided it’s a clean pattern. Subclassing is not clean for this case. And imagine we had four or ten such predicate attributes, subclassing would be insane.

Modules can help us out. Let’s try:

class BankAccount

end

module Thawed

  def frozen?; false; end

end

module Frozen

  def frozen?; true; end

end

bank_account = BankAccount.new(...).extend(Frozen)

bank_account.frozen?
  #=> true

Now we’re extending an object with a module (not including the module in a class), and we get the module’s functionality in that object. It works like a charm, although you do want to be aware there are now three states for frozen-ness: Frozen, Thawed, and I-Forgot-To-Extend-The-Object. And we can mix in as many such predicate modules as we like.

module responsibilities

With classes including modules, each class is responsible for including the modules it needs. Writing .extend(Foo) when creating a new object shifts the responsibility to the client creating an object. That’s nearly always a bad idea, so we bakeit into the initialize method. I prefer hashes of options and initializers, but you can do this in other ways:

class BankAccount

  def initialize options = {}
    self.extend(
      if options[:frozen]
        Frozen
      else
        Thawed
      end
    )
  end

end

You can experiment with this pattern. If you find yourself writing a lot of this kind of code:

if object.frozen?
  raise "Cannot fuggle with a frozen object"
else
  fuggle(object)
end

You can write:

module Thawed

  def frozen?; false; end

  def guard_with_frozen_check desc
    yield self
  end

end

module Frozen

  def frozen?; true; end

  def guard_with_frozen_check desc = 'evaluate code block'
    raise "Cannot #{desc} with a frozen object"
  end

end

bank_account.guard_with_frozen_check('fuggle') do |acct|
  fuggle(acct)
end

This is much more ‘OO’ than having code test frozen?. Not that there’s anything wrong with that! But what if you like to test bank accounts for frozen-ness? Well, you don’t really need a frozen? method if you don’t want one:


module Thawed; end

module Frozen; end

bank_account = BankAccount.new(...).extend(Frozen)

bank_account.kind_of?(Frozen)
  #=> true

Checking whether an account is a kind of Frozen is a matter of taste, of course. But it’s no worse in my mind than a frozen? method if we do not expect an object to change such a state during its lifetime.

Well, there you have it: The Predicate Module Pattern. Cheers!

(discuss)


personal commentary

If you make a habit of programming as I do, you will inevitably run into contrary opinions. For example, one widely held opinion is that #kind_of? is a “code smell.” I agree with this, provided that the expression “code smell” retains it shistorical meaning, namely something that should be double-checked to make sure that it is what you want.

As a general rule, you should be absolutely certain that you are using .kind_of? for good rasons, and not because you are unfamiliar with the “Kingdom of Nouns” style of programming where entities are burdened with an every-increasing number of responsibilities because they ought to know everything about how to use them.

In the code above, we’re actually presented with three ways to use a bank account’s frozen predicate attribute:

  1. A method called frozen?.
  2. Using kind_of?(Frozen).
  3. Baking flow control into the predicate modules using the guard_with_frozen_check method.

If a module is created strictly to communciate a predicate to fellow programmers, it’s true that you can define frozen? in a module to show that ths is not expected to change, however there is a problem. The interface of the method frozen? is abstract enough that the predicate could be a state that changes, or it could be a state that doesn’t change.

That’s widely seen as a benefit, but when everything is abstract and could-be-changed in the future, interfaces communicate very little. kind_of?(Frozen) pushes the implementation into the interface, true, but it also pushes a contractual promise about the behaviour of Frozen into the interface. That can be a benefit when you make a conscious choice that you are trying to make this behaviour obvious.

Generally, modules and classes are used for implementing interfaces, and they shouldn’t become the interface. But a predicate module is, IMO, a place where it is worth considering whether the smell is calling out an actual antipattern or whether this is one of those places where a general rule espoused by the mass of the herd doesn’t apply.

As for option 3, this speaks to a style of programming that eschews checking predicates or values at all times. The name guard_with_frozen_check is good for explaining the mechanism, but terrible in practice. I’d pick the name as the smell. Consider instead:

class BankAccount

  def initialize options = {}
    self.extend(
      if options[:security_score].andand < 42
        Frozen
      else
        Thawed
      end
    )
  end

end

module Thawed

  def perform_user_action desc
    yield self
  end

end

module Frozen

  def perform_user_action desc = 'perform user action'
    raise "Cannot #{desc} with an object frozen because of a poor security score"
  end

end

bank_account = BankAccount.new security_score: 74

bank_account.perform_user_action('fuggle') do |acct|
  fuggle(acct)
end

In this code, clients do not know anything about why an account might be froze, they create accounts and provide security scores, and they ask the accounts to perform user actions. The account checks the frozen “state” via a module.

You could do the same thing by saving the score and checking it, or saving a frozen predicate attribute, but you wouldn’t be communicating that security scores don’t change in the context of an instantiated BankAccount object.

It’s up to you what to do with this pattern. Just be aware that if you read essays by people who switched from Java to Ruby at a time when Ruby was unpopular, they may act as if “popularity” isn’t their first consideration when choosing how to write programs.

That’s neither good, nor bad, it just is.

https://raganwald.com/2013/09/12/the-predicate-module-pattern
Leaky Greenspunned Abstractions
Show full content

As programmers, it is our job to build software out of abstractions. Logic gates are connected to form a “Von Neumann Computer.” An assembler creates an interface that can be programmed with instructions like MOV 12345, 567890. A compiler lets us write x = y, and so on up to has_and_belongs_to_many :roles.

When programming in a particular language, we often want to borrow an abstraction from another language, much as English speakers will murmur “C’est la vie” when the build breaks. In JavaScript, the Underscore library includes a function called pluck. Compare _.pluck(users, 'lastName') in JavaScript to using String#to_proc in Ruby: users.map(&:lastName). The mechanisms and syntaxes are different, but the underlying ideas are similar.

For small idioms, other languages can be a fertile source of abstractions. But we struggle when we become ambitious and attempt to Greenspun new semantics that are a poor fit with our primary tool.

In Ruby, for example, Benjamin Stein and I wrote a little thing called andand. It emulates the existential (or “elvis”) operator from Groovy and CoffeeScript. On the surface, it’s as simple as String#to_proc. You can write something like user.andand.lastName, and if user is null, the expression evaluates to null without any exceptions being thrown.

a leaky abstraction

But let’s draw the curtain back, shall we? The andand method that’s mixed into all objects is not too tough to parse once you realize there’s a special case for passing a block:

def andand (p = nil)
  if self
    if block_given?
      yield(self)
    elsif p
      p.to_proc.call(self)
    else
      self
    end
  else
    if block_given? or p
      self
    else
      MockReturningMe.new(self)
    end
  end 
end

class MockReturningMe < BlankSlate
  def initialize(me)
    super()
    @me = me
  end
  def method_missing(*args)
    @me
  end
end

But what is this MockReturningMe thingummy? Well, if the receiver of .andand is falsey, the method .andand returns a special proxy object that returns the original object no matter what method you send it. This works just fine for the “normal case” of writing something like raganwald.andand.braythwayt, but introduces icky1 edge cases.

In CoffeeScript, the compiler will complain if you write object?. instead of object?.method. But object.andand is perfectly acceptable Ruby code that returns either the receiver or one of these proxy objects. All sorts of unpleasant bugs can arise from a simple mistake, bugs that can’t be caught in a dynamically typed language like Ruby.

As Joel Spolsky would say, “andand is a leaky abstraction.”

the blockhead programmer

Implementing a programming language is an incredibly valuable exercise. Some time ago I wrote a toy Scheme, one where everything was built up from unhygienic macros and just five special forms. let isn’t one of those five, so I wrote a macro that rewrote

(let ((foo 1) (bar 2))
  (+ foo bar))

into:

((lambda (foo bar)
  (+ foo bar)) 
  1 2)

If you’re somewhat familiar with JavaScript and Lisp, you’ll recognize the second expression as an Immediately Invoked Function Expression. The macro provides the illusion that let defines and binds local variables in my toy Scheme the way var does in JavaScript. But that isn’t what happens: In reality, parameters to lambdas are the only mechanism for defining variables.

It’s an interesting mechanism, and it has been borrowed for the CoffeeScript language’s do keyword. JavaScript programmers are often tempted to use it to implement block scoping. In JavaScript, a new scope is only introduced by functions. Take this terrible code:

function whatDoesThisDo (n) {
  result = '';
  for (var i = 0; i < n; ++i ) {
    if (i % 2 === 0) {
      for (var i = 0; i < n; ++i ) {
        result = result + 'x';
      }
    }
  }
  return result;
}

whatDoesThisDo(6)
  //=> "xxxxxx"

It seems contrived for the purpose of hazing University graduates that interview for programming jobs. The key point for our purposes is that despite the var declaration and the fact that for (var i = 0; i < result.length; ++i ) is nested inside of if (i % 2 === 0) { ... }, the i indexing the inner loop is the exact same i as the one that indexes the outer loop, and that is going to produce problems.

Some languages have block scope: The introduction of a block like { ... } introduces a new scope, and therefore you can create a new i that shadows the original. This is possible in Scheme with the let form, and if you have a taste for having one variable mean different things in different places, you can appear to create the same effect in JavaScript with an IIFE:

function whatDoesThisDo (n) {
  result = '';
  for (var i = 0; i < n; ++i ) {
    if (i % 2 === 0) (function () {
      for (var i = 0; i < n; ++i ) {
        result = result + 'x';
      }
    })();
  }
  return result;
}

whatDoesThisDo(6)
  //=> "xxxxxxxxxxxxxxxxxx"

By using (function () { ... })(); instead of plain { ... }, we’re creating a new JavaScript scope. Passing rapidly over the performance implications of creating a new function only to execute it once and then throw it away, have we implemented block scope as you might find in a language like C#?

Almost.

more leaks

Once again, we have a leaky abstraction. The enthusiastic programmer, having rediscovered how to implement block scope in JavaScript with IIFEs, might decide that IIFEs are the new go-to idiom for block scoping:

function oddsAndEvens (n) {
  var result;
  if (n % 2 == 0) {
    (function () {
      for (var i = 0; i < n; ++i) {
        result = result + 'even';
      }
      return result;
    })()
  }
  else {
    (function () {
      for (var i = 0; i < n; ++i) {
        result = result + 'odd';
      }
      return result;
    })()
  }
}

oddsAndEvens(4)
  //=> undefined

The reasoning behind this code is beyond dubious, but the problem is apparent: The intention was that return result return the result from the oddsAndEvens function, but in reality it returns the result from the anonymous function enclosing it. In languages like Ruby and C++, blocks are lighter weight than lambdas precisely because the semantics of things like return are different from within a block than from within a function body.

It doesn’t matter that let is implemented with a lambda in our toy Scheme because our toy Scheme doesn’t have a return form. But it matters greatly in JavaScript. Imagine, for example, that you use Esprima to write a preprocessor. You could lose all grip with reality and translate ES5 code like this:

let(x = 1, y = 2) do {
  x + y;
} while(true);

into:

(function (x, y) {
  return x + y;
})(1, 2);

Besides the vestigial while(true), this will break badly whenever someone tries to use a return inside our pretend-let, just as we saw above. And it gets worse: What is the meaning of this inside our so-called blocks?

Now, many of these problems can be fixed. You could write .call(this, 1, 2) instead of (1, 2) to preserve this. You could even use try and catch to establish a new scope for a single variable, as let-er does. But once again, we find that when we try to bolt the features from one language onto another, we create a leaky abstraction that falls down for anything but the obvious cases.

are we doomed?

Now these examples are contrived, but they reproduce actual bugs I’ve encountered when writing code using similar mechanisms. And my experience so far is that the more complex the abstraction being borrowed form one language and Greenspunned onto another, the more annoying the abstraction leaks become. Don’t get me started on lazy collections in Ruby!

It’s natural to wonder if all such efforts should be dismissed as a bad idea. They can be unfamiliar to uniglot programmers, and if my contention is correct, they are fraught with edge cases and subtle bugs. Must we always try to “cut with the grain” and use a language’s “natural” idioms?

Personally, I embrace features from other languages. But recognizing that they have limitations, I avoid embracing them to the point that they become prevalent. Something like trampolining can be a very useful tool for surgically solving a particular problem in a language that lacks Tail Call Elimination, but trampolining all of a code base’s method calls would be an act of masochism.

And perhaps, we can learn from this that languages are limited. While it might seem like we can implement any feature from another language we like, the reality is that we can write code that looks like another language’s code, but under the hood it’s still our original language, and trouble awaits those who blindly embrace the abstraction.

If another language’s abstractions are so convenient, maybe we should consider switching rather than writing an ad hoc, informally-specified, bug-ridden, slow implementation of half of a better idea.

(discuss)

  1. The quality of being as obscure as Ick, an obscure Ruby library. 

https://raganwald.com/2013/08/01/leaky-greenspunned-abstractions
I hated, hated, hated this CoffeeScript
Show full content

In It’s a Mad, Mad, Mad, Mad World: Scoping in CoffeeScript and JavaScript, I translated a small snippet of JavaScript almost directly to CoffeeScript. The point was to compare the way JavaScript and CoffeeScript scope variables, so it was necessary to reproduce the code and variables almost directly.

I discussed the failure modes of each language. Then, in the conclusion, I offered that JavaScript programmers rarely encounter the JavaScript failure modes. This is true: JavaScript programmers often use strict and/or employ various lint tools that identify possible failures early. I also offered that CoffeeScript programmers rarely encounter CoffeeScript’s failure modes. I believe this is also true, but for different reasons. One of those reasons is that idiomatic CoffeeScript rarely resembles idiomatic JavaScript.

A direct translation of JavaScript to CoffeeScript will be ugly, so much so that people will hate, hate, hate it. And even worse, it is much more likely to contain errors than idiomatic CoffeeScript. Take this JavaScript from the previous post:

function table (numberOfRows, numberOfColumns) {
  var i,
      str = '';
  for (i = 0; i < numberOfRows; ++i) {
    str = str + row(numberOfColumns);
  }
  return '<table>' + str + '</table>';
  
  function row (numberOfCells) {
    var i,
        str = '';
    for (i = 0; i < numberOfCells; ++i) {
      str = str + '<td></td>';
    }
    return '<tr>' + str + '</tr>';
  }
}

The literal translation to CoffeeScript is horrible and definitely not idiomatic:

table = (numberOfRows, numberOfColumns) ->
  row = (numberOfCells) ->
    str = ""
    i = 0
    while i < numberOfCells
      str = str + "<td></td>"
      ++i
    "<tr>" + str + "</tr>"
  str = ""
  i = 0
  while i < numberOfRows
    str = str + row(numberOfColumns)
    ++i
  return "<table>" + str + "</table>"
  
table(3,3)
  #=> "<table><tr><td></td><td></td><td></td></tr><tr><td></td><td></td><td></td></tr><tr><td></td><td></td><td></td></tr></table>"

One way to write this same thing in idiomatic CoffeeScript is to use a comprehension. Comprehensions are familiar to Python programmers (as is CoffeeScript’s significant whitespace). Here’s a comprehension-based implementation with the debug line that was a failure mode in the previous code by “capturing” a local variable:

console.log('here') for i in [1..5]

table = (numberOfRows, numberOfColumns) ->
  row = (numberOfColumns) ->
    "<tr>#{ ('<td></td>' for i in [1..numberOfColumns]).join('') }</tr>"
  "<table>#{ (row(numberOfColumns) for i in [1..numberOfRows]).join('') }</table>"

table(3,3)
  #=> "<table><tr><td></td><td></td><td></td></tr><tr><td></td><td></td><td></td></tr><tr><td></td><td></td><td></td></tr></table>"

It works because CoffeeScript creates a safe for loop behind the scenes for us, and using string interpolation we avoid having to use extra variables for catenation. This code is possible in CoffeeScript because everything, including comprehensions, is an expression on CoffeeScript. JavaScript has lots of statements, such as its for loops, that do not produce values. So in JavaScript, we have to manually collect the values with extra variables.

It will always be possible to make a mistake with scopes in CoffeeScript, however the point of this example is that when you naturally use the features CoffeeScript provides, you need fewer non-essential variables such as loop indices, and therefore have fewer opportunities to make these mistakes.

So the moral of the story is this: When you adopt a language, learn to program using the language’s features in a natural way. Do not write JavaScript code in CoffeeScript, just as you wouldn’t write CoffeeScript in JavaScript. This way, you’ll have fewer bugs. And nobody will hate, hate, hate your code.

https://raganwald.com/2013/07/29/I-hated-hated-hated-this-coffeescript
It's a Mad, Mad, Mad, Mad World: Scoping in CoffeeScript and JavaScript
Show full content

“I’ve been mad for fucking years, absolutely years, been over the edge for yonks, been working me buns off for bands…”

“I’ve always been mad, I know I’ve been mad, like the most of us…very hard to explain why you’re mad, even if you’re not mad…”

–“Speak to Me,” Nick Mason

coffeescript

CoffeeScript, as many people know, is a transpile-to-JavaScript language.1 For the most part, it does not introduce major changes in semantics. For example, this:

-> 'Hello, world'

Transpiles directly to:

function () { return 'Hello, world'; }

This is convenient syntactic sugar, and by removing what some folks call the “syntactic vinegar” of extraneous symbols, it encourages the use of constructs that would otherwise make the code noisy and obscure the important meaning. The vast majority of features introduced by CoffeeScript are of this nature: They introduce local changes that transpile directly to JavaScript.2

CoffeeScript also introduces features that don’t exist in JavaScript, such as destructuring assignment and comprehensions. In each case, the features compile directly to JavaScript without introducing changes elsewhere in the program. And since they don’t look like existing JavaScript features, little confusion is created.

equals doesn’t equal equals

One CoffeeScript feature does introduce confusion, and the more you know JavaScript the more confusion it introduces. This is the behaviour of the assignment operator, the lowly (and prevalent!) equals sign:

foo = 'bar'

Although it looks almost identical to assignment in JavaScript:

foo = 'bar';

It has different semantics. That’s confusing. Oh wait, it’s worse than that: Sometimes it has different semantics. Sometimes it doesn’t.

So what’s the deal with that?

Well, let’s review the wonderful world of JavaScript. We’ll pretend we’re in a browser application, and we write:

foo = 'bar';

What does this mean? Well, it depends: If this is in the top level of a file, and not inside of a function, then foo is a global variable. In JavaScript, global means global across all files, so you are now writing code that is coupled with every other file in your application or any vendored code you are loading.

But what if it’s inside a function?

function fiddleSticks (bar) {
  foo = bar;
  // ...
}

For another example, many people enclose file code in an Immediately Invoked Function Expression (“IIFE”) like this:

;(function () {
  foo = 'bar'
  // more code...
})();

What do foo = 'bar'; or foo = bar; mean in these cases? Well, it depends as we say. It depends on whether foo is declared somewhere else in the same scope. For example:

function fiddleSticks (bar) {
  var foo;
  foo = bar;
  // ...
}

Or:

function fiddleSticks (bar) {
  foo = bar;
  // ...
  var foo = batzIndaBelfrie;
  // ...
} 

Or even:

function fiddleSticks (bar) {
  foo = bar;
  // ...
  function foo () {
    // ...
  }
  // ...
}

Because of something called hoisting,3 these all mean the same this: foo is local to function fiddleSticks, and therefore it is NOT global and ISN’T magically coupled to every other file loaded whether written by yourself or someone else.

nested scope

JavaScript permits scope nesting. If you write this:

function foo () {
  var bar = 1;
  var bar = 2;
  return bar;
}

Then bar will be 2. Declaring bar twice makes no difference, since both declarations are in the same scope. However, if you nest functions, you can nest scopes:

function foo () {
  var bar = 1;
  function foofoo () {
    var bar = 2;
  }
  return bar;
}

Now function foo will return 1 because the second declaration of bar is inside a nested function, and therefore inside a nested scope, and therefore it’s a completely different variable that happens to share the same name. This is called shadowing: The variable bar inside foofoo shadows the variable bar inside foo.

javascript failure modes

Now over time people have discovered that global variables are generally a very bad idea, and accidental global variables doubly so. Here’s an example of why:

function row (numberOfCells) {
  var str = '';
  for (i = 0; i < numberOfCells; ++i) {
    str = str + '<td></td>';
  }
  return '<tr>' + str + '</tr>';
}

function table (numberOfRows, numberOfColumns) {
  var str = '';
  for (i = 0; i < numberOfRows; ++i) {
    str = str + row(numberOfColumns);
  }
  return '<table>' + str + '</table>';
}

Let’s try it:

table(3, 3)
  //=> "<table><tr><td></td><td></td><td></td></tr></table>"

We only get one row, because the variable i in the function row is global, and so is the variable i in the function table, so they’re the exact same global variable. Therefore, after counting out three columns, i is 3 and the for loop in table finishes. Oops!

And this is especially bad because the two functions could be anywhere in the code. If you accidentally use a global variable and call a function elsewhere that accidentally uses the same global variable, pfft, you have a bug. This is nasty because there’s this weird action-at-a-distance where a bug in one file reaches out and breaks some code in another file.

Now, this isn’t a bug in JavaScript the language, just a feature that permits the creation of very nasty bugs. So I call it a failure mode, not a language bug.

coffeescript to the rescue

CoffeeScript addresses this failure mode in two ways. First, all variables are local to functions. If you wish to do something in the global environment, you must do it explicitly. So in JavaScript:

UserModel = Backbone.Model.extend({ ... });
var user = new UserModel(...);

While in CoffeeScript:

window.UserModel = window.Backbone.Model.extend({ ... })
user = new window.UserModel(...)

Likewise, CoffeeScript bakes the IIFE enclosing every file in by default. So instead of:

;(function () {
  // ...
})();

You can just write your code.4

The net result is that it is almost impossible to replicate the JavaScript failure mode of creating or clobbering a global variable by accident. That is a benefit.

what would coffeescript do?

This sounds great, but CoffeeScript can be surprising to JavaScript programmers. Let’s revisit our table function. First, we’ll fix it:

function row (numberOfCells) {
  var i,
      str = '';
  for (i = 0; i < numberOfCells; ++i) {
    str = str + '<td></td>';
  }
  return '<tr>' + str + '</tr>';
}

function table (numberOfRows, numberOfColumns) {
  var i,
      str = '';
  for (i = 0; i < numberOfRows; ++i) {
    str = str + row(numberOfColumns);
  }
  return '<table>' + str + '</table>';
}

table(3, 3)
  //=> "<table><tr><td></td><td></td><td></td></tr><tr><td></td><td></td><td></td></tr><tr><td></td><td></td><td></td></tr></table>"

Good! Now suppose we notice that no function calls row other than table. Although there is a slightly more “performant” way to do this, we decide that the clearest and simplest way to indicate this relationship is to nest row inside table Pascal-style:

function table (numberOfRows, numberOfColumns) {
  var i,
      str = '';
  for (i = 0; i < numberOfRows; ++i) {
    str = str + row(numberOfColumns);
  }
  return '<table>' + str + '</table>';
  
  function row (numberOfCells) {
    var i,
        str = '';
    for (i = 0; i < numberOfCells; ++i) {
      str = str + '<td></td>';
    }
    return '<tr>' + str + '</tr>';
  }
}

It still works like a charm, because the i in row shadows the i in table, so there’s no conflict. Okay. Now how does it work in CoffeeScript?

Here’s one possible translation to CoffeeScript:

table = (numberOfRows, numberOfColumns) ->
  row = (numberOfCells) ->
    str = ""
    i = 0
    while i < numberOfCells
      str = str + "<td></td>"
      ++i
    "<tr>" + str + "</tr>"
  str = ""
  i = 0
  while i < numberOfRows
    str = str + row(numberOfColumns)
    ++i
  return "<table>" + str + "</table>"
  
table(3,3)
  #=> "<table><tr><td></td><td></td><td></td></tr><tr><td></td><td></td><td></td></tr><tr><td></td><td></td><td></td></tr></table>"

It works just fine. Here’s another:

table = (numberOfRows, numberOfColumns) ->
  str = ""
  i = 0
  row = (numberOfCells) ->
    str = ""
    i = 0
    while i < numberOfCells
      str = str + "<td></td>"
      ++i
    "<tr>" + str + "</tr>"
  str = ""
  i = 0
  while i < numberOfRows
    str = str + row(numberOfColumns)
    ++i
  return "<table>" + str + "</table>"
  
table(3,3)
  #=> "<table><tr><td></td><td></td><td></td></tr></table>"

Broken! And a third:

str = ""
i = 0
table = (numberOfRows, numberOfColumns) ->
  row = (numberOfCells) ->
    str = ""
    i = 0
    while i < numberOfCells
      str = str + "<td></td>"
      ++i
    "<tr>" + str + "</tr>"
  str = ""
  i = 0
  while i < numberOfRows
    str = str + row(numberOfColumns)
    ++i
  return "<table>" + str + "</table>"

table(3,3)
  #=> "<table><tr><td></td><td></td><td></td></tr></table>"

Also broken! Although the three examples look similar, the first gives us what we expect but the second and third do not. What gives?

Well, CoffeeScript doesn’t allow us to “declare” that variables are local with var. They’re always local. But local to what? In CoffeeScript, they’re local to the function that either declares the variable as a parameter or that contains the first assignment to the variable.5 So in our first example, reading from the top, the first use of str and i is inside the row function, so CoffeeScript makes them local to row.

A little later on, the code makes an assignment to i and str within the table function. This scope happens to enclose row’s scope, but it is different so it can’t share the str and i variables. CoffeeScript thus makes the i and str in table variables local to table. As a result, the i and str in row end up shadowing the i and str in table.

The second example is different. The first i encountered by CoffeeScript is in table, so CoffeeScript makes it local to table as we’d expect. The second i is local to row. But since row in enclosed by table, it’s possible to make that i refer to the i already defined, and thus CoffeeScript does not shadow the variable. The i inside row is the same variable as the i inside table.

In the third example, i (and str) are declared outside of both table and row, and thus again they all end up being the same variable with no shadowing.

Now, CoffeeScript could scan an entire function before deciding what variables belong where, but it doesn’t. That simplifies things, because you don’t have to worry about a variable being declared later that affects your code. Everything you need to understand is in the same file and above your code.

In many cases, it also allows you to manipulate whether a variable is shadowed or not by carefully controlling the order of assignments. That’s good, right?

all those against the bill, say “nay nay!”

Detractors of this behaviour say this is not good. When JavaScript is written using var, the meaning of a function is not changed by what is written elsewhere in the file before the code in question. Although you can use this feature to control shadowing by deliberately ordering your code to get the desired result, a simple refactoring can break what you’ve already written.

For example, if you write:

table = (numberOfRows, numberOfColumns) ->
  row = (numberOfCells) ->
    str = ""
    i = 0
    while i < numberOfCells
      str = str + "<td></td>"
      ++i
    "<tr>" + str + "</tr>"
  str = ""
  i = 0
  while i < numberOfRows
    str = str + row(numberOfColumns)
    ++i
  return "<table>" + str + "</table>"

table(3,3)
  #=> "<table><tr><td></td><td></td><td></td></tr><tr><td></td><td></td><td></td></tr><tr><td></td><td></td><td></td></tr></table>"

All will be well, until you are debugging late one night, and you add:

console.log('Hello!') for i in [1..5]

table = (numberOfRows, numberOfColumns) ->
  row = (numberOfCells) ->
    str = ""
    i = 0
    while i < numberOfCells
      str = str + "<td></td>"
      ++i
    "<tr>" + str + "</tr>"
  str = ""
  i = 0
  while i < numberOfRows
    str = str + row(numberOfColumns)
    ++i
  return "<table>" + str + "</table>"

table(3,3)
  #=> "table><tr><td></td><td></td><td></td></tr></table>"

This breaks your code because the i you used at the top “captures” the other variables so they are now all the same thing. To someone used to JavaScript, this is a Very Bad Thing™. When you write this in JavaScript:

function row (numberOfCells) {
  var i,
      str = '';
  for (i = 0; i < numberOfCells; ++i) {
    str = str + '<td></td>';
  }
  return '<tr>' + str + '</tr>';
}

It will always mean the same thing no matter where it is in a file, and no matter what comes before it or after it. There is no spooky “action-at-a-distance” where code somewhere else changes what this code means. Whereas in CoffeeScript, you don’t know whether the i in row is local to row or not without scanning the code that comes before it in the same or enclosing scopes.

coffeescript’s failure mode

In this case, CoffeeScript has a failure mode: The meaning of a function seems to be changed by altering its position within a file or (in what amounts to the same thing) by altering code that appears before it in a file in the same or enclosing scopes. In contrast, JavaScript’s var declaration never exhibits this failure mode. JavaScript has a different action-at-a-distance failure mode, where neglecting var causes action at a much further distance: The meaning of code can be affected by code written in an entirely different file.

Mind you, the result of calling our row function is not affected by declaring an i in an enclosing scope. Our function always did what it was expected to do and always will. Although you and I know why the change breaks the table function is that row now uses an enclosed variable, imagine that we were writing unit tests. All of our tests for row would continue to pass, it’s the tests for table that break. So in an evidence-based programming sense, when we maintain the habit of always initializing variables we expect to use locally, changing code outside of those functions only changes the evidence that the enclosing code produces.

So one way to look at this is that row is fine, but moving i around changes the meaning of the code where you move i. And why wouldn’t you expect making changes to table to change its meaning?

so which way to the asylum?

If you ask around, you can find people who dislike JavaScript’s behaviour, and others who dislike CoffeeScript’s behaviour. Accidentally getting global variables when you neglect var is brutal, and action-at-a-distance affecting the meaning of a function (even if it is always within the same file) flies against everything we have learned about the importance of writing small chunks of code that completely encapsulate their behaviour.

Of course, programmers tend to internalize the languages they learn to use. If you write a lot of JavaScript, you habitually use var and may have tools that slap your wrist when you don’t. You’re bewildered by all this talk of action-at-a-distance. It will seems to you to be one of those rookie mistake problems that quickly goes away and is not a practical concern.

Likewise, if you write twenty thousand lines of CoffeeScript, you may never be bitten by its first-use-is-a-declaration behaviour. You may be in the habit of using variable names like iRow and iColumn out of habit. You may find that your files never get so large and your functions so deeply nested that a “capture” problem takes longer than three seconds to diagnose and fix.

It’s a bit of a cop-out, but I suggest that this issue resembles the debate over strong, manifest typing vs. dynamic typing. In theory, one is vastly preferable to the other. But in practice, large stable codebases are written with both kinds of languages, and programmers seem to adjust to overcome the failure modes of their tools unconsciously while harvesting the benefits that each language provides.

  1. Yes, “transpile” is a real word, or at least, a real piece of jargon. It’s a contraction of “transcompiler,” which is a compiler that translates one language to another language at a similar level of abstraction. There’s room for debate over what constitutes a “similar level of abstraction.” https://en.wikipedia.org/wiki/Source-to-source_compiler 

  2. There are other possibilities: You could write a Tail-Call Optimized language that transpiles to JavaScript, however its changes wouldn’t always be local: Some function calls would be rewritten substantially to use trampolining. Or adding continuations to a language might cause everything to be rewritten in continuation-passing style. 

  3. Scanning all of the code first is called “hoisting,” in part because some declarations nested in blocks are “hoisted” up to the level of the function, and all declarations are “hoisted” to the top of the function. This is a source of confusion for some programmers, but it isn’t germane to this essay. 

  4. If you don’t want the file enclosed in an IIFE, you can compile your CoffeeScript with the --bare command-line switch. 

  5. Lexical scope and order 

https://raganwald.com/2013/07/27/Ive-always-been-mad
Yes, JavaScript is a Lisp
Show full content

Some people will tell you that JavaScript isn’t Scheme. Of course it isn’t: Scheme is Scheme, JavaScript is JavaScript. There are excellent reasons why JavaScript is nothing like Scheme. Pretty much everything that makes Scheme Scheme is nowhere to be found in JavaScript.

One of the fundamental ideas in Scheme is to be minimal and elegant to the point of pain. In early Schemes, there was no real block scoping. You had to create block scopes using let as you do today, but let was implemented as a macro that acted a lot like JavaScript’s Immediately Invoked Function Expression idiom.

Likewise, the Scheme standard mandates Tail Call Optimization because the philosophy of the language is that if you have recursion, you don’t need loops. And there is no real exception system because if you have continuations, you can make pretty-much anything you want for yourself.

If Lisp is a “programmable programming language,” then Scheme is an assemble-it-at-home kit for making yourself a programmable programming language. JavaScript does not have this quality AT ALL.

JavaScript also isn’t Lisp as people who write Lisp use the word. Agree or disagree, the “Lisp Community” has coalesced around Common Lisp. Anything that doesn’t harken back to MacLisp is considered not-Lisp by experts. You know, Scheme looks a lot like a Lisp-1 to everyone else, but hard-core Lispers will tell you that Scheme isn’t Lisp and that the only thing it has in common with Lisp is CONS. Seriously.

names and clans

Macpherson Dress Tartan

Nevertheless, I often say that JavaScript is “a” Lisp, although not Lisp. I say it the same way that I might say that a Scottish Lee is “a” MacPherson even though obviously, the surname Lee is not spelled “MacPherson.”

The trick (for me) is context. Saying that JavaScript is “a” Lisp when talking about Lisp is not helpful. It does not add any bits of information to understanding Lisp. If someone is learning Lisp, I do not suggest they learn JavaScript first.

But saying that JavaScript is “a” Lisp when talking about JavaScript… That’s a different matter. It makes one pause. The obvious response is to ask “In what way?” This leads to conversations about recursion and trampolining and the use of IIFEs to create block scope.

It leads one to think that tools like Esprima might be a good fit with the language rather than this one weird trick for eliminating code wrinkles.

I think it’s mostly false, but just true enough to be interesting. And that’s good enough for me.

pax, exeunt

So in summary, I do not think that JavaScript is Scheme or Lisp. I do not think it adds value to thinking about Scheme or Lisp to think that there is a meaningful relationship between Scheme or Lisp and JavaScript. but I do think it is interesting and productive to think about what JavaScript draws from Scheme and/or Lisp, and to that end I often say that JavaScript is “a” Lisp when talking about JavaScript.


p.s. Dave Herman pointed me to a great essay. The opening line says it best:

Programming language “paradigms” are a moribund and tedious legacy of a bygone age.

https://raganwald.com/2013/07/19/javascript-is-a-lisp
Unfinished Work #1: Bind-by-Contract
Show full content

This is an unfinished work. I am still trying to think through all of the implications of binding names to entities using test suites. It seems fairly obvious how it would work with Ruby Gems. But does it scale downwards? Can I specify a function to be called by contract? How about a class? A constructor? Does this integrate with pattern-matching for function signatures? I still don’t know!

abstract

In this essay, I argue that programming languages tend to solve less-and-less important problems as they mature, and that it’s important for Ruby to break with this behaviour if it is to survive.

I suggest “coupling” is a problem that needs to be solved, and offer patching zero-day exploits as a motivation. I finish by presenting “bind-by-contract” as one potential disruptive fix.

introduction

There’s an oft-quoted aphorism that goes like this: The definition of “Insanity” is doing the same thing over and over, but expecting different results. In the field of programming language design, we are extremely good at doing the same thing over and over again. Popular languages mix and remix elements of other popular languages, and we seem to have become trapped in the vicinity of local maxima, where small variations just move is to a different part of the same neighbourhood.

Meanwhile, our problems are multiplying, especially as our code bases grow. I just read someone’s claim that they had a Rails app in production with more than five thousand model classes. This is to be expected: Software grows over time, and statistically we can expect that the likelihood of finding a Juggernaut-sized application grows over time.

But does our tooling grow with our applications? Does the underlying language platform grow with our applications? Generally, no.

why languages cluster around local maxima

Languages survive in two ways: By attracting new developers, and by preventing the defection of existing developers. Languages emphasize attracting new developers early in their lifecycle, and emphasize preventing defection once they become mature. This model is very well aligned with the “Technology Adoption Lifecycle” espoused in the book Crossing the Chasm.

Programming languages have historically been subordinate to hardware and/or operating systems. People generally adopt a programming language because it’s the dominant way of working with a platform that is itself popular. So people adopted JavaScript because Netscape Navigator was popular and other browsers embedded JavaScript. People adopted Objective C because iPhones are popular. And so on.

But what about preventing defection to other programming languages when a language becomes mature? There’s very little a language can do if its underlying platform crashes in popularity. If iOS were to fail as an ecosystem, Objective C will almost certainly crumble, given that it hasn’t made any non-trivial inroads to another platform.

But languages can prevent defection within their “natural” platform. Within a single platform, languages struggle for mindshare. As I write this, CoffeeScript is struggling with JavaScript. Node and JavaScript are struggling with Rails and Ruby. As described in Reis and Trout’s classic Marketing Warfare, the leader in any market has an ideal strategy, playing defence.

There are three principles of defence:

  1. Defensive strategies should only be pursued by the actual leader.
  2. Attacking yourself is the best defensive strategy.
  3. Always cover strong offensive moves by competitors.

Programming languages tend to defend themselves using the third strategy, “covering strong offensive moves by competitors.” For example, introducing lambdas to Java. Or introducing fat arrows to JavaScript. Or skinny arrows to Ruby. In other words, copying the selling features of other languages that seem to be gaining traction.

If languages spend a lot of effort copying each other, it cannot be surprising that we see very little substantial change. For that to happen, languages would also have to embrace the second, more powerful strategy: Attacking themselves. In other words, having the courage to incorporate disruptive change.

And notice, please, that to attack itself, a language cannot simply shore up or correct a weakness. For example, adding WebWorkers to JavaScript corrects a weakness with respect to multi-threading. This is not attacking itself.

Now if JavaScript were to embrace multi-threading by changing its semantics and enforcing immutability of variables, that would be attacking itself.

Programming languages rarely change so drastically that they undermine their own strengths. People say this is because of the importance of maintaining legacy code compatibility, &c. &c. blah blah. Let’s see why languages can and should disrupt themselves.

why ruby ought to make the jump to hyperspace

It’s all nonsense. If Ruby on Rails can evolve from 2.3 to 4.0, breaking old applications and requiring migration, Ruby itself can evolve in such a way that old programs will break. This is already the case for some minor misfeatures generally considered to be bugs such as the rules for scoping of block parameters that changed between 1.8.x and 1.9.x.

This is true for all popular programming languages, but I will focus on Ruby because first, I am familiar with the language, its history, and its community. And because second, it is transitioning to the “mature” stage of its life-cycle, and presents a unique opportunity to consider how it might engage the second defensive tactic, “attacking itself.”

Great disruptive changes generally take one of two forms: Taking some hitherto minor characteristic and elevating it, or taking some previously essential characteristic and throwing it away.

We could play “what if” games along these lines, and doing so would be a useful exercise for a creativity retreat. Post-its could be made up with various features (“Blocks,” “Gems,” “Eigenclasses,” “Everything’s an Object” and so forth), and after much back and forth and moving them around on the walls, we could come up with many fine proposals. I’m sure that some or even most of them would be superior to anything one person might come up with.

But for the sake of giving an example of such a disruptive change, here is a process I have used for generating ideas like this.

one source of opportunities to attack yourself

One fertile source of opportunities for disruption in programming languages is the tooling system surrounding each language.

Good things happen when you consider each tool the manifestation of an opinion about a language’s faults. When a particular tool becomes pervasive, when it has critical mass within a community, you have a very powerful sign that there is an opportunity to make a productive change.

For example, Java IDEs have magnificent support for making certain kind sod automatic refactoring. Even if you’d never seen another programming language, you could look at Eclipse and ask yourself, “what could we do to Java such that this kind of refactoring either wouldn’t be necessary or could be done within the language rather than with a tool?”

Perhaps you would independently invent metaprogramming or macros or something else, who knows?

Likewise, people write an awful lot of Markdown with CoffeeScript embedded in it. Someone might look at this and ask whether CoffeeScript ought to directly embrace Markdown.

So what tools are pervasive in Ruby? Rails, obviously. But the one that interests me is the Gem ecosystem. People are constantly working on ways to distribute reusable chums of code and handling the dependencies between the various chunks, but these efforts are always built on top of the language, rather than being intrinsic to Ruby itself.

Another that interests me is testing. The community has fallen hard for developing comprehensive automated test suites for applications as well as for libraries/gems. And again, these efforts exist outside of the language itself.

Finally, there is a metric fuckload1 of Ruby code in Github. Today code is considered a moving, malleable thing that changes over time, not a static thing that is built and thereafter is subject to only minor tinkering.

When I step back and look at the situation, it seems reasonable that a language designed in the 1990s wouldn’t have much to say about gems, tests, or version control. But it’s surprising that as we are moving briskly towards its 20th year, we seem more interested in coroutines and generators than in the things we deal with every day.

what do gems, tests, and github tell us?

Obviously, the pervasive use of gems and tests tells us that code often has a complex set of dependencies. Github and the versions we find in things like gemfiles tells us that these dependencies change as code changes asynchronously.

Anybody dealing with a non-trivial codebase knows that coupling is a serious problem. The Java people have invested greatly in their various XML-driven IOC and DI schemes.

Coupling is more than an annoyance. It’s a critical problem today, because when a codebase evolves such that it is difficult or time-consuming to upgrade, or it is fragile and breaks when changes are made to its dependencies, that codebase is very vulnerable to attack if it powers an application on an Intranet or the Internet itself.

Ruby applications need to be easily patched to close zero-day exploits. This is not always the case, and if we do not solve this problem, there will be a sudden exodus towards a language that does.

But the kinds of changes that Ruby–and other mature languages–are making do little to address this critical problem. Everyone’s excited about arrows connecting things on a single line of code. We debate promises vs. callbacks and ask whether they ought to be Monads.

This makes sense because these are the things that other languages do well, so mature programming languages are “covering” strong offensive moves.

But new languages don’t have large, mature code bases with a heinous mass of dependencies. So new languages don’t introduce strong mechanisms to address these problems. And thus languages like Ruby will not gain these features simply by adopting features from the new new shiny thing.

Ruby needs to address such issues itself, by disrupting its own model.

the prime disruptor: names

As described above, one mechanism for disruption is to drop an existing feature. So here’s one we can drop: The dependence on naming things.

Naming things has been described as a hard problem. But what if we de-emphasize names? Names are coupling. Names of classes usually like in the global context. Most of the problems with metaprogramming comes because global names introduce global coupling.

Imagine for a moment that we have some kind of component in Ruby. What if it doesn’t name anything in the global scope? This sounds hard until you remember that in Ruby, a lot of things can be anonymous. Classes and modules can be anonymous if we use constructs like some_clazz = Class.new instead of class SomeClazz.

If a component “exports” an anonymous class or set of classes, the code making use of the component can mange its namespace. Two different pieces of code can “import” two different components and give them the same name. Or give the same component different names.

You could have a regular Array and another that incorporates all the ActiveSupport enhancements. If an upgrade to a component breaks some of your old code, you could use the old version with the code that breaks, and use the new version with the code that doesn’t break.

Some of this can be accomplished right now with more tooling, but the hardest problems in this area require changes to Ruby’s semantics at a deeper level. At the very least, they involve parsing and transpiling Ruby source.

Discarding the dependance on names is one very possible way for Ruby to attack itself.

bind-by-contract

What about tests and github? We have the idea that different pieces of code could “bind” names to different versions of the same component. What do tests tell us about versions?

Well today, we have the idea of Semantic Versioning. There are revisions that make no changes to the public interface of a component. There are revisions that expand the public interface, but do not change any existing behaviour. And there are versions that break full backwards compatibility by changing existing behaviour.

This can all be expressed with tests as follows. Just as a component has a public API and a private implementation, we can build two test suites, one that tests the public API and another that tests the private implementation.

Now let’s describe revisions in terms of changes to these test suites:

  • Some revisions do not change any tests at all.
  • Some revisions may be accompanied by changes to the private test suite, but no changes to the public test suite.
  • Some revisions may be accompanied by additional public test suite tests, but no changes to any existing tests.
  • Some revisions may be accompanied by changes or even the removal of existing public tests.

These descriptions are very similar to the semantic versions, but we are no describing revisions objectively, in terms of the revision’s contracted behaviour.

Hmmm

whither version numbers?

What if we throw away version numbers for components? Version numbers are like names, we don’t need no stinking names. So how does a piece of code express its requirements on a component? With a test suite, of course. In whatever replaces a gemfile, we can specify test suites.

We can specify the tests the component must pass to work for us. We can specify the component’s own tests, and of course we have tests for our own code.

So now when “linking” or “bundling” or whatever-ing our code, the language ought to be able to apply whatever version of a component matches all of the tests we specify for the component and simultaneously does not break our own tests.

Such matching is rote, tedious work. Our language and/or platform should do this for us automatically. Crowd-sourcing this matching in a database of components ought to be a given.

This is all attainable. This is all highly compatible with the way Rubyists are already writing code.

bind-by-name-and-version -> bind-by-contract

Every idea needs a catch-phrase, so how about “bind-by-contract?” This captures the idea that when we write a piece of code, we used to say that we need ActiveRecord 3.2, binding to code by name and version.

But now we will say that we have this “contract,” expressed as a set of tests and our tolerance around whether the code can satisfy more or less than these tests. And we wish to bind to something satisfying the contract.

This, in conjunction with the movement away from binding code to global names, will give us some of the tools to manage ever-larger codebases with ever-more-rapidly evolving gems and dependencies.

In an era where we build critical infrastructure that is under perpetual attack and we must handle critical zero-day exploit patches, I feel it’s essential to tackle this problem head-on.

And I offer this one idea as an existence proof that there are things we can do. We should think of more things to do, of course. We should think very hard. But we absolutely, positively must not be afraid to reconsider any of the things about our languages that we consider “fundamental.”

Thank you for taking the time to read my thoughts on this subject.

  1. Not to be confused with the “Imperial Fuckload,” defined as the cumulative mass of stormtroopers carried aboard a Star Destroyer. 

https://raganwald.com/2013/06/20/bind-by-contract
Happy Birthday To Me
Show full content

Why yes, it is my birthday today. Wondering what to get me? Just a second, some people are singing

Okay. It’s simple, really. I’d like yet another programming language. It doesn’t have to have entirely new ideas, but if it is going to recycle some old ideas, I’d really like it if it recycled some old ideas that every other damn language isn’t recycling.

Now, it’s easy to design an esoteric language and say “Here, Reg, knock yourself out.” But I have a special caveat. I want a language that is designed according to the principle of maximum surprise.

the principle of maximum surprise

Weird languages are just weird. They aren’t surprising, in the sense that if you walk into an insane asylum, you aren’t going to be surprised to discover weird things happening inside.

What’s really surprising is when you walk into a University, you see students and teachers bustling about, observe lectures in lecture halls, and figure it’s like any other University. Then someone explains that there are no formal courses and everything is self-organizing according to a set of rules and what looks like a normal univeristy “emerges” from the special rules.

Surprise!

So a maximally surprising language looks a lot like a language you’re already familiar with, and for most trivial cases acts just like it, but it can do wildly non-trivial things because the semantics underneath are from Mars, but you’re from Venus.

For example.

pattern matching

SNOBOL (and yes, Icon) introduced pattern matching as a first-class concept. To the max, you might say.

You had primitives like SUCCEED, FAIL, and FENCE you could stick inside patterns to control backtracking.

Patterns composed. So you could write 'Hello' | 'Hi' To get a pattern that matched either of two strings. Hey, when you put it like that you get something interesting. 'Hello' is a string, sure, but it’s also a pattern. That’s very damn interesting, the idea that everything in a language is a pattern.

For one thing, I absolutely despise writing if a is 'foo' or a is 'bar' in today’s languages. It’s this English-likeness monster concept of a language that kinda looks like English, but really isn’t. Because if you try to write if a is 'foo' or 'bar' it doesn’t do what you think.

But a language with SNOBOL semantics would do exactly what you think and match a against a pattern made of 'foo' or 'bar'. Solid.

Speaking of pattern matching, the FP people have taken over this term and now use it to mean generic functions or what-have-you, meaning you can write this fictional JavaScript:

function length ([]) { return 0; }

function length ([first, butFirst...]) { return 1 + length(butFirst); }

Through the magic of destructuring, the language matches calls to length against the various definitions and evaluates the body of the first one that “matches” the parameters. It’s a fabulous way to write clear code.

But why argue whether “pattern matching” means SNOBOL or Haskell? I’d applaud a language that “swings both ways.” Let’s start with something subtle. Let’s say that this pseudo-CoffeeScript:

length = ([]) -> 0
length = ([first, butFirst...]) -> 1 + length(butFirst)

Is syntactic sugar for:

length = ([]) -> 0 or ([first, butFirst...]) -> 1 + length(butFirst)

Now we can think of length([yabba, dabba, doo]) as pattern matching, and what we call functions are also patterns that apply their bodies to anything they match.

Hang on while I sip some birthday Scotch. Mmmm.

So if that’s the case, what does this mean?

x = 1
# ...
# lotsa code
# ...
x = 2

Well, by the above, it means that x = 1 or 2, not that x = 1 for a while and later x = 2. That’s incredibly interesting as well. Why shouldn’t that work? Why shouldn’t x have multiple values?

You can see how a language like this might look just like CoffeeScript, or Ruby, or whatever, but things that might be errors in another language would be wonderful expressions in this language.

I’m going to move onto the next topic. It’s easy to write provocative things like this, but designing a solid language around these ideas would involve a deep dive into the List Monad.

Speaking of which…

monads, monads everywhere but no more drops of ink

I’m tired of Greenspunning monads and even more tired of reading explanations that force us to parse the semantics and the implementation in the same essay. It’s sweet that Haskell gives us syntactic sugar for Monads (like >>= and do), but what if we went “all the way?”

So this is simple. I want a language with Algol-ish syntax, meaning some form for delimiting sequences of expressions, like { ... }, or begin ... end, or if you’re old school, (BEGIN ...). And by default, it should do exactly what you expect.

But I also want a way of decorating those blocks with my own semantics. If I want Maybe semantics, I can do that. Or if I want List semantics. Or Reader, Writer, or State semantics. Or anything I feel like rolling myself.

And I don’t want that being something on top of the default, I want this baked into both the syntax and the semantics so that it’s an entirely first-class concept. And when I say a first-class concept, I’m implying–well now I’m stating, I was implying–that it’s reflectable and mutable.

So I can take a function with “normal” semantics and produce a version of the function with different semantics. This implies that all expressions are first-class something-or-others that you can take apart and put back together.

For example, I can now impose error handling on something without some kind of black box “If it throws an exception I want to catch it” system. What I like about this delusional world is that we have a powerful argument for homoiconicity without demanding macros.

Did I say macros? I guess that’s as good a third example as any…

tired of macros

I always say JavaScript is “a” Lisp and in return I get shelled with mortar rounds, each lovingly engraved “Without Macros?” Well, there’s something to that argument. But there’s something else, and I recall ranting about this five years ago.

The upshot is, if you have call by name or by need semantics, you can actually get a huge amount of what you want from macros. It seems like a little thing at first, but as you get comfortable with it, you find yourself composing evaluation abstractions that have no analogue in functions or methods.

For example, this is ridiculously obvious:

try(
  postToServer(foo),
  displayFailure(bar),
  logError(blitz),
  giveUp()
);

try uses call-by-name semantics. It evaluates postToServer(foo), and if that fails, it evaluates displayFailure(bar), and if that fails, it evaluates logError(blitz), and if that fails it gives up.

Of course you could write it like this in CoffeeScript:

try(
  -> postToServer(foo),
  -> displayFailure(bar),
  -> logError(blitz),
  -> giveUp()
);

But there’s a lot of win in making the syntax go away visually, and even more win in that you can optimize this business of thunks if you know that they aren’t intended to be first-class functions but rather lightweight delayed evaluations closer to Ruby blocks.

For one thing, return semantics are all pizzled if we are manually writing functions in most languages. a return ought to return from the enclosing function, not return from inside the thunk to outside the thunk.

I’d like to see someone bring call-by-name back. Haskell does something incredibly interesting with pervasive call-by-need, it’s amazing, and it really speaks to the importance of not just removing harmful semantics but also of adding new ones that work in harmony with what remains.

But call-by-name and/or call-by-need could be added to existing mutation-friendly languages, I’d like to see someone try.

time to go

I’m off for a birthday ride, so I’ll stop mid-rant. It’s easy to suggest things, much harder to ship them when you have to consider the costs of each feature and the way they interact. But nevertheless, this is the one day of the year when I ought to be able to say, “Here’s what I’d like to unwrap.”

Each of these ideas adheres to my “principle of maximum surprise.” Namely, it looks and behaves exactly like you expect. Until it doesn’t, and then you are pleasantly but maximally surprised!

If I can’t have these by Midnight June 14th, feel free to give them to me by the morning of December 25th. And if you really want to surprise me, don’t do any of these, but figure out another way to engineer maximal nasal-coffee-over-keyboard spray.

You have nearly six months!

https://raganwald.com/2013/06/14/happy-birthday-to-me
Functional's Greatest Accomplishment
Show full content
functional

Michael Fogus has just published Functional JavaScript: Introducing Functional Programming with Underscore.js. He is also writing a series of essays introducing people to existing resources for writing JavaScript in a functional style (or with a dollop of functional on top).

I was pleased to see his Fun.js – Functional JavaScript post kick it off. This introduces on Oliver Steele’s library Functional Javascript (a/k/a Functional). Functional was one of the very first libraries to approach JavaScript from a strongly functional perspective, and it goes well beyond the usual mapping and reducing to include currying and partial application, combinators, sequencing, and the controversial string lambdas.

Today, many people rightly point to things like Underscore or CoffeeScript and ask if we need Functional. To be precise, they don’t ask, they criticize, often expressing strong disapproval as is common in our community.

Perhaps if someone wrote this library today we might ask if it solves a new problem or introduces a new way of thinking. Mostly, the answer might be “no.” We might say that the point-free programming offered by string lambdas are a matter of taste, and that libraries like Underscore, Lemonad, or allong.es cover everything else and more besides.

But of course, Functional wasn’t written today. It was written well before any of these other libraries. In fact, it inspired all three of the libraries just mentioned in whole or in part. When it was introduced, it seemed exotic and weird to the JavaScript community, but its ideas won a few hardy adventurers over, and they won some more people over, and gradually its seeds grew and spread and intermingled with other ideas.

Today, people look at it and say “Pooh,” but this is because Oliver Steele did such a good job of opening our minds to functional programming in JavaScript, that we turned around and made his library obsolete!

Functional is full of hacks and clever implementations of things that really ought to be part of a language itself. Things like currying really ought to be baked into everything as it is done in languages like Haskell. When you use a library like Functional, you quickly become sensitive to how JavaScript can get in your way. You agitate for change.

And indeed, the JavaScript language is changing. Lambda expressions are coming. Other features are already here, added to the language in part because Functional taught is to ask for them.

Functional seems obsolete today precisely because it did such a good job of teaching us how to think functionally.

looking forward

This is a very nice tribute, you might be saying, but is there anything “actionable” in this essay? Yes. Having looked backwards, let’s look forwards.

Functional was a library that introduced a new idea, an idea that seemed at the time to be going “against the grain” of the JavaScript community. But it wasn’t entirely artificial, it introduced ideas that were proven useful in other environments. And although the library did hack around a few things, on the whole the ideas did fit somewhat well with the JavaScript language.

Besides thinking nice thoughts about Oliver Steele, besides picking up Michael’s book and doing more functional programming, we can also ask ourselves, What library today introduces a new idea that may one day become commonplace?

That library will seem hackish. It will seem to cut against the grain of the community, but will fit fairly well with JavaScript the language. The idea will be proven in another context, even if that context isn’t as rabidly popular as the JavaScript context.

Perhaps when we see that library, we might think of Functional. We might say to ourselves, “Let’s try this out, let’s see what it teaches us.”

Functional’s accomplishment of teaching us to want even better functional tooling for JavaScript was great. But if it teaches us to open our minds to new and even better ideas, that will be an even greater accomplishment.

https://raganwald.com/2013/05/30/functionals-greatest-accomplishment
Inelegance
Show full content

There are many definitions of “elegance,” but with respect to programming, I like to define it as, “The degree to which a set of features scale.” The more things you can do with a few features, the better.

One of the easiest way to make a set of features scale is to make them composeable in as many ways as possible. If you have features A, B, and C, and they don’t compose, You can do three things. If they also compose in some binary way, you could have as many as nine: A, B, C, AB, BA, AC, CA, BC, and CB.

To be elegant, you want things to be composeable, and you ideally want them to compose in a natural, simple way. When designing feature “A,” you shouldn’t have to write special case code for composing A with B and A with C. What happens when you add “D?” Are you supposed to go back and retroactively change A, B, and C to compose with D?

Audrey Hepburn

inelegance

Special case code is often a sign of inelegance, a smell that the model is not right. I’ve written a lot of inelegant code. For example, I was recently working with adding some functional idioms to JavaScript.

(Although these examples are in JavaScript, I don’t think the concept of inelegance is JavaScript-specific.)

At some point, I wrote a curry function. To refresh your memory, currying creates a chain of functions that take on parameter each. For example, here is a function that curries a binary function:

function curry2 (f) {
  return function (first) {
    return function (last) {
      return f(first, last);
    }
  }
}

I also wrote a flip function that takes any function and reverses its arguments. For example, here is a function that flips any binary function:

function flip2 (f) {
  return function (first, last) {
    return f(last, first);
  }
}

The two functions compose: You can write curry2(flip2(f)). So far, things are minimal and elegant. Of course, not all functions are binary. How do we write a generalized flip or curry function, one that handles arbitrarily polyadic functions?

JavaScript functions are also objects with properties. One of them, .length, is the number of arguments declared. So:

function () {}.length
  //=> 0
function (x) {}.length
  //=> 1
function (x, y) {}.length
  //=> 2
function (x, y, z) {}.length
  //=> 3

Thus, we can discover the declared arity of a function with .length. We can also access the number of actual arguments and their values, regardless of arity, with a special variable called arguments. For example:

(function () {
  return '' + arguments.length + '-' + arguments[1];
})('a', 'b', 'c');
  //=> "3-b"

arguments isn’t actually an array, so if you want to convert it to an array, you need to use some legerdemain:

function toArray(args) {
  return [].slice.call(args, 0);
}

Putting all this together, I came up with this implementation of flip:

function flip (f) {
  return function () {
    return f.apply(this, toArray(arguments).reverse())
  }
}

function echo (a, b, c, d) {
  return [a, b, c, d];
}

echo(1, 2, 3, 4);
  //=> [ 1, 2, 3, 4 ]
  
flip(echo)(1, 2, 3, 4)
  //=> [ 4, 3, 2, 1 ]

Emboldened, I wrote a polyadic version of curry with these tools:

function curry (f) {
  var collectedArgs = [];
  
  if (f.length < 2) {
    return f;
  }
  else return (function getmoreargs (remaining) {
    return function (arg) {
      collectedArgs.push(arg);
      if (remaining === 1) {
        return f.apply(this, collectedArgs);
      }
      else return getmoreargs(remaining - 1);
    };
  })(f.length);
}

curry(echo)
  //=> [Function]
curry(echo)(1)
  //=> [Function]
curry(echo)(1)(2)
  //=> [Function]
curry(echo)(1)(2)(3)
  //=> [Function]
curry(echo)(1)(2)(3)(4)
  //=> [ 1, 2, 3, 4 ]

Great! Now for the “elegance” test:

curry(flip(echo))(1)(2)(3)(4)
  //=> TypeError: object is not a function

The problem is that curry inspects its function to work out how many arguments are expected. But flip breaks the implied contract by returning a function that doesn’t declare any arguments:

echo.length
  //=> 4
  
flip(echo).length
  //=> 0

Bzzt! Naturally, I wrote a special wrapper to “do the right thing.” Leaving out some performance caching, it looks like this:

function arity (numberOfArgs, fun) {
  if (fun.length === numberOfArgs) return fun;
  var parameters = new Array(numberOfArgs);
  for (var i = 0; i < numberOfArgs; ++i) {
    parameters[i] = "__" + i;
  }
  var pstr = parameters.join();
  var code = "return function ("+pstr+") { return fun.apply(this, arguments); };";
  return (new Function(['fun'], code))(fun);
};

This grisly bit of code actually parses some code at runtime to “wrap” any function in the correct number of arguments. Here it is in action:

function flip (f) {
  return arity(f.length, function () {
    return f.apply(this, toArray(arguments).reverse())
  });
}

And now:

curry(flip(echo))(1)(2)(3)(4)
  //=> [4, 3, 2, 1]

Which is nice, and might less inelegant than not being able to compose curry with flip.

I could also have used some other special case mechanism like storing the original length of a function in a special property when reversing a function, but it felt less inelegant to cut with JavaScript’s grain and respect the way it handles function arity “natively.”

This inelegance problem crops up even more when working with large frameworks: Features interact in ways that expose edge cases. It’s vital to make code as composeable as possible and thus as elegant as possible, but sooner or later you may find yourself holding your nose and trying to find the minimum level of inelegance.

And it may well be that the least inelegant thing to do is simply declare that certain elements don’t compose at all. Maybe that’s what should have happened here. Maybe currying flipped functions should simply be banned.

What do you think?

https://raganwald.com/2013/05/09/inelegance
config ||= config
Show full content

You’re interviewing for a Ruby job, and the interviewer poses the following question:

You’re reviewing some code and find the line config ||= config. What do you infer?

Well, what DO you infer?

  1. The author of that code is insane.
  2. The author is sane, was trying to __________
  3. The author is sane, but didn’t know about Ruby feature __________
  4. The company is insane for posing questions like this in an interview.

(Question courtesy of Peter Cooper)

https://raganwald.com/2013/05/02/quiz
Did you ever take that test yourself?
Show full content

Chris Sturgill wrote a perfectly sane blog post called Tests Are Overhyped. His thesis is that some people take testing to extremes where they are no longer beneficial, and slavishly follow test-centric practices for “cargo cult” reasons rather without any introspection over what they are trying to accomplish.

Perfectly cogent and reasonable, and I agree with this thesis. Unfortunately, I posted an ungracious criticism of the article. I took Chris to task for taking testing down without suggesting anything concrete. Do you see the irony? I criticized his blog post without suggesting a concrete improvement on it.

Well, time to apologize and make up for my foolishness. Chris, I was wrong, I apologize. And here is a taste of my own medicine, some advice about testing.

a good testing practice

Here is a testing rule of thumb: If some piece of functionality in your code ought to be documented, write tests for that piece first.

Now you may not get around to writing all the developer documentation that you’d like. I appreciate that. Or maybe you will be documenting it. This heuristic still applies. Here’s my thinking:

First, some code is so wonderfully written or obvious that it doesn’t really need comments.

Great! Testing the “obvious” code is lower priority. It’s easier to read, probably has fewer bugs, and the next developer working with it ought to be able to bang out tests for it if you don’t get around to finishing those tests yourself.

But some code could use a few guide posts along the way. Perhaps you needed to optimize for performance over obviousness. You drop in a comment explaining that a particular method is memoized for performance.

Now write a test verifying that the memoization works properly. This code is higher risk. There’s a chance someone won’t understand what it is doing. Or make assumptions about the method returning different things when called more than once. Or break it by accident.

The test helps you document the memoization, albeit in a place removed from the code itself.1 The test will also catch those breakages down the road. Any code can break, but some code is more likely to break than others. My anecdotal experience2 is that code that needs a little help explaining is more likely to break and/or harder to fix for the person who didn’t write it.

Of course, there are various kinds of comments. A test explains what the code is expected to do. If you are artful, the test can give heavy hints as to how it does it. I don’t know how to write a test that explains why the code was written this way. So test aren’t really a substitute for documentation. And I don’t try to make them do that job.

I just think of the need for documentation as a “risk smell,” and a hint that tests for that bit of code are going to be more valuable than tests for code that is “obvious.”

exeunt

And that’s it: If a piece of code “ought” to have some comment or developer documentation, tests for that code are higher priority than tests for self-explanatory code. Write them first.

I can’t tell you that you should always follow this pattern, but what I do say is that it is a kind of “default” for me. Meaning, if I don’t have a good reason to do otherwise, I try to follow this rule.


(discuss)

notes:

  1. If you can get a code browser that will show you the tests for a method next to the code for a method, start using it. 

  2. “The plural of anecdote is not data.” The singular, even less so. 

https://raganwald.com/2013/04/16/did-you-ever-take-that-test-yourself
When FP? And when OOP?
Show full content

Very roughly speaking, functional programming (“FP”) and object-oriented programming (“OOP”) have similar levels of expressive power and similar abilities to encapsulate programs into smaller parts that can be combined and recombined.

The biggest difference between the two “schools of thought” concerns the relationship between data and operations on the data.

The central tenet of OOP is that data and the operations upon it are tightly coupled: An object owns its data and it owns the implementation of the operations on the data. It hides those from other objects via its interface, a collection of methods or messages it responds to. Thus, the central model for abstraction is the data itself, hidden as it is behind a small API in the form of its interface.

The central activity in OOP is composing new objects and extending existing objects by adding new methods to them.

The central tenet of FP is that data is only loosely coupled to functions. You can write different operations on the same data structure, and the central model for abstraction is the function, not the data structure. Functions hide their implementation, and the language’s abstractions speak to functions and they way they are combined or expressed, such as generic functions or combinators.

The central activity in FP is writing new functions.

In a fight between a bear and an alligator, the terrain determines the outcome.

So when is one more appropriate than the other? As this is a practical blog, I will hand-wave all theoretical considerations such as the ability to mechanically reason about code and think of the grubbiest of real-world pragmatic considerations, writing business code in an environment where there is too much to do with inadequate resources in not enough time.

Is one of these two models the overwhelming “winner” in the business environment? Think carefully, take your time. I’ll go and press myself an Espresso while you think it over…


Of course, the answer is, business programming is dominated by the functional model, not the OO model. Does this seem like a surprising answer? Only if you are thinking solely of Java, C++, C#, and Ruby.

When you think of it, all that “OO” is usually a thin skin over access to various databases that support SQL, a very functional language. While it is possible to manage a database such that all access to its tables is done through PL/SQL stored procedures, this usually creates a severe programming bottleneck for very little real gain.

The main benefit of a relational database is that it can handle future requirements. When you need new reports, you just write them. Many different applications can talk to the same database. Constraints can be applied programmatically to enforce consistency across all applications.

If you step back, you see that a database is a big data structure, and the applications are bundles of operations acting upon it. The heart of nearly every business application is a big functional database, a data structure with operations acting upon it.


And yet, we embrace objects in our applications. Is this just fashion? Or is there something fundamentally different about what we need to do when writing applications and what we need to do when writing databases?

The answer lies in what OO makes easy and what databases make easy.

A well-crafted OO architecture makes changing the way things are put together easy. All that hiding and decoupling allows you to change the relationships between things easily. OO does not make adding new operations particularly easy, you see this whenever you find yourself muttering about “double dispatch and visitors.”

But if you have a business process for placing an order that is being refactored to handle new business rules, this is where OO shines. Those things that don’t need to know about the change are insulated from those that do.

On the other hand, a well-crafted database makes adding new queries and operations easy. It handles the case where you need to look at the data in new ways or add new kinds of updates to the data. Client applications are decoupled from issues like indexing for performance.

It doesn’t make changing the relationships easy. If you change the management structure such that you go from one manager per report to a many-to-many matrix management structure, that change will break a lot of applications.

So if we had all the things that need to be in business software written on cards, those that represent long-term, relatively changeless relationships go in the database, while those that represent short-term operations that evolve and change over time go in the application.

The stack of application entities is usually four times higher than the stack of database entities, but that’s as it should be. Things do change, businesses are supposed to learn, grow and evolve.


So what about FP as we usually mean it, code written in a functional style within multi-paradigm languages? What about simply organizing OO programs as collections of operations acting on relatively static data structures?

This is always appropriate, although once again priority must be given to considering the relative longevity of the relationships. Those that are unlikely to change but are subject to being operated upon by a changing cast of entities should be written in a more functional style, while those things that change relatively often can be written in an OO style.

If every manager has one or more reports, and every report has exactly one manager, there is little to be gained by hiding this relationship behind an API where manager objects delegate operations invisibly. This relationship could be better constructed as data with operations acting on it.

But a rule about shipping costs is likely to change, and it should be encapsulated as much as possible to insulate the rest of the program against future alterations.

Good software is written in both styles, because good software has more than one need to satisfy.

https://raganwald.com/2013/04/08/functional-vs-OOP