On maintenance of code with a large possibility space, stateful code, caching, the downfall of retained mode APIs, and freeing up programmer resources.
On maintenance of code with a large possibility space, stateful code, caching, the downfall of retained mode APIs, and freeing up programmer resources.
Static Single Assignment is a program representation where each “variable” (though this term can be misleading) is assigned exactly once. Mostly the variables aren’t variables at all, but instead names for values—for expressions. SSA is used a lot in compilers. See this lesson from Cornell for converting Bril to SSA, which also has some good background. I’m making this page to catalog the papers I have found interesting and leave a couple of comments on them.
adventures with LLVM IR
In the previous post , we explored the IR—the compiler’s working format where devirtualization, inlining, and escape analysis happen. The IR optimizes your code at a high level, making smart decisions about which functions to inline and where values should live—on the heap or stack. But the IR still looks a lot like your source code. It has variables that can be assigned multiple times, complex control flow with loops and conditionals, and operations that map closely to Go syntax.
A Faster Python
I’m making progress on the Java decompiler I’ve mentioned in a previous post, and I want to share the next couple of tricks I’m using to speed it up. Java bytecode is a stack-based language, and so data flow is a bit cursed, especially when the control flow is complicated. I need to analyze data flow globally for expression inlining and some other stuff. Single-static assignment produces basically everything I need as a byproduct… but it’s not very fast. For one thing, it typically mutates the IR instead of returning data separately, and the resulting IR has imperative code mixed with functional code, which is a little unpleasant to work with. SSA has multiple implementations with very different performance characteristics and conditions, and each of them forces me to make a tradeoff I’m not positive about.
A pipleined query language that compiles to SQL
Understanding Wasm, Part 1: In which we disambiguate, define, and delve into the terms "Virtual Machine" and "Instruction Set Architecture".
As a maintainer of the GCC register allocator (RA), I naturally have a keen interest in the register allocators used in various industrial compilers. For some compilers, like LLVM and Cranelift, there is sufficient documentation, including papers and presentations, to gain a deep understanding of their register allocators (RAs).
In my previous post “Chrome Browser Exploitation, Part 1: Introduction to V8 and JavaScript Internals”, we took our first deep dive into the world of browser exploitation by covering a few complex topics that were necessary for fundamental knowledge. We mainly covered topics on how JavaScript and V8 worked under the hood by exploring what objects, maps and shapes were, how these objects were structured in memory, and we also covered some basic memory optimizations such as pointer tagging and pointer compression. We also touched on the compiler pipeline, bytecode interpreter, and code optimizations.
Every year I do the Advent of Code programming competition in a different language and (eventually) write about it. Last year I used Elixir, a functional programming languages with immutable data types based on the Erlang runtime. This post describes Elixir, and how it offers JS/TS developers a glimpse of what a future with the pipeline operator might look like. (It's more of a mixed bag than I would have expected.) We'll also look at Elixir's ongoing attempts to add static types and what lessons TypeScript can provide them.
Note: This post was originally published on the Bytecode Allianceblog.
Learning MLIR with too many dialects.
A short survey of compiler targets like assembly, LLVM, C, JVM and Brainfuck.
Table of Contents This series is an introduction to MLIR and an onboarding tutorial for the HEIR project. Last time we saw how to run and test a basic lowering. This time we will write some simple passes to illustrate the various parts of the MLIR API and the pass infrastructure. As mentioned previously, the main work in MLIR is defining passes that either optimize part of a program, lower from parts of one dialect to others, or perform various normalization and canonicalization operations.
A tiny programming language with a JIT interpreter, self-contained - andreabergia/emjay
⌚ Nice watch!
Read about how we exploit JIT compilation in ClickHouse
The post describes how the React Compiler uses SSA form for fine grained reactivity
Emjay is a tiny JIT compiler that turns math expressions into executable machine code. It's a complete end-to-end implementation, from parsing to SSA-based IR optimization to native aarch64 code generation, built from scratch in Rust.
The most common porting bug in the TypeScript Go port