sunfishcode's blog
A blog by sunfishcode


Embrace the Kinda

Posted on

So yeah, what...

What is Wasm?

Decision making

Well, I guess, one can find various one-sentence descriptions out there. The webassembly.org website leads with:

WebAssembly (abbreviated Wasm) is a binary instruction format for a stack-based virtual machine. Wasm is designed as a portable compilation target for programming languages, enabling deployment on the web for client and server applications.

Wikipedia's WebAssembly article leads with:

WebAssembly (sometimes abbreviated Wasm) defines a portable binary-code format and a corresponding text format for executable programs as well as software interfaces for facilitating interactions between such programs and their host environment.

The WebAssembly reference manual I wrote a few years ago leads with:

WebAssembly, or “Wasm”, is a general-purpose virtual ISA designed to be a compilation target for a wide variety of programming languages. Much of its distinct personality derives from its security, code compression, and decoding optimization features.

And these are all like fine. But at the same time, these have all been out there for years and here we are today and people are still trying to figure out what this whole thing, just, like, is is.

Hang around Wasm spaces, and you may hear someone drop the quip:

WebAssembly is neither Web nor Assembly!

It's a scintillating one. It pops. It packs that irresistible combination of classic meme ancestry, hot new subject, apparent absurdity, and a real kernel of truth all in one scrumptiously crunchy package.

Artist’s impression of a gamma-ray burst

“Wasm is not Web”, the quote goes, “because it's carefully factored so that nothing in the core spec has any dependency on browsers. And we're using it in servers and embedded devices and stuff!” And it's right!

At the same time, Wasm does come from a Web context, and many of the forces shaping it have relationships with the Web. Wasm's concept of program isolation, with a trusted callstack enabling it to call out into code that it doesn't trust and doesn't trust it, and its system of imports and exports, come from the need to embed Wasm within very complex browser environments. Wasm has a standards body which is a W3C Community Group. And in many ways, “the Web” itself has also grown beyond its original scope, into an interconnected ecosystem of scopes, which Wasm participates in in multiple ways. Is ActivityPub Web? Well... yeah. Is Web of Things Web? Still yeah. And of course, Wasm also does run in actual browsers! So if we want to fully understand why Wasm works like it does, both on a technological and social level, we can't ignore the Web side of the story either.

So is Wasm “Web”?

Perhaps the shortest way to say it would be... well...

Kinda.

“...or Assembly”, the quote goes on, “because it has a lot of structure and a type system”. And this is also true; assembly languages as we know them today all basically look alike if you squint a little. Wasm looks distinctly different.

But here too, Wasm's instruction set is also very low-level, retaining many of the design characteristics of a typical assembly language, like having a single i32 type that's neither signed nor unsigned, leveraging the magic of two's complement, just like most machine ISAs do. And in some of the major compilers that have been ported to target it, Wasm is emitted by the part of the compiler that's built to emit assembly code. It's the “architecture”, in compiler parlance. And this is one of the major forces behind Wasm's overall design, so if we ignore it, we aren't seeing the whole picture.

So is Wasm “Assembly”?

Well...

Thinking woman

Kinda.

Kinda???

Kinda! Our existing mental categories are important, because they're how we approach a new system, and they're how we port existing code to new systems. So we do talk about Wasm in terms of other systems, and we do build bridges to enable concepts and code to be ported over.

At the same time, attempting to understand Wasm as categorically one existing category or another will limit our potential to understand it and to take advantage of its strengths.

Cool story

Is a Wasm instance a process? Kinda. It can have an address space, like a process does. And it has some forms of isolation from other instances, somewhat like processes do from other processes. But it can also import its entire address space from another instance, or not have an address space at all! And it doesn't carry around all the state that Unix attaches to its processes. And the address space doesn't contain things like the call stack or the program's executable code.

So we do sometimes use the word “process”, in contexts where that word fits. We value compatibility with existing software, and existing software often expects to have a process, so we do things to present the illusion that an instance is such a process.

Sometimes we use the word “nanoprocess”, which highlights how Wasm processes perform isolation without relying on heavyweight host process boundaries, enabling it to scale to very many instances runnining on the same machine at the same time, like little nanogears.

Fullerene Nanogears - GPN-2000-001535

But instances also have some superpowers that processes don't have. Instances can call each other. On the same callstack. This is difficult to even contemplate if we limit ourselves to a Unix perspective, and it creates interesting new opportunities for what Unix would call “Inter-Process Communication”, but with very little ceremony. It's just a call. Compatibility is important. Taking full advantage of powerful new tools is also part of the big picture here.

Sciurus carolinensis performing superhero landing

Look I think I've got a pretty good picture

Is WASI an OS? Kinda. It's a framework for defining APIs, which could represent an OS, and plays the role of an OS from the perspective of compatibility with a lot of existing software.

But it doesn't have to be an OS. WASI APIs could be implemented as an adapter library on top of other APIs, WASI or otherwise, allowing a program written for one environment to run in another. So it's kinda?

Is Wasm a container? Kinda. It can perform isolation, protecting the inside from the outside and vice versa, and it can be a single file that contains everything needed to run a program. And as the tools mature I expect it'll be adding a lot of the features found in container systems. But at the same time, it's also kinda different. As in our discussion of processes above, Wasm programs can straight-up call functions in other programs.

Very interesting now lets...

Is Wasm a stack machine? Kinda. It has a stack-machine instruction encoding, where most operands don't need to be explicitly named, as they can be implicitly “pushed” and “popped”, which compresses the binary encoding. But it's a heavily constrained stack machine. Wasm's validation will check that at every point in the code, the exact size of the stack, and the types of everything on it, are known at compile time.

This enables optimizing Wasm engines to translate the stack machine into a register-machine IR, with virtual registers that then get register-allocated. And of all the Wasm producers I've seen, they all either internally work in terms of registers, or an abstract syntax tree, and just translate into pushes and pops at the last moment. So almost nothing in the overall ecosystem actually operates at the pushes-and-pops abstraction level, except for the binary encoding and a few tools that operate on it. So the stack-machine personality is only a surface appearance.

Is Wasm a Reduced Instruction Set Computer (RISC)? Kinda. It does have explicit load and store instructions, making it a load-store architecture, and most of its arithmetic operations are very simple RISC-like instructions. But it also does have fairly complex instructions like call and memory.grow.

Is a Wasm component an executable, or a library? Kinda... both? In the component model, those aren't fundamentally different things. There are components that can be used in the manner of commands, but these same components can also be linked to like a library.

... can we wrap this up?

Is Wasm a bytecode? Kinda. Or, well, yes. Yes it is. But the kinda here is because if we reduce it down to just “a bytecode”, it's easy to miss what the big deal is. We already have multiple popular bytecode-based systems out there. What makes Wasm different?

Wasm has a different approach to isolation, which makes it easier to embed within existing systems and compose with itself without compromising its security model. Wasm doesn't have a single favorite source language, and aims to be language-neutral from the beginning. And, Wasm isn't strongly associated with a single large corporation. It's defined by a standards body, and in practice it has active participation and meaningful consensus process involving a wide variety of organizations.

Is Wasm a platform? Kinda? It's kinda more like a substrate on which platforms can be defined, along with a shared ecosystem that can be used on such platforms. WASI has a concept of worlds which are each like their own platforms, defining all the APIs available to programs within them. And embedders can define their own worlds, making their own platforms. So in a sense, we're building a meta-platform, plus a number of platforms inside.

But at the same time, with virtual platform layering, programs written for one world can be adapted to run on another. And many ecosystem tools and libraries will be able to be shared. So there is also a sense in which all worlds are part of a single overarching meta-“platform”.

Maybe stop now?

Is Wasm a programming language? Kinda. It does have a type system, and a syntax. And function calls with arguments and return values instead of a “put things in certain registers and memories” calling convention. Tail calls have to be built in rather than being just compiler cleverness. It even has an if/else construct and eschews goto. But it's also too low-level to be really written by hand on a regular basis.

Ok, ok. Here. Is Wasm a Lisp?

Well... ok, fine. You got me there. One could try to make the case for Wasm being a Lisp with the S-expression text syntax, and like, being Turing-complete? And it's gaining tail calls and GC, and it is getting increasingly interesting to compile Lisp-family languages to. But ultimately, it's not recognizably a Lisp. There are no builtin concepts of cons lists, eval, macros, and so on.

But here we are. A sea of kindas. And we can, and do, simplify them into the familiar concepts and categories that people know and that existing programming languages and existing code is often assuming. Giving people onramps and compiling existing code and running existing applications are all critically important for Wasm to succeed.

At the same time,

Let's have fun, and embrace the kinda!