Eigenstate : Web Asssembly

Why Web Assembly is Not My Favorite Compilation Target

I'd like to point out that I haven't been involved in the web assembly effort, and I'm not maintaining any large or widely used compilers (just my own toy-ish language, minor contributions to the QBE compiler backend, and an internship on IBM's compiler team), but I ended up getting a bit ranty. So,

So here goes:

My Complaints:

When I'm writing a compiler, the first thing that I'd do with the syntax tree structure -- loops, if statements, and so on -- is validate them for semantics, do type checking and so on. The second thing I do with them is just throw them out, and flatten to basic blocks, and possibly to SSA form. In some other parts of the compiler world, a popular format is continuation passing style. I'm not an expert on compiling with continuation passing style, but it neither seems to be a good fit for the loops and scoped blocks that web assembly seems to have embraced.

I'd like to argue that a flatter, goto based format would be far more useful as a target for compiler developers, and would not significantly hinder the writing of a usable polyfill.

Personally, also I'm not a big fan of nested complex expressions. They're a bit clunkier to consume, especially if inner nodes can have side effects, but I don't strongly object to them as a compiler implementer -- The web assembly JIT can consume them, I can ignore them and generate the instructions that map to my IR. They don't make me want to flip tables.

The bigger problem comes down to loops, blocks, and other syntactic elements that, as an optimizing compiler writer, you try very hard to represent as a graph with branches representing edges; The explicit control flow constructs are a hindrance. Reconstructing them from the graph once you've actually done the optimizations you want is certainly possible, but it's quite a bit of complexity to work around a more complex format. And that annoys me: Both the producer and the consumer are working around entirely invented problems which would be avoided by simply dropping complex control flow constructs from web assembly.

In addition, the insistence of higher level constructs leads to some pathological cases. For example, Duff's Device ends up with horrible web assembly output, as seen by messing around in The Wasm Explorer. However, the inverse is not true: Everything that can be expressed in web assembler can be trivially converted to an equivalent in some unstructured, goto based format.

So, at the very least, I'd like to suggest that the web assembly team add support for arbitrary labels and gotos. If they choose to keep the higher level constructs, it would be a bit of wasteful complexity, but at least compiler writers like me wold be able to ignore them and generate output directly.

Polyfilling:

One of the concerns I have heard when discussing this is that the loop and block based structure allows for easier polyfilling of web assembly. While this isn't entirely false, I think that a simple polyfill solution for labels and gotos is possible. Whiie it might not be quite as optimal, I think that it's worth a little bit of ugliness in the bytecode in order to avoid starting a new tool with built in technical debt.

If we assume an LLVM (or QBE) like syntax for web assmembly, then some code that looks like:

int f(int x) {
    if (x == 42)
        return 123;
    else
        return 666;
}

might compile to:

 func @f(%x : i32) {
    %1 = test %x 42
jmp %1 iftrue iffalse

 L0:
    %r =i 123
jmp LRet
 L1:
    %r =i 666
jmp LRet
 Lret:
    ret %r
 }

This could be polyfilled to Javascript that looks like:

function f(x) {
    var __label = L0;
    var __ret;

    while (__label != LRet) {
        switch (__label) {
        case L0:
            var _v1 = (x == 42)
            if (_v1) {__lablel = L1;} else {label = L2;}
            break;
        case L1:
            __ret = 123
            __label = LRet
            break;
        case L2;
            __ret = 666
            __label = LRet
            break;
        default:
            assert(false);
            break;
    }
}

Is it ugly? Yeah. Does it matter? Hopefuly, if web assembly takes off, not for long.

And if not:

Well, if I ever got around to targeting web assembly, I guess I'd generate code using the approach I mentioned in the polyfill, and do my best to ignore all of the high level constructs, hoping that the compilers would be smart enough to catch on to this pattern.

But it would be nice if we didn't need to have both sides of the code generation work around the format specified.