Skip to content

Files

Latest commit

9fc9194 · Mar 12, 2017

History

History
612 lines (456 loc) · 37.3 KB

asmjs.org

File metadata and controls

612 lines (456 loc) · 37.3 KB

Overview

This is a port of various GNU utilities to work with or on an “architecture” based on the asm.js subset of JavaScript or the WebAssembly virtual machine as implemented in Web browsers; the compiler used is GCC, which means C and C++ programs (and, in theory, programs in other languages) can be compiled to JavaScript or WebAssembly code that can then be run in a web browser or JavaScript shell.

It’s quite slow, partly because of the design of the architecture.

It was originally based on Emscripten, but no Emscripten code remains at this point; instead, GCC and glibc are used.

Pipeline

The stages of building a JavaScript program are as follows:

GCC invocation

The GCC cross-compiler, asmjs-virtual-asmjs-gcc is invoked to produce ELF objects and ultimately to link those into an ELF executable.

GCC translates C/C++ code to “assembler” code

This uses the standard GCC frontend, including full optimization, and a custom GCC backend that simply translates GCC’s internal RTL expressions into asm.js code.

GNU as translates “assembler” code into an ELF object file

While, ultimately, JavaScript code is produced in a form that can be loaded by a web browser, we use the ELF format for intermediate files containing JavaScript code, binary data, and debugging information in the DWARF format.

GNU ld links the ELF object files into an ELF executable

During this stage, special “hexadecimal” relocations are used to patch hexadecimal addresses into the JavaScript source code stored in the ELF files.

bin/prepare invocation

Afterwards, prepare is run on the ELF executable to produce a JavaScript file.

GNU objcopy extracts the raw data file

As JavaScript has no easy way of including binary data in a JavaScript source code file, all data is encoded in hexadecimal format for inclusion in the JavaScript code.

“hexify” turns the data file into a hex-encoded JavaScript array

This simply replaces each byte by its hexadecimal expansion.

GNU objcopy adds this array to the ELF executable as a new section

GNU ld creates a temporary ELF object with JavaScript support code and the ELF executable

GNU ld turns this temporary ELF object file into JavaScript

bin/jsexport invocation

WIP. The jsexport program is run on the ELF executable to add code and data for exporting C (C++-representable, actually) code and data to JavaScript code behind nice wrappers.

Repositories

main repository

The main (meta) repository is at https://github.com/pipcet/asmjs. It includes sub-repositories for binutils/GDB, GCC, glibc, and some example applications.

The Makefile rules in this repository require manual intervention to rebuild things after changes; this is because otherwise, a tiny change in binutils or gcc would force a complete rebuild and slow down development too much.

Design

“Assembler” language

Assembler input consists of the following kinds of line:

Labels

Labels appear on lines of their own; these lines do not begin with whitespace. They end with a colon.

Pseudo-op lines

Lines beginning with whitespace followed by a period are treated as assembler pseudo-ops, or macro invocations (macros are thus restricted to have names that begin with a period).

JavaScript lines

Lines beginning with whitespace followed by a non-period are treated as JavaScript code. They are written verbatim to the object file as though the .ascii pseudo-op had been used. If the last character of a JavaScript line is a dollar sign $, no line terminator is added, not even a space character, so the line appears “pasted together” with the following line, which may be a pseudo-op or macro invocation; otherwise, a newline character is added.

FIXME: indentation of macros

asm.js code

An entire program is compiled into one asm.js module (see the asm.js spec), which consists of one asm.js function for each program function, in addition to a few functions of a more special nature.

overall module design

The asm.js module has roughly the following general form:

function (stdlib, foreign, heap) {
    "use asm";

    var HEAP8 = new stdlib.Int8Array(heap);
    var HEAP16 = new stdlib.Int16Array(heap);
    var HEAP32 = new stdlib.Int32Array(heap);
    var HEAPU8 = new stdlib.Uint8Array(heap);
    var HEAPU16 = new stdlib.Uint16Array(heap);
    var HEAPU32 = new stdlib.Uint32Array(heap);
    var HEAPF32 = new stdlib.Float32Array(heap);
    var HEAPF64 = new stdlib.Float64Array(heap);

    <further local definitions>

    var foreign_extcall = foreign.extcall;

    var rv = 0;
    var a0 = 0;
    var a1 = 0;
    var a2 = 0;
    var a3 = 0;

    <asm.js functions>

    return {
        get_arg: get_arg,
        <...>

        f_0x40001000: f_0x40001000,
        <...>
    };
}

asm.js functions

Each asm.js function has roughly the following general form:

function f_0xXXX(pc, sp, r0, r1, rpc, rfp)
{
mainloop:
    while (1) {
        switch (pc|0) {
            case 0xXX:
                <function prologue>
            case 0xXX + 1:
                <first basic block>
            default:
                if (pc|0) abort();
                <restore registers>
            }
        }
    }
}

This format actually changed to

function f_0xXXX(dpc, sp1, r0, r1, rpc, pc0)
{
mainloop:
    while (1) {
        switch (dpc|0) {
            case 0:
                <function prologue>
            case 1:
                <first basic block>
            default:
                if (pc|0) abort();
                <restore registers>
            }
        }
    }
}

parameters

All six arguments are 32-bit integers.

PC = pc0 + dpc

The program counter is the sum of the pc0 parameter, representing the first address in a function, and the dpc parameter, a delta value. This is so that code is nominally position-independent, which might be useful for simulating dynamic loading/linking, and so the switch statement can look up dpc in a table directly rather than having to subtract a value first.

SP = sp1 - 16

The stack pointer is at a fixed offset from the sp1 parameter, which is the stack pointer of the surrounding function prior to the call (and, thus, the canonical frame address).

integer arguments

The first two integer arguments to be passed to the function are stored in the r0 and r1 parameters.

Return PC

The PC to which execution will return is passed in the rpc parameter. There’s a good case to be made that we shouldn’t do this and save a parameter instead, but that requires changes to GCC; in any case, it makes debugging easier to have a fast way of accessing the return PC.

All six arguments are 32-bit integers. There are two ways of calling an asm.js function:

ordinary calls

In an ordinary call, dpc is set to zero, pc0 is set to the program counter value assigned to the function, sp is set to the stack pointer at the beginning of the function, r0 and r1 are set to the first two integer arguments to the function, and rpc is set to the return address for the function.

continuation calls

In a continuation call, dpc is set to the negative of pc0, which is set to the program counter value assigned to the function, sp is set to the current function’s frame pointer, r0 and r1 are ignored, and rpc is set as above. In such a call, the function will jump to the default label in the master switch statement and restore all registers from the register save area, a block of memory pointed to by the sp argument (which actually becomes the fp register; sp is restored to a different value by this code). The function then continues executing at the restored dpc value, which is usually different from 0.

Similarly, there are two ways of leaving an asm.js function. Both ways correspond to a return statement in the asm.js code; leaving functions through JavaScript exceptions is not supported.

ordinary return

In an ordinary return, the per-thread variable rv is set to the return value and the value that was passed as sp in the ordinary call that started this function is returned using a JavaScript return statement. Since sp is always aligned to a 32-bit boundary, its lower-order two bits are 0: src_javascript{sp & 3 == 0}.

special return

In a special return, a value is returned whose lower-order two bits are not 0. In fact, those two bits are a tag specifying what should happen with the rest of the return value, which is turned into a pointer by ignoring the lower-order two bits.

If the tag value is 1, one of two things happened: either the function is blocked waiting for an asynchronous event to wake up the thread again and resume execution, or the function needs to access the VM stack; for example, a __builtin_frame_address expression might be evaluated.

In both of these cases, the asm.js function(s) further up the stack save their registers to their respective register save areas on the VM stack and return the same value that was returned to them; the ultimate return value of the asm.js invocation is thus the frame pointer of the innermost asm.js function to be executed plus the constant 1.

Once control returns to JavaScript code, the two cases once again become different: if waiting for an asynchronous event, the JavaScript code returns so the JavaScript VM can execute other code, which will at some point wake up the asm.js thread and continue execution. If the function merely needs to access the VM stack, control returns to the asm.js code immediately.

If the tag value is 2, the function executed the equivalent of a longjmp: control is to resume at a frame pointer specified in the pointer field of the return value, but asm.js functions whose frame pointer is inner to that frame pointer are not to save their state to the VM stack as they have been aborted by the longjmp.

Tag value 3 is reserved. It will probably be used to implement the very special kind of cross-function jump that is used by GCC to implement computed gotos in a duplicated C++ constructor.

calling another asm.js function

The JavaScript code to call another asm.js function is basically:

rp = f_0xYYYY(0xYYY, sp-16, r0, r1, 0xXXXX, fp);
if (rp & 3)
    break mainloop;

changed to:

rp = f_0xYYYY(0, sp, r0, r1, 0xXXXX, 0xYYYY);
if (rp & 3)
    break mainloop;

Thus, for ordinary calls resulting in an ordinary return, only the lower-most two bits of the asm.js function’s return value are ever checked. For special returns, the return value is kept in the rp variable and handled by the code outside the main loop.

basic blocks and labels

The basic blocks that make up the main code for a function are generated by GCC; basic blocks are separated by labels, which represent points where control potentially enters another basic block. The basic form of a basic block is thus:

case 0xYYY:
    <JavaScript code>

The case value is assigned by the “assembler”.

Control continues to another basic block either by a fall-through to the next basic block or by a jump to another basic block. An unconditional jump corresponds to the JavaScript code:

dpc = 0xYYY;
continue mainloop;

Recall that the main loop is an infinite loop wrapped around a switch statement, so control will eventually (after an indirect jump, which we’re trying to eliminate. See bug https://bugzilla.mozilla.org/show_bug.cgi?id=1253952) continue at the corresponding label in the switch statement.

A conditional branch corresponds to the JavaScript code:

if (<condition>) {
    dpc = 0xYYY;
    continue mainloop;
}

Note that both conditional branches and unconditional jumps are limited to targets within the same function. (This restriction results in a GCC test-suite failure).

registers

asm.js functions use special local or per-thread variables called “registers”. These do not correspond directly to registers on the physical machine running the JavaScript VM. The idea is that this way an intermediate number of local variables is presented to the JavaScript VM’s register allocator: enough so most code doesn’t use stack locations to address values, but few enough that there should be relatively few conflicts between live virtual registers for most code. Ideally, it was hoped that the JavaScript VM’s register allocator could be tricked into assigning one physical register to each of the asm.js registers, but this has not worked out so far.

There are two kinds of registers:

local registers

Local registers are variables local to an asm.js function. Unlike physical registers of a physical machine, which are often reused across function calls, all these registers are “call-saved”: they retain their value across a function call, and stack space is assigned to saving and restoring them on the VM stack (but this stack space is not actually written to until a special return requires that it is).

The local registers are named r0 through r7 and i0 through i7 for 32-bit integer registers (there is no longer any appreciable difference between the rX and iX registers), and f0 through f7 for 64-bit floating-point registers. There are also local variables pc, sp, and fp which behave much like local registers.

r0 and r1 are also used for passing the first two arguments to asm.js functions.

per-thread registers

Per-thread registers are variables shared between all asm.js functions executing on the same thread; as multi-threading is not yet implemented, they are effectively global to an asm.js module. As per-thread registers must be written to a global memory location, it is expected that they cannot be assigned to physical registers and access to them is thus appreciably slower than access to local registers. However, as per-thread registers are not preserved across function calls, they can be used to return values from functions. The per-thread registers are: rv, which is used to return an integer value from a function, and also as the static chain link register for nested functions; a0 through a3, which are used to pass the third through sixth 32-bit integer arguments to asm.js functions (r0 and r1 are used for the first two registers); and tp, which is the thread pointer for thread-specific data (currently unimplemented).

system calls

In addition to ordinary function calls, in which an asm.js function calls another asm.js function, there is a mechanism for asm.js functions to call JavaScript code; by analogy to system calls used by operating systems, this is referred to as a system call or external call (syscall or extcall for short).

The JavaScript code generated for a system call is:

rp = foreign_extcall(module, name, pc, sp+16);

foreign_extcall is the identifier for the JavaScript function implementing system calls; module is a pointer to a string identifying the kind of system call to be performed (this used to be useful to distinguish Emscripten calls from native syscalls, but Emscripten calls are not currently supported); name is the name of the syscall to be executed.

The arguments for the system call are not placed in r0, r1 or a0 through a3; instead, they are placed on the VM stack directly.

The current implementation always uses “thinthin” as the value of module, and a string containing the name of a Linux system call as the value of name. The arguments are meant to represent the arguments that the x86-64 Linux system call of the same name would take, regardless of the actual architecture of the machine we are executing on. It is the responsibility of the JavaScript support code to interpret the data precisely as Linux on x86-64 would have and to translate it into structures with layouts comprehensible to the native operating system, or JavaScript objects.

Similarly to Linux system calls, the return value rv of a system call is a negated errno value if it is in the integer range -4095 through -1 (or 0xfffff000 - 0xffffffff). The errno codes are those of x86-64 Linux, not those of the architecture of the machine we are executing on.

The return value rp of foreign_extcall is interpreted very similarly to the rp value of an ordinary asm.js-function call. There is one substantial difference, which is that if rp has a tag value of 1, execution will resume by repeating the system call, not at the point after the system call returns.

This design allows system calls to be effectively asynchronous: in terms of the JavaScript code, the ThinThin layer accepts as return values of the system call functions special JavaScript objects known as Promises, making it relatively easy to implement system calls that do not return a value immediately.

ThinThin

The interface preliminarily called “ThinThin” implements a minimal system call interface based on the JavaScript functions made available to ordinary web pages by the current Firefox trunk build. While it’s relatively easy to make the resulting code work on Chromium/Google Chrome browsers, this requires setting some flags and might not always work.

ThinThin is deliberately kept quite minimal (that’s what one of the “thin”s is for), though it is meant to be extended significantly from its current state. One significant difference between the web browser environment and traditional Unix/GNU environments is that there is no easy way to list all “files” in a “directory” that’s really just an HTTP URL prefix. The approach taken by ThinThin is to pretend that only those files that ThinThin has been explicitly told about are presumed to exist in that case; directories are thinly-populated, which is what the second “thin” stands for.

os.sys.call

There is a patch to the Firefox/SpiderMonkey source code that enables JavaScript code to directly call system calls of the underlying operating system. This can be used to implement asm.js system calls, but requires a translation layer (which has not been written) for architectures other than x86-64 Linux.

Emscripten library calls

Development started out using the C library included with the Emscripten project to implement system calls. This is currently no longer supported; our code no longer depends on Emscripten in any way, and that won’t change, but it also dropped all facilities to use Emscripten as an optional extension to the environment, and that will likely change with Emscripten support reenabled as an option.

the VM stack

JavaScript code, and asm.js code as a special case, is interpreted or compiled to code that makes use of the CPU stack to store local data, return addresses, and function arguments beyond those that can be passed in CPU registers. This stack is meant to be entirely opaque to JavaScript code and we thus make no assumptions about it.

However, we implement a second stack, the VM stack, which is a region of a JavaScript ArrayBuffer reserved and potentially used to store values which cannot be stored in the asm.js function’s registers; this can be either because there are no more available asm.js registers, or because the function is about to return, in which case the contents of the local asm.js registers are necessarily lost.

The idea is thus that all relevant data can be saved to the VM stack based on the JavaScript stack, and execution can resume using only this data. This allows the JavaScript VM to return from all asm.js or JavaScript functions and wait for an event asynchronously to resume execution.

The price to pay for this is two-fold: in terms of performance, it requires all asm.js functions to be implemented using the src_javascript{while(1) switch (pc) { }} pattern described above. This results in a number of indirect jumps, most of which can in theory be prevented by optimizations of the JavaScript VM running the code. In terms of memory, the price is that memory is reserved for the VM stack even while the memory actually used is on the JavaScript stack: we thus reserve memory twice for our stack values.

In addition to allowing asynchronous operations, this stack design gives us the opportunity to inspect the VM stack of a running program (perhaps by first instructing the program to store its state on the VM stack rather than the JavaScript stack). This means a program can inspect its own VM stack, which is useful for printing backtraces, unwinding exception frames, and implementing the __builtin_return_address and __builtin_frame_address GCC macros, but it also means that we can use GDB on our asm.js code to interpret the contents of the VM stack.

stack layout

The stack grows downwards. 8-byte alignment is maintained. When a function is called, the initial sp1 value points to (just above) a 16-byte area of reserved VM stack space; when the JavaScript stack is saved to the VM stack, the fp of the previous function will be stored at offset 0 in this reserved VM stack area.

frame layout: register save area

There is a register save area in the stack frame to store local registers in case of a special return. Since it is unused for ordinary returns, GCC does not usually know about it.

The register save area is pointed to by fp. Local variables are accessed at negative offsets to fp, i.e. the frame grows downward, so there is no conflict. The register save area has the following layout:

bitmask at offset 0

32-bit bitmask specifying which registers are saved

pc at offset 4

32-bit pc.

sp at offset 8

32-bit sp.

total size of register save area at offset 12

32-bit integer value.

registers at offset 16+x, as specified in bitmask

32-bit integer values or aligned 64-bit floating-point values.

non-local returns: exception handling and longjmp

It is sometimes necessary for a C function to return control not to the function which called it directly, but to one which called a chain of intermediate functions which eventually passed control to our function. Similarly, it is sometimes necessary for a C++ function to pass control on to an exception handler, which is special code emitted by a function which called our function directly or indirectly.

In both cases, the asm.js function corresponding to the C or C++ function returns an rp value with a tag value of 2, and a pointer value corresponding to the code that is to be executed next.

dynamic linking

Dynamic linking/loading does not currently work, though some initial code has been committed. Even when it does work, it will be subject to many limitations:

  • wasm32 only
  • no PLT for now
  • that means all code, including the main executable, must be compiled with -fPIC
  • it also means all calls of global functions will be indirect
  • eager linking only
  • needs WebAssembly.Module.customSections (https://bugzilla.mozilla.org/show_bug.cgi?id=1321122)

The idea, for now, is not to have a dynamically-linked glibc, but to dlopen() some small-ish modules from a large static executable. Dynlinking glibc should work, of course, but it’ll be too slow and not really advantageous.

debugging

Explicitly calling a gdb stub from C code to allow GDB to inspect data on the VM stack and modify it used to work. Breakpoints do not yet work, and are expected to require special build options and incur significant performance penalties when they do. Function calls made by GDB do not yet work.

__builtin_return_address

The GCC builtin builtin_return_address is supported and should return the right values; if called with an argument of 0 as builtin_return_address(0), it returns the rpc register’s value and should be relatively fast to execute; if called with an argument greater than 0, it causes the JavaScript function to perform a special return and inspects the stack afterwards to determine the right result; this is relatively slow.

__builtin_frame_address

Similarly to builtin_return_address, builtin_frame_address(0) returns a saved value and should execute quickly, while calling builtin_frame_address with an argument greater than 0 is relatively slow.

XXX is this still the case?

ELF format

The asm.js target uses a variant of the ELF format for intermediate files, even though the files ultimately processed by the web browser or JavaScript shell are pure JavaScript.

endianness

The asm.js target currently requires a little-endian JavaScript VM, and the ELF format is little-endian.

machine identifier

The machine identifier used for the ELF files is 0x534a (“JS” in little-endian notation).

32-bit

Currently, asm.js allows only for 32-bit addresses, and the asm.js target uses the 32-bit ELF format.

entry point

The entry point of the program is not specified by the relevant field of the ELF header but by the global symbol __entry. This is because ld -Obinary provides no way of extracting the entry point address.

section contents

data sections

Data sections contain binary data in 32-bit little-endian format. They use standard ELF relocations for pointers to data or code.

JavaScript sections

JavaScript sections contain ASCII/UTF-8-encoded JavaScript source code, with some addresses left out and encoded as strings of 16 ASCII “0” characters. (Sometimes, only 15 or 13 characters are used). The (possibly unaligned) offsets of such strings then appear in special relocations which replace the strings by ASCII-encoded hexadecimal digits representing a symbol’s address.

While JavaScript sections are not copied to the ArrayBuffer visible to an executing asm.js program, they are assigned addresses in the same address space. This allows us to distinguish pointers to JavaScript source code from data pointers based on the high-order bits of the address value.

However, the address of a basic block’s JavaScript source code does not correspond to the case label, or the pc0 value, of the basic block. Instead, PC values live in a third part of the address space, which is also invisible to the running program and distinguishable from the other two parts by its high-order bits.

This is so that PC values of adjacent basic blocks (after a complication described below) are consecutive integers, which allows the switch statement that an asm.js function is based on to be executed at relatively high speed.

text sections

Text sections contain any number of 16-byte-aligned 16-byte structures each consisting of two 64-bit little-endian addresses, marking the beginning and end of JavaScript source code stored in a JavaScript section. Like JavaScript sections, text sections are not loaded into the ArrayBuffer visible to the asm.js program. Each 16-byte structure has a PC address which necessarily ends in the hexadecimal digit 0.

There is a slight complication as the asm.js spec requires case labels to be densely packed: the pc local variable actually stores the result of right-shifting a PC address by four bits (equivalently, omitting the last hexadecimal digit). The convention we’re trying to adhere to is that whenever a PC address is written to memory, it is left unshifted (or left-shifted if it has previously been right-shifted) and its last hexadecimal digit is necessarily 0.

As there is currently no 64-bit support, there are only 32-bit little-endian binary relocations in text sections.

relocations

As mentioned above, there are hexadecimal relocations specific to the asm.js “architecture” in addition to the binary relocations common to all ELF architectures:

HEX16

This relocation replaces 16 ASCII hex digits in the ELF section by the right number of hex digits to represent the value of the relocation, encoded as ASCII hex digits; the digits are followed by space characters to keep the length of the resulting string at 16 bytes.

HEX16R4

Like HEX16, but only 15 ASCII hex digits are replaced, and the value is right-shifted by 4 bits; in other words, the last digit is omitted.

HEX16R12

Like HEX16R4, but the right shift is 12 bits, and the last three digits are omitted.

signals

Signals are not currently supported.

Links

Emscripten

http://emscripten.org

Relooper algorithm

https://github.com/kripken/emscripten/raw/master/docs/paper.pdf

asm.js standard

http://asmjs.org

WebAssembly

http://webassembly.github.io/ https://github.com/sunfishcode/wasm-reference-manual/blob/master/WebAssembly.md

Future design

asm.js calling convention

I think the best calling convention to use is:

function f_0x80001000(dpc, sp1, r0, r1, rpc, pc0)

for ordinary calls:

pc = pc0
dpc = 0
sp = sp1 - 16

for special calls:

dpc = -pc0
rp = sp1 - 16

To call another asm.js function, use

rp = f_0xXXXX(0, sp, r0, r1, 0xYYYY, 0xXXXX);

or

rp = indcall(0, sp, r0, r1, 0xYYYY, 0xXXXX);

In theory, we could do without the last argument for non-PIE code, and without the second-to-last one for functions not using __builtin_return_address(0).

register save area

pointer to rfp = . + length of register save area

pc0

pc = pc0 + dpc

rpc

sp

bitmap if return fp not yet reached

r0-r7 as indicated in bitmap

i0-i7 as indicated in bitmap

f0-f7 as indicated in bitmap

rv, frv, fa0, fa1, a0, a1, a2, a3 as indicated in bitmap

second bitmap if return fp not yet reached

rfp = HEAP32[HEAP32[fp>>2]>>2]

rpc = HEAP32[rfp+4>>2] + HEAP32[rfp+8>>2]

pc = pc0 + dpc

Scratch space

Ignore this: it’s been written but probably superseded by the above.

Stack layout

The asm.js target port uses the VM stack, a stack in the asm.js “heap” array buffer in addition to the normal JavaScript stack. The JavaScript stack’s layout is specific to the JavaScript engine in use and not interesting to us.

During normal operation (function calls that exit normally), space on the VM stack is reserved but nothing is actually written there; when a non-local exit is about to be performed (or certain other conditions are met), each function whose state is recorded on the JavaScript stack writes its state to the VM stack and returns to its caller.

When execution is resumed, only the innermost function is called again at first, and control briefly returns to JavaScript when it exits. The functions being called restore the state in registers and on the JavaScript stack based on the contents of the VM stack before continuing to execute translated JavaScript code.

Memory layout

0x00000000–0x00001000: zero page

0x00001000–0x00002000: control page

0x00001000: interrupt flag

0x00001020: interrupt reason

0x00001040: breakpoint array

Relocations

The asm.js target port uses ELF. The ELF .javascript.text sections (and similar) contain JavaScript source code, not binary opcodes. Therefore, the relocations do not patch in binary numbers, but hexadecimal ASCII-encoded ones. The relocations are called HEX16, HEX16R4 (leave out the last hex digit), and HEX16R12 (leave out the last three hex digits).

Syscalls

Syscalls are performed by calling the extcall function with two arguments: a pointer to a string specifying the API to be used and another pointer to a string specifying the syscall to be executed. At present, the first pointer should always be to the string “thinthin” followed by a NUL character.

The convention is that syscall arguments are sign-extended to 64 bits, then treated the same as the arguments of the corresponding Linux syscall on the x86-64 architecture. This makes it particularly easy to forward syscalls on that architecture, while other architectures have to translate syscall arguments.

Similarly, structure layouts and constants passed to system calls are identical to those of Linux on the x86-64 architecture.

os.sys.call

One implementation of the syscall API forwards them directly to the syscall(3) function, which is called by some extra code added to the SpiderMonkey JavaScript shell.

The calling convention is that os.sys.call is called with the first argument, an integer, specifying the syscall number, followed by a list of arguments that are either integers, to be passed directly to the syscall, or array buffer arguments, followed by an integer offset, which are converted to a pointer to the relevant offset in the array buffer. As a special case, an offset of 0 is translated to a NULL pointer, and the array buffer is ignored.

how to special-case the first case label:

function blah(x)
{
    while (x == 0) {
        init stuff;

        x = 1;
        break;
    }
    switch (x) {
    }
}

Rationale

Our design is significantly simpler than Emscripten’s; in particular, we have no need of a relooper algorithm, and we limit the number of JS local variables to those of a simulated register file.

In essence, we admit that we are treating the asm.js target as a hostile environment which makes simulat

The proposed specification for WebAssembly lacks the equivalent of a switch statement; the nearest match is the br_table opcode, which requires

wasm support

wasm64 support is severely outdated (and simulates 64-bit operations as 32-bit ones anyway; the wasm MVP does not contain 64-bit support).

assembly language

Unlike the text-based asm.js backend, the wasm target uses a conventional assembler approach: the wasm opcodes are used as though they were assembly instructions.

Notation is in RPN order: child nodes of the AST are described first, then their parent node. This can also be read as instructions for a stack machine.

Immediate arguments follow the instruction opcode, with the exception of an immediate argument specifying the value type for a block, loop, or if block; that type is specified in brackets following the mnemonic, with [] for a void type, [i] for i32, [l] for i64, [f] for f32, and [d] for f64.

ELF format

machine identification

For wasm, we use an id of 0x4157, which corresponds to “WA” if formatted in little-endian mode.

relocations

Two extra relocations are provided, one for LEB128 constants and one for LEB128 constants right-shifted by 32 bits, in order to extract the function index from a 32-plus-32-bit PC.

todo list

unconditional branches

implement

uses .dpc

test

conditional branches

implement

uses .dpc

test

ordinary calls

indirect calls

trampolines

exceptions

__builtin_setjmp

appears to work

setjmp()

MI thunks

other nonlocal gotos

__builtin_return_address(0)

implement
test

__builtin_return_address(N)

implement
test

__builtin_frame_address(0)

__builtin_frame_address(N)

fix GDB

wasm

make %S0 expand to “set_local $r0” or “i32.store %0”

make .dpc produce an i32.const

new section namespacing.

.javascript.text: JavaScript text section .wasm-ast.text: WebAssembly AST, as defined by the standard .wasm-pwas.text: wasm-translate.scm input, as not defined by a standard .wasm.text: WebAssembly binary code

wasm32

fix offsets for 32-bitness

setjmp

longjmp

mi thunk

eh_return

indcall

officialize ld script

shorten uleb128s

sleb128 vs uleb128

use i32.store offset

use tee_local

peephole patterns

signals

SIGFPE/SIGSEGV “handlers”

debugging signal

user-defined signals

SIGALRM

fast switch for wasm

wasm, unlike asm.js, allows the default label of a br_table to be the first case. So make the default case dpc = -1, use dpc = 0 for a register restore, then start “real” dpcs at dpc = 1. That saves an indirect branch and should make things significantly faster.

remove $trace code

dynamic memory limit

test “name” section

drop pc0 argument for direct calls

It’s highly desirable to keep things limited to six integer arguments since those can be passed in registers on x86-64, but we need two arguments for the return pc0 and the return dpc. Alternatively, we could pass a pointer to a 64-bit PC in the data section, or on the stack for PIC code.

In fact it’s only setjmp that really needs to know our dpc.

Use call_indirect rather than indcall

unsorted

multifile support

The multifile feature supports relinking an ELF executable (which has already been linked) with additional object files to produce another ELF executable. This is similar to what LD_PRELOAD does on systems that support it, but manages to work around the lack of dynamic linking in the asmjs target.

In addition to including the new object file code, constructors and destructors in the library file are executed.

This is implemented by modifying the linker script files to produce a linked list of file descriptors, starting at the fixed offset 16384, with new descriptors appended to existing object files after the .bss section.

dummy sections

The wasm32 backend uses a number of dummy ELF sections whose only purpose it is to allocate positions in some index space.

The new intermediate macro assembler language

file_header

function_name index, name

local_name findex, lindex, name

numbered_section index, name

named_section name

opcodes

relocations

@plt

@got

@gotcode

Since WebAssembly has separate spaces for data addresses and function indices, relocations must distinguish between code and data references. Usually, GOT references are to data; @gotcode references, however, are to function indices.

index spaces

.space.function

.space.global

.space.table

.space.memory

.space.pc