Let’s Write a Web Assembly Interpreter (Part 2)

Richard Anaya
4 min readMar 30, 2020

Welcome back! Last time together, we took a look at all the details of uploading the bytes of a web assembly program into our interpreter. Today we’re going to write the code for handling the most basic program.

#[no_mangle]
pub fn main(_args:usize,_len:usize) -> usize {
return 42;
}

Before we begin, I want to show you the final code we’ll be making and work backwards.

let program = Program::parse(&wasm_bytes)?;    
let main_function = program.find_exported_function("main")?;
let main_code = program.find_code_block(main_function.index)?;
if let Instruction::I32Const(num) = main_code.code_expression[0] {
Ok(num as f64)
} else {
Err("Interpreter can't do anything else yet.".to_string())
}

Wait what? That’s it? Yes.

We have a lot to unpack though about what’s happening in each step. First you can see our very first step is to parse the binary format into a usable data structure. We use a very minimalist library I've written called watson . This does nothing but parse byte by byte all the components into structs and has no execution capabilities.

Let’s explore what these data structures are in a web assembly program. As we mentioned in the last article, the structure of a web assembly program is a linked list of data structures representing various aspects of a program:

Here’s the complete list of section types if you are curious!

pub const SECTION_CUSTOM: u8 = 0;
pub const SECTION_TYPE: u8 = 1;
pub const SECTION_IMPORT: u8 = 2;
pub const SECTION_FUNCTION: u8 = 3;
pub const SECTION_TABLE: u8 = 4;
pub const SECTION_MEMORY: u8 = 5;
pub const SECTION_GLOBAL: u8 = 6;
pub const SECTION_EXPORT: u8 = 7;
pub const SECTION_START: u8 = 8;
pub const SECTION_ELEMENT: u8 = 9;
pub const SECTION_CODE: u8 = 10;
pub const SECTION_DATA: u8 = 11;

Consider the lone function of our simple web assembly program, there’s various parts of this function that get stored in some of these sections:

  • types — represents the input and output signature of a function
  • exports — represents the names of functions that are exported (i.e. accessible from JavaScript )
  • code — low level instructions of a function
  • functions — an list of known functions and their type signatures

Let’s take a look at sort of a human readable version of our simple.wasm

[Type Section]
0: fn(inputs[]) -> outputs[I32]
[Function Section]
0: type[0]
[Memory Section]
0: min 2 max 10
[Export Section]
"main" function[0]
"memory" memory[0]
[Code Section]
0: locals[] code[I32Const(42)]

Notice how members of these sections reference each other via their index.

Knowing this helps us make more sense of these two lines in our simple interpreter.

let main_function = program.find_exported_function("main")?;
let main_code = program.find_code_block(main_function.index)?;

We’re looking for pieces of information from these sections, and then asking for more information from other sections. The first line looks for a function called main . The second line looks for the code instructions of a function based on what we just found.

You might enjoy looking at the details of these lookup helper functions in watson.

Now we can see the code instructions for our main function:

[ 65, 42 ] // represents [ I32_CONST, 42 ]

The I32_CONST instruction indicates that we are going to be using a number. In this case, since there are no operations, this number is used as a return value!

Integers in web assembly use a very specific format LEB-128.

Hopefully now our very very primitive interpreter makes more sense.

extern crate alloc;
use crate::alloc::string::ToString;
use alloc::vec::Vec;
use watson::*;
use webassembly::*;
#[no_mangle]
fn malloc(size: usize) -> *mut u8 {
let mut buf = Vec::<u8>::with_capacity(size as usize);
let ptr = buf.as_mut_ptr();
core::mem::forget(buf);
ptr
}
fn load_and_run_main(wasm_bytes: &[u8]) -> Result<f64, String> {
let program = Program::parse(&wasm_bytes)?;
let main_function = program.find_exported_function("main")?;
let main_code = program.find_code_block(main_function.index)?;
if let Instruction::I32Const(num) = main_code.code_expression[0] {
Ok(num as f64)
} else {
Err("Interpreter can't do anything else yet.".to_string())
}
}
#[no_mangle]
fn run(ptr: *mut u8, len: usize) -> f64 {
let wasm_bytes = unsafe { Vec::from_raw_parts(ptr, len, len) };
match load_and_run_main(&wasm_bytes) {
Ok(result) => result,
Err(e) => {
panic!("fail");
}
}
}

In our next part we’ll be talking more about functions, instructions, and local variables!

And you can see this lesson running here.

--

--

Richard Anaya

Data Engineer, Code Philosopher, & Robot Psychologist