A JVM in Rust part 5 - Executing instructions

Published Monday, Aug 28, 2023 - 2418 words, 12 minutes

Tagged:

This post is part of the Writing a JVM in Rust series.

In this post, I will discuss how RJVM executes the Java bytecode. We will discuss various families of instructions, but we will leave method invocations and exception handling for the next part, because this post is already long enough. 😅

Loading classes

Let us start with actually loading classes. We have seen in part 3 the reader crate, that loads a class file and creates a Rust structure that models it. In the vm crate, however, I have created a different (although very similar) type:

/// A loaded java class
#[derive(Debug)]
pub struct Class<'a> {
    pub id: ClassId,
    pub name: String,
    /// Source file is stored as an attribute in the .class file, but might
    /// be missingfor synthetic classes or if the compiler didn't write it.
    pub source_file: Option<String>,
    pub constants: ConstantPool,
    pub flags: ClassAccessFlags,
    pub superclass: Option<ClassRef<'a>>,
    pub interfaces: Vec<ClassRef<'a>>,
    pub fields: Vec<ClassFileField>,
    pub methods: Vec<ClassFileMethod>,
    // Base classes field have the same index they have in the base class,
    // and our ownfield come after. This is the index of the first "owned"
    // field. Note that this will include the static fields, as required
    // by the bytecode specs.
    pub first_field_index: usize,
    // The total number of fields in this class, including those
    // in the base class.
    pub num_total_fields: usize,
}

pub type ClassRef<'a> = &'a Class<'a>;

/// In various data structures, we store the class id of the object,
/// i.e. a progressive number assigned when we load the class. Note that,
/// while we do not support it yet, multiple class loaders could load
/// the same class more than once, but they would be required to assign
/// different id to them.
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)]
#[repr(transparent)]
pub struct ClassId(u32);

The main difference from the ClassFile struct is that we replace the names of the parent class and implemented interfaces with references to the loaded classes. Notice also that I have added a ClassId to each class, assigned simply as a progressive integer.

Note that my implementation does not follow correctly the spec, in particular, it does not do the verification. Furthermore, it does not do any sort of optimization - it keeps the constant pool as it is in the .class file, rather than inlining all the constants. Thus, when executing an instruction such as ldc the vm has to lookup the correct constant. A smarter implementation would do some pre-process on the bytecode to do this lookup just once, when loading the class file.

Resolving classes

I have implemented a basic concept of classpath, simply modeled as a list of entries:

/// Models a class path, i.e. a list of [ClassPathEntry]
#[allow(dead_code)]
#[derive(Default, Debug)]
pub struct ClassPath {
    entries: Vec<Box<dyn ClassPathEntry>>,
}

/// Models an entry in the class path, i.e. a single Jar or directory
pub trait ClassPathEntry: fmt::Debug {
    fn resolve(&self, class_name: &str)
      -> Result<Option<Vec<u8>>, ClassLoadingError>;
}

impl ClassPath {
    /// Parses and adds class path entries.
    /// These should be separated by a colon (:), just like in a real JVM.
    pub fn push(&mut self, string: &str)
      -> Result<(), ClassPathParseError> { /* ... */ }

    /// Attempts to resolve a class from the various entries.
    /// Stops at the first entry that has a match or an error.
    pub fn resolve(&self, class_name: &str)
      -> Result<Option<Vec<u8>>, ClassLoadingError> { /* ... */ }
}

The implementation of ClassPath will simply iterate through the entries and find the first matching one.

There are two implementations of the trait ClassPathEntry: one can load classes from a directory and one from a jar file:

/// Implementation of [ClassPathEntry] that searches for `.class` files,
/// using the given directory as the root package
#[derive(Debug)]
pub struct FileSystemClassPathEntry {
    base_directory: PathBuf,
}

/// Implementation of [ClassPathEntry] that searches for `.class` files
/// inside a `.jar` file
pub struct JarFileClassPathEntry {
    file_name: String,
    zip: RefCell<ZipArchive<BufReader<File>>>,
}

Since a jar file is simply a zip, I have used the zip crate to load them. The main jar file that we load, in the integration tests, is the real rt.jar taken from OpenJDK 7. Notice that it loads the whole zip file in memory, which is not ideal since it is almost 60 MB - and sadly it makes using miri not practical. Ideally, I should use a library that just reads the directory index from the jar file and then each .class file lazily.

Storing loaded classes

To store the classes that have been loaded, so that they can be looked up when referred by another class, I have implemented a structure called ClassManager:

/// An object that will allocate and manage Class objects
pub(crate) struct ClassManager<'a> {
    class_path: ClassPath,
    classes_by_id: HashMap<ClassId, ClassRef<'a>>,
    classes_by_name: HashMap<String, ClassRef<'a>>,
    /// Used to allocate class instances that will be alive as long as the arena
    /// (and thus the `ClassManager` are alive).
    arena: Arena<Class<'a>>,

    /// Used to generate ClassId
    next_id: u32,

    /// In a real implementation, we would have a current class loader
    /// for each thread,in a hierarchy. Currently, we only have exactly
    /// ONE global class loader.
    current_class_loader: ClassLoader<'a>,
}

I have used the crate typed_arena to allocate the classes and ensure that the allocations will be alive until the ClassManager is, hence the <'a> lifetime.

When the VM code requires a class to be resolved, that class might have already been loaded in memory or might require loading. Therefore, I have implemented the following enum:

impl<'a> ClassManager<'a> {
    pub fn get_or_resolve_class(&mut self, class_name: &str)
      -> Result<ResolvedClass<'a>, VmError> { /* ... */ }
}

/// When a class instance is requested, returns whether the class was already
/// loaded,or whether the requeste loaded a new class (which will need to
/// be initialized).
#[derive(Debug, Clone)]
pub(crate) enum ResolvedClass<'a> {
    AlreadyLoaded(ClassRef<'a>),
    NewClass(ClassesToInitialize<'a>),
}

/// In case a new class was loaded, maps the whole list of the classes that
/// requireinitialization, in order so that a base class is initialized
/// _before_ the derived classes. Includes the newly resolved class
/// in the list [to_initialize].
#[derive(Debug, Clone)]
pub(crate) struct ClassesToInitialize<'a> {
    resolved_class: ClassRef<'a>,
    pub(crate) to_initialize: Vec<ClassRef<'a>>,
}

This is necessary because loading a new class might, recursively, require the loading of a referred superclass or interfaces, which have to be initialized in order. For example, when loading the very first class in the VM, we will at the very least need to load java.lang.Object too.

Executing methods

We are finally ready to execute some code! The main API of the vm crate is the Vm struct:

/// An instance of the virtual machine. Single-threaded,
/// can execute one method (generally `main`).
pub struct Vm<'a> {
    /// Responsible for allocating and storing classes
    class_manager: ClassManager<'a>,

    /// Responsible for allocating objects
    object_allocator: ObjectAllocator<'a>,

    /// Allocated call stacks
    call_stacks: Arena<CallStack<'a>>,

    // ...
}

impl<'a> Vm<'a> {
    pub fn new(max_memory: usize) -> Self { /* ... */ }

    /// Allocates a new call stack. We need to store it to be able to refer it later,
    /// for extracting the gc roots.
    pub fn allocate_call_stack(&mut self) -> &'a mut CallStack<'a> {
      /* ... */
    }

    pub fn resolve_class_method(
        &mut self,
        call_stack: &mut CallStack<'a>,
        class_name: &str,
        method_name: &str,
        method_type_descriptor: &str,
    ) -> Result<ClassAndMethod<'a>, MethodCallFailed<'a>> {
        /* ... */
    }

    pub fn invoke(
        &mut self,
        call_stack: &mut CallStack<'a>,
        class_and_method: ClassAndMethod<'a>,
        object: Option<AbstractObject<'a>>,
        args: Vec<Value<'a>>,
    ) -> MethodCallResult<'a> {
        /* ... */
    }
}

/// A method call can return:
/// - for success: a value, or a None option in case of void methods
/// - for failures: a MethodCallFailed error
pub type MethodCallResult<'a> = Result<Option<Value<'a>>, MethodCallFailed<'a>>;

/// Models the fact that a method execution has failed
#[derive(Debug, PartialEq)]
pub enum MethodCallFailed<'a> {
    InternalError(VmError),
    ExceptionThrown(JavaException<'a>),
}

Thus, to execute some code, the steps are:

create an instance of Vm;
create a CallStack;
resolve the method to be invoked (generally main);
invoke it.

The type Value was already discussed in the first part, so I will refer you to that discussion for the details.

The execution of a method will often, in turn, invoke other methods - this will be done again via the same invoke API, recursively. We will thus use both our CallStack and CallFrame abstractions, but also the native Rust call stack to execute methods.

As an aside, notice how nice MethodCallResult is, using the power of sum types of Rust! 🤩

Call stack and frames

In RJVM there are two abstractions: a CallFrame represents the invocation of a single method, used to store the local variables, program counter, and the stack. A CallStack is simply a stack of the various call frames - each function invocation will add one entry to the stack, and when a method returns, its frame will be popped. This is the code:

/// A call frame for a single method call inside a [CallStack].
#[derive(Debug)]
pub struct CallFrame<'a> {
    /// The class and method that is being executed
    class_and_method: ClassAndMethod<'a>,

    /// The current program counter
    pc: ProgramCounter,

    /// The locals variables' map of the method
    locals: Vec<Value<'a>>,

    /// The current stack
    stack: ValueStack<'a>,

    /// The bytecode to execute
    code: &'a Vec<u8>,
}

/// Models the program counter, i.e. the address of an instruction
/// in the bytecode of a method
#[derive(Debug, PartialEq, Eq, Clone, Copy, PartialOrd, Ord)]
pub struct ProgramCounter(pub u16);

impl<'a> CallFrame<'a> {
    pub fn new(class_and_method: ClassAndMethod<'a>, locals: Vec<Value<'a>>)
    -> Self { /* ... */ }

    /// Executes the whole method
    pub fn execute(
        &mut self,
        vm: &mut Vm<'a>,
        call_stack: &mut CallStack<'a>,
    ) -> MethodCallResult<'a> {
      /* ... */
    }
}

A few interesting things:

the bytecode being executed is stored directly in the call frame, as a reference, to avoid having to jump through the method references;
the local variables are simply modeled by a Vec;
…but the call stack is implemented via a helper struct named ValueStack. This was done to simplify the implementation and tests of strange instructions such as dup2_x1 and similar.

Executing a method means executing its instruction. The program counter will start at zero, the instruction at that address will be executed, and the program counter will be incremented. The jump instructions will change the value of the program counter, and the return instructions will stop the execution of the method. I have given an overview of exception handling in part 1 and I will discuss that further in the next part.

/// Executes the whole method
pub fn execute(
    &mut self,
    vm: &mut Vm<'a>,
    call_stack: &mut CallStack<'a>,
) -> MethodCallResult<'a> {
    self.debug_start_execution();

    loop {
        let executed_instruction_pc = self.pc;
        let (instruction, new_address) =
            Instruction::parse(
              self.code,
              executed_instruction_pc.0.into_usize_safe()
            ).map_err(|_|
              MethodCallFailed::InternalError(VmError::ValidationException))?;
        self.debug_print_status(&instruction);

        // Move pc to the next instruction, _before_ executing it, since we want
        // a "goto" to override this
        self.pc = ProgramCounter(new_address as u16);

        let instruction_result = self.execute_instruction(
          vm, call_stack, instruction);
        match instruction_result {
            Ok(ReturnFromMethod(return_value)) => return Ok(return_value),
            Ok(ContinueMethodExecution) => { /* continue the loop */ }

            Err(MethodCallFailed::InternalError(err)) => {
                return Err(MethodCallFailed::InternalError(err))
            }

            Err(MethodCallFailed::ExceptionThrown(exception)) => {
              /* To be discussed another time */
            }
        }
    }
}

/// Possible execution result of an instruction
enum InstructionCompleted<'a> {
    /// Indicates that the instruction executed was one of the return family.
    /// The callershould stop the method execution and return the value.
    ReturnFromMethod(Option<Value<'a>>),

    /// Indicates that the instruction was not a return, and thus the execution
    /// should resume from the instruction at the program counter.
    ContinueMethodExecution,
}

Finally, executing one instruction is simply a huge switch statement:

// Reference: https://docs.oracle.com/javase/specs/jvms/se7/html/jvms-6.html
fn execute_instruction(
    &mut self,
    vm: &mut Vm<'a>,
    call_stack: &mut CallStack<'a>,
    instruction: Instruction,
) -> Result<InstructionCompleted<'a>, MethodCallFailed<'a>> {
    match instruction {
        Instruction::Aconst_null => self.push(Null)?,
        Instruction::Aload(index) => self.execute_aload(index.into_usize_safe())?,
        Instruction::Aload_0 => self.execute_aload(0)?,
        Instruction::Aload_1 => self.execute_aload(1)?,
        // hundreds more instructions
    }
}

I have created helper methods for most instructions. For example, you can see the execute_aload one referred to in the snippet above.

Doing some arithmetics

Let us see in detail a real instruction that does something, such as iadd which replaces the two integers on the stack with their sum. This is its implementation (after expanding some macros):

Instruction::Iadd =>
  self.execute_int_math(|a, b| Ok(a.wrapping_add(b)))?,
// ...

fn execute_int_math<T>(&mut self, evaluator: T) -> Result<(), MethodCallFailed<'a>>
    where
        T: FnOnce(i32, i32) -> Result<i32, VmError>,
{
    let val2 = self.pop_int()?;
    let val1 = self.pop_int()?;
    let result = evaluator(val1, val2)?;
    self.push(Int(result))
}

fn pop_int(&mut self) -> Result<i32, MethodCallFailed<'a>> {
    let value = self.pop()?;
    match value {
        Value::Int(value) => Ok(value),
        _ => Err(MethodCallFailed::InternalError(
            VmError::ValidationException,
        )),
    }
}

Notice that pop_int does a validation check: it ensures that the stack actually contains an int when invoked. This is something that a real JVM will not need to do - all these checks are performed only once, during the verification phase, after loading a .class file, to avoid paying the cost at runtime.

Notice also that, given the large number of arithmetic instructions that follow the pattern “pop, pop, compute, push”, I have made execute_int_math receive the actual computation algorithm as a lambda.

Furthermore, since many instructions come in four or five families (int, long, float, double, and object versions), I have used a combination of generics and Rust’s macro mechanism to write the actual instruction once and “type” it properly. For example, I have this declarative macro:

generate_pop!(pop_int, Int, i32);
generate_pop!(pop_long, Long, i64);
generate_pop!(pop_float, Float, f32);
generate_pop!(pop_double, Double, f64);
generate_pop!(pop_object, Object, AbstractObject<'a>);

/// Pops a Value of the appropriate type from the stack
macro_rules! generate_pop {
    ($name:ident, $variant:ident, $type:ty) => {
        fn $name(&mut self) -> Result<$type, MethodCallFailed<'a>> {
            let value = self.pop()?;
            match value {
                Value::$variant(value) => Ok(value),
                _ => Err(MethodCallFailed::InternalError(
                    VmError::ValidationException,
                )),
            }
        }
    };
}

Declarative macros are relatively easy to use, quite powerful, and very useful. Having used the C and C++ preprocessor for years, they are really a breath of fresh air. I haven’t dabbled in Rust procedural macros though, which are a lot more complex.

Jumps

Jumps are easy: we just have to update the program counter! This is goto:

Instruction::Goto(jump_address) => self.goto(jump_address),
// ...

fn goto(&mut self, jump_address: u16) {
    self.pc = ProgramCounter(jump_address);
}

Where an instruction such as ifeq will be implemented by:

Instruction::Ifeq(jump_address) => self.execute_if(jump_address, |v| v == 0)?,
// ...

fn execute_if<T>(
    &mut self,
    jump_address: u16,
    comparator: T,
) -> Result<(), MethodCallFailed<'a>>
where
    T: FnOnce(i32) -> bool,
{
    let value = self.pop_int()?;
    if comparator(value) {
        self.goto(jump_address);
    }
    Ok(())
}

Conclusions

I hope I have managed to give you an overview of how RJVM executes Java bytecode. In the next part, we will complete the overview of the instruction executions by tackling the various invoke instructions and the handling of exceptions.

As usual, thank you for reading! 😊