Tuesday, 17 January 2012

Calling Java from assembler

Hi there,

You may think that this is completely insane. We have C, right? You know, that high-level language which might still be popular come the end of the year?

Yes, but then that's hardly the point. I want to know how to do the same thing in assembly. So, with that in mind, here goes. First, the Java class we're going to use to do this:

HelloWorld.java
public class HelloWorld {

 static {
  System.loadLibrary("hello");
 }

 public static void main(String[] args) throws Exception {
  HelloWorld hw = new HelloWorld();
  hw.requestGreeting();
 }

 native void requestGreeting();

 void sayHello() {
  System.out.println("Hello, World!");
 }
}


This version differs from the previous ones in that it includes the method sayHello(), which we'll get the library to call.

Makefile
-------------------------------8<-------------------------------

all: HelloWorld.class libhello.so opt3_libhello.so

HelloWorld.class: HelloWorld.java
        javac -cp . HelloWorld.java

libhello.so: HelloWorld.o
        ld -fPIC -shared -o libhello.so HelloWorld.o

HelloWorld.o: HelloWorld.s
        as -gstabs -o HelloWorld.o HelloWorld.s

opt3_libhello.so: HelloWorld.c
        gcc -gstabs -O3 -shared -fPIC -I${JAVA_HOME}/include -I${JAVA_HOME}/include/linux HelloWorld.c -o opt_libhello.so

clean:
        rm *.o *.class *.so

-------------------------------8<-------------------------------

And here's what we want the assembler to do:

HelloWorld.c
#include "HelloWorld.h"

JNIEXPORT void JNICALL Java_HelloWorld_requestGreeting(JNIEnv* env, jobject hw_obj) {
 jclass clazz = (*env)->FindClass(env, "HelloWorld");
 jmethodID mid = (*env)->GetMethodID(env, clazz, "sayHello", "()V");
 (*env)->CallVoidMethod(env, hw_obj, mid);
}


I've omitted the creation of the HelloWorld.h header file for now, since it's not the focus of the post.

So - here's the assembler which does the same thing (with quite a few comments along the way). I'm sure that there are more efficient ways of doing this, but was reasonably heartened to see that the output of gcc -O3 ... HelloWorld.c appeared to do something similar.

HelloWorld.s
-------------------------------8<-------------------------------
.section .data

clazz_name:
        .asciz  "HelloWorld"
void_sig:
        .asciz  "()V"
method_name:
        .asciz  "sayHello"
        
.section .text

        .globl  Java_HelloWorld_requestGreeting
        .type   Java_HelloWorld_requestGreeting, @function

Java_HelloWorld_requestGreeting:

        #
        # Prologue
        #
        pushq   %rbp
        movq    %rsp%rbp                # Store the two arguments to the function

        #
        # JNIEXPORT void JNICALL Java_HelloWorld_sayHello(JNIEnv*, jobject);
        # Parameters:
        #     RDI: JNIEnv*
        #     RSI: jobject
        #
        # The stack frame we inherit and subsequently intend to set-up will look like this:
        #   8      -> return address
        #  -0(RBP) -> previous RBP
        #  -8      -> JNIEnv parameter
        # -16      -> jobject parameter
        # -24      -> The address of the JNI function-table (to be calculated)
        # -32      -> The result of the call to FindClass (to be retrieved)
        # -40      -> The result of the call to GetMethodID (to be retrieved)
        #
        subq    $40%rsp                 # Reserve the stack frame

        movq    %rdi-8(%rbp)            # Store the JNIEnv parameter on the stack
        movq    %rsi-16(%rbp)           # Store the jobject parameter on the stack
        
        movq    (%rdi), %rax              # RAX now contains the starting address of the function-table.
        movq    %rax-24(%rbp)           # Store function-table address on the stack
        
        #
        # Invoke the (*JNIEnv)->FindClass function to look-up the address of the function-table
        # for the class "HelloWorld"
        # Parameters:
        #     RDI: JNIEnv*
        #     RSI: address-of "HelloWorld"
        # Returns:
        #     JClass*
        #
                                          # RDI still contains pointer to JNIEnv
        leaq    clazz_name(%rip), %rsi    # Calculate and store the address of "HelloWorld" in RSI

        movq    48(%rax), %rax            # Store the address of the 6th element in the function-table in RAX
        call    *%rax                     # Call resulting function-pointer; it returns a pointer to the JClass in RAX
        
        movq    %rax-32(%rbp)           # Store result on the stack
        
        
        # Invoke (*JNIEnv)->GetMethodID function
        # Parameters:
        #     RDI: JNIEnv*
        #     RSI: JClass*
        #     RDX: address-of "sayHello"
        #     RCX: address-of "()V", the "void" parameter-list descriptor
        # Returns:
        #     jmethodID*
        #
        movq    -8(%rbp), %rdi            # Retrieve pointer to JNIEnv from the stack and store in RDI
        movq    %rax%rsi                # Store JClass pointer in RSI 
        leaq    void_sig(%rip), %rcx      # Store address of "()V" in RCX
        leaq    method_name(%rip), %rdx   # Store address of "sayHello" in RDX
        
        movq    -24(%rbp), %rax           # Look-up the pointer to the function-table from the stack and store in RAX
        movq    264(%rax), %rax           # Store the address of the 33rd element (GetMethodID) in RAX
        call    *%rax                     # Call function-pointer; it returns a pointer to the jmethodID
        
        movq    %rax-40(%rbp)           # Store the jmethodID on the stack
        
        #
        # Invoke (*env)->CallVoidMethod 
        # Parameters:
        #     RDI: JNIEnv*
        #     RSI: jobject*
        #     RDX: jmethodID*
        #      AL: varargs parameter count
        # Returns:
        #     void
        movq    -8(%rbp), %rdi            # Retrieve pointer to JNIEnv from the stack and store in RDI
        movq    -16(%rbp), %rsi           # Retrieve pointer to JObject from the stack and store in RSI
        movq    %rax%rdx                # Store pointer to jmethodID in RDX
        
        movq    -24(%rbp), %rax           # Look-up the pointer to the function-table from the stack and store in RAX
        movq    488(%rax), %rcx           # Store the address of the 61st element (CallVoidMethod) in _RCX_
        xorq    %rax%rax                # Set RAX (and hence AL) to zero. See page 20 of the SysV ABI and notes below
        call    *%rcx
        
        #
        # Epilogue
        #
        movq    %rbp%rsp                # forget stack frame
        popq    %rbp                      # restore caller's base-pointer

        retq
        
-------------------------------8<-------------------------------

The last call in interesting because it takes a "hidden argument" in the %AL register. The following appears at page 20 of the x86_64 System V ABI:

For calls that may call functions that use varargs or stdargs (prototype-less calls or calls to functions containing ellipsis (...) in the declaration) %al is used as hidden argument to specify the number of vector registers used. The contents of %al do not need to match exactly the number of registers, but must be an upper bound on the number of vector registers used and is in the range 0-8 inclusive.

Whereas on a 32-bit architecture the varargs method would have to figure out for themselves how many parameters to read from the stack, the 64-bit ABI specifies that the parameter count will be passed in the lower 16 bits of the %RAX register.

I found the following resources of great help while researching how this works:

The JNIEnv Interface pointer
Invoking assembly language programs from Java

JNIEnv is a pointer that, in turn, points to another pointer. This second pointer points to a function table that is an array of pointers. Each pointer in the function table points to a JNI interface function. The virtual machine is guaranteed to pass the same interface pointer to native method implementation functions called from the same thread. However, a native method can be called from different threads, and therefore may be passed different JNIEnv interface pointers. Although the interface pointer is thread-local, the doubly indirected JNI function table is shared among multiple threads.

In order to call an interface function, we have to determine the value of the corresponding entry in the function table.

[...]

To retrieve the contents of the entry in the function table that corresponds to the function we want to call, we have to multiply the zero based index of the function (see Sheng Liang's book) by eight, since each pointer is eight bytes long, and add the result to the starting address of the function table which we have formed in RAX earlier.

No comments:

Post a Comment