Wednesday, July 2, 2008

Writing assembly in Java

Not really assembly language... but byte code instead.

I read an article on dynamic code generation with C# a few days ago and was wondering if similar thing can be done with Java. After some Google searches, the (expected) answer is yes!

The benefit of dynamic code generation is performance, esp. when the logic involved has to be executed many times. This article has a good example on using dynamic programming on a stack-based calculator.

For example, to a stack-based calculator, the expression "2 $0 * $1 /" means:

  • push the value 2 to the stack
  • push parameter 0 to the stack
  • execute the multiple calculation by popping two values from the stack and store the result back to the stack
  • push parameter 1 to the stack
  • execute the divide calculation by popping two values from the stack and store the result back to the stack

That is: (2 * $0) / $1.

Using the plain old Java, it is easy to implement the logic. All we need to do is tokenize the expression and push/pop values from the Stack class for calculation.

......
                char opchar = tok.charAt(0);
                int  op = "+-*/".indexOf(opchar);

                if (op == -1)
                {
                    // not an operator
                    stk.push(Double.valueOf(tok));
                }
                else
                {
                    double arg2 = (stk.pop()).doubleValue();
                    double arg1 = (stk.pop()).doubleValue();

                    switch(op)
                    {
                        case 0:
                            stk.push(arg1 + arg2);
                            break;

                        case 1:
                            stk.push(arg1 - arg2);
                            break;

                        case 2:
                            stk.push(arg1 * arg2);
                            break;

                        case 3:
                            stk.push(arg1 / arg2);
                            break;

                        default:
                            throw new RuntimeException(
                                "unknown parameter(" + i + "): " + tok);
                    }

                }
......


When tested for performance, this implementation can compute the expression "2 $0 * $1 / 1 + $2 -" a million times in less than 3 seconds. Not bad. But it could be improved.

Using dynamic code generation, we can actually create a method (or a Java class) on-the-fly for each expression. This eliminates all the unnecessary testing and branching logic in the loop. For comparison, the above mentioned test can finish in less than 20ms when using dynamic code generation. (yes... finishing the one million calculation in less than twenty millisecond!!).

Here are the details on the dynamic code generation. First, go get the Byte Code Engineering Library (BCEL). It simplifies the way to write byte code logic.

Create a new ClassGen class as the template for the new Calculator
        ClassGen gen = new ClassGen(
            dynClassName,  // fully qualified class name
            "java.lang.Object",  // fully qualified superclass name
            "",  // source file name
            Constants.ACC_PUBLIC | Constants.ACC_SUPER,  // access qualifiers
            new String[]
                {"net.clarenceho.calc.Calculator"}  // implemented interfaces
            );


Then, define methods for the class. We need to define a constant pool and an instruction list of each method. The logic is similar to writing assembly language. We will be pushing/popping values from the operand stack and call arithmetic/logic operations to do the work. e.g.

......
                int op = "+-*/".indexOf(tok.charAt(0));

                switch(op)
                {
                    case -1:
                        // not an operator
                        double val = Double.parseDouble(tok);
                        il.append(new PUSH(cp, val));
                        break;

                    case 0:
                        il.append(InstructionConstants.DADD);
                        break;

                    case 1:
                        il.append(InstructionConstants.DSUB);
                        break;

                    case 2:
                        il.append(InstructionConstants.DMUL);
                        break;

                    case 3:
                        il.append(InstructionConstants.DDIV);
                        break;

                    default:
                        throw new RuntimeException(
                            "unknown parameter: " + tok);
                }
......

il is the instruction list for the method. the DADD, DSUB etc are the arithmetic logic for double values. Refer to the BCEL apidoc for details.

Following the above mentioned article, I also implemented the logic for fun. The only different is that my implementation takes doubles instead of integers. Full source can be found below.


Interface for the Calculator:
package net.clarenceho.calc;

public interface Calculator
{
    double calc(double[] param);
}


Implementation using traditional Java logic:
package net.clarenceho.calc;

import java.util.StringTokenizer;
import java.util.Stack;
import java.util.ArrayList;

public class SimpleCalculator implements Calculator
{
    private ArrayList<String> lst = new ArrayList<String>();

    public SimpleCalculator(String exp)
    {
        StringTokenizer st;
        st = new StringTokenizer(exp, " ");
        while (st.hasMoreTokens())
        {
            lst.add(st.nextToken());
        }
    }

    public double calc(double param[])
    {
        Stack<Double> stk = new Stack<Double>();

        for (int i=0; i < lst.size(); i++)
        {
            String tok = lst.get(i);
            if (tok.startsWith("$"))
            {
                // a parameter
                int pos = Integer.parseInt(tok.substring(1));
                stk.push(param[pos]);
            }
            else
            {
                char opchar = tok.charAt(0);
                int  op = "+-*/".indexOf(opchar);

                if (op == -1)
                {
                    // not an operator
                    stk.push(Double.valueOf(tok));
                }
                else
                {
                    double arg2 = (stk.pop()).doubleValue();
                    double arg1 = (stk.pop()).doubleValue();

                    switch(op)
                    {
                        case 0:
                            stk.push(arg1 + arg2);
                            break;

                        case 1:
                            stk.push(arg1 - arg2);
                            break;

                        case 2:
                            stk.push(arg1 * arg2);
                            break;

                        case 3:
                            stk.push(arg1 / arg2);
                            break;

                        default:
                            throw new RuntimeException(
                                "unknown parameter(" + i + "): " + tok);
                    }

                }
            }
        }

        return (stk.pop()).doubleValue();
    }
}


Implementation using dynamic code generation:
package net.clarenceho.calc;

import java.lang.Thread;
import java.lang.Class;
import java.lang.ClassLoader;
import java.util.StringTokenizer;
import java.util.Stack;

import org.apache.bcel.Constants;
import org.apache.bcel.generic.ClassGen;
import org.apache.bcel.generic.ConstantPoolGen;
import org.apache.bcel.generic.InstructionList;
import org.apache.bcel.generic.InstructionConstants;
import org.apache.bcel.generic.PUSH;
import org.apache.bcel.generic.MethodGen;
import org.apache.bcel.generic.Type;

public class DynamicCalculator extends ClassLoader implements Calculator
{
    Calculator dynCalc = null;

    public DynamicCalculator(String exp)
    {
        String dynClassName = "net.clarenceho.calc.DynamicCalculator_" +
            Thread.currentThread().getId() + "_" +
            System.currentTimeMillis();

        ClassGen gen = new ClassGen(
            dynClassName,  // fully qualified class name
            "java.lang.Object",  // fully qualified superclass name
            "",  // source file name
            Constants.ACC_PUBLIC | Constants.ACC_SUPER,  // access qualifiers
            new String[]
                {"net.clarenceho.calc.Calculator"}  // implemented interfaces
            );

        // need to manually add an empty constructor
        gen.addEmptyConstructor(Constants.ACC_PUBLIC);
        // add member function to the class;
        this.addMethod(gen, exp);

        // create a new instance of the dynamic class
        byte[] data = gen.getJavaClass().getBytes();
        Class theClass = this.defineClass(dynClassName, data, 0, data.length);
        try
        {
            this.dynCalc = (Calculator)theClass.newInstance();
        }
        catch (Exception e)
        {
            throw new RuntimeException(e.getMessage());
        }
    }


    private void addMethod(ClassGen cgen, String exp)
    {
        // each method has a constant pool and an instruction list
        ConstantPoolGen cp = cgen.getConstantPool();
        InstructionList il = new InstructionList();

        StringTokenizer st = new StringTokenizer(exp, " ");

        while (st.hasMoreTokens())
        {
            String tok = st.nextToken();

            if (tok.startsWith("$"))
            {
                // a parameter
                int pos = Integer.parseInt(tok.substring(1));

                // load and store the array reference to the operand stack
                il.append(InstructionConstants.ALOAD_1);
                // store the array index to the operand stack
                il.append(new PUSH(cp, pos));
                // load and store array item
                il.append(InstructionConstants.DALOAD);
            }
            else
            {
                int op = "+-*/".indexOf(tok.charAt(0));

                switch(op)
                {
                    case -1:
                        // not an operator
                        double val = Double.parseDouble(tok);
                        il.append(new PUSH(cp, val));
                        break;

                    case 0:
                        il.append(InstructionConstants.DADD);
                        break;

                    case 1:
                        il.append(InstructionConstants.DSUB);
                        break;

                    case 2:
                        il.append(InstructionConstants.DMUL);
                        break;

                    case 3:
                        il.append(InstructionConstants.DDIV);
                        break;

                    default:
                        throw new RuntimeException(
                            "unknown parameter: " + tok);
                }
            }
        }

        // return a double
        il.append(InstructionConstants.DRETURN);

        MethodGen mgen = new MethodGen(
            Constants.ACC_PUBLIC,  // access qualifiers
            Type.DOUBLE,  // return type
            new Type[] {Type.getType("[D")},  // argument types
            new String[] {"param"},  // argument names
            "calc",  // name of method
            cgen.getClassName(),  // class name containing this method
            il,  // instruction list associated with this method
            cp  // constant pool
            );

        // compute the max stack size
        mgen.setMaxStack();
        // compute the max number of local variables
        mgen.setMaxLocals();

        cgen.addMethod(mgen.getMethod());
    }

    public double calc(double[] param)
    {
        if (this.dynCalc != null)
        {
            return this.dynCalc.calc(param);
        }
        else
        {
            throw new RuntimeException(
                "problem creating dynamic calculator");
        }
    }
}

No comments: