Wednesday, August 09, 2023

Jacobin at the 2-year Mark

Jacobin (a JVM written entirely in Go) just reached its 2-year anniversary. Since our 18-month update, a lot has happened. We have:

  • Added instantiation of non-static classes
  • Added support for superclasses
  • Implemented the JDK’s native math libraries in Go
  • Added support for multidimensional arrays
  • Added support for compact strings
  • The interpreter now handles 190 bytecodes (out of 203)
  • Default to using the classes and libraries bunded with the OpenJDK
  • Significant instruction-level tracing capabilities (see below)

What we’re working on now and taking up shortly:

  • Making sure that our test suites generate the same results as the OpenJDK JVM
  • Adding the final bytecodes to the interpreter. (Some of these are very complicated, so they will likely take a while.)
  • Add exception handling
  • Add support for interfaces

Even before these goals are attained, we expect that to start running benchmarks and third-party test suites on Jacobin.

Much of the good progress we’ve made since our 18-month update is due to the addition of Richard Elkins (@texadactyl) to the team. He implemented the JDK’s native math libraries and has created a test suite, Jacotest, which grinds on existing and upcoming features.

Tracing and Peering into the JVM

Our progress remains very much aligned with the original goals for Jacobin: a JVM capable of running Java17 programs, written entirely in Go with no dependencies, delivered as a small executable from a cohesive, extensively commented codebase.

At present, Jacobin is a 3.9MB executable that is tested daily on Windows, Linux, and MacOS. Because it’s a single codebase, we have the pleasure of loading it into our IDE (GoLand, kindly provided by JetBrains) and stepping through the execution of a class bytecode-by-bytecode following the execution path across classes and libraries.

To give us a roadmap, we expanded our already detailed instruction tracing to show the values on the operand stack and other useful details. Here is a sample of the tracing log (available by specifying the -trace:inst option on the command line):

 

java/lang/StringLatin1 meth: inflate    PC:  30, GOTO       TOS:  - 

java/lang/StringLatin1 meth: inflate    PC:   3, ILOAD      TOS:  - 

java/lang/StringLatin1 meth: inflate    PC:   5, ILOAD      TOS:  0 int64 22 

java/lang/StringLatin1 meth: inflate    PC:   7, IF_ICMPGE  TOS:  1 int64 22 

java/lang/StringLatin1 meth: inflate    PC:  33, RETURN     TOS:  - 

java/lang/StringLatin1 meth: toChars    PC:  14, ALOAD_1    TOS:  - 

java/lang/StringLatin1 meth: toChars    PC:  15, ARETURN    TOS:  0 Object  

java/lang/String       meth: toCharArray PC: 14, GOTO       TOS:  0 Object  

java/lang/String       meth: toCharArray PC: 24, ARETURN    TOS:  0 Object  

main                   meth: main       PC:  41, ASTORE     TOS:  0 Object: &{{68288800 0} <nil> [{[I 0xc000004450}]}

main                   meth: main       PC:  43, GETSTATIC  TOS:  - 

 

(Some entries removed for simplicity.) In this listing, you see on the extreme left, the class name, the method name, the program counter (PC, which is the number of the bytecode being executed), the bytecode, and the value on the top of the stack (TOS). In this, TOS: 0 means there is one item on the stack (at position 0) and its type and value are shown immediately to the right (or on the next line in case of line wrapping).  

Notice that in this excerpt, execution starts in java.lang.StringLatin1/inflate(), eventually returns to the calling function in java.lang.String, toCharArray(). When this completes, it returns to the main method in the class called main. which is loaded with a pointer to an object that consists of an array of integers (in this particular case, an array of chars that form a string)

Testing

As stated in our previous posts, we’re deeply committed to testing. Currently, Jacobin uses a testbed of 618 tests: 525 unit tests and additional 93 tests in the Jacotest suite. Even at this level, we’re not satisfied with the depth of coverage, and we expect to continue expanding the testing aggressively.

By the Numbers

Jacobin consists of 11,097 lines (this includes code, comments, and blank lines). The 525 unit tests represent 21,465 lines. The Jacotest suite consists of and additional 21,921 lines (mostly Java). This totals to 43,386 lines of testing code, which means our test code is currently 3.91x the size of our production code. We aim to increase that ratio as we move forward.

So, where do we stand?

We’re not quite ready for users to begin testing Jacobin. In this coming year, we aim to ship a release that you can try out and test with your own Java classes. At that point, we’ll pivot to improving performance. (If you want to jump the gun, though, you can always download the code and do a build. Instructions on the release page.)

If you want to help the project, we’d love a star on GitHub (this helps keeps our motivation high) and perhaps let others know about the project.