Wednesday, August 09, 2023

Jacobin at the 2-year Mark

Jacobin (a JVM written entirely in Go) just reached its 2-year anniversary. Since our 18-month update, a lot has happened. We have:

  • Added instantiation of non-static classes
  • Added support for superclasses
  • Implemented the JDK’s native math libraries in Go
  • Added support for multidimensional arrays
  • Added support for compact strings
  • The interpreter now handles 190 bytecodes (out of 203)
  • Default to using the classes and libraries bunded with the OpenJDK
  • Significant instruction-level tracing capabilities (see below)

What we’re working on now and taking up shortly:

  • Making sure that our test suites generate the same results as the OpenJDK JVM
  • Adding the final bytecodes to the interpreter. (Some of these are very complicated, so they will likely take a while.)
  • Add exception handling
  • Add support for interfaces

Even before these goals are attained, we expect that to start running benchmarks and third-party test suites on Jacobin.

Much of the good progress we’ve made since our 18-month update is due to the addition of Richard Elkins (@texadactyl) to the team. He implemented the JDK’s native math libraries and has created a test suite, Jacotest, which grinds on existing and upcoming features.

Tracing and Peering into the JVM

Our progress remains very much aligned with the original goals for Jacobin: a JVM capable of running Java17 programs, written entirely in Go with no dependencies, delivered as a small executable from a cohesive, extensively commented codebase.

At present, Jacobin is a 3.9MB executable that is tested daily on Windows, Linux, and MacOS. Because it’s a single codebase, we have the pleasure of loading it into our IDE (GoLand, kindly provided by JetBrains) and stepping through the execution of a class bytecode-by-bytecode following the execution path across classes and libraries.

To give us a roadmap, we expanded our already detailed instruction tracing to show the values on the operand stack and other useful details. Here is a sample of the tracing log (available by specifying the -trace:inst option on the command line):

 

java/lang/StringLatin1 meth: inflate    PC:  30, GOTO       TOS:  - 

java/lang/StringLatin1 meth: inflate    PC:   3, ILOAD      TOS:  - 

java/lang/StringLatin1 meth: inflate    PC:   5, ILOAD      TOS:  0 int64 22 

java/lang/StringLatin1 meth: inflate    PC:   7, IF_ICMPGE  TOS:  1 int64 22 

java/lang/StringLatin1 meth: inflate    PC:  33, RETURN     TOS:  - 

java/lang/StringLatin1 meth: toChars    PC:  14, ALOAD_1    TOS:  - 

java/lang/StringLatin1 meth: toChars    PC:  15, ARETURN    TOS:  0 Object  

java/lang/String       meth: toCharArray PC: 14, GOTO       TOS:  0 Object  

java/lang/String       meth: toCharArray PC: 24, ARETURN    TOS:  0 Object  

main                   meth: main       PC:  41, ASTORE     TOS:  0 Object: &{{68288800 0} <nil> [{[I 0xc000004450}]}

main                   meth: main       PC:  43, GETSTATIC  TOS:  - 

 

(Some entries removed for simplicity.) In this listing, you see on the extreme left, the class name, the method name, the program counter (PC, which is the number of the bytecode being executed), the bytecode, and the value on the top of the stack (TOS). In this, TOS: 0 means there is one item on the stack (at position 0) and its type and value are shown immediately to the right (or on the next line in case of line wrapping).  

Notice that in this excerpt, execution starts in java.lang.StringLatin1/inflate(), eventually returns to the calling function in java.lang.String, toCharArray(). When this completes, it returns to the main method in the class called main. which is loaded with a pointer to an object that consists of an array of integers (in this particular case, an array of chars that form a string)

Testing

As stated in our previous posts, we’re deeply committed to testing. Currently, Jacobin uses a testbed of 618 tests: 525 unit tests and additional 93 tests in the Jacotest suite. Even at this level, we’re not satisfied with the depth of coverage, and we expect to continue expanding the testing aggressively.

By the Numbers

Jacobin consists of 11,097 lines (this includes code, comments, and blank lines). The 525 unit tests represent 21,465 lines. The Jacotest suite consists of and additional 21,921 lines (mostly Java). This totals to 43,386 lines of testing code, which means our test code is currently 3.91x the size of our production code. We aim to increase that ratio as we move forward.

So, where do we stand?

We’re not quite ready for users to begin testing Jacobin. In this coming year, we aim to ship a release that you can try out and test with your own Java classes. At that point, we’ll pivot to improving performance. (If you want to jump the gun, though, you can always download the code and do a build. Instructions on the release page.)

If you want to help the project, we’d love a star on GitHub (this helps keeps our motivation high) and perhaps let others know about the project.

Tuesday, February 14, 2023

Jacobin JVM at 18 months

Earlier this month, the Jacobin JVM project (a JVM written in Go) reached its 18-month milestone. Since our post at the 12-month mark, we have added support for numerous Java bytecodes to the interpreter, including all the bytecodes for longs, floats, doubles and their operations, all the bit manipulations, and all operations on single-dimensional arrays of primitives. We've implemented 176 bytecodes at present and expect to finish up the remaining ones we need during the coming six months.

At present, Jacobin can execute simple static classes, which is enough to allow us to test functionality and to begin running benchmarks. While performance has not in any way been a goal during our work, as we get closer to finishing the interpreter, it will assume greater importance. @suresk is already sketching out an observability client, similar to VisualVM and other tools, to guide our optimization work. 

Jacobin continues to meet our initial goals: it is written entirely in Go and has no dependencies. It runs fast and the executable is only 3.1MB (on Windows). It runs Java class files and JARs compiled by Java 7 through Java 17.

By the numbers

Jacobin's codebase consists of 25,813 lines (which include code, comments, and blank lines). As mentioned in earlier posts, we have a very deep commitment to testing as shown by the fact that this codebase includes 18,015 lines of testing code for the 7,798 of production code. This is a ratio of testing code to production code of 2.31x -- our highest to date (as we set out to do in earlier posts). Those 18K lines represent 429 unit and integration tests.

Easy Things You Can Do to Help

While Jacobin is still in pre-alpha mode, if you choose to build it or run one of the posted executables on GitHub, we’d love your feedback. We respond quickly to any and all feedback and questions. In this regard, Richard Elkins (@texadactyl) deserves our heartfelt thanks for running Jacobin on various test files and sharing his results with us.

If you’d just like to show your support for the project, we'd love a star on GitHub. Knowing people are interested in Jacobin really helps keep our motivation and spirits high. If you're on Twitter, please follow our handle (@jacobin_jvm) to keep abreast of what we’re doing.