Sunday, September 01, 2024

 Jacobin at the 3-year Mark

A Brief Look at the History

The project to build a JVM with go, Jacobin, has just reached its third anniversary. This is a good time to reflect a bit on the project. Three years ago, I started work on Jacobin with the belief that I could have a JVM that would run Java 11 code seamlessly in 18-24 months. I am a bit chastened by how much I underestimated the time line and the difficulty! (In addition, Jacobin has advanced to support for Java 17.)

Initially, like most JVM projects I believe, I wrote the parser for class files. Because Oracle's JVM spec is clear and greatly detailed about class anatomy, this work moved along quickly. Then came implementation of the interpreter: interpreting and running the bytecodes. 

This required a huge amount of code to get all the basics in place: not only the interpreter itself, but also the classloaders, method tables, frames and the frame stack, and many other small items--all of which had to be written accurately and integrated correctly with the other bits of the system. 

The interpreter is almost finished. We've implemented 202.5 of the 204 byte codes. INVOKEINTERFACE explains the decimal fraction--it's almost complete. The final remaining byte code is INVOKEDYNAMIC, which, we expect, will not be finished before year-end. Maybe longer, alas. It's a beast!

My principal collaborator, Richard Elkins, has made two enormous contributions (and many, many smaller ones). The first is he wrote (and keeps extending, dang it!) a suite of end-to-end integration tests that give Jacobin a heavy workout. These are the kinds of tests that throw and catch an exception 100 times, or throw, catch and rethrow, catch that and rethrow a dozen layers deep to uncover seams or not-quite-correct details. These tests (along with 700+ unit tests) have pushed us to refine the implementation and make sure everything works correctly.

That is, except in one area where we have struggled greatly and where Richard has been working steadfastly for a long time. This is an issue I entirely failed to anticipate: how much of the standard JDK libraries are written in native code--that is, not in Java.

At first, we chose to reimplement those methods in go. But that proved unsatisfactory for several reasons, most especially that one small native method could plug into a warren of rabbit holes, each with its own set of native functions that were all needed just to make the one original method work. 

Recently, we've begun experimenting with using the PureGo project to bridge from Jacobin to the native function libraries in the JDK. If we can get it to work suitably for our needs, it would remove the pressure to reimplement the core Java libraries in go.

The other unexpected problem (which we have solved) was static initialization blocks. They are a feature of Java that is almost never used by developers, but which is extensively employed by the JVM and the JDK libraries. Unfortunately, the JVM spec barely mentions static initialization blocks and we had to figure this out somewhat blindly until we were able to work out the details and get the blocks running as expected by the JDK.

Our goal now is to finish the last of the bytecodes, address the open seams revealed by Richard's test suite and then run the Computer Language Benchmarks. Once those are running, we'll begin inviting folks who follow the project to do early testing. (If you want to be among those early testers, give us a star on GitHub and open a ticket. Anyone who's opened a ticket with us is automatically enrolled in the alpha and beta programs.)

What's the Motivation for all this Effort?

Initially, I started this project because I thought it would be great to expand my knowledge of Java by understanding the JVM better. It was an educational  passion project. But as the years have gone by, a new and more useful mission has emerged: Jacobin is the only Java 17-capable JVM written entirely in one language. This means that you can download the source code, compile it, and run it in your IDE and see Java instructions executing one at a time. That's really quite cool!

We've decided to explore making an elegant UI as a viewport into Jacobin, so that if you want to know exactly how your Java program executes, you can observe the whole thing at the level of detail you want. 

In the official JDKs (those based on OpenJDK), this is very difficult for the typical developer. Those JDKs are written in multiple languages and the code is difficult to follow. While heavily commented in places, those comments are aimed at JDK experts who have full contextual information. (This is not a critique, I should add. With 100+ developers working on the JDK, comments are necessarily oriented towards developers with the requisite knowledge and background.)

Jacobin, by contrast, is written in one language, heavily commented and, we believe, approachable by the average developer. The trade-off will be speed and a few missing features, mostly detailed on the project's GitHub status page

That page also shows the overall project status. 

The Last Six Months

Jacobin 0.5.0, which first saw life on 28 Feb of this year added: all missing byte codes other that INVOKEINTERFACE and INVOKEDYNAMIC, implemented the full exception throw and catch mechanisms, interned Java strings, added the Java File I/O libraries, and began deep exploration of PureGo for calling native methods.

Testing

As we've discussed many times in these updates, we're close to fanatical about testing. We currently rely on 712 unit tests and 155 end-to-end integration tests. For a total of 867 tests (up from 708 six months ago). 

By the Numbers

Jacobin consists of 54,156 lines (includes codes, comments, and blank lines). The Jacotest suite consists of 28,486 lines, for a total project size of 82,642 lines. Of those, 25,588 lines are production code and 57,054 are testing code. This means our testing code is 223% the size of production code. This ratio is down from six months ago, when we were over 300%. This decrease is due to the great number of native functions we've translated into go but have not tested extensively until we find the definitive approach to integrating those functions with Jacobin (as touched on briefly above). Once that's worked out, those functions will see our usual heavy testing. 

If you'd like to show your support for Jacobin JVM, we'd love a ⭐ on GitHub. That helps keep our motivation high! If you want more frequent updates, please follow us on Twitter (@jacobin_jvm)