Yesterday I was spending some time thinking about the possibilities to cluster applications written in "JRuby":http://jruby.codehaus.org/ with "Terracotta":http://www.terracotta.org/. Sounds like a crazy idea? Well, I don't know. Perhaps it is. But thinking ahead a bit, with perhaps future deployments of "Ruby on Rails":http://www.rubyonrails.org/ applications etc. on JRuby makes it a bit more interesting. Anyway, let's give it a try. h1. Chatter sample application First let's start with writing a **very** simple chat application in JRuby. h1. Run it Now let's run it. Of course, this application is completely useless. It is in process, single threaded and only recieves input from STDIN, which means that it can only be used by one single user at a time. But, what if we could make a single instance of the
@messages list available on the network and then run multiple instances of the application (each on its own JVM instance, even on multiple machines) and have each one of them use this shared list?
This would solve our problem and this in actually exactly what Terracotta could [conceptually] do for us. But Terracotta is a Java infrastructure service without any bindings or support for JRuby. So how can we proceed?
h1. Terracotta for JRuby
I can see two different ways we can bridge JRuby and Terracotta:
h2. Create a pattern language
We could hook into and extend the Terracotta pattern matching language (the language that is used to pick out the shared state and the methods modifying it) to support the Ruby syntax. This would mean allowing the user to define the patterns based on the Ruby language but then, under the hood, actually map the Ruby syntax to the real Java classes that are generated by JRuby (this assumes using the compiler and not the interpreter). The benefit here is that it would be "transparent", in the same way as it is for regular Java applications. This is perhaps the best solution long term, but requires quite a lot of work and requires a fully working JRuby compiler. The development of the JRuby compiler has just started. When I tried it today it did not even compile the samples shipped with the distribution, so the Terracotta support for the compiler naturally has to wait until the implementation gets more complete and stable.
h2. Create a JRuby API
The minimal set of abstractions we need in this API is:
1. State sharing: Be able to pick out the state that should be shared - e.g. the top level "root" object of a shared object graph
2. State guarding: Be able to safely update this graph within the scope of a transaction
Ok, let's try to design this API.
h1. Designing the JRuby API
Most people are probably not aware of that Terracotta actually has an API that can do all this (and much more) for us. It is well hidden in the "SVN repository":http://svn.terracotta.org/fisheye/browse/Terracotta and is used internally as the API for the hooks, added to the target application during the bytecode instrumentation process, to call.
The class in question is the: "ManagerUtil":http://svn.terracotta.org/fisheye/browse/Terracotta/branches/2.2.0/community/open/code/base/dso-l1/src/com/tc/object/bytecode/ManagerUtil.java.
So, everything we need is in the
ManagerUtil class. But before we roll up our sleeves and start hacking on the glue code, let's take a step back and think through what we want out of the API from a user's perspective.
First we need to be able to create a "shared root", e.g. create a top-level instance of a shared object graph. One way of doing it would be to create some sort of factory method that can do the heavy lifting for us, similar to this:
Here we told some factory method
createRoot to create a root instance of the type
java.util.ArrayList. Seems reasonable I think.
The other thing we need to do is to create some sort of transactions. We want to be able to start a transaction, lock the [shared] instance being updated, update it, unlock the instance and finally commit the transaction. Here is some basic pseudo code:
All steps except 'modify target' is done by infrastructure code. Code that has nothing to do with your business logic, code that we want to eliminate and "untangle" as much as possible.
In Ruby, a common design pattern to accomplish this is to use a method that takes a "block/closure":http://en.wikipedia.org/wiki/Closure_(computer_science). Using this pattern we could design the API to look something like this:
Here the semantics are:
@messages list is guarded for concurrent access (both in a single JVM or in the cluster)
* the 'do' keyword takes the lock (on
@messages) and initiates the transaction (unit of work)
* all updates to
@messages that are done within the 'do-end' block are recorded in a change set (and are guaranteed to be done in isolation)
* when the 'end' of the block is reached then the transaction is committed, the change set is replicated and the lock (on
@messages) is released
h1. Implementing the JRuby API
Now, let's take a look at the
ManagerUtil class. It has a lot of useful stuff but the methods that we are currently interested in has the following signatures:
Based on these methods we can create the following JRuby module implementing the requirements we outlined above (and let's put it in a flle called 'terracotta.rb' to be able to easily include it into the applications we want) :
ManagerUtil has a whole bunch of other useful and cool methods, such as for example
deepClone for creating optimistic concurrency transactions, etc. But I'll leave these for a later blog post.
h1. Creating a distributed version of the Chatter application
Great. Now we have a JRuby API with all the abstractions needed to create a distributed version of our little chat application. What we have to do is simply to create the
@messsages variable using the factory method for creating roots in the API. Then we also have to make sure that we guard the updating of the shared
java.util.ArrayList using a guarded block.
Let's take a look at the final version of the application.
Pretty simple and fairly intuitive, right?
**Note 1**: Actually, we could have made it even more simple if we had taken a
java.util.concurrent.LinkedBlockingQueue instead of a regular
java.util.ArrayList as the list to hold our messages. If we would have done that then we could have skipped the
DSO.guard block altogether since the Java concurrency abstractions are natively supported by Terracotta. But then I would have missed the opportunity to show you how to handle non-thread-safe data access.
**Note 2**: It currently only works for sharing native Java classes. In other words, you can currently **not** cluster native JRuby constructs since it would mean cluster the internals of the JRuby interpreter (which is a lot of work and most likely not possible without rewriting parts of the interpreter). However, one feasible approach would be to not use the JRuby interpreter but the JRuby compiler and cluster its generated Java classes - but unfortunately the compiler is not ready for general use yet (see the footnote at the bottom).
h1. Enable Terracotta for JRuby
In order to enable Terracotta for JRuby we have to add a couple of JVM options to the startup of our application, or add them directly in the jruby.(sh|bat) script.
You have to change:
I know what you are thinking:
"Hey! What's up with that
Yes, unfortunately you have to feed Terracotta with a tiny bit of XML. This can perhaps be eliminated in the future (and replaced with a couple of command line options more or a JRuby config). But for now you have to write an XML file that looks like this:
As you can see it contains the name of the server where the Terracotta server resides, the path to the logs and a statement that says; include all classes for instrumentation. That's it.
Now let's run it.
h1. Run the distributed Chatter
Here is an sample of me trying it out with my wife Sara. It shows my session window:
Well, I hope that you find it a bit more exciting...
Feel like helping out? Drop me a line.
h1. And one more thing...
I also had some fun porting the JTable sample in the Terracotta distribution. It worked out nicely. I only had one problem and that was that I encountered a JRuby bug when trying to create a multi-dimensional
tc-config.xml file? Do I have to write that? I hate XML!"
java.lang.Object array. I was able to work around the bug by creating the array using reflection, but this unfortunately had the effect of making the final code much longer. Anyway, here is the code in case you want to try it out: