Oddthesis-logo

JBoss and mod_cluster

Background on Clustering HTTP

JBoss clusters, as we all know.  You can fire up a farm of AS5 worker nodes, and they'll find each other (sometimes with the help of a Gossip router).  They'll share HTTP sessions and such, through the magic of JBoss Cache.

But then you end up with a farm of distinct HTTP listeners out there, each with their own IP address.  So we jam a proxy out front, normally, which dispatches requests to any one of the workers on the farm.

But then your proxy has to know all about the workers.  With many generic solutions, that means maintaing a list of worker nodes.  Normally by a human or a ball of bash scripts.

Let me introduce JBoss mod_cluster, though, which goes a long way to making clustering a simple, happy, joyous event.

JBoss mod_cluster

While JBoss mod_cluster has a few different modes of operation, from standalone to HA, using HTTP or AJP to chat with the back-ends, we'll be looking at the top-of-the-line implementation, since I get to play with all the big toys.

First, as the name implies, mod_cluster is a module for Apache httpd.  In fact, it's a set of modules that work with mod_proxy and mod_proxy_ajp.

The Apache httpd configuration can be super-simple:

<VirtualHost *:80>
ServerName test.local.oddthesis.org

ManagerBalancerName mycluster
ServerAdvertise off

<Proxy *>
</Proxy>

</VirtualHost>

I turn off proxy server advertising because multicast isn't available to me (see below).

In the HA mode, mod_cluster takes advantage of the fact that your cluster knows itself.  A worker is responsible for providing the entire cluster view to the front-end httpd processes.  It also informs the cluster itself of the view of the proxy front-ends.

Then mod_cluster pipes requests through mod_proxy_ajp dynamically to find their way to a worker.  You don't have to maintain worker lists yourself or through bash voodoo any more.  The front-end chats AJP with the workers, so things flow efficiently.  Add nodes, remove nodes, have nodes crash (never!), and the proxy responds.

mod_cluster overview

The source distribution seems to include httpd and a lot of its own dependencies.  I was able to compile just the modules which work fine in a stock Fedora-10 spin of apache httpd.  I'll be publishing an RPM shortly, also.

Multicast

Some of the magic involves multicast, which I've decided is a tool of the devil.  By default, it seems disabled in VMWare, and is permanently disabled on EC2.  So it might as well not exist.

With mod_cluster, the default is for the proxies to advertise over multicast so that workers can find them initially.  This is awesome if your environment supports it. Mine doesn't.

But the AS5 portion of mod_cluster realizes that sometimes you can't use multicast.  So you can provide a list of proxies in the mod-cluster service within AS5.  I've decided to go with property substitution and modifying my JBoss boot script to check for $JBOSS_PROXY_LIST in /etc/jboss-as5.conf.  This gets passed on in and consumed at AS5's boot time.  Basically:

run.sh -Djboss.modcluster.proxyList=$JBOSS_PROXY_LIST
  <!-- Configure this node's communication with the load balancer -->
<bean name="HAModClusterConfig" class="org.jboss.modcluster.config.ha.HAModClusterConfig" mode="On Demand">

<!-- Comma separated list of address:port listing the httpd servers
where mod_cluster is running. -->
<property name="proxyList">${jboss.modcluster.proxyList}</property>

In an EC2 environment, my puppet recipe will grab the proxy list from the boot metadata and reset my /etc/jboss-as5.conf appropriately, perhaps.

You may argue that we've just replaced the maintenance of a worker list with the maintenance of a proxy list.  Which is somewhat true.  But the proxy list tends to be smaller, more static, and less crashy.  Workers tend to grow, shrink and crash more often.  And if you do have multicast available to you, mod_cluster will sprinkle magic end-to-end, and no list maintenance is required at all.

Overall, mod_cluster is definitely another useful tool for running Java apps in scalable environments.

Comments

Marek Goldmann, 02:19pm UTC, 27 December 2008

Short, but helpful. Thanks!

P.S. I'm waiting for the RPM's :)

Bob McWhirter 03:33pm UTC, 27 December 2008

@Marek--

Monday, I'll wander into the Starbucks and get the RPMs online. I'm bandwidth-challenged, and pushing the whole set including the AS5 RPM along with both mod_cluster RPMs takes forever over satellite.

Bob McWhirter 07:56pm UTC, 29 December 2008

Bret McMillan, 02:04am UTC, 20 January 2009

Very cool; I wonder if in EC2 particularly if you could build your list of proxies via queries against a certain AMI type, or have it percolate through via Cache...

suresh 07:23pm UTC, 08 April 2009

Good one;
Is this specific to AS5?? or does it work with AS4??

I'm working on Jboss AS 4 cluster setup on EC2 and stuck!!!

Thanks,
Suresh

Marek Goldmann 01:23pm UTC, 09 April 2009

@suresh,

mod_cluster was designed to work especially well with JBoss AS 5, but it could be used with JBossWeb or Tomcat too, just see documentation pages on mod_cluster site for more info. Be aware that when you use it in Tomcat there are some disadvantages (only non-clustered mode supported for example), see previous link.

BTW, we're, working on a JBoss AS 5 cluster solution for EC2. A release of JBoss-Cloud with EC2 support will be released really soon.

--Marek

Login to avoid moderation.

Sign up if you need an account.

Creative Commons License Copyright 2008. Odd Thesis by Bob McWhirter is licensed under a Creative Commons Attribution-Share Alike 3.0 United States License.