RMI and System.gc() Unplugged

 In Censum, Garbage Collection, Java

I’ve just finished tuning a client’s application where one of the items on the table was to find the source of calls to System.gc(). Using Censum made easy work of understanding the source of the calls. The team I was working with, not only missed that these calls to System.gc() were creating havoc with their end users experience, they didn’t even realize that something, somewhere was messing things up!

It’s not to say that developers are completely oblivious to the issue. Developers have gotten better about not placing calls to System.gc() in their code, but they often still don’t see the call being made in their GC logs. One of the confusing causes of unwanted calls to System.gc() comes from Java (typically Java EE) applications that have remote objects, and therefore use RMI. In the early days, these calls came into applications at the rate of 1 / minute and, as you can imagine, it caused a lot of consternation for those analyzing the performance of Java EE applications making use of Session or Entity beans! After much discussion, the call rate was reduced to the current rate of 1 / hour. Better, but it can still be disruptive as much as it can be useful. Let’s expose a little bit of how the JVM works by digging into how RMI plays with the Garbage Collectors.

The Remote Procedure Call (RPC) protocol (originally written in C) allowed one process to call an exposed function in another process. The function in the other process was expected to manage all of its resources on it own. For example, if it allocated memory, it was responsible for freeing it. The only piece of management that relied on the client doing the right thing was to shutdown the connection. Even then, a broken socket would cause the RPC library to close down the connection on its own.

Move forward and we can see that the RPC style of remoting is re-implemented in Java with RMI and CORBA. Together, these protocols expose classes that you can instantiate and interact with, in a remote JVM. Without getting in too deeply into how all this works, one can quickly see that this leaves us with a problem. The problem of course being that in Java the garbage collector will only clean up objects that are unreachable. For a remote JVM to access a remote object we must have that remote JVM hold a pointer to it on our behalf. This implies that to clean up a remote object, the remote JVM has to signal when it’s finished with that remote object. Given the nature of the resources needed to make all this work, the implementors of RMI decided that they should force a call to a collection of tenured space on a regular basis. In practical terms this means, RMI makes regular calls to System.gc().

If you look deep inside RMI you’ll run across the class ObjectTable. That class is responsible for retaining the roots of all objects that support a remote JVM making a call against an object on a local JVM. All of the objects added to this table are wrapped in a WeakReference. Further more, ObjectTable manages it’s own ReferenceQueue using an internal Reaper (implements Runnable). In addition, the connection classes from java.io all rely on finalization to clean up resources such as file handles. As well as the ObjectTable holding onto the roots for the Distributed Garbage Collector (DGC – which cleans up for RMI), all newly disconnected objects must be identified by a regular garbage collection cycle, cleaned out of their respective ReferenceQueue, and then reclaimed with a subsequent collection.

To support all of this, you can find a declaration to the class GC.LatencyRequest. The class sun.misc.GC class is a singleton whose primary function is to make sure that a collection of tenured has run within a fixed period of time (otherwise known as gcLatency). The class is supported by the native method GC.LatencyRequest gcLatencyRequest(). This native call returns an estimate of the elapsed time since the last GC. In the case of the Serial, Parallel, CMS and G1 collectors, it returns the elapsed time since tenured space was collected. If that elapsed time exceeds the latency target, then a System.gc() is called. As long as there are objects in ObjectTable, the GC::run() method (listing below) will remain active.

public void run() {

  for (;;) {
    long l;
    
    synchronized (lock) {
      l = latencyTarget;
        if (l == NO_TARGET) {
        /* No latency target, so exit */
        GC.daemon = null;
        return;
      }

      long d = maxObjectInspectionAge();

      if (d >= l) {
        /* Do a full collection.  There is a remote possibility
         * that a full collection will occur between the time
         * we sample the inspection age and the time the GC
         * actually starts, but this is sufficiently unlikely
         * that it doesn't seem worth the more expensive JVM
         * interface that would be required.
         */
        System.gc();
        d = 0;
      }

      /* Wait for the latency period to expire,
       * or for notification that the period has changed
       */
      try {
        lock.wait(l - d);
      } catch (InterruptedException x) {
        continue;
      }
    }
  }
}

The default value for latencyTarget is 3600000ms. Oddly enough, you can override this by setting your own latency target in code as follows.

public static void main(String[] args) throws Throwable {
  System.out.println( “latency target: “ + GC.currentLatencyTarget());
  System.out.println("turn on GC");
  LatencyRequest request = GC.requestLatency( 1000);
  System.out.println( “latency target: “ GC.currentLatencyTarget());
  Thread.sleep(6000);
  System.out.println("turn off GC");
  request.cancel();
  Thread.sleep( 6000);
}

If you run this code with the flags -verbose:gc and -XX:+PrintGCDetails, you will see a call to System.gc() made once a second, for 6 seconds. The output you should see will look something like;

latency target: 0
turn on GC
latency target: 1000
[GC (System.gc())  5243K->792K(251392K), 0.0010124 secs]
[Full GC (System.gc())  792K->582K(251392K), 0.0045390 secs]
[GC (System.gc())  3203K->678K(251392K), 0.0006147 secs]
[Full GC (System.gc())  678K->571K(251392K), 0.0108860 secs]
[GC (System.gc())  1882K->571K(251392K), 0.0006402 secs]
[Full GC (System.gc())  571K->568K(251392K), 0.0060488 secs]
[GC (System.gc())  568K->568K(251392K), 0.0003195 secs]
[Full GC (System.gc())  568K->568K(251392K), 0.0018367 secs]
[GC (System.gc())  568K->568K(251392K), 0.0005905 secs]
[Full GC (System.gc())  568K->568K(251392K), 0.0028363 secs]
[GC (System.gc())  568K->568K(251392K), 0.0003654 secs]
[Full GC (System.gc())  568K->568K(251392K), 0.0018081 secs]
turn off GC

If ObjectTable empties, it will reset the GC latency to 0 which will overwrite your settings. If another RMI request is made, the target latency will be set to the preset value for RMI. This implies that any settings we make in code are subject to change at any time. A safer way to set the GC latency for RMI is to set the system properties sun.rmi.dgc.client.gcInterval and sun.rmi.dgc.server.gcInterval.

Generally I don’t find these extraneous calls to System.gc() to be useful so I often tune down the values to as much as MAX_LONG milliseconds. However, if you are using RMI and tenured is never collected, it’s likely that you want to schedule a Full collection simply to recycle things like file descriptors. File descriptors in particular are special as they are processed with finalization, which means a full GC must run in order for these objects to be inserted into the finalization ReferenceQueue.

Final note, there are other sources of JDK internal calls to System.gc(). These include:

  • Use of FileChannel.map
  • Use of ByteBuffer.allocateDirect()

The latter will make the call is reference processing isn’t keeping up and you are hitting MaxDirectMemorySize.

Yet Another Final Note, Yes, I know the Java API spec for System.gc() says that a JVM can ignore calls to System.gc() if it so chooses too. The reality is that in OpenJDK, all System.gc() calls are honoured unless -XX:+DIsableExplicitGC is set explicitly as a JVM command line option. Another nice trick (thanks Charlie Hunt!) is that you can also add -XX:+ExplicitGCInvokesConcurrent for CMS and G1 collectors to avoid a single threaded stop the world Full GCs (including of course the System GC), great if you have cores to spare!

Hopefully most of you are using Censum to see have these sorts of problems reported in seconds!

Cheers,
Kirk (CTO) and the jClarity team

No more memory leaks and application pauses!

Find out why my app is slow and tell me how to fix it!

Recent Posts