What’s required to make a Container aware Java Runtime

 In Java

Hi all,

Java today has some fairly serious challenges when it comes to running efficiently on Container based technologies (Docker being the popular choice). This basically boils down to the container reporting incorrect values for important system resources and/or the Java runtime looking at default operating system locations as opposed to a container provided value. This mismatch in communication leads to the JVM using numbers that aren’t representative of the resources that a container is actually providing and so Java tends to run inefficiently in a container environment.

For example, at JVM startup the JVM interrogates the runtime so that is can decide how to best adapt to that run time. For example, max heap is set to 1/4 of real RAM. If the container doesn’t answer how much RAM is available then the JVM will configure it’s self based on all the real RAM on the machine. There are similar heuristics used when deciding on how many helper threads should be used for the common thread pool, the parallel phases of the garbage collector and so on.

This is one of the main reasons that very few jClarity customers are running containerised Java applications in production today.

Luckily, Oracle are stepping up their efforts to provide first class support for Containers with regards to the Java Runtime.

The official ‘bug’ tracking their JDK Enhancement Proposal (JEP) work on this is at: https://bugs.openjdk.java.net/browse/JDK-8182070.

We’ll add some of our own commentary and insights here, deliberately mapping against the titles of that JEP. I’ll quote various sections of the Oracle JEP (cutting out some less interesting stuff) and add our commentary beneath.

Goals

Enhance the JVM and Core libraries to detect running in a container and adapt to the system resources available to it. This JEP will only support Docker on Linux-x64 although the design should be flexible enough to allow support for other platforms and container technologies. The initial focus will be on Linux low level container technology such as cgroups so that we will be able to easily support other container technologies running on Linux in addition to Docker.

There’s a danger here that this will make running Java a 2nd class citizen on containers that run on a Mac OS X or Windows machine (two of the main platforms that we see out in the wild alongside Linux). We’re hoping that a very rough implementation for another container or O/S target will be built (even if it’s a throwaway prototype) in order to help validate the flexibility of the design for future container technologies (and non Linux operating systems).

A mailing list thread suggests that this is being taken into consideration.

Non-Goals

It is not a goal of this JEP to support any platform other than Docker container technology running on Linux x64.

See above! That said, Java on Linux is a very high percentage of mission critical apps in production out there today and it makes sense to target this first.

Success Metrics

Success will be measured by the improved efficiency of running multiple Java containers on a host system with out of the box options.

We’d like to see some before and after numbers on this. CPU, memory, socket, disk – latencies and throughputs under certain loads. In fact if anyone has already run such tests on their applications and are willing to share the results we’d love to hear from you!

Motivation

Container technology is becoming more and more prevalent in Cloud based applications. This technology provides process isolation and allows the platform vendor to specify limits and alter the behavior of a process running inside a container that the Java runtime is not aware of. This causes the Java runtime to potentially attempt to use more system resources than are available to it causing performance degradation or even termination.

Totally agree here – we ourselves are increasingly using Docker for development and QA purposes but would not consider it in PRD yet due to the concerns listed above.

Description

This enhancement will be made up of the following work items:

B. Exposing container resource limits and configuration.

There are several configuration options and limits that can be imposed upon a running container. Not all of these are important to a running Java process. We clearly want to be able to detect how many CPUs have been allocated to our process along with the maximum amount of memory that we be allocated but there are other options that we might want to base runtime decisions on.

In addition, since Container typically impose limits on system resources, they also provide the ability to easily access the amount of consumption of these resources. I intent on providing this information in addition to the configuration data.

I propose adding a new jdk.internal.Platform class that will allow access to this information. Since some of this information is needed during the startup of the VM, I propose that much of the implementation of the methods in the Platform class be done in the VM and exposed as JVM_xxxxxx functions. In hotspot, the JVM_xxxxxx function will be implemented via the os.hpp interface.

Here are the categories of configuration and consumption statistics that will be made available (The exact API is TBD):

isContainerized
Memory Limit 
Total Memory Limit
Soft Memory Limit
Max Memory Usage
Current Memory Usage 
Maximum Kernel Memory
CPU Shares
CPU Period
CPU Quote
Number of CPUs
CPU Sets
CPU Set Memory Nodes
CPU Usage
CPU Usage Per CPU
Block I/O Weight
Block I/O Device Weight 
Device I/O Read Rate
Device I/O Write Rate
OOM Kill Enabled
OOM Score Adjustment
Memory Swappiness
Shared Memory Size

This looks like a pretty good list to us, it’s interesting to note that when it comes down to the performance of a managed runtime that you are actually looking at quite a limited set of resources that all balance against each other.

C. Adjusting Java runtime configuration based on limits.

Java startup normally queries the operating system in order to setup runtime defaults for things such as the number of GC threads and default memory limits. When running in a container, the operating system functions used provide information about the host and does not include the containers configuration and limits. The VM and core libraries will be modified as part of this JEP to first determine if the current running process is running in a container. It will then cause the runtime to use the container values rather than the general operating system functions for configuring and managing the Java process. There have been a few attempts to correct some of these issue in the VM but they are not complete. The CPU detection in the VM currently only handles a container that limits cpu usage via CPU sets. If the Docker –cpu or –cpu-period along with –cpu-quota options are specified, it currently has no effect on the VMs configuration.

The experimental memory detection that has been implemented only impacts the Heap selection and does not apply to the os::physical_memory or os::available_memory low level functions. This leaves other parts of the VM and core libraries to believe there is more memory available than there actually is.

The Numa support available in the VM is also not correct when running in a container. The number of available memory nodes and enabled nodes as reported by the libnuma library does not take into account the impact of the Docker –cpuset-mems option which restricts which memory nodes the container can use. Inside the container, the file /proc/{pid}/self does report the correct Cpus_allowed and Mems_Allowed but libnuma doesn’t respect this. This has been verified via the numactl utility.

To correct these shortcomings and make this support more robust, here’s a list of the current cgroup subsystems that we be examined in order to update the internal VM and core library configuration.

Number of CPUs

Use a combination of number_of_cpus() and cpu_sets() in order to determine how many processors are available to the process and adjust the JVMs os::active_processor_count appropriately. The number_of_cpus() will be calculated based on the cpu_quota() and cpu_period() using this formula: number_of_cpus() = cpu_quota() / cpu_period(). Since it’s not currently possible to understand the relative weight of the running container against all other containers, altering the cpu_shares of a running container will have no affect on altering Java’s configuration.

Also add a new VM flag that allows the number of CPUs to be overridden. This flag will be honored even if UseContainerSupport is not enabled.

The last option will actually be very useful for those who want to pin their JVM’s ergonomics to a subset of the available CPU on a server. For example, your JVM process might not be as important as that Python routine on the host. It makes sense that you can more easily restrict what the JVM is using.

Total available memory

Use the memory_limit() value from the cgroup file system to initialize the os::physical_memory() value in the VM. This value will propagate to all other parts of the Java runtime.

We might also consider examining the soft_memory_limit and total_memory_limit in addition to the memory_limit during the ergonomics startup processing in order to fine tuning some of the other VM settings.

We’d like to see these extra options enabled. In really latency sensitive applications, being able to tune that last dial is sometimes required!

D. Adding container configuration to error crash logs and Unified JVM logging.

As as troubleshooting aid, we will dump any available container statistics to the hotspot error log and add container specific information to the JVM logging system.

This would be invaluable. Virtualisation and containerisation complexities make for some of the trickier analyses that we perform for customers (both with our tooling and sometimes a human eye). Having to correlate logging information from so many disparate layers can be frustrating. As a side note, the fact that a GC log can provide some useful CPU usage and safe pointing information actually allows you to diagnose a whole host of performance issues that are not caused by the GC sub system. This in built correlation has proven to be most helpful!

E. Adding a startup flag to enable/disable this support.

Add a -XX:+UseContainerSupport VM option that will be used to enable this support. The default will be off until this feature is proven.

The Oracle engineers are (once more) showing great wisdom in feature toggling. If you don’t apply this technique to your mission critical apps, it’s time to start looking at things like LaunchDarkly!

F. Configuration change notifications

An additional API will be provided to allow an application to receive a notification when configuration changes occur. Configuration change events will not necessarily cause the VM and Java core libraries to reconfigure their usage of resources. This support will be optional.

We’d hope to see this accessible via logs or JMX so there’s a chance to warn the end user that something has changed and that the JVM may or may not have been able to adjust.

Conclusion

Java’s evolution can seem frustratingly slow at times (if you were to read Hacker News and Reddit you’d think that Docker was a production std for several years now) but we think that the timing of this JEP is about right for Java and will meet the industry need for moving Java applications onto containers in an efficient manner. We look forward to helping test out the early implementations!

Cheers,
Martijn (CEO) and the jClarity Team.

No more memory leaks and application pauses!

Find out why my app is slow and tell me how to fix it!

Recent Posts

Leave a Comment