Imagine a Java application that can start up in milliseconds, without compromising on throughput, memory, development-production parity or Java language features. Sounds out of this world, right? Well, through the use of technologies like CRIU support in Eclipse OpenJ9 and Liberty’s InstantOn, we’ve taken one giant leap forwards for innovation within Java, offering exactly this! Join this session to learn more about these innovations and how you could utilise OSS technologies to deliver highly scalable and performant applications that are optimized for today’s cloud-native environments.
2. Java’s Popularity
• Language is easy to read and write
• Fantastic community - OSS
• Great Libraries
• Robust documentation
• Platform independence
• Continuous development and innovation
• History of supporting enterprise
• Java Virtual Machine
@gracejansen27
3. The JVM
• Bytecode
• Used to create Java machines
• Ensures portability
• Offers optimisations
• JIT (Just In Time) compiler
• Uses 2 compilers
• C1 – fast compile, low optimisation
• C2 – slow compile, high optimisation
@gracejansen27
6. Challenges in the era of Cloud-native
The shift to cloud-native has
changed the demands placed
on the underlying JVM
technologies that drive
application frameworks and
runtimes
• Serverless computing
• Cloud economics
• Cloud native JVM
6
@gracejansen27
7. Cloud Economics
7
Pay-as-you go pricing model based on CPU and
memory usage.
Elastic scaling based on demand is crucial for
success.
Requirements:
• Scale-to-zero
• Low memory usage
• Minimal latency – fast startup
Cloud
Native
JVM
Legacy
JVM
Doing more with less!
@gracejansen27
8. Cloud Native Runtimes
8
Portability and flexibility – right once, run
everywhere.
Slower startup times but high peak throughput.
The cloud demands a shift in the performance
characteristics of JVMs.
Solutions:
- Dynamic AOT compilation and class
metadata persistence
- Static compilation – native image
Native Image
JDK
Startup time
Peak performance
Total image footprint
Fast build time
Usability
@gracejansen27
10. AOT compilation
• Ahead of time, static compilation
• The process of compiling high-level Java code into native executable code
• Machine code is tailored to a specific operating system and hardware
architecture
• E.g. GraalVM approach
@gracejansen27
Sources Bytecode
Bytecode +
Metadata
JAR
Native
Executable
Package
AOT Compilation
(native-image)
AOT Processing (inc.
running bytecode)
Compile
11. Native Solution to Fast Startup
• Pros:
• Much faster startup
• Closed world allows for smaller overall binary size
• Cons
• Lower peak performance
• More costly memory management
• Closed world assumption does not apply to all applications
• All reflection must be known at compile time
• Long compile times
• Difference in deployed environment vs. development
environment
@gracejansen27
12. AOT vs JIT compilation
https://twitter.com/thomaswue/status/
1145603781108928513?s=20&t=-
6ufSBjc46mfN5d_6Y2-Rg
@gracejansen27
14. Checkpoint/Restore In Userspace - CRIU
• Linux Project - OSS
• Aims to reduce start-up time for Java applications
• Enables a snapshot (checkpoint) to be taken of a running
application
• App can then be restarted from the snapshot
• Potential for Java applications to scale to zero and be run in
serverless environments.
@gracejansen27
15. Capable of reducing startup time dramatically
• Up to 18x reduction in first response time by using Liberty Instant On beta using CRIU
• Liberty Instant On beta is an effort to productize the use of CRIU with Liberty
• Main design goals
• Usability : make it easy for an end user to consume the feature
• Functionality : make it possible for end user to run their application without changes
15
0
1000
2000
3000
4000
5000
6000
Liberty baseline Liberty Instant On prototype
PingPerf - First Response Time(ms)
(lower is better)
0
1000
2000
3000
4000
5000
6000
Liberty baseline Liberty Instant On prototype
Rest CRUD - First Response
Time(ms)
(lower is better)
0
1000
2000
3000
4000
5000
6000
Liberty baseline Liberty Instant On prototype
Daytrader7 - First Response Time(ms)
(lower is better)
1470
128
3082
213
5462
310
System Configuration:
-------------------------------
SUT: LinTel – SLES 15 sp3 - Intel(R) Xeon(R) Gold 5118 CPU @ 2.30GHz, 2 physical cores(4 cpu), 32GB RAM.
12x
15x
18x
18. Semeru Runtimes
18
IBM Semeru Runtimes is a production-ready JDK
based on the Eclipse OpenJ9 JVM. The OpenJ9
JVM is designed with memory efficiency and startup
performance in mind. OpenJ9 continues to innovate
by introducing new features like CRIU Support
which is the basis of Semeru InstantOn and Liberty
InstantOn.
Key features:
• Broad platform support
• Tuned for the cloud
• Zero usage restrictions
Containers
Semeru InstantOn
Liberty InstantOn
Jakarta EE Application
@gracejansen27
19. Semeru InstantOn
19
Openj9 CRIU Support provides an alternative
to static compilation and traditional JVMs in the form
of checkpoint and restore. It offers fast startup while
keeping the benefits of traditional JVMs.
Workflow:
• Checkpoint run at build phase
• Restore run in deployment
Traditional Run
JVM
startup
Application
initialization
Application
ready state
Build phase
Deployment run
Initiate
restore
checkpoint image
time
startup time application running
much faster
startup!
@gracejansen27
20. • Use Linux CRIU capability to snapshot/restore Liberty process
• Linux CRIU snapshots process state including memory pages,
file handles, network connections
• Liberty and OpenJ9 run pre-snapshot and post-restore hooks
to enable seamless user experience
Liberty InstantOn
Liberty JVM starts
up
Liberty JVM reaches
snapshot point
Liberty JVM runs pre
snapshot hooks
Liberty JVM initiates
snapshot
Build time
Liberty JVM exits
after snapshot
Restore Liberty JVM
from snapshot
Liberty JVM runs
post restore hooks
Run time
Restore Liberty JVM
from snapshot
Liberty JVM runs
post restore hooks
Run time
Goal : “instant on” without limiting programming model/capabilities substantially
Liberty JVM
continues running
Liberty JVM
continues running
20
@gracejansen27
21. Where to checkpoint?
21
Liberty InstantOn leverages Semeru to provide a
seamless checkpoint/restore solution for
developers. With checkpoint restore, there is a
tradeoff between startup time and the complexity of
the restore.
Checkpoint phases:
• features
• deployment
• applications
Kernel
start and feature
runtime
processing
Process
application
Start
application
Accept
requests
Applications
Deployment
Features
300 – 2200ms
100 – 3000ms
0 – ???ms
@gracejansen27
22. Where to checkpoint?
22
Checkpoint phases:
• features
• deployment
• applications
Kernel
start and feature
runtime
processing
Process
application
Start
application
Accept
requests
Applications
Deployment
Features
300 – 2200ms
100 – 3000ms
0 – ???ms
@gracejansen27
23. Where to checkpoint?
23
Checkpoint phases:
• features
• deployment
• applications
Kernel
start and feature
runtime
processing
Process
application
Start
application
Accept
requests
Applications
Deployment
Features
300 – 2200ms
100 – 3000ms
0 – ???ms
@gracejansen27
24. Where to checkpoint?
24
Checkpoint phases:
• features
• deployment
• applications
Kernel
start and feature
runtime
processing
Process
application
Start
application
Accept
requests
Applications
Deployment
Features
300 – 2200ms
100 – 3000ms
0 – ???ms
@gracejansen27
25. Where to checkpoint?
25
Later checkpoint == Faster
restore time
Later checkpoint == More
complexity
Kernel
start and feature
runtime
processing
Process
application
Start
application
Accept
requests
Applications
Deployment
Features
Later
checkpoint
==
Faster
restore
time
Later
checkpoint
==
More
complexity
300 – 2200ms
100 – 3000ms
0 – ???ms
@gracejansen27
30. JITServer aka JIT-as-a-Service
Decouple the JIT compiler from the JVM and let it run as an independent process
Offload JIT
compilation to
remote process
Remote
JIT
Remote
JIT
JVM
JIT
JVM
JIT
Kubernetes
Control Plane
Treat JIT
compilation as a
cloud service
• Auto-managed by orchestrator
• A mono-to-micro solution
• Local JIT still available
@gracejansen27
31.
32. Resources:
• OpenJ9 reesources: https://blog.openj9.org/
• Liberty InstantOn Beta blogpost: https://openliberty.io/blog/2022/09/29/instant-on-
beta.html
• Liberty InstantOn Demo video: https://www.youtube.com/watch?v=EpaCdR_KXNQ
• CRIU: https://criu.org/Main_Page
• Longer Education session on InstantOn (ExpertTV recording on Community Page):
https://community.ibm.com/community/user/wasdevops/discussion/instant-on-java-cloud-
applications-with-checkpoint-and-restore-replay-of-a-lets-code-episode-on-ibm-expert-tv
@gracejansen27
36. Container support
36
Liberty InstantOn provides tools for users to easily
create container images with their checkpointed
applications for fast startup in the cloud.
Steps:
• Build application image
build
FROM open-liberty:beta-checkpoint
COPY --chown=1001:0 server.xml
/config/server.xml
COPY --chown=1001:0 demo.war
/config/dropins/demo.war
RUN configure.sh
Dockerfile
podman build -t demo-application .
37. Container support
37
Liberty InstantOn provides tools for users to easily
create container images with their checkpointed
applications for fast startup in the cloud.
Steps:
• Build application image
• Run application to checkpoint in container
run checkpoint
podman run
--name demo-checkpoint
--privileged
--env WLP_CHECKPOINT=applications
demo-application
38. Container support
38
Liberty InstantOn provides tools for users to easily
create container images with their checkpointed
applications for fast startup in the cloud.
Steps:
• Build application image
• Run application to checkpoint in container
• Commit container to checkpoint image
commit
podman commit …
39. Container support
39
Liberty InstantOn provides tools for users to easily
create container images with their checkpointed
applications for fast startup in the cloud.
Steps:
• Build application image
• Run application to checkpoint in container
• Commit container to checkpoint image
• Restore checkpointed image
restore
podman run …
41. JITServer aka JIT-as-a-Service
Decouple the JIT compiler from the JVM and let it run as an independent process
Offload JIT
compilation to
remote process
Remote
JIT
Remote
JIT
JVM
JIT
JVM
JIT
Kubernetes
Control Plane
Treat JIT
compilation as a
cloud service
• Auto-managed by orchestrator
• A mono-to-micro solution
• Local JIT still available
@gracejansen27
42. JITServer improves container density and cost
Default config
AM 500
B 550
C 550
F 450 P 450
P 450
B 550
F 450
AM 500
A 350
AM 500
M 200
Q 350
P 450
Q 350
D 600
D 1000
F 450
B 550
Q 350
AM 500
AM 500
AM 500
B 550
B 550
A 350
C 550
F 450
M 200
P 450
P 450
P 450
Q 350
Q 350
D 1000
AM 500 B 550
P 450
AM 500
B 550
B 550
C 550
C 550
F 450
F 450 P 450
Q 350
Q 350
D 1000
D 1000
Q 350
AM 250
AM 250
P 250
P 250
F 250
F 250
B 400 C 350
Q 150
Q 150
M 150
AM 250
AM 250
P 250
P 250
F 250
B 400
Q 150
Q 150
J 1200
A 250
B 400
B 400
C 350
D 1000 D 1000
D 600
AM 250
AM 250
P 250
P 250
F 250
F 250
B 400 C 350
Q 150
Q 150
M 150
AM 250
AM 250
P 250
P 250
F 250
B 400
Q 150
Q 150
J 1200
A 250
B 400
B 400
C 350
D 1000
D 1000
JITServer config
Legend:
AM: AcmeAir monolithic
A: Auth service
B: Booking service
C: Customer service
D: Database (mongo/postgres)
F: Flight service
J: JITServer
M: Main service
P: Petclinic
Q: Quarkus
Total=8250 MB Total=8550 MB Total=8600 MB
Total=9250 MB Total=9850 MB
6.3 GB less
42
Abstract:
Imagine a Java application that can start up in milliseconds, without compromising on throughput, memory, development-production parity or Java language features. Sounds out of this world, right? Well, through the use of technologies like CRIU support in Eclipse OpenJ9 and Liberty’s InstantOn, we’ve taken one giant leap forwards for innovation within Java, offering exactly this! Join this session to learn more about these innovations and how you could utilise OSS technologies to deliver highly scalable and performant applications that are optimized for today’s cloud-native environments.
My plan was to cover mostly what we've done before, so why devs care about this, what alternatives there are and potential issues with it, where CRIU fits in and how that then leads on to InstantOn. Thought then we could do a quick code demo, like's been done before by Ian, Alasdair, you, etc.
Language is easy to read and write – Moreover, rich ecosystem of APIs
Community – JUGs, Java Champions, Conferences, Newsletters/Podcasts, Blogs, etc
Libraries – not just proprietary, also lots of OSS too!
Platform independence - it can run on various operating systems like Windows, Linux/Unix, and macOS
History of supporting enterprise – omnipresence, used throughout banking and other very large enterprise organizations and applications
Java Virtual Machine - “write once, run everywhere” - it can run on various types of computers and mobile devices without any change in the output, backwards compatibility, managed runtime environment
When Java was released in 1995, all computer programs were written to a specific operating system, and program memory was managed by the software developer. The JVM was a revelation.
Ensures portability of Java programs across different architectures
The JVM is how we run our Java programs. We configure the settings and then rely on the JVM to manage program resources during execution.
Java bytecode is the bytecode-structured instruction set of the Java virtual machine (JVM) - machine-level language code that runs on the JVM
JVM converts Java bytecode into machines language
Java code is compiled into bytecode. This bytecode gets interpreted on different machines
To be able to run a Java program, the JVM interprets the bytecode. Since interpreters are usually a lot slower than native code executing on a real processor, the JVM can run another compiler which will now compile our bytecode into the machine code that can be run by the processor. This so-called just-in-time compiler is much more sophisticated than the javac compiler, and it runs complex optimizations to generate high-quality machine code.
It contains two conventional JIT-compilers: the client compiler, also called C1 and the server compiler, called opto or C2.
C1 is designed to run faster and produce less optimized code, while C2, on the other hand, takes a little more time to run but produces a better-optimized code.
But we want to move to the cloud
The cloud-native environment demands a change in the way we develop and deploy applications to the cloud
But it also demands a change in how the applications are run in the cloud
There are different demands on the JVM when running in a micro-service architectures on the cloud as opposed to a more traditional on-prem setup
And this difference can have economic implications as well
The JVM was designed with portability and flexibility in mind. This meant that bytecodes were loaded lazily and optimized while the application was running. The result was slower startup times but high peak throughput. The cloud demands a shift in the performance characteristics of JVMs.
The shift to the cloud has enabled business to focus more of their attention on develping apps and features and less on IT infrastructure
While this has been a beneficial move, it does mean that you need to optimize for costs in a different way
Typically pricing models are a function of CPU, memory and the time its used for
As a result, to minimize costs you need to make sure that you are only paying for what you use, and no more
So when there is high load, you scale up to meet demand
When there is Low load you scale down to save on costs
Elastic/dynamic scaling
This approach is known as scale-to-zero
Scale to zero requires that you have very fast startup times, as startup becomes part of the latency end users may experience during increasing demand
Next slide…
The JVM was originally designed with goal of enabling portability and flexibility of Java apps – that is, right once run everywhere
To achieve this the JVM was design to load bytecodes as late as possible and optimize at runtime so you can achieve peak performance on your platform, and never have to customize your application for it
The result of this design point was slower startup times due to the fact that you need to interpret bytecode initially, but high peak throughput once the JIT has had to chance to compile and optimize the code
The nature of cloud-native demands that JVMs not only have high throughput but also fast startup
This has given rise to new technologies attempting to optimize JVMs for the cloud
New class persistence and dynamic AOT compilation features have been added to traditional JDKs in order to improve startup times
- these essentially save class metadata and reload it on subsequent runs so that there is a much shorter path for loading classes and compiling methods
- this has shown benefits to startup perf, but it pales in comparison to native image approaches
- bigger is better in the chart
Native image is a new (but also really old) approach where the entire application is compiled statically -closed world analysis
- this means that no interpretation is required, since everything is compiled, you start at peak performance
- so really fast startup times
However, there are tradeoffs with native image
- since it doesnt have a Jit, it cannot benefit from profile guided opts in the way that traditional JDKs do, so peakperf is often lower than traditional jdks
- in addition, JVMs have tried and tested GC policies, that are very efficient at reclaiming memory
- memory footprint under peak load can often be measured to be lower on traditional JVMs in comparison to native image
Due to its closed world analysis, native images can often strip out the parts of the app that are not needed, keeping image sizes small
- that being said with technologies like multilayer SCC, there is sharing when you have multiple JVMs in a container image
- so the JVM app might be larger, sharing the JVM image layer may result in smaller footprint
full static compilation is takes time, this means that you will experience longer buildtimes for your apps in comparison to traditional jdks
The JVM is a dynamic platform, that lets you import code, re-write your bytecodes at runtime, etc
- with native image these things arent possible, you need to know upfront all the code that will be run and specify it
- this is an additional burden on users
- also, common debugging tools will not work on native images, because it’s a native exe not a JVM
These summarizes some of the tradesoffs between the native image and JVM approach…
Ahead of Time Compilation (GraalVM approach)
Aims to improve warming-up period
Provide faster start-up for Java applications
Ahead-Of-Time (AOT) Compilation is the process of compiling high-level Java code into native executable code.
Usually, this is made by the JVM's Just-in-time compiler (JIT) at runtime, which allows for observation and optimization while executing the application.
This advantage is lost in the case of AOT compilation.
New class persistence and dynamic AOT compilation features have been added to traditional JDKs in order to improve startup times
- these essentially save class metadata and reload it on subsequent runs so that there is a much shorter path for loading classes and compiling methods
- this has shown benefits to startup perf, but it pales in comparison to native image approaches
Typically, before AOT compilation, there can optionally be a separate step called AOT processing, i.e. collecting metadata from the code and providing them to the AOT compiler. The division into these 2 steps makes sense because AOT processing can be framework specific, while the AOT compiler is more generic.
May have heard of Crac which uses this technology – project lead by Azul
The CRaC (Coordinated Restore at Checkpoint) Project researches coordination of Java programs with mechanisms to checkpoint (make an image of, snapshot) a Java instance while it is executing.
PerfPing, very simple application (not access to external database)
Rest CRUD application (uses external database, more realistic of an application, Uses APIs
Day Trader, much more sophisticated application, more like a real enterprise application
Before I describe Semeru Runtimes, Ill briefly describe openj9
Openj9 is an open source JVM that’s been around for a while, open source in 2017, closed source for over 10 years before that
It’s a JVM that has good perf characteristics in cloud env like low memory footprint
And its open source so there are no usage restrictions
IBM semeru runtimes is a production ready JDK based on eclipse Openj9 JVM
So what we will look at next is Semeru InstantOn which is based on openj9 CRIU support
The semeru InstantOn feature based on openj9 CRIU support uses the checkpoint/restore model
- It’s a feature based on linux checkpoint restore in users space (CRIU)
This is a service that lets one take a checkpoint of a running application, serialize its memory space, registers, etc. To disk, then restore it later on the same machine or a different machine
InstantOn leverages this technology to improve JVM startup time
On the right youll se what a typical JVM run looks like
- first you need to start the JVM, this involves JVM init and loading/init core JCL clases
- Then you need to load and initialize your application classes
- once that is done, your application is ready to run
With instanton you start your application in the build phase and run it to a pre-determined point before the application is ready,
- then you save the image by initiating a checkpoint
At deployment
- you can restore the image, this means you fast forward to when the application is ready, rather then starting the application again from the beginning
And that’s how you can achieve fast startup with instanton
Next well see how open liberty makes use of this…
The feature is called is called InstantOn, and it provided by Open Liberty and the OpenJ9 JVM.
It essentially allows you to create a checkpoint of your application image, which you can then deploy again and again, starting from the point of your checkpoint.
And the best part is you keep all the best parts of about Java and the JVM, but now with great start-up times.
So as previously mentioned. liberty makes use of Semeru InstantOn to provide fast startup
And the way to do this is very simple, and it doesn’t require users to re-write their applications
Basically liberty instanton lets you take an existing app and transform it to an instanton app by selecting from 3 pre-defined checkpoint stages
Kernel start Base Liberty platform (no features) [~200ms] + start configured features [~100-2,000 ms] this heavily depends on number of features you’re using (e.g. if you’re using the whole Jakarta EE and/or MicroProfile platform, could take several seconds, but if you’re only enabling the servlet container that could be much much less like several hundred milliseconds)
Now able to process app and serve app (deployment stage)
Scan application metadata, annotations, descriptors, this is heavily application dependent
Now able to start application --> run application startup code, again very application dependent
Once that’s done we get the ports open and then we’re ready to serve your app
Need to pick a safe place to checkpoint (which depends on your application)
The first stage is features phase
- perform a checkpoint just after all the feature bundles have been activated, before the application manager has processed anything
- the point at which the liberty kernel has loaded all the kernel features
The second is deployment phase
- this is after inspecting the application, a parsing the application annotations, metada, etc.
- essentially knows everything about the app, but before the any application code is run
The third is the applications phase
- this is application startup code
- app code that has to run before it is reports that the app has started and is ready to receive requests
So the phase that you just choose to checkpoint depends on your application
- ideally you want to checkpoint as late as possible
- if you are doing very little in the applications phase then that is good position to take a checkpoint
- however if you are connecting to a remote database in the applications phase
- or doing something where you don’t want the state saved in the checkpoint
- then applications phase is not a good choice
- in this case its better to checkpoint at deployment
- everything up to this point in liberty is safe to checkpoint
- the features phase is essentially a fallback in case there are issues in processing application configs
So to summarize, the later you checkpoint, the faster the startup, however, the more complexity and considerations that you need to take into account
With all the benefits of the JVM and none of the compromises of Native Image
Cons of GraalVM:
Lower peak performance
More costly memory management
Closed world assumption (no dynamic loading, limitations on use of reflection), which doesn’t apply to all apps
All reflection must be known at start time
Long compile times
Difference in deployed environment vs development environment
Using getting started app (open liberty guides)
Only pre-req is Java SDK to build it and a container runtime to run the container in, in this case we’re using Podman to run the container in
First step is to clone the repo with the app in, which we’ve already done
To show InstantOn capability, there is one change needed to the Dockerfile to use a version of Liberty that has the new capability in it.
Edit docker file to change Liberty version to the Liberty beta InstantOn version of the container which we’ll then build the app on top of
Simple MP app, set of capabilities that it provides and set of Liberty features that are defined to provide these capabilities
Now build the normal container image as described in the guide
First thing is standard Maven package to produce WAR file that we’ll use to build into a container image
Create container image using Docker file that we just edited
Now build a new version of this container with checkpoint built into it
Do that by running the container I just created specifying that it should take a checkpoint and then exit. In this case we’re specifying that Liberty should take a checkpoint at the latest point possible after the app is initialized by specifying Checkpoint=applications.
Now create new version of container image and I’m going to name this one: Liberty Demo InstantOn
Add this checkpoint into this new image
Can now run both of them and we’ll see what the respective differences are
So first up is the container image that doesn’t have the checkpoint in it
Relatively simple application so it comes up fairly quickly anyway
As you can see this started in about 6 seconds
Now, let’s start the version that has the checkpoint in it
You’ll notice the three cap options used when running the container – these grant the container the three linux capabilities that are required to process the restore
As you can see, this version of the container came up in 0.3 seconds which is about 20x faster
We can then take a look at the running container – it’s a simple rest app that exposes health endpoints, provides metrics for the applications running there, we’e only hit the application once so the request count is one, and all it does is simply return the Java system properties of the application
So it’s just a simple rest application
The checkpoint that is burnt into the image is completely portable – the image we just created has a checkpoint taken while running in a fedora VM onto the Windows subsystem for Linux.
We can push this up to an image repo and deploy to a RHeL VM in Openshift
The restore in that environment is identical using exactly the same checkpoint
To demo this, we’ll tag and push the InstantOn version of the container to a remote repo
So there’s the tag
Let’s push it to our remote repository
Can now deploy that image to our OCP environment using either a full Kubernetes deployment YAML or can use Liberty operator that we have installed in OCP environment and use a more concise Liberty custom resource definition
Let’s take a look at that
Here’s our CR for our application container that we’ve just built and pushed to this remote repo here
If we switch over to OCP environment, here is the topology view where we can see there are no resources deployed
So if we move that out the way and now apply that custom resource to our OCP environment, we’ll see the operator installing the application from the custom resource
We’ll see we’ve got two pods that have been created here, we can view the logs of one of the pods and as you can see, the image was restored in this completely different environment in 0.24 seconds.
Works really well with other innovations like JIT Compiler
Java Virtual Machines (JVMs) employ Just-in-Time (JIT) compilers to improve the throughput of Java applications.
However, JIT compilers do not come for free: they consume resources, in terms of CPU and memory, and therefore they can interfere with the smooth running of Java applications.
Wouldn’t it be nice to keep all the advantages of dynamic compilation and eliminate its disadvantages?
This is where JITServer technology comes in.
In the Eclipse Openj9 JVM, the JIT compiler has been decoupled from the rest of the JVM and is run in its own independent process, either locally or on a remote machine. This process is referred to as JITServer and can easily be containerized and run in the cloud as a service.
This mechanism prevents your Java™ application suffering possible negative effects due to CPU and memory consumption caused by JIT compilation.
This technology can improve quality of service, robustness, and performance of Java applications.
Here we show a couple of JVMs on the left, and remote JIT services on the right.
The JVMs will no longer use their local JIT, and will offload their JIT compilations to the remote JIT services.
Here we show the remote JIT processes containerized and made available as a cloud service.
This give us an added benefit – we can be managed by orchestrators like Kubernetes or Docker Swarm, where it can make sure our service it is always running and scaled properly to handle demand.
And this solution is just like any other monolith to micro-services conversion – in this case the JVM is the monolith that is turned into 2 loosely coupled mircro-services – the JIT and the rest of the JVM.
Note that on the diagram we show the JVM JIT crossed-out, but it still can be used if the remote JIT should become unavailable.
***
All the disadvantages mentioned previously can be alleviated by decoupling the JIT compiler fromthe JVM and let it run as in independent process, possibly on a remote machine.
On the surface it may look like JITServer just moves the compilation overhead from one place to another, but such compilation consolidation strategy can reap the following benefits:
Spikes in memory consumption due to compilation activity at the client JVM are eliminated reducing the likelihood of a spurious out-of-memory occurrence.
The JIT compiler no longer steals CPU cycles from the Java application, thus eliminating performance hiccups, improving the quality-of-service (QoS) and possibly leading to a faster startup/rampup experience.
Application resource provisioning is greatly simplified because the user can ignore the unpredictable effects of JIT compilation and focus on the CPU/memory needs of the Java application alone.
Container sizing can be based solely on the resource usage of the Java application resulting in smaller containers, increased application density and reduced overall cost.
Robustness of applications is increased because compile time crashes due to bugs in the JIT no longer bring down the JVM.
Greater control over the resources devoted to compilation can be achieved because the number of JITServer container instances (and their size) can be scaled up and down independently of the Java applications. It’s worth mentioning that a single JITServer can fulfill the compilation needs for tens (if not hundreds) of JVMs. Moreover, for a typical usage pattern where requests from N client JVMs are staggered over time, the memory consumption at the server does not have to increase N times.
Now, the ability to take a checkpoint and restore it is great, however that isn’t enough
As we discussed earlier in this presentation, cloud native is also about how you develop and deploy your applications
Containers are a great way for packaging your applications
Liberty instanton integrates seamlessly with containers,
on the right hand side youll see how that works
First you start by setting up your application image, (typically this will be based on another image that contains your OS dependencies and your runtime)
- this is likely something you are already doing if you are currently deploying to the cloud
Next the checkpoint run is initiated, this runs the application in the container and takes a checkpoint at the specified checkpoint phase
In this example its doing it at applications
//TODO is there help form liberty here
Once the checkpoint is taken and the container stops, the sate of the container is committed. This is effectively your continaer checkpoint image
Now, when you want to deploy your image, you restore your container checkpoint image, with an entrypoint that triggers liberty to run the saved checkpoint image
And that’s it.
Liberty provides scripts to let you build a checkpoint image, so if you are doing this yourself theres only two additional steps, run checkpoint and commit
Also, build times are really fast since you are not doing a full compilation
Next, Jarek will show you how this works with a demo…
Now that we understand the role of compilers, let’s talk about when is the compilation performed. There are two main compilation strategies in Java: Just in Time Compilation (JIT) and Ahead of Time Compilation (AOT). The first generates machine code during the execution of the program itself (i.e., shortly before the first invocation of a Java method). The latter generates machine code before the execution of the program (i.e., during the bytecode verification and build phase of the application).
Java Virtual Machines (JVMs) employ Just-in-Time (JIT) compilers to improve the throughput of Java applications.
However, JIT compilers do not come for free: they consume resources, in terms of CPU and memory, and therefore they can interfere with the smooth running of Java applications.
Wouldn’t it be nice to keep all the advantages of dynamic compilation and eliminate its disadvantages?
This is where JITServer technology comes in.
In the Eclipse Openj9 JVM, the JIT compiler has been decoupled from the rest of the JVM and is run in its own independent process, either locally or on a remote machine. This process is referred to as JITServer and can easily be containerized and run in the cloud as a service.
This mechanism prevents your Java™ application suffering possible negative effects due to CPU and memory consumption caused by JIT compilation.
This technology can improve quality of service, robustness, and performance of Java applications.
Here we show a couple of JVMs on the left, and remote JIT services on the right.
The JVMs will no longer use their local JIT, and will offload their JIT compilations to the remote JIT services.
Here we show the remote JIT processes containerized and made available as a cloud service.
This give us an added benefit – we can be managed by orchestrators like Kubernetes or Docker Swarm, where it can make sure our service it is always running and scaled properly to handle demand.
And this solution is just like any other monolith to micro-services conversion – in this case the JVM is the monolith that is turned into 2 loosely coupled mircro-services – the JIT and the rest of the JVM.
Note that on the diagram we show the JVM JIT crossed-out, but it still can be used if the remote JIT should become unavailable.
***
All the disadvantages mentioned previously can be alleviated by decoupling the JIT compiler fromthe JVM and let it run as in independent process, possibly on a remote machine.
On the surface it may look like JITServer just moves the compilation overhead from one place to another, but such compilation consolidation strategy can reap the following benefits:
Spikes in memory consumption due to compilation activity at the client JVM are eliminated reducing the likelihood of a spurious out-of-memory occurrence.
The JIT compiler no longer steals CPU cycles from the Java application, thus eliminating performance hiccups, improving the quality-of-service (QoS) and possibly leading to a faster startup/rampup experience.
Application resource provisioning is greatly simplified because the user can ignore the unpredictable effects of JIT compilation and focus on the CPU/memory needs of the Java application alone.
Container sizing can be based solely on the resource usage of the Java application resulting in smaller containers, increased application density and reduced overall cost.
Robustness of applications is increased because compile time crashes due to bugs in the JIT no longer bring down the JVM.
Greater control over the resources devoted to compilation can be achieved because the number of JITServer container instances (and their size) can be scaled up and down independently of the Java applications. It’s worth mentioning that a single JITServer can fulfill the compilation needs for tens (if not hundreds) of JVMs. Moreover, for a typical usage pattern where requests from N client JVMs are staggered over time, the memory consumption at the server does not have to increase N times.
So these slides show how the OpenShift scheduler decided to place the various pods on the worker nodes. Note that each application has a different color. and each application is replicated multiple times.
The size of the shape indicates its relative container size. The number in the shape is the memory limit for that container.
As you can see in the top row, all 3 worker nodes are used, and the size of the containers are all larger than the bottom graph – this is due to building in extra memory to avoid OOM issues and improve throughput.
The bottom row uses the JITServer, which results in only 2 worker nodes being used, despite the fact that the JITServer containers (shown in brown) are the largest containers in the node. The savings comes from being able to scale down each of the application nodes.
The end result is a 33% cost savings by using one less worker node.
***
This slide is an illustration of how OpenShift scheduler decided to place the various containers on nodes.
There are two different configurations: above we have the default configuration without JITServer which needs 3 worker nodes.
Below we have the JITServer configuration that only uses 2 worker nodes.
The colored boxes represent containers and the legend on the right will help you decipher which application each container is running.
The number in each box represents the memory limit for that container. These values were experimentally determined so that the application can run without an OOM or drastically reduced throughput.
The boxes were drawn at scale meaning that a bigger box represents a proportionally larger amount of memory given to that container.
At a glance you can see that in the default config you cannot fit all those containers in just 2 nodes; you need 3 nodes.
In contrast, the JITServer config uses 6.3 GB less and we are able to fit all those containers in just 2 nodes. This happens even after we account for the presence of JITServer
(as you can see we have two JITServer instances, one on each node).
The takeaway is that JITServer technology allows you to increase application density in the cloud and therefore cost.
Now lets take a look at how each of the applications performed, The orange line represents the top row from the previous page. And the blue line represents the bottom row from the previous page, which uses the JITServer.
Each graph represents each of the applications.
You can see that the performance is pretty even, despite the fact that the JITServer is working with less worker node CPUs. The small blue lags are likely caused by the noisy-neighbor effect due to all the apps loaded up at the same time.
***
I have here 4 graphs, one for each application and the blue line shows how throughput with JITServer varies in time, while the orange line represents the throughput of the baseline.
As you can see, the steady state throughput for the 2 configurations is the same.
From the ramp-up point of view JITServer is doing quite well for Petclinic and Quarkus, while for AcmeAir mono and micro there is a miniscule ramp-up lag, which I would say it's negligible.
On the graphs you can also notice some dips in throughput, more pronounced for AcmeAir monolithic.
This is due to interference between applications, or the so-called noisy neighbor effect.
Since in practice applications are not likely to be loaded at the exact same time, in these experiments we apply load to the 4 applications in a staggered fashion, 2 minutes apart, starting with AcmeAir microservices and continuing with AcmeAir monolithic, Petclinic and Quarkus.
Those throughput dips correspond to these 2 minute intervals when the next application starts to become exercised causing a flurry of JIT compilations to happen.
If you pay close attention, you'll observe that the Baseline configuration is affected too by the noisy neighbor effect, but to a lower extent because Baseline has 50% more CPUs to its disposal (3 nodes vs 2).