Más contenido relacionado

Similar a Javaland_JITServerTalk.pptx(20)

Más de Grace Jansen(20)



  1. The Next Frontier in Open Source Java Compilers: Just-In-Time Compilation as a Service 1
  2. Rich Hagarty IBM Software Developer / Advocate @rhagarty8 2
  3. Grace Jansen IBM Software Developer Advocate & Java Champion @gracejansen27 jansen/ 3
  4. Agenda 4 JIT-as-a-Service
  5. Agenda 5 JIT-as-a-Service PROBLEM: Java on Cloud - a bad fit for microservices
  6. Agenda REASON: JVM and JIT compiler – the good and the bad 6 JIT-as-a-Service PROBLEM: Java on Cloud - a bad fit for microservices
  7. Agenda REASON: JVM and JIT compiler – the good and the bad 7 JIT-as-a-Service SOLUTION: JIT-as-a-Service to the rescue PROBLEM: Java on Cloud - a bad fit for microservices
  8. Java on Cloud A Good Fit? 8
  9. Legacy Java Apps 9 •Java monolith on dedicated server •Plenty of CPU power and memory •Never went down •6 month upgrade/refresh schedule
  10. Moving to the Cloud 10 -Running in containers -Managed by Cloud Provider (and K8s) -Auto-scaling to meet demand Cloud native App talking to Microservices
  11. Main Motivators 11 -Flexible & scalable -Easier to roll-out new releases more frequently -Take advantage of latest-greatest Cloud technologies -Less infrastructure to maintain and manage -Saving money
  12. Performance vs Cost 12 Performance Cost Variables – Container Size and Container Instances Why is this so hard to get right?
  13. JVM and JIT compiler: The Good and The Bad 13
  14. Java Virtual Machine (JVM) 14 The Good • Device independent – write once, run anywhere • > 25 years of improvements • JIT produces optimized machine code through use of Profilers • Efficient garbage collection • Longer it runs, the better it runs (JVM collects more profile data, JIT compiles more methods)
  15. Java Virtual Machine (JVM) 15 The Bad • Initial execution run is “interpreted”, which is relatively slow • “Hot Spot” methods compiled by JIT can create CPU and memory spikes • CPU spikes cause lower QoS • Memory spikes cause OOM issues, including crashes • Slow start-up time • Slow ramp-up time
  16. Java Virtual Machine (JVM) 16 0 50 100 150 200 250 300 350 400 0 30 60 90 CPU utilization (%) Time (sec) Daytrader7 CPU consumption CPU spikes caused by JIT compilation 0 100000 200000 300000 400000 500000 600000 0 30 60 90 Resident set size (KB) Time (sec) Daytrader7 memory footprint Footprint spikes caused by JIT compilation
  17. Cost vs Performance - Revisited 17 Finding the ”Sweet Spot” • Right-sized provisioned containers • Well-behaved efficient auto-scaling Let’s revisit our 2 variables
  18. Container Size 18 Main issues: •Need to over-provision to avoid OOM •Very hard to do – JVMs have a non-deterministic behavior 0 100000 200000 300000 400000 500000 600000 0 30 60 90 Resident set size (KB) Time (sec) Daytrader7 memory footprint Footprint spikes caused by JIT compilation
  19. Container Instances (Auto-Scaling) 19 Main issues: •Slow start-up and ramp-up times •CPU spikes can cause auto-scaler to incorrectly launch additional instances
  20. Cost vs Performance - Revisited 20 Solution •Minimize/eliminate CPU and memory spikes •Improve start-up and ramp-up times
  21. JIT-as-a-Service to the rescue 21
  22. JIT-as-a-Service Decouple the JIT compiler from the JVM and let it run as an independent process Offload JIT compilation to remote process Remote JIT Remote JIT JVM JIT JVM JIT Kubernetes Control Plane Treat JIT compilation as a cloud service • Auto-managed by orchestrator • A mono-to-micro solution • Local JIT still available 22
  23. Eclipse OpenJ9 JITServer • JITServer feature is available in the Eclipse OpenJ9 JVM • “Semeru Cloud Compiler” when used with Semeru Runtimes • OpenJ9 combines with OpenJDK to form a full JDK Link to GitHub repo: 23
  24. Overview of Eclipse OpenJ9 Designed from the start to span all the operating systems needed by IBM products This JVM can go from small to large Can handle constrained environments or memory rich ones Renowned for its small footprint, fast start-up and ramp-up time Is used by the largest enterprises on the planet 24
  25. Eclipse OpenJ9 Performance 25
  26. IBM Semeru Runtimes “The part of Java that’s really in the clouds” IBM-built OpenJDK runtimes powered by the Eclipse OpenJ9 JVM No cost, stable, secure, high performance, cloud optimized, multi- platform, ready for development and production use Open Edition • Open source license (GPLv2+CE) • Available for Java 8, 11, 17, 18 (soon 19) Certified Edition • IBM license • Java SE TCK certified. • Available for Java 11, 17 26
  27. JITServer advantages for JVM Clients 27 Provisioning Easier to size; only consider the needs of the application Performance Improved ramp-up time due to JITServer supplying extra CPU power when the JVM needs it the most. Reduced CPU consumption with JITServer AOT cache Cost Reduced memory consumption means increased application density and reduced operational cost. Efficient auto-scaling – only pay for what you need/use. Resiliency If the JITServer crashes, the JVM can continue to run and compile with its local JIT
  28. Performance graphs 28
  29. Improve ramp-up time with JITServer • Experiment in docker containers • Show that JITServer improves ramp-up • Show that JITServer allows a lower memory limit for JVM containers 29 OpenLiberty+ AcmeAir 1P, 400 MB limit OpenLiberty+ AcmeAir 1P, 200 MB limit OpenLiberty+ AcmeAir 1P, 200 MB limit MongoDB JITServer 4P, 1GB limit JMeter JMeter JMeter InfluxDB Grafana Collect throughput data from JMeter Display throughput data Apply load to AcmeAir instances Provide data persistence services Run the AcmeAir application Provide JIT compilation services Grafana Prometheus Scrape metrics Display JITServer metrics
  30. 30
  31. 31
  32. JITServer value in Kubernetes • cloud-an-aws-experiment/ • Experimental test bed • ROSA (RedHat OpenShift Service on AWS) • Demonstrate that JITServer is not tied to IBM HW or SW • OCP cluster: 3 master nodes, 2 infra nodes, 3 worker nodes • Worker nodes have 8 vCPUs and 16 GB RAM (only ~12.3 GB available) • Four different applications • AcmeAir Microservices • AcmeAir Monolithic • Petclinic (Springboot framework) • Quarkus • Low amount of load to simulate conditions seen in practice • OpenShift Scheduler to manage pod and node deployments/placement 32
  33. JITServer improves container density and cost Default config AM 500 B 550 C 550 F 450 P 450 P 450 B 550 F 450 AM 500 A 350 AM 500 M 200 Q 350 P 450 Q 350 D 600 D 1000 F 450 B 550 Q 350 AM 500 AM 500 AM 500 B 550 B 550 A 350 C 550 F 450 M 200 P 450 P 450 P 450 Q 350 Q 350 D 1000 AM 500 B 550 P 450 AM 500 B 550 B 550 C 550 C 550 F 450 F 450 P 450 Q 350 Q 350 D 1000 D 1000 Q 350 AM 250 AM 250 P 250 P 250 F 250 F 250 B 400 C 350 Q 150 Q 150 M 150 AM 250 AM 250 P 250 P 250 F 250 B 400 Q 150 Q 150 J 1200 A 250 B 400 B 400 C 350 D 1000 D 1000 D 600 AM 250 AM 250 P 250 P 250 F 250 F 250 B 400 C 350 Q 150 Q 150 M 150 AM 250 AM 250 P 250 P 250 F 250 B 400 Q 150 Q 150 J 1200 A 250 B 400 B 400 C 350 D 1000 D 1000 JITServer config Legend: AM: AcmeAir monolithic A: Auth service B: Booking service C: Customer service D: Database (mongo/postgres) F: Flight service J: JITServer M: Main service P: Petclinic Q: Quarkus Total=8250 MB Total=8550 MB Total=8600 MB Total=9250 MB Total=9850 MB 6.3 GB less 33
  34. Throughput comparison 0 200 400 600 800 1000 1200 0 240 480 720 960 1200 1440 1680 1920 Throughput (pages/sec) Time (sec) AcmeAir Microservices 0 200 400 600 800 1000 1200 0 240 480 720 960 1200 1440 1680 1920 Throughput (pages/sec) Time (sec) AcmeAir Monolithic 0 20 40 60 80 100 120 140 160 180 0 240 480 720 960 1200 1440 1680 1920 Throughput (pages/sec) Time (sec) Petclinic 0 500 1000 1500 2000 2500 3000 3500 0 240 480 720 960 1200 1440 1680 1920 Throughput (pages/sec) Time (sec) Quarkus Machine load: 17.5% from apps 7% from OpenShift 34  JITServer and default configuration achieve the same level of throughput at steady-state JITServer Baseline
  35. Conclusions from high density experiments • JITServer can improve container density and reduce operational costs of Java applications running in the cloud by 20-30% • Steady-state throughput is the same despite using fewer nodes 35
  36. Horizontal Pod Autoscaling in Kubernetes • Better autoscaling behavior with JITServer due to faster ramp-up • Less risk to trick the HPA due to transient JIT compilation overhead 36 Setup: Single node Microk8s cluster (16 vCPUs, 16 GB RAM) JVMs limited to 1 CPU, 500MB JITServer limited to 8 CPUs and has AOT cache enabled Load applied with JMeter, 100 threads, 10 ms think-time, 60s ramp-up time Autoscaler: scales up when average CPU utilization exceeds 0.5P. Up to 15 AcmeAir instances 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 0 60 120 180 240 300 360 420 480 Throughput (pages/sec) Time (sec) AcmeAir throughput when using Kubernetes autoscaling Baseline JITServer+AOTcache
  37. How to use it 37
  38. JITServer usage basics • One JDK, three different personas • Normal JVM: $JAVA_HOME/bin/java MyApp • JITServer: $JAVA_HOME/bin/jitserver • Client JVM: $JAVA_HOME/bin/java -XX:+UseJITServer MyApp • Optional further configuration through JVM command line options • At the server: -XX:JITServerPort=… default: 38400 • At the client: -XX:JITServerAddress=… default: ‘localhost’ -XX:JITServerPort=… default: 38400 • Full list of options: • Note: Java version and OpenJ9 release at client and server must match 38
  39. JITServer usage in Kubernetes • Typically we create/configure • JITServer deployment • JITServer service (clients interact with service) • Use • Yaml files • Helm charts: repo • Certified Operators • Tutorial: kubernetes/ 39
  40. JITServer encryption/authentication through TLS • Needs additional JVM options • Server: -XX:JITServerSSLKey=key.pem -XX:JITServerSSLCert=cert.pem • Client: -XX:JITServerSSLRootCerts=cert.pem • Certificates and keys can be provided using Kubernetes TLS Secrets • Create TLS secret: • kubectl create secret tls my-tls-secret --key <private-key-filename> --cert <certificate-filename> • Use a volume to map “pem” files 40 apiVersion: v1 kind: Pod metadata: name: my-pod spec: containers: - name: my-container-name image: my-image volumeMounts: - name: secret-volume mountPath: /etc/secret-volume volumes: - name: secret-volume secret: secretName: my-tls-secret
  41. Monitoring • Support for custom metrics for Prometheus • Metrics scrapping: GET request to http://<jitserveraddress>:<port>/metrics • Command line options: -XX:+JITServerMetrics -XX:JITServerMetricsPort=<port> • Metrics available • jitserver_cpu_utilization • jitserver_available_memory • jitserver_connected_clients • jitserver_active_threads • Verbose logging • Print client/server connections -XX:+JITServerLogConnections • Heart-beat: periodically print to verbose log some JITServer stats • -Xjit:statisticsFrequency=<period-in-ms> • Print detailed information about client/server behavior -Xjit:verbose={JITServer},verbose={compilePerformance},vlog=… 41
  42. JITServer usage recommendations When to use it: • JVM needs to compile many methods in a relatively short time • JVM is running in a CPU/memory constrained environment, which can worsen interference from the JIT compiler • The network latency between JITServer and client VM is relatively low (<1ms) • To keep network latency low, use “latency-performance” profile for tuned and configure your VM with SR-IOV 42
  43. JITServer usage recommendations Recommendations: • 10-20 client JVMs connected to a single JITServer instance • JITServer needs 1-2 GB of RAM • Better performance if the compilation phases from different JVM clients do not overlap (stagger) • Encryption adds to the communication overhead; avoid if possible • In K8s use “sessionAffinity” to ensure a client always connects to the same server • Enable JITServer AOT cache: -XX:+JITServerUseAOTCache (client needs to have shared class cache enabled) 43
  44. Final thoughts • JIT provides advantage, but compilation adds overhead…. So….. • Disaggregate JIT from JVM  JIT compilation as a service • Eclipse OpenJ9 JITServer (a.k.a Semeru Cloud Compiler) • Available now on Linux for Java 8, Java 11 and Java 17 (IBM Semeru Runtimes) • Especially good for constrained environments (micro-containers) • Kubernetes ready (Helm chart available, Prometheus integration) • Can improve ramp-up, autoscaling and performance of short lived applications • Can reduce peak memory footprint, increase app density and reduce costs • Java solution to Java problem, with no compromise 44
  45. Thank You! 45 @gracejansen27 /grace-jansen/

Notas del editor

  1. This talk is entitled “The next frontier in Open Source Java Compilers: Just-in-Time Compilation as a Service”
  2. Introduction
  3. Introduction
  4. I assume everyone here is a Java developer and knows what a JVM and a JIT is. As you know the JVM, or Java Virtual Machine, executes your Java application, And the JIT, or Just-in-time Compiler is invoked by the JVM during run time to compile the most frequently called, or HOT, methods. With this in mind, today we will be talking about the concept of a JIT-as-a-Service, and why we need it.
  5. We are going to break this talk down into 3 parts: First we’ll discuss the problem we want to address, and this is running Java on the cloud is not a good fit, specifically in a distributed and dynamic architecture, like microservices
  6. 2. Then we’ll talk the reason this is, by taking a look at the the JVM and the JIT compiler, which has a great history, but has some side effects that can affect performance at start-up
  7. 3. Finally, we’ll discuss a way to get around these start-up issues by using JIT-as-a-Service
  8. Let’s start with some background on running Java apps in cloud
  9. For contrast, let’s start with how we all use to typically run our Java enterprise apps. It was a monolith application running on a dedicated server, and to make sure we didn't have any performance issues, we loaded that server with plenty of CPUs and memory. And, of course, that application ran great. It didn't matter if it took 10 minutes to start because it never went down. Maybe once every 6 months it would be taken offline to launch a new version with some library upgrades, a couple new features, and some bug fixes.
  10. Now let's fast forward to today where the trend is to deploy apps to the cloud. That same monolith application, is now composed of many small microservices that all talk to each other, all running in containers, and managed by some cloud provider. And, depending on demand, there maybe multiple instances of each microservice.
  11. And we do this for a couple reasons – More agile and dynamic Can implement new releases more easily and frequently Positioned to take advantage of new cloud technologies Less infrastructure we have to maintain and manage - going from a constrained to a utility use model And of course, a major motivator is to save money But how do we ensure that performance is still acceptable to our customers while still minimizing cost so that we actually do save money?
  12. The main variables controlling cost and performance are how big our containers are, and how many instances of each are running. Container size we can control, but scaling of instances is left to the cloud orchestrator to manage. But we can do a lot to ensure that scaling is efficient and effective. More on this later. This graph shows the various ways we can get these variables wrong. Of course, if we under-provision our containers, and not enough instances can be efficiently run, we save on money, but the performance is unacceptable. On the opposite side, if we over-provision our containers and we may too many instances running, we have great performance, but we’re wasting money. Our goal is to get to the bottom-right quadrant – the sweet spot. But this is extremely hard to accomplish. Getting this right is the new focus for Java vendors, all coming out with new technologies to address this problem. Why is this so hard? To better understand, we need to go over some background on how Java applications execute.
  13. For Java applications, it’s all about the JVM and the JIT, which are great and time-tested technologies, but they have some not so good side-effects, especially during start-up
  14. One reason Java really took off early on was because it was device independent - write once, run anywhere. And it’s been around for a long time, constantly improving over time. It uses a JIT to dynamically compile “hot” methods using Profilers to generate very optimized machine code – much more than you can get with using a static or Ahead-of-Time compiler. It has great garbage collection. And because it takes time for the JVM to profile and the JIT to compile, Java apps actually run better the longer they run.
  15. But there are some trade-offs. Before the JIT is invoked, the code is “interpreted”, which is relatively slow And when the JIT is invoked, it can cause CPU and memory spikes. CPU spikes at the very least can lower QoS, and memory spikes cause OOM issues, including crashes. One of the main reasons JVMs crash is due to OOM issues. Both CPU and memory spikes slow down start-up time and ramp-up time. Start-up time is the time it takes for the app to be ready to process its first reqest, and ramp-up time is the time it takes for the JIT to compile all of the hot methods and to be running fully optimized.
  16. Here’s graph of a typical Java App at start-up, and you can see the CPU spikes on the left, and memory spikes on the right. A lot of the CPU spikes are caused by JIT compilations, and you can see the biggest spikes occur at the start when the JIT is the most active. The result of these spikes can cause lower QoS, which means sluggish performance. This is also true for memory spikes. Again, you can see the biggest spikes are related to JIT compiles during ramp-up time. Memory spikes are particularly bad because they can cause OOM issues, including crashing the JVM.
  17. So now that we have some good background info, lets revisit our 2 variables for determining cost vs performance. Remember, our goal is to find that sweet spot - where we have just the right amount of resources provisioned for our containers, and we have containers set up for efficient auto-scaling.
  18. For container size, we now know why it’s hard to get the right size. We need to over-provision in order to avoid any OOM issues. We need to handle the initial spikes, but those resources are wasted once the app reaches steady-state. And the amount to over-provision is hard because Java is non-deterministic, meaning we can run the same application twice and get different spike levels. You really need to run a series of different load tests to get this even close to right.
  19. For auto-scaling container instances, we now know we have 2 main issues. Slow start-up and ramp-up times makes scaling ineffective – new instances take too long to start-up causing QoS issues. The alternative is to just start more instances than you think you need and effectively eliminate any auto-scaling. Another problem is that CPU spikes due to JIT compiles can cause issues with auto-scalers. These spikes may be interpreted incorrectly by the auto-scaler as demand load and may result in unnecessary instance launches. One way to minimize this problem is set your thresholds very high, but again, this makes your auto-scaler less effective.
  20. The solution is pretty clear – we need to flatten out those CPU and memory spikes, and we need to improve start-up and ramp-up times.
  21. Which leads us to JIT-as-a-Service.
  22. The basic premise here is to decouple the JIT compiler from the JVM and let it run as an independent process. Here we show a couple of JVMs on the left, and remote JIT services on the right. The JVMs will no longer use their local JIT, and will offload their JIT compilations to the remote JIT services. Here we show the remote JIT processes containerized and made available as a cloud service. This gives us an added benefit – we can be managed by orchestrators like Kubernetes, where it can make sure our service it is always running and scaled properly to handle demand. And this solution is just like any other monolith to micro-services conversion – in this case the JVM is the monolith that is turned into 2 loosely coupled mircro-services – the JIT and the rest of the JVM. Note that on the diagram we show the JVM JIT crossed-out, but it still can be used if the remote JIT should become unavailable.
  23. This service already exists, and it is called the JITServer and is a feature of the Eclipse OpenJ9 JVM – which is totally open source and free to download. It also goes by the name “Semeru Cloud Compiler”, because it is mostly distributed with the IBM Semeru Runtimes (which we will talk about in a minute). For distribution, the OpenJ9 combines with the OpenJDK binaries to form a full JDK. As I mentioned, the Eclipse Open J9 JVM, and by extension JITServer Technology is completely open source - here is a link to its GitHub repo.
  24. A little background on the OpenJ9 JVM – It originally started life as the J9 JVM, which was developed by IBM over 20 years ago to run all of their Java based workloads on IBM hardware. It was open sourced to the Eclipse Foundation around 5 years ago and re-branded as OpenJ9 It works with any Java workload, from micro-services to monolliths, and is specifically designed to work in constrained environments. And its well-known for its small footproint, fast startup and ramp-up times. Over time it has been used by many fortune 500 companies to run their enterprise Java applications. So it has a long history of success.
  25. Here we show how it compares to the popular HotSpot JVM. OpenJ9 is the green, and HotSpot is orange. And this comparison is independent of any JITServer advantages. These graphs are based on startup and ramp-up times. Remember, the distinction is start-up time is initial application load time, while ramp-up time is the time it takes to be running at peak throughput with optimized compiled code. Going left to right, you can see that start-up time can be 51% faster than HotSpot. Next we see that OpenJ9 has a 50% smaller footprint after start-up, which means more resources for the application. Next, we see faster ramp-up time. Notice how much longer it takes HotSpot to match the level of OpenJ9. And finally, you see OpenJ9 still has a smaller footprint, even after fully ramping up. All of these metrics are important when running in constrained environments.
  26. Semeru Runtimes is IBMs distribution of the OpenJDK binaries, and it is the only distribution that comes with the OpenJ9 JVM. It comes in 2 flavors – an Open and Certified Edition. Both are free to download, the only difference is licensing and supported platforms. If you are wondering where the name came from, the connection is that Mount Semeru is the tallest mountain on the island of Java.
  27. Back to the JITServer - let’s take a look at the advantages, from the perspective of the JVM clients that will be utilizing the JITServer. For provisioning: Since there are no more JIT compilation spikes, sizing becomes much easier. There is no need to add in any “just-in-case” resources, and you can just focus on what the application needs. As for performance: It will be much more predictable – the JIT will no longer be stealing CPU cycles. And because the JITServer can provide additional CPU cycles from the start, ramp-up times will be improved. And this is especially true for short-lived apps, since a majority of their life span is during ramp-up. The JITServer also has its own AOT cache, which means that any new replicated instances can have access to already compiled methods. As for cost: Less resources are needed, and more efficient auto-scaling means only paying for what you need and use. And finally for resiliency: The JVM and the JITServer are separate processes, so the JVM can continue if the JITServer crashes. And the JVM still has use of its local JIT. Let’s take a closer at some test results which show both cost savings and improved performance.
  28. This is setup for the demo Acmeair app: This application shows an implementation of a fictitious airline called "Acme Air". The application was built with some key business requirements: the ability to scale to billions of web API calls per day, the need to develop and deploy the application targeting multiple cloud platforms (including Public, Private and hybrid). The application can be deployed both on-prem as well as on Cloud platforms this and the next few slides can be substituted by running the actual demo, or showing a copy of the demo located at
  29. Now lets take a look at another test to see how JITServer can help with provisioning. The experiment was conducted on a Red Hat OpenShift cluster on AWS. It has 3 worker nodes, with around 12GB of RAM to play with. We will be running 4 test applications – 2 versions of the AcmeAir application, one as a monolith, and one with micro-services. And a springboot and Quarkus application. We will apply a real-world load to the applications to simulate activity, and we will let the OpenShift Scheduler determine how to deploy and replicate the applications. *** Let’s look at a more complex example that demonstrates the value of JITServer a Kubernetes setting. These experiments were performed on RedHat OpenShift Service on AWS (for those of you that don’t know OpenShift is a Kubernetes distribution from RedHat). Our cluster has 3 worker nodes with 8 vCPUs and 16 GB of RAM out if which only 12.3GB are available (the rest is used by OS and OCP related applications) As workload we have 4 different applications: AcmeAir Microservices and AcmeAir Monolithic based on OpenLiberty, Petclinic (which is based on the Springboot framework) and a Quarkus based app. We apply a load amount of load to better reflect conditions seen in practice (I have seen quite a few studies that show that, the level of utilization is somewhere around 6 and 15% while another study from Google gives more generous numbers between 10 and 50%).
  30. So these slides show how the OpenShift scheduler decided to place the various pods on the worker nodes. Note that each application has a different color. and each application is replicated multiple times. The size of the shape indicates its relative container size. The number in the shape is the memory limit for that container. As you can see in the top row, all 3 worker nodes are used, and the size of the containers are all larger than the bottom graph – this is due to building in extra memory to avoid OOM issues and improve throughput. The bottom row uses the JITServer, which results in only 2 worker nodes being used, despite the fact that the JITServer containers (shown in brown) are the largest containers in the node. The savings comes from being able to scale down each of the application nodes. The end result is a 33% cost savings by using one less worker node. *** This slide is an illustration of how OpenShift scheduler decided to place the various containers on nodes. There are two different configurations: above we have the default configuration without JITServer which needs 3 worker nodes. Below we have the JITServer configuration that only uses 2 worker nodes. The colored boxes represent containers and the legend on the right will help you decipher which application each container is running. The number in each box represents the memory limit for that container. These values were experimentally determined so that the application can run without an OOM or drastically reduced throughput. The boxes were drawn at scale meaning that a bigger box represents a proportionally larger amount of memory given to that container. At a glance you can see that in the default config you cannot fit all those containers in just 2 nodes; you need 3 nodes. In contrast, the JITServer config uses 6.3 GB less and we are able to fit all those containers in just 2 nodes. This happens even after we account for the presence of JITServer (as you can see we have two JITServer instances, one on each node). The takeaway is that JITServer technology allows you to increase application density in the cloud and therefore cost.
  31. Now lets take a look at how each of the applications performed, The orange line represents the top row from the previous page. And the blue line represents the bottom row from the previous page, which uses the JITServer. Each graph represents each of the applications. You can see that the performance is pretty even, despite the fact that the JITServer is working with less worker node CPUs. The small blue lags are likely caused by the noisy-neighbor effect due to all the apps loaded up at the same time. *** I have here 4 graphs, one for each application and the blue line shows how throughput with JITServer varies in time, while the orange line represents the throughput of the baseline. As you can see, the steady state throughput for the 2 configurations is the same. From the ramp-up point of view JITServer is doing quite well for Petclinic and Quarkus, while for AcmeAir mono and micro there is a miniscule ramp-up lag, which I would say it's negligible. On the graphs you can also notice some dips in throughput, more pronounced for AcmeAir monolithic. This is due to interference between applications, or the so-called noisy neighbor effect. Since in practice applications are not likely to be loaded at the exact same time, in these experiments we apply load to the 4 applications in a staggered fashion, 2 minutes apart, starting with AcmeAir microservices and continuing with AcmeAir monolithic, Petclinic and Quarkus. Those throughput dips correspond to these 2 minute intervals when the next application starts to become exercised causing a flurry of JIT compilations to happen. If you pay close attention, you'll observe that the Baseline configuration is affected too by the noisy neighbor effect, but to a lower extent because Baseline has 50% more CPUs to its disposal (3 nodes vs 2).
  32. So, to summarize, the experiment demonstrate that JITServer can increase container density in the cloud without sacrificing throughput. And this results in reducing operational cost of Java applications by 20 to 30%. And just to note - Ramp-up time can be slightly affected in high density scenarios due to limited computing resources, but how much depends on the level of load and the number of pods concurrently active. **** In conclusion, the experiments conducted on Amazon AWS demonstrate that JITServer can increase container density in the cloud without sacrificing throughput and therefore reduces operational cost of Java applications by 20 to 30%. Ramp-up can be slightly affected in high density scenarios due to limited computing resources, but the extent of this depends on the level of load and the number of pods concurrently active.
  33. And for out last experiment, we wanted to see how JITServer affects autoscaling in Kubernetes. You can see a description of the test bed on the right. The autoscaler instantiates a new pod when the average CPU utilization exceeds 50%. The graph shows the throughput of the AcmeAir app while increasing amounts of load are applied. The orange line is the baseline, and the blue line represents using the JITServer. As you can see, the throughput curve continues to rise, and then it plateaus. The dips are associated with the launch of new pods, which burn a lot of CPU. Comparing the two curves, JITSever gives you a better behavior because it is able the warm-up the newly spawn pods faster. Also, without the JITServer there is the danger that the autoscaler will interpret the CPU used for compilation as load and be misled into launching even more pods. The JITServer makes this less likely. And because the JITServer caches compiled methods, new instances of the same app will have these methods available to them at start-up. *** HPA = Horizontal Pod Autoscaler We performed some experiments to see how JITServer affects the autoscaling behavior in Kubernetes. A description of the experimental test-bed can be found on the right. Let’s focus on the graph on the left which shows the throughput of AcmeAir app while increasing load is applied to it. HPA monitors the average CPU usage of the AcmeAir pods and if that exceeds our target of 0.5P, new pods are instantiated. Overall the throughput curve goes up and at some point it plateaus. Interestingly, the curve shows some transient dips in throughput and those correlate with HPA decisions to launch new pods. The new pods will burn a lot of CPU but yield poor performance until they warm-up. To maintain fairness the load balancer gives an equal number of requests to each pod, but because the new pods can only process a limited number of requests per second, the older pods will match that level of performance. Comparing the two curves, JITSever gives you a better behavior because it is able the warm-up the newly spawn pods faster. Moreover, without JITServer there is the danger that the autoscaler will interpret the CPU used for compilation as load and be misled into launching even more pods. This can be avoided by making the autoscaler less sensitive to CPU spikes. However, not reacting fast enough to a legitimate increase in load is also bad.
  34. If running from the command line, the JITServer can be started from the OpenJ9 bin directory, by simply typing “jitserver”. I would like to point out that the JITServer is just a OpenJ9 JVM, but running under a different persona. To use it from your app, just use the UseJitServer option when starting your java applications And there are a number of other options such as specifying address and port number if you need more than just the default values. Here is a link to all of the options. And note that the JITServer and its clients need to on the same Java version OpenJ9 release. ##### Port can be changed if there is a conflict with another service using the same port number
  35. For Kubernetes, you need to set up a JITServer deployment and service. You do this with Yaml files, or Helm Charts, and an Operator is now available. Here is a link to a tutorial that walks you through the steps.
  36. - You can establish trust and encrypt the communication between client and server using TLS - This can be done with the command line options shown in blue, which specify the certificate file and private key to be used. - The certificate and the private key files can be stored as Kubernetes TLS secrets and mapped into the container using volumes. - I am showing here an excerpt of a yaml file though there are other ways possible.
  37. The JITServer can be queried for metrics. Here is a list, which is what we showed in the demo. You can also specify logging options.
  38. Here are some recommendation on when to use the JITServer. One use case is when your JVM needs to compile many methods in a relatively short time Or you are running in a constrained environment, where you can’t afford CPU spikes from compilations And only use if you have low network latency. Communication between the JVM and JITServer can create a lot of traffic. And you should use any latency-performance settings to tune your environment. ***** SR-IOV = single root I/O Virtualization Kubernetes use "requests" and "limits" for resources like CPU and memory. "requests" is the amount of CPU/memory that a process is guaranteed to get. This value is used for pod scheduling. "limits" is the maximum amount of CPU/memory a pod is allowed to consume. By setting a smaller CPU "request" value, Kubernetes has more flexibility in scheduling the JITServer.  By setting a larger CPU "limit" value, you give JITServer the ability to use any unused CPU cycles Typically the "request" value should be set to what the process consumes at steady-state (if you can define such a thing for JITServer) -Xshareclasses on client – this turns on AOT
  39. As for recommendations on how to use it: The JITServer does create require additional resources, so to get the most net benefit, you should try to have between 10-20 connected client JVMs It needs at least 1-2 GB of RAM If you are using it with Kubernetes, always set the CPU/memory limits (which is the max) much larger than the requests (which is the minimum). This can really help handling CPU usage spikes And as we saw with one of our experiments, it performs better if all of its clients are not started at the same time. Don’t use encryption unless you really need it. It does add a lot of overhead In Kubernetes, definitely use “sessionAffinity” to make sure JITServers and their clients stay connected. And the last tip is using the AOT cache feature of OpenJ9. If both the client and JITServer have this enabled, AOT code can be cached on the server side and shared with all of the JVM instances. ***** SR-IOV = single root I/O Virtualization Kubernetes use "requests" and "limits" for resources like CPU and memory. "requests" is the amount of CPU/memory that a process is guaranteed to get. This value is used for pod scheduling. "limits" is the maximum amount of CPU/memory a pod is allowed to consume. By setting a smaller CPU "request" value, Kubernetes has more flexibility in scheduling the JITServer.  By setting a larger CPU "limit" value, you give JITServer the ability to use any unused CPU cycles Typically the "request" value should be set to what the process consumes at steady-state (if you can define such a thing for JITServer) -Xshareclasses on client – this turns on AOT
  40. So, what did we learn today? JIT compilation adds overhead, and one solution is to disaggregate the JIT from the JVM and perform JIT compilations as a service. We demonstrated the OpenJ9 implementation, which is the JITServer, also know as the Semeru Cloud Compiler.