2. INTRODUCTION
jHiccup is an open source tool designed to measure the pauses
and stalls (or “hiccups”) associated with an application’s underlying
Java runtime platform. The new tool captures the aggregate
effects of the Java Virtual Machine (JVM), operating system,
hypervisor (if used) and hardware on application stalls and
response time.
jHiccup allows developers, systems operators and performance
engineers to easily create and analyze response time profiles, and
to clearly identify whether causes of application delays reside in
the application code or in the underlying runtime platform. jHiccup
is completely transparent and non-intrusive to the application, has
zero performance overhead in operation, and is compatible with all
Java applications using any JVM.
4. MEASUREMENTS
The hiccups measured are NOT stalls caused by the application's
code. They are stalls caused by the platform that would be visible
to and affect any application thread running on the platform at the
time of the stall. jHiccup shows graphically via 'Hiccup Charts' just
how responsive the runtime platform really is. By understanding
the pauses associated with the underlying platform, IT
organizations can better isolate latency and delays and the
contributing components.
If jHiccup experiences and records a certain level of measured
platform hiccups, it is safe to assume that the application running
on the same JVM platform during that time had experienced
hiccup/stall effects that are at least as large as the measured
level.
5. FUNCTIONALITIES
jHiccup helps you avoid common pitfalls in application
performance characterization. Most performance reporting
assumes a normal distribution of response times (single mode),
yet most 'real' systems are multi-modal due to GC pauses, OS or
virtualization 'hiccups', swapping, etc. jHiccup can help you find
and identify the causes of these delays that affect response times.
jHiccup works with any Java application. It is non-intrusive, runs
on all major operating systems and is compatible with all JVMs.
jHiccup is run as a wrapper around other applications so that
measurements can be done without any changes to application
code.
Other tools focus on the application itself; jHiccup is unique in
looking at the underlying platform.
6. HOW TO RUN JHICCUP
To use jHiccup, include the script command in your Java application start
command.
For example:
If your program were normally executed as:
java <Java args> UsefulProgram -a -b –c
Then the launch line would become:
jHiccup java <Java args> UsefulProgram -a -b -c.
7. JHICCUP LOG FILES
jHiccup records hiccup durations observed by the application it
wraps. It produces two log files that are updated at each recording
interval (which defaults to 5 seconds). The two log files are:
hiccup.<identifier>.log:
A single line entry is appended to the log file at every
recording interval. Each entry line lists the hiccup duration of
certain percentiles (50%, 90%, Max) of samples collected in the past
a aecording interval, as well as aggregate percentiles (50%,
90%, 99%, 99.9%, 99.99%, Max) for samples accumulated since
the beginning of the jHiccup run.
hiccup.<identifier>.hgrm:
This file is updated in it's entirety with each recording interval.
It contains detailed histogram data showing the distribution of
hiccup duration by percentile for the entire jHiccup run.
8. ANALYZING JHICCUP DATA &
CARTS
The data collected by jHiccup represents actual stall times observed
on the runtime platform (at the JVM level and below) while the
application being monitored is actually running on it. While jHiccup
does not measure application response times, application response
time will inevitably include any stall times observed at the runtime
platform level. jHiccup's Hiccup Charts, at any time point and at any
percentile, represent the absolute "best case" scenario data for
application response times - the time in which an empty amount of
work could be done. jHiccup data and Hiccup Charts can quickly
identify whether application responsiveness issues are dominated by
application code or by the underlying runtime platform's behavior:
1. If application response time behavior is similar to that observed by jHiccup, observed as a
correlation of dominant stall events in time and of response time %'ile distribution and
magnitude, the runtime platform's behavior and occasional stalls are likely the dominant
factor in application responsiveness. Improvement in runtime stall behavior through
tuning or eliminating runtime stall causes will most likely improve the application response
time characteristics.
2. If application response time behavior does not correlate with jHiccup data and Hiccup
Charts, the application code and/or external resources are most likely the dominant
factor, and runtime tuning and improvements in the area of stalls and responsiveness are
unlikely to result in improved application response behavior.