SlideShare una empresa de Scribd logo
1 de 55
Descargar para leer sin conexión
Tomcat Expert Series

                     Troubleshooting in Production
                                                                                    Filip Hanik
                                                                                   SpringSource
                                                                                        2009




Copyright 2007 SpringSource. Copying, publishing or distributing without express written permission is prohibited.
Topics in this Session


 • Brief overview of JVM memory layout
 • Understanding Out Of Memory Errors
    – Causes
    – Solutions
 • Error logs and stack traces
 • The tale of the thread dump




                                         2
Topics in this Session


 • Using OS utilities to narrow down the
   problem
 • JMX – what it can do for you




                                           3
The JVM process heap


           OS Memory (RAM)


       Process Heap (java/java.exe)


    Java Object Heap

                   Everything else…
                                      4
Storing data in memory


• JVM manages the process heap (in most cases)
   – JNI managed memory would be an exception, and there
     are others
• No shared memory between processes
   – At least not available through the Java API
• JVM creates a Java Heap
   – Part of the process heap
• Configured through –Xmx and –Xms settings




                                                           5
JVM Process Heap


• Maximum size is limited
   – 32 bit size, roughly 2GB
   – 32 bit JVM on 64 bit OS, roughly 3.7GB
   – 64 bit, much much larger ?
• If 2GB is the max for the process
   – -Xmx1800m –Xms1800m – not very good
   – Leaves no room for anything else




                                              6
Gotcha #1


• -Xmx and -Xms
   – Only controls the Java Object Heap

• Often misunderstood to control the process heap

• Confusion leads to incorrect tuning
   – And in some cases, the situation worsens




                                                    7
Java Object Heap Allocation


• Aggressive Heap Allocation
• -XX:MinHeapFreeRatio=
   – Default is 40 (40%)
   – When the JVM allocates memory, it allocates enough to
     get 40% free
• -XX:MaxHeapFreeRatio=
   – Default 70%
   – To give back memory when a majority of it is not used
• Not important when –Xms == -Xmx




                                                             8
Java Object Heap


        Java Object Heap (-Xmx/-Xms)


   Young Generation

                         Old Generation

A good size for the YG is 33% of the total heap
                                              9
Java Object Heap


• Young Generation
   – All new objects are created here
   – Only moved to Old Gen if they survive one or more minor
     GC
• Sized using
   – -Xmn – fixed value
   – -XX:NewRatio=<value> - dynamic sizing
   – -XX:MaxNewSize/-XX:NewSize – similar to -Xmx/-Xms
• Survivor Spaces
   – 2, used during the GC algorithm (minor collections)
   – Mainly to alleviate fragmentation


                                                               10
Young Generation

                        Young size(-XX:NewRatio )
                     Survivor Ratio(-XX:SurvivorRatio )


       Eden Space                      Survivor Space



        New Objects
           2Mb default

                               To         From
                                  64Kb default

                                                          11
Old Generation



              Tenured Space
             5Mb min 44Mb max (default)


     Garbage collection section will
     explain in detail how these spaces
     are used during the GC process.

                                          12
Java Heap Space


 • java.lang.OutOfMemoryError: Java heap space
    – Most common out of memory error
    – Caused by too many objects in the Java heap

 • Solution
    – Increase -Xmx if possible
    – Add -XX:+HeapDumpOnOutOfMemoryError
    – Fix memory leak if application is consuming more
      memory than expected

 • Side effects
    – Increasing -Xmx can have other side effects

                                                         13
JVM Process Heap

• Yes, there is more...
  –   Permanent Space
  –   Code Generation
  –   Socket Buffers
  –   Thread stacks
  –   Direct Memory Space
  –   JNI Code
  –   Garbage Collection
  –   JNI allocated memory




                             14
Permanent Space


• Permanent Generation
  – Permanent Space (name for it)
  – 4Mb initial, 64Mb max
• Stores classes, methods and other meta data
  – -XX:PermSize=<value> (initial)
  – -XX:MaxPermSize=<value> (max)
• Common OOM for webapp reloads
  – Separate space for pre-historic reasons
  – Early days of Java, class GC was not common, reduces
    size of the Java Heap
  – Does not exist in IBM JVM


                                                           15
Permanent Space


• Permanent Space Memory Errors
  – Too many classes loaded
  – Classes not unloaded/not being GC:ed
  – Unaffected by –Xmx flag
• java.lang.OutOfMemoryError: PermGen space
  – Many situations, increasing max perm size will help
  – i.e. no leak, but just not enough memory
  – Others will require to fix the leak




                                                          16
Socket Buffers


• Each connection contains two buffers
   – Receive buffer ~37k
   – Send buffer ~25k
• Configured in Java code
   – Default and Max limits set in kernel
• Very common tune parameter for content delivery
• Usually hit other limits than memory before you
  run out of memory
   – IOException: Too many open files (for example)
   – SocketException: No buffer space available



                                                      17
Thread Stacks


• Each thread has a separate memory space called
  “thread stack”
• Configured by –Xss
• Default value depends on OS/JVM
• As number of threads increase, memory usage
  increases




                                                   18
Thread Stacks


• java.lang.OutOfMemoryError: unable to create new
  native thread
• Solution
   – Decrease –Xmx (or other space) and/or
   – Decrease –Xss
   – Or, you have a thread leak, fix the program
• Gotcha
   – Increasing –Xmx (32bit systems) will leave less room for
     threads if it is being used, hence the opposite of the
     solution
   – Too low –Xss value can cause
     java.lang.StackOverflowError
• Thread dump will lead to instant answer
   – Returns all the threads                                    19
Garbage Collection


• However, if there is excessive GC

• java.lang.OutOfMemoryError: GC overhead limit
  exceeded

• 98% of the time is spent in GC
   – less than 2% of the heap is recovered

• To disable
   – -XX:-UseGCOverheadLimit
• To enable
   – -XX:+UseGCOverheadLimit
                                                  20
GC: How It Works

       Eden Space               Survivor Space



                              From       To


                Tenured Space
4. Next time Eden is full
1. New object isis fullnd minor 1st Eden
3. Copy EDEN created
Copy nd fillsEdenobjects remain in
5. If 2 from and to 2 –
2. When surviving objects into collection or 1st
Copy from copied nd the tenured
survivor space 2 to
These get    1st to
                                                   21
GC: Debugging it


 •   -Xloggc:%CATALINA_BASE%logsgc.log
 •   -XX:+PrintGCDetails
 •   -XX:+PrintGC
 •   -XX:+PrintGCApplicationStoppedTime
 •   -XX:+PrintGCTimeStamps
 •   -XX:+PrintHeapAtGC




                                           22
Troubleshooting steps


 • What are the symptoms
 • How does the problem manifest itself
 • How can I dig into the problem




                                          23
Tomcat Logs


• Tomcat is really good at logging

• It doesn’t spit out info you don’t need know

• When an error happens it can, however, generate
  tons of log entries

• Common IO exceptions are swallowed
   – Normal TCP behavior




                                                    24
Tomcat Logs


• Unexpected errors are always logged
   – Application and container errors

• Logs are always a resource

• Log entries are categorized
   –   DEBUG
   –   INFO
   –   WARNING
   –   SEVERE
   –   FATAL


                                        25
Tomcat Logs


• INFO – no error, just information given to you

• WARNING – you might care a little bit

• SEVERE – yes, now you got an error

• FATAL – what ever this is, it cant be good!

• Always pay attention to the log level




                                                   26
Tomcat Logs


• So what do I need to look at

• SEVERE – yes, now you got an error

• It’s easy to ‘grep’ logs for these entries




                                               27
Tomcat Logs


    • Very basic error
    • So Tomcat can’t log application errors
        – Only if the application doesn’t ‘trap’ the error, tomcat
          will catch it and spit out some info
// An uncaught application error could look like this
SEVERE: Servlet.service() for servlet jsp threw exception
java.lang.NullPointerException at org.apache.jsp.npe_jsp._jspService(npe_jsp.java:55)


    • Found in catalina.2008-03-28.log
        – Uncaught application exception




                                                                                  28
Java Stack Traces


• Shows the code execution path up until the error
  happened
   // An uncaught application error could look like this
   SEVERE: Servlet.service() for servlet jsp threw exception
   java.lang.NullPointerException
         at org.apache.jsp.npe_jsp._jspService(npe_jsp.java:55)
         at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:70)
         at javax.servlet.http.HttpServlet.service(HttpServlet.java:803)
         at org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:374)
         at org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:337)
         at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:266)
         at javax.servlet.http.HttpServlet.service(HttpServlet.java:803)
         at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
         at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
         at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
         at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175)
         at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
         at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
         at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
         at org.apache.catalina.valves.RequestDumperValve.invoke(RequestDumperValve.java:151)
         at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)
         at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844)
         at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
         at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
         at java.lang.Thread.run(Thread.java:595)

                                                                                                                29
Java Stack Traces


• Traces can be chained, only the root cause is the
  real error
   // An uncaught application error could look like this
   SEVERE: Servlet.service() for servlet jsp threw exception
   java.lang.RuntimeException: java.lang.NullPointerException
         at org.apache.jsp.npe_jsp._jspService(npe_jsp.java:59)
         at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:70)
   …
   Caused by: java.lang.NullPointerException
         at org.apache.jsp.npe_jsp._jspService(npe_jsp.java:57)
         ... 19 more




• Always find the top most error and the bottom
  most root cause
                                                                                 30
Tomcat Logs


 • If the logs are not yielding any information
    – Gather more information about the error
 • Can you reproduce it
 • Is it happening on your server
    – Trace the request down




                                                  31
Viewing Requests


• Access logs can help
   – They record every request and it’s response code
   – You can print out headers, cookies, request and session
     information

• Often very useful to see how traffic is flowing

• When using httpd in front
   – httpd access log combined with Tomcat access log
   – Excellent way to consolidate requests




                                                               32
Seeing traffic


 • When using httpd in front
    – httpd has a mod_dumpio module
    – Print out everything one wants to know
    – Useful when privileges for a sniffer are not present



 • RequestDumper Valve
    – Valve that spits out everything, similar to mod_dumpio
    – Poorly designed, it breaks out the output into multiple
      log statement
    – Not recommended



                                                                33
Seeing traffic


 • Network sniffers (client/server)
    – Nothing compares to getting exact data
    – Wireshark, ethereal, tcpdumper, etc

 • Many choices, just pick one.
    – Often one is already installed
    – Requires root privileges

 • Client side visualizer
    – MS Fiddler is an excellent tool for Windows users
    – Firefox firebug



                                                          34
Seeing traffic


 • Ability to map
    – Request to error
    – Error to a time frame
    – Error to a client

 • Traffic pattern can
    – Help you reproduce the error
    – Resolve the issue faster




                                     35
Thread dumps


• Displays the state of all threads in a virtual
  machine

• Provides plenty of information about activity and
  any dead locks

• Provides a trace where each thread started to
  where its current point in execution




                                                      36
Thread dumps


• On Unix -> kill -3 <tomcat pid>
  – jstack -l also works
  – On Solaris, powerful pstack utility

• On Windows Ctrl+Break
  – JDK 1.6+ you have jstack to help

• Tanuki Wrapper
  – telnet <host:port> D

• Thread dump is printed to stdout
  – Or wherever stdout is redirected to
  – Don't send it to /dev/null !
                                          37
The tale of Thread dumps


• Alternate way to dump
     – jstack – tool that comes with the JDK
     – Use -l option with jstack to get lock information
• More than just threads
 Heap
 def new generation total 9088K, used 5497K [0x04070000, 0x04a40000, 0x067d0000)
  eden space 8128K, 60% used [0x04070000, 0x045402a0, 0x04860000)
  from space 960K, 59% used [0x04950000, 0x049de428, 0x04a40000)
  to space 960K, 0% used [0x04860000, 0x04860000, 0x04950000)
 tenured generation total 121024K, used 1656K [0x067d0000, 0x0de00000, 0x24070000)
   the space 121024K, 1% used [0x067d0000, 0x0696e068, 0x0696e200, 0x0de00000)
 compacting perm gen total 12288K, used 4482K [0x24070000, 0x24c70000, 0x2c070000)
   the space 12288K, 36% used [0x24070000, 0x244d0920, 0x244d0a00, 0x24c70000)
    ro space 8192K, 66% used [0x2c070000, 0x2c5bd978, 0x2c5bda00, 0x2c870000)
    rw space 12288K, 52% used [0x2c870000, 0x2ceb9cb8, 0x2ceb9e00, 0x2d470000)




• JDK 1.6+ also prints out memory stats

                                                                                     38
The tale of Thread dumps


• More than just threads
• Dead lock detection

        Found one Java-level deadlock:
        =============================
        "pool-2-thread-6":
         waiting for ownable synchronizer 0x482eaeb0,
        (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync),
         which is held by "http-8081-2"
        "http-8081-2":
         waiting to lock monitor 0x08143b14 (object 0x482eade0,
        a org.apache.catalina.ha.session.DeltaRequest),
         which is held by "pool-2-thread-6"




                                                                             39
The tale of Thread dumps


 • Thread information
 • First line for each thread contains critical info

 "http-8082-7"  thread name
 daemon        type of thread [daemon, empty means non daemon]
 prio=10       thread priority [1..10]
 tid=0x09142000  C++ pointer to JVM OSThread object
 nid=0x67c  kernel thread identifier
 in Object.wait()  what the thread is doing
 [0xba3ff000..0xba3ff4e0]  address space


                                                                  40
The tale of Thread dumps


• Thread information
    – CPU usage very high, caused by a spinning thread, but
      which one?

 nid=0x67c  kernel thread identifier

    – On linux for example, one can get CPU usage per thread

 ps -eL -o pid,%cpu,lwp | grep -i `ps -ef 
 |grep -v grep |grep java|awk '{print $2}'` |grep -v 0.0



• Lists all threads that run inside a java process with
  a CPU usage higher than 0.0%
                                                               41
The tale of Thread dumps


•   Threads dumps can help you identify
•   Show what threads are waiting for a lock
•   Dead locks
•   Spinning threads, and what code it belongs
•   Memory usage
•   Cause of an unresponsive JVM




                                                 42
The tale of Thread dumps


 • When taking thread dumps ALWAYS take two or
   more dumps

 • This will help you see if threads are changing
   execution path

 • Single thread dump can cause a lot of “false
   positives”, where you think a thread is stuck but
   it’s not




                                                       43
The tale of Thread dumps


 • When taking thread dumps ALWAYS take two or
   more dumps

 • Sometimes when a OS is overloaded, maybe
   swapping, threads move, but very slowly

 • If you hadn’t had two dumps, you’d never know
   that!




                                                   44
The tale of Thread dumps


 • Examples of thread dumps and GC logs
   –   Stuck but not dead locked (jvm-locked.txt)
   –   What went wrong? (five-thread-dumps.txt)
   –   Memory leak confirmed (gc.log/gc-logs-explained.txt)
   –   Excessive GC (gc-excessive-object.creation.txt)




                                                              45
OS utilities


 • Troubleshooting a Tomcat server
    – often involves more than just Tomcat and JVM
      troubleshooting
    – OS utilities come very handy

 • Tomcat logs internal errors
    – If they spawned an exception
    – Container bugs are hard to troubleshoot, but rare in
      frequencies




                                                             46
OS utilities


 • File descriptors are common problems
    – Most OS have a limit on FD
    – A FD can be an open file or a socket

 • File descriptor leaks are also very common in web
   applications
    – java.io.IOException: Too many open files

 • Use utilities to track it down




                                                       47
OS utilities


 • File descriptors
    – Sockets
    – Open files
 • On Linux – lsof –p <process id>
    – Will list all open FD
    – This will often put you on the right path on what is
      going wrong
    – Solaris – pfiles




                                                             48
OS utilities


 • Active Connections
    – Can be HTTP/AJP connections
    – Useful to track database or other connections

 • netstat is an excellent tool to view sockets and
   their current state

 • On Unix
    – Able to track socket buffers and their usage
    – Send buffer filling up, slow client or bad network
    – Receive buffer filling up, application is not reading fast
      enough

                                                                   49
OS utilities


 • Linux
    – nmon
    – Very nice system stats collector
    – CPU,Memory,disk,network and more

 • Windows
    – Expand your task manager
    – It reports as much as you need (thread counts, IO
      activity, virtual vs residential memory




                                                          50
JMX


 • Both Tomcat and the JVM
      – Make information available through JMX

 • jconsole
      – Utility that comes with the JDK
      – Lets you attach to a JVM and get information

 • JVM or application inoperable
      – JMX may not report accurately
      – Don't rely on it exclusively




                                                       51
So far…


 • We’ve covered tons of options to troubleshoot
   your systems
    – without actually adding any software
    – This is usually a requirement for production
      troubleshooting

 • There are additional options available

 • Such as profilers and monitoring applications




                                                     52
Profilers


 • Comes in all shapes and sizes (and prices)

 • My preferences
    – www.yourkit.com
    – Very biased, as a Tomcat developer, I get it for free

 • But it has many advantages such as:
    –   Inexpensive
    –   Works fairly well in production environments
    –   Excellent support
    –   Great tool to use


                                                              53
Profilers


 • Only last resort for production
    – When you are unable to reproduce in UAT/QA/etc
    – But sometimes required to solve the problem




                                                       54
Summary


 • Gather all the information first
    – Even if you think you don't need it
    – Create a check list
       •   Thread dumps
       •   Logs
       •   Configuration files
       •   OS statistics
       •   etc
 • Chasing a problem without all that, you can easily
   miss something
    – And you'll chase a needle in a haystack



                                                        55

Más contenido relacionado

La actualidad más candente

Java Performance Monitoring & Tuning
Java Performance Monitoring & TuningJava Performance Monitoring & Tuning
Java Performance Monitoring & TuningMuhammed Shakir
 
Java tuning on GNU/Linux for busy dev
Java tuning on GNU/Linux for busy devJava tuning on GNU/Linux for busy dev
Java tuning on GNU/Linux for busy devTomek Borek
 
When Ruby Meets Java - The Power of Torquebox
When Ruby Meets Java - The Power of TorqueboxWhen Ruby Meets Java - The Power of Torquebox
When Ruby Meets Java - The Power of Torqueboxrockyjaiswal
 
Jvm Performance Tunning
Jvm Performance TunningJvm Performance Tunning
Jvm Performance Tunningguest1f2740
 
JVM Memory Management Details
JVM Memory Management DetailsJVM Memory Management Details
JVM Memory Management DetailsAzul Systems Inc.
 
Diagnosing Your Application on the JVM
Diagnosing Your Application on the JVMDiagnosing Your Application on the JVM
Diagnosing Your Application on the JVMStaffan Larsen
 
TorqueBox: The beauty of Ruby with the power of JBoss. Presented at Devnexus...
TorqueBox: The beauty of Ruby with the power of JBoss.  Presented at Devnexus...TorqueBox: The beauty of Ruby with the power of JBoss.  Presented at Devnexus...
TorqueBox: The beauty of Ruby with the power of JBoss. Presented at Devnexus...bobmcwhirter
 
자바 성능 강의
자바 성능 강의자바 성능 강의
자바 성능 강의Terry Cho
 
I know why your Java is slow
I know why your Java is slowI know why your Java is slow
I know why your Java is slowaragozin
 
Performance tuning jvm
Performance tuning jvmPerformance tuning jvm
Performance tuning jvmPrem Kuppumani
 
JVM Performance Tuning
JVM Performance TuningJVM Performance Tuning
JVM Performance TuningJeremy Leisy
 
Lightweight Grids With Terracotta
Lightweight Grids With TerracottaLightweight Grids With Terracotta
Lightweight Grids With TerracottaPT.JUG
 
Complex Made Simple: Sleep Better with TorqueBox
Complex Made Simple: Sleep Better with TorqueBoxComplex Made Simple: Sleep Better with TorqueBox
Complex Made Simple: Sleep Better with TorqueBoxbobmcwhirter
 
High performance network programming on the jvm oscon 2012
High performance network programming on the jvm   oscon 2012 High performance network programming on the jvm   oscon 2012
High performance network programming on the jvm oscon 2012 Erik Onnen
 
Don't dump thread dumps
Don't dump thread dumpsDon't dump thread dumps
Don't dump thread dumpsTier1 App
 
TorqueBox for Rubyists
TorqueBox for RubyistsTorqueBox for Rubyists
TorqueBox for Rubyistsbobmcwhirter
 
Java profiling Do It Yourself (jug.msk.ru 2016)
Java profiling Do It Yourself (jug.msk.ru 2016)Java profiling Do It Yourself (jug.msk.ru 2016)
Java profiling Do It Yourself (jug.msk.ru 2016)aragozin
 

La actualidad más candente (20)

Java Performance Monitoring & Tuning
Java Performance Monitoring & TuningJava Performance Monitoring & Tuning
Java Performance Monitoring & Tuning
 
Java tuning on GNU/Linux for busy dev
Java tuning on GNU/Linux for busy devJava tuning on GNU/Linux for busy dev
Java tuning on GNU/Linux for busy dev
 
When Ruby Meets Java - The Power of Torquebox
When Ruby Meets Java - The Power of TorqueboxWhen Ruby Meets Java - The Power of Torquebox
When Ruby Meets Java - The Power of Torquebox
 
近未来的並列 LL
近未来的並列 LL近未来的並列 LL
近未来的並列 LL
 
Jvm Performance Tunning
Jvm Performance TunningJvm Performance Tunning
Jvm Performance Tunning
 
JVM Memory Management Details
JVM Memory Management DetailsJVM Memory Management Details
JVM Memory Management Details
 
Diagnosing Your Application on the JVM
Diagnosing Your Application on the JVMDiagnosing Your Application on the JVM
Diagnosing Your Application on the JVM
 
TorqueBox: The beauty of Ruby with the power of JBoss. Presented at Devnexus...
TorqueBox: The beauty of Ruby with the power of JBoss.  Presented at Devnexus...TorqueBox: The beauty of Ruby with the power of JBoss.  Presented at Devnexus...
TorqueBox: The beauty of Ruby with the power of JBoss. Presented at Devnexus...
 
자바 성능 강의
자바 성능 강의자바 성능 강의
자바 성능 강의
 
Java Performance Tuning
Java Performance TuningJava Performance Tuning
Java Performance Tuning
 
I know why your Java is slow
I know why your Java is slowI know why your Java is slow
I know why your Java is slow
 
Performance tuning jvm
Performance tuning jvmPerformance tuning jvm
Performance tuning jvm
 
JVM Performance Tuning
JVM Performance TuningJVM Performance Tuning
JVM Performance Tuning
 
Lightweight Grids With Terracotta
Lightweight Grids With TerracottaLightweight Grids With Terracotta
Lightweight Grids With Terracotta
 
Complex Made Simple: Sleep Better with TorqueBox
Complex Made Simple: Sleep Better with TorqueBoxComplex Made Simple: Sleep Better with TorqueBox
Complex Made Simple: Sleep Better with TorqueBox
 
High performance network programming on the jvm oscon 2012
High performance network programming on the jvm   oscon 2012 High performance network programming on the jvm   oscon 2012
High performance network programming on the jvm oscon 2012
 
Don't dump thread dumps
Don't dump thread dumpsDon't dump thread dumps
Don't dump thread dumps
 
TorqueBox for Rubyists
TorqueBox for RubyistsTorqueBox for Rubyists
TorqueBox for Rubyists
 
Devignition 2011
Devignition 2011Devignition 2011
Devignition 2011
 
Java profiling Do It Yourself (jug.msk.ru 2016)
Java profiling Do It Yourself (jug.msk.ru 2016)Java profiling Do It Yourself (jug.msk.ru 2016)
Java profiling Do It Yourself (jug.msk.ru 2016)
 

Similar a Tomcatx troubleshooting-production

Inside The Java Virtual Machine
Inside The Java Virtual MachineInside The Java Virtual Machine
Inside The Java Virtual Machineelliando dias
 
Introduction of Java GC Tuning and Java Java Mission Control
Introduction of Java GC Tuning and Java Java Mission ControlIntroduction of Java GC Tuning and Java Java Mission Control
Introduction of Java GC Tuning and Java Java Mission ControlLeon Chen
 
7 jvm-arguments-Confoo
7 jvm-arguments-Confoo7 jvm-arguments-Confoo
7 jvm-arguments-ConfooTier1 app
 
Jvm tuning in a rush! - Lviv JUG
Jvm tuning in a rush! - Lviv JUGJvm tuning in a rush! - Lviv JUG
Jvm tuning in a rush! - Lviv JUGTomek Borek
 
Jvm problem diagnostics
Jvm problem diagnosticsJvm problem diagnostics
Jvm problem diagnosticsDanijel Mitar
 
java-monitoring-troubleshooting
java-monitoring-troubleshootingjava-monitoring-troubleshooting
java-monitoring-troubleshootingWilliam Au
 
Java and cgroups eng
Java and cgroups engJava and cgroups eng
Java and cgroups engRalf Ernst
 
Java garbage collection, jvm, visual vm
Java garbage collection, jvm, visual vmJava garbage collection, jvm, visual vm
Java garbage collection, jvm, visual vmBrad Schoening, MSCS
 
Mastering java in containers - MadridJUG
Mastering java in containers - MadridJUGMastering java in containers - MadridJUG
Mastering java in containers - MadridJUGJorge Morales
 
QCon 2017 - Java/JVM com Docker em produção: lições das trincheiras
QCon 2017 - Java/JVM com Docker em produção: lições das trincheirasQCon 2017 - Java/JVM com Docker em produção: lições das trincheiras
QCon 2017 - Java/JVM com Docker em produção: lições das trincheirasLeonardo Zanivan
 
Devoxx Fr 2022 - Remèdes aux oomkill, warm-ups, et lenteurs pour des conteneu...
Devoxx Fr 2022 - Remèdes aux oomkill, warm-ups, et lenteurs pour des conteneu...Devoxx Fr 2022 - Remèdes aux oomkill, warm-ups, et lenteurs pour des conteneu...
Devoxx Fr 2022 - Remèdes aux oomkill, warm-ups, et lenteurs pour des conteneu...Jean-Philippe BEMPEL
 
7 jvm-arguments-v1
7 jvm-arguments-v17 jvm-arguments-v1
7 jvm-arguments-v1Tier1 app
 
Modern Engineer’s Troubleshooting Tools, Techniques & Tricks at Confoo 2018
Modern Engineer’s Troubleshooting Tools, Techniques & Tricks at Confoo 2018Modern Engineer’s Troubleshooting Tools, Techniques & Tricks at Confoo 2018
Modern Engineer’s Troubleshooting Tools, Techniques & Tricks at Confoo 2018Tier1app
 

Similar a Tomcatx troubleshooting-production (20)

Inside The Java Virtual Machine
Inside The Java Virtual MachineInside The Java Virtual Machine
Inside The Java Virtual Machine
 
Inside the JVM
Inside the JVMInside the JVM
Inside the JVM
 
Introduction of Java GC Tuning and Java Java Mission Control
Introduction of Java GC Tuning and Java Java Mission ControlIntroduction of Java GC Tuning and Java Java Mission Control
Introduction of Java GC Tuning and Java Java Mission Control
 
7 jvm-arguments-Confoo
7 jvm-arguments-Confoo7 jvm-arguments-Confoo
7 jvm-arguments-Confoo
 
Jvm is-your-friend
Jvm is-your-friendJvm is-your-friend
Jvm is-your-friend
 
Jvm tuning in a rush! - Lviv JUG
Jvm tuning in a rush! - Lviv JUGJvm tuning in a rush! - Lviv JUG
Jvm tuning in a rush! - Lviv JUG
 
Jvm problem diagnostics
Jvm problem diagnosticsJvm problem diagnostics
Jvm problem diagnostics
 
Basics of JVM Tuning
Basics of JVM TuningBasics of JVM Tuning
Basics of JVM Tuning
 
java-monitoring-troubleshooting
java-monitoring-troubleshootingjava-monitoring-troubleshooting
java-monitoring-troubleshooting
 
Heap & thread dump
Heap & thread dumpHeap & thread dump
Heap & thread dump
 
Javasession10
Javasession10Javasession10
Javasession10
 
[BGOUG] Java GC - Friend or Foe
[BGOUG] Java GC - Friend or Foe[BGOUG] Java GC - Friend or Foe
[BGOUG] Java GC - Friend or Foe
 
Java and cgroups eng
Java and cgroups engJava and cgroups eng
Java and cgroups eng
 
Java garbage collection, jvm, visual vm
Java garbage collection, jvm, visual vmJava garbage collection, jvm, visual vm
Java garbage collection, jvm, visual vm
 
Mastering java in containers - MadridJUG
Mastering java in containers - MadridJUGMastering java in containers - MadridJUG
Mastering java in containers - MadridJUG
 
Taming The JVM
Taming The JVMTaming The JVM
Taming The JVM
 
QCon 2017 - Java/JVM com Docker em produção: lições das trincheiras
QCon 2017 - Java/JVM com Docker em produção: lições das trincheirasQCon 2017 - Java/JVM com Docker em produção: lições das trincheiras
QCon 2017 - Java/JVM com Docker em produção: lições das trincheiras
 
Devoxx Fr 2022 - Remèdes aux oomkill, warm-ups, et lenteurs pour des conteneu...
Devoxx Fr 2022 - Remèdes aux oomkill, warm-ups, et lenteurs pour des conteneu...Devoxx Fr 2022 - Remèdes aux oomkill, warm-ups, et lenteurs pour des conteneu...
Devoxx Fr 2022 - Remèdes aux oomkill, warm-ups, et lenteurs pour des conteneu...
 
7 jvm-arguments-v1
7 jvm-arguments-v17 jvm-arguments-v1
7 jvm-arguments-v1
 
Modern Engineer’s Troubleshooting Tools, Techniques & Tricks at Confoo 2018
Modern Engineer’s Troubleshooting Tools, Techniques & Tricks at Confoo 2018Modern Engineer’s Troubleshooting Tools, Techniques & Tricks at Confoo 2018
Modern Engineer’s Troubleshooting Tools, Techniques & Tricks at Confoo 2018
 

Más de Vladimir Khokhryakov (11)

Application specialist in Riga
Application specialist in RigaApplication specialist in Riga
Application specialist in Riga
 
Software 2001
Software 2001Software 2001
Software 2001
 
33 mhz
33 mhz33 mhz
33 mhz
 
Quake I
Quake IQuake I
Quake I
 
Tomcatx performance-tuning
Tomcatx performance-tuningTomcatx performance-tuning
Tomcatx performance-tuning
 
Dos5.0
Dos5.0Dos5.0
Dos5.0
 
Windows3.1
Windows3.1Windows3.1
Windows3.1
 
Windows NT
Windows NTWindows NT
Windows NT
 
Antivirus93
Antivirus93Antivirus93
Antivirus93
 
Macintosh против PC.1991
Macintosh против PC.1991Macintosh против PC.1991
Macintosh против PC.1991
 
Next 1989
Next 1989Next 1989
Next 1989
 

Tomcatx troubleshooting-production

  • 1. Tomcat Expert Series Troubleshooting in Production Filip Hanik SpringSource 2009 Copyright 2007 SpringSource. Copying, publishing or distributing without express written permission is prohibited.
  • 2. Topics in this Session • Brief overview of JVM memory layout • Understanding Out Of Memory Errors – Causes – Solutions • Error logs and stack traces • The tale of the thread dump 2
  • 3. Topics in this Session • Using OS utilities to narrow down the problem • JMX – what it can do for you 3
  • 4. The JVM process heap OS Memory (RAM) Process Heap (java/java.exe) Java Object Heap Everything else… 4
  • 5. Storing data in memory • JVM manages the process heap (in most cases) – JNI managed memory would be an exception, and there are others • No shared memory between processes – At least not available through the Java API • JVM creates a Java Heap – Part of the process heap • Configured through –Xmx and –Xms settings 5
  • 6. JVM Process Heap • Maximum size is limited – 32 bit size, roughly 2GB – 32 bit JVM on 64 bit OS, roughly 3.7GB – 64 bit, much much larger ? • If 2GB is the max for the process – -Xmx1800m –Xms1800m – not very good – Leaves no room for anything else 6
  • 7. Gotcha #1 • -Xmx and -Xms – Only controls the Java Object Heap • Often misunderstood to control the process heap • Confusion leads to incorrect tuning – And in some cases, the situation worsens 7
  • 8. Java Object Heap Allocation • Aggressive Heap Allocation • -XX:MinHeapFreeRatio= – Default is 40 (40%) – When the JVM allocates memory, it allocates enough to get 40% free • -XX:MaxHeapFreeRatio= – Default 70% – To give back memory when a majority of it is not used • Not important when –Xms == -Xmx 8
  • 9. Java Object Heap Java Object Heap (-Xmx/-Xms) Young Generation Old Generation A good size for the YG is 33% of the total heap 9
  • 10. Java Object Heap • Young Generation – All new objects are created here – Only moved to Old Gen if they survive one or more minor GC • Sized using – -Xmn – fixed value – -XX:NewRatio=<value> - dynamic sizing – -XX:MaxNewSize/-XX:NewSize – similar to -Xmx/-Xms • Survivor Spaces – 2, used during the GC algorithm (minor collections) – Mainly to alleviate fragmentation 10
  • 11. Young Generation Young size(-XX:NewRatio ) Survivor Ratio(-XX:SurvivorRatio ) Eden Space Survivor Space New Objects 2Mb default To From 64Kb default 11
  • 12. Old Generation Tenured Space 5Mb min 44Mb max (default) Garbage collection section will explain in detail how these spaces are used during the GC process. 12
  • 13. Java Heap Space • java.lang.OutOfMemoryError: Java heap space – Most common out of memory error – Caused by too many objects in the Java heap • Solution – Increase -Xmx if possible – Add -XX:+HeapDumpOnOutOfMemoryError – Fix memory leak if application is consuming more memory than expected • Side effects – Increasing -Xmx can have other side effects 13
  • 14. JVM Process Heap • Yes, there is more... – Permanent Space – Code Generation – Socket Buffers – Thread stacks – Direct Memory Space – JNI Code – Garbage Collection – JNI allocated memory 14
  • 15. Permanent Space • Permanent Generation – Permanent Space (name for it) – 4Mb initial, 64Mb max • Stores classes, methods and other meta data – -XX:PermSize=<value> (initial) – -XX:MaxPermSize=<value> (max) • Common OOM for webapp reloads – Separate space for pre-historic reasons – Early days of Java, class GC was not common, reduces size of the Java Heap – Does not exist in IBM JVM 15
  • 16. Permanent Space • Permanent Space Memory Errors – Too many classes loaded – Classes not unloaded/not being GC:ed – Unaffected by –Xmx flag • java.lang.OutOfMemoryError: PermGen space – Many situations, increasing max perm size will help – i.e. no leak, but just not enough memory – Others will require to fix the leak 16
  • 17. Socket Buffers • Each connection contains two buffers – Receive buffer ~37k – Send buffer ~25k • Configured in Java code – Default and Max limits set in kernel • Very common tune parameter for content delivery • Usually hit other limits than memory before you run out of memory – IOException: Too many open files (for example) – SocketException: No buffer space available 17
  • 18. Thread Stacks • Each thread has a separate memory space called “thread stack” • Configured by –Xss • Default value depends on OS/JVM • As number of threads increase, memory usage increases 18
  • 19. Thread Stacks • java.lang.OutOfMemoryError: unable to create new native thread • Solution – Decrease –Xmx (or other space) and/or – Decrease –Xss – Or, you have a thread leak, fix the program • Gotcha – Increasing –Xmx (32bit systems) will leave less room for threads if it is being used, hence the opposite of the solution – Too low –Xss value can cause java.lang.StackOverflowError • Thread dump will lead to instant answer – Returns all the threads 19
  • 20. Garbage Collection • However, if there is excessive GC • java.lang.OutOfMemoryError: GC overhead limit exceeded • 98% of the time is spent in GC – less than 2% of the heap is recovered • To disable – -XX:-UseGCOverheadLimit • To enable – -XX:+UseGCOverheadLimit 20
  • 21. GC: How It Works Eden Space Survivor Space From To Tenured Space 4. Next time Eden is full 1. New object isis fullnd minor 1st Eden 3. Copy EDEN created Copy nd fillsEdenobjects remain in 5. If 2 from and to 2 – 2. When surviving objects into collection or 1st Copy from copied nd the tenured survivor space 2 to These get 1st to 21
  • 22. GC: Debugging it • -Xloggc:%CATALINA_BASE%logsgc.log • -XX:+PrintGCDetails • -XX:+PrintGC • -XX:+PrintGCApplicationStoppedTime • -XX:+PrintGCTimeStamps • -XX:+PrintHeapAtGC 22
  • 23. Troubleshooting steps • What are the symptoms • How does the problem manifest itself • How can I dig into the problem 23
  • 24. Tomcat Logs • Tomcat is really good at logging • It doesn’t spit out info you don’t need know • When an error happens it can, however, generate tons of log entries • Common IO exceptions are swallowed – Normal TCP behavior 24
  • 25. Tomcat Logs • Unexpected errors are always logged – Application and container errors • Logs are always a resource • Log entries are categorized – DEBUG – INFO – WARNING – SEVERE – FATAL 25
  • 26. Tomcat Logs • INFO – no error, just information given to you • WARNING – you might care a little bit • SEVERE – yes, now you got an error • FATAL – what ever this is, it cant be good! • Always pay attention to the log level 26
  • 27. Tomcat Logs • So what do I need to look at • SEVERE – yes, now you got an error • It’s easy to ‘grep’ logs for these entries 27
  • 28. Tomcat Logs • Very basic error • So Tomcat can’t log application errors – Only if the application doesn’t ‘trap’ the error, tomcat will catch it and spit out some info // An uncaught application error could look like this SEVERE: Servlet.service() for servlet jsp threw exception java.lang.NullPointerException at org.apache.jsp.npe_jsp._jspService(npe_jsp.java:55) • Found in catalina.2008-03-28.log – Uncaught application exception 28
  • 29. Java Stack Traces • Shows the code execution path up until the error happened // An uncaught application error could look like this SEVERE: Servlet.service() for servlet jsp threw exception java.lang.NullPointerException at org.apache.jsp.npe_jsp._jspService(npe_jsp.java:55) at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:70) at javax.servlet.http.HttpServlet.service(HttpServlet.java:803) at org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:374) at org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:337) at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:266) at javax.servlet.http.HttpServlet.service(HttpServlet.java:803) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.valves.RequestDumperValve.invoke(RequestDumperValve.java:151) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447) at java.lang.Thread.run(Thread.java:595) 29
  • 30. Java Stack Traces • Traces can be chained, only the root cause is the real error // An uncaught application error could look like this SEVERE: Servlet.service() for servlet jsp threw exception java.lang.RuntimeException: java.lang.NullPointerException at org.apache.jsp.npe_jsp._jspService(npe_jsp.java:59) at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:70) … Caused by: java.lang.NullPointerException at org.apache.jsp.npe_jsp._jspService(npe_jsp.java:57) ... 19 more • Always find the top most error and the bottom most root cause 30
  • 31. Tomcat Logs • If the logs are not yielding any information – Gather more information about the error • Can you reproduce it • Is it happening on your server – Trace the request down 31
  • 32. Viewing Requests • Access logs can help – They record every request and it’s response code – You can print out headers, cookies, request and session information • Often very useful to see how traffic is flowing • When using httpd in front – httpd access log combined with Tomcat access log – Excellent way to consolidate requests 32
  • 33. Seeing traffic • When using httpd in front – httpd has a mod_dumpio module – Print out everything one wants to know – Useful when privileges for a sniffer are not present • RequestDumper Valve – Valve that spits out everything, similar to mod_dumpio – Poorly designed, it breaks out the output into multiple log statement – Not recommended 33
  • 34. Seeing traffic • Network sniffers (client/server) – Nothing compares to getting exact data – Wireshark, ethereal, tcpdumper, etc • Many choices, just pick one. – Often one is already installed – Requires root privileges • Client side visualizer – MS Fiddler is an excellent tool for Windows users – Firefox firebug 34
  • 35. Seeing traffic • Ability to map – Request to error – Error to a time frame – Error to a client • Traffic pattern can – Help you reproduce the error – Resolve the issue faster 35
  • 36. Thread dumps • Displays the state of all threads in a virtual machine • Provides plenty of information about activity and any dead locks • Provides a trace where each thread started to where its current point in execution 36
  • 37. Thread dumps • On Unix -> kill -3 <tomcat pid> – jstack -l also works – On Solaris, powerful pstack utility • On Windows Ctrl+Break – JDK 1.6+ you have jstack to help • Tanuki Wrapper – telnet <host:port> D • Thread dump is printed to stdout – Or wherever stdout is redirected to – Don't send it to /dev/null ! 37
  • 38. The tale of Thread dumps • Alternate way to dump – jstack – tool that comes with the JDK – Use -l option with jstack to get lock information • More than just threads Heap def new generation total 9088K, used 5497K [0x04070000, 0x04a40000, 0x067d0000) eden space 8128K, 60% used [0x04070000, 0x045402a0, 0x04860000) from space 960K, 59% used [0x04950000, 0x049de428, 0x04a40000) to space 960K, 0% used [0x04860000, 0x04860000, 0x04950000) tenured generation total 121024K, used 1656K [0x067d0000, 0x0de00000, 0x24070000) the space 121024K, 1% used [0x067d0000, 0x0696e068, 0x0696e200, 0x0de00000) compacting perm gen total 12288K, used 4482K [0x24070000, 0x24c70000, 0x2c070000) the space 12288K, 36% used [0x24070000, 0x244d0920, 0x244d0a00, 0x24c70000) ro space 8192K, 66% used [0x2c070000, 0x2c5bd978, 0x2c5bda00, 0x2c870000) rw space 12288K, 52% used [0x2c870000, 0x2ceb9cb8, 0x2ceb9e00, 0x2d470000) • JDK 1.6+ also prints out memory stats 38
  • 39. The tale of Thread dumps • More than just threads • Dead lock detection Found one Java-level deadlock: ============================= "pool-2-thread-6": waiting for ownable synchronizer 0x482eaeb0, (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync), which is held by "http-8081-2" "http-8081-2": waiting to lock monitor 0x08143b14 (object 0x482eade0, a org.apache.catalina.ha.session.DeltaRequest), which is held by "pool-2-thread-6" 39
  • 40. The tale of Thread dumps • Thread information • First line for each thread contains critical info "http-8082-7"  thread name daemon  type of thread [daemon, empty means non daemon] prio=10  thread priority [1..10] tid=0x09142000  C++ pointer to JVM OSThread object nid=0x67c  kernel thread identifier in Object.wait()  what the thread is doing [0xba3ff000..0xba3ff4e0]  address space 40
  • 41. The tale of Thread dumps • Thread information – CPU usage very high, caused by a spinning thread, but which one? nid=0x67c  kernel thread identifier – On linux for example, one can get CPU usage per thread ps -eL -o pid,%cpu,lwp | grep -i `ps -ef |grep -v grep |grep java|awk '{print $2}'` |grep -v 0.0 • Lists all threads that run inside a java process with a CPU usage higher than 0.0% 41
  • 42. The tale of Thread dumps • Threads dumps can help you identify • Show what threads are waiting for a lock • Dead locks • Spinning threads, and what code it belongs • Memory usage • Cause of an unresponsive JVM 42
  • 43. The tale of Thread dumps • When taking thread dumps ALWAYS take two or more dumps • This will help you see if threads are changing execution path • Single thread dump can cause a lot of “false positives”, where you think a thread is stuck but it’s not 43
  • 44. The tale of Thread dumps • When taking thread dumps ALWAYS take two or more dumps • Sometimes when a OS is overloaded, maybe swapping, threads move, but very slowly • If you hadn’t had two dumps, you’d never know that! 44
  • 45. The tale of Thread dumps • Examples of thread dumps and GC logs – Stuck but not dead locked (jvm-locked.txt) – What went wrong? (five-thread-dumps.txt) – Memory leak confirmed (gc.log/gc-logs-explained.txt) – Excessive GC (gc-excessive-object.creation.txt) 45
  • 46. OS utilities • Troubleshooting a Tomcat server – often involves more than just Tomcat and JVM troubleshooting – OS utilities come very handy • Tomcat logs internal errors – If they spawned an exception – Container bugs are hard to troubleshoot, but rare in frequencies 46
  • 47. OS utilities • File descriptors are common problems – Most OS have a limit on FD – A FD can be an open file or a socket • File descriptor leaks are also very common in web applications – java.io.IOException: Too many open files • Use utilities to track it down 47
  • 48. OS utilities • File descriptors – Sockets – Open files • On Linux – lsof –p <process id> – Will list all open FD – This will often put you on the right path on what is going wrong – Solaris – pfiles 48
  • 49. OS utilities • Active Connections – Can be HTTP/AJP connections – Useful to track database or other connections • netstat is an excellent tool to view sockets and their current state • On Unix – Able to track socket buffers and their usage – Send buffer filling up, slow client or bad network – Receive buffer filling up, application is not reading fast enough 49
  • 50. OS utilities • Linux – nmon – Very nice system stats collector – CPU,Memory,disk,network and more • Windows – Expand your task manager – It reports as much as you need (thread counts, IO activity, virtual vs residential memory 50
  • 51. JMX • Both Tomcat and the JVM – Make information available through JMX • jconsole – Utility that comes with the JDK – Lets you attach to a JVM and get information • JVM or application inoperable – JMX may not report accurately – Don't rely on it exclusively 51
  • 52. So far… • We’ve covered tons of options to troubleshoot your systems – without actually adding any software – This is usually a requirement for production troubleshooting • There are additional options available • Such as profilers and monitoring applications 52
  • 53. Profilers • Comes in all shapes and sizes (and prices) • My preferences – www.yourkit.com – Very biased, as a Tomcat developer, I get it for free • But it has many advantages such as: – Inexpensive – Works fairly well in production environments – Excellent support – Great tool to use 53
  • 54. Profilers • Only last resort for production – When you are unable to reproduce in UAT/QA/etc – But sometimes required to solve the problem 54
  • 55. Summary • Gather all the information first – Even if you think you don't need it – Create a check list • Thread dumps • Logs • Configuration files • OS statistics • etc • Chasing a problem without all that, you can easily miss something – And you'll chase a needle in a haystack 55