Video and slides synchronized, mp3 and slide download available at URL http://bit.ly/1ygAXJ2.
Eva Andreasson has a fairly entertaining and yet painful presentation around garbage collection that will probably will end with more questions than answers. Filmed at qconlondon.com.
Eva Andreasson has been working with Java virtual machine technologies, SOA, Cloud, and other enterprise middleware solutions for the past 10 years. She joined Cloudera in 2012. Recently launched Cloudera Search - a new real-time search workload natively integrated with the Hadoop stack - and drove the efforts that lead to 10x user uptake of Hue - the UI of the Hadoop ecosystem.
2. InfoQ.com: News & Community Site
• 750,000 unique visitors/month
• Published in 4 languages (English, Chinese, Japanese and Brazilian
Portuguese)
• Post content from our QCon conferences
• News 15-20 / week
• Articles 3-4 / week
• Presentations (videos) 12-15 / week
• Interviews 2-3 / week
• Books 1 / month
Watch the video with slide
synchronization on InfoQ.com!
http://www.infoq.com/presentations
/garbage-collection-benefits
3. Presented at QCon London
www.qconlondon.com
Purpose of QCon
- to empower software development by facilitating the spread of
knowledge and innovation
Strategy
- practitioner-driven conference designed for YOU: influencers of
change and innovation in your teams
- speakers and topics driving the evolution and innovation
- connecting and catalyzing the influencers and innovators
Highlights
- attended by more than 12,000 delegates since 2007
- held in 9 cities worldwide
6. WHAT IS GARBAGE COLLECTION (GC)?
4
• What is a Java Virtual Machine (JVM)?
• Runtime code compilation
• Dynamic memory management
• What is dynamic memory management?
• Does not require explicit memory allocation (when programming)
• Frees up memory no longer referenced
7. JAVA HEAP - SPOTLIGHT
5
• The mythical (?) Java heap
• -Xmx, -Xms
• Top: RES / RSS --- total memory footprint
• Full = a thread fails to allocate a new object
Java Heap: Where all Java Objects get allocated
JVM Internal
Memory:
• Code cache
• VM Threads
• VM & GC Structs
8. REFERENCE COUNTING VS. TRACING
6
• Reference Counting
• Plus a counter for each reference to an object
• Subtract for each removed reference to an object
• When 0, reclaim the heap space occupied by that object
• Simple to reclaim
• Hard to maintain counters
• Difficult (costly) to handle cyclic structures
9. REFERENCE COUNTING VS. TRACING
7
• Tracing
• Identify roots (thread stacks, …, etc)
• Trace all references from those objects, recursively
• Anything not found is garbage
• Simple to maintain, handles cyclic structures
• Needs to trace all live objects before reclaiming memory
10. SIDE NOTE: WHAT IS FRAGMENTATION?
8
New
Object
FULL
HEAP
POST
GC
FRAGMENTS
New
Object
11. COPYING VS. MARK’N’SWEEP
9
• Copying
• Split the Java Heap into sections
• Allocate only in the one section, until full
• Stop the world
• Trace all reachable objects in the section and move (copy) them to another section
• Reclaim the original section as “free”
• Prevents fragmentation
• Wasteful in space
• Stops the world
12. COPYING VS. MARK’N’SWEEP
10
• Mark’n’sweep
• Allocate objects in the entire heap space, until full
• Trace and mark live objects (no moving)
• When all live objects are marked, sweep all non-marked areas (build free lists)
• Allocation can now happen at the address spaces of the free lists
• Allows entire heap for allocation
• Suffers from fragmentation
• Over time free list chunks too small to fit new objects
13. PARALLEL VS. CONCURRENT
11
• Parallel
• Stop the world
• Allow all available threads to do GC work
• Allow allocation once the entire GC cycle is complete
• Concurrent
• Allow some of the available threads to do as much GC work in the background as
possible, without impacting running applications too much
• Iterative marking, track areas where running applications have made
modifications and re-mark
• Needs to start in time…
14. GENERATIONAL
12
• Generational (-Xns)
• Most objects die young
• Define a space (could be distributed) on the heap for allocation
• The rest of the heap is considered “old space” or “old generation”
• As objects “survive” garbage collection in the young space (a.k.a. nursery), promote
them to the old space
• Reduces the speed of fragmentation of the heap
• Can use different algorithm than old space: copying, parallel and copying…
15. COMPACTING
13
• Compaction
• The operation of moving objects on the heap together
• Opens up larger consecutive spaces of free memory
• Mitigates fragmentation
• Compaction area size
• Incremental
• Intelligent
• Parallel mark’n’sweep and copying implementations usually do this during their normal
stop the world phase
• Concurrent mark’n’sweep needs to handle this somehow, eventually…
18. THE GOOD
16
• Garbage means you are using Java the way it was intended – truly object oriented!!
• Without GC, the world would have looked differently
• Java helped software (and hardware) innovation
• Programming became “mainstream” (no offence…)
• Coding could be done faster
• More jobs were created
• More products and businesses popped up
• If tuned “right”…
20. THE DESERT OF TUNING
18
• Endless tuning and re-tuning
• Rant-warning!
• Ok for some application profiles
• Time window applications
• Client applications
• Specially architected applications (new-objects only)
• Applications not sensitive to latency
21. A DESERT SURVIVAL KIT
19
• Chose the right GC algorithm for your application
• Understand your application allocation rate (in production) and allocate enough heap
• Measure the right thing!!
• Test != Production
• Not average or std dev – latency is not a standard distribution!
• Check out: Gil Tene’s jHiccup tool (and his talk) – a great new approach!
23. RECOGNIZE THIS?
21
• Initially everything is fine, GCs are happening without much impact
• Over time application seems to freeze up on occasion, or starts responding slower and
slower
• Soon, the entire application hangs, affecting other servers to start firing up
• Eventually the JVM gives up and “crashes”
• GC logs show back-to-back GCs and in the end some sort of Out Of Memory, Allocation,
or Promotion Error
24. WHAT REALLY HAPPENS
22
• When allocation fails, GC is triggered
• GC is doing its job, but no memory opens up (everything is live)
• Back-to-back GCs, still no new memory => OOME..
• OOME indicates not enough heap for your allocation rate
28. REMEMBER COMPACTION?
26
• Most GC implementations do not handle compaction well
• Moving objects is costly – stop the world is easy
• Generational added
• Tuning options and heuristics added
• Only one JVM that I know of that does compaction concurrently today (Zing)
29. PREPARE FOR THE REAL VILLAIN…
27
"There are two kinds of people in the world my friend, those with a rope around their
neck and the people that have the job of doing the cutting!"
33. I CHALLENGE THEE
31
“You see in this world, there’s two kinds of people my friend…
…those with loaded guns and those who dig…you dig!”
34. JOIN THE FUTURE OF JAVA!
32
• Open JDK is a great opportunity for innovation - join the community!
• Have all GC algorithms been invented yet?
• How do we enable a better world of self-tuning, adaptive JVMs?
• Relieve the admin of the pain of the tuning!
• How about fixing the core problem?
• Implement concurrent compaction
• Be creative around allocation / dynamic allocation rates!