2. @tall_chris#Devoxx #TCoffheap
Who Am I?
• Trained as a Physicist, clearly not trained as a Computer
Scientist.
• 4Years Doing Unnatural Things With Bytecode In Academia
• 3Years Doing Unnatural Things With Bytecode For Money
• 4Years Doing Unnatural Things With ByteBuffers
• 11Years Doing Java Development
• Software Engineer working at Terracotta (Software AG)
4. @tall_chris#Devoxx #TCoffheap
A Bit of History
2010 Started development as a caching ‘tier’ within Ehcache.
2011 Integrated as a caching tier in front of Oracle BDB in the
Terracotta Server.
2013 Legal complications push it in to service as the primary
storage for the Terracotta Server.
2015 Open Sourced (https://github.com/Terracotta-OSS/
offheap-store).
7. @tall_chris#Devoxx #TCoffheap
Problem Statement
• “a lot of caching” leads to
• a lot of heap, which leads to,
• a lot of work for the garbage collector, which leads to,
• a lot of GC pausing/overhead”
• The situation is markedly better now than when the bulk of this
library was written. (Please don’t tell my employer I said that)
8. @tall_chris#Devoxx #TCoffheap
Map/Cache Best Practices
• Immutable Keys
• please do this!
• ImmutableValues
• please do this!
• So with immutability everywhere, who cares about object
identity?
• If I don’t need object identity, do I need a heap?
• If I don’t need a heap, do I need a garbage collector?
9. @tall_chris#Devoxx #TCoffheap
Solution
• Replace heavy (large) map/cache usage with an ‘outside the
heap’ but ‘inside the process’ implementation.
• Benefits at two scales:
• At moderate scale, the GC offload reduces overheads.
• At large scale, we can still function: -Xmx6T
• Caveats
• Marshalling/unmarshalling costs time (and CPU)
• Trading away average latency to control the tail.
44. @tall_chris#Devoxx #TCoffheap
Options with 64 bits available
• 64 bit combined pointer
• 32 bit key pointer & 32 bit value pointer
• int key directly + 32 bit pointer
• long key directly + 32 bit pointer
• …anything else you like
48. @tall_chris#Devoxx #TCoffheap
A Native Heap Allocator
• malloc/free performed using a Java port of dlmalloc
• http://g.oswego.edu/dl/html/malloc.html
• Works well for our use cases as we do not generally control
or even know the malloc size distribution.
50. @tall_chris#Devoxx #TCoffheap
“Java Serialization Sucks”
• Serialization is self describing.
• It supports
• object identity
• cycles
• complex versioning
• Pretty heavyweight, especially for short streams…
• …but it’s the default serialization mechanism available in
Ehcache 2.x
51. @tall_chris#Devoxx #TCoffheap
“Java Serialization Sucks”
• serialize(new Integer(42))
• results in these 81 bytes:
0 1 2 3 4 5 6 7 8 9 A B C D E F
0 AC ED 00 05 73 72 00 11 6A 61 76 61 2E 6C 61 6E
1 67 2E 49 6E 74 65 67 65 72 12 E2 A0 A4 F7 81 87
2 38 02 00 01 49 00 05 76 61 6C 75 65 78 72 00 10
3 6A 61 76 61 2E 6C 61 6E 67 2E 4E 75 6D 62 65 72
4 86 AC 95 1D 0B 94 E0 8B 02 00 00 78 70 00 00 00
5 2A
52. @tall_chris#Devoxx #TCoffheap
OffHeap’s Serialization Sucks Less?
• serialize(new Integer(42))
• results in 22 bytes
0 1 2 3 4 5 6 7 8 9 A B C D E F
0 AC ED 00 05 73 72 00 00 00 00 78 72 00 00 00 01
1 78 70 00 00 00 2A
2
3
4
5
54. @tall_chris#Devoxx #TCoffheap
With some structure
STREAM_MAGIC STREAM_VERSION
TC_OBJECT
TC_CLASSDESC descriptor(0)
TC_END_BLOCKDATA
TC_CLASSDESC descriptor(1)
TC_END_BLOCKDATA
TC_NULL
0000002A
55. @tall_chris#Devoxx #TCoffheap
Where did the 59 bytes go?
• How many types are in my map?
• All keys the same type: really common
• All values the same type: fairly common
• Stick those common ObjectStreamClass instances in a look
aside structure
• Map<Integer, ObjectStreamClass> for reading streams
• Map<SerializableDataKey, Integer> for writing streams
59. @tall_chris#Devoxx #TCoffheap
j.u.c.ConcurrentMap
• What does a concurrent map provide?
• happens-before relationship: “actions in a thread prior to placing an
object into a ConcurrentMap as a key or value happen-before actions
subsequent to the access or removal of that object from the
ConcurrentMap in another thread”
• atomic operations: “…except that the action is performed atomically.”
• What do we want?
• concurrent access (readers and writers)
60. @tall_chris#Devoxx #TCoffheap
Happens Before Relationships
• volatile write/read
• but not on offheap memory locations
• synchronized
• needs a heap object
• other j.u.c classes (Lock,Atomic…)
• needs a heap object
• There is no way within the JDK to enforce a happens before
relationship between writes/reads of an offheap location…
61. @tall_chris#Devoxx #TCoffheap
No Unsafe please, we’re a library
• Our testing has never shown our offheap implementation to
be a bottleneck in our usages.
• Unnecessary complexity costs $$$
• support
• maintenance
• bugs
65. @tall_chris#Devoxx #TCoffheap
A ‘Concurrent’ Map
✅ happens-before relationship: “actions in a thread prior to
placing an object into a ConcurrentMap as a key or value
happen-before actions subsequent to the access or removal of
that object from the ConcurrentMap in another thread”
✅ atomic operations: “…except that the action is performed
atomically.”
⚠ concurrent access (readers and writers)