SlideShare una empresa de Scribd logo
1 de 29
Descargar para leer sin conexión
Dealing with JVM limitations
in Apache Cassandra
Jonathan Ellis / @spyced
Pain points for Java databases

✤   GC
✤   GC
✤   GC
Pain points for Java databases

✤   GC
✤   Platform specific code
GC

✤   Concurrent and compacting: choose one
    ✤   G1
    ✤   Azul C4 / Zing?
Fragmentation

✤   Bloom filter arrays
✤   Compression offsets
Automatic mitigation?

✤   http://www.research.ibm.com/people/d/dfb/papers/Bacon03Controlling.pdf
✤   http://researcher.ibm.com/files/us-hirzel/pldi10-arraylets.pdf
Fragmentation, 2

✤   Arena allocation for memtables
(Memtables?)

    write( k1 , c1:v1 )

                                       Memory




                          Memtable




   Commit log



                                     Hard drive
write( k1 , c1:v )

                                             Memory
                      k1 c1:v




                                Memtable



     k1 c1:v




Commit log



                                           Hard drive
write( k1 , c2:v )

                                      Memory
                     k1 c1:v c2:v




    k1 c1:v
    k1 c2:v




                                    Hard drive
write( k2 ,     c1:v c2:v   )
                                                   Memory
                                k1 c1:v c2:v

                                k2   c1:v c2:v




   k1 c1:v
   k1 c2:v
 k2 c1:v c2:v




                                                 Hard drive
write( k1 ,     c1:v c3:v   )
                                                      Memory
                                k1 c1:v c2:v c3:v

                                k2   c1:v c2:v




   k1 c1:v
   k1 c2:v
 k2 c1:v c2:v
 k1 c1:v c3:v




                                                    Hard drive
Memory




          flush




                 index
cleanup    k1 c1:v c2:v c3:v

           k2   c1:v c2:v


                               SSTable




                                         Hard drive
“Java is a memory hog”

✤   Large overhead for typical objects and collections
✤   How large?
✤   java.lang.instrument.Instrumentation

    ✤   JAMM: Java Agent for Memory Measurements
    ✤   https://github.com/jbellis/jamm
org.apache.cassandra.cache.SerializingCache


✤   Live objects are about 85% JVM bookeeping
✤   org.apache.cassandra.cache.FreeableMemory   using reference
    counting
✤   Considering doing reference-counted, off-heap memtables
    as well
Don’t forget about young gen

✤   Always stop-the-world for ~100ms
Platform-specific code

✤   OS
✤   JVM
m[un]map

✤   Log-structured storage wants to remove old files post-
    compaction; some platforms disallow deleting open files
✤   Old workaround (pre-1.0):
    ✤   use PhantomReference to tell when mmap’d file is GC (hence
        unmapped)
    ✤   Poor user experience and messy corner cases
✤   New workaround:
    ✤   Class.forName("sun.nio.ch.DirectBuffer").getMethod("cleaner")
mmap part 2

✤   2GB limit via ByteBuffer:
    public abstract byte get(int index)

✤   Workaround: MmappedSegmentedFile
    public Iterator<DataInput> iterator(long position)
link

✤   Used for snapshots
✤   Old workaround: JNA
✤   New workaround: supported directly by Java7
mlockall

✤   swappiness: pissing off database developers since 2001 (?)
✤   mlockall(MCL_CURRENT)
Low-level i/o

✤   posix_fadvise
✤   mincore/fincore
✤   fctl


✤   ... JNA
A plug for JNA

✤   https://github.com/twall/jna

     static {
         try {
              Native.register("c");
       ...

     private static native int mlockall(int flags)
     throws LastErrorException;
The fallacy of choosing portability over power

✤   Applets have been dead for years
✤   Python gets it right
    ✤   import readline
The fallacy of choosing safety over power

✤   Allowing munmap would expose developers to segfaults
✤   But, relying on the GC to clean up external resources is a
    well-known antipattern
    ✤   File.close
✤   We need munmap badly enough that we resort to
    unnatural and unportable code to get it
    ✤   You haven’t kept us from risking segfaults, you’ve just made us
        miserable
Compatibility through obscurity?

✤   sun.misc.Unsafe
✤   Used by high-profile libraries like high-scale-lib
... even public options




     http://blogs.oracle.com/dave/entry/false_sharing_induced_by_card
Too negative?
Still true

✤   "Many concurrent algorithms are very easy to write with a
    GC and totally hard (to down right impossible) using
    explicit free." -- Cliff Click

Más contenido relacionado

La actualidad más candente

Disk Performance Comparison Xen v.s. KVM
Disk Performance Comparison Xen v.s. KVMDisk Performance Comparison Xen v.s. KVM
Disk Performance Comparison Xen v.s. KVMnknytk
 
XPDS14 - Scaling Xen's Aggregate Storage Performance - Felipe Franciosi, Citrix
XPDS14 - Scaling Xen's Aggregate Storage Performance - Felipe Franciosi, CitrixXPDS14 - Scaling Xen's Aggregate Storage Performance - Felipe Franciosi, Citrix
XPDS14 - Scaling Xen's Aggregate Storage Performance - Felipe Franciosi, CitrixThe Linux Foundation
 
Ceph on All Flash Storage -- Breaking Performance Barriers
Ceph on All Flash Storage -- Breaking Performance BarriersCeph on All Flash Storage -- Breaking Performance Barriers
Ceph on All Flash Storage -- Breaking Performance BarriersCeph Community
 
Redis persistence in practice
Redis persistence in practiceRedis persistence in practice
Redis persistence in practiceEugene Fidelin
 
Intel QLC: Cost-effective Ceph on NVMe
Intel QLC: Cost-effective Ceph on NVMeIntel QLC: Cost-effective Ceph on NVMe
Intel QLC: Cost-effective Ceph on NVMeCeph Community
 
Improvements in GlusterFS for Virtualization usecase
Improvements in GlusterFS for Virtualization usecaseImprovements in GlusterFS for Virtualization usecase
Improvements in GlusterFS for Virtualization usecaseDeepak Shetty
 
Cinder Live Migration and Replication - OpenStack Summit Austin
Cinder Live Migration and Replication - OpenStack Summit AustinCinder Live Migration and Replication - OpenStack Summit Austin
Cinder Live Migration and Replication - OpenStack Summit AustinEd Balduf
 
Nexenta at VMworld Hands-on Lab
Nexenta at VMworld Hands-on LabNexenta at VMworld Hands-on Lab
Nexenta at VMworld Hands-on LabNexenta Systems
 
Ceph Day Beijing - Our journey to high performance large scale Ceph cluster a...
Ceph Day Beijing - Our journey to high performance large scale Ceph cluster a...Ceph Day Beijing - Our journey to high performance large scale Ceph cluster a...
Ceph Day Beijing - Our journey to high performance large scale Ceph cluster a...Danielle Womboldt
 
Redis cluster
Redis clusterRedis cluster
Redis clusteriammutex
 
Automatic Operation Bot for Ceph - You Ji
Automatic Operation Bot for Ceph - You JiAutomatic Operation Bot for Ceph - You Ji
Automatic Operation Bot for Ceph - You JiCeph Community
 
Doing QoS Before Ceph Cluster QoS is available - David Byte, Alex Lau
Doing QoS Before Ceph Cluster QoS is available - David Byte, Alex LauDoing QoS Before Ceph Cluster QoS is available - David Byte, Alex Lau
Doing QoS Before Ceph Cluster QoS is available - David Byte, Alex LauCeph Community
 
DataStax: Extreme Cassandra Optimization: The Sequel
DataStax: Extreme Cassandra Optimization: The SequelDataStax: Extreme Cassandra Optimization: The Sequel
DataStax: Extreme Cassandra Optimization: The SequelDataStax Academy
 
Cinder - status of replication
Cinder - status of replicationCinder - status of replication
Cinder - status of replicationEd Balduf
 
RHEVM - Live Storage Migration
RHEVM - Live Storage MigrationRHEVM - Live Storage Migration
RHEVM - Live Storage MigrationRaz Tamir
 
2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific Dashboard2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific DashboardCeph Community
 
Ceph on 64-bit ARM with X-Gene
Ceph on 64-bit ARM with X-GeneCeph on 64-bit ARM with X-Gene
Ceph on 64-bit ARM with X-GeneCeph Community
 
2021.06. Ceph Project Update
2021.06. Ceph Project Update2021.06. Ceph Project Update
2021.06. Ceph Project UpdateCeph Community
 
OSv presentation from Linux Foundation Collaboration Summit
OSv presentation from Linux Foundation Collaboration SummitOSv presentation from Linux Foundation Collaboration Summit
OSv presentation from Linux Foundation Collaboration SummitDon Marti
 

La actualidad más candente (20)

Disk Performance Comparison Xen v.s. KVM
Disk Performance Comparison Xen v.s. KVMDisk Performance Comparison Xen v.s. KVM
Disk Performance Comparison Xen v.s. KVM
 
XPDS14 - Scaling Xen's Aggregate Storage Performance - Felipe Franciosi, Citrix
XPDS14 - Scaling Xen's Aggregate Storage Performance - Felipe Franciosi, CitrixXPDS14 - Scaling Xen's Aggregate Storage Performance - Felipe Franciosi, Citrix
XPDS14 - Scaling Xen's Aggregate Storage Performance - Felipe Franciosi, Citrix
 
Ceph on All Flash Storage -- Breaking Performance Barriers
Ceph on All Flash Storage -- Breaking Performance BarriersCeph on All Flash Storage -- Breaking Performance Barriers
Ceph on All Flash Storage -- Breaking Performance Barriers
 
ceph-barcelona-v-1.2
ceph-barcelona-v-1.2ceph-barcelona-v-1.2
ceph-barcelona-v-1.2
 
Redis persistence in practice
Redis persistence in practiceRedis persistence in practice
Redis persistence in practice
 
Intel QLC: Cost-effective Ceph on NVMe
Intel QLC: Cost-effective Ceph on NVMeIntel QLC: Cost-effective Ceph on NVMe
Intel QLC: Cost-effective Ceph on NVMe
 
Improvements in GlusterFS for Virtualization usecase
Improvements in GlusterFS for Virtualization usecaseImprovements in GlusterFS for Virtualization usecase
Improvements in GlusterFS for Virtualization usecase
 
Cinder Live Migration and Replication - OpenStack Summit Austin
Cinder Live Migration and Replication - OpenStack Summit AustinCinder Live Migration and Replication - OpenStack Summit Austin
Cinder Live Migration and Replication - OpenStack Summit Austin
 
Nexenta at VMworld Hands-on Lab
Nexenta at VMworld Hands-on LabNexenta at VMworld Hands-on Lab
Nexenta at VMworld Hands-on Lab
 
Ceph Day Beijing - Our journey to high performance large scale Ceph cluster a...
Ceph Day Beijing - Our journey to high performance large scale Ceph cluster a...Ceph Day Beijing - Our journey to high performance large scale Ceph cluster a...
Ceph Day Beijing - Our journey to high performance large scale Ceph cluster a...
 
Redis cluster
Redis clusterRedis cluster
Redis cluster
 
Automatic Operation Bot for Ceph - You Ji
Automatic Operation Bot for Ceph - You JiAutomatic Operation Bot for Ceph - You Ji
Automatic Operation Bot for Ceph - You Ji
 
Doing QoS Before Ceph Cluster QoS is available - David Byte, Alex Lau
Doing QoS Before Ceph Cluster QoS is available - David Byte, Alex LauDoing QoS Before Ceph Cluster QoS is available - David Byte, Alex Lau
Doing QoS Before Ceph Cluster QoS is available - David Byte, Alex Lau
 
DataStax: Extreme Cassandra Optimization: The Sequel
DataStax: Extreme Cassandra Optimization: The SequelDataStax: Extreme Cassandra Optimization: The Sequel
DataStax: Extreme Cassandra Optimization: The Sequel
 
Cinder - status of replication
Cinder - status of replicationCinder - status of replication
Cinder - status of replication
 
RHEVM - Live Storage Migration
RHEVM - Live Storage MigrationRHEVM - Live Storage Migration
RHEVM - Live Storage Migration
 
2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific Dashboard2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific Dashboard
 
Ceph on 64-bit ARM with X-Gene
Ceph on 64-bit ARM with X-GeneCeph on 64-bit ARM with X-Gene
Ceph on 64-bit ARM with X-Gene
 
2021.06. Ceph Project Update
2021.06. Ceph Project Update2021.06. Ceph Project Update
2021.06. Ceph Project Update
 
OSv presentation from Linux Foundation Collaboration Summit
OSv presentation from Linux Foundation Collaboration SummitOSv presentation from Linux Foundation Collaboration Summit
OSv presentation from Linux Foundation Collaboration Summit
 

Similar a Dealing with JVM limitations in Apache Cassandra (Fosdem 2012)

Fosdem 2012
Fosdem 2012Fosdem 2012
Fosdem 2012pcmanus
 
33rd degree conference
33rd degree conference33rd degree conference
33rd degree conferencepcmanus
 
Cassandra and Solid State Drives
Cassandra and Solid State DrivesCassandra and Solid State Drives
Cassandra and Solid State DrivesDataStax Academy
 
Scala & Spark(1.6) in Performance Aspect for Scala Taiwan
Scala & Spark(1.6) in Performance Aspect for Scala TaiwanScala & Spark(1.6) in Performance Aspect for Scala Taiwan
Scala & Spark(1.6) in Performance Aspect for Scala TaiwanJimin Hsieh
 
Dynamic Change Data Capture with Flink CDC and Consistent Hashing
Dynamic Change Data Capture with Flink CDC and Consistent HashingDynamic Change Data Capture with Flink CDC and Consistent Hashing
Dynamic Change Data Capture with Flink CDC and Consistent HashingHostedbyConfluent
 
Dynamic Change Data Capture with Flink CDC and Consistent Hashing
Dynamic Change Data Capture with Flink CDC and Consistent HashingDynamic Change Data Capture with Flink CDC and Consistent Hashing
Dynamic Change Data Capture with Flink CDC and Consistent HashingYaroslav Tkachenko
 
Devoxx Fr 2022 - Remèdes aux oomkill, warm-ups, et lenteurs pour des conteneu...
Devoxx Fr 2022 - Remèdes aux oomkill, warm-ups, et lenteurs pour des conteneu...Devoxx Fr 2022 - Remèdes aux oomkill, warm-ups, et lenteurs pour des conteneu...
Devoxx Fr 2022 - Remèdes aux oomkill, warm-ups, et lenteurs pour des conteneu...Jean-Philippe BEMPEL
 
XT Best Practices
XT Best PracticesXT Best Practices
XT Best PracticesJeff Larkin
 
Next Level Mobile Graphics | Munseong Kang, Oleksii Vasylenko
Next Level Mobile Graphics | Munseong Kang, Oleksii VasylenkoNext Level Mobile Graphics | Munseong Kang, Oleksii Vasylenko
Next Level Mobile Graphics | Munseong Kang, Oleksii VasylenkoJessica Tams
 
Apache Sparkにおけるメモリ - アプリケーションを落とさないメモリ設計手法 -
Apache Sparkにおけるメモリ - アプリケーションを落とさないメモリ設計手法 -Apache Sparkにおけるメモリ - アプリケーションを落とさないメモリ設計手法 -
Apache Sparkにおけるメモリ - アプリケーションを落とさないメモリ設計手法 -Yoshiyasu SAEKI
 
Introduction to Khronos SYCL
Introduction to Khronos SYCLIntroduction to Khronos SYCL
Introduction to Khronos SYCLMin-Yih Hsu
 
Nytro-XV_NWD_VM_Performance_Acceleration
Nytro-XV_NWD_VM_Performance_AccelerationNytro-XV_NWD_VM_Performance_Acceleration
Nytro-XV_NWD_VM_Performance_AccelerationKhai Le
 
Top five questions to ask when choosing a big data solution
Top five questions to ask when choosing a big data solutionTop five questions to ask when choosing a big data solution
Top five questions to ask when choosing a big data solutionjbellis
 
Scale Out Your Graph Across Servers and Clouds with OrientDB
Scale Out Your Graph Across Servers and Clouds  with OrientDBScale Out Your Graph Across Servers and Clouds  with OrientDB
Scale Out Your Graph Across Servers and Clouds with OrientDBLuca Garulli
 
Cross the streams thanks to Kafka and Flink (Christophe Philemotte, Digazu) K...
Cross the streams thanks to Kafka and Flink (Christophe Philemotte, Digazu) K...Cross the streams thanks to Kafka and Flink (Christophe Philemotte, Digazu) K...
Cross the streams thanks to Kafka and Flink (Christophe Philemotte, Digazu) K...confluent
 
Colin-Ian-King-Mentorship-Stress-ng.pdf
Colin-Ian-King-Mentorship-Stress-ng.pdfColin-Ian-King-Mentorship-Stress-ng.pdf
Colin-Ian-King-Mentorship-Stress-ng.pdfxiso
 
Haskell Symposium 2010: An LLVM backend for GHC
Haskell Symposium 2010: An LLVM backend for GHCHaskell Symposium 2010: An LLVM backend for GHC
Haskell Symposium 2010: An LLVM backend for GHCdterei
 

Similar a Dealing with JVM limitations in Apache Cassandra (Fosdem 2012) (20)

Fosdem 2012
Fosdem 2012Fosdem 2012
Fosdem 2012
 
33rd degree conference
33rd degree conference33rd degree conference
33rd degree conference
 
Cassandra and Solid State Drives
Cassandra and Solid State DrivesCassandra and Solid State Drives
Cassandra and Solid State Drives
 
Scala & Spark(1.6) in Performance Aspect for Scala Taiwan
Scala & Spark(1.6) in Performance Aspect for Scala TaiwanScala & Spark(1.6) in Performance Aspect for Scala Taiwan
Scala & Spark(1.6) in Performance Aspect for Scala Taiwan
 
Dynamic Change Data Capture with Flink CDC and Consistent Hashing
Dynamic Change Data Capture with Flink CDC and Consistent HashingDynamic Change Data Capture with Flink CDC and Consistent Hashing
Dynamic Change Data Capture with Flink CDC and Consistent Hashing
 
Dynamic Change Data Capture with Flink CDC and Consistent Hashing
Dynamic Change Data Capture with Flink CDC and Consistent HashingDynamic Change Data Capture with Flink CDC and Consistent Hashing
Dynamic Change Data Capture with Flink CDC and Consistent Hashing
 
Devoxx Fr 2022 - Remèdes aux oomkill, warm-ups, et lenteurs pour des conteneu...
Devoxx Fr 2022 - Remèdes aux oomkill, warm-ups, et lenteurs pour des conteneu...Devoxx Fr 2022 - Remèdes aux oomkill, warm-ups, et lenteurs pour des conteneu...
Devoxx Fr 2022 - Remèdes aux oomkill, warm-ups, et lenteurs pour des conteneu...
 
XT Best Practices
XT Best PracticesXT Best Practices
XT Best Practices
 
Next Level Mobile Graphics | Munseong Kang, Oleksii Vasylenko
Next Level Mobile Graphics | Munseong Kang, Oleksii VasylenkoNext Level Mobile Graphics | Munseong Kang, Oleksii Vasylenko
Next Level Mobile Graphics | Munseong Kang, Oleksii Vasylenko
 
Lev
LevLev
Lev
 
Apache Sparkにおけるメモリ - アプリケーションを落とさないメモリ設計手法 -
Apache Sparkにおけるメモリ - アプリケーションを落とさないメモリ設計手法 -Apache Sparkにおけるメモリ - アプリケーションを落とさないメモリ設計手法 -
Apache Sparkにおけるメモリ - アプリケーションを落とさないメモリ設計手法 -
 
Introduction to Khronos SYCL
Introduction to Khronos SYCLIntroduction to Khronos SYCL
Introduction to Khronos SYCL
 
Nytro-XV_NWD_VM_Performance_Acceleration
Nytro-XV_NWD_VM_Performance_AccelerationNytro-XV_NWD_VM_Performance_Acceleration
Nytro-XV_NWD_VM_Performance_Acceleration
 
Jvm Language Summit Rose 20081016
Jvm Language Summit Rose 20081016Jvm Language Summit Rose 20081016
Jvm Language Summit Rose 20081016
 
Top five questions to ask when choosing a big data solution
Top five questions to ask when choosing a big data solutionTop five questions to ask when choosing a big data solution
Top five questions to ask when choosing a big data solution
 
Scale Out Your Graph Across Servers and Clouds with OrientDB
Scale Out Your Graph Across Servers and Clouds  with OrientDBScale Out Your Graph Across Servers and Clouds  with OrientDB
Scale Out Your Graph Across Servers and Clouds with OrientDB
 
Cross the streams thanks to Kafka and Flink (Christophe Philemotte, Digazu) K...
Cross the streams thanks to Kafka and Flink (Christophe Philemotte, Digazu) K...Cross the streams thanks to Kafka and Flink (Christophe Philemotte, Digazu) K...
Cross the streams thanks to Kafka and Flink (Christophe Philemotte, Digazu) K...
 
Colin-Ian-King-Mentorship-Stress-ng.pdf
Colin-Ian-King-Mentorship-Stress-ng.pdfColin-Ian-King-Mentorship-Stress-ng.pdf
Colin-Ian-King-Mentorship-Stress-ng.pdf
 
Haskell Symposium 2010: An LLVM backend for GHC
Haskell Symposium 2010: An LLVM backend for GHCHaskell Symposium 2010: An LLVM backend for GHC
Haskell Symposium 2010: An LLVM backend for GHC
 
Starburn
StarburnStarburn
Starburn
 

Más de jbellis

Five Lessons in Distributed Databases
Five Lessons  in Distributed DatabasesFive Lessons  in Distributed Databases
Five Lessons in Distributed Databasesjbellis
 
Data day texas: Cassandra and the Cloud
Data day texas: Cassandra and the CloudData day texas: Cassandra and the Cloud
Data day texas: Cassandra and the Cloudjbellis
 
Cassandra Summit 2015
Cassandra Summit 2015Cassandra Summit 2015
Cassandra Summit 2015jbellis
 
Cassandra summit keynote 2014
Cassandra summit keynote 2014Cassandra summit keynote 2014
Cassandra summit keynote 2014jbellis
 
Cassandra 2.1
Cassandra 2.1Cassandra 2.1
Cassandra 2.1jbellis
 
Tokyo cassandra conference 2014
Tokyo cassandra conference 2014Tokyo cassandra conference 2014
Tokyo cassandra conference 2014jbellis
 
Cassandra Summit EU 2013
Cassandra Summit EU 2013Cassandra Summit EU 2013
Cassandra Summit EU 2013jbellis
 
London + Dublin Cassandra 2.0
London + Dublin Cassandra 2.0London + Dublin Cassandra 2.0
London + Dublin Cassandra 2.0jbellis
 
Cassandra Summit 2013 Keynote
Cassandra Summit 2013 KeynoteCassandra Summit 2013 Keynote
Cassandra Summit 2013 Keynotejbellis
 
Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012jbellis
 
State of Cassandra 2012
State of Cassandra 2012State of Cassandra 2012
State of Cassandra 2012jbellis
 
Massively Scalable NoSQL with Apache Cassandra
Massively Scalable NoSQL with Apache CassandraMassively Scalable NoSQL with Apache Cassandra
Massively Scalable NoSQL with Apache Cassandrajbellis
 
Cassandra 1.1
Cassandra 1.1Cassandra 1.1
Cassandra 1.1jbellis
 
Pycon 2012 What Python can learn from Java
Pycon 2012 What Python can learn from JavaPycon 2012 What Python can learn from Java
Pycon 2012 What Python can learn from Javajbellis
 
Apache Cassandra: NoSQL in the enterprise
Apache Cassandra: NoSQL in the enterpriseApache Cassandra: NoSQL in the enterprise
Apache Cassandra: NoSQL in the enterprisejbellis
 
Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011jbellis
 
Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)
Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)
Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)jbellis
 
What python can learn from java
What python can learn from javaWhat python can learn from java
What python can learn from javajbellis
 
State of Cassandra, 2011
State of Cassandra, 2011State of Cassandra, 2011
State of Cassandra, 2011jbellis
 
Brisk: more powerful Hadoop powered by Cassandra
Brisk: more powerful Hadoop powered by CassandraBrisk: more powerful Hadoop powered by Cassandra
Brisk: more powerful Hadoop powered by Cassandrajbellis
 

Más de jbellis (20)

Five Lessons in Distributed Databases
Five Lessons  in Distributed DatabasesFive Lessons  in Distributed Databases
Five Lessons in Distributed Databases
 
Data day texas: Cassandra and the Cloud
Data day texas: Cassandra and the CloudData day texas: Cassandra and the Cloud
Data day texas: Cassandra and the Cloud
 
Cassandra Summit 2015
Cassandra Summit 2015Cassandra Summit 2015
Cassandra Summit 2015
 
Cassandra summit keynote 2014
Cassandra summit keynote 2014Cassandra summit keynote 2014
Cassandra summit keynote 2014
 
Cassandra 2.1
Cassandra 2.1Cassandra 2.1
Cassandra 2.1
 
Tokyo cassandra conference 2014
Tokyo cassandra conference 2014Tokyo cassandra conference 2014
Tokyo cassandra conference 2014
 
Cassandra Summit EU 2013
Cassandra Summit EU 2013Cassandra Summit EU 2013
Cassandra Summit EU 2013
 
London + Dublin Cassandra 2.0
London + Dublin Cassandra 2.0London + Dublin Cassandra 2.0
London + Dublin Cassandra 2.0
 
Cassandra Summit 2013 Keynote
Cassandra Summit 2013 KeynoteCassandra Summit 2013 Keynote
Cassandra Summit 2013 Keynote
 
Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012
 
State of Cassandra 2012
State of Cassandra 2012State of Cassandra 2012
State of Cassandra 2012
 
Massively Scalable NoSQL with Apache Cassandra
Massively Scalable NoSQL with Apache CassandraMassively Scalable NoSQL with Apache Cassandra
Massively Scalable NoSQL with Apache Cassandra
 
Cassandra 1.1
Cassandra 1.1Cassandra 1.1
Cassandra 1.1
 
Pycon 2012 What Python can learn from Java
Pycon 2012 What Python can learn from JavaPycon 2012 What Python can learn from Java
Pycon 2012 What Python can learn from Java
 
Apache Cassandra: NoSQL in the enterprise
Apache Cassandra: NoSQL in the enterpriseApache Cassandra: NoSQL in the enterprise
Apache Cassandra: NoSQL in the enterprise
 
Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011
 
Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)
Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)
Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)
 
What python can learn from java
What python can learn from javaWhat python can learn from java
What python can learn from java
 
State of Cassandra, 2011
State of Cassandra, 2011State of Cassandra, 2011
State of Cassandra, 2011
 
Brisk: more powerful Hadoop powered by Cassandra
Brisk: more powerful Hadoop powered by CassandraBrisk: more powerful Hadoop powered by Cassandra
Brisk: more powerful Hadoop powered by Cassandra
 

Último

Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024The Digital Insurer
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Principled Technologies
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 

Último (20)

Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 

Dealing with JVM limitations in Apache Cassandra (Fosdem 2012)

  • 1. Dealing with JVM limitations in Apache Cassandra Jonathan Ellis / @spyced
  • 2. Pain points for Java databases ✤ GC ✤ GC ✤ GC
  • 3. Pain points for Java databases ✤ GC ✤ Platform specific code
  • 4. GC ✤ Concurrent and compacting: choose one ✤ G1 ✤ Azul C4 / Zing?
  • 5. Fragmentation ✤ Bloom filter arrays ✤ Compression offsets
  • 6. Automatic mitigation? ✤ http://www.research.ibm.com/people/d/dfb/papers/Bacon03Controlling.pdf ✤ http://researcher.ibm.com/files/us-hirzel/pldi10-arraylets.pdf
  • 7. Fragmentation, 2 ✤ Arena allocation for memtables
  • 8. (Memtables?) write( k1 , c1:v1 ) Memory Memtable Commit log Hard drive
  • 9. write( k1 , c1:v ) Memory k1 c1:v Memtable k1 c1:v Commit log Hard drive
  • 10. write( k1 , c2:v ) Memory k1 c1:v c2:v k1 c1:v k1 c2:v Hard drive
  • 11. write( k2 , c1:v c2:v ) Memory k1 c1:v c2:v k2 c1:v c2:v k1 c1:v k1 c2:v k2 c1:v c2:v Hard drive
  • 12. write( k1 , c1:v c3:v ) Memory k1 c1:v c2:v c3:v k2 c1:v c2:v k1 c1:v k1 c2:v k2 c1:v c2:v k1 c1:v c3:v Hard drive
  • 13. Memory flush index cleanup k1 c1:v c2:v c3:v k2 c1:v c2:v SSTable Hard drive
  • 14. “Java is a memory hog” ✤ Large overhead for typical objects and collections ✤ How large? ✤ java.lang.instrument.Instrumentation ✤ JAMM: Java Agent for Memory Measurements ✤ https://github.com/jbellis/jamm
  • 15. org.apache.cassandra.cache.SerializingCache ✤ Live objects are about 85% JVM bookeeping ✤ org.apache.cassandra.cache.FreeableMemory using reference counting ✤ Considering doing reference-counted, off-heap memtables as well
  • 16. Don’t forget about young gen ✤ Always stop-the-world for ~100ms
  • 18. m[un]map ✤ Log-structured storage wants to remove old files post- compaction; some platforms disallow deleting open files ✤ Old workaround (pre-1.0): ✤ use PhantomReference to tell when mmap’d file is GC (hence unmapped) ✤ Poor user experience and messy corner cases ✤ New workaround: ✤ Class.forName("sun.nio.ch.DirectBuffer").getMethod("cleaner")
  • 19. mmap part 2 ✤ 2GB limit via ByteBuffer: public abstract byte get(int index) ✤ Workaround: MmappedSegmentedFile public Iterator<DataInput> iterator(long position)
  • 20. link ✤ Used for snapshots ✤ Old workaround: JNA ✤ New workaround: supported directly by Java7
  • 21. mlockall ✤ swappiness: pissing off database developers since 2001 (?) ✤ mlockall(MCL_CURRENT)
  • 22. Low-level i/o ✤ posix_fadvise ✤ mincore/fincore ✤ fctl ✤ ... JNA
  • 23. A plug for JNA ✤ https://github.com/twall/jna static { try { Native.register("c"); ... private static native int mlockall(int flags) throws LastErrorException;
  • 24. The fallacy of choosing portability over power ✤ Applets have been dead for years ✤ Python gets it right ✤ import readline
  • 25. The fallacy of choosing safety over power ✤ Allowing munmap would expose developers to segfaults ✤ But, relying on the GC to clean up external resources is a well-known antipattern ✤ File.close ✤ We need munmap badly enough that we resort to unnatural and unportable code to get it ✤ You haven’t kept us from risking segfaults, you’ve just made us miserable
  • 26. Compatibility through obscurity? ✤ sun.misc.Unsafe ✤ Used by high-profile libraries like high-scale-lib
  • 27. ... even public options http://blogs.oracle.com/dave/entry/false_sharing_induced_by_card
  • 29. Still true ✤ "Many concurrent algorithms are very easy to write with a GC and totally hard (to down right impossible) using explicit free." -- Cliff Click