SlideShare una empresa de Scribd logo
1 de 20
Cassandra Essentials
Tutorial Series


   An Overview of
Apache Cassandra
Agenda

 What   is Cassandra?
 History
 Architecture
 Key Features and Benefits
 Who’s using Cassandra?
 Where to get Cassandra
Definition of Cassandra
Apache Cassandra™ is a free
Distributed…
High performance…
Extremely scalable…
Fault tolerant (i.e. no single point of failure)…
post-relational database solution. Cassandra can serve
as both real-time datastore (the “system of record”) for
online/transactional applications, and as a read-
intensive database for business intelligence systems.
The History of Cassandra
     Bigtable              Dynamo
Architecture Overview
   Cassandra was designed with the understanding that
    system/hardware failures can and do occur
   Peer-to-peer, distributed system
   All nodes the same
   Data partitioned among all nodes in the cluster
   Custom data replication to ensure fault tolerance
   Read/Write-anywhere design
Architecture Overview
   Each node communicates with each other through the
    Gossip protocol, which exchanges information across the
    cluster every second
   A commit log is used on each node to capture write
    activity. Data durability is assured
   Data also written to an in-memory structure (memtable)
    and then to disk once the memory structure is full (an
    SStable)
Architecture Overview
   The schema used in Cassandra is mirrored after Google
    Bigtable. It is a row-oriented, column structure
   A keyspace is akin to a database in the RDBMS world
   A column family is similar to an RDBMS table but is more
    flexible/dynamic
   A row in a column family is indexed by its key. Other
    columns may be indexed as well



                              Portfolio Keyspace
                                  Customer Column Family

                                   ID     Name     SSN   DOB
Why Cassandra?
 Gigabyte   to Petabyte scalability
 Linear performance gains through adding nodes
 No single point of failure
 Easy replication / data distribution
 Multi-data center and Cloud capable
 No need for separate caching layer
 Tunable data consistency
 Flexible schema design
 Data Compression
 CQL language (like SQL)
 Support for key languages and platforms
 No need for special hardware or software
Big Data Scalability
 Capable of comfortably scaling to petabytes
 New nodes = Linear performance increases
 Add new nodes online



       1                                 1




                                                 2
                                     4

                 Double Throughput
                    Capabilities

       2                                     3
No Single Point of Failure
 Allnodes the same
 Customized replication affords tunable data
  redundancy
 Read/write from any node
 Can replicate data among different physical
  data center racks
Easy Replication / Data Distribution
 Transparently   handled by Cassandra
 Multi-data center capable
 Exploits all the benefits of Cloud computing
 Able to do hybrid Cloud/On-premise setup
No Need for Caching Software
 Peer-to-peer   architecture removes need for
  special caching layer and the programming that
  goes with it
 The database cluster uses the memory from all
  participating nodes to cache the data assigned
  to each node
 No irregularities between a memory cache and
  database are encountered
        Application Servers
Reads




                              Writes




        Memcached Servers




          Database Server
Tunable Data Consistency
 Choose  between strong and eventual
  consistency (All to any node responding)
  depending on the need
 Can be done on a per-operation basis, and for
  both reads and writes
 Handles Multi-data center operations

                                                     1



                                                 6           2




           Writes             Reads
              Any               One            5           3
              One               Quorum
              Quorum            Local_Quorum
               Local_Quorum       Each_Quorum
                                                         4
                             
              Each_Quorum       All
              All
Flexible Schema
 Dynamic   schema design allows for much more
  flexible data storage than rigid RDBMS
 Handles structured, semi-structured, and
  unstructured data. Counters also supported
 No offline/downtime for schema changes
 Supports primary and secondary indexes


                     Portfolio Keyspace
                         Customer Column Family

                          ID    Name      SSN   DOB
Data Compression
 Uses Google’s Snappy data compression
  algorithm
 Compresses data on a per column family level
 Internal tests at DataStax show up to 80%+
  compression of raw data
 No performance penalty (and some increases in
  overall performance due to less physical I/O)!
CQL Language
 Verysimilar to RDBMS SQL syntax
 Create objects via DDL (e.g. CREATE…)
 Core DML commands supported: INSERT, UPDATE,
  DELETE
 Query data with SELECT


                                      1



                                  6           2




           SELECT *
           FROM   USERS           5
           WHERE  STATE = ‘TX’;               3



                                          4
Who’s Using Cassandra?
http://www.datastax.com/cassandrausers#all
Where to get Cassandra?
 Go to www.datastax.com
 DataStax makes free smart start installers
  available for Cassandra that include:
      The most up-to-date Cassandra version that is
       production quality
      A version of DataStax OpsCenter, which is a visual,
       browser-based management tool for managing
       and monitoring Cassandra
      Drivers and connectors for popular development
       languages
      Same database and application
      Automatic configuration assistance for ensuring
       optimal performance and setup for either stand-
       alone or cluster implementations
      Getting Started Guide
Where Can I Learn More?




          www.datastax.com

            Free Online Documentation
            Technical White Papers
            Technical Articles
            Tutorials
            User Forums
            User/Customer Case Studies
            FAQ’s
            Videos
            Blogs
            Software downloads
Cassandra Essentials
Tutorial Series

An Overview of
Apache Cassandra

Thanks…!

Más contenido relacionado

La actualidad más candente

Cassandra overview
Cassandra overviewCassandra overview
Cassandra overview
Sean Murphy
 
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
Simplilearn
 

La actualidad más candente (20)

Intro to Cassandra
Intro to CassandraIntro to Cassandra
Intro to Cassandra
 
C* Summit 2013: The World's Next Top Data Model by Patrick McFadin
C* Summit 2013: The World's Next Top Data Model by Patrick McFadinC* Summit 2013: The World's Next Top Data Model by Patrick McFadin
C* Summit 2013: The World's Next Top Data Model by Patrick McFadin
 
Cassandra overview
Cassandra overviewCassandra overview
Cassandra overview
 
Introduction to Cassandra Architecture
Introduction to Cassandra ArchitectureIntroduction to Cassandra Architecture
Introduction to Cassandra Architecture
 
Cassandra Introduction & Features
Cassandra Introduction & FeaturesCassandra Introduction & Features
Cassandra Introduction & Features
 
Cassandra
CassandraCassandra
Cassandra
 
Cql – cassandra query language
Cql – cassandra query languageCql – cassandra query language
Cql – cassandra query language
 
Introduction to Cassandra Basics
Introduction to Cassandra BasicsIntroduction to Cassandra Basics
Introduction to Cassandra Basics
 
Introduction to cassandra
Introduction to cassandraIntroduction to cassandra
Introduction to cassandra
 
Non relational databases-no sql
Non relational databases-no sqlNon relational databases-no sql
Non relational databases-no sql
 
MongoDB
MongoDBMongoDB
MongoDB
 
Data Stores @ Netflix
Data Stores @ NetflixData Stores @ Netflix
Data Stores @ Netflix
 
Cassandra 101
Cassandra 101Cassandra 101
Cassandra 101
 
Understanding Data Partitioning and Replication in Apache Cassandra
Understanding Data Partitioning and Replication in Apache CassandraUnderstanding Data Partitioning and Replication in Apache Cassandra
Understanding Data Partitioning and Replication in Apache Cassandra
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 
Introduction to Apache Cassandra
Introduction to Apache Cassandra Introduction to Apache Cassandra
Introduction to Apache Cassandra
 
NoSQL databases - An introduction
NoSQL databases - An introductionNoSQL databases - An introduction
NoSQL databases - An introduction
 
A Deep Dive Into Understanding Apache Cassandra
A Deep Dive Into Understanding Apache CassandraA Deep Dive Into Understanding Apache Cassandra
A Deep Dive Into Understanding Apache Cassandra
 
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
 
Appache Cassandra
Appache Cassandra  Appache Cassandra
Appache Cassandra
 

Destacado

Replication and Consistency in Cassandra... What Does it All Mean? (Christoph...
Replication and Consistency in Cassandra... What Does it All Mean? (Christoph...Replication and Consistency in Cassandra... What Does it All Mean? (Christoph...
Replication and Consistency in Cassandra... What Does it All Mean? (Christoph...
DataStax
 
Cassandra Tutorial
Cassandra TutorialCassandra Tutorial
Cassandra Tutorial
mubarakss
 
Cassandra Explained
Cassandra ExplainedCassandra Explained
Cassandra Explained
Eric Evans
 

Destacado (11)

Cassandra at eBay - Cassandra Summit 2012
Cassandra at eBay - Cassandra Summit 2012Cassandra at eBay - Cassandra Summit 2012
Cassandra at eBay - Cassandra Summit 2012
 
Replication and Consistency in Cassandra... What Does it All Mean? (Christoph...
Replication and Consistency in Cassandra... What Does it All Mean? (Christoph...Replication and Consistency in Cassandra... What Does it All Mean? (Christoph...
Replication and Consistency in Cassandra... What Does it All Mean? (Christoph...
 
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
 
Migrating Netflix from Datacenter Oracle to Global Cassandra
Migrating Netflix from Datacenter Oracle to Global CassandraMigrating Netflix from Datacenter Oracle to Global Cassandra
Migrating Netflix from Datacenter Oracle to Global Cassandra
 
Solr & Cassandra: Searching Cassandra with DataStax Enterprise
Solr & Cassandra: Searching Cassandra with DataStax EnterpriseSolr & Cassandra: Searching Cassandra with DataStax Enterprise
Solr & Cassandra: Searching Cassandra with DataStax Enterprise
 
Cassandra NoSQL Tutorial
Cassandra NoSQL TutorialCassandra NoSQL Tutorial
Cassandra NoSQL Tutorial
 
Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3
 
Cassandra Tutorial
Cassandra TutorialCassandra Tutorial
Cassandra Tutorial
 
Apache Cassandra Developer Training Slide Deck
Apache Cassandra Developer Training Slide DeckApache Cassandra Developer Training Slide Deck
Apache Cassandra Developer Training Slide Deck
 
NoSQL Essentials: Cassandra
NoSQL Essentials: CassandraNoSQL Essentials: Cassandra
NoSQL Essentials: Cassandra
 
Cassandra Explained
Cassandra ExplainedCassandra Explained
Cassandra Explained
 

Similar a An Overview of Apache Cassandra

Spinnaker VLDB 2011
Spinnaker VLDB 2011Spinnaker VLDB 2011
Spinnaker VLDB 2011
sandeep_tata
 

Similar a An Overview of Apache Cassandra (20)

Evaluating Apache Cassandra as a Cloud Database
Evaluating Apache Cassandra as a Cloud DatabaseEvaluating Apache Cassandra as a Cloud Database
Evaluating Apache Cassandra as a Cloud Database
 
Why Cassandra?
Why Cassandra?Why Cassandra?
Why Cassandra?
 
Cassandra - A Basic Introduction Guide
Cassandra - A Basic Introduction GuideCassandra - A Basic Introduction Guide
Cassandra - A Basic Introduction Guide
 
cassandra
cassandracassandra
cassandra
 
Understanding Data Consistency in Apache Cassandra
Understanding Data Consistency in Apache CassandraUnderstanding Data Consistency in Apache Cassandra
Understanding Data Consistency in Apache Cassandra
 
CASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEM
CASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEMCASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEM
CASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEM
 
Scaling SQL and NoSQL Databases in the Cloud
Scaling SQL and NoSQL Databases in the Cloud Scaling SQL and NoSQL Databases in the Cloud
Scaling SQL and NoSQL Databases in the Cloud
 
Cassandra implementation for collecting data and presenting data
Cassandra implementation for collecting data and presenting dataCassandra implementation for collecting data and presenting data
Cassandra implementation for collecting data and presenting data
 
Cassandra tutorial
Cassandra tutorialCassandra tutorial
Cassandra tutorial
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learning
 
Cassandra internals
Cassandra internalsCassandra internals
Cassandra internals
 
Migrating Oracle database to Cassandra
Migrating Oracle database to CassandraMigrating Oracle database to Cassandra
Migrating Oracle database to Cassandra
 
Data Storage Management
Data Storage ManagementData Storage Management
Data Storage Management
 
Cassandra presentation
Cassandra presentationCassandra presentation
Cassandra presentation
 
Spinnaker VLDB 2011
Spinnaker VLDB 2011Spinnaker VLDB 2011
Spinnaker VLDB 2011
 
Managing Objects and Data in Apache Cassandra
Managing Objects and Data in Apache CassandraManaging Objects and Data in Apache Cassandra
Managing Objects and Data in Apache Cassandra
 
Beyond Aurora. Scale-out SQL databases for AWS
Beyond Aurora. Scale-out SQL databases for AWS Beyond Aurora. Scale-out SQL databases for AWS
Beyond Aurora. Scale-out SQL databases for AWS
 
Scaling Your Database In The Cloud
Scaling Your Database In The CloudScaling Your Database In The Cloud
Scaling Your Database In The Cloud
 
Apache Cassandra introduction
Apache Cassandra introductionApache Cassandra introduction
Apache Cassandra introduction
 
Couchbase - Yet Another Introduction
Couchbase - Yet Another IntroductionCouchbase - Yet Another Introduction
Couchbase - Yet Another Introduction
 

Más de DataStax

Más de DataStax (20)

Is Your Enterprise Ready to Shine This Holiday Season?
Is Your Enterprise Ready to Shine This Holiday Season?Is Your Enterprise Ready to Shine This Holiday Season?
Is Your Enterprise Ready to Shine This Holiday Season?
 
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
 
Running DataStax Enterprise in VMware Cloud and Hybrid Environments
Running DataStax Enterprise in VMware Cloud and Hybrid EnvironmentsRunning DataStax Enterprise in VMware Cloud and Hybrid Environments
Running DataStax Enterprise in VMware Cloud and Hybrid Environments
 
Best Practices for Getting to Production with DataStax Enterprise Graph
Best Practices for Getting to Production with DataStax Enterprise GraphBest Practices for Getting to Production with DataStax Enterprise Graph
Best Practices for Getting to Production with DataStax Enterprise Graph
 
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step JourneyWebinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
 
Webinar | How to Understand Apache Cassandra™ Performance Through Read/Writ...
Webinar  |  How to Understand Apache Cassandra™ Performance Through Read/Writ...Webinar  |  How to Understand Apache Cassandra™ Performance Through Read/Writ...
Webinar | How to Understand Apache Cassandra™ Performance Through Read/Writ...
 
Webinar | Better Together: Apache Cassandra and Apache Kafka
Webinar  |  Better Together: Apache Cassandra and Apache KafkaWebinar  |  Better Together: Apache Cassandra and Apache Kafka
Webinar | Better Together: Apache Cassandra and Apache Kafka
 
Top 10 Best Practices for Apache Cassandra and DataStax Enterprise
Top 10 Best Practices for Apache Cassandra and DataStax EnterpriseTop 10 Best Practices for Apache Cassandra and DataStax Enterprise
Top 10 Best Practices for Apache Cassandra and DataStax Enterprise
 
Introduction to Apache Cassandra™ + What’s New in 4.0
Introduction to Apache Cassandra™ + What’s New in 4.0Introduction to Apache Cassandra™ + What’s New in 4.0
Introduction to Apache Cassandra™ + What’s New in 4.0
 
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
 
Webinar | Aligning GDPR Requirements with Today's Hybrid Cloud Realities
Webinar  |  Aligning GDPR Requirements with Today's Hybrid Cloud RealitiesWebinar  |  Aligning GDPR Requirements with Today's Hybrid Cloud Realities
Webinar | Aligning GDPR Requirements with Today's Hybrid Cloud Realities
 
Designing a Distributed Cloud Database for Dummies
Designing a Distributed Cloud Database for DummiesDesigning a Distributed Cloud Database for Dummies
Designing a Distributed Cloud Database for Dummies
 
How to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
How to Power Innovation with Geo-Distributed Data Management in Hybrid CloudHow to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
How to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
 
How to Evaluate Cloud Databases for eCommerce
How to Evaluate Cloud Databases for eCommerceHow to Evaluate Cloud Databases for eCommerce
How to Evaluate Cloud Databases for eCommerce
 
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
 
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
 
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
 
Datastax - The Architect's guide to customer experience (CX)
Datastax - The Architect's guide to customer experience (CX)Datastax - The Architect's guide to customer experience (CX)
Datastax - The Architect's guide to customer experience (CX)
 
An Operational Data Layer is Critical for Transformative Banking Applications
An Operational Data Layer is Critical for Transformative Banking ApplicationsAn Operational Data Layer is Critical for Transformative Banking Applications
An Operational Data Layer is Critical for Transformative Banking Applications
 
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design ThinkingBecoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
 

Último

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Último (20)

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 

An Overview of Apache Cassandra

  • 1. Cassandra Essentials Tutorial Series An Overview of Apache Cassandra
  • 2. Agenda  What is Cassandra?  History  Architecture  Key Features and Benefits  Who’s using Cassandra?  Where to get Cassandra
  • 3. Definition of Cassandra Apache Cassandra™ is a free Distributed… High performance… Extremely scalable… Fault tolerant (i.e. no single point of failure)… post-relational database solution. Cassandra can serve as both real-time datastore (the “system of record”) for online/transactional applications, and as a read- intensive database for business intelligence systems.
  • 4. The History of Cassandra Bigtable Dynamo
  • 5. Architecture Overview  Cassandra was designed with the understanding that system/hardware failures can and do occur  Peer-to-peer, distributed system  All nodes the same  Data partitioned among all nodes in the cluster  Custom data replication to ensure fault tolerance  Read/Write-anywhere design
  • 6. Architecture Overview  Each node communicates with each other through the Gossip protocol, which exchanges information across the cluster every second  A commit log is used on each node to capture write activity. Data durability is assured  Data also written to an in-memory structure (memtable) and then to disk once the memory structure is full (an SStable)
  • 7. Architecture Overview  The schema used in Cassandra is mirrored after Google Bigtable. It is a row-oriented, column structure  A keyspace is akin to a database in the RDBMS world  A column family is similar to an RDBMS table but is more flexible/dynamic  A row in a column family is indexed by its key. Other columns may be indexed as well Portfolio Keyspace Customer Column Family ID Name SSN DOB
  • 8. Why Cassandra?  Gigabyte to Petabyte scalability  Linear performance gains through adding nodes  No single point of failure  Easy replication / data distribution  Multi-data center and Cloud capable  No need for separate caching layer  Tunable data consistency  Flexible schema design  Data Compression  CQL language (like SQL)  Support for key languages and platforms  No need for special hardware or software
  • 9. Big Data Scalability  Capable of comfortably scaling to petabytes  New nodes = Linear performance increases  Add new nodes online 1 1 2 4 Double Throughput Capabilities 2 3
  • 10. No Single Point of Failure  Allnodes the same  Customized replication affords tunable data redundancy  Read/write from any node  Can replicate data among different physical data center racks
  • 11. Easy Replication / Data Distribution  Transparently handled by Cassandra  Multi-data center capable  Exploits all the benefits of Cloud computing  Able to do hybrid Cloud/On-premise setup
  • 12. No Need for Caching Software  Peer-to-peer architecture removes need for special caching layer and the programming that goes with it  The database cluster uses the memory from all participating nodes to cache the data assigned to each node  No irregularities between a memory cache and database are encountered Application Servers Reads Writes Memcached Servers Database Server
  • 13. Tunable Data Consistency  Choose between strong and eventual consistency (All to any node responding) depending on the need  Can be done on a per-operation basis, and for both reads and writes  Handles Multi-data center operations 1 6 2 Writes Reads  Any  One 5 3  One  Quorum  Quorum  Local_Quorum Local_Quorum Each_Quorum 4    Each_Quorum  All  All
  • 14. Flexible Schema  Dynamic schema design allows for much more flexible data storage than rigid RDBMS  Handles structured, semi-structured, and unstructured data. Counters also supported  No offline/downtime for schema changes  Supports primary and secondary indexes Portfolio Keyspace Customer Column Family ID Name SSN DOB
  • 15. Data Compression  Uses Google’s Snappy data compression algorithm  Compresses data on a per column family level  Internal tests at DataStax show up to 80%+ compression of raw data  No performance penalty (and some increases in overall performance due to less physical I/O)!
  • 16. CQL Language  Verysimilar to RDBMS SQL syntax  Create objects via DDL (e.g. CREATE…)  Core DML commands supported: INSERT, UPDATE, DELETE  Query data with SELECT 1 6 2 SELECT * FROM USERS 5 WHERE STATE = ‘TX’; 3 4
  • 18. Where to get Cassandra?  Go to www.datastax.com  DataStax makes free smart start installers available for Cassandra that include:  The most up-to-date Cassandra version that is production quality  A version of DataStax OpsCenter, which is a visual, browser-based management tool for managing and monitoring Cassandra  Drivers and connectors for popular development languages  Same database and application  Automatic configuration assistance for ensuring optimal performance and setup for either stand- alone or cluster implementations  Getting Started Guide
  • 19. Where Can I Learn More? www.datastax.com  Free Online Documentation  Technical White Papers  Technical Articles  Tutorials  User Forums  User/Customer Case Studies  FAQ’s  Videos  Blogs  Software downloads
  • 20. Cassandra Essentials Tutorial Series An Overview of Apache Cassandra Thanks…!