SlideShare una empresa de Scribd logo
1 de 20
Cassandra
How Stuff Works
Sergey Enin
(sergey_enin@epam.com)
AGENDA

Agenda

Introduction
Architecture

Partitioning & Replication
Data management
Data model
2
Introduction

3
INTRODUCTION: SELECTED CASES

Selected Cases
Who use Cassandra?
eBay has Cassandra supporting multiple
applications (Social Signals, Hunch, and many
time series use cases) with clusters spanning
several data centers.
Netflix is using Cassandra on AWS as a key
infrastructure component of its globally distributed
streaming product.
Shazam uses Cassandra cluster to power their
recommendations system.

and many others…

Check - http://www.datastax.com/cassandrausers

4
INTRODUCTION: MOST ADVANTAGES

Most advantages
Most advantages of Cassandra are:
• Fast writes.
• Tunable consistency.
• Decentralization.

• Integration with Hadoop.
5
Architecture

6
ARCHITECTURE: FAST WRITES

Fast writes
Cassandra is very fast on writes, cause of
use of Log-structured merge tree.

Process of inserting new record into Cassandra

7
ARCHITECTURE: FAST WRITE

How LSM-tree is done: Memtables and SSTables
2
1

3

1

Commit log – all data is written to the
commit log for durability.

2

SSTables are immutable. A row is typically stored across multiple
SSTable files.

3

Each SSTable has a bloom filter associated with it. The
bloom filter is used to check if a requested row key exists in
the SSTable before doing any disk seeks.

4

Deleted data is not immediately removed from disk.
A deleted column can reappear. Tombstones.
8
ARCHITECTURE: NETWORK ARCHITECTURE

Network architecture

• All nodes – are peers
(no master).
• Client specify set of Cassandra nodes and get
connected to first live node.
• Nodes are using gossip protocol.
9
Partitioning & replication

10
PARTITIONING & REPLICATION: DATA PARTITIONING

Data partitioning

Partitioner – determines, where first replica would live in the ring.

•

RandomPartitioner – default strategy, provides ±same load of all
nodes.

•

ByteOrderedPartitioner - orders rows lexically by key bytes, allows
range scans, not recommended.

11
PARTITIONING & REPLICATION: REPLICATION

Replication
Replication = replication factor
+ replica placement strategy
Replica placement strategy:
SimpleStrategy:
•
default strategy;
•
not taking
network topology
into account;

NetworkTopology
Strategy:
•
preferred, when
you have information
about network map
of your nodes;
12
Data management

13
DATA MANAGEMENT: DATA ACCESSING

Data accessing
READ + WRITES:

• Tunable consistency. Consistency level specify
how many nodes should answer for read/write
request(but writes goes to all replicas).
• Batches - sets a global consistency level and
client-supplied timestamp for all columns
written by the statements in the batch.

14
DATA MANAGEMENT: ACID

ACID
ACID

• Atomicity – writes are atomic at row level.
• Consistency – tunable consistency.
• Isolation – writes are invisible until they are
complete.
• Durability – writes are durable.
• Read-repair, anti-entropy node repair, hinted
handoff.

15
Data model

16
DATA MODEL: CASSANDRA`S DATA MODEL

Cassandra`s data model
Relational databases – you design
schema, based on entities and
relationships.
Cassandra – you design schema, based
on what queries you would like to
perform.

17
DATA MODEL: INDEXES

Indexes
An index is a data structure that allows for
fast, efficient lookup of data matching a given
condition.
Primary key – the unique key used to identify
each row in a table.
Secondary indexes – refer to indexes on
column values.

18
DATA MODEL: CQL3

CQL3
cqlsh> INSERT INTO users
(user_name, password)
VALUES ('jsmith', 'ch@ngem3a');
cqlsh> SELECT * FROM users WHERE
user_name='jsmith';
user_name | password | state
-----------+-----------+------jsmith | ch@ngem3a | null
Confidential

19
THANK YOU!

Thank you!

20

Más contenido relacionado

La actualidad más candente

Connecting kafka message systems with scylla
Connecting kafka message systems with scylla   Connecting kafka message systems with scylla
Connecting kafka message systems with scylla Maheedhar Gunturu
 
Running Cassandra on Amazon EC2
Running Cassandra on Amazon EC2Running Cassandra on Amazon EC2
Running Cassandra on Amazon EC2Dave Gardner
 
Cassandra Community Webinar: CMB - An Open Message Bus for the Cloud
Cassandra Community Webinar: CMB - An Open Message Bus for the CloudCassandra Community Webinar: CMB - An Open Message Bus for the Cloud
Cassandra Community Webinar: CMB - An Open Message Bus for the CloudDataStax
 
DynomiteDB - No spof High-availability Redis cluster solution
DynomiteDB -  No spof High-availability Redis cluster solutionDynomiteDB -  No spof High-availability Redis cluster solution
DynomiteDB - No spof High-availability Redis cluster solutionLeandro Totino Pereira
 
Cassandra Summit 2014: Active-Active Cassandra Behind the Scenes
Cassandra Summit 2014: Active-Active Cassandra Behind the ScenesCassandra Summit 2014: Active-Active Cassandra Behind the Scenes
Cassandra Summit 2014: Active-Active Cassandra Behind the ScenesDataStax Academy
 
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...DataStax Academy
 
Beyond Aurora. Scale-out SQL databases for AWS
Beyond Aurora. Scale-out SQL databases for AWS Beyond Aurora. Scale-out SQL databases for AWS
Beyond Aurora. Scale-out SQL databases for AWS Clustrix
 
HIgh Performance Redis- Tague Griffith, GoPro
HIgh Performance Redis- Tague Griffith, GoProHIgh Performance Redis- Tague Griffith, GoPro
HIgh Performance Redis- Tague Griffith, GoProRedis Labs
 
Cassandra internals
Cassandra internalsCassandra internals
Cassandra internalsnarsiman
 
Cassandra for mission critical data
Cassandra for mission critical dataCassandra for mission critical data
Cassandra for mission critical dataOleksandr Semenov
 
Achieve new levels of performance for Magento e-commerce sites.
Achieve new levels of performance for Magento e-commerce sites.Achieve new levels of performance for Magento e-commerce sites.
Achieve new levels of performance for Magento e-commerce sites.Clustrix
 
Beginning Operations: 7 Deadly Sins for Apache Cassandra Ops
Beginning Operations: 7 Deadly Sins for Apache Cassandra OpsBeginning Operations: 7 Deadly Sins for Apache Cassandra Ops
Beginning Operations: 7 Deadly Sins for Apache Cassandra OpsDataStax Academy
 
Discover some "Big Data" architectural concepts with Redis
Discover some  "Big Data" architectural concepts with  Redis Discover some  "Big Data" architectural concepts with  Redis
Discover some "Big Data" architectural concepts with Redis Maturin BADO
 
Cisco UCS Integrated Infrastructure for Big Data with Cassandra
Cisco UCS Integrated Infrastructure for Big Data with CassandraCisco UCS Integrated Infrastructure for Big Data with Cassandra
Cisco UCS Integrated Infrastructure for Big Data with CassandraDataStax Academy
 
Scylla Summit 2022: New AWS Instances Perfect for ScyllaDB
Scylla Summit 2022: New AWS Instances Perfect for ScyllaDBScylla Summit 2022: New AWS Instances Perfect for ScyllaDB
Scylla Summit 2022: New AWS Instances Perfect for ScyllaDBScyllaDB
 
vSphere With OpenStack
vSphere With OpenStackvSphere With OpenStack
vSphere With OpenStackKenneth Hui
 
Scylla Summit 2019 Keynote - Dor Laor - Beyond Cassandra
Scylla Summit 2019 Keynote - Dor Laor - Beyond CassandraScylla Summit 2019 Keynote - Dor Laor - Beyond Cassandra
Scylla Summit 2019 Keynote - Dor Laor - Beyond CassandraScyllaDB
 
Scylla Summit 2022: Scylla 5.0 New Features, Part 2
Scylla Summit 2022: Scylla 5.0 New Features, Part 2Scylla Summit 2022: Scylla 5.0 New Features, Part 2
Scylla Summit 2022: Scylla 5.0 New Features, Part 2ScyllaDB
 
Scaling Big Data with Hadoop and Mesos
Scaling Big Data with Hadoop and MesosScaling Big Data with Hadoop and Mesos
Scaling Big Data with Hadoop and MesosDiscover Pinterest
 

La actualidad más candente (20)

Connecting kafka message systems with scylla
Connecting kafka message systems with scylla   Connecting kafka message systems with scylla
Connecting kafka message systems with scylla
 
Running Cassandra on Amazon EC2
Running Cassandra on Amazon EC2Running Cassandra on Amazon EC2
Running Cassandra on Amazon EC2
 
Cassandra Community Webinar: CMB - An Open Message Bus for the Cloud
Cassandra Community Webinar: CMB - An Open Message Bus for the CloudCassandra Community Webinar: CMB - An Open Message Bus for the Cloud
Cassandra Community Webinar: CMB - An Open Message Bus for the Cloud
 
DynomiteDB - No spof High-availability Redis cluster solution
DynomiteDB -  No spof High-availability Redis cluster solutionDynomiteDB -  No spof High-availability Redis cluster solution
DynomiteDB - No spof High-availability Redis cluster solution
 
Cassandra Summit 2014: Active-Active Cassandra Behind the Scenes
Cassandra Summit 2014: Active-Active Cassandra Behind the ScenesCassandra Summit 2014: Active-Active Cassandra Behind the Scenes
Cassandra Summit 2014: Active-Active Cassandra Behind the Scenes
 
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
 
Beyond Aurora. Scale-out SQL databases for AWS
Beyond Aurora. Scale-out SQL databases for AWS Beyond Aurora. Scale-out SQL databases for AWS
Beyond Aurora. Scale-out SQL databases for AWS
 
HIgh Performance Redis- Tague Griffith, GoPro
HIgh Performance Redis- Tague Griffith, GoProHIgh Performance Redis- Tague Griffith, GoPro
HIgh Performance Redis- Tague Griffith, GoPro
 
Cassandra internals
Cassandra internalsCassandra internals
Cassandra internals
 
Cassandra for mission critical data
Cassandra for mission critical dataCassandra for mission critical data
Cassandra for mission critical data
 
Zabbix at scale with Elasticsearch
Zabbix at scale with ElasticsearchZabbix at scale with Elasticsearch
Zabbix at scale with Elasticsearch
 
Achieve new levels of performance for Magento e-commerce sites.
Achieve new levels of performance for Magento e-commerce sites.Achieve new levels of performance for Magento e-commerce sites.
Achieve new levels of performance for Magento e-commerce sites.
 
Beginning Operations: 7 Deadly Sins for Apache Cassandra Ops
Beginning Operations: 7 Deadly Sins for Apache Cassandra OpsBeginning Operations: 7 Deadly Sins for Apache Cassandra Ops
Beginning Operations: 7 Deadly Sins for Apache Cassandra Ops
 
Discover some "Big Data" architectural concepts with Redis
Discover some  "Big Data" architectural concepts with  Redis Discover some  "Big Data" architectural concepts with  Redis
Discover some "Big Data" architectural concepts with Redis
 
Cisco UCS Integrated Infrastructure for Big Data with Cassandra
Cisco UCS Integrated Infrastructure for Big Data with CassandraCisco UCS Integrated Infrastructure for Big Data with Cassandra
Cisco UCS Integrated Infrastructure for Big Data with Cassandra
 
Scylla Summit 2022: New AWS Instances Perfect for ScyllaDB
Scylla Summit 2022: New AWS Instances Perfect for ScyllaDBScylla Summit 2022: New AWS Instances Perfect for ScyllaDB
Scylla Summit 2022: New AWS Instances Perfect for ScyllaDB
 
vSphere With OpenStack
vSphere With OpenStackvSphere With OpenStack
vSphere With OpenStack
 
Scylla Summit 2019 Keynote - Dor Laor - Beyond Cassandra
Scylla Summit 2019 Keynote - Dor Laor - Beyond CassandraScylla Summit 2019 Keynote - Dor Laor - Beyond Cassandra
Scylla Summit 2019 Keynote - Dor Laor - Beyond Cassandra
 
Scylla Summit 2022: Scylla 5.0 New Features, Part 2
Scylla Summit 2022: Scylla 5.0 New Features, Part 2Scylla Summit 2022: Scylla 5.0 New Features, Part 2
Scylla Summit 2022: Scylla 5.0 New Features, Part 2
 
Scaling Big Data with Hadoop and Mesos
Scaling Big Data with Hadoop and MesosScaling Big Data with Hadoop and Mesos
Scaling Big Data with Hadoop and Mesos
 

Destacado

Sergey Enin-ScrumAlliance_CSM_Certificate
Sergey Enin-ScrumAlliance_CSM_CertificateSergey Enin-ScrumAlliance_CSM_Certificate
Sergey Enin-ScrumAlliance_CSM_CertificateSergey Enin
 
Approaching graph db
Approaching graph dbApproaching graph db
Approaching graph dbSergey Enin
 
Cassandra Read/Write Paths
Cassandra Read/Write PathsCassandra Read/Write Paths
Cassandra Read/Write Pathsjdsumsion
 
An Enterprise Architect's View of MongoDB
An Enterprise Architect's View of MongoDBAn Enterprise Architect's View of MongoDB
An Enterprise Architect's View of MongoDBMongoDB
 
Intro To MongoDB
Intro To MongoDBIntro To MongoDB
Intro To MongoDBAlex Sharp
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDBRavi Teja
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache CassandraDataStax
 

Destacado (13)

122201661137PM
122201661137PM122201661137PM
122201661137PM
 
Sergey Enin-ScrumAlliance_CSM_Certificate
Sergey Enin-ScrumAlliance_CSM_CertificateSergey Enin-ScrumAlliance_CSM_Certificate
Sergey Enin-ScrumAlliance_CSM_Certificate
 
BigDataNerds
BigDataNerdsBigDataNerds
BigDataNerds
 
Approaching graph db
Approaching graph dbApproaching graph db
Approaching graph db
 
Cassandra Read/Write Paths
Cassandra Read/Write PathsCassandra Read/Write Paths
Cassandra Read/Write Paths
 
Graph Databases
Graph DatabasesGraph Databases
Graph Databases
 
Graph database
Graph databaseGraph database
Graph database
 
An Enterprise Architect's View of MongoDB
An Enterprise Architect's View of MongoDBAn Enterprise Architect's View of MongoDB
An Enterprise Architect's View of MongoDB
 
Mongo DB
Mongo DBMongo DB
Mongo DB
 
Intro To MongoDB
Intro To MongoDBIntro To MongoDB
Intro To MongoDB
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache Cassandra
 
Cassandra NoSQL Tutorial
Cassandra NoSQL TutorialCassandra NoSQL Tutorial
Cassandra NoSQL Tutorial
 

Similar a Cassandra presentation

cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningVitsRangannavar
 
Cassandra implementation for collecting data and presenting data
Cassandra implementation for collecting data and presenting dataCassandra implementation for collecting data and presenting data
Cassandra implementation for collecting data and presenting dataChen Robert
 
04-Introduction-to-CassandraDB-.pdf
04-Introduction-to-CassandraDB-.pdf04-Introduction-to-CassandraDB-.pdf
04-Introduction-to-CassandraDB-.pdfhothyfa
 
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...Qian Lin
 
NoSQL A brief look at Apache Cassandra Distributed Database
NoSQL A brief look at Apache Cassandra Distributed DatabaseNoSQL A brief look at Apache Cassandra Distributed Database
NoSQL A brief look at Apache Cassandra Distributed DatabaseJoe Alex
 
Cassandra an overview
Cassandra an overviewCassandra an overview
Cassandra an overviewPritamKathar
 
Designing & Optimizing Micro Batching Systems Using 100+ Nodes (Ananth Ram, R...
Designing & Optimizing Micro Batching Systems Using 100+ Nodes (Ananth Ram, R...Designing & Optimizing Micro Batching Systems Using 100+ Nodes (Ananth Ram, R...
Designing & Optimizing Micro Batching Systems Using 100+ Nodes (Ananth Ram, R...DataStax
 
Navigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skiesNavigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skiesshnkr_rmchndrn
 
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?Clustrix
 
Introduction to ClustrixDB
Introduction to ClustrixDBIntroduction to ClustrixDB
Introduction to ClustrixDBI Goo Lee
 
The No SQL Principles and Basic Application Of Casandra Model
The No SQL Principles and Basic Application Of Casandra ModelThe No SQL Principles and Basic Application Of Casandra Model
The No SQL Principles and Basic Application Of Casandra ModelRishikese MR
 
Apache Cassandra and The Multi-Cloud by Amanda Moran
Apache Cassandra and The Multi-Cloud by Amanda MoranApache Cassandra and The Multi-Cloud by Amanda Moran
Apache Cassandra and The Multi-Cloud by Amanda MoranData Con LA
 
Breakthrough OLAP performance with Cassandra and Spark
Breakthrough OLAP performance with Cassandra and SparkBreakthrough OLAP performance with Cassandra and Spark
Breakthrough OLAP performance with Cassandra and SparkEvan Chan
 

Similar a Cassandra presentation (20)

cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learning
 
Cassandra implementation for collecting data and presenting data
Cassandra implementation for collecting data and presenting dataCassandra implementation for collecting data and presenting data
Cassandra implementation for collecting data and presenting data
 
Cassandra
CassandraCassandra
Cassandra
 
Cassandra tutorial
Cassandra tutorialCassandra tutorial
Cassandra tutorial
 
04-Introduction-to-CassandraDB-.pdf
04-Introduction-to-CassandraDB-.pdf04-Introduction-to-CassandraDB-.pdf
04-Introduction-to-CassandraDB-.pdf
 
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
 
NoSQL A brief look at Apache Cassandra Distributed Database
NoSQL A brief look at Apache Cassandra Distributed DatabaseNoSQL A brief look at Apache Cassandra Distributed Database
NoSQL A brief look at Apache Cassandra Distributed Database
 
Cassandra an overview
Cassandra an overviewCassandra an overview
Cassandra an overview
 
Cassandra training
Cassandra trainingCassandra training
Cassandra training
 
Designing & Optimizing Micro Batching Systems Using 100+ Nodes (Ananth Ram, R...
Designing & Optimizing Micro Batching Systems Using 100+ Nodes (Ananth Ram, R...Designing & Optimizing Micro Batching Systems Using 100+ Nodes (Ananth Ram, R...
Designing & Optimizing Micro Batching Systems Using 100+ Nodes (Ananth Ram, R...
 
Navigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skiesNavigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skies
 
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?
 
NoSQL_Night
NoSQL_NightNoSQL_Night
NoSQL_Night
 
No sql
No sqlNo sql
No sql
 
Why Cassandra?
Why Cassandra?Why Cassandra?
Why Cassandra?
 
Introduction to ClustrixDB
Introduction to ClustrixDBIntroduction to ClustrixDB
Introduction to ClustrixDB
 
The No SQL Principles and Basic Application Of Casandra Model
The No SQL Principles and Basic Application Of Casandra ModelThe No SQL Principles and Basic Application Of Casandra Model
The No SQL Principles and Basic Application Of Casandra Model
 
Cassandra Database
Cassandra DatabaseCassandra Database
Cassandra Database
 
Apache Cassandra and The Multi-Cloud by Amanda Moran
Apache Cassandra and The Multi-Cloud by Amanda MoranApache Cassandra and The Multi-Cloud by Amanda Moran
Apache Cassandra and The Multi-Cloud by Amanda Moran
 
Breakthrough OLAP performance with Cassandra and Spark
Breakthrough OLAP performance with Cassandra and SparkBreakthrough OLAP performance with Cassandra and Spark
Breakthrough OLAP performance with Cassandra and Spark
 

Último

Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 

Último (20)

Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 

Cassandra presentation

  • 1. Cassandra How Stuff Works Sergey Enin (sergey_enin@epam.com)
  • 4. INTRODUCTION: SELECTED CASES Selected Cases Who use Cassandra? eBay has Cassandra supporting multiple applications (Social Signals, Hunch, and many time series use cases) with clusters spanning several data centers. Netflix is using Cassandra on AWS as a key infrastructure component of its globally distributed streaming product. Shazam uses Cassandra cluster to power their recommendations system. and many others… Check - http://www.datastax.com/cassandrausers 4
  • 5. INTRODUCTION: MOST ADVANTAGES Most advantages Most advantages of Cassandra are: • Fast writes. • Tunable consistency. • Decentralization. • Integration with Hadoop. 5
  • 7. ARCHITECTURE: FAST WRITES Fast writes Cassandra is very fast on writes, cause of use of Log-structured merge tree. Process of inserting new record into Cassandra 7
  • 8. ARCHITECTURE: FAST WRITE How LSM-tree is done: Memtables and SSTables 2 1 3 1 Commit log – all data is written to the commit log for durability. 2 SSTables are immutable. A row is typically stored across multiple SSTable files. 3 Each SSTable has a bloom filter associated with it. The bloom filter is used to check if a requested row key exists in the SSTable before doing any disk seeks. 4 Deleted data is not immediately removed from disk. A deleted column can reappear. Tombstones. 8
  • 9. ARCHITECTURE: NETWORK ARCHITECTURE Network architecture • All nodes – are peers (no master). • Client specify set of Cassandra nodes and get connected to first live node. • Nodes are using gossip protocol. 9
  • 11. PARTITIONING & REPLICATION: DATA PARTITIONING Data partitioning Partitioner – determines, where first replica would live in the ring. • RandomPartitioner – default strategy, provides ±same load of all nodes. • ByteOrderedPartitioner - orders rows lexically by key bytes, allows range scans, not recommended. 11
  • 12. PARTITIONING & REPLICATION: REPLICATION Replication Replication = replication factor + replica placement strategy Replica placement strategy: SimpleStrategy: • default strategy; • not taking network topology into account; NetworkTopology Strategy: • preferred, when you have information about network map of your nodes; 12
  • 14. DATA MANAGEMENT: DATA ACCESSING Data accessing READ + WRITES: • Tunable consistency. Consistency level specify how many nodes should answer for read/write request(but writes goes to all replicas). • Batches - sets a global consistency level and client-supplied timestamp for all columns written by the statements in the batch. 14
  • 15. DATA MANAGEMENT: ACID ACID ACID • Atomicity – writes are atomic at row level. • Consistency – tunable consistency. • Isolation – writes are invisible until they are complete. • Durability – writes are durable. • Read-repair, anti-entropy node repair, hinted handoff. 15
  • 17. DATA MODEL: CASSANDRA`S DATA MODEL Cassandra`s data model Relational databases – you design schema, based on entities and relationships. Cassandra – you design schema, based on what queries you would like to perform. 17
  • 18. DATA MODEL: INDEXES Indexes An index is a data structure that allows for fast, efficient lookup of data matching a given condition. Primary key – the unique key used to identify each row in a table. Secondary indexes – refer to indexes on column values. 18
  • 19. DATA MODEL: CQL3 CQL3 cqlsh> INSERT INTO users (user_name, password) VALUES ('jsmith', 'ch@ngem3a'); cqlsh> SELECT * FROM users WHERE user_name='jsmith'; user_name | password | state -----------+-----------+------jsmith | ch@ngem3a | null Confidential 19