SlideShare una empresa de Scribd logo
1 de 52
Descargar para leer sin conexión
Evaluating Apache Cassandra as a Cloud Database 
Christian Johannsen, Solutions Engineer
About me 
2 
Christian Johannsen 
Solutions Engineer @ DataStax 
Twitter: @cjohannsen81
What is a cloud database not? 
3 
• A Cloud database is not simply taking a traditional RDBMS and 
running it in a Cloud provider’s environment
What is a cloud database? 
4 
• A cloud database is a database optimised to handle the 
infrastructure limitations and features of running on 
shared, geographically dispersed infrastructure. 
• Some of them are: Outages, noisy neighbours, security, 
variable networks etc…
Key attributes of a cloud database 
5 
• Transparent Elasticity 
• being able to add and subtract nodes (physical or virtual machines) with 
load balancing when the underlying application and business demands it. 
• Transparent Scalability 
• addition of nodes increases both (1) performance throughput; (2) ability to 
handle Big Data and maintain high performance 
• High Availability 
• always up; no single point of failure 
• Easy Data Distribution 
• able to span multiple geographies, data centers, and cloud provider 
zones. Can read/write to any node
6 
Key attributes of a cloud database 
• Support for multiple data types 
• Able to handle structured, semi-structured, and unstructured data. 
• Easy to manage 
• easy to administer a logical database across many nodes 
• Lower Cost 
• It shouldn’t break the bank and you shouldn’t have to have a massive 
upfront cost 
• Multiple Infrastructure Support 
• It should support being deployed on multiple cloud providers (public and 
private) as well as traditional infrastructure 
• Security 
• Data should be secure in flight, at rest and audited.
Key Attributes at a glance 
7 
1. Transparent Elasticity 
2. Transparent Scalability 
3. High Availability 
4. Easy Data Distribution 
5. Data Redundancy 
6. Support for multiple data types 
7. Easy to manage 
8. Lower Cost 
9. Multiple Infrastructure Support 
10. Security
Apache Cassandra 
8
Apache Cassandra 
9 
• Cassandra is a massively scalable, open source, NoSQL, distributed 
database built for modern, mission-critical online applications. 
• Written in Java and is a hybrid of Amazon Dynamo and Google BigTable 
• Masterless with no single point of failure 
• Distributed, data centre and rack aware 
• 100% uptime 
• Predictable scaling Dynamo 
BigTable 
BigTable: http://research.google.com/archive/bigtable-osdi06.pdf 
Dynamo: http://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf
Apache Cassandra Overview 
• Cassandra was designed with the understanding that system/hardware failures 
10 
can and do occur 
• Peer-to-peer, distributed system 
• All nodes the same - masterless with no single point of failure 
• Data partitioned among all nodes in the cluster 
• Custom data replication to ensure fault tolerance 
• Tunable data consistency 
• Data Center and Rack aware 
• Familiar SQL-Like language – CQL 
• More capacity? Add a server! 
• More throughput? Add a server!
Cassandra Adoption 
11
Cassandra - Locally Distributed 
12 
• Replication factor (RF): How many copies of your 
data? 
• RF = 3 in this example 
• Each node is storing 60% of the clusters total data 
i.e. 3/5 
Handy Calculator: http://www.ecyrd.com/cassandracalculator/ 
Node 1 
1st copy 
Node 4 
Node 5 
Node 2 
2nd copy 
Node 3 
3rd copy 
• Client reads or writes to any node 
• Node coordinates with others 
• Data read or replicated in parallel 
Node 2 
2nd copy
Cassandra - Rack/Zone aware 
Rack 1 
13 
Node 1 
1st copy 
Rack 2 Rack 2 
Node 4 
Node 2 
Node 3 
2nd copy 
Rack 3 
Rack 1 
Node 5 
3rd copy 
• Cassandra is aware of which rack or 
zone each nod resides in 
• It will attempt to place each data copy 
in a different rack 
• RF=3 in this example
Cassandra - DC/Region aware 
14 
• Active Everywhere – reads/writes in multiple data centres 
• Client writes local 
• Data syncs across WAN 
• Replication Factor per DC 
• Different number of nodes per 
data center 
DC: USA DC: EUROPE 
Node 1 
1st copy 
Node 4 
Node 5 
Node 2 
2nd copy 
Node 3 
3rd copy 
Node 1 
1st copy 
Node 4 
Node 5 
Node 2 
2nd copy 
Node 3 
3rd copy
Cassandra - Tuneable Consistency 
15 
• Consistency Level (CL) 
• Client specifies per operation 
• Handles multi-data center operations 
ALL = All replicas ack 
QUORUM = > 51% of replicas ack 
LOCAL_QUORUM = > 51% in local DC ack 
ONE = Only one replica acks 
Plus more…. (see docs) 
Blog: Eventual Consistency != Hopeful Consistency 
http://planetcassandra.org/blog/post/a-netflix-experiment-eventual-consistency-hopeful-consistency- 
by-christos-kalantzis/ 
Node 1 
1st copy 
Node 4 
Node 5 
Node 2 
2nd copy 
Node 3 
3rd copy 
Parallel 
Write 
Write 
CL=QUORUM 
5 μs ack 
12 μs ack 
500 μs ack 
12 μs ack
Cassandra - Node failure 
16 
• A single node failure shouldn’t bring failure. 
• Replication Factor + Consistency Level = Success 
• This example: 
• RF = 3 
• CL = QUORUM 
Node 1 
1st copy 
Node 4 
Node 5 
Node 2 
2nd copy 
Node 3 
3rd copy 
Parallel 
Write 
Write 
CL=QUORUM 
5 μs ack 
12 μs ack 
12 μs ack 
>51% ack – so request is a success
Cassandra - Node Recovery 
17 
• When a write is performed and a replica node for the row is unavailable the 
coordinator will store a hint locally (3 hours) 
• When the node recovers, the coordinator replays the missed writes. 
• Note: a hinted write does not count the consistency level 
• Note: you should still run repairs across your cluster 
Node 1 
1st copy 
Node 4 
Node 5 
Node 2 
2nd copy 
Node 3 
3rd copy 
Stores Hints while Node 3 is offline
Cassandra Rack/Zone Failure 
Rack 1 
18 
• Cassandra will place the data in as many 
different racks or availability zones as it can. 
• This example: 
• RF = 3 
• CL = QUORUM 
• AZ/Rack 2 fails 
• Data copies still available in Node 1 and 
Node 5 
• Quorum can be honored i.e. > 51% ack 
request is a success 
Node 1 
1st copy 
Rack 2 Rack 2 
Node 4 
Node 2 
Node 3 
2nd copy 
Rack 3 
Rack 1 
Node 5 
3rd copy
Operational Simplicity 
19 
• Cassandra is a complete product – there is not a multitude 
of components to install, set-up and monitor. 
• Extremely simple to administer and deploy 
• Backups are instantaneous and simple to restore 
• Supports snapshots, incremental backups and point-in-time recovery. 
• Cassandra can handle non-uniform hardware and disks. 
o This enables the mixing of solid state and spinning disks in a single cluster and pinning 
tables to workload-appropriate disks. 
• No downtime is required in Cassandra for upgrades or 
adding/removing servers from the cluster.
What ´s up with DataStax? 
20
DataStax at a glance 
21 
330+ 
Employees Percent Customers 
Founded in April 2010 
~25 500+ 
Santa Clara, Austin, New York, London, Sydney
DataStax delivers value 
22 
Certified, 
Enterprise-ready 
Cassandra 
Security Analytics Search Visual 
Monitoring 
Management 
Services 
In-Memory 
Dev. IDE & 
Drivers 
Professional 
Services 
Support & 
Training 
Commercial 
Confidence 
Enterprise 
Functionality
Enterprise Integrations 
23 
• DataStax adds Enterprise Features like: Hadoop, Solr, 
Spark
DataStax OpsCenter 
24 
• DataStax OpsCenter is a browser-based, visual management and 
monitoring solution for Apache Cassandra and DataStax Enterprise 
• Functionality is also exposed via HTTP APIs
OpsCenter - New Cluster Example 
25 
A new, 10-node Cassandra (or A new, 10-node DSE cluste Hr awdiothop O) cplsuCsteern wteitrh r OunpnsCinegn toenr rAunWnSin gin in 3 3 m miinnuutteess…… 
1 2 3 Done
OpsCenter 5.0 
26 
• Manage multiple clusters and nodes 
• Add and remove nodes 
• Administer individual nodes or in bulk 
• Configure clusters 
• Perform rolling restarts 
• Automatically repair data 
• Rebalance data 
• Backup management 
• Capacity planning
DataStax Office Demo 
27 
• 32 Raspberry Pi´s 
• 16 per DataStax Enterprise 4.5 Cluster 
• Managed in OpsCenter 5.0 
• “Red Button” downs one DataCenter 
• Not the Performance-Demo but 
• Availability 
• Commodity Hardware
Native Drivers 
28 
• Different Native Drivers available: Java, Python etc. 
• Load Balancing Policies (Client Driver receives Updates) 
• Data Centre Aware 
• Latency Aware 
• Token Aware 
• Reconnection policies 
• Retry policies 
• Downgrading Consistency 
• Plus others.. 
• http://www.datastax.com/download/clientdrivers
CQL 
29 
• Cassandra Query Language 
• CQL is intended to provide a common, simpler and easier to use 
interface into Cassandra - and you probably already know it! 
• e.g. SELECT * FROM users 
• Usual statements: 
• CREATE / DROP / ALTER TABLE / SELECT
CQL Basics 
CREATE KEYSPACE league WITH REPLICATION = {‘class’:’NetworkTopologyStrategy’, ‘DataCentre1’:3, 
‘DataCentre2’: 2}; 
30 
USE league; 
CREATE TABLE teams ( 
team_name varchar, 
player_name varchar, 
jersey int, 
PRIMARY KEY (team_name, player_name) 
); 
SELECT * FROM teams WHERE team_name = ‘Mighty Mutts’ and player_name = ‘Lucky’; 
INSERT INTO teams (team_name, player_name, jersey) VALUES ('Mighty Mutts',’Felix’,90);
CQL Data Types 
31
DevCenter 1.1 
32 
• Visual Query Tool for Developers and Administrators 
• Easily create and run Cassandra Queries 
• Visually navigate database objects 
• Context-based suggestions
Do you remember the facts? 
33
Key Attributes of a cloud database 
34 
1. Transparent Elasticity 
2. Transparent Scalability 
3. High Availability 
4. Easy Data Distribution 
5. Data Redundancy 
6. Support for multiple data types 
7. Easy to manage 
8. Lower Cost 
9. Multiple Infrastructure Support 
10. Security
Transparent Elasticity 
• Nodes can be added and removed from Cassandra online, with no 
35 
downtime being experienced. 
1 
2 
3 
4 
6 
5 
1 
7 
10 
4 
2 
3 
5 
6 
8 
9 
11 
12
Transparent Scalability 
36 
• Addition of Cassandra nodes increases performance linearly and 
ability to manage TB’s-PB’s of data. 
1 
2 
3 
4 
6 
5 
1 
7 
10 
4 
2 
3 
5 
6 
8 
9 
11 
12 
Performance 
throughput = N 
Performance 
throughput = N x 2
High Availability 
37 
• Cassandra, with its peer-to-peer masterless architecture has no 
single point of failure. 
• 100% uptime is the norm!
Easy Data Distribution 
38 
• Cassandra allows a single logical database to span 1-N Data 
Centers that are geographically dispersed. 
• Also supports a hybrid Cloud implementation.
Data Redundancy 
39 
• Cassandra allows for customisable data redundancy so that data 
is completely protected. 
• Also supports rack/zone awareness (data can be replicated 
between different racks to guard against machine/rack failures).
Support for multiple data types 
40 
• Cassandra’s data model (based on Google’s Bigtable) allows a 
user to store structured, semi-structured, and unstructured data 
with ease. 
Portfolio Keyspace 
Customer Table 
ID Name SSN DOB
Easy to manage 
41 
• OpsCenter, AMI installers etc.. allow you to install and configure 
an entire multi-node Cloud implementation in minutes. 
• All can be managed and monitored via Web-based console or via 
RESTful APIs.
Lower Cost 
42 
• Cassandra is open source software and is freely available. 
• Commercial/advanced versions of Cassandra are available from 
DataStax along with support and additional services.
Multiple Infrastructure Support 
43 
• Cassandra is supported on popular Cloud provider platforms and 
operating systems.
Security 
44 
• Cassandra has multiple security features - authentication, 
authorisation, inflight encryption 
• Advanced “enterprise level” security can also provided via 
DataStax Enterprise
What´s new?! 
45
What is Spark? 
46 
• Apache Project since 2010 
• 10-100x faster than Hadoop MapReduce 
• In-Memory Storage 
• Single JVM Processor per node 
• Rich Scala, Java and Python API´s 
• 2x-5x less code 
• Interactive Shell
Why Spark on Cassandra? 
47 
• Data model independent queries 
• cross-table operations (JOIN, UNION, etc.)! 
• complex analytics (e.g. machine learning) 
• data transformation, aggregation etc. 
• stream processing (coming soon) 
• all nodes are Spark workers 
• by default resilient to worker failures 
• first node promoted as Spark Master 
• Standby Master promoted on failure 
• Master HA available in Dactastax Enterprise
How to Spark on Cassandra? 
48 
• DataStax Cassandra Spark Driver on GitHub 
• Compatible with: 
• Spark 0.9+ 
• Cassandra 2.0+ 
• DataStax Enterprise 4.5+ 
• Cassandra Tables exposes as Spark RDDs (Resilient Distributed 
Datasets 
• Read from and write to Cassandra 
• Mapping of Cassandra tables and rows to Scala objects 
• Scala only driver for now
Spark type mapping 
49
2.1 Release - User Defined Types 
50 
CREATE TYPE address ( 
street text, 
city text, 
zip_code int, 
phones set<text> 
) 
CREATE TABLE users ( 
id uuid PRIMARY KEY, 
name text, 
addresses map<text, address> 
) 
SELECT id, name, addresses.city, addresses.phones FROM users; 
id | name | addresses.city | addresses.phones 
--------------------+----------------+-------------------------- 
63bf691f | chris | Berlin | {’0201234567', ’0796622222'}
2.1 Release - Secondary Indexes on 
collections 
51 
CREATE TABLE songs ( 
id uuid PRIMARY KEY, 
artist text, 
album text, 
title text, 
data blob, 
tags set<text> 
); 
CREATE INDEX song_tags_idx ON songs(tags); 
SELECT * FROM songs WHERE tags CONTAINS 'blues'; 
id | album | artist | tags | title 
----------+---------------+-------------------+-----------------------+------------------ 
5027b27e | Country Blues | Lightnin' Hopkins | {'acoustic', 'blues'} | Worrying My Mind
Thanks! Let´s see a demo! 
52

Más contenido relacionado

La actualidad más candente

Cassandra multi-datacenter operations essentials
Cassandra multi-datacenter operations essentialsCassandra multi-datacenter operations essentials
Cassandra multi-datacenter operations essentialsJulien Anguenot
 
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...DataStax
 
Everyday I’m scaling... Cassandra
Everyday I’m scaling... CassandraEveryday I’m scaling... Cassandra
Everyday I’m scaling... CassandraInstaclustr
 
Understanding Cassandra internals to solve real-world problems
Understanding Cassandra internals to solve real-world problemsUnderstanding Cassandra internals to solve real-world problems
Understanding Cassandra internals to solve real-world problemsAcunu
 
Building a Multi-Region Cluster at Target (Aaron Ploetz, Target) | Cassandra ...
Building a Multi-Region Cluster at Target (Aaron Ploetz, Target) | Cassandra ...Building a Multi-Region Cluster at Target (Aaron Ploetz, Target) | Cassandra ...
Building a Multi-Region Cluster at Target (Aaron Ploetz, Target) | Cassandra ...DataStax
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to CassandraGokhan Atil
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache CassandraDataStax
 
Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...
Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...
Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...DataStax
 
Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016
Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016
Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016DataStax
 
Distribute Key Value Store
Distribute Key Value StoreDistribute Key Value Store
Distribute Key Value StoreSantal Li
 
Introduction to Cassandra Architecture
Introduction to Cassandra ArchitectureIntroduction to Cassandra Architecture
Introduction to Cassandra Architecturenickmbailey
 
Apache Cassandra in the Real World
Apache Cassandra in the Real WorldApache Cassandra in the Real World
Apache Cassandra in the Real WorldJeremy Hanna
 
Mesosphere and Contentteam: A New Way to Run Cassandra
Mesosphere and Contentteam: A New Way to Run CassandraMesosphere and Contentteam: A New Way to Run Cassandra
Mesosphere and Contentteam: A New Way to Run CassandraDataStax Academy
 
Apache Cassandra multi-datacenter essentials
Apache Cassandra multi-datacenter essentialsApache Cassandra multi-datacenter essentials
Apache Cassandra multi-datacenter essentialsJulien Anguenot
 
Cassandra and Spark
Cassandra and SparkCassandra and Spark
Cassandra and Sparknickmbailey
 
Cassandra - A decentralized storage system
Cassandra - A decentralized storage systemCassandra - A decentralized storage system
Cassandra - A decentralized storage systemArunit Gupta
 
Apache Cassandra @Geneva JUG 2013.02.26
Apache Cassandra @Geneva JUG 2013.02.26Apache Cassandra @Geneva JUG 2013.02.26
Apache Cassandra @Geneva JUG 2013.02.26Benoit Perroud
 

La actualidad más candente (20)

Cassandra multi-datacenter operations essentials
Cassandra multi-datacenter operations essentialsCassandra multi-datacenter operations essentials
Cassandra multi-datacenter operations essentials
 
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
 
Everyday I’m scaling... Cassandra
Everyday I’m scaling... CassandraEveryday I’m scaling... Cassandra
Everyday I’m scaling... Cassandra
 
Understanding Cassandra internals to solve real-world problems
Understanding Cassandra internals to solve real-world problemsUnderstanding Cassandra internals to solve real-world problems
Understanding Cassandra internals to solve real-world problems
 
Building a Multi-Region Cluster at Target (Aaron Ploetz, Target) | Cassandra ...
Building a Multi-Region Cluster at Target (Aaron Ploetz, Target) | Cassandra ...Building a Multi-Region Cluster at Target (Aaron Ploetz, Target) | Cassandra ...
Building a Multi-Region Cluster at Target (Aaron Ploetz, Target) | Cassandra ...
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to Cassandra
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache Cassandra
 
Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...
Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...
Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...
 
Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016
Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016
Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016
 
Distribute Key Value Store
Distribute Key Value StoreDistribute Key Value Store
Distribute Key Value Store
 
Introduction to Cassandra Architecture
Introduction to Cassandra ArchitectureIntroduction to Cassandra Architecture
Introduction to Cassandra Architecture
 
Apache Cassandra in the Real World
Apache Cassandra in the Real WorldApache Cassandra in the Real World
Apache Cassandra in the Real World
 
Mesosphere and Contentteam: A New Way to Run Cassandra
Mesosphere and Contentteam: A New Way to Run CassandraMesosphere and Contentteam: A New Way to Run Cassandra
Mesosphere and Contentteam: A New Way to Run Cassandra
 
Apache Cassandra multi-datacenter essentials
Apache Cassandra multi-datacenter essentialsApache Cassandra multi-datacenter essentials
Apache Cassandra multi-datacenter essentials
 
Advanced Operations
Advanced OperationsAdvanced Operations
Advanced Operations
 
Cassandra and Spark
Cassandra and SparkCassandra and Spark
Cassandra and Spark
 
Cassandra - A decentralized storage system
Cassandra - A decentralized storage systemCassandra - A decentralized storage system
Cassandra - A decentralized storage system
 
Cassandra 101
Cassandra 101Cassandra 101
Cassandra 101
 
Apache Cassandra @Geneva JUG 2013.02.26
Apache Cassandra @Geneva JUG 2013.02.26Apache Cassandra @Geneva JUG 2013.02.26
Apache Cassandra @Geneva JUG 2013.02.26
 
Cassandra ppt 2
Cassandra ppt 2Cassandra ppt 2
Cassandra ppt 2
 

Destacado

Nutanix - Expert Session - Metro Availability
Nutanix -  Expert Session - Metro AvailabilityNutanix -  Expert Session - Metro Availability
Nutanix - Expert Session - Metro AvailabilityChristian Johannsen
 
2016 11-16 Citrix XenServer & Nutanix Master Class
2016 11-16 Citrix XenServer & Nutanix Master Class2016 11-16 Citrix XenServer & Nutanix Master Class
2016 11-16 Citrix XenServer & Nutanix Master ClassMarc Trouard-Riolle
 
Webinar: Network Automation [Tips & Tricks]
Webinar: Network Automation [Tips & Tricks]Webinar: Network Automation [Tips & Tricks]
Webinar: Network Automation [Tips & Tricks]Cumulus Networks
 
Nutanix + Cumulus Linux: Deploying True Hyper Convergence with Open Networking
Nutanix + Cumulus Linux: Deploying True Hyper Convergence with Open NetworkingNutanix + Cumulus Linux: Deploying True Hyper Convergence with Open Networking
Nutanix + Cumulus Linux: Deploying True Hyper Convergence with Open NetworkingCumulus Networks
 
Demystifying Networking Webinar Series- Routing on the Host
Demystifying Networking Webinar Series- Routing on the HostDemystifying Networking Webinar Series- Routing on the Host
Demystifying Networking Webinar Series- Routing on the HostCumulus Networks
 
Network Architecture for Containers
Network Architecture for ContainersNetwork Architecture for Containers
Network Architecture for ContainersCumulus Networks
 
Cloud Businesses: Strategic Considerations
Cloud Businesses: Strategic ConsiderationsCloud Businesses: Strategic Considerations
Cloud Businesses: Strategic ConsiderationsTathagat Varma
 
Nutanix vdi workshop presentation
Nutanix vdi workshop presentationNutanix vdi workshop presentation
Nutanix vdi workshop presentationHe Hariyadi
 
Microservices Network Architecture 101
Microservices Network Architecture 101Microservices Network Architecture 101
Microservices Network Architecture 101Cumulus Networks
 
Nutanix Community Meetup #1 - Nutanix入門編
Nutanix Community Meetup #1 - Nutanix入門編Nutanix Community Meetup #1 - Nutanix入門編
Nutanix Community Meetup #1 - Nutanix入門編Satoshi Shimazaki
 
Building Scalable Data Center Networks
Building Scalable Data Center NetworksBuilding Scalable Data Center Networks
Building Scalable Data Center NetworksCumulus Networks
 
Nutanix NEXT on Tour - Maarssen, Netherlands
Nutanix NEXT on Tour - Maarssen, Netherlands  Nutanix NEXT on Tour - Maarssen, Netherlands
Nutanix NEXT on Tour - Maarssen, Netherlands NEXTtour
 
SDN/NFV Building Block Introduction
SDN/NFV Building Block IntroductionSDN/NFV Building Block Introduction
SDN/NFV Building Block IntroductionMichelle Holley
 
Demystifying Networking: Data Center Networking Trends 2017
Demystifying Networking: Data Center Networking Trends 2017Demystifying Networking: Data Center Networking Trends 2017
Demystifying Networking: Data Center Networking Trends 2017Cumulus Networks
 
Performance out of the box developers
Performance   out of the box developersPerformance   out of the box developers
Performance out of the box developersMichelle Holley
 
Business Transformation Readiness Assessment- The BTEP Way
Business Transformation Readiness Assessment- The BTEP WayBusiness Transformation Readiness Assessment- The BTEP Way
Business Transformation Readiness Assessment- The BTEP WayChandrashekhar More
 
SYN 104: Citrix and Nutanix
SYN 104: Citrix and Nutanix SYN 104: Citrix and Nutanix
SYN 104: Citrix and Nutanix Citrix
 

Destacado (20)

DataStax TechDay - Munich 2014
DataStax TechDay - Munich 2014DataStax TechDay - Munich 2014
DataStax TechDay - Munich 2014
 
Nutanix - Expert Session - Metro Availability
Nutanix -  Expert Session - Metro AvailabilityNutanix -  Expert Session - Metro Availability
Nutanix - Expert Session - Metro Availability
 
2016 11-16 Citrix XenServer & Nutanix Master Class
2016 11-16 Citrix XenServer & Nutanix Master Class2016 11-16 Citrix XenServer & Nutanix Master Class
2016 11-16 Citrix XenServer & Nutanix Master Class
 
Webinar: Network Automation [Tips & Tricks]
Webinar: Network Automation [Tips & Tricks]Webinar: Network Automation [Tips & Tricks]
Webinar: Network Automation [Tips & Tricks]
 
Nutanix + Cumulus Linux: Deploying True Hyper Convergence with Open Networking
Nutanix + Cumulus Linux: Deploying True Hyper Convergence with Open NetworkingNutanix + Cumulus Linux: Deploying True Hyper Convergence with Open Networking
Nutanix + Cumulus Linux: Deploying True Hyper Convergence with Open Networking
 
Demystifying Networking Webinar Series- Routing on the Host
Demystifying Networking Webinar Series- Routing on the HostDemystifying Networking Webinar Series- Routing on the Host
Demystifying Networking Webinar Series- Routing on the Host
 
Network Architecture for Containers
Network Architecture for ContainersNetwork Architecture for Containers
Network Architecture for Containers
 
Cloud Businesses: Strategic Considerations
Cloud Businesses: Strategic ConsiderationsCloud Businesses: Strategic Considerations
Cloud Businesses: Strategic Considerations
 
Nutanix vdi workshop presentation
Nutanix vdi workshop presentationNutanix vdi workshop presentation
Nutanix vdi workshop presentation
 
Microservices Network Architecture 101
Microservices Network Architecture 101Microservices Network Architecture 101
Microservices Network Architecture 101
 
Nutanix Community Meetup #1 - Nutanix入門編
Nutanix Community Meetup #1 - Nutanix入門編Nutanix Community Meetup #1 - Nutanix入門編
Nutanix Community Meetup #1 - Nutanix入門編
 
Building Scalable Data Center Networks
Building Scalable Data Center NetworksBuilding Scalable Data Center Networks
Building Scalable Data Center Networks
 
Nutanix NEXT on Tour - Maarssen, Netherlands
Nutanix NEXT on Tour - Maarssen, Netherlands  Nutanix NEXT on Tour - Maarssen, Netherlands
Nutanix NEXT on Tour - Maarssen, Netherlands
 
LB for type2
LB for type2LB for type2
LB for type2
 
SDN/NFV Building Block Introduction
SDN/NFV Building Block IntroductionSDN/NFV Building Block Introduction
SDN/NFV Building Block Introduction
 
Demystifying Networking: Data Center Networking Trends 2017
Demystifying Networking: Data Center Networking Trends 2017Demystifying Networking: Data Center Networking Trends 2017
Demystifying Networking: Data Center Networking Trends 2017
 
Performance out of the box developers
Performance   out of the box developersPerformance   out of the box developers
Performance out of the box developers
 
Business Transformation Readiness Assessment- The BTEP Way
Business Transformation Readiness Assessment- The BTEP WayBusiness Transformation Readiness Assessment- The BTEP Way
Business Transformation Readiness Assessment- The BTEP Way
 
SYN 104: Citrix and Nutanix
SYN 104: Citrix and Nutanix SYN 104: Citrix and Nutanix
SYN 104: Citrix and Nutanix
 
Nutanix
NutanixNutanix
Nutanix
 

Similar a BigData Developers MeetUp

Highly available, scalable and secure data with Cassandra and DataStax Enterp...
Highly available, scalable and secure data with Cassandra and DataStax Enterp...Highly available, scalable and secure data with Cassandra and DataStax Enterp...
Highly available, scalable and secure data with Cassandra and DataStax Enterp...Johnny Miller
 
Cassandra - A Basic Introduction Guide
Cassandra - A Basic Introduction GuideCassandra - A Basic Introduction Guide
Cassandra - A Basic Introduction GuideMohammed Fazuluddin
 
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...DataStax Academy
 
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analyticsLeveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analyticsJulien Anguenot
 
Streaming Analytics with Spark, Kafka, Cassandra and Akka by Helena Edelson
Streaming Analytics with Spark, Kafka, Cassandra and Akka by Helena EdelsonStreaming Analytics with Spark, Kafka, Cassandra and Akka by Helena Edelson
Streaming Analytics with Spark, Kafka, Cassandra and Akka by Helena EdelsonSpark Summit
 
Streaming Analytics with Spark, Kafka, Cassandra and Akka
Streaming Analytics with Spark, Kafka, Cassandra and AkkaStreaming Analytics with Spark, Kafka, Cassandra and Akka
Streaming Analytics with Spark, Kafka, Cassandra and AkkaHelena Edelson
 
Aruman Cassandra database
Aruman Cassandra databaseAruman Cassandra database
Aruman Cassandra databaseUmesh Dande
 
High Throughput Analytics with Cassandra & Azure
High Throughput Analytics with Cassandra & AzureHigh Throughput Analytics with Cassandra & Azure
High Throughput Analytics with Cassandra & AzureDataStax Academy
 
Apache Cassandra in the Real World
Apache Cassandra in the Real WorldApache Cassandra in the Real World
Apache Cassandra in the Real WorldJeremy Hanna
 
Cassandra Tutorial
Cassandra Tutorial Cassandra Tutorial
Cassandra Tutorial Na Zhu
 
Vijfhart thema-avond-oracle-12c-new-features
Vijfhart thema-avond-oracle-12c-new-featuresVijfhart thema-avond-oracle-12c-new-features
Vijfhart thema-avond-oracle-12c-new-featuresmkorremans
 
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...DataStax Academy
 
Taking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionTaking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionSplunk
 
Ambedded - how to build a true no single point of failure ceph cluster
Ambedded - how to build a true no single point of failure ceph cluster Ambedded - how to build a true no single point of failure ceph cluster
Ambedded - how to build a true no single point of failure ceph cluster inwin stack
 
Cassandra Introduction & Features
Cassandra Introduction & FeaturesCassandra Introduction & Features
Cassandra Introduction & FeaturesDataStax Academy
 
Cassandra Introduction & Features
Cassandra Introduction & FeaturesCassandra Introduction & Features
Cassandra Introduction & FeaturesPhil Peace
 
Revolutionary Storage for Modern Databases, Applications and Infrastrcture
Revolutionary Storage for Modern Databases, Applications and InfrastrctureRevolutionary Storage for Modern Databases, Applications and Infrastrcture
Revolutionary Storage for Modern Databases, Applications and Infrastrcturesabnees
 

Similar a BigData Developers MeetUp (20)

Devops kc
Devops kcDevops kc
Devops kc
 
Highly available, scalable and secure data with Cassandra and DataStax Enterp...
Highly available, scalable and secure data with Cassandra and DataStax Enterp...Highly available, scalable and secure data with Cassandra and DataStax Enterp...
Highly available, scalable and secure data with Cassandra and DataStax Enterp...
 
Cassandra - A Basic Introduction Guide
Cassandra - A Basic Introduction GuideCassandra - A Basic Introduction Guide
Cassandra - A Basic Introduction Guide
 
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
 
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analyticsLeveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
 
Streaming Analytics with Spark, Kafka, Cassandra and Akka by Helena Edelson
Streaming Analytics with Spark, Kafka, Cassandra and Akka by Helena EdelsonStreaming Analytics with Spark, Kafka, Cassandra and Akka by Helena Edelson
Streaming Analytics with Spark, Kafka, Cassandra and Akka by Helena Edelson
 
Streaming Analytics with Spark, Kafka, Cassandra and Akka
Streaming Analytics with Spark, Kafka, Cassandra and AkkaStreaming Analytics with Spark, Kafka, Cassandra and Akka
Streaming Analytics with Spark, Kafka, Cassandra and Akka
 
Aruman Cassandra database
Aruman Cassandra databaseAruman Cassandra database
Aruman Cassandra database
 
High Throughput Analytics with Cassandra & Azure
High Throughput Analytics with Cassandra & AzureHigh Throughput Analytics with Cassandra & Azure
High Throughput Analytics with Cassandra & Azure
 
Apache Cassandra in the Real World
Apache Cassandra in the Real WorldApache Cassandra in the Real World
Apache Cassandra in the Real World
 
Cassandra Tutorial
Cassandra Tutorial Cassandra Tutorial
Cassandra Tutorial
 
Vijfhart thema-avond-oracle-12c-new-features
Vijfhart thema-avond-oracle-12c-new-featuresVijfhart thema-avond-oracle-12c-new-features
Vijfhart thema-avond-oracle-12c-new-features
 
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
 
Taking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionTaking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout Session
 
Cassandra training
Cassandra trainingCassandra training
Cassandra training
 
Ambedded - how to build a true no single point of failure ceph cluster
Ambedded - how to build a true no single point of failure ceph cluster Ambedded - how to build a true no single point of failure ceph cluster
Ambedded - how to build a true no single point of failure ceph cluster
 
Apache cassandra
Apache cassandraApache cassandra
Apache cassandra
 
Cassandra Introduction & Features
Cassandra Introduction & FeaturesCassandra Introduction & Features
Cassandra Introduction & Features
 
Cassandra Introduction & Features
Cassandra Introduction & FeaturesCassandra Introduction & Features
Cassandra Introduction & Features
 
Revolutionary Storage for Modern Databases, Applications and Infrastrcture
Revolutionary Storage for Modern Databases, Applications and InfrastrctureRevolutionary Storage for Modern Databases, Applications and Infrastrcture
Revolutionary Storage for Modern Databases, Applications and Infrastrcture
 

Último

Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...OnePlan Solutions
 
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full RecordingOpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full RecordingShane Coughlan
 
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecturerahul_net
 
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdfAndrey Devyatkin
 
Introduction to Firebase Workshop Slides
Introduction to Firebase Workshop SlidesIntroduction to Firebase Workshop Slides
Introduction to Firebase Workshop Slidesvaideheekore1
 
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...OnePlan Solutions
 
Zer0con 2024 final share short version.pdf
Zer0con 2024 final share short version.pdfZer0con 2024 final share short version.pdf
Zer0con 2024 final share short version.pdfmaor17
 
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full RecordingOpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full RecordingShane Coughlan
 
Data modeling 101 - Basics - Software Domain
Data modeling 101 - Basics - Software DomainData modeling 101 - Basics - Software Domain
Data modeling 101 - Basics - Software DomainAbdul Ahad
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Rob Geurden
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLionel Briand
 
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptxThe Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptxRTS corp
 
SAM Training Session - How to use EXCEL ?
SAM Training Session - How to use EXCEL ?SAM Training Session - How to use EXCEL ?
SAM Training Session - How to use EXCEL ?Alexandre Beguel
 
Amazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilitiesAmazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilitiesKrzysztofKkol1
 
Strategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsStrategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsJean Silva
 
Copilot para Microsoft 365 y Power Platform Copilot
Copilot para Microsoft 365 y Power Platform CopilotCopilot para Microsoft 365 y Power Platform Copilot
Copilot para Microsoft 365 y Power Platform CopilotEdgard Alejos
 
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonLeveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonApplitools
 
Effectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorEffectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorTier1 app
 
Best Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh ITBest Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh ITmanoharjgpsolutions
 

Último (20)

Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
 
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full RecordingOpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
 
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecture
 
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf
 
Introduction to Firebase Workshop Slides
Introduction to Firebase Workshop SlidesIntroduction to Firebase Workshop Slides
Introduction to Firebase Workshop Slides
 
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
 
Zer0con 2024 final share short version.pdf
Zer0con 2024 final share short version.pdfZer0con 2024 final share short version.pdf
Zer0con 2024 final share short version.pdf
 
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full RecordingOpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
 
Data modeling 101 - Basics - Software Domain
Data modeling 101 - Basics - Software DomainData modeling 101 - Basics - Software Domain
Data modeling 101 - Basics - Software Domain
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and Repair
 
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptxThe Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
 
SAM Training Session - How to use EXCEL ?
SAM Training Session - How to use EXCEL ?SAM Training Session - How to use EXCEL ?
SAM Training Session - How to use EXCEL ?
 
Amazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilitiesAmazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilities
 
Strategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsStrategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero results
 
Copilot para Microsoft 365 y Power Platform Copilot
Copilot para Microsoft 365 y Power Platform CopilotCopilot para Microsoft 365 y Power Platform Copilot
Copilot para Microsoft 365 y Power Platform Copilot
 
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonLeveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
 
Effectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorEffectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryError
 
Best Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh ITBest Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh IT
 

BigData Developers MeetUp

  • 1. Evaluating Apache Cassandra as a Cloud Database Christian Johannsen, Solutions Engineer
  • 2. About me 2 Christian Johannsen Solutions Engineer @ DataStax Twitter: @cjohannsen81
  • 3. What is a cloud database not? 3 • A Cloud database is not simply taking a traditional RDBMS and running it in a Cloud provider’s environment
  • 4. What is a cloud database? 4 • A cloud database is a database optimised to handle the infrastructure limitations and features of running on shared, geographically dispersed infrastructure. • Some of them are: Outages, noisy neighbours, security, variable networks etc…
  • 5. Key attributes of a cloud database 5 • Transparent Elasticity • being able to add and subtract nodes (physical or virtual machines) with load balancing when the underlying application and business demands it. • Transparent Scalability • addition of nodes increases both (1) performance throughput; (2) ability to handle Big Data and maintain high performance • High Availability • always up; no single point of failure • Easy Data Distribution • able to span multiple geographies, data centers, and cloud provider zones. Can read/write to any node
  • 6. 6 Key attributes of a cloud database • Support for multiple data types • Able to handle structured, semi-structured, and unstructured data. • Easy to manage • easy to administer a logical database across many nodes • Lower Cost • It shouldn’t break the bank and you shouldn’t have to have a massive upfront cost • Multiple Infrastructure Support • It should support being deployed on multiple cloud providers (public and private) as well as traditional infrastructure • Security • Data should be secure in flight, at rest and audited.
  • 7. Key Attributes at a glance 7 1. Transparent Elasticity 2. Transparent Scalability 3. High Availability 4. Easy Data Distribution 5. Data Redundancy 6. Support for multiple data types 7. Easy to manage 8. Lower Cost 9. Multiple Infrastructure Support 10. Security
  • 9. Apache Cassandra 9 • Cassandra is a massively scalable, open source, NoSQL, distributed database built for modern, mission-critical online applications. • Written in Java and is a hybrid of Amazon Dynamo and Google BigTable • Masterless with no single point of failure • Distributed, data centre and rack aware • 100% uptime • Predictable scaling Dynamo BigTable BigTable: http://research.google.com/archive/bigtable-osdi06.pdf Dynamo: http://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf
  • 10. Apache Cassandra Overview • Cassandra was designed with the understanding that system/hardware failures 10 can and do occur • Peer-to-peer, distributed system • All nodes the same - masterless with no single point of failure • Data partitioned among all nodes in the cluster • Custom data replication to ensure fault tolerance • Tunable data consistency • Data Center and Rack aware • Familiar SQL-Like language – CQL • More capacity? Add a server! • More throughput? Add a server!
  • 12. Cassandra - Locally Distributed 12 • Replication factor (RF): How many copies of your data? • RF = 3 in this example • Each node is storing 60% of the clusters total data i.e. 3/5 Handy Calculator: http://www.ecyrd.com/cassandracalculator/ Node 1 1st copy Node 4 Node 5 Node 2 2nd copy Node 3 3rd copy • Client reads or writes to any node • Node coordinates with others • Data read or replicated in parallel Node 2 2nd copy
  • 13. Cassandra - Rack/Zone aware Rack 1 13 Node 1 1st copy Rack 2 Rack 2 Node 4 Node 2 Node 3 2nd copy Rack 3 Rack 1 Node 5 3rd copy • Cassandra is aware of which rack or zone each nod resides in • It will attempt to place each data copy in a different rack • RF=3 in this example
  • 14. Cassandra - DC/Region aware 14 • Active Everywhere – reads/writes in multiple data centres • Client writes local • Data syncs across WAN • Replication Factor per DC • Different number of nodes per data center DC: USA DC: EUROPE Node 1 1st copy Node 4 Node 5 Node 2 2nd copy Node 3 3rd copy Node 1 1st copy Node 4 Node 5 Node 2 2nd copy Node 3 3rd copy
  • 15. Cassandra - Tuneable Consistency 15 • Consistency Level (CL) • Client specifies per operation • Handles multi-data center operations ALL = All replicas ack QUORUM = > 51% of replicas ack LOCAL_QUORUM = > 51% in local DC ack ONE = Only one replica acks Plus more…. (see docs) Blog: Eventual Consistency != Hopeful Consistency http://planetcassandra.org/blog/post/a-netflix-experiment-eventual-consistency-hopeful-consistency- by-christos-kalantzis/ Node 1 1st copy Node 4 Node 5 Node 2 2nd copy Node 3 3rd copy Parallel Write Write CL=QUORUM 5 μs ack 12 μs ack 500 μs ack 12 μs ack
  • 16. Cassandra - Node failure 16 • A single node failure shouldn’t bring failure. • Replication Factor + Consistency Level = Success • This example: • RF = 3 • CL = QUORUM Node 1 1st copy Node 4 Node 5 Node 2 2nd copy Node 3 3rd copy Parallel Write Write CL=QUORUM 5 μs ack 12 μs ack 12 μs ack >51% ack – so request is a success
  • 17. Cassandra - Node Recovery 17 • When a write is performed and a replica node for the row is unavailable the coordinator will store a hint locally (3 hours) • When the node recovers, the coordinator replays the missed writes. • Note: a hinted write does not count the consistency level • Note: you should still run repairs across your cluster Node 1 1st copy Node 4 Node 5 Node 2 2nd copy Node 3 3rd copy Stores Hints while Node 3 is offline
  • 18. Cassandra Rack/Zone Failure Rack 1 18 • Cassandra will place the data in as many different racks or availability zones as it can. • This example: • RF = 3 • CL = QUORUM • AZ/Rack 2 fails • Data copies still available in Node 1 and Node 5 • Quorum can be honored i.e. > 51% ack request is a success Node 1 1st copy Rack 2 Rack 2 Node 4 Node 2 Node 3 2nd copy Rack 3 Rack 1 Node 5 3rd copy
  • 19. Operational Simplicity 19 • Cassandra is a complete product – there is not a multitude of components to install, set-up and monitor. • Extremely simple to administer and deploy • Backups are instantaneous and simple to restore • Supports snapshots, incremental backups and point-in-time recovery. • Cassandra can handle non-uniform hardware and disks. o This enables the mixing of solid state and spinning disks in a single cluster and pinning tables to workload-appropriate disks. • No downtime is required in Cassandra for upgrades or adding/removing servers from the cluster.
  • 20. What ´s up with DataStax? 20
  • 21. DataStax at a glance 21 330+ Employees Percent Customers Founded in April 2010 ~25 500+ Santa Clara, Austin, New York, London, Sydney
  • 22. DataStax delivers value 22 Certified, Enterprise-ready Cassandra Security Analytics Search Visual Monitoring Management Services In-Memory Dev. IDE & Drivers Professional Services Support & Training Commercial Confidence Enterprise Functionality
  • 23. Enterprise Integrations 23 • DataStax adds Enterprise Features like: Hadoop, Solr, Spark
  • 24. DataStax OpsCenter 24 • DataStax OpsCenter is a browser-based, visual management and monitoring solution for Apache Cassandra and DataStax Enterprise • Functionality is also exposed via HTTP APIs
  • 25. OpsCenter - New Cluster Example 25 A new, 10-node Cassandra (or A new, 10-node DSE cluste Hr awdiothop O) cplsuCsteern wteitrh r OunpnsCinegn toenr rAunWnSin gin in 3 3 m miinnuutteess…… 1 2 3 Done
  • 26. OpsCenter 5.0 26 • Manage multiple clusters and nodes • Add and remove nodes • Administer individual nodes or in bulk • Configure clusters • Perform rolling restarts • Automatically repair data • Rebalance data • Backup management • Capacity planning
  • 27. DataStax Office Demo 27 • 32 Raspberry Pi´s • 16 per DataStax Enterprise 4.5 Cluster • Managed in OpsCenter 5.0 • “Red Button” downs one DataCenter • Not the Performance-Demo but • Availability • Commodity Hardware
  • 28. Native Drivers 28 • Different Native Drivers available: Java, Python etc. • Load Balancing Policies (Client Driver receives Updates) • Data Centre Aware • Latency Aware • Token Aware • Reconnection policies • Retry policies • Downgrading Consistency • Plus others.. • http://www.datastax.com/download/clientdrivers
  • 29. CQL 29 • Cassandra Query Language • CQL is intended to provide a common, simpler and easier to use interface into Cassandra - and you probably already know it! • e.g. SELECT * FROM users • Usual statements: • CREATE / DROP / ALTER TABLE / SELECT
  • 30. CQL Basics CREATE KEYSPACE league WITH REPLICATION = {‘class’:’NetworkTopologyStrategy’, ‘DataCentre1’:3, ‘DataCentre2’: 2}; 30 USE league; CREATE TABLE teams ( team_name varchar, player_name varchar, jersey int, PRIMARY KEY (team_name, player_name) ); SELECT * FROM teams WHERE team_name = ‘Mighty Mutts’ and player_name = ‘Lucky’; INSERT INTO teams (team_name, player_name, jersey) VALUES ('Mighty Mutts',’Felix’,90);
  • 32. DevCenter 1.1 32 • Visual Query Tool for Developers and Administrators • Easily create and run Cassandra Queries • Visually navigate database objects • Context-based suggestions
  • 33. Do you remember the facts? 33
  • 34. Key Attributes of a cloud database 34 1. Transparent Elasticity 2. Transparent Scalability 3. High Availability 4. Easy Data Distribution 5. Data Redundancy 6. Support for multiple data types 7. Easy to manage 8. Lower Cost 9. Multiple Infrastructure Support 10. Security
  • 35. Transparent Elasticity • Nodes can be added and removed from Cassandra online, with no 35 downtime being experienced. 1 2 3 4 6 5 1 7 10 4 2 3 5 6 8 9 11 12
  • 36. Transparent Scalability 36 • Addition of Cassandra nodes increases performance linearly and ability to manage TB’s-PB’s of data. 1 2 3 4 6 5 1 7 10 4 2 3 5 6 8 9 11 12 Performance throughput = N Performance throughput = N x 2
  • 37. High Availability 37 • Cassandra, with its peer-to-peer masterless architecture has no single point of failure. • 100% uptime is the norm!
  • 38. Easy Data Distribution 38 • Cassandra allows a single logical database to span 1-N Data Centers that are geographically dispersed. • Also supports a hybrid Cloud implementation.
  • 39. Data Redundancy 39 • Cassandra allows for customisable data redundancy so that data is completely protected. • Also supports rack/zone awareness (data can be replicated between different racks to guard against machine/rack failures).
  • 40. Support for multiple data types 40 • Cassandra’s data model (based on Google’s Bigtable) allows a user to store structured, semi-structured, and unstructured data with ease. Portfolio Keyspace Customer Table ID Name SSN DOB
  • 41. Easy to manage 41 • OpsCenter, AMI installers etc.. allow you to install and configure an entire multi-node Cloud implementation in minutes. • All can be managed and monitored via Web-based console or via RESTful APIs.
  • 42. Lower Cost 42 • Cassandra is open source software and is freely available. • Commercial/advanced versions of Cassandra are available from DataStax along with support and additional services.
  • 43. Multiple Infrastructure Support 43 • Cassandra is supported on popular Cloud provider platforms and operating systems.
  • 44. Security 44 • Cassandra has multiple security features - authentication, authorisation, inflight encryption • Advanced “enterprise level” security can also provided via DataStax Enterprise
  • 46. What is Spark? 46 • Apache Project since 2010 • 10-100x faster than Hadoop MapReduce • In-Memory Storage • Single JVM Processor per node • Rich Scala, Java and Python API´s • 2x-5x less code • Interactive Shell
  • 47. Why Spark on Cassandra? 47 • Data model independent queries • cross-table operations (JOIN, UNION, etc.)! • complex analytics (e.g. machine learning) • data transformation, aggregation etc. • stream processing (coming soon) • all nodes are Spark workers • by default resilient to worker failures • first node promoted as Spark Master • Standby Master promoted on failure • Master HA available in Dactastax Enterprise
  • 48. How to Spark on Cassandra? 48 • DataStax Cassandra Spark Driver on GitHub • Compatible with: • Spark 0.9+ • Cassandra 2.0+ • DataStax Enterprise 4.5+ • Cassandra Tables exposes as Spark RDDs (Resilient Distributed Datasets • Read from and write to Cassandra • Mapping of Cassandra tables and rows to Scala objects • Scala only driver for now
  • 50. 2.1 Release - User Defined Types 50 CREATE TYPE address ( street text, city text, zip_code int, phones set<text> ) CREATE TABLE users ( id uuid PRIMARY KEY, name text, addresses map<text, address> ) SELECT id, name, addresses.city, addresses.phones FROM users; id | name | addresses.city | addresses.phones --------------------+----------------+-------------------------- 63bf691f | chris | Berlin | {’0201234567', ’0796622222'}
  • 51. 2.1 Release - Secondary Indexes on collections 51 CREATE TABLE songs ( id uuid PRIMARY KEY, artist text, album text, title text, data blob, tags set<text> ); CREATE INDEX song_tags_idx ON songs(tags); SELECT * FROM songs WHERE tags CONTAINS 'blues'; id | album | artist | tags | title ----------+---------------+-------------------+-----------------------+------------------ 5027b27e | Country Blues | Lightnin' Hopkins | {'acoustic', 'blues'} | Worrying My Mind
  • 52. Thanks! Let´s see a demo! 52