SlideShare a Scribd company logo
1 of 95
Rimas Silkaitis
From Postgres to Cassandra
NoSQL vs SQL
||
&&
Rimas Silkaitis
Product
@neovintage
app cloud
DEPLOY MANAGE SCALE
$ git push heroku master
Counting objects: 11, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (10/10), done.
Writing objects: 100% (11/11), 22.29 KiB | 0 bytes/s, done.
Total 11 (delta 1), reused 0 (delta 0)
remote: Compressing source files... done.
remote: Building source:
remote:
remote: -----> Ruby app detected
remote: -----> Compiling Ruby
remote: -----> Using Ruby version: ruby-2.3.1
Heroku Postgres
Over 1 Million Active DBs
Heroku Redis
Over 100K Active Instances
Apache Kafka on Heroku
Runtime
Runtime
Workers
$ psql
psql => d
List of relations
schema | name | type | owner
--------+----------+-------+-----------
public | users | table | neovintage
public | accounts | table | neovintage
public | events | table | neovintage
public | tasks | table | neovintage
public | lists | table | neovintage
Ugh… Database Problems
$ psql
psql => d
List of relations
schema | name | type | owner
--------+----------+-------+-----------
public | users | table | neovintage
public | accounts | table | neovintage
public | events | table | neovintage
public | tasks | table | neovintage
public | lists | table | neovintage
Site Traffic
Events
* Totally Not to Scale
One
Big Table
Problem
CREATE TABLE users (
id bigserial,
account_id bigint,
name text,
email text,
encrypted_password text,
created_at timestamptz,
updated_at timestamptz
);
CREATE TABLE accounts (
id bigserial,
name text,
owner_id bigint,
created_at timestamptz,
updated_at timestamptz
);
CREATE TABLE events (
user_id bigint,
account_id bigint,
session_id text,
occurred_at timestamptz,
category text,
action text,
label text,
attributes jsonb
);
Table
events
events
events_20160901
events_20160902
events_20160903
events_20160904
Add Some Triggers
$ psql
neovintage::DB=> e
INSERT INTO events (
user_id,
account_id,
category,
action,
created_at)
VALUES (1,
2,
“in_app”,
“purchase_upgrade”
“2016-09-07 11:00:00 -07:00”);
events_20160901
events_20160902
events_20160903
events_20160904
events
INSERT
query
Constraints
• Data has little value after a period of time
• Small range of data has to be queried
• Old data can be archived or aggregated
There’s A Better Way
&&
One
Big Table
Problem
$ psql
psql => d
List of relations
schema | name | type | owner
--------+----------+-------+-----------
public | users | table | neovintage
public | accounts | table | neovintage
public | events | table | neovintage
public | tasks | table | neovintage
public | lists | table | neovintage
Why Introduce
Cassandra?
• Linear Scalability
• No Single Point of Failure
• Flexible Data Model
• Tunable Consistency
Runtime
WorkersNew Architecture
I only know relational databases.
How do I do this?
Understanding Cassandra
Two Dimensional
Table Spaces
RELATIONAL
Associative Arrays
or Hash
KEY-VALUE
Postgres is Typically Run as Single Instance*
• Partitioned Key-Value Store
• Has a Grouping of Nodes (data
center)
• Data is distributed amongst the
nodes
Cassandra Cluster with 2 Data Centers
assandra uery anguage
SQL-like
[sēkwel lahyk]
adjective
Resembling SQL in appearance,
behavior or character
adverb
In the manner of SQL
s Talk About Primary K
Partition
Table
Partition Key
• 5 Node Cluster
• Simplest terms: Data is partitioned
amongst all the nodes using the
hashing function.
Replication Factor
Replication Factor
Setting this parameter
tells Cassandra how
many nodes to copy
incoming the data to
This is a replication factor of 3
But I thought
Cassandra had
tables?
Prior to 3.0, tables were called column families
Let’s Model Our Events
Table in Cassandra
We’re not going to go
through any setup
Plenty of tutorials exist
for that sort of thing
Let’s assume were
working with 5 node
cluster
$ psql
neovintage::DB=> d events
Table “public.events"
Column | Type | Modifiers
---------------+--------------------------+-----------
user_id | bigint |
account_id | bigint |
session_id | text |
occurred_at | timestamp with time zone |
category | text |
action | text |
label | text |
attributes | jsonb |
$ cqlsh
cqlsh> CREATE KEYSPACE
IF NOT EXISTS neovintage_prod
WITH REPLICATION = {
‘class’: ‘NetworkTopologyStrategy’,
‘us-east’: 3
};
$ cqlsh
cqlsh> CREATE SCHEMA
IF NOT EXISTS neovintage_prod
WITH REPLICATION = {
‘class’: ‘NetworkTopologyStrategy’,
‘us-east’: 3
};
KEYSPACE ==
SCHEMA
• CQL can use KEYSPACE and SCHEMA
interchangeably
• SCHEMA in Cassandra is somewhere between
`CREATE DATABASE` and `CREATE SCHEMA` in
Postgres
$ cqlsh
cqlsh> CREATE SCHEMA
IF NOT EXISTS neovintage_prod
WITH REPLICATION = {
‘class’: ‘NetworkTopologyStrategy’,
‘us-east’: 3
};
Replication Strategy
$ cqlsh
cqlsh> CREATE SCHEMA
IF NOT EXISTS neovintage_prod
WITH REPLICATION = {
‘class’: ‘NetworkTopologyStrategy’,
‘us-east’: 3
};
Replication Factor
Replication Strategies
• NetworkTopologyStrategy - You have to define the
network topology by defining the data centers. No
magic here
• SimpleStrategy - Has no idea of the topology and
doesn’t care to. Data is replicated to adjacent nodes.
$ cqlsh
cqlsh> CREATE TABLE neovintage_prod.events (
user_id bigint primary key,
account_id bigint,
session_id text,
occurred_at timestamp,
category text,
action text,
label text,
attributes map<text, text>
);
Remember the Primary
Key?
• Postgres defines a PRIMARY KEY as a constraint
that a column or group of columns can be used as a
unique identifier for rows in the table.
• CQL shares that same constraint but extends the
definition even further. Although the main purpose is
to order information in the cluster.
• CQL includes partitioning and sort order of the data
on disk (clustering).
$ cqlsh
cqlsh> CREATE TABLE neovintage_prod.events (
user_id bigint primary key,
account_id bigint,
session_id text,
occurred_at timestamp,
category text,
action text,
label text,
attributes map<text, text>
);
Single Column Primary
Key
• Used for both partitioning and clustering.
• Syntactically, can be defined inline or as a separate
line within the DDL statement.
$ cqlsh
cqlsh> CREATE TABLE neovintage_prod.events (
user_id bigint,
account_id bigint,
session_id text,
occurred_at timestamp,
category text,
action text,
label text,
attributes map<text, text>,
PRIMARY KEY (
(user_id, occurred_at),
account_id,
session_id
)
);
$ cqlsh
cqlsh> CREATE TABLE neovintage_prod.events (
user_id bigint,
account_id bigint,
session_id text,
occurred_at timestamp,
category text,
action text,
label text,
attributes map<text, text>,
PRIMARY KEY (
(user_id, occurred_at),
account_id,
session_id
)
);
Composite
Partition Key
$ cqlsh
cqlsh> CREATE TABLE neovintage_prod.events (
user_id bigint,
account_id bigint,
session_id text,
occurred_at timestamp,
category text,
action text,
label text,
attributes map<text, text>,
PRIMARY KEY (
(user_id, occurred_at),
account_id,
session_id
)
);
Clustering Keys
PRIMARY KEY (
(user_id, occurred_at),
account_id,
session_id
)
Composite Partition Key
• This means that both the user_id and the occurred_at
columns are going to be used to partition data.
• If you were to not include the inner parenthesis, the the
first column listed in this PRIMARY KEY definition
would be the sole partition key.
PRIMARY KEY (
(user_id, occurred_at),
account_id,
session_id
)
Clustering Columns
• Defines how the data is sorted on disk. In this case, its
by account_id and then session_id
• It is possible to change the direction of the sort order
$ cqlsh
cqlsh> CREATE TABLE neovintage_prod.events (
user_id bigint,
account_id bigint,
session_id text,
occurred_at timestamp,
category text,
action text,
label text,
attributes map<text, text>,
PRIMARY KEY (
(user_id, occurred_at),
account_id,
session_id
)
) WITH CLUSTERING ORDER BY (
account_id desc, session_id acc
);
Ahhhhh… Just
like SQL
Data TypesTypes
Postgres Type Cassandra Type
bigint bigint
int int
decimal decimal
float float
text text
varchar(n) varchar
blob blob
json N/A
jsonb N/A
hstore map<type>, <type>
Postgres Type Cassandra Type
bigint bigint
int int
decimal decimal
float float
text text
varchar(n) varchar
blob blob
json N/A
jsonb N/A
hstore map<type>, <type>
Challenges
• JSON / JSONB columns don't have 1:1 mappings in
Cassandra
• You’ll need to nest MAP type in Cassandra or flatten
out your JSON
• Be careful about timestamps!! Time zones are already
challenging in Postgres.
• If you don’t specify a time zone in Cassandra the time
zone of the coordinator node is used. Always specify
one.
Ready for
Webscale
General Tips
• Just like Table Partitioning in Postgres, you need to
think about how you’re going to query the data in
Cassandra. This dictates how you set up your keys.
• We just walked through the semantics on the
database side. Tackling this change on the
application-side is a whole extra topic.
• This is just enough information to get you started.
Runtime
Workers
Runtime
Workers
Foreign Data Wrapper
fdw
=>
fdw
We’re not going to go through
any setup, again……..
https://bitbucket.org/openscg/cassandra_fdw
$ psql
neovintage::DB=> CREATE EXTENSION cassandra_fdw;
CREATE EXTENSION
$ psql
neovintage::DB=> CREATE EXTENSION cassandra_fdw;
CREATE EXTENSION
neovintage::DB=> CREATE SERVER cass_serv
FOREIGN DATA WRAPPER cassandra_fdw
OPTIONS (host ‘127.0.0.1');
CREATE SERVER
$ psql
neovintage::DB=> CREATE EXTENSION cassandra_fdw;
CREATE EXTENSION
neovintage::DB=> CREATE SERVER cass_serv
FOREIGN DATA WRAPPER cassandra_fdw
OPTIONS (host ‘127.0.0.1');
CREATE SERVER
neovintage::DB=> CREATE USER MAPPING FOR public
SERVER cass_serv
OPTIONS (username 'test', password ‘test');
CREATE USER
$ psql
neovintage::DB=> CREATE EXTENSION cassandra_fdw;
CREATE EXTENSION
neovintage::DB=> CREATE SERVER cass_serv
FOREIGN DATA WRAPPER cassandra_fdw
OPTIONS (host ‘127.0.0.1');
CREATE SERVER
neovintage::DB=> CREATE USER MAPPING FOR public SERVER cass_serv
OPTIONS (username 'test', password ‘test');
CREATE USER
neovintage::DB=> CREATE FOREIGN TABLE cass.events (id int)
SERVER cass_serv
OPTIONS (schema_name ‘neovintage_prod',
table_name 'events', primary_key ‘id');
CREATE FOREIGN TABLE
neovintage::DB=> INSERT INTO cass.events (
user_id,
occurred_at,
label
)
VALUES (
1234,
“2016-09-08 11:00:00 -0700”,
“awesome”
);
Some Gotchas
• No Composite Primary Key Support in
cassandra_fdw
• No support for UPSERT
• Postgres 9.5+ and Cassandra 3.0+ Supported
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016

More Related Content

What's hot

Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...
DataStax
 
What We Learned About Cassandra While Building go90 (Christopher Webster & Th...
What We Learned About Cassandra While Building go90 (Christopher Webster & Th...What We Learned About Cassandra While Building go90 (Christopher Webster & Th...
What We Learned About Cassandra While Building go90 (Christopher Webster & Th...
DataStax
 

What's hot (20)

Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016
 
Real time data pipeline with spark streaming and cassandra with mesos
Real time data pipeline with spark streaming and cassandra with mesosReal time data pipeline with spark streaming and cassandra with mesos
Real time data pipeline with spark streaming and cassandra with mesos
 
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...
 
DataStax | Data Science with DataStax Enterprise (Brian Hess) | Cassandra Sum...
DataStax | Data Science with DataStax Enterprise (Brian Hess) | Cassandra Sum...DataStax | Data Science with DataStax Enterprise (Brian Hess) | Cassandra Sum...
DataStax | Data Science with DataStax Enterprise (Brian Hess) | Cassandra Sum...
 
How We Used Cassandra/Solr to Build Real-Time Analytics Platform
How We Used Cassandra/Solr to Build Real-Time Analytics PlatformHow We Used Cassandra/Solr to Build Real-Time Analytics Platform
How We Used Cassandra/Solr to Build Real-Time Analytics Platform
 
Apache Cassandra at Macys
Apache Cassandra at MacysApache Cassandra at Macys
Apache Cassandra at Macys
 
Introduction to data modeling with apache cassandra
Introduction to data modeling with apache cassandraIntroduction to data modeling with apache cassandra
Introduction to data modeling with apache cassandra
 
Cassandra Summit 2015: Intro to DSE Search
Cassandra Summit 2015: Intro to DSE SearchCassandra Summit 2015: Intro to DSE Search
Cassandra Summit 2015: Intro to DSE Search
 
Time series with Apache Cassandra - Long version
Time series with Apache Cassandra - Long versionTime series with Apache Cassandra - Long version
Time series with Apache Cassandra - Long version
 
Deploying, Backups, and Restore w Datastax + Azure at Albertsons/Safeway (Gur...
Deploying, Backups, and Restore w Datastax + Azure at Albertsons/Safeway (Gur...Deploying, Backups, and Restore w Datastax + Azure at Albertsons/Safeway (Gur...
Deploying, Backups, and Restore w Datastax + Azure at Albertsons/Safeway (Gur...
 
Frustration-Reduced Spark: DataFrames and the Spark Time-Series Library
Frustration-Reduced Spark: DataFrames and the Spark Time-Series LibraryFrustration-Reduced Spark: DataFrames and the Spark Time-Series Library
Frustration-Reduced Spark: DataFrames and the Spark Time-Series Library
 
BI, Reporting and Analytics on Apache Cassandra
BI, Reporting and Analytics on Apache CassandraBI, Reporting and Analytics on Apache Cassandra
BI, Reporting and Analytics on Apache Cassandra
 
Datastax day 2016 : Cassandra data modeling basics
Datastax day 2016 : Cassandra data modeling basicsDatastax day 2016 : Cassandra data modeling basics
Datastax day 2016 : Cassandra data modeling basics
 
Using Spark to Load Oracle Data into Cassandra
Using Spark to Load Oracle Data into CassandraUsing Spark to Load Oracle Data into Cassandra
Using Spark to Load Oracle Data into Cassandra
 
Using Approximate Data for Small, Insightful Analytics (Ben Kornmeier, Protec...
Using Approximate Data for Small, Insightful Analytics (Ben Kornmeier, Protec...Using Approximate Data for Small, Insightful Analytics (Ben Kornmeier, Protec...
Using Approximate Data for Small, Insightful Analytics (Ben Kornmeier, Protec...
 
Micro-batching: High-performance Writes (Adam Zegelin, Instaclustr) | Cassand...
Micro-batching: High-performance Writes (Adam Zegelin, Instaclustr) | Cassand...Micro-batching: High-performance Writes (Adam Zegelin, Instaclustr) | Cassand...
Micro-batching: High-performance Writes (Adam Zegelin, Instaclustr) | Cassand...
 
What We Learned About Cassandra While Building go90 (Christopher Webster & Th...
What We Learned About Cassandra While Building go90 (Christopher Webster & Th...What We Learned About Cassandra While Building go90 (Christopher Webster & Th...
What We Learned About Cassandra While Building go90 (Christopher Webster & Th...
 
Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016
Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016
Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016
 
Cassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series ModelingCassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series Modeling
 
Everyday I’m scaling... Cassandra
Everyday I’m scaling... CassandraEveryday I’m scaling... Cassandra
Everyday I’m scaling... Cassandra
 

Viewers also liked

Elassandra: Elasticsearch as a Cassandra Secondary Index (Rémi Trouville, Vin...
Elassandra: Elasticsearch as a Cassandra Secondary Index (Rémi Trouville, Vin...Elassandra: Elasticsearch as a Cassandra Secondary Index (Rémi Trouville, Vin...
Elassandra: Elasticsearch as a Cassandra Secondary Index (Rémi Trouville, Vin...
DataStax
 

Viewers also liked (20)

EDF2013: Selected Talk, Simon Riggs: Practical PostgreSQL and AXLE Project
EDF2013: Selected Talk, Simon Riggs: Practical PostgreSQL and AXLE ProjectEDF2013: Selected Talk, Simon Riggs: Practical PostgreSQL and AXLE Project
EDF2013: Selected Talk, Simon Riggs: Practical PostgreSQL and AXLE Project
 
Cassandra db
Cassandra dbCassandra db
Cassandra db
 
BKK16-400B ODPI - Standardizing Hadoop
BKK16-400B ODPI - Standardizing HadoopBKK16-400B ODPI - Standardizing Hadoop
BKK16-400B ODPI - Standardizing Hadoop
 
Music Recommendations at Spotify
Music Recommendations at SpotifyMusic Recommendations at Spotify
Music Recommendations at Spotify
 
EXPLicando o Explain no PostgreSQL
EXPLicando o Explain no PostgreSQLEXPLicando o Explain no PostgreSQL
EXPLicando o Explain no PostgreSQL
 
PGDay Campinas 2013 - PL/pg…ETL – Transformação de dados para DW e BI usando ...
PGDay Campinas 2013 - PL/pg…ETL – Transformação de dados para DW e BI usando ...PGDay Campinas 2013 - PL/pg…ETL – Transformação de dados para DW e BI usando ...
PGDay Campinas 2013 - PL/pg…ETL – Transformação de dados para DW e BI usando ...
 
PGDay Campinas 2013 - Como Full Text Search pode ajudar na busca textual
PGDay Campinas 2013 - Como Full Text Search pode ajudar na busca textualPGDay Campinas 2013 - Como Full Text Search pode ajudar na busca textual
PGDay Campinas 2013 - Como Full Text Search pode ajudar na busca textual
 
Building the Modern Data Hub: Beyond the Traditional Enterprise Data Warehouse
Building the Modern Data Hub: Beyond the Traditional Enterprise Data WarehouseBuilding the Modern Data Hub: Beyond the Traditional Enterprise Data Warehouse
Building the Modern Data Hub: Beyond the Traditional Enterprise Data Warehouse
 
PostgreSQL: How to Store Passwords Safely
PostgreSQL: How to Store Passwords SafelyPostgreSQL: How to Store Passwords Safely
PostgreSQL: How to Store Passwords Safely
 
Apache Cassandra Data Modeling with Travis Price
Apache Cassandra Data Modeling with Travis PriceApache Cassandra Data Modeling with Travis Price
Apache Cassandra Data Modeling with Travis Price
 
Dba PostgreSQL desde básico a avanzado parte2
Dba PostgreSQL desde básico a avanzado parte2Dba PostgreSQL desde básico a avanzado parte2
Dba PostgreSQL desde básico a avanzado parte2
 
Building an Activity Feed with Cassandra
Building an Activity Feed with CassandraBuilding an Activity Feed with Cassandra
Building an Activity Feed with Cassandra
 
Elassandra: Elasticsearch as a Cassandra Secondary Index (Rémi Trouville, Vin...
Elassandra: Elasticsearch as a Cassandra Secondary Index (Rémi Trouville, Vin...Elassandra: Elasticsearch as a Cassandra Secondary Index (Rémi Trouville, Vin...
Elassandra: Elasticsearch as a Cassandra Secondary Index (Rémi Trouville, Vin...
 
PgBouncer: Pool, Segurança e Disaster Recovery | Felipe Pereira
PgBouncer: Pool, Segurança e Disaster Recovery | Felipe PereiraPgBouncer: Pool, Segurança e Disaster Recovery | Felipe Pereira
PgBouncer: Pool, Segurança e Disaster Recovery | Felipe Pereira
 
DevOps e PostgreSQL: Replicação de forma simplificada | Miguel Di Ciurcio
DevOps e PostgreSQL: Replicação de forma simplificada | Miguel Di CiurcioDevOps e PostgreSQL: Replicação de forma simplificada | Miguel Di Ciurcio
DevOps e PostgreSQL: Replicação de forma simplificada | Miguel Di Ciurcio
 
Testing - Ing. Gabriela Muñoz
Testing - Ing. Gabriela MuñozTesting - Ing. Gabriela Muñoz
Testing - Ing. Gabriela Muñoz
 
Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3
 
Benchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDB
Benchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDBBenchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDB
Benchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDB
 
Cassandra NoSQL Tutorial
Cassandra NoSQL TutorialCassandra NoSQL Tutorial
Cassandra NoSQL Tutorial
 
Cassandra Data Modeling - Practical Considerations @ Netflix
Cassandra Data Modeling - Practical Considerations @ NetflixCassandra Data Modeling - Practical Considerations @ Netflix
Cassandra Data Modeling - Practical Considerations @ Netflix
 

Similar to From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016

Similar to From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016 (20)

Presentation
PresentationPresentation
Presentation
 
10 Reasons to Start Your Analytics Project with PostgreSQL
10 Reasons to Start Your Analytics Project with PostgreSQL10 Reasons to Start Your Analytics Project with PostgreSQL
10 Reasons to Start Your Analytics Project with PostgreSQL
 
Use Your MySQL Knowledge to Become an Instant Cassandra Guru
Use Your MySQL Knowledge to Become an Instant Cassandra GuruUse Your MySQL Knowledge to Become an Instant Cassandra Guru
Use Your MySQL Knowledge to Become an Instant Cassandra Guru
 
Instaclustr Webinar 50,000 Transactions Per Second with Apache Spark on Apach...
Instaclustr Webinar 50,000 Transactions Per Second with Apache Spark on Apach...Instaclustr Webinar 50,000 Transactions Per Second with Apache Spark on Apach...
Instaclustr Webinar 50,000 Transactions Per Second with Apache Spark on Apach...
 
Instaclustr webinar 50,000 transactions per second with Apache Spark on Apach...
Instaclustr webinar 50,000 transactions per second with Apache Spark on Apach...Instaclustr webinar 50,000 transactions per second with Apache Spark on Apach...
Instaclustr webinar 50,000 transactions per second with Apache Spark on Apach...
 
SparkSQL: A Compiler from Queries to RDDs
SparkSQL: A Compiler from Queries to RDDsSparkSQL: A Compiler from Queries to RDDs
SparkSQL: A Compiler from Queries to RDDs
 
Data stores: beyond relational databases
Data stores: beyond relational databasesData stores: beyond relational databases
Data stores: beyond relational databases
 
Trivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan Ott
Trivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan OttTrivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan Ott
Trivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan Ott
 
Apache Cassandra, part 1 – principles, data model
Apache Cassandra, part 1 – principles, data modelApache Cassandra, part 1 – principles, data model
Apache Cassandra, part 1 – principles, data model
 
3 Dundee-Spark Overview for C* developers
3 Dundee-Spark Overview for C* developers3 Dundee-Spark Overview for C* developers
3 Dundee-Spark Overview for C* developers
 
Avoiding Pitfalls for Cassandra.pdf
Avoiding Pitfalls for Cassandra.pdfAvoiding Pitfalls for Cassandra.pdf
Avoiding Pitfalls for Cassandra.pdf
 
Spark & Cassandra - DevFest Córdoba
Spark & Cassandra - DevFest CórdobaSpark & Cassandra - DevFest Córdoba
Spark & Cassandra - DevFest Córdoba
 
Intro to cassandra
Intro to cassandraIntro to cassandra
Intro to cassandra
 
Instaclustr webinar 2017 feb 08 japan
Instaclustr webinar 2017 feb 08   japanInstaclustr webinar 2017 feb 08   japan
Instaclustr webinar 2017 feb 08 japan
 
Manchester Hadoop Meetup: Spark Cassandra Integration
Manchester Hadoop Meetup: Spark Cassandra IntegrationManchester Hadoop Meetup: Spark Cassandra Integration
Manchester Hadoop Meetup: Spark Cassandra Integration
 
Riak add presentation
Riak add presentationRiak add presentation
Riak add presentation
 
Cassandra Talk: Austin JUG
Cassandra Talk: Austin JUGCassandra Talk: Austin JUG
Cassandra Talk: Austin JUG
 
Einführung in MongoDB
Einführung in MongoDBEinführung in MongoDB
Einführung in MongoDB
 
2015 02-09 - NoSQL Vorlesung Mosbach
2015 02-09 - NoSQL Vorlesung Mosbach2015 02-09 - NoSQL Vorlesung Mosbach
2015 02-09 - NoSQL Vorlesung Mosbach
 
Cassandra Java APIs Old and New – A Comparison
Cassandra Java APIs Old and New – A ComparisonCassandra Java APIs Old and New – A Comparison
Cassandra Java APIs Old and New – A Comparison
 

More from DataStax

More from DataStax (20)

Is Your Enterprise Ready to Shine This Holiday Season?
Is Your Enterprise Ready to Shine This Holiday Season?Is Your Enterprise Ready to Shine This Holiday Season?
Is Your Enterprise Ready to Shine This Holiday Season?
 
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
 
Running DataStax Enterprise in VMware Cloud and Hybrid Environments
Running DataStax Enterprise in VMware Cloud and Hybrid EnvironmentsRunning DataStax Enterprise in VMware Cloud and Hybrid Environments
Running DataStax Enterprise in VMware Cloud and Hybrid Environments
 
Best Practices for Getting to Production with DataStax Enterprise Graph
Best Practices for Getting to Production with DataStax Enterprise GraphBest Practices for Getting to Production with DataStax Enterprise Graph
Best Practices for Getting to Production with DataStax Enterprise Graph
 
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step JourneyWebinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
 
Webinar | How to Understand Apache Cassandra™ Performance Through Read/Writ...
Webinar  |  How to Understand Apache Cassandra™ Performance Through Read/Writ...Webinar  |  How to Understand Apache Cassandra™ Performance Through Read/Writ...
Webinar | How to Understand Apache Cassandra™ Performance Through Read/Writ...
 
Webinar | Better Together: Apache Cassandra and Apache Kafka
Webinar  |  Better Together: Apache Cassandra and Apache KafkaWebinar  |  Better Together: Apache Cassandra and Apache Kafka
Webinar | Better Together: Apache Cassandra and Apache Kafka
 
Top 10 Best Practices for Apache Cassandra and DataStax Enterprise
Top 10 Best Practices for Apache Cassandra and DataStax EnterpriseTop 10 Best Practices for Apache Cassandra and DataStax Enterprise
Top 10 Best Practices for Apache Cassandra and DataStax Enterprise
 
Introduction to Apache Cassandra™ + What’s New in 4.0
Introduction to Apache Cassandra™ + What’s New in 4.0Introduction to Apache Cassandra™ + What’s New in 4.0
Introduction to Apache Cassandra™ + What’s New in 4.0
 
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
 
Webinar | Aligning GDPR Requirements with Today's Hybrid Cloud Realities
Webinar  |  Aligning GDPR Requirements with Today's Hybrid Cloud RealitiesWebinar  |  Aligning GDPR Requirements with Today's Hybrid Cloud Realities
Webinar | Aligning GDPR Requirements with Today's Hybrid Cloud Realities
 
Designing a Distributed Cloud Database for Dummies
Designing a Distributed Cloud Database for DummiesDesigning a Distributed Cloud Database for Dummies
Designing a Distributed Cloud Database for Dummies
 
How to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
How to Power Innovation with Geo-Distributed Data Management in Hybrid CloudHow to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
How to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
 
How to Evaluate Cloud Databases for eCommerce
How to Evaluate Cloud Databases for eCommerceHow to Evaluate Cloud Databases for eCommerce
How to Evaluate Cloud Databases for eCommerce
 
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
 
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
 
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
 
Datastax - The Architect's guide to customer experience (CX)
Datastax - The Architect's guide to customer experience (CX)Datastax - The Architect's guide to customer experience (CX)
Datastax - The Architect's guide to customer experience (CX)
 
An Operational Data Layer is Critical for Transformative Banking Applications
An Operational Data Layer is Critical for Transformative Banking ApplicationsAn Operational Data Layer is Critical for Transformative Banking Applications
An Operational Data Layer is Critical for Transformative Banking Applications
 
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design ThinkingBecoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
 

Recently uploaded

%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
masabamasaba
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
VishalKumarJha10
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
VictorSzoltysek
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
masabamasaba
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
mohitmore19
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 

Recently uploaded (20)

%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
SHRMPro HRMS Software Solutions Presentation
SHRMPro HRMS Software Solutions PresentationSHRMPro HRMS Software Solutions Presentation
SHRMPro HRMS Software Solutions Presentation
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
%in Durban+277-882-255-28 abortion pills for sale in Durban
%in Durban+277-882-255-28 abortion pills for sale in Durban%in Durban+277-882-255-28 abortion pills for sale in Durban
%in Durban+277-882-255-28 abortion pills for sale in Durban
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 

From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016

  • 3. ||
  • 4. &&
  • 6.
  • 8.
  • 10. $ git push heroku master Counting objects: 11, done. Delta compression using up to 8 threads. Compressing objects: 100% (10/10), done. Writing objects: 100% (11/11), 22.29 KiB | 0 bytes/s, done. Total 11 (delta 1), reused 0 (delta 0) remote: Compressing source files... done. remote: Building source: remote: remote: -----> Ruby app detected remote: -----> Compiling Ruby remote: -----> Using Ruby version: ruby-2.3.1
  • 11. Heroku Postgres Over 1 Million Active DBs
  • 12. Heroku Redis Over 100K Active Instances
  • 13. Apache Kafka on Heroku
  • 16. $ psql psql => d List of relations schema | name | type | owner --------+----------+-------+----------- public | users | table | neovintage public | accounts | table | neovintage public | events | table | neovintage public | tasks | table | neovintage public | lists | table | neovintage
  • 17.
  • 18.
  • 19.
  • 21. $ psql psql => d List of relations schema | name | type | owner --------+----------+-------+----------- public | users | table | neovintage public | accounts | table | neovintage public | events | table | neovintage public | tasks | table | neovintage public | lists | table | neovintage
  • 24. CREATE TABLE users ( id bigserial, account_id bigint, name text, email text, encrypted_password text, created_at timestamptz, updated_at timestamptz ); CREATE TABLE accounts ( id bigserial, name text, owner_id bigint, created_at timestamptz, updated_at timestamptz );
  • 25. CREATE TABLE events ( user_id bigint, account_id bigint, session_id text, occurred_at timestamptz, category text, action text, label text, attributes jsonb );
  • 26. Table
  • 29. $ psql neovintage::DB=> e INSERT INTO events ( user_id, account_id, category, action, created_at) VALUES (1, 2, “in_app”, “purchase_upgrade” “2016-09-07 11:00:00 -07:00”);
  • 31. Constraints • Data has little value after a period of time • Small range of data has to be queried • Old data can be archived or aggregated
  • 33. &&
  • 35. $ psql psql => d List of relations schema | name | type | owner --------+----------+-------+----------- public | users | table | neovintage public | accounts | table | neovintage public | events | table | neovintage public | tasks | table | neovintage public | lists | table | neovintage
  • 36. Why Introduce Cassandra? • Linear Scalability • No Single Point of Failure • Flexible Data Model • Tunable Consistency
  • 38. I only know relational databases. How do I do this?
  • 42. Postgres is Typically Run as Single Instance*
  • 43. • Partitioned Key-Value Store • Has a Grouping of Nodes (data center) • Data is distributed amongst the nodes
  • 44. Cassandra Cluster with 2 Data Centers
  • 46. SQL-like [sēkwel lahyk] adjective Resembling SQL in appearance, behavior or character adverb In the manner of SQL
  • 47. s Talk About Primary K Partition
  • 48. Table
  • 50.
  • 51. • 5 Node Cluster • Simplest terms: Data is partitioned amongst all the nodes using the hashing function.
  • 53. Replication Factor Setting this parameter tells Cassandra how many nodes to copy incoming the data to This is a replication factor of 3
  • 54. But I thought Cassandra had tables?
  • 55. Prior to 3.0, tables were called column families
  • 56. Let’s Model Our Events Table in Cassandra
  • 57.
  • 58. We’re not going to go through any setup Plenty of tutorials exist for that sort of thing Let’s assume were working with 5 node cluster
  • 59. $ psql neovintage::DB=> d events Table “public.events" Column | Type | Modifiers ---------------+--------------------------+----------- user_id | bigint | account_id | bigint | session_id | text | occurred_at | timestamp with time zone | category | text | action | text | label | text | attributes | jsonb |
  • 60. $ cqlsh cqlsh> CREATE KEYSPACE IF NOT EXISTS neovintage_prod WITH REPLICATION = { ‘class’: ‘NetworkTopologyStrategy’, ‘us-east’: 3 };
  • 61. $ cqlsh cqlsh> CREATE SCHEMA IF NOT EXISTS neovintage_prod WITH REPLICATION = { ‘class’: ‘NetworkTopologyStrategy’, ‘us-east’: 3 };
  • 62. KEYSPACE == SCHEMA • CQL can use KEYSPACE and SCHEMA interchangeably • SCHEMA in Cassandra is somewhere between `CREATE DATABASE` and `CREATE SCHEMA` in Postgres
  • 63. $ cqlsh cqlsh> CREATE SCHEMA IF NOT EXISTS neovintage_prod WITH REPLICATION = { ‘class’: ‘NetworkTopologyStrategy’, ‘us-east’: 3 }; Replication Strategy
  • 64. $ cqlsh cqlsh> CREATE SCHEMA IF NOT EXISTS neovintage_prod WITH REPLICATION = { ‘class’: ‘NetworkTopologyStrategy’, ‘us-east’: 3 }; Replication Factor
  • 65. Replication Strategies • NetworkTopologyStrategy - You have to define the network topology by defining the data centers. No magic here • SimpleStrategy - Has no idea of the topology and doesn’t care to. Data is replicated to adjacent nodes.
  • 66. $ cqlsh cqlsh> CREATE TABLE neovintage_prod.events ( user_id bigint primary key, account_id bigint, session_id text, occurred_at timestamp, category text, action text, label text, attributes map<text, text> );
  • 67. Remember the Primary Key? • Postgres defines a PRIMARY KEY as a constraint that a column or group of columns can be used as a unique identifier for rows in the table. • CQL shares that same constraint but extends the definition even further. Although the main purpose is to order information in the cluster. • CQL includes partitioning and sort order of the data on disk (clustering).
  • 68. $ cqlsh cqlsh> CREATE TABLE neovintage_prod.events ( user_id bigint primary key, account_id bigint, session_id text, occurred_at timestamp, category text, action text, label text, attributes map<text, text> );
  • 69. Single Column Primary Key • Used for both partitioning and clustering. • Syntactically, can be defined inline or as a separate line within the DDL statement.
  • 70. $ cqlsh cqlsh> CREATE TABLE neovintage_prod.events ( user_id bigint, account_id bigint, session_id text, occurred_at timestamp, category text, action text, label text, attributes map<text, text>, PRIMARY KEY ( (user_id, occurred_at), account_id, session_id ) );
  • 71. $ cqlsh cqlsh> CREATE TABLE neovintage_prod.events ( user_id bigint, account_id bigint, session_id text, occurred_at timestamp, category text, action text, label text, attributes map<text, text>, PRIMARY KEY ( (user_id, occurred_at), account_id, session_id ) ); Composite Partition Key
  • 72. $ cqlsh cqlsh> CREATE TABLE neovintage_prod.events ( user_id bigint, account_id bigint, session_id text, occurred_at timestamp, category text, action text, label text, attributes map<text, text>, PRIMARY KEY ( (user_id, occurred_at), account_id, session_id ) ); Clustering Keys
  • 73. PRIMARY KEY ( (user_id, occurred_at), account_id, session_id ) Composite Partition Key • This means that both the user_id and the occurred_at columns are going to be used to partition data. • If you were to not include the inner parenthesis, the the first column listed in this PRIMARY KEY definition would be the sole partition key.
  • 74. PRIMARY KEY ( (user_id, occurred_at), account_id, session_id ) Clustering Columns • Defines how the data is sorted on disk. In this case, its by account_id and then session_id • It is possible to change the direction of the sort order
  • 75. $ cqlsh cqlsh> CREATE TABLE neovintage_prod.events ( user_id bigint, account_id bigint, session_id text, occurred_at timestamp, category text, action text, label text, attributes map<text, text>, PRIMARY KEY ( (user_id, occurred_at), account_id, session_id ) ) WITH CLUSTERING ORDER BY ( account_id desc, session_id acc ); Ahhhhh… Just like SQL
  • 77. Postgres Type Cassandra Type bigint bigint int int decimal decimal float float text text varchar(n) varchar blob blob json N/A jsonb N/A hstore map<type>, <type>
  • 78. Postgres Type Cassandra Type bigint bigint int int decimal decimal float float text text varchar(n) varchar blob blob json N/A jsonb N/A hstore map<type>, <type>
  • 79. Challenges • JSON / JSONB columns don't have 1:1 mappings in Cassandra • You’ll need to nest MAP type in Cassandra or flatten out your JSON • Be careful about timestamps!! Time zones are already challenging in Postgres. • If you don’t specify a time zone in Cassandra the time zone of the coordinator node is used. Always specify one.
  • 81. General Tips • Just like Table Partitioning in Postgres, you need to think about how you’re going to query the data in Cassandra. This dictates how you set up your keys. • We just walked through the semantics on the database side. Tackling this change on the application-side is a whole extra topic. • This is just enough information to get you started.
  • 82.
  • 86. fdw
  • 87. We’re not going to go through any setup, again…….. https://bitbucket.org/openscg/cassandra_fdw
  • 88. $ psql neovintage::DB=> CREATE EXTENSION cassandra_fdw; CREATE EXTENSION
  • 89. $ psql neovintage::DB=> CREATE EXTENSION cassandra_fdw; CREATE EXTENSION neovintage::DB=> CREATE SERVER cass_serv FOREIGN DATA WRAPPER cassandra_fdw OPTIONS (host ‘127.0.0.1'); CREATE SERVER
  • 90. $ psql neovintage::DB=> CREATE EXTENSION cassandra_fdw; CREATE EXTENSION neovintage::DB=> CREATE SERVER cass_serv FOREIGN DATA WRAPPER cassandra_fdw OPTIONS (host ‘127.0.0.1'); CREATE SERVER neovintage::DB=> CREATE USER MAPPING FOR public SERVER cass_serv OPTIONS (username 'test', password ‘test'); CREATE USER
  • 91. $ psql neovintage::DB=> CREATE EXTENSION cassandra_fdw; CREATE EXTENSION neovintage::DB=> CREATE SERVER cass_serv FOREIGN DATA WRAPPER cassandra_fdw OPTIONS (host ‘127.0.0.1'); CREATE SERVER neovintage::DB=> CREATE USER MAPPING FOR public SERVER cass_serv OPTIONS (username 'test', password ‘test'); CREATE USER neovintage::DB=> CREATE FOREIGN TABLE cass.events (id int) SERVER cass_serv OPTIONS (schema_name ‘neovintage_prod', table_name 'events', primary_key ‘id'); CREATE FOREIGN TABLE
  • 92. neovintage::DB=> INSERT INTO cass.events ( user_id, occurred_at, label ) VALUES ( 1234, “2016-09-08 11:00:00 -0700”, “awesome” );
  • 93.
  • 94. Some Gotchas • No Composite Primary Key Support in cassandra_fdw • No support for UPSERT • Postgres 9.5+ and Cassandra 3.0+ Supported