SlideShare una empresa de Scribd logo
1 de 29
Descargar para leer sin conexión
Chicago 2015
CQL: This is not the SQL
you are looking for
Aaron Ploetz
Wait... CQL is not SQL?
lCQL3 introduced in Cassandra 1.1.
lCQL is beneficial to new users who have a relational background
(which is most of us).
lHowever similar, CQL is NOT a direct implementation of SQL.
lNew users leave themselves open to issues and frustration when
they use CQL with SQL-based expectations.
$ whoami
lAaron Ploetz
l@APloetz
lLead Database Engineer
lUsing Cassandra since version 0.8.
lContributor to the Cassandra tag on
l2014/15 DataStax MVP for Apache Cassandra
1 SQL features/keywords not present in CQL
2 Differences between CQL and SQL keywords
3 Secondary Indexes
4 Anti-Patterns
5 Questions
SQL features/keywords not present in CQL
lJOINs
lLIKE
lSubqueries
lAggregation
lArithmetic
lExcept for counters and collections.
Differences between CQL and SQL keywords
lWHERE
lPRIMARY KEY
lORDER BY
lIN
lDISTINCT
lCOUNT
lLIMIT
lINSERT vs. UPDATE (“upsert”)
WHERE
lOnly supports AND, IN, =, >, >=, <, <=.
lSome only function under certain conditions.
lAlso: CONTAINS, CONTAINS KEY for indexed collections.
lDoes not exist: OR, !=
lConditions can only operate on PRIMARY KEY components, and
in the defined order of the keys.
WHERE (cont)
lSELECT * FROM shipcrewregistry WHERE
shipname='Serenity';
lStart with partition key(s); cannot skip PRIMARY KEY
components.
lCREATE TABLE shipcrewregistry
(shipname text, lastname text, firstname
text, citizenid uuid,
aliases set<text>, PRIMARY KEY
(shipname, lastname, firstname,
citizenid));
ALLOW FILTERING
lActually I lied, you can skip primary key components if you apply
the ALLOW FILTERING clause.
lSELECT * FROM shipcrewregistry WHERE
lastname='Washburne';
ALLOW FILTERING (cont)
lSELECT * FROM shipcrewregistry WHERE
lastname='Washburne' ALLOW FILTERING;
lBut I don't recommend that.
lALLOW FILTERING pulls back all rows and then applies your
WHERE conditions.
lThe folks at DataStax have proposed some alternate names...
lBottom line, if you are using ALLOW FILTERING, you are doing it
wrong.
PRIMARY KEY
lPRIMARY KEYs function differently between Cassandra and
relational databases.
lCassandra uses primary keys to determine data distribution and
on-disk sort order.
lPartition keys are the equivalent of “old school” row keys.
lClustering keys determine on-disk sort order within a
partitioning key.
ORDER BY
lOne of the most misunderstood aspects of CQL.
lCan only order by clustering columns, in the key order of the
clustering columns listed in the table definition (CLUSTERING
ORDER).
lWhich means, that you really don't need ORDER BY.
lSo what does it do? It can reverse the sort direction (ASCending
vs. DESCending) of the first clustering column.
PRIMARY KEY / ORDER BY Example:
Table Definition
lCREATE TABLE postsByUserYear ( userid text,
year bigint, tag text, posttime timestamp,
content text, postid UUID, PRIMARY KEY
((userid, year), posttime, tag))
WITH CLUSTERING ORDER BY
(posttime desc, tag asc);
PRIMARY KEY / ORDER BY Example:
Queries
lSELECT * FROM postsByUserYear WHERE userid='2';
lSELECT * FROM postsByUserYear ORDER BY
posttime;
lSELECT * FROM postsByUserYear WHERE userid='2'
AND year=2015 ORDER BY posttime DESC;
lSELECT * FROM postsByUserYear WHERE userid='2'
AND year=2015 ORDER BY tag;
IN
lCan only operate on the last partition key and/or the last
clustering key.
lAnd only when the first partition/clustering keys are restricted
by an equals relation.
lDoes not perform well...especially with large clusters.
Testing IN
lCREATE TABLE bladerunners (id text, type text, ts
timestamp, name text, data text, PRIMARY KEY (id));
Testing IN (cont)
l SELECT * FROM bladerunners WHERE id IN
('B26354','B26354');
DISTINCT
lReturns a list of the queried partition keys.
lCan only operate on partition key column(s).
lIn Cassandra DISTINCT returns the partition (row) keys, so it is a
fairly light operation (relative to the size of the cluster and/or data
set).
lWhereas in the relational world, DISTINCT is a very resource
intensive operation.
COUNT
lCounts the number of rows returned, dependent on the WHERE
clause.
lDoes not aggregate.
lSimilar to its RDBMs counterpart.
lCan be (inadvertently) restricted by LIMIT.
lResource intensive command; especially because it has to scan
each row in the table (which may be on different nodes), and
apply the WHERE conditions.
Limit
lLimits your query to N rows (where N is a positive integer).
lSELECT * FROM bladerunners LIMIT 2;
lDoes not allow you to specify a start point.
lYou cannot use LIMIT to “page” through your result set.
Cassandra “Upserts”
lUnder the hood, INSERT and UPDATE are treated the same by
Cassandra.
lColloquially known as an “Upsert.”
lBoth INSERT and UPDATE operations require the complete
PRIMARY KEY.
So why the different syntax?
lFlexibility. Some situations call for one or the other.
lCounter columns/tables can only be incremented with an
UPDATE.
lINSERTs can save you some dev time in the application layer if
your PRIMARY KEY changes.
“Upsert” example
lUPDATE bladerunners SET data='This guy
is a one-man slaughterhouse.',name='Harry
Bryant',ts='2015-03-30 14:47:00-0600',type='Captain'
WHERE id='B16442';
lUPDATE bladerunners SET data = 'Drink
some for me, huh pal?' WHERE id='B16442';
“Upsert” example (cont)
lINSERT INTO bladerunners (id, type, ts, data, name)
VALUES ('B29591','Blade Runner','2015-03-30 14:34:00-
0600','Captain Bryant would like a word.','Eduardo Gaff');
lINSERT INTO bladerunners (id,data) VALUES
('B29591','It''s too bad she won't live. But then again,
who does?');
Secondary Indexes
lCassandra provides secondary indexes to allow queries on non-
partition key columns.
lIn 2.1.x you can even create indexes on collections and user
defined types.
lDesigned for convenience, not for performance.
lDoes not perform well on high-cardinality columns.
lExtremely low cardinality is also not a good idea.
lLow performance on a frequently updated column.
lIn my opinion, try to avoid using them all together.
Anti-Patterns
lMulti-Key queries: IN
lSecondary Index queries
lDELETEs or INSERTing null values
Summary
lWhile CQL is designed to make use of our previous experience
using SQL, it is important to remember that the two do not behave
the same.
lEven if you are at an expert level in SQL, read the CQL
documentation before making any assumptions.
Additional Reading
lGetting Started with Time Series Data Modeling –
Patrick McFadin
lSELECT – DataStax CQL 3.1 documentation
lCounting Keys in Cassandra –
Richard Low
lCassandra High Availability –
Robbie Strickland
Questions?

Más contenido relacionado

La actualidad más candente

Stop the Chaos! Get Real Oracle Performance by Query Tuning Part 1
Stop the Chaos! Get Real Oracle Performance by Query Tuning Part 1Stop the Chaos! Get Real Oracle Performance by Query Tuning Part 1
Stop the Chaos! Get Real Oracle Performance by Query Tuning Part 1SolarWinds
 
Maximum Overdrive: Tuning the Spark Cassandra Connector
Maximum Overdrive: Tuning the Spark Cassandra ConnectorMaximum Overdrive: Tuning the Spark Cassandra Connector
Maximum Overdrive: Tuning the Spark Cassandra ConnectorRussell Spitzer
 
Chasing the optimizer
Chasing the optimizerChasing the optimizer
Chasing the optimizerMauro Pagano
 
SQL Pattern Matching – should I start using it?
SQL Pattern Matching – should I start using it?SQL Pattern Matching – should I start using it?
SQL Pattern Matching – should I start using it?Andrej Pashchenko
 
Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...
Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...
Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...DataStax
 
How to Find Patterns in Your Data with SQL
How to Find Patterns in Your Data with SQLHow to Find Patterns in Your Data with SQL
How to Find Patterns in Your Data with SQLChris Saxon
 
Cassandra under the hood
Cassandra under the hoodCassandra under the hood
Cassandra under the hoodAndriy Rymar
 
Cassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series ModelingCassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series ModelingVassilis Bekiaris
 
Oracle Data Protection - 2. část
Oracle Data Protection - 2. částOracle Data Protection - 2. část
Oracle Data Protection - 2. částMarketingArrowECS_CZ
 
DOAG Oracle Unified Audit in Multitenant Environments
DOAG Oracle Unified Audit in Multitenant EnvironmentsDOAG Oracle Unified Audit in Multitenant Environments
DOAG Oracle Unified Audit in Multitenant EnvironmentsStefan Oehrli
 
Oracle Database Performance Tuning Advanced Features and Best Practices for DBAs
Oracle Database Performance Tuning Advanced Features and Best Practices for DBAsOracle Database Performance Tuning Advanced Features and Best Practices for DBAs
Oracle Database Performance Tuning Advanced Features and Best Practices for DBAsZohar Elkayam
 
MySQL Cluster: El ‘qué’ y el ‘cómo’.
MySQL Cluster: El ‘qué’ y el ‘cómo’.MySQL Cluster: El ‘qué’ y el ‘cómo’.
MySQL Cluster: El ‘qué’ y el ‘cómo’.Keith Hollman
 
REST in Piece - Administration of an Oracle Cluster/Database using REST
REST in Piece - Administration of an Oracle Cluster/Database using RESTREST in Piece - Administration of an Oracle Cluster/Database using REST
REST in Piece - Administration of an Oracle Cluster/Database using RESTChristian Gohmann
 
Spark streaming , Spark SQL
Spark streaming , Spark SQLSpark streaming , Spark SQL
Spark streaming , Spark SQLYousun Jeong
 
[TDC2016] Apache Cassandra Estratégias de Modelagem de Dados
[TDC2016]  Apache Cassandra Estratégias de Modelagem de Dados[TDC2016]  Apache Cassandra Estratégias de Modelagem de Dados
[TDC2016] Apache Cassandra Estratégias de Modelagem de DadosEiti Kimura
 
Part1 of SQL Tuning Workshop - Understanding the Optimizer
Part1 of SQL Tuning Workshop - Understanding the OptimizerPart1 of SQL Tuning Workshop - Understanding the Optimizer
Part1 of SQL Tuning Workshop - Understanding the OptimizerMaria Colgan
 
Scylla Summit 2022: IO Scheduling & NVMe Disk Modelling
 Scylla Summit 2022: IO Scheduling & NVMe Disk Modelling Scylla Summit 2022: IO Scheduling & NVMe Disk Modelling
Scylla Summit 2022: IO Scheduling & NVMe Disk ModellingScyllaDB
 
Staying Ahead of the Curve with Spring and Cassandra 4
Staying Ahead of the Curve with Spring and Cassandra 4Staying Ahead of the Curve with Spring and Cassandra 4
Staying Ahead of the Curve with Spring and Cassandra 4VMware Tanzu
 
Introduction to Cassandra Basics
Introduction to Cassandra BasicsIntroduction to Cassandra Basics
Introduction to Cassandra Basicsnickmbailey
 
Adapting and adopting spm v04
Adapting and adopting spm v04Adapting and adopting spm v04
Adapting and adopting spm v04Carlos Sierra
 

La actualidad más candente (20)

Stop the Chaos! Get Real Oracle Performance by Query Tuning Part 1
Stop the Chaos! Get Real Oracle Performance by Query Tuning Part 1Stop the Chaos! Get Real Oracle Performance by Query Tuning Part 1
Stop the Chaos! Get Real Oracle Performance by Query Tuning Part 1
 
Maximum Overdrive: Tuning the Spark Cassandra Connector
Maximum Overdrive: Tuning the Spark Cassandra ConnectorMaximum Overdrive: Tuning the Spark Cassandra Connector
Maximum Overdrive: Tuning the Spark Cassandra Connector
 
Chasing the optimizer
Chasing the optimizerChasing the optimizer
Chasing the optimizer
 
SQL Pattern Matching – should I start using it?
SQL Pattern Matching – should I start using it?SQL Pattern Matching – should I start using it?
SQL Pattern Matching – should I start using it?
 
Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...
Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...
Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...
 
How to Find Patterns in Your Data with SQL
How to Find Patterns in Your Data with SQLHow to Find Patterns in Your Data with SQL
How to Find Patterns in Your Data with SQL
 
Cassandra under the hood
Cassandra under the hoodCassandra under the hood
Cassandra under the hood
 
Cassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series ModelingCassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series Modeling
 
Oracle Data Protection - 2. část
Oracle Data Protection - 2. částOracle Data Protection - 2. část
Oracle Data Protection - 2. část
 
DOAG Oracle Unified Audit in Multitenant Environments
DOAG Oracle Unified Audit in Multitenant EnvironmentsDOAG Oracle Unified Audit in Multitenant Environments
DOAG Oracle Unified Audit in Multitenant Environments
 
Oracle Database Performance Tuning Advanced Features and Best Practices for DBAs
Oracle Database Performance Tuning Advanced Features and Best Practices for DBAsOracle Database Performance Tuning Advanced Features and Best Practices for DBAs
Oracle Database Performance Tuning Advanced Features and Best Practices for DBAs
 
MySQL Cluster: El ‘qué’ y el ‘cómo’.
MySQL Cluster: El ‘qué’ y el ‘cómo’.MySQL Cluster: El ‘qué’ y el ‘cómo’.
MySQL Cluster: El ‘qué’ y el ‘cómo’.
 
REST in Piece - Administration of an Oracle Cluster/Database using REST
REST in Piece - Administration of an Oracle Cluster/Database using RESTREST in Piece - Administration of an Oracle Cluster/Database using REST
REST in Piece - Administration of an Oracle Cluster/Database using REST
 
Spark streaming , Spark SQL
Spark streaming , Spark SQLSpark streaming , Spark SQL
Spark streaming , Spark SQL
 
[TDC2016] Apache Cassandra Estratégias de Modelagem de Dados
[TDC2016]  Apache Cassandra Estratégias de Modelagem de Dados[TDC2016]  Apache Cassandra Estratégias de Modelagem de Dados
[TDC2016] Apache Cassandra Estratégias de Modelagem de Dados
 
Part1 of SQL Tuning Workshop - Understanding the Optimizer
Part1 of SQL Tuning Workshop - Understanding the OptimizerPart1 of SQL Tuning Workshop - Understanding the Optimizer
Part1 of SQL Tuning Workshop - Understanding the Optimizer
 
Scylla Summit 2022: IO Scheduling & NVMe Disk Modelling
 Scylla Summit 2022: IO Scheduling & NVMe Disk Modelling Scylla Summit 2022: IO Scheduling & NVMe Disk Modelling
Scylla Summit 2022: IO Scheduling & NVMe Disk Modelling
 
Staying Ahead of the Curve with Spring and Cassandra 4
Staying Ahead of the Curve with Spring and Cassandra 4Staying Ahead of the Curve with Spring and Cassandra 4
Staying Ahead of the Curve with Spring and Cassandra 4
 
Introduction to Cassandra Basics
Introduction to Cassandra BasicsIntroduction to Cassandra Basics
Introduction to Cassandra Basics
 
Adapting and adopting spm v04
Adapting and adopting spm v04Adapting and adopting spm v04
Adapting and adopting spm v04
 

Similar a CQL: This is not the SQL you are looking for.

Cassandra Day Chicago 2015: CQL: This is not he SQL you are looking for
Cassandra Day Chicago 2015: CQL: This is not he SQL you are looking forCassandra Day Chicago 2015: CQL: This is not he SQL you are looking for
Cassandra Day Chicago 2015: CQL: This is not he SQL you are looking forDataStax Academy
 
Introdcution to Cassandra for RDB Folks
Introdcution to Cassandra for RDB FolksIntrodcution to Cassandra for RDB Folks
Introdcution to Cassandra for RDB FolksAlexander Graebe
 
Use Your MySQL Knowledge to Become an Instant Cassandra Guru
Use Your MySQL Knowledge to Become an Instant Cassandra GuruUse Your MySQL Knowledge to Become an Instant Cassandra Guru
Use Your MySQL Knowledge to Become an Instant Cassandra GuruTim Callaghan
 
Advanced MySQL Query Optimizations
Advanced MySQL Query OptimizationsAdvanced MySQL Query Optimizations
Advanced MySQL Query OptimizationsDave Stokes
 
Dbms ii mca-ch7-sql-2013
Dbms ii mca-ch7-sql-2013Dbms ii mca-ch7-sql-2013
Dbms ii mca-ch7-sql-2013Prosanta Ghosh
 
Steps towards of sql server developer
Steps towards of sql server developerSteps towards of sql server developer
Steps towards of sql server developerAhsan Kabir
 
Arrays and lists in sql server 2008
Arrays and lists in sql server 2008Arrays and lists in sql server 2008
Arrays and lists in sql server 2008nxthuong
 
NoSql with cassandra
NoSql with cassandraNoSql with cassandra
NoSql with cassandraMarek Koniew
 
Exploring collections with example
Exploring collections with exampleExploring collections with example
Exploring collections with examplepranav kumar verma
 
Green plum培训材料
Green plum培训材料Green plum培训材料
Green plum培训材料锐 张
 
Scylla Summit 2017: How to Run Cassandra/Scylla from a MySQL DBA's Point of View
Scylla Summit 2017: How to Run Cassandra/Scylla from a MySQL DBA's Point of ViewScylla Summit 2017: How to Run Cassandra/Scylla from a MySQL DBA's Point of View
Scylla Summit 2017: How to Run Cassandra/Scylla from a MySQL DBA's Point of ViewScyllaDB
 
Cassandra, Modeling and Availability at AMUG
Cassandra, Modeling and Availability at AMUGCassandra, Modeling and Availability at AMUG
Cassandra, Modeling and Availability at AMUGMatthew Dennis
 
Managing Cassandra at Scale by Al Tobey
Managing Cassandra at Scale by Al TobeyManaging Cassandra at Scale by Al Tobey
Managing Cassandra at Scale by Al TobeyDataStax Academy
 
How to leave the ORM at home and write SQL
How to leave the ORM at home and write SQLHow to leave the ORM at home and write SQL
How to leave the ORM at home and write SQLMariaDB plc
 

Similar a CQL: This is not the SQL you are looking for. (20)

Cassandra Day Chicago 2015: CQL: This is not he SQL you are looking for
Cassandra Day Chicago 2015: CQL: This is not he SQL you are looking forCassandra Day Chicago 2015: CQL: This is not he SQL you are looking for
Cassandra Day Chicago 2015: CQL: This is not he SQL you are looking for
 
Introdcution to Cassandra for RDB Folks
Introdcution to Cassandra for RDB FolksIntrodcution to Cassandra for RDB Folks
Introdcution to Cassandra for RDB Folks
 
Use Your MySQL Knowledge to Become an Instant Cassandra Guru
Use Your MySQL Knowledge to Become an Instant Cassandra GuruUse Your MySQL Knowledge to Become an Instant Cassandra Guru
Use Your MySQL Knowledge to Become an Instant Cassandra Guru
 
Advanced MySQL Query Optimizations
Advanced MySQL Query OptimizationsAdvanced MySQL Query Optimizations
Advanced MySQL Query Optimizations
 
Cassandra20141113
Cassandra20141113Cassandra20141113
Cassandra20141113
 
Chapter2
Chapter2Chapter2
Chapter2
 
Cassandra20141009
Cassandra20141009Cassandra20141009
Cassandra20141009
 
Dbms ii mca-ch7-sql-2013
Dbms ii mca-ch7-sql-2013Dbms ii mca-ch7-sql-2013
Dbms ii mca-ch7-sql-2013
 
Structured Query Language
Structured Query LanguageStructured Query Language
Structured Query Language
 
Steps towards of sql server developer
Steps towards of sql server developerSteps towards of sql server developer
Steps towards of sql server developer
 
Sql
SqlSql
Sql
 
Arrays and lists in sql server 2008
Arrays and lists in sql server 2008Arrays and lists in sql server 2008
Arrays and lists in sql server 2008
 
NoSql with cassandra
NoSql with cassandraNoSql with cassandra
NoSql with cassandra
 
Exploring collections with example
Exploring collections with exampleExploring collections with example
Exploring collections with example
 
Green plum培训材料
Green plum培训材料Green plum培训材料
Green plum培训材料
 
SQL
SQLSQL
SQL
 
Scylla Summit 2017: How to Run Cassandra/Scylla from a MySQL DBA's Point of View
Scylla Summit 2017: How to Run Cassandra/Scylla from a MySQL DBA's Point of ViewScylla Summit 2017: How to Run Cassandra/Scylla from a MySQL DBA's Point of View
Scylla Summit 2017: How to Run Cassandra/Scylla from a MySQL DBA's Point of View
 
Cassandra, Modeling and Availability at AMUG
Cassandra, Modeling and Availability at AMUGCassandra, Modeling and Availability at AMUG
Cassandra, Modeling and Availability at AMUG
 
Managing Cassandra at Scale by Al Tobey
Managing Cassandra at Scale by Al TobeyManaging Cassandra at Scale by Al Tobey
Managing Cassandra at Scale by Al Tobey
 
How to leave the ORM at home and write SQL
How to leave the ORM at home and write SQLHow to leave the ORM at home and write SQL
How to leave the ORM at home and write SQL
 

Último

Things you didn't know you can use in your Salesforce
Things you didn't know you can use in your SalesforceThings you didn't know you can use in your Salesforce
Things you didn't know you can use in your SalesforceMartin Humpolec
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesDavid Newbury
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.YounusS2
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfDaniel Santiago Silva Capera
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024D Cloud Solutions
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPathCommunity
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UbiTrack UK
 
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureEric D. Schabell
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfDianaGray10
 
20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf
20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf
20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdfJamie (Taka) Wang
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Will Schroeder
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Commit University
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Websitedgelyza
 
Cloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial DataCloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial DataSafe Software
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7DianaGray10
 
RAG Patterns and Vector Search in Generative AI
RAG Patterns and Vector Search in Generative AIRAG Patterns and Vector Search in Generative AI
RAG Patterns and Vector Search in Generative AIUdaiappa Ramachandran
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostMatt Ray
 
Babel Compiler - Transforming JavaScript for All Browsers.pptx
Babel Compiler - Transforming JavaScript for All Browsers.pptxBabel Compiler - Transforming JavaScript for All Browsers.pptx
Babel Compiler - Transforming JavaScript for All Browsers.pptxYounusS2
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6DianaGray10
 

Último (20)

Things you didn't know you can use in your Salesforce
Things you didn't know you can use in your SalesforceThings you didn't know you can use in your Salesforce
Things you didn't know you can use in your Salesforce
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond Ontologies
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation Developers
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
 
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability Adventure
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
 
20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf
20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf
20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Website
 
Cloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial DataCloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial Data
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7
 
RAG Patterns and Vector Search in Generative AI
RAG Patterns and Vector Search in Generative AIRAG Patterns and Vector Search in Generative AI
RAG Patterns and Vector Search in Generative AI
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
 
Babel Compiler - Transforming JavaScript for All Browsers.pptx
Babel Compiler - Transforming JavaScript for All Browsers.pptxBabel Compiler - Transforming JavaScript for All Browsers.pptx
Babel Compiler - Transforming JavaScript for All Browsers.pptx
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6
 

CQL: This is not the SQL you are looking for.

  • 1. Chicago 2015 CQL: This is not the SQL you are looking for Aaron Ploetz
  • 2. Wait... CQL is not SQL? lCQL3 introduced in Cassandra 1.1. lCQL is beneficial to new users who have a relational background (which is most of us). lHowever similar, CQL is NOT a direct implementation of SQL. lNew users leave themselves open to issues and frustration when they use CQL with SQL-based expectations.
  • 3. $ whoami lAaron Ploetz l@APloetz lLead Database Engineer lUsing Cassandra since version 0.8. lContributor to the Cassandra tag on l2014/15 DataStax MVP for Apache Cassandra
  • 4. 1 SQL features/keywords not present in CQL 2 Differences between CQL and SQL keywords 3 Secondary Indexes 4 Anti-Patterns 5 Questions
  • 5. SQL features/keywords not present in CQL lJOINs lLIKE lSubqueries lAggregation lArithmetic lExcept for counters and collections.
  • 6. Differences between CQL and SQL keywords lWHERE lPRIMARY KEY lORDER BY lIN lDISTINCT lCOUNT lLIMIT lINSERT vs. UPDATE (“upsert”)
  • 7. WHERE lOnly supports AND, IN, =, >, >=, <, <=. lSome only function under certain conditions. lAlso: CONTAINS, CONTAINS KEY for indexed collections. lDoes not exist: OR, != lConditions can only operate on PRIMARY KEY components, and in the defined order of the keys.
  • 8. WHERE (cont) lSELECT * FROM shipcrewregistry WHERE shipname='Serenity'; lStart with partition key(s); cannot skip PRIMARY KEY components. lCREATE TABLE shipcrewregistry (shipname text, lastname text, firstname text, citizenid uuid, aliases set<text>, PRIMARY KEY (shipname, lastname, firstname, citizenid));
  • 9. ALLOW FILTERING lActually I lied, you can skip primary key components if you apply the ALLOW FILTERING clause. lSELECT * FROM shipcrewregistry WHERE lastname='Washburne';
  • 10. ALLOW FILTERING (cont) lSELECT * FROM shipcrewregistry WHERE lastname='Washburne' ALLOW FILTERING; lBut I don't recommend that. lALLOW FILTERING pulls back all rows and then applies your WHERE conditions. lThe folks at DataStax have proposed some alternate names... lBottom line, if you are using ALLOW FILTERING, you are doing it wrong.
  • 11. PRIMARY KEY lPRIMARY KEYs function differently between Cassandra and relational databases. lCassandra uses primary keys to determine data distribution and on-disk sort order. lPartition keys are the equivalent of “old school” row keys. lClustering keys determine on-disk sort order within a partitioning key.
  • 12. ORDER BY lOne of the most misunderstood aspects of CQL. lCan only order by clustering columns, in the key order of the clustering columns listed in the table definition (CLUSTERING ORDER). lWhich means, that you really don't need ORDER BY. lSo what does it do? It can reverse the sort direction (ASCending vs. DESCending) of the first clustering column.
  • 13. PRIMARY KEY / ORDER BY Example: Table Definition lCREATE TABLE postsByUserYear ( userid text, year bigint, tag text, posttime timestamp, content text, postid UUID, PRIMARY KEY ((userid, year), posttime, tag)) WITH CLUSTERING ORDER BY (posttime desc, tag asc);
  • 14. PRIMARY KEY / ORDER BY Example: Queries lSELECT * FROM postsByUserYear WHERE userid='2'; lSELECT * FROM postsByUserYear ORDER BY posttime; lSELECT * FROM postsByUserYear WHERE userid='2' AND year=2015 ORDER BY posttime DESC; lSELECT * FROM postsByUserYear WHERE userid='2' AND year=2015 ORDER BY tag;
  • 15. IN lCan only operate on the last partition key and/or the last clustering key. lAnd only when the first partition/clustering keys are restricted by an equals relation. lDoes not perform well...especially with large clusters.
  • 16. Testing IN lCREATE TABLE bladerunners (id text, type text, ts timestamp, name text, data text, PRIMARY KEY (id));
  • 17. Testing IN (cont) l SELECT * FROM bladerunners WHERE id IN ('B26354','B26354');
  • 18. DISTINCT lReturns a list of the queried partition keys. lCan only operate on partition key column(s). lIn Cassandra DISTINCT returns the partition (row) keys, so it is a fairly light operation (relative to the size of the cluster and/or data set). lWhereas in the relational world, DISTINCT is a very resource intensive operation.
  • 19. COUNT lCounts the number of rows returned, dependent on the WHERE clause. lDoes not aggregate. lSimilar to its RDBMs counterpart. lCan be (inadvertently) restricted by LIMIT. lResource intensive command; especially because it has to scan each row in the table (which may be on different nodes), and apply the WHERE conditions.
  • 20. Limit lLimits your query to N rows (where N is a positive integer). lSELECT * FROM bladerunners LIMIT 2; lDoes not allow you to specify a start point. lYou cannot use LIMIT to “page” through your result set.
  • 21. Cassandra “Upserts” lUnder the hood, INSERT and UPDATE are treated the same by Cassandra. lColloquially known as an “Upsert.” lBoth INSERT and UPDATE operations require the complete PRIMARY KEY.
  • 22. So why the different syntax? lFlexibility. Some situations call for one or the other. lCounter columns/tables can only be incremented with an UPDATE. lINSERTs can save you some dev time in the application layer if your PRIMARY KEY changes.
  • 23. “Upsert” example lUPDATE bladerunners SET data='This guy is a one-man slaughterhouse.',name='Harry Bryant',ts='2015-03-30 14:47:00-0600',type='Captain' WHERE id='B16442'; lUPDATE bladerunners SET data = 'Drink some for me, huh pal?' WHERE id='B16442';
  • 24. “Upsert” example (cont) lINSERT INTO bladerunners (id, type, ts, data, name) VALUES ('B29591','Blade Runner','2015-03-30 14:34:00- 0600','Captain Bryant would like a word.','Eduardo Gaff'); lINSERT INTO bladerunners (id,data) VALUES ('B29591','It''s too bad she won't live. But then again, who does?');
  • 25. Secondary Indexes lCassandra provides secondary indexes to allow queries on non- partition key columns. lIn 2.1.x you can even create indexes on collections and user defined types. lDesigned for convenience, not for performance. lDoes not perform well on high-cardinality columns. lExtremely low cardinality is also not a good idea. lLow performance on a frequently updated column. lIn my opinion, try to avoid using them all together.
  • 26. Anti-Patterns lMulti-Key queries: IN lSecondary Index queries lDELETEs or INSERTing null values
  • 27. Summary lWhile CQL is designed to make use of our previous experience using SQL, it is important to remember that the two do not behave the same. lEven if you are at an expert level in SQL, read the CQL documentation before making any assumptions.
  • 28. Additional Reading lGetting Started with Time Series Data Modeling – Patrick McFadin lSELECT – DataStax CQL 3.1 documentation lCounting Keys in Cassandra – Richard Low lCassandra High Availability – Robbie Strickland