SlideShare una empresa de Scribd logo
1 de 43
Descargar para leer sin conexión
Cassandra Hands On
Niall Milton, CTO, DigBigData
Examples courtesy of Patrick Callaghan, DataStax
Sponsored By
Introduction
—  We will be walking through Cassandra use cases
from Patrick Callaghan on github.
—  https://github.com/PatrickCallaghan/
—  Patrick sends his apologies but due to Aer Lingus
air strike on Friday he couldn’t get a flight back to
UK
—  This presentation will cover the important points
from each sample application
Agenda
—  Transactions Example
—  Paging Example
—  Analytics Example
—  Risk Sensitivity Example
Transactions Example
Scenario
—  We want to add products, each with a quantity to
an order
—  Orders come in concurrently from random buyers
—  Products that have sold out will return “OUT OF
STOCK”
—  We want to use lightweight transactions to
guarantee that we do not allow orders to complete
when no stock is available
Lightweight Transactions
—  Guarantee a serial isolation level, ACID
—  Uses PAXOS consensus algorithm to achieve this in a
distributed system. See:
—  http://research.microsoft.com/en-us/um/people/lamport/
pubs/paxos-simple.pdf
—  Every node is still equal, no master or locks
—  Allows for conditional inserts & updates
—  The cost of linearizable consistency is higher latency,
not suitable for high volume writes where low latency is
required
Retrieve & Run the Code
1.  git clone
https://github.com/PatrickCallaghan/datastax-
transaction-demo.git
2.  mvn clean compile exec:java -
Dexec.mainClass="com.datastax.demo.SchemaSetup”
3.  mvn clean compile exec:java -
Dexec.mainClass="com.datastax.transactions.Main" -
Dload=true -DcontactPoints=127.0.0.1 -
DnoOfThreads=10
Schema
1.  create keyspace if not exists
datastax_transactions_demo WITH replication =
{'class': 'SimpleStrategy',
'replication_factor': '1' };
2.  create table if not exists products(productId
text, capacityleft int, orderIds set<text>,
PRIMARY KEY (productId));
3.  create table if not exists
buyers_orders(buyerId text, orderId text,
productId text, PRIMARY KEY(buyerId, orderId));
Model
public class Order {	
	
	private String orderId;	
	private String productId;	
	private String buyerId;	
		
	…	
}
Method
—  Find current product quantity at CL.SERIAL
—  This allows us to execute a PAXOS query without
proposing an update, i.e. read the current value
SELECT capacityLeft from products WHERE
productId = ‘1234’
e.g. capacityLeft = 5
Method Contd.
—  Do a conditional update using IF operator to make
sure product quantity has not changed since last
quantity check
—  Note the use of the set collection type here.
—  This statement will only succeed if the IF condition is
met
UPDATE products SET orderIds=orderIds +
{'3'}, capacityleft = 4 WHERE productId =
’1234' IF capacityleft = 5;
Method Contd.
—  If last query succeeds, simply insert the order.
INSERT into orders (buyerId, orderId,
productId) values (1,3,’1234’);
—  This guarantees that no order will be placed where
there is insufficient quantity to fulfill it.
Comments
—  Using LWT incurs a cost of higher latency because
all replicas must be consulted before a value is
committed / returned.
—  CL.SERIAL does not propose a new value but is
used to read the possibly uncommitted PAXOS
state
—  The IF operator can also be used as IF NOT EXISTS
which is useful for user creation for example
Paging Example
Scenario
—  We have 1000s of products in our product
catalogue
—  We want to browse these using a simple select
—  We don’t want to retrieve all at once!
Cursors
—  We are often dealing with wide rows in Cassandra
—  Reading entire rows or multiple rows at once could
lead to OOM errors
—  Traditionally this meant using range queries to
retrieve content
—  Cassandra 2.0 (and Java driver) introduces cursors
—  Makes row based queries more efficient (no need to
use the token() function)
—  This will simplify client code
Retrieve & Run the Code
1.  git clone
https://github.com/PatrickCallaghan/datastax-
paging-demo.git
2.  mvn clean compile exec:java -
Dexec.mainClass="com.datastax.demo.SchemaSetup"
3.  mvn clean compile exec:java -
Dexec.mainClass="com.datastax.paging.Main"
Schema
create table if not exists
products(productId text, capacityleft int,
orderIds set<text>, PRIMARY KEY
(productId));
—  N.B With the default partitioner, products will be
ordered based on Murmer3 hash value. Old way we
would need to use the token() function to retrieve
them in order
Model
public class Product {	
	
	private String productId;	
	private int capacityLeft;	
	private Set<String> orderIds;	
	
	…	
}
Method
1.  Create a simple select query for the products
table.
2.  Set the fetch size parameter
3.  Execute the statement
Statement stmt = new
SimpleStatement("Select * from products”);	
stmt.setFetchSize(100);	
ResultSet resultSet =
this.session.execute(stmt);
Method Contd.
1.  Get an iterator for the result set
2.  Use a while loop to iterate over the result set
Iterator<Row> iterator = resultSet.iterator();	
while (iterator.hasNext()){	
	Row row = iterator.next();	
// do stuff with the row	
}
Comments
—  Very easy to transparently iterate in a memory
efficient way over a large result set
—  Cursor state is maintained by driver.
—  Allows for failover between different page
responses, i.e. the state is not lost if a page fails to
load from a node in the replica set, the page will be
requested from another node
—  See: http://www.datastax.com/dev/blog/client-
side-improvements-in-cassandra-2-0
Analytics Example
Scenario
—  Don’t have Hadoop but want to run some HIVE type
analytics on our large dataset
—  Example: Get the Top10 financial transactions
ordered by monetary value for each user
—  May want to add more complex filtering later
(where value > 1000) or even do mathematical
groupings, percentiles, means, min, max
Cassandra for Analytics
—  Useful for many scenarios when no other analytics
solution is available
—  Using cursors, queries are bounded & memory efficient
depending on the operation
—  Can be applied anywhere we can do iterative or recursive
processing, SUM, AVG, MIN, MAX etc.
—  NB: The example code also includes an
CQLSSTableWriter which is fast & convenient if we want
to manually create SSTables of large datasets rather
than send millions of insert queries to Cassandra
Retrieve & Run the Code
1.  git clone
https://github.com/PatrickCallaghan/datastax-
analytics-example.git
2.  export MAVEN_OPTS=-Xmx512M (up the memory)
3.  mvn clean compile exec:java -
Dexec.mainClass="com.datastax.bulkloader.Main"
4.  mvn clean compile exec:java -
Dexec.mainClass="com.datastax.analytics.TopTrans
actionsByAmountForUserRunner"
Schema
create table IF NOT EXISTS transactions (	
	accid text,	
	txtnid uuid,	
	txtntime timestamp,	
	amount double,	
	type text,	
	reason text,	
	PRIMARY KEY(accid, txtntime)	
);
Model
public class Transaction {	
	pivate String txtnId;	
	private String acountId;	
	private double amount;	
	private Date txtnDate;	
	private String reason;	
	private String type;	
	…	
}
Method
—  Pass a blocking queue into the DAO method which cursors the
data, allows us to pop items off as they are added
—  NB: Could also use a callback here to update the queue
public void
getAllProducts(BlockingQueue<Transaction>
processorQueue)	
Statement stmt = new SimpleStatement(“SELECT * FROM
transactions”);	
stmt.setFetchSize(2500);	
ResultSet resultSet = this.session.execute(stmt);
Method Contd.
1.  Get an iterator for the result set
2.  Use a while loop to iterate over the result set, add each row
into the queue
while (iterator.hasNext()) {	
	Row row = iterator.next();	
	Transaction transaction = 	
	createTransactionFromRow(row); //convenience	
	queue.offer(transaction); 	 	 		
}
Method Contd.
1.  Use Java Collections & Transaction comparator to
track Top results
private Set<Transaction> orderedSet = new
BoundedTreeSet<Transaction>(10, new
TransactionAmountComparator());
Comments
—  Entirely possible, but probably not to be thought of as a
complete replacement for dedicated analytics solutions
—  Issues are token distribution across replicas and mixed write
and read patterns
—  Running analytics or MR operations can be a read heavy
operation (as well as memory and i/o intensive)
—  Transaction logging tends to be write heavy
—  Cassandra can handle it, but in practice it is better to split
workloads except for smaller cases, where latency doesn’t
matter or where the cluster is not generally under significant
load
—  Consider DSE Hadoop, Spark, Storm as alternatives
Risk Sensitivity Example
Scenario
—  In financial risk systems, positions have sensitivity to
certain variable
—  Positions are hierarchical and is associated with a trader
at a desk which is part of an asset type in a certain
location.
—  E.g. Frankfurt/FX/desk10/trader7/position23
—  Sensitivity values are inserted for each position. We
need to aggregate them for each level in the hierarchy
—  The Sum of all sensitivities over time is the new
sensitivity as they are represented by deltas.
Scenario
—  E.g. Aggregations for:
—  Frankfurt/FX/desk10/trader7
—  Frankfurt/FX/desk10
—  Frankfurt/FX
—  As new positions are entered the risk sensitivities will
change and will need to be aggregated for each level
for the new value to be available
Queries
select * from risk_sensitivities_hierarchy
where hier_path = 'Paris/FX'; !
select * from risk_sensitivities_hierarchy
where hier_path = 'Paris/FX/desk4' and
sub_hier_path='trader3'; !
select * from risk_sensitivities_hierarchy
where hier_path = 'Paris/FX/desk4' and
sub_hier_path='trader3' and
risk_sens_name='irDelta';!
Retrieve & Run the Code
1.  git clone
https://github.com/PatrickCallaghan/datastax-
analytics-example.git
2.  export MAVEN_OPTS=-Xmx512M (up the memory)
3.  mvn clean compile exec:java -
Dexec.mainClass="com.datastax.bulkloader.Main"
4.  mvn clean compile exec:java -
Dexec.mainClass="com.heb.finance.analytics.Main"
-DstopSize=1000000
Schema
create table if not exists risk_sensitivities_hierarchy ( 	
	hier_path text,	
	sub_hier_path text, 	
	risk_sens_name text, 	
	value double, 	
	PRIMARY KEY (hier_path, sub_hier_path,
risk_sens_name)	
) WITH compaction={'class': 'LeveledCompactionStrategy'};	
NB: Notice the use of LCS as we want the table to be efficient for
reads also
Model
public class RiskSensitivity	
	public final String name;	
	public final String path;	
	public final String position;	
	public final BigDecimal value;	
	…	
}
Method
—  Write a service to write new sensitivities to
Cassandra Periodically.
insert into risk_sensitivities_hierarchy
(hier_path, sub_hier_path, risk_sens_name,
value) VALUES (?, ?, ?, ?)
Method Contd.
—  In our aggregator do the following periodically
—  Select data for hierarchies we wish to aggregate
select * from risk_sensitivities_hierarchy where
hier_path = ‘Frankfurt/FX/desk10/trader4’
—  Will get all positions related to this hierarchy
—  Add the values (represented as deltas) to each other to get
the new sensitivity
—  E.g. S1 = -3, S2 = 2, S3= -1
—  Write it back for ‘Frankfurt/FX/desk10/trader4’
Comments
—  Simple way to maintain up to date risk sensitivity
on an on going basis based on previous data
—  Will mean (N Hierarchies) * (N variables) queries
are executed periodically (keep an eye on this)
—  Cursors, blocking queue and bounded collections
help us achieve the same result without reading
entire rows
—  Has other applications such as roll ups for stream
data provided you have a reasonably low cardinality
in terms of number of (time resolution) * variables.
—  Thanks Patrick Callaghan for the hard work coding
the examples!
— Questions?

Más contenido relacionado

La actualidad más candente

Lambda at Weather Scale - Cassandra Summit 2015
Lambda at Weather Scale - Cassandra Summit 2015Lambda at Weather Scale - Cassandra Summit 2015
Lambda at Weather Scale - Cassandra Summit 2015Robbie Strickland
 
LA Cassandra Day 2015 - Testing Cassandra
LA Cassandra Day 2015  - Testing CassandraLA Cassandra Day 2015  - Testing Cassandra
LA Cassandra Day 2015 - Testing CassandraChristopher Batey
 
Five Data Models for Sharding | Nordic PGDay 2018 | Craig Kerstiens
Five Data Models for Sharding | Nordic PGDay 2018 | Craig KerstiensFive Data Models for Sharding | Nordic PGDay 2018 | Craig Kerstiens
Five Data Models for Sharding | Nordic PGDay 2018 | Craig KerstiensCitus Data
 
Riddles of Streaming - Code Puzzlers for Fun & Profit (Nick Dearden, Confluen...
Riddles of Streaming - Code Puzzlers for Fun & Profit (Nick Dearden, Confluen...Riddles of Streaming - Code Puzzlers for Fun & Profit (Nick Dearden, Confluen...
Riddles of Streaming - Code Puzzlers for Fun & Profit (Nick Dearden, Confluen...confluent
 
Event sourcing - what could possibly go wrong ? Devoxx PL 2021
Event sourcing  - what could possibly go wrong ? Devoxx PL 2021Event sourcing  - what could possibly go wrong ? Devoxx PL 2021
Event sourcing - what could possibly go wrong ? Devoxx PL 2021Andrzej Ludwikowski
 
Distributed Real-Time Stream Processing: Why and How 2.0
Distributed Real-Time Stream Processing:  Why and How 2.0Distributed Real-Time Stream Processing:  Why and How 2.0
Distributed Real-Time Stream Processing: Why and How 2.0Petr Zapletal
 
WSO2 Product Release Webinar: WSO2 Complex Event Processor 4.0
WSO2 Product Release Webinar: WSO2 Complex Event Processor 4.0WSO2 Product Release Webinar: WSO2 Complex Event Processor 4.0
WSO2 Product Release Webinar: WSO2 Complex Event Processor 4.0WSO2
 
Performance Analysis and Optimizations for Kafka Streams Applications
Performance Analysis and Optimizations for Kafka Streams ApplicationsPerformance Analysis and Optimizations for Kafka Streams Applications
Performance Analysis and Optimizations for Kafka Streams ApplicationsGuozhang Wang
 
Building a fully-automated Fast Data Platform
Building a fully-automated Fast Data PlatformBuilding a fully-automated Fast Data Platform
Building a fully-automated Fast Data PlatformManuel Sehlinger
 
Low latency in java 8 by Peter Lawrey
Low latency in java 8 by Peter Lawrey Low latency in java 8 by Peter Lawrey
Low latency in java 8 by Peter Lawrey J On The Beach
 
Distributed real time stream processing- why and how
Distributed real time stream processing- why and howDistributed real time stream processing- why and how
Distributed real time stream processing- why and howPetr Zapletal
 
Building large-scale analytics platform with Storm, Kafka and Cassandra - NYC...
Building large-scale analytics platform with Storm, Kafka and Cassandra - NYC...Building large-scale analytics platform with Storm, Kafka and Cassandra - NYC...
Building large-scale analytics platform with Storm, Kafka and Cassandra - NYC...Alexey Kharlamov
 
Distributed Computing on PostgreSQL | PGConf EU 2017 | Marco Slot
Distributed Computing on PostgreSQL | PGConf EU 2017 | Marco SlotDistributed Computing on PostgreSQL | PGConf EU 2017 | Marco Slot
Distributed Computing on PostgreSQL | PGConf EU 2017 | Marco SlotCitus Data
 
Andrzej Ludwikowski - Event Sourcing - what could possibly go wrong? - Codemo...
Andrzej Ludwikowski - Event Sourcing - what could possibly go wrong? - Codemo...Andrzej Ludwikowski - Event Sourcing - what could possibly go wrong? - Codemo...
Andrzej Ludwikowski - Event Sourcing - what could possibly go wrong? - Codemo...Codemotion
 
Stream processing - Apache flink
Stream processing - Apache flinkStream processing - Apache flink
Stream processing - Apache flinkRenato Guimaraes
 
Life of a Label (PromCon2016, Berlin)
Life of a Label (PromCon2016, Berlin)Life of a Label (PromCon2016, Berlin)
Life of a Label (PromCon2016, Berlin)Brian Brazil
 
Creating Fault Tolerant Services on Mesos
Creating Fault Tolerant Services on MesosCreating Fault Tolerant Services on Mesos
Creating Fault Tolerant Services on MesosArangoDB Database
 

La actualidad más candente (20)

Lambda at Weather Scale - Cassandra Summit 2015
Lambda at Weather Scale - Cassandra Summit 2015Lambda at Weather Scale - Cassandra Summit 2015
Lambda at Weather Scale - Cassandra Summit 2015
 
Lambdas puzzler - Peter Lawrey
Lambdas puzzler - Peter LawreyLambdas puzzler - Peter Lawrey
Lambdas puzzler - Peter Lawrey
 
LA Cassandra Day 2015 - Testing Cassandra
LA Cassandra Day 2015  - Testing CassandraLA Cassandra Day 2015  - Testing Cassandra
LA Cassandra Day 2015 - Testing Cassandra
 
Five Data Models for Sharding | Nordic PGDay 2018 | Craig Kerstiens
Five Data Models for Sharding | Nordic PGDay 2018 | Craig KerstiensFive Data Models for Sharding | Nordic PGDay 2018 | Craig Kerstiens
Five Data Models for Sharding | Nordic PGDay 2018 | Craig Kerstiens
 
Riddles of Streaming - Code Puzzlers for Fun & Profit (Nick Dearden, Confluen...
Riddles of Streaming - Code Puzzlers for Fun & Profit (Nick Dearden, Confluen...Riddles of Streaming - Code Puzzlers for Fun & Profit (Nick Dearden, Confluen...
Riddles of Streaming - Code Puzzlers for Fun & Profit (Nick Dearden, Confluen...
 
Event sourcing - what could possibly go wrong ? Devoxx PL 2021
Event sourcing  - what could possibly go wrong ? Devoxx PL 2021Event sourcing  - what could possibly go wrong ? Devoxx PL 2021
Event sourcing - what could possibly go wrong ? Devoxx PL 2021
 
Distributed Real-Time Stream Processing: Why and How 2.0
Distributed Real-Time Stream Processing:  Why and How 2.0Distributed Real-Time Stream Processing:  Why and How 2.0
Distributed Real-Time Stream Processing: Why and How 2.0
 
WSO2 Product Release Webinar: WSO2 Complex Event Processor 4.0
WSO2 Product Release Webinar: WSO2 Complex Event Processor 4.0WSO2 Product Release Webinar: WSO2 Complex Event Processor 4.0
WSO2 Product Release Webinar: WSO2 Complex Event Processor 4.0
 
Performance Analysis and Optimizations for Kafka Streams Applications
Performance Analysis and Optimizations for Kafka Streams ApplicationsPerformance Analysis and Optimizations for Kafka Streams Applications
Performance Analysis and Optimizations for Kafka Streams Applications
 
Building a fully-automated Fast Data Platform
Building a fully-automated Fast Data PlatformBuilding a fully-automated Fast Data Platform
Building a fully-automated Fast Data Platform
 
Spark streaming: Best Practices
Spark streaming: Best PracticesSpark streaming: Best Practices
Spark streaming: Best Practices
 
Low latency in java 8 by Peter Lawrey
Low latency in java 8 by Peter Lawrey Low latency in java 8 by Peter Lawrey
Low latency in java 8 by Peter Lawrey
 
Distributed real time stream processing- why and how
Distributed real time stream processing- why and howDistributed real time stream processing- why and how
Distributed real time stream processing- why and how
 
Building large-scale analytics platform with Storm, Kafka and Cassandra - NYC...
Building large-scale analytics platform with Storm, Kafka and Cassandra - NYC...Building large-scale analytics platform with Storm, Kafka and Cassandra - NYC...
Building large-scale analytics platform with Storm, Kafka and Cassandra - NYC...
 
Distributed Computing on PostgreSQL | PGConf EU 2017 | Marco Slot
Distributed Computing on PostgreSQL | PGConf EU 2017 | Marco SlotDistributed Computing on PostgreSQL | PGConf EU 2017 | Marco Slot
Distributed Computing on PostgreSQL | PGConf EU 2017 | Marco Slot
 
Andrzej Ludwikowski - Event Sourcing - what could possibly go wrong? - Codemo...
Andrzej Ludwikowski - Event Sourcing - what could possibly go wrong? - Codemo...Andrzej Ludwikowski - Event Sourcing - what could possibly go wrong? - Codemo...
Andrzej Ludwikowski - Event Sourcing - what could possibly go wrong? - Codemo...
 
Stream processing - Apache flink
Stream processing - Apache flinkStream processing - Apache flink
Stream processing - Apache flink
 
WSO2 Complex Event Processor
WSO2 Complex Event ProcessorWSO2 Complex Event Processor
WSO2 Complex Event Processor
 
Life of a Label (PromCon2016, Berlin)
Life of a Label (PromCon2016, Berlin)Life of a Label (PromCon2016, Berlin)
Life of a Label (PromCon2016, Berlin)
 
Creating Fault Tolerant Services on Mesos
Creating Fault Tolerant Services on MesosCreating Fault Tolerant Services on Mesos
Creating Fault Tolerant Services on Mesos
 

Similar a Cassandra hands on

Apache Cassandra 2.0
Apache Cassandra 2.0Apache Cassandra 2.0
Apache Cassandra 2.0Joe Stein
 
Distribute Key Value Store
Distribute Key Value StoreDistribute Key Value Store
Distribute Key Value StoreSantal Li
 
Distribute key value_store
Distribute key value_storeDistribute key value_store
Distribute key value_storedrewz lin
 
App Grid Dev With Coherence
App Grid Dev With CoherenceApp Grid Dev With Coherence
App Grid Dev With CoherenceJames Bayer
 
Application Grid Dev with Coherence
Application Grid Dev with CoherenceApplication Grid Dev with Coherence
Application Grid Dev with CoherenceJames Bayer
 
App Grid Dev With Coherence
App Grid Dev With CoherenceApp Grid Dev With Coherence
App Grid Dev With CoherenceJames Bayer
 
Pragmatic Cloud Security Automation
Pragmatic Cloud Security AutomationPragmatic Cloud Security Automation
Pragmatic Cloud Security AutomationCloudVillage
 
Practical catalyst
Practical catalystPractical catalyst
Practical catalystdwm042
 
GumGum: Multi-Region Cassandra in AWS
GumGum: Multi-Region Cassandra in AWSGumGum: Multi-Region Cassandra in AWS
GumGum: Multi-Region Cassandra in AWSDataStax Academy
 
Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic System
Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic SystemTimely Year Two: Lessons Learned Building a Scalable Metrics Analytic System
Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic SystemAccumulo Summit
 
Logisland "Event Mining at scale"
Logisland "Event Mining at scale"Logisland "Event Mining at scale"
Logisland "Event Mining at scale"Thomas Bailet
 
Streaming, Analytics and Reactive Applications with Apache Cassandra
Streaming, Analytics and Reactive Applications with Apache CassandraStreaming, Analytics and Reactive Applications with Apache Cassandra
Streaming, Analytics and Reactive Applications with Apache CassandraCédrick Lunven
 
Apache Big Data Europe 2015: Selected Talks
Apache Big Data Europe 2015: Selected TalksApache Big Data Europe 2015: Selected Talks
Apache Big Data Europe 2015: Selected TalksAndrii Gakhov
 
Neo4j Stored Procedure Training Part 2
Neo4j Stored Procedure Training Part 2Neo4j Stored Procedure Training Part 2
Neo4j Stored Procedure Training Part 2Max De Marzi
 
Introduction to apache_cassandra_for_developers-lhg
Introduction to apache_cassandra_for_developers-lhgIntroduction to apache_cassandra_for_developers-lhg
Introduction to apache_cassandra_for_developers-lhgzznate
 
Riga DevDays 2017 - Efficient AWS Lambda
Riga DevDays 2017 - Efficient AWS LambdaRiga DevDays 2017 - Efficient AWS Lambda
Riga DevDays 2017 - Efficient AWS LambdaAntons Kranga
 
Machine learning at scale with aws sage maker
Machine learning at scale with aws sage makerMachine learning at scale with aws sage maker
Machine learning at scale with aws sage makerPhilipBasford
 

Similar a Cassandra hands on (20)

Apache Cassandra 2.0
Apache Cassandra 2.0Apache Cassandra 2.0
Apache Cassandra 2.0
 
Distribute Key Value Store
Distribute Key Value StoreDistribute Key Value Store
Distribute Key Value Store
 
Distribute key value_store
Distribute key value_storeDistribute key value_store
Distribute key value_store
 
App Grid Dev With Coherence
App Grid Dev With CoherenceApp Grid Dev With Coherence
App Grid Dev With Coherence
 
Application Grid Dev with Coherence
Application Grid Dev with CoherenceApplication Grid Dev with Coherence
Application Grid Dev with Coherence
 
App Grid Dev With Coherence
App Grid Dev With CoherenceApp Grid Dev With Coherence
App Grid Dev With Coherence
 
Pragmatic Cloud Security Automation
Pragmatic Cloud Security AutomationPragmatic Cloud Security Automation
Pragmatic Cloud Security Automation
 
Practical catalyst
Practical catalystPractical catalyst
Practical catalyst
 
GumGum: Multi-Region Cassandra in AWS
GumGum: Multi-Region Cassandra in AWSGumGum: Multi-Region Cassandra in AWS
GumGum: Multi-Region Cassandra in AWS
 
Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic System
Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic SystemTimely Year Two: Lessons Learned Building a Scalable Metrics Analytic System
Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic System
 
Logisland "Event Mining at scale"
Logisland "Event Mining at scale"Logisland "Event Mining at scale"
Logisland "Event Mining at scale"
 
Streaming, Analytics and Reactive Applications with Apache Cassandra
Streaming, Analytics and Reactive Applications with Apache CassandraStreaming, Analytics and Reactive Applications with Apache Cassandra
Streaming, Analytics and Reactive Applications with Apache Cassandra
 
Apache Big Data Europe 2015: Selected Talks
Apache Big Data Europe 2015: Selected TalksApache Big Data Europe 2015: Selected Talks
Apache Big Data Europe 2015: Selected Talks
 
Neo4j Stored Procedure Training Part 2
Neo4j Stored Procedure Training Part 2Neo4j Stored Procedure Training Part 2
Neo4j Stored Procedure Training Part 2
 
Lampstack (1)
Lampstack (1)Lampstack (1)
Lampstack (1)
 
Amazon elastic map reduce
Amazon elastic map reduceAmazon elastic map reduce
Amazon elastic map reduce
 
Introduction to apache_cassandra_for_developers-lhg
Introduction to apache_cassandra_for_developers-lhgIntroduction to apache_cassandra_for_developers-lhg
Introduction to apache_cassandra_for_developers-lhg
 
Riga DevDays 2017 - Efficient AWS Lambda
Riga DevDays 2017 - Efficient AWS LambdaRiga DevDays 2017 - Efficient AWS Lambda
Riga DevDays 2017 - Efficient AWS Lambda
 
Machine learning at scale with aws sage maker
Machine learning at scale with aws sage makerMachine learning at scale with aws sage maker
Machine learning at scale with aws sage maker
 
Java performance
Java performanceJava performance
Java performance
 

Último

Katraj ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For S...
Katraj ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For S...Katraj ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For S...
Katraj ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For S...tanu pandey
 
Real Escorts in Al Nahda +971524965298 Dubai Escorts Service
Real Escorts in Al Nahda +971524965298 Dubai Escorts ServiceReal Escorts in Al Nahda +971524965298 Dubai Escorts Service
Real Escorts in Al Nahda +971524965298 Dubai Escorts ServiceEscorts Call Girls
 
Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.soniya singh
 
( Pune ) VIP Baner Call Girls 🎗️ 9352988975 Sizzling | Escorts | Girls Are Re...
( Pune ) VIP Baner Call Girls 🎗️ 9352988975 Sizzling | Escorts | Girls Are Re...( Pune ) VIP Baner Call Girls 🎗️ 9352988975 Sizzling | Escorts | Girls Are Re...
( Pune ) VIP Baner Call Girls 🎗️ 9352988975 Sizzling | Escorts | Girls Are Re...nilamkumrai
 
(+971568250507 ))# Young Call Girls in Ajman By Pakistani Call Girls in ...
(+971568250507  ))#  Young Call Girls  in Ajman  By Pakistani Call Girls  in ...(+971568250507  ))#  Young Call Girls  in Ajman  By Pakistani Call Girls  in ...
(+971568250507 ))# Young Call Girls in Ajman By Pakistani Call Girls in ...Escorts Call Girls
 
Busty Desi⚡Call Girls in Vasundhara Ghaziabad >༒8448380779 Escort Service
Busty Desi⚡Call Girls in Vasundhara Ghaziabad >༒8448380779 Escort ServiceBusty Desi⚡Call Girls in Vasundhara Ghaziabad >༒8448380779 Escort Service
Busty Desi⚡Call Girls in Vasundhara Ghaziabad >༒8448380779 Escort ServiceDelhi Call girls
 
Wagholi & High Class Call Girls Pune Neha 8005736733 | 100% Gennuine High Cla...
Wagholi & High Class Call Girls Pune Neha 8005736733 | 100% Gennuine High Cla...Wagholi & High Class Call Girls Pune Neha 8005736733 | 100% Gennuine High Cla...
Wagholi & High Class Call Girls Pune Neha 8005736733 | 100% Gennuine High Cla...SUHANI PANDEY
 
VIP Call Girls Pollachi 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Pollachi 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Pollachi 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Pollachi 7001035870 Whatsapp Number, 24/07 Bookingdharasingh5698
 
VVIP Pune Call Girls Sinhagad WhatSapp Number 8005736733 With Elite Staff And...
VVIP Pune Call Girls Sinhagad WhatSapp Number 8005736733 With Elite Staff And...VVIP Pune Call Girls Sinhagad WhatSapp Number 8005736733 With Elite Staff And...
VVIP Pune Call Girls Sinhagad WhatSapp Number 8005736733 With Elite Staff And...SUHANI PANDEY
 
Nanded City ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready ...
Nanded City ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready ...Nanded City ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready ...
Nanded City ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready ...tanu pandey
 
Al Barsha Night Partner +0567686026 Call Girls Dubai
Al Barsha Night Partner +0567686026 Call Girls  DubaiAl Barsha Night Partner +0567686026 Call Girls  Dubai
Al Barsha Night Partner +0567686026 Call Girls DubaiEscorts Call Girls
 
Sarola * Female Escorts Service in Pune | 8005736733 Independent Escorts & Da...
Sarola * Female Escorts Service in Pune | 8005736733 Independent Escorts & Da...Sarola * Female Escorts Service in Pune | 8005736733 Independent Escorts & Da...
Sarola * Female Escorts Service in Pune | 8005736733 Independent Escorts & Da...SUHANI PANDEY
 
💚😋 Bilaspur Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋
💚😋 Bilaspur Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋💚😋 Bilaspur Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋
💚😋 Bilaspur Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋nirzagarg
 
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
20240510 QFM016 Irresponsible AI Reading List April 2024.pdfMatthew Sinclair
 
"Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency""Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency"growthgrids
 
Pune Airport ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready...
Pune Airport ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready...Pune Airport ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready...
Pune Airport ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready...tanu pandey
 
𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...
𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...
𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...Neha Pandey
 
APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53APNIC
 

Último (20)

Katraj ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For S...
Katraj ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For S...Katraj ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For S...
Katraj ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For S...
 
Russian Call Girls in %(+971524965298 )# Call Girls in Dubai
Russian Call Girls in %(+971524965298  )#  Call Girls in DubaiRussian Call Girls in %(+971524965298  )#  Call Girls in Dubai
Russian Call Girls in %(+971524965298 )# Call Girls in Dubai
 
Real Escorts in Al Nahda +971524965298 Dubai Escorts Service
Real Escorts in Al Nahda +971524965298 Dubai Escorts ServiceReal Escorts in Al Nahda +971524965298 Dubai Escorts Service
Real Escorts in Al Nahda +971524965298 Dubai Escorts Service
 
6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...
6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...
6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...
 
Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.
 
( Pune ) VIP Baner Call Girls 🎗️ 9352988975 Sizzling | Escorts | Girls Are Re...
( Pune ) VIP Baner Call Girls 🎗️ 9352988975 Sizzling | Escorts | Girls Are Re...( Pune ) VIP Baner Call Girls 🎗️ 9352988975 Sizzling | Escorts | Girls Are Re...
( Pune ) VIP Baner Call Girls 🎗️ 9352988975 Sizzling | Escorts | Girls Are Re...
 
(+971568250507 ))# Young Call Girls in Ajman By Pakistani Call Girls in ...
(+971568250507  ))#  Young Call Girls  in Ajman  By Pakistani Call Girls  in ...(+971568250507  ))#  Young Call Girls  in Ajman  By Pakistani Call Girls  in ...
(+971568250507 ))# Young Call Girls in Ajman By Pakistani Call Girls in ...
 
Busty Desi⚡Call Girls in Vasundhara Ghaziabad >༒8448380779 Escort Service
Busty Desi⚡Call Girls in Vasundhara Ghaziabad >༒8448380779 Escort ServiceBusty Desi⚡Call Girls in Vasundhara Ghaziabad >༒8448380779 Escort Service
Busty Desi⚡Call Girls in Vasundhara Ghaziabad >༒8448380779 Escort Service
 
Wagholi & High Class Call Girls Pune Neha 8005736733 | 100% Gennuine High Cla...
Wagholi & High Class Call Girls Pune Neha 8005736733 | 100% Gennuine High Cla...Wagholi & High Class Call Girls Pune Neha 8005736733 | 100% Gennuine High Cla...
Wagholi & High Class Call Girls Pune Neha 8005736733 | 100% Gennuine High Cla...
 
VIP Call Girls Pollachi 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Pollachi 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Pollachi 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Pollachi 7001035870 Whatsapp Number, 24/07 Booking
 
VVIP Pune Call Girls Sinhagad WhatSapp Number 8005736733 With Elite Staff And...
VVIP Pune Call Girls Sinhagad WhatSapp Number 8005736733 With Elite Staff And...VVIP Pune Call Girls Sinhagad WhatSapp Number 8005736733 With Elite Staff And...
VVIP Pune Call Girls Sinhagad WhatSapp Number 8005736733 With Elite Staff And...
 
Nanded City ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready ...
Nanded City ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready ...Nanded City ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready ...
Nanded City ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready ...
 
Al Barsha Night Partner +0567686026 Call Girls Dubai
Al Barsha Night Partner +0567686026 Call Girls  DubaiAl Barsha Night Partner +0567686026 Call Girls  Dubai
Al Barsha Night Partner +0567686026 Call Girls Dubai
 
Sarola * Female Escorts Service in Pune | 8005736733 Independent Escorts & Da...
Sarola * Female Escorts Service in Pune | 8005736733 Independent Escorts & Da...Sarola * Female Escorts Service in Pune | 8005736733 Independent Escorts & Da...
Sarola * Female Escorts Service in Pune | 8005736733 Independent Escorts & Da...
 
💚😋 Bilaspur Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋
💚😋 Bilaspur Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋💚😋 Bilaspur Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋
💚😋 Bilaspur Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋
 
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
 
"Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency""Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency"
 
Pune Airport ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready...
Pune Airport ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready...Pune Airport ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready...
Pune Airport ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready...
 
𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...
𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...
𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...
 
APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53
 

Cassandra hands on

  • 1. Cassandra Hands On Niall Milton, CTO, DigBigData Examples courtesy of Patrick Callaghan, DataStax Sponsored By
  • 2. Introduction —  We will be walking through Cassandra use cases from Patrick Callaghan on github. —  https://github.com/PatrickCallaghan/ —  Patrick sends his apologies but due to Aer Lingus air strike on Friday he couldn’t get a flight back to UK —  This presentation will cover the important points from each sample application
  • 3. Agenda —  Transactions Example —  Paging Example —  Analytics Example —  Risk Sensitivity Example
  • 5. Scenario —  We want to add products, each with a quantity to an order —  Orders come in concurrently from random buyers —  Products that have sold out will return “OUT OF STOCK” —  We want to use lightweight transactions to guarantee that we do not allow orders to complete when no stock is available
  • 6. Lightweight Transactions —  Guarantee a serial isolation level, ACID —  Uses PAXOS consensus algorithm to achieve this in a distributed system. See: —  http://research.microsoft.com/en-us/um/people/lamport/ pubs/paxos-simple.pdf —  Every node is still equal, no master or locks —  Allows for conditional inserts & updates —  The cost of linearizable consistency is higher latency, not suitable for high volume writes where low latency is required
  • 7. Retrieve & Run the Code 1.  git clone https://github.com/PatrickCallaghan/datastax- transaction-demo.git 2.  mvn clean compile exec:java - Dexec.mainClass="com.datastax.demo.SchemaSetup” 3.  mvn clean compile exec:java - Dexec.mainClass="com.datastax.transactions.Main" - Dload=true -DcontactPoints=127.0.0.1 - DnoOfThreads=10
  • 8. Schema 1.  create keyspace if not exists datastax_transactions_demo WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '1' }; 2.  create table if not exists products(productId text, capacityleft int, orderIds set<text>, PRIMARY KEY (productId)); 3.  create table if not exists buyers_orders(buyerId text, orderId text, productId text, PRIMARY KEY(buyerId, orderId));
  • 9. Model public class Order { private String orderId; private String productId; private String buyerId; … }
  • 10. Method —  Find current product quantity at CL.SERIAL —  This allows us to execute a PAXOS query without proposing an update, i.e. read the current value SELECT capacityLeft from products WHERE productId = ‘1234’ e.g. capacityLeft = 5
  • 11. Method Contd. —  Do a conditional update using IF operator to make sure product quantity has not changed since last quantity check —  Note the use of the set collection type here. —  This statement will only succeed if the IF condition is met UPDATE products SET orderIds=orderIds + {'3'}, capacityleft = 4 WHERE productId = ’1234' IF capacityleft = 5;
  • 12. Method Contd. —  If last query succeeds, simply insert the order. INSERT into orders (buyerId, orderId, productId) values (1,3,’1234’); —  This guarantees that no order will be placed where there is insufficient quantity to fulfill it.
  • 13. Comments —  Using LWT incurs a cost of higher latency because all replicas must be consulted before a value is committed / returned. —  CL.SERIAL does not propose a new value but is used to read the possibly uncommitted PAXOS state —  The IF operator can also be used as IF NOT EXISTS which is useful for user creation for example
  • 15. Scenario —  We have 1000s of products in our product catalogue —  We want to browse these using a simple select —  We don’t want to retrieve all at once!
  • 16. Cursors —  We are often dealing with wide rows in Cassandra —  Reading entire rows or multiple rows at once could lead to OOM errors —  Traditionally this meant using range queries to retrieve content —  Cassandra 2.0 (and Java driver) introduces cursors —  Makes row based queries more efficient (no need to use the token() function) —  This will simplify client code
  • 17. Retrieve & Run the Code 1.  git clone https://github.com/PatrickCallaghan/datastax- paging-demo.git 2.  mvn clean compile exec:java - Dexec.mainClass="com.datastax.demo.SchemaSetup" 3.  mvn clean compile exec:java - Dexec.mainClass="com.datastax.paging.Main"
  • 18. Schema create table if not exists products(productId text, capacityleft int, orderIds set<text>, PRIMARY KEY (productId)); —  N.B With the default partitioner, products will be ordered based on Murmer3 hash value. Old way we would need to use the token() function to retrieve them in order
  • 19. Model public class Product { private String productId; private int capacityLeft; private Set<String> orderIds; … }
  • 20. Method 1.  Create a simple select query for the products table. 2.  Set the fetch size parameter 3.  Execute the statement Statement stmt = new SimpleStatement("Select * from products”); stmt.setFetchSize(100); ResultSet resultSet = this.session.execute(stmt);
  • 21. Method Contd. 1.  Get an iterator for the result set 2.  Use a while loop to iterate over the result set Iterator<Row> iterator = resultSet.iterator(); while (iterator.hasNext()){ Row row = iterator.next(); // do stuff with the row }
  • 22. Comments —  Very easy to transparently iterate in a memory efficient way over a large result set —  Cursor state is maintained by driver. —  Allows for failover between different page responses, i.e. the state is not lost if a page fails to load from a node in the replica set, the page will be requested from another node —  See: http://www.datastax.com/dev/blog/client- side-improvements-in-cassandra-2-0
  • 24. Scenario —  Don’t have Hadoop but want to run some HIVE type analytics on our large dataset —  Example: Get the Top10 financial transactions ordered by monetary value for each user —  May want to add more complex filtering later (where value > 1000) or even do mathematical groupings, percentiles, means, min, max
  • 25. Cassandra for Analytics —  Useful for many scenarios when no other analytics solution is available —  Using cursors, queries are bounded & memory efficient depending on the operation —  Can be applied anywhere we can do iterative or recursive processing, SUM, AVG, MIN, MAX etc. —  NB: The example code also includes an CQLSSTableWriter which is fast & convenient if we want to manually create SSTables of large datasets rather than send millions of insert queries to Cassandra
  • 26. Retrieve & Run the Code 1.  git clone https://github.com/PatrickCallaghan/datastax- analytics-example.git 2.  export MAVEN_OPTS=-Xmx512M (up the memory) 3.  mvn clean compile exec:java - Dexec.mainClass="com.datastax.bulkloader.Main" 4.  mvn clean compile exec:java - Dexec.mainClass="com.datastax.analytics.TopTrans actionsByAmountForUserRunner"
  • 27. Schema create table IF NOT EXISTS transactions ( accid text, txtnid uuid, txtntime timestamp, amount double, type text, reason text, PRIMARY KEY(accid, txtntime) );
  • 28. Model public class Transaction { pivate String txtnId; private String acountId; private double amount; private Date txtnDate; private String reason; private String type; … }
  • 29. Method —  Pass a blocking queue into the DAO method which cursors the data, allows us to pop items off as they are added —  NB: Could also use a callback here to update the queue public void getAllProducts(BlockingQueue<Transaction> processorQueue) Statement stmt = new SimpleStatement(“SELECT * FROM transactions”); stmt.setFetchSize(2500); ResultSet resultSet = this.session.execute(stmt);
  • 30. Method Contd. 1.  Get an iterator for the result set 2.  Use a while loop to iterate over the result set, add each row into the queue while (iterator.hasNext()) { Row row = iterator.next(); Transaction transaction = createTransactionFromRow(row); //convenience queue.offer(transaction); }
  • 31. Method Contd. 1.  Use Java Collections & Transaction comparator to track Top results private Set<Transaction> orderedSet = new BoundedTreeSet<Transaction>(10, new TransactionAmountComparator());
  • 32. Comments —  Entirely possible, but probably not to be thought of as a complete replacement for dedicated analytics solutions —  Issues are token distribution across replicas and mixed write and read patterns —  Running analytics or MR operations can be a read heavy operation (as well as memory and i/o intensive) —  Transaction logging tends to be write heavy —  Cassandra can handle it, but in practice it is better to split workloads except for smaller cases, where latency doesn’t matter or where the cluster is not generally under significant load —  Consider DSE Hadoop, Spark, Storm as alternatives
  • 34. Scenario —  In financial risk systems, positions have sensitivity to certain variable —  Positions are hierarchical and is associated with a trader at a desk which is part of an asset type in a certain location. —  E.g. Frankfurt/FX/desk10/trader7/position23 —  Sensitivity values are inserted for each position. We need to aggregate them for each level in the hierarchy —  The Sum of all sensitivities over time is the new sensitivity as they are represented by deltas.
  • 35. Scenario —  E.g. Aggregations for: —  Frankfurt/FX/desk10/trader7 —  Frankfurt/FX/desk10 —  Frankfurt/FX —  As new positions are entered the risk sensitivities will change and will need to be aggregated for each level for the new value to be available
  • 36. Queries select * from risk_sensitivities_hierarchy where hier_path = 'Paris/FX'; ! select * from risk_sensitivities_hierarchy where hier_path = 'Paris/FX/desk4' and sub_hier_path='trader3'; ! select * from risk_sensitivities_hierarchy where hier_path = 'Paris/FX/desk4' and sub_hier_path='trader3' and risk_sens_name='irDelta';!
  • 37. Retrieve & Run the Code 1.  git clone https://github.com/PatrickCallaghan/datastax- analytics-example.git 2.  export MAVEN_OPTS=-Xmx512M (up the memory) 3.  mvn clean compile exec:java - Dexec.mainClass="com.datastax.bulkloader.Main" 4.  mvn clean compile exec:java - Dexec.mainClass="com.heb.finance.analytics.Main" -DstopSize=1000000
  • 38. Schema create table if not exists risk_sensitivities_hierarchy ( hier_path text, sub_hier_path text, risk_sens_name text, value double, PRIMARY KEY (hier_path, sub_hier_path, risk_sens_name) ) WITH compaction={'class': 'LeveledCompactionStrategy'}; NB: Notice the use of LCS as we want the table to be efficient for reads also
  • 39. Model public class RiskSensitivity public final String name; public final String path; public final String position; public final BigDecimal value; … }
  • 40. Method —  Write a service to write new sensitivities to Cassandra Periodically. insert into risk_sensitivities_hierarchy (hier_path, sub_hier_path, risk_sens_name, value) VALUES (?, ?, ?, ?)
  • 41. Method Contd. —  In our aggregator do the following periodically —  Select data for hierarchies we wish to aggregate select * from risk_sensitivities_hierarchy where hier_path = ‘Frankfurt/FX/desk10/trader4’ —  Will get all positions related to this hierarchy —  Add the values (represented as deltas) to each other to get the new sensitivity —  E.g. S1 = -3, S2 = 2, S3= -1 —  Write it back for ‘Frankfurt/FX/desk10/trader4’
  • 42. Comments —  Simple way to maintain up to date risk sensitivity on an on going basis based on previous data —  Will mean (N Hierarchies) * (N variables) queries are executed periodically (keep an eye on this) —  Cursors, blocking queue and bounded collections help us achieve the same result without reading entire rows —  Has other applications such as roll ups for stream data provided you have a reasonably low cardinality in terms of number of (time resolution) * variables.
  • 43. —  Thanks Patrick Callaghan for the hard work coding the examples! — Questions?