The document discusses the .NET driver for Cassandra. It provides an overview of the driver and how to connect to Cassandra and execute queries from .NET applications. Key points covered include how to connect to a cluster, execute queries using simple and prepared statements, handle paging of large result sets, and map query results to .NET objects. Examples are provided showing common operations like creating a session, executing queries, and updating data using batches in a .NET application connecting to Cassandra.
Testing tools and AI - ideas what to try with some tool examples
Introduction to the .NET Driver for Cassandra
1. Introduction to the .NET Driver
Luke Tillman
Technical Evangelist
@LukeTillman
2. The DataStax Drivers for Cassandra
• Currently Available
– C# (.NET)
– Python
– Java
– NodeJS
– Ruby
– C++
– PHP
• Will Probably Happen
– Scala
– JDBC
• Early Discussions
– Go
– Rust
2
• Open source, Apache 2 licensed, available on GitHub
– https://github.com/datastax/
3. The DataStax Drivers for Cassandra
Language Bootstrapping Code
C#
Cluster cluster = Cluster.Builder().AddContactPoint("127.0.0.1").Build();
ISession session = cluster.Connect("killrvideo");
Python
from cassandra.cluster import Cluster
cluster = Cluster(contact_points=['127.0.0.1'])
session = cluster.connect('killrvideo')
Java
Cluster cluster = Cluster.builder().addContactPoint("127.0.0.1").build();
Session session = cluster.connect("killrvideo");
NodeJS
var cassandra = require('cassandra-driver');
var client = new cassandra.Client({
contactPoints: ['127.0.0.1'], keyspace: 'killrvideo'
});
4. 4
A video sharing web application built on DataStax
Enterprise and Microsoft Azure
www.killrvideo.com
5. .NET and Cassandra
• Available via NuGet
• Bootstrap using the Builder and then reuse the ISession object
Cluster cluster = Cluster.Builder()
.AddContactPoint("127.0.0.1")
.Build();
ISession session = cluster.Connect("killrvideo");
5
6. .NET and Cassandra
• Executing CQL with SimpleStatement
• Sync and Async API available for executing statements
• Use Async API for executing queries in parallel
var videoId = Guid.NewGuid();
var statement = new SimpleStatement("SELECT * FROM videos WHERE videoid = ?",
videoId);
RowSet rows = await session.ExecuteAsync(statement);
6
7. .NET and Cassandra
• Getting values from a RowSet is easy
• Rowset is a collection of Row (IEnumerable<Row>)
RowSet rows = await _session.ExecuteAsync(statement);
foreach (Row row in rows)
{
var videoId = row.GetValue<Guid>("videoid");
var addedDate = row.GetValue<DateTimeOffset>("added_date");
var name = row.GetValue<string>("name");
}
7
8. CQL 3 Data Types to .NET Types
• Full listing available in driver docs (http://www.datastax.com/docs)
CQL 3 Data Type .NET Type
bigint, counter long
boolean bool
decimal, float float
double double
int int
uuid, timeuuid System.Guid
text, varchar string (Encoding.UTF8)
timestamp System.DateTimeOffset
varint System.Numerics.BigInteger
9. Use Prepared Statements
• Performance optimization for queries you run repeatedly
• Pay the cost of preparing once (causes roundtrip to Cassandra)
• KillrVideo: looking a user’s credentials up by email address
• Save and reuse the PreparedStatement instance after preparing
9
PreparedStatement prepared = session.Prepare(
"SELECT * FROM user_credentials WHERE email = ?");
10. Use Prepared Statements
• Bind variable values when ready to execute
• Execution only has to send variable values over the wire
• Cassandra doesn’t have to reparse the CQL string each time
• Remember: Prepare once, bind and execute many
10
BoundStatement bound = prepared.Bind("luke.tillman@datastax.com");
RowSet rows = await _session.ExecuteAsync(bound);
11. Statement Options
• Options like Consistency Level and Retry Policy are available at
the Statement level
• If not set on a statement, driver will fallback to defaults set when
building/configuring the Cluster
11
IStatement bound =
prepared.Bind("luke.tillman@datastax.com")
.SetPageSize(100)
.SetConsistencyLevel(ConsistencyLevel.LocalOne)
.SetRetryPolicy(new DefaultRetryPolicy())
.EnableTracing();
12. Batch Statements: Use and Misuse
• You can mix and match Simple/Bound statements in a batch
• Batches are Logged (atomic) by default
• Use when you want a group of mutations (statements) to all
succeed or all fail (denormalizing at write time)
• Large batches are an anti-pattern (Cassandra will warn you)
• Not a performance optimization for bulk-loading data
12
13. KillrVideo: Update a Video’s Name with a Batch
13
public class VideoCatalogDataAccess
{
public VideoCatalogDataAccess(ISession session)
{
_session = session;
_prepared = _session.Prepare(
"UPDATE user_videos SET name = ? WHERE userid = ? AND videoid = ?");
}
public async Task UpdateVideoName(UpdateVideoDto video)
{
BoundStatement bound = _prepared.Bind(video.Name, video.UserId, video.VideoId);
var simple = new SimpleStatement("UPDATE videos SET name = ? WHERE videoid = ?",
video.Name, video.VideoId);
// Use an atomic batch to send over all the mutations
var batchStatement = new BatchStatement();
batchStatement.Add(bound);
batchStatement.Add(simple);
RowSet rows = await _session.ExecuteAsync(batch);
}
}
14. Lightweight Transactions when you need them
• Use when you don’t want writes to step on each other
– Sometimes called Linearizable Consistency
– Similar to Serial Isolation Level from RDBMS
• Essentially a Check and Set (CAS) operation using Paxos
• Read the fine print: has a latency cost associated with it
• The canonical example: unique user accounts
14
15. KillrVideo: LWT to create user accounts
• Returns a column called [applied] indicating success/failure
• Different from relational world where you might expect an
Exception (i.e. PrimaryKeyViolationException or similar)
15
string cql = "INSERT INTO user_credentials (email, password, userid)" +
"VALUES (?, ?, ?) IF NOT EXISTS";
var statement = new SimpleStatement(cql, user.Email, hashedPassword, user.UserId);
RowSet rows = await _session.ExecuteAsync(statement);
var userInserted = rows.Single().GetValue<bool>("[applied]");
16. Automatic Paging
• The Problem: Loading big result sets into memory is a recipe
for disaster (OutOfMemoryExceptions, etc.)
• Better to load and process a large result set in pages (chunks)
• Automatic Paging makes paging on a large RowSet
transparent
17. Automatic Paging
• Set a page size on a statement
• Iterate over the resulting RowSet
• As you iterate, new pages are fetched transparently when the
Rows in the current page are exhausted
• Will allow you to iterate until all pages are exhausted
boundStatement = boundStatement.SetPageSize(100);
RowSet rows = await _session.ExecuteAsync(boundStatement);
foreach (Row row in rows)
{
}
18. Mapping Rows to Objects – Mapper Component
• Micro ORM: Write CQL queries, RowSets are mapped to POCOs
• Mappings are based on conventions, can be configured via
code (fluent-style interface) or attributes on your POCOs
public class User
{
public Guid UserId { get; set; }
public string Name { get; set; }
}
// Create a mapper from your session object
var mapper = new Mapper(session);
// Get a user by id from Cassandra or null if not found
var user = client.SingleOrDefault<User>(
"SELECT userid, name FROM users WHERE userid = ?", someUserId);
18
19. Mapping Rows to Objects – LINQ Provider
• Write LINQ queries instead of CQL, results mapped to POCOs
[Table("users")]
public class User
{
[Column("userid"), PartitionKey]
public Guid UserId { get; set; }
[Column("name")]
public string Name { get; set; }
}
var user = session.GetTable<User>()
.SingleOrDefault(u => u.UserId == someUserId)
.Execute();
19