Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.

Cassandra

198 visualizaciones

Publicado el

Cassandra

Publicado en: Tecnología
  • Sé el primero en comentar

  • Sé el primero en recomendar esto

Cassandra

  1. 1. Cassandra diegopacheco @diego_pacheco Diego Pacheco
  2. 2. http://cassandra.apache.org/
  3. 3. Why Apache Cassandra? ❏ Open Source ❏ Written in Java ❏ Scalability & High Availability ❏ Fault Tolerance ❏ Replication across multiple datacenters ❏ Async Masterless Replication ❏ No Single Point of Failure ❏ Based on Amazon Dynamo paper ❏ Created by Facebook, open sourced to apache in 2008
  4. 4. Battle Tested by http://planetcassandra.org/apache-cassandra-use-cases/
  5. 5. Benchmark: 1 million writes per second http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html
  6. 6. CAP: Consistency VS Availability
  7. 7. Cluster
  8. 8. Murmur3Partitioner ❏ Murmur3Partitioner ❏ Default ❏ 3-5x faster than RandomPartitioner ❏ Based on Tokens hash values ❏ Uniform ❏ RandomPartitioner ❏ Uniform ❏ MD5 hash ❏ ByteOrderedPartitioner ❏ Lexically ordered by key bytes ❏ Ordered partition ❏ Not Recommended: ❏ Difficult LB, Hot Spots, Uneven LB multiple tables.
  9. 9. Replication ❏ Concepts: ❏ Virtual Nodes: Data ownership to machines ❏ Partitioner: Partitions data on the cluster ❏ Replication Strategy: Determine Replicas for each row of data ❏ Snitch: Topology, information about replicas and strategy. ❏ Client writes to any node ❏ Node coordinates with replicas ❏ Replication happens in parallel ❏ Replication Factor = How many nodes with same data? I.E. 3. ❏ SimpleStrategy VS NetworkTopologyStrategy ❏ Design: Nodes, Racks, Data Centers great for Cloud Computing!
  10. 10. Replication
  11. 11. Consistency ❏ Tunable Consistency: Reads and Writes: ❏ Consistency VS Availability Trade Offs: ❏ ONE, TWO, THREE ❏ QUORUM(majority = N /2 + 1) - LOCAL_QUORUM(majority local dc) ❏ EACH_QUORUM (majority all dcs) ❏ LOCAL_ONE ❏ ALL ❏ ANY (Just for writes) ❏ Disaster Recovery scenarios: ❏ SERIAL ❏ LOCAL_SERIAL
  12. 12. Reads and Index ❏ Partition key Cache ❏ Off Heap ❏ Configurable ❏ Row Cache ❏ Off Heap ❏ Configurable ❏ Secondary Index ❏ Filter data on table by non-primary key ❏ ALLOW FILTERING - Could be problematic ❏ Cassandra 3.4 - SASI Secondary Index ❏ Better Performance ❏ In memory mapped B+ tree ❏ Can't use with collections
  13. 13. Storage ❏ Log-Structured Merge Tree(don't use B-TREE) ❏ Avoid Read before write ❏ Flavors Latency ❏ Cass Groups Insert/Updates in memory. Periodically SYNC to disk(sequential append). ❏ Immutable Data ❏ Check before write? Use Lightweight Transactions. ❏ Writes: Commit Log -> Memtable -> Flush -> Disk SSTABLE ❏ All writes are versioned ❏ No Delete: Tombstones ❏ Reads: Bloom filter ❏ Off Heap structure for SSTable
  14. 14. Java Driver ❏ Specific to Cassandra ❏ Prepared Statements ❏ Connection Pooling ❏ Reconnection Policies ❏ Load Balancing Policies ❏ Retry Policies ❏ Async Netty ❏ Native Protocol
  15. 15. Cassandra diegopacheco @diego_pacheco Diego Pacheco

×