SlideShare una empresa de Scribd logo
1 de 29
Descargar para leer sin conexión
Aerospike aer . o . spike [air-oh- spahyk]
noun, 1. tip of a rocket that enhances speed and stability
STORM
PERSISTENCE
AND REAL-TIME
ANALYTICS
APRIL 1, 2014
IN-MEMORY NOSQL DATABASE
brian@aerospike.com
Follow
Join Us!
< meta description=“Aerospike is a
Rule CaptainWord {
strings:
$header = {D0 CF 11 E0 A1 B1 1A E1}
$author = {00 00 00 63 61 70 74 61 69 6E 00}
condition:
$header at 0 and $author
Content=“malware,Exec Code, Overflow, ExecCode Bypass”
Content’=“ Java 0day. Gh0st RAT, Mac Control RAT, MS09-027”
Content=“Hosh Hewer, Jenwediki yighingha iltimas qilish Jediwili”
tag
<head>
<meta name=“”>
<meta desc=“”>
</head>
© 2014 Aerospike. All rights reserved. Pg. 2
“keywords”
<TITLE>
✓ I’M ATTENDING
@Aerospikedb
Real-time analytics with
Storm and Aerospike
Brian Bulkowski
Founder and CTO
Aerospike
Streaming architecture
© 2014 Aerospike. All rights reserved. | Pg. 3
Data
Warehouse,
Hadoop
Cluster
Real-time
Interactions
Server
Batch Analytics
•  User segmentation
•  Location patterns
•  Similar audience
Real-time Interactions
•  Frequency caps
•  Recent ads served
•  Recent search terms
User
Data
Streaming
(Storm)
Hadoop
Why not other databases?
➤  Database requests in bolts
➤  Flash optimized
§  Do you need more than 30G ?
➤  Read / write optimized
➤  Faster & more reliable than
than Kafka (Cassandra based)
➤  Faster than Mongo
➤  More scale than Redis
© 2012 Aerospike. All rights reserved. Confidential | Pg. 4
Examples
➤  Recommendations
§  Multiple recommendation systems
§  Multi-arm bandit
§  https://github.com/tdunning/
storm-counts/wiki/Bayesian-Bandit
➤  Simple fraud counts
§  Store recent requests for payment
§  Store recent users
§  Calculate fraud scores, drop events
if past threshold
© 2014 Aerospike. All rights reserved. Confidential | Pg. 5
Aerospike Bolts
➤  Aerospike has speed, reliability, scale for Storm
§  Free version at http://aerospike.com/
§  Internap – free high performance SSD servers for trial
➤  Bolts available on github
§  https://github.com/aerospike/storm-aerospike
➤  EnrichBolt
§  Add fields from column after looking up a key
➤  PersistBolt
§  Store fields based on a key
➤  Benefits
§  In memory with FLASH
§  Clustered for high performance
§  HA state matches Storm’s stateless model
© 2014 Aerospike. All rights reserved. Confidential | Pg. 6
Follow
Join Us!
< meta description=“Aerospike is a
Rule CaptainWord {
strings:
$header = {D0 CF 11 E0 A1 B1 1A E1}
$author = {00 00 00 63 61 70 74 61 69 6E 00}
condition:
$header at 0 and $author
Content=“malware,Exec Code, Overflow, ExecCode Bypass”
Content’=“ Java 0day. Gh0st RAT, Mac Control RAT, MS09-027”
Content=“Hosh Hewer, Jenwediki yighingha iltimas qilish Jediwili”
#HashTags
tag
<head>
<meta name=“”>
<meta desc=“”>
</head>
© 2012 Aerospike. All rights reserved. Confidential Pg. 7
“keywords”
<TITLE>
✓ I’M ATTENDING
@Aerospikedb
The power of Flash
OTHER DATABASE
OS FILE SYSTEM
PAGE CACHE
BLOCK INTERFACE
SSD HDD
BLOCK INTERFACE
SSD SSD
OPEN NVM
SSD
OTHER
DATABASE
AEROSPIKE FLASH OPTIMIZED
IN-MEMORY DATABASE
Ask me and I’ll tell you the answer.Ask me. I’ll look up the answer and then tell it to
you.
AEROSPIKE
HYBRID MEMORY SYSTEM™
Flash-optimization Delivers Disruptive Performance
DRAM & HDD SSD & DRAM
Storage /server 180 GB (196 GB Server) 2.4 TB (4 x 700 GB)
TPS /server 500,000 500,000
Cost /server $8,000 $11,000
Server costs $1,488,000 $154,000
Power /server 0.9 kW 1.1 kW
Power (2 years) $0.12 per kWh ave.
US
$352,000 $32,400
Maintenance (2 years) $3,600 /
server
$670,000 $50,400
Total $2,510,000 $236,800
…at 1/10 the hardware cost
Actual customer analysis
500K TPS
10 TB Storage
2x Replication
186 SERVERS 14 SERVERS
OTHER DATABASES
ONLY
© 2012 Aerospike. All rights reserved. Pg. 10
Measure your drives!
Aerospike Certification Tool (ACT)
http://github.com/aerospike/act
Transactional database workload
Reads: 1.5KB
(can’t batch / cache reads, random)
Writes: 128K blocks
(log based layout)
(plus defragmentation)
Turn up the load until
latency is over required SLA
© 2012 Aerospike. All rights reserved. Pg. 11
Micron P320h – ACT results
[root@144.bm-general.dev.nym2 act]#
latency_calc/act_latency.py -l
actconfig_micron_75x_1d_rssdb_20130503232823.out
trans device %>(ms) %>(ms)
hour 1 8 64 1 8 64
----- ------ ------ ------ ------ ------
1 0.17 0.00 0.00 0.03 0.00 0.00
2 0.17 0.00 0.00 0.03 0.00 0.00
3 0.18 0.00 0.00 0.03 0.00 0.00
4 0.18 0.00 0.00 0.03 0.00 0.00
5 0.18 0.00 0.00 0.03 0.00 0.00
6 0.19 0.00 0.00 0.04 0.00 0.00
150K read IOPS @ 1.5K
225MB writes @ 128K
225MB reads @ 128K
$8/GB
© 2012 Aerospike. All rights reserved. Pg. 12
Test data – the next generation
6K reads per second, 9MB/sec write load
> 1 ms > 8 ms > 64 ms
Intel s3700, 20% OP - 6k iops 1.6 0 0 ($3/GB)
Intel s3700, 20% OP - 12k iops 5.4 0 0
Intel s3700, 20% OP - 24k iops 12.29 0 0
Intel s3700, NO OP - 24k iops 15.33 0 0
FusionIO Iodrive 2 – 6k iops 2.63 0.01 0 ($8/GB)
FusionIO iodrive 2 – 12k iops 7.32 0.1 0
© 2012 Aerospike. All rights reserved. Pg. 13
Test data – the previous generation
2K reads per second, 3MB/sec write load
> 1 ms > 8 ms > 64 ms
Intel X25-M + w/No OP (160G): 17.9% 0.6% 0.4%
Intel X25-M + OP (126G):            3.4% 0.1% 0.08%
OCZ Deneva 2 SLC + OP (95G): 0.9% 0.08% 0%
Samsung SS805 (100G):       2.0% 0.09% 0%
Intel 710 + OP (158G): 4.0% 0.01% 0%
Intel 320 + OP (126G):   5.6% 0% 0%
OCZ Vertex 2 + OP (190G):   6.3% 0.5% 0.01%
SMART XceedIOPS + OP (158G):     5.4% 0.4% 0%
Intel 510 + OP (95G):   6.2% 4.0% 0.03%
Micron P300 + OP (79GB):       1.3% 1.0% 0.7%
© 2012 Aerospike. All rights reserved. Pg. 14
Test data – the previous generation
6K reads per second, 18MB/sec write load
> 1 ms > 8 ms > 64 ms
OCZ Deneva 2 SLC + OP (95G): 3.2% 0.4% 0%
Samsung SS805 (100G): 10.1% 0.8% 0.02%
Intel 320 + OP (126G): 22.0% 0.3% 0.03%
OCZ Deneva 2 MLC (Sync)  8.8% 0.6% 0.06%
OCZ Vertex 2 + OP (190G): 27.6% 4.6% 0.4%
SMART XceedIOPS + OP (158G): 24.5% 5.4% 1.0%
Follow
Join Us!
< meta description=“Aerospike is a
Rule CaptainWord {
strings:
$header = {D0 CF 11 E0 A1 B1 1A E1}
$author = {00 00 00 63 61 70 74 61 69 6E 00}
condition:
$header at 0 and $author
Content=“malware,Exec Code, Overflow, ExecCode Bypass”
Content’=“ Java 0day. Gh0st RAT, Mac Control RAT, MS09-027”
Content=“Hosh Hewer, Jenwediki yighingha iltimas qilish Jediwili”
tag
<head>
<meta name=“”>
<meta desc=“”>
</head>
© 2014 Aerospike. All rights reserved. Pg. 15
“keywords”
<TITLE>
✓ I’M ATTENDING
@Aerospikedb
Why Aerospike ?
© 2013 Aerospike. All rights reserved. Confidential. 16
➤  Key Value API
➤  Real-time Performance
➤  Read/Write Workloads
➤  Clustering
➤  High Availability
➤  Commodity Hardware
➤  RAM + Flash
➤  XDR
Distributed Key Value Database +
Global Data Management
© 2013 Aerospike. All rights reserved. Confidential. 17
Challenges
1.  Handle extremely high rates of persistent read/write
transactions
2.  Avoid hot spots to maintain tight latency SLAs
3.  Provide immediate consistency with replication
4.  Allow long running tasks with transactions
5.  Scale linearly as data sizes increase
6.  Add capacity with no service interruption
Aerospike: the gold standard for high throughput,
low latency, high reliability transactions
Performance
• Over ten trillion transactions per
month
• 99% of transactions faster than 2
ms
• 150K TPS per server
Scalability
• Billions of Internet users
• Clustered Software
• Automatic Data Rebalancing
Reliability
• 50 customers; zero service down-
time
• Immediate Consistency
• Rapid Failover; Data Center
Replication
Price/Performance
• Makes impossible projects
affordable
• Flash-optimized
• 1/10 the servers required
© 2013 Aerospike. All rights reserved. Confidential. 19
10x Performance
0
50,000
100,000
150,000
200,000
250,000
300,000
350,000
Balanced Read-Heavy
Aerospike Cassandra MongoDB Couchbase 2.0*
*We were forced to exclude Couchbase...since when run with either disk
or replica durability on it was unable to complete the test.”
– Thumbtack Technology
0
2.5
5
7.5
10
0 50,000 100,000 150,000 200,000
AverageLatency,ms
Throughput, ops/sec
Balanced Workload Read Latency
Aerospike
Cassandra
MongoDB
0
4
8
12
16
0 50,000 100,000 150,000 200,000
AverageLatency,ms
Throughput, ops/sec
Balanced Workload Update Latency
Aerospike
Cassandra
MongoDB
HIGH THROUGHPUT LOW LATENCY
Throughput,TPS
© 2013 Aerospike. All rights reserved. Confidential. 20
High Availability
1 32 4 5 Phases
1) 100KTPS – 4 nodes
2) Clients at Max
3) 400KTPS – 4 nodes
4) 400KTPS – 3 nodes
5) 400KTPS – 4 nodes
Aerospike Node Specs:
CentOS 6.3
Intel i5-2400@ 3.1 GHz (Quad core)
16 GB RAM@1333 MHz
© 2013 Aerospike. All rights reserved. Confidential. 21
➤ Hard to
Maintain
➤ Performance
Better than the Competition
➤ Latency
➤ Number of
Servers
➤ Stability
➤ Cost of
RAM
➤ Cost of
RAM
➤ Scalability
© 2013 Aerospike. All rights reserved. Confidential. 22
OHIO
1)  No Hotspots
– DHT with RIPEMD160
simplifies data
partitioning
2)  Smart Client – 1 hop to
data, no load balancers
3)  Shared Nothing
Architecture,
every node identical
7) XDR – asynch replication
across data centers ensures
Zero Downtime
4)  Single row ACID
– synch replication in cluster
5)  Smart Cluster, Zero Touch
– auto-failover, rebalancing,
rolling upgrades..
6)  Transactions and long running
tasks prioritized real-time
Simpler Scaling: Fewer Servers, ACID, Zero Touch
© 2013 Aerospike. All rights reserved. Confidential. 23
Intelligent Client
•  Implements Aerospike API
•  Optimistic row locking
•  Optimized binary protocol
•  Cluster tracking
–  Learns about cluster
changes, partition map
–  Gossip protocol
•  Transaction semantics
–  Global transaction ID
–  Retransmit and timeout
Shields Applications from the Complexity of the Cluster
© 2013 Aerospike. All rights reserved. Confidential. 24
1.  Write sent to row master
2.  Latch against simultaneous writes
3.  Apply write synchronously to master memory
and replica memory
4.  Queue operations to disk
5.  Signal completed transaction
(optional storage commit wait)
6.  Master applies conflict resolution policy
(rollback/ rollforward)
master replica
1.  Cluster discovers new node via gossip
protocol
2.  Paxos vote determines new data
organization
3.  Partition migrations scheduled
4.  When a partition migration starts,
write journal starts on destination
5.  Partition moves atomically
6.  Journal is applied and source data deleted
transactions
continue
Writing with Immediate Consistency Adding a Node
ACID Transactions
© 2013 Aerospike. All rights reserved. Confidential. 25
➤  Distributed Hash Table with No Hotspots
§  Every key hashed with RIPEMD160
into a 20 byte (fixed length) string
NO KNOWN COLLISIONS
§  Hash + additional (fixed 64 bytes) data
stored in DRAM in the index
§  Some bits from hash value are used to
calculate the Partition ID (4096 partitions)
§  Partition ID maps to Node ID in the cluster
➤  1 Hop to data
§  Smart Client simply calculates Partition ID to
determine Node ID
§  No Load Balancers required
➤  Shared Nothing architecture
§  Every node is indentical
Distribution
cookie-abcdefg-12345678
182023kh15hh3kahdjsh
Partition ID Master
Node ID
Replica
Node ID
… 1 4
1820 2 3
1821 3 2
4096 4 1
© 2013 Aerospike. All rights reserved. Confidential. 26
➤  Super Storm Sandy 2012
§  NYC down for 17 hours
§  Back up and synched in 1 hour via
Aerospike Cross-Data Center Replication (XDR)
Replication that Works
“Aerospike allows us to
handle business continuity
and reliability across 4 data
centers seamlessly. And we
can now expand our
deployment to new data
centers in less than a week.”
- Elad Efraim, CTO
© 2013 Aerospike. All rights reserved. Confidential. 27
➤  Namespaces (policy containers)
§  Determine storage - DRAM or Flash
§  Determine replication factor
§  Contain records and sets
➤  Sets (tables) of records
§  Arbitrary grouping
➤  Records (rows)
§  Max 128k, contain key and bins
§  Bin with same name can contain
values of different types
u  String, integer, bytes (raw, blob, etc)
u  list ( an ordered collection of
values )
u  map ( a collection of keys and
values )
§  Bins can be added anytime
NOSQL EXTENSIBILITY
© 2013 Aerospike. All rights reserved. Confidential. 28
DISTRIBUTED QUERIES
1.  “Scatter” requests to all nodes
2.  Indexes in DRAM for fast map of secondary à primary keys
3.  Indexes co-located with data to guarantee ACID,
manage migrations
4.  Records read in parallel from all SSDs
using lock free concurrency control
5.  Aggregate results on each node
6.  “Gather” results from all nodes on client
STREAM AGGREGATIONS
1.  Push Code/ Security Policies/ Rules to Data with UDFs
2.  Pipe Query results through UDFs to
Filter, Transform, Aggregate.. Map, Reduce
REAL-TIME ANALYTICS on OPERATIONAL DATA (No ETL)
➤  In Database, within the same Cluster
➤  On the same Data, on XDR Replicated Clusters
Real-time Analytics on Operational Data
© 2013 Aerospike. All rights reserved. Confidential. 29
brian@aerospike.com
srini@aerospike.com
QUESTIONS

Más contenido relacionado

La actualidad más candente

Predictable Big Data Performance in Real-time
Predictable Big Data Performance in Real-timePredictable Big Data Performance in Real-time
Predictable Big Data Performance in Real-timeAerospike, Inc.
 
Ceph Day San Jose - Enable Fast Big Data Analytics on Ceph with Alluxio
Ceph Day San Jose - Enable Fast Big Data Analytics on Ceph with Alluxio Ceph Day San Jose - Enable Fast Big Data Analytics on Ceph with Alluxio
Ceph Day San Jose - Enable Fast Big Data Analytics on Ceph with Alluxio Ceph Community
 
PGConf.ASIA 2019 Bali - Building PostgreSQL as a Service with Kubernetes - Ta...
PGConf.ASIA 2019 Bali - Building PostgreSQL as a Service with Kubernetes - Ta...PGConf.ASIA 2019 Bali - Building PostgreSQL as a Service with Kubernetes - Ta...
PGConf.ASIA 2019 Bali - Building PostgreSQL as a Service with Kubernetes - Ta...Equnix Business Solutions
 
Ceph Day San Jose - Object Storage for Big Data
Ceph Day San Jose - Object Storage for Big Data Ceph Day San Jose - Object Storage for Big Data
Ceph Day San Jose - Object Storage for Big Data Ceph Community
 
Ceph on 64-bit ARM with X-Gene
Ceph on 64-bit ARM with X-GeneCeph on 64-bit ARM with X-Gene
Ceph on 64-bit ARM with X-GeneCeph Community
 
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar AhmedPGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar AhmedEqunix Business Solutions
 
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA ArchitectureCeph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA ArchitectureDanielle Womboldt
 
Ceph Day KL - Ceph on All-Flash Storage
Ceph Day KL - Ceph on All-Flash Storage Ceph Day KL - Ceph on All-Flash Storage
Ceph Day KL - Ceph on All-Flash Storage Ceph Community
 
QCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference ArchitectureQCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference ArchitecturePatrick McGarry
 
Intel - optimizing ceph performance by leveraging intel® optane™ and 3 d nand...
Intel - optimizing ceph performance by leveraging intel® optane™ and 3 d nand...Intel - optimizing ceph performance by leveraging intel® optane™ and 3 d nand...
Intel - optimizing ceph performance by leveraging intel® optane™ and 3 d nand...inwin stack
 
Ceph Day San Jose - Ceph at Salesforce
Ceph Day San Jose - Ceph at Salesforce Ceph Day San Jose - Ceph at Salesforce
Ceph Day San Jose - Ceph at Salesforce Ceph Community
 
Ceph Day San Jose - From Zero to Ceph in One Minute
Ceph Day San Jose - From Zero to Ceph in One Minute Ceph Day San Jose - From Zero to Ceph in One Minute
Ceph Day San Jose - From Zero to Ceph in One Minute Ceph Community
 
Development to Production with Sharded MongoDB Clusters
Development to Production with Sharded MongoDB ClustersDevelopment to Production with Sharded MongoDB Clusters
Development to Production with Sharded MongoDB ClustersSeveralnines
 
(DAT402) Amazon RDS PostgreSQL:Lessons Learned & New Features
(DAT402) Amazon RDS PostgreSQL:Lessons Learned & New Features(DAT402) Amazon RDS PostgreSQL:Lessons Learned & New Features
(DAT402) Amazon RDS PostgreSQL:Lessons Learned & New FeaturesAmazon Web Services
 
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance BarriersCeph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance BarriersCeph Community
 
Unlock Bigdata Analytic Efficiency with Ceph Data Lake - Zhang Jian, Fu Yong
Unlock Bigdata Analytic Efficiency with Ceph Data Lake - Zhang Jian, Fu YongUnlock Bigdata Analytic Efficiency with Ceph Data Lake - Zhang Jian, Fu Yong
Unlock Bigdata Analytic Efficiency with Ceph Data Lake - Zhang Jian, Fu YongCeph Community
 
Basic and Advanced Analysis of Ceph Volume Backend Driver in Cinder - John Haan
Basic and Advanced Analysis of Ceph Volume Backend Driver in Cinder - John HaanBasic and Advanced Analysis of Ceph Volume Backend Driver in Cinder - John Haan
Basic and Advanced Analysis of Ceph Volume Backend Driver in Cinder - John HaanCeph Community
 

La actualidad más candente (18)

Predictable Big Data Performance in Real-time
Predictable Big Data Performance in Real-timePredictable Big Data Performance in Real-time
Predictable Big Data Performance in Real-time
 
Ceph Day San Jose - Enable Fast Big Data Analytics on Ceph with Alluxio
Ceph Day San Jose - Enable Fast Big Data Analytics on Ceph with Alluxio Ceph Day San Jose - Enable Fast Big Data Analytics on Ceph with Alluxio
Ceph Day San Jose - Enable Fast Big Data Analytics on Ceph with Alluxio
 
PGConf.ASIA 2019 Bali - Building PostgreSQL as a Service with Kubernetes - Ta...
PGConf.ASIA 2019 Bali - Building PostgreSQL as a Service with Kubernetes - Ta...PGConf.ASIA 2019 Bali - Building PostgreSQL as a Service with Kubernetes - Ta...
PGConf.ASIA 2019 Bali - Building PostgreSQL as a Service with Kubernetes - Ta...
 
Ceph Day San Jose - Object Storage for Big Data
Ceph Day San Jose - Object Storage for Big Data Ceph Day San Jose - Object Storage for Big Data
Ceph Day San Jose - Object Storage for Big Data
 
Ceph on 64-bit ARM with X-Gene
Ceph on 64-bit ARM with X-GeneCeph on 64-bit ARM with X-Gene
Ceph on 64-bit ARM with X-Gene
 
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar AhmedPGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
 
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA ArchitectureCeph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
 
Ceph Day KL - Ceph on All-Flash Storage
Ceph Day KL - Ceph on All-Flash Storage Ceph Day KL - Ceph on All-Flash Storage
Ceph Day KL - Ceph on All-Flash Storage
 
MySQL Head-to-Head
MySQL Head-to-HeadMySQL Head-to-Head
MySQL Head-to-Head
 
QCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference ArchitectureQCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference Architecture
 
Intel - optimizing ceph performance by leveraging intel® optane™ and 3 d nand...
Intel - optimizing ceph performance by leveraging intel® optane™ and 3 d nand...Intel - optimizing ceph performance by leveraging intel® optane™ and 3 d nand...
Intel - optimizing ceph performance by leveraging intel® optane™ and 3 d nand...
 
Ceph Day San Jose - Ceph at Salesforce
Ceph Day San Jose - Ceph at Salesforce Ceph Day San Jose - Ceph at Salesforce
Ceph Day San Jose - Ceph at Salesforce
 
Ceph Day San Jose - From Zero to Ceph in One Minute
Ceph Day San Jose - From Zero to Ceph in One Minute Ceph Day San Jose - From Zero to Ceph in One Minute
Ceph Day San Jose - From Zero to Ceph in One Minute
 
Development to Production with Sharded MongoDB Clusters
Development to Production with Sharded MongoDB ClustersDevelopment to Production with Sharded MongoDB Clusters
Development to Production with Sharded MongoDB Clusters
 
(DAT402) Amazon RDS PostgreSQL:Lessons Learned & New Features
(DAT402) Amazon RDS PostgreSQL:Lessons Learned & New Features(DAT402) Amazon RDS PostgreSQL:Lessons Learned & New Features
(DAT402) Amazon RDS PostgreSQL:Lessons Learned & New Features
 
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance BarriersCeph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
 
Unlock Bigdata Analytic Efficiency with Ceph Data Lake - Zhang Jian, Fu Yong
Unlock Bigdata Analytic Efficiency with Ceph Data Lake - Zhang Jian, Fu YongUnlock Bigdata Analytic Efficiency with Ceph Data Lake - Zhang Jian, Fu Yong
Unlock Bigdata Analytic Efficiency with Ceph Data Lake - Zhang Jian, Fu Yong
 
Basic and Advanced Analysis of Ceph Volume Backend Driver in Cinder - John Haan
Basic and Advanced Analysis of Ceph Volume Backend Driver in Cinder - John HaanBasic and Advanced Analysis of Ceph Volume Backend Driver in Cinder - John Haan
Basic and Advanced Analysis of Ceph Volume Backend Driver in Cinder - John Haan
 

Destacado

Tectonic Shift: A New Foundation for Data Driven Business
Tectonic Shift: A New Foundation for Data Driven BusinessTectonic Shift: A New Foundation for Data Driven Business
Tectonic Shift: A New Foundation for Data Driven BusinessAerospike, Inc.
 
Running a High Performance NoSQL Database on Amazon EC2 for Just $1.68/Hour
Running a High Performance NoSQL Database on Amazon EC2 for Just $1.68/HourRunning a High Performance NoSQL Database on Amazon EC2 for Just $1.68/Hour
Running a High Performance NoSQL Database on Amazon EC2 for Just $1.68/HourAerospike, Inc.
 
There are 250 Database products, are you running the right one?
There are 250 Database products, are you running the right one?There are 250 Database products, are you running the right one?
There are 250 Database products, are you running the right one?Aerospike, Inc.
 
Real-Time Big Data at In-Memory Speed, Using Storm
Real-Time Big Data at In-Memory Speed, Using StormReal-Time Big Data at In-Memory Speed, Using Storm
Real-Time Big Data at In-Memory Speed, Using StormNati Shalom
 
What the Spark!? Intro and Use Cases
What the Spark!? Intro and Use CasesWhat the Spark!? Intro and Use Cases
What the Spark!? Intro and Use CasesAerospike, Inc.
 
Aerospike Hybrid Memory Architecture
Aerospike Hybrid Memory ArchitectureAerospike Hybrid Memory Architecture
Aerospike Hybrid Memory ArchitectureAerospike, Inc.
 

Destacado (8)

Tectonic Shift: A New Foundation for Data Driven Business
Tectonic Shift: A New Foundation for Data Driven BusinessTectonic Shift: A New Foundation for Data Driven Business
Tectonic Shift: A New Foundation for Data Driven Business
 
Running a High Performance NoSQL Database on Amazon EC2 for Just $1.68/Hour
Running a High Performance NoSQL Database on Amazon EC2 for Just $1.68/HourRunning a High Performance NoSQL Database on Amazon EC2 for Just $1.68/Hour
Running a High Performance NoSQL Database on Amazon EC2 for Just $1.68/Hour
 
There are 250 Database products, are you running the right one?
There are 250 Database products, are you running the right one?There are 250 Database products, are you running the right one?
There are 250 Database products, are you running the right one?
 
Real-Time Big Data at In-Memory Speed, Using Storm
Real-Time Big Data at In-Memory Speed, Using StormReal-Time Big Data at In-Memory Speed, Using Storm
Real-Time Big Data at In-Memory Speed, Using Storm
 
What the Spark!? Intro and Use Cases
What the Spark!? Intro and Use CasesWhat the Spark!? Intro and Use Cases
What the Spark!? Intro and Use Cases
 
The Future of Apache Storm
The Future of Apache StormThe Future of Apache Storm
The Future of Apache Storm
 
Aerospike Hybrid Memory Architecture
Aerospike Hybrid Memory ArchitectureAerospike Hybrid Memory Architecture
Aerospike Hybrid Memory Architecture
 
Resource Aware Scheduling in Apache Storm
Resource Aware Scheduling in Apache StormResource Aware Scheduling in Apache Storm
Resource Aware Scheduling in Apache Storm
 

Similar a Storm Persistence and Real-Time Analytics

Real Time Big Data (w/ NoSQL)
Real Time Big Data (w/ NoSQL)Real Time Big Data (w/ NoSQL)
Real Time Big Data (w/ NoSQL)Stein Writes Inc.
 
Combining Real-time and Batch Analytics with NoSQL, Storm and Hadoop - NoSQL ...
Combining Real-time and Batch Analytics with NoSQL, Storm and Hadoop - NoSQL ...Combining Real-time and Batch Analytics with NoSQL, Storm and Hadoop - NoSQL ...
Combining Real-time and Batch Analytics with NoSQL, Storm and Hadoop - NoSQL ...Aerospike
 
Building low latency java applications with ehcache
Building low latency java applications with ehcacheBuilding low latency java applications with ehcache
Building low latency java applications with ehcacheChris Westin
 
Super scaling singleton inserts
Super scaling singleton insertsSuper scaling singleton inserts
Super scaling singleton insertsChris Adkin
 
How To Set Up SQL Load Balancing with HAProxy - Slides
How To Set Up SQL Load Balancing with HAProxy - SlidesHow To Set Up SQL Load Balancing with HAProxy - Slides
How To Set Up SQL Load Balancing with HAProxy - SlidesSeveralnines
 
Amazon EC2 Container Service in Action
Amazon EC2 Container Service in ActionAmazon EC2 Container Service in Action
Amazon EC2 Container Service in ActionRemotty
 
Proving out flash storage array performance using swingbench and slob
Proving out flash storage array performance using swingbench and slobProving out flash storage array performance using swingbench and slob
Proving out flash storage array performance using swingbench and slobKapil Goyal
 
System Capa Planning_DBA oracle edu
System Capa Planning_DBA oracle eduSystem Capa Planning_DBA oracle edu
System Capa Planning_DBA oracle edu엑셈
 
Cassandra Summit 2014: Lesser Known Features of Cassandra 2.1
Cassandra Summit 2014: Lesser Known Features of Cassandra 2.1Cassandra Summit 2014: Lesser Known Features of Cassandra 2.1
Cassandra Summit 2014: Lesser Known Features of Cassandra 2.1DataStax Academy
 
Site Performance - From Pinto to Ferrari
Site Performance - From Pinto to FerrariSite Performance - From Pinto to Ferrari
Site Performance - From Pinto to FerrariJoseph Scott
 
Scale and Throughput @ Clicktale with Akka
Scale and Throughput @ Clicktale with AkkaScale and Throughput @ Clicktale with Akka
Scale and Throughput @ Clicktale with AkkaYuval Itzchakov
 
Load Balancing MySQL with HAProxy - Slides
Load Balancing MySQL with HAProxy - SlidesLoad Balancing MySQL with HAProxy - Slides
Load Balancing MySQL with HAProxy - SlidesSeveralnines
 
Presenta completaoow2013
Presenta completaoow2013Presenta completaoow2013
Presenta completaoow2013Fran Navarro
 
Pluggable Databases: What they will break and why you should use them anyway!
Pluggable Databases: What they will break and why you should use them anyway!Pluggable Databases: What they will break and why you should use them anyway!
Pluggable Databases: What they will break and why you should use them anyway!Guatemala User Group
 
Redis on NVMe SSD - Zvika Guz, Samsung
 Redis on NVMe SSD - Zvika Guz, Samsung Redis on NVMe SSD - Zvika Guz, Samsung
Redis on NVMe SSD - Zvika Guz, SamsungRedis Labs
 
Understanding DSE Search by Matt Stump
Understanding DSE Search by Matt StumpUnderstanding DSE Search by Matt Stump
Understanding DSE Search by Matt StumpDataStax
 

Similar a Storm Persistence and Real-Time Analytics (20)

Real Time Big Data (w/ NoSQL)
Real Time Big Data (w/ NoSQL)Real Time Big Data (w/ NoSQL)
Real Time Big Data (w/ NoSQL)
 
Combining Real-time and Batch Analytics with NoSQL, Storm and Hadoop - NoSQL ...
Combining Real-time and Batch Analytics with NoSQL, Storm and Hadoop - NoSQL ...Combining Real-time and Batch Analytics with NoSQL, Storm and Hadoop - NoSQL ...
Combining Real-time and Batch Analytics with NoSQL, Storm and Hadoop - NoSQL ...
 
Building low latency java applications with ehcache
Building low latency java applications with ehcacheBuilding low latency java applications with ehcache
Building low latency java applications with ehcache
 
Super scaling singleton inserts
Super scaling singleton insertsSuper scaling singleton inserts
Super scaling singleton inserts
 
Adaptec maxCache 3.0
Adaptec maxCache 3.0Adaptec maxCache 3.0
Adaptec maxCache 3.0
 
How To Set Up SQL Load Balancing with HAProxy - Slides
How To Set Up SQL Load Balancing with HAProxy - SlidesHow To Set Up SQL Load Balancing with HAProxy - Slides
How To Set Up SQL Load Balancing with HAProxy - Slides
 
Amazon EC2 Container Service in Action
Amazon EC2 Container Service in ActionAmazon EC2 Container Service in Action
Amazon EC2 Container Service in Action
 
Os Gopal
Os GopalOs Gopal
Os Gopal
 
Oracle on AWS RDS Migration - 성기명
Oracle on AWS RDS Migration - 성기명Oracle on AWS RDS Migration - 성기명
Oracle on AWS RDS Migration - 성기명
 
Proving out flash storage array performance using swingbench and slob
Proving out flash storage array performance using swingbench and slobProving out flash storage array performance using swingbench and slob
Proving out flash storage array performance using swingbench and slob
 
System Capa Planning_DBA oracle edu
System Capa Planning_DBA oracle eduSystem Capa Planning_DBA oracle edu
System Capa Planning_DBA oracle edu
 
Cassandra Summit 2014: Lesser Known Features of Cassandra 2.1
Cassandra Summit 2014: Lesser Known Features of Cassandra 2.1Cassandra Summit 2014: Lesser Known Features of Cassandra 2.1
Cassandra Summit 2014: Lesser Known Features of Cassandra 2.1
 
Site Performance - From Pinto to Ferrari
Site Performance - From Pinto to FerrariSite Performance - From Pinto to Ferrari
Site Performance - From Pinto to Ferrari
 
Scale and Throughput @ Clicktale with Akka
Scale and Throughput @ Clicktale with AkkaScale and Throughput @ Clicktale with Akka
Scale and Throughput @ Clicktale with Akka
 
Load Balancing MySQL with HAProxy - Slides
Load Balancing MySQL with HAProxy - SlidesLoad Balancing MySQL with HAProxy - Slides
Load Balancing MySQL with HAProxy - Slides
 
Presenta completaoow2013
Presenta completaoow2013Presenta completaoow2013
Presenta completaoow2013
 
Pluggable Databases: What they will break and why you should use them anyway!
Pluggable Databases: What they will break and why you should use them anyway!Pluggable Databases: What they will break and why you should use them anyway!
Pluggable Databases: What they will break and why you should use them anyway!
 
Redis on NVMe SSD - Zvika Guz, Samsung
 Redis on NVMe SSD - Zvika Guz, Samsung Redis on NVMe SSD - Zvika Guz, Samsung
Redis on NVMe SSD - Zvika Guz, Samsung
 
IO Dubi Lebel
IO Dubi LebelIO Dubi Lebel
IO Dubi Lebel
 
Understanding DSE Search by Matt Stump
Understanding DSE Search by Matt StumpUnderstanding DSE Search by Matt Stump
Understanding DSE Search by Matt Stump
 

Más de Aerospike, Inc.

2017 DB Trends for Powering Real-Time Systems of Engagement
2017 DB Trends for Powering Real-Time Systems of Engagement2017 DB Trends for Powering Real-Time Systems of Engagement
2017 DB Trends for Powering Real-Time Systems of EngagementAerospike, Inc.
 
WEBINAR: Architectures for Digital Transformation and Next-Generation Systems...
WEBINAR: Architectures for Digital Transformation and Next-Generation Systems...WEBINAR: Architectures for Digital Transformation and Next-Generation Systems...
WEBINAR: Architectures for Digital Transformation and Next-Generation Systems...Aerospike, Inc.
 
Leveraging Big Data with Hadoop, NoSQL and RDBMS
Leveraging Big Data with Hadoop, NoSQL and RDBMSLeveraging Big Data with Hadoop, NoSQL and RDBMS
Leveraging Big Data with Hadoop, NoSQL and RDBMSAerospike, Inc.
 
Using Databases and Containers From Development to Deployment
Using Databases and Containers  From Development to DeploymentUsing Databases and Containers  From Development to Deployment
Using Databases and Containers From Development to DeploymentAerospike, Inc.
 
01282016 Aerospike-Docker webinar
01282016 Aerospike-Docker webinar01282016 Aerospike-Docker webinar
01282016 Aerospike-Docker webinarAerospike, Inc.
 
The role of NoSQL in the Next Generation of Financial Informatics
The role of NoSQL in the Next Generation of Financial InformaticsThe role of NoSQL in the Next Generation of Financial Informatics
The role of NoSQL in the Next Generation of Financial InformaticsAerospike, Inc.
 
How to Get a Game Changing Performance Advantage with Intel SSDs and Aerospike
How to Get a Game Changing Performance Advantage with Intel SSDs and AerospikeHow to Get a Game Changing Performance Advantage with Intel SSDs and Aerospike
How to Get a Game Changing Performance Advantage with Intel SSDs and AerospikeAerospike, Inc.
 
Get Started with Data Science by Analyzing Traffic Data from California Highways
Get Started with Data Science by Analyzing Traffic Data from California HighwaysGet Started with Data Science by Analyzing Traffic Data from California Highways
Get Started with Data Science by Analyzing Traffic Data from California HighwaysAerospike, Inc.
 
ACID & CAP: Clearing CAP Confusion and Why C In CAP ≠ C in ACID
ACID & CAP:  Clearing CAP Confusion and Why C In CAP ≠ C in ACIDACID & CAP:  Clearing CAP Confusion and Why C In CAP ≠ C in ACID
ACID & CAP: Clearing CAP Confusion and Why C In CAP ≠ C in ACIDAerospike, Inc.
 
Flash Economics and Lessons learned from operating low latency platforms at h...
Flash Economics and Lessons learned from operating low latency platforms at h...Flash Economics and Lessons learned from operating low latency platforms at h...
Flash Economics and Lessons learned from operating low latency platforms at h...Aerospike, Inc.
 
You Snooze You Lose or How to Win in Ad Tech?
You Snooze You Lose or How to Win in Ad Tech?You Snooze You Lose or How to Win in Ad Tech?
You Snooze You Lose or How to Win in Ad Tech?Aerospike, Inc.
 
Distributing Data The Aerospike Way
Distributing Data The Aerospike WayDistributing Data The Aerospike Way
Distributing Data The Aerospike WayAerospike, Inc.
 
Big Data Learnings from a Vendor's Perspective
Big Data Learnings from a Vendor's PerspectiveBig Data Learnings from a Vendor's Perspective
Big Data Learnings from a Vendor's PerspectiveAerospike, Inc.
 

Más de Aerospike, Inc. (14)

2017 DB Trends for Powering Real-Time Systems of Engagement
2017 DB Trends for Powering Real-Time Systems of Engagement2017 DB Trends for Powering Real-Time Systems of Engagement
2017 DB Trends for Powering Real-Time Systems of Engagement
 
WEBINAR: Architectures for Digital Transformation and Next-Generation Systems...
WEBINAR: Architectures for Digital Transformation and Next-Generation Systems...WEBINAR: Architectures for Digital Transformation and Next-Generation Systems...
WEBINAR: Architectures for Digital Transformation and Next-Generation Systems...
 
Leveraging Big Data with Hadoop, NoSQL and RDBMS
Leveraging Big Data with Hadoop, NoSQL and RDBMSLeveraging Big Data with Hadoop, NoSQL and RDBMS
Leveraging Big Data with Hadoop, NoSQL and RDBMS
 
Using Databases and Containers From Development to Deployment
Using Databases and Containers  From Development to DeploymentUsing Databases and Containers  From Development to Deployment
Using Databases and Containers From Development to Deployment
 
01282016 Aerospike-Docker webinar
01282016 Aerospike-Docker webinar01282016 Aerospike-Docker webinar
01282016 Aerospike-Docker webinar
 
The role of NoSQL in the Next Generation of Financial Informatics
The role of NoSQL in the Next Generation of Financial InformaticsThe role of NoSQL in the Next Generation of Financial Informatics
The role of NoSQL in the Next Generation of Financial Informatics
 
How to Get a Game Changing Performance Advantage with Intel SSDs and Aerospike
How to Get a Game Changing Performance Advantage with Intel SSDs and AerospikeHow to Get a Game Changing Performance Advantage with Intel SSDs and Aerospike
How to Get a Game Changing Performance Advantage with Intel SSDs and Aerospike
 
Get Started with Data Science by Analyzing Traffic Data from California Highways
Get Started with Data Science by Analyzing Traffic Data from California HighwaysGet Started with Data Science by Analyzing Traffic Data from California Highways
Get Started with Data Science by Analyzing Traffic Data from California Highways
 
ACID & CAP: Clearing CAP Confusion and Why C In CAP ≠ C in ACID
ACID & CAP:  Clearing CAP Confusion and Why C In CAP ≠ C in ACIDACID & CAP:  Clearing CAP Confusion and Why C In CAP ≠ C in ACID
ACID & CAP: Clearing CAP Confusion and Why C In CAP ≠ C in ACID
 
Flash Economics and Lessons learned from operating low latency platforms at h...
Flash Economics and Lessons learned from operating low latency platforms at h...Flash Economics and Lessons learned from operating low latency platforms at h...
Flash Economics and Lessons learned from operating low latency platforms at h...
 
You Snooze You Lose or How to Win in Ad Tech?
You Snooze You Lose or How to Win in Ad Tech?You Snooze You Lose or How to Win in Ad Tech?
You Snooze You Lose or How to Win in Ad Tech?
 
Distributing Data The Aerospike Way
Distributing Data The Aerospike WayDistributing Data The Aerospike Way
Distributing Data The Aerospike Way
 
Introduction to Aerospike
Introduction to AerospikeIntroduction to Aerospike
Introduction to Aerospike
 
Big Data Learnings from a Vendor's Perspective
Big Data Learnings from a Vendor's PerspectiveBig Data Learnings from a Vendor's Perspective
Big Data Learnings from a Vendor's Perspective
 

Último

Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 

Último (20)

Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 

Storm Persistence and Real-Time Analytics

  • 1. Aerospike aer . o . spike [air-oh- spahyk] noun, 1. tip of a rocket that enhances speed and stability STORM PERSISTENCE AND REAL-TIME ANALYTICS APRIL 1, 2014 IN-MEMORY NOSQL DATABASE brian@aerospike.com
  • 2. Follow Join Us! < meta description=“Aerospike is a Rule CaptainWord { strings: $header = {D0 CF 11 E0 A1 B1 1A E1} $author = {00 00 00 63 61 70 74 61 69 6E 00} condition: $header at 0 and $author Content=“malware,Exec Code, Overflow, ExecCode Bypass” Content’=“ Java 0day. Gh0st RAT, Mac Control RAT, MS09-027” Content=“Hosh Hewer, Jenwediki yighingha iltimas qilish Jediwili” tag <head> <meta name=“”> <meta desc=“”> </head> © 2014 Aerospike. All rights reserved. Pg. 2 “keywords” <TITLE> ✓ I’M ATTENDING @Aerospikedb Real-time analytics with Storm and Aerospike Brian Bulkowski Founder and CTO Aerospike
  • 3. Streaming architecture © 2014 Aerospike. All rights reserved. | Pg. 3 Data Warehouse, Hadoop Cluster Real-time Interactions Server Batch Analytics •  User segmentation •  Location patterns •  Similar audience Real-time Interactions •  Frequency caps •  Recent ads served •  Recent search terms User Data Streaming (Storm) Hadoop
  • 4. Why not other databases? ➤  Database requests in bolts ➤  Flash optimized §  Do you need more than 30G ? ➤  Read / write optimized ➤  Faster & more reliable than than Kafka (Cassandra based) ➤  Faster than Mongo ➤  More scale than Redis © 2012 Aerospike. All rights reserved. Confidential | Pg. 4
  • 5. Examples ➤  Recommendations §  Multiple recommendation systems §  Multi-arm bandit §  https://github.com/tdunning/ storm-counts/wiki/Bayesian-Bandit ➤  Simple fraud counts §  Store recent requests for payment §  Store recent users §  Calculate fraud scores, drop events if past threshold © 2014 Aerospike. All rights reserved. Confidential | Pg. 5
  • 6. Aerospike Bolts ➤  Aerospike has speed, reliability, scale for Storm §  Free version at http://aerospike.com/ §  Internap – free high performance SSD servers for trial ➤  Bolts available on github §  https://github.com/aerospike/storm-aerospike ➤  EnrichBolt §  Add fields from column after looking up a key ➤  PersistBolt §  Store fields based on a key ➤  Benefits §  In memory with FLASH §  Clustered for high performance §  HA state matches Storm’s stateless model © 2014 Aerospike. All rights reserved. Confidential | Pg. 6
  • 7. Follow Join Us! < meta description=“Aerospike is a Rule CaptainWord { strings: $header = {D0 CF 11 E0 A1 B1 1A E1} $author = {00 00 00 63 61 70 74 61 69 6E 00} condition: $header at 0 and $author Content=“malware,Exec Code, Overflow, ExecCode Bypass” Content’=“ Java 0day. Gh0st RAT, Mac Control RAT, MS09-027” Content=“Hosh Hewer, Jenwediki yighingha iltimas qilish Jediwili” #HashTags tag <head> <meta name=“”> <meta desc=“”> </head> © 2012 Aerospike. All rights reserved. Confidential Pg. 7 “keywords” <TITLE> ✓ I’M ATTENDING @Aerospikedb The power of Flash
  • 8. OTHER DATABASE OS FILE SYSTEM PAGE CACHE BLOCK INTERFACE SSD HDD BLOCK INTERFACE SSD SSD OPEN NVM SSD OTHER DATABASE AEROSPIKE FLASH OPTIMIZED IN-MEMORY DATABASE Ask me and I’ll tell you the answer.Ask me. I’ll look up the answer and then tell it to you. AEROSPIKE HYBRID MEMORY SYSTEM™ Flash-optimization Delivers Disruptive Performance
  • 9. DRAM & HDD SSD & DRAM Storage /server 180 GB (196 GB Server) 2.4 TB (4 x 700 GB) TPS /server 500,000 500,000 Cost /server $8,000 $11,000 Server costs $1,488,000 $154,000 Power /server 0.9 kW 1.1 kW Power (2 years) $0.12 per kWh ave. US $352,000 $32,400 Maintenance (2 years) $3,600 / server $670,000 $50,400 Total $2,510,000 $236,800 …at 1/10 the hardware cost Actual customer analysis 500K TPS 10 TB Storage 2x Replication 186 SERVERS 14 SERVERS OTHER DATABASES ONLY
  • 10. © 2012 Aerospike. All rights reserved. Pg. 10 Measure your drives! Aerospike Certification Tool (ACT) http://github.com/aerospike/act Transactional database workload Reads: 1.5KB (can’t batch / cache reads, random) Writes: 128K blocks (log based layout) (plus defragmentation) Turn up the load until latency is over required SLA
  • 11. © 2012 Aerospike. All rights reserved. Pg. 11 Micron P320h – ACT results [root@144.bm-general.dev.nym2 act]# latency_calc/act_latency.py -l actconfig_micron_75x_1d_rssdb_20130503232823.out trans device %>(ms) %>(ms) hour 1 8 64 1 8 64 ----- ------ ------ ------ ------ ------ 1 0.17 0.00 0.00 0.03 0.00 0.00 2 0.17 0.00 0.00 0.03 0.00 0.00 3 0.18 0.00 0.00 0.03 0.00 0.00 4 0.18 0.00 0.00 0.03 0.00 0.00 5 0.18 0.00 0.00 0.03 0.00 0.00 6 0.19 0.00 0.00 0.04 0.00 0.00 150K read IOPS @ 1.5K 225MB writes @ 128K 225MB reads @ 128K $8/GB
  • 12. © 2012 Aerospike. All rights reserved. Pg. 12 Test data – the next generation 6K reads per second, 9MB/sec write load > 1 ms > 8 ms > 64 ms Intel s3700, 20% OP - 6k iops 1.6 0 0 ($3/GB) Intel s3700, 20% OP - 12k iops 5.4 0 0 Intel s3700, 20% OP - 24k iops 12.29 0 0 Intel s3700, NO OP - 24k iops 15.33 0 0 FusionIO Iodrive 2 – 6k iops 2.63 0.01 0 ($8/GB) FusionIO iodrive 2 – 12k iops 7.32 0.1 0
  • 13. © 2012 Aerospike. All rights reserved. Pg. 13 Test data – the previous generation 2K reads per second, 3MB/sec write load > 1 ms > 8 ms > 64 ms Intel X25-M + w/No OP (160G): 17.9% 0.6% 0.4% Intel X25-M + OP (126G):            3.4% 0.1% 0.08% OCZ Deneva 2 SLC + OP (95G): 0.9% 0.08% 0% Samsung SS805 (100G):       2.0% 0.09% 0% Intel 710 + OP (158G): 4.0% 0.01% 0% Intel 320 + OP (126G):   5.6% 0% 0% OCZ Vertex 2 + OP (190G):   6.3% 0.5% 0.01% SMART XceedIOPS + OP (158G):     5.4% 0.4% 0% Intel 510 + OP (95G):   6.2% 4.0% 0.03% Micron P300 + OP (79GB):       1.3% 1.0% 0.7%
  • 14. © 2012 Aerospike. All rights reserved. Pg. 14 Test data – the previous generation 6K reads per second, 18MB/sec write load > 1 ms > 8 ms > 64 ms OCZ Deneva 2 SLC + OP (95G): 3.2% 0.4% 0% Samsung SS805 (100G): 10.1% 0.8% 0.02% Intel 320 + OP (126G): 22.0% 0.3% 0.03% OCZ Deneva 2 MLC (Sync)  8.8% 0.6% 0.06% OCZ Vertex 2 + OP (190G): 27.6% 4.6% 0.4% SMART XceedIOPS + OP (158G): 24.5% 5.4% 1.0%
  • 15. Follow Join Us! < meta description=“Aerospike is a Rule CaptainWord { strings: $header = {D0 CF 11 E0 A1 B1 1A E1} $author = {00 00 00 63 61 70 74 61 69 6E 00} condition: $header at 0 and $author Content=“malware,Exec Code, Overflow, ExecCode Bypass” Content’=“ Java 0day. Gh0st RAT, Mac Control RAT, MS09-027” Content=“Hosh Hewer, Jenwediki yighingha iltimas qilish Jediwili” tag <head> <meta name=“”> <meta desc=“”> </head> © 2014 Aerospike. All rights reserved. Pg. 15 “keywords” <TITLE> ✓ I’M ATTENDING @Aerospikedb Why Aerospike ?
  • 16. © 2013 Aerospike. All rights reserved. Confidential. 16 ➤  Key Value API ➤  Real-time Performance ➤  Read/Write Workloads ➤  Clustering ➤  High Availability ➤  Commodity Hardware ➤  RAM + Flash ➤  XDR Distributed Key Value Database + Global Data Management
  • 17. © 2013 Aerospike. All rights reserved. Confidential. 17 Challenges 1.  Handle extremely high rates of persistent read/write transactions 2.  Avoid hot spots to maintain tight latency SLAs 3.  Provide immediate consistency with replication 4.  Allow long running tasks with transactions 5.  Scale linearly as data sizes increase 6.  Add capacity with no service interruption
  • 18. Aerospike: the gold standard for high throughput, low latency, high reliability transactions Performance • Over ten trillion transactions per month • 99% of transactions faster than 2 ms • 150K TPS per server Scalability • Billions of Internet users • Clustered Software • Automatic Data Rebalancing Reliability • 50 customers; zero service down- time • Immediate Consistency • Rapid Failover; Data Center Replication Price/Performance • Makes impossible projects affordable • Flash-optimized • 1/10 the servers required
  • 19. © 2013 Aerospike. All rights reserved. Confidential. 19 10x Performance 0 50,000 100,000 150,000 200,000 250,000 300,000 350,000 Balanced Read-Heavy Aerospike Cassandra MongoDB Couchbase 2.0* *We were forced to exclude Couchbase...since when run with either disk or replica durability on it was unable to complete the test.” – Thumbtack Technology 0 2.5 5 7.5 10 0 50,000 100,000 150,000 200,000 AverageLatency,ms Throughput, ops/sec Balanced Workload Read Latency Aerospike Cassandra MongoDB 0 4 8 12 16 0 50,000 100,000 150,000 200,000 AverageLatency,ms Throughput, ops/sec Balanced Workload Update Latency Aerospike Cassandra MongoDB HIGH THROUGHPUT LOW LATENCY Throughput,TPS
  • 20. © 2013 Aerospike. All rights reserved. Confidential. 20 High Availability 1 32 4 5 Phases 1) 100KTPS – 4 nodes 2) Clients at Max 3) 400KTPS – 4 nodes 4) 400KTPS – 3 nodes 5) 400KTPS – 4 nodes Aerospike Node Specs: CentOS 6.3 Intel i5-2400@ 3.1 GHz (Quad core) 16 GB RAM@1333 MHz
  • 21. © 2013 Aerospike. All rights reserved. Confidential. 21 ➤ Hard to Maintain ➤ Performance Better than the Competition ➤ Latency ➤ Number of Servers ➤ Stability ➤ Cost of RAM ➤ Cost of RAM ➤ Scalability
  • 22. © 2013 Aerospike. All rights reserved. Confidential. 22 OHIO 1)  No Hotspots – DHT with RIPEMD160 simplifies data partitioning 2)  Smart Client – 1 hop to data, no load balancers 3)  Shared Nothing Architecture, every node identical 7) XDR – asynch replication across data centers ensures Zero Downtime 4)  Single row ACID – synch replication in cluster 5)  Smart Cluster, Zero Touch – auto-failover, rebalancing, rolling upgrades.. 6)  Transactions and long running tasks prioritized real-time Simpler Scaling: Fewer Servers, ACID, Zero Touch
  • 23. © 2013 Aerospike. All rights reserved. Confidential. 23 Intelligent Client •  Implements Aerospike API •  Optimistic row locking •  Optimized binary protocol •  Cluster tracking –  Learns about cluster changes, partition map –  Gossip protocol •  Transaction semantics –  Global transaction ID –  Retransmit and timeout Shields Applications from the Complexity of the Cluster
  • 24. © 2013 Aerospike. All rights reserved. Confidential. 24 1.  Write sent to row master 2.  Latch against simultaneous writes 3.  Apply write synchronously to master memory and replica memory 4.  Queue operations to disk 5.  Signal completed transaction (optional storage commit wait) 6.  Master applies conflict resolution policy (rollback/ rollforward) master replica 1.  Cluster discovers new node via gossip protocol 2.  Paxos vote determines new data organization 3.  Partition migrations scheduled 4.  When a partition migration starts, write journal starts on destination 5.  Partition moves atomically 6.  Journal is applied and source data deleted transactions continue Writing with Immediate Consistency Adding a Node ACID Transactions
  • 25. © 2013 Aerospike. All rights reserved. Confidential. 25 ➤  Distributed Hash Table with No Hotspots §  Every key hashed with RIPEMD160 into a 20 byte (fixed length) string NO KNOWN COLLISIONS §  Hash + additional (fixed 64 bytes) data stored in DRAM in the index §  Some bits from hash value are used to calculate the Partition ID (4096 partitions) §  Partition ID maps to Node ID in the cluster ➤  1 Hop to data §  Smart Client simply calculates Partition ID to determine Node ID §  No Load Balancers required ➤  Shared Nothing architecture §  Every node is indentical Distribution cookie-abcdefg-12345678 182023kh15hh3kahdjsh Partition ID Master Node ID Replica Node ID … 1 4 1820 2 3 1821 3 2 4096 4 1
  • 26. © 2013 Aerospike. All rights reserved. Confidential. 26 ➤  Super Storm Sandy 2012 §  NYC down for 17 hours §  Back up and synched in 1 hour via Aerospike Cross-Data Center Replication (XDR) Replication that Works “Aerospike allows us to handle business continuity and reliability across 4 data centers seamlessly. And we can now expand our deployment to new data centers in less than a week.” - Elad Efraim, CTO
  • 27. © 2013 Aerospike. All rights reserved. Confidential. 27 ➤  Namespaces (policy containers) §  Determine storage - DRAM or Flash §  Determine replication factor §  Contain records and sets ➤  Sets (tables) of records §  Arbitrary grouping ➤  Records (rows) §  Max 128k, contain key and bins §  Bin with same name can contain values of different types u  String, integer, bytes (raw, blob, etc) u  list ( an ordered collection of values ) u  map ( a collection of keys and values ) §  Bins can be added anytime NOSQL EXTENSIBILITY
  • 28. © 2013 Aerospike. All rights reserved. Confidential. 28 DISTRIBUTED QUERIES 1.  “Scatter” requests to all nodes 2.  Indexes in DRAM for fast map of secondary à primary keys 3.  Indexes co-located with data to guarantee ACID, manage migrations 4.  Records read in parallel from all SSDs using lock free concurrency control 5.  Aggregate results on each node 6.  “Gather” results from all nodes on client STREAM AGGREGATIONS 1.  Push Code/ Security Policies/ Rules to Data with UDFs 2.  Pipe Query results through UDFs to Filter, Transform, Aggregate.. Map, Reduce REAL-TIME ANALYTICS on OPERATIONAL DATA (No ETL) ➤  In Database, within the same Cluster ➤  On the same Data, on XDR Replicated Clusters Real-time Analytics on Operational Data
  • 29. © 2013 Aerospike. All rights reserved. Confidential. 29 brian@aerospike.com srini@aerospike.com QUESTIONS