Más contenido relacionado La actualidad más candente (20) Similar a TimeSpaceDB (20) TimeSpaceDB1. $ whoami
Name: Zvi Avraham
Title: Founder & CEO
Company: ZADATA Ltd
Email: zvi@zadata.com
ZΛDΛTΛ © 2015
3. Market for Realtime Data Feeds
Data
Sellers
Data
Buyers
ZΛDΛTΛ © 2015
70% of fees
Data Scientists,
Data Analysts,
Researchers
Quants,
Algo-traders,
Financial Analysts
Businesses
Content & App Publishers,
Utilities / M2M / Internet-of-Things
Independent Developers & ISVs: Web/Mobile/Devices
Can be both Data Sellers & Data Buyers
ZΛDΛTΛ
30% commission
+ acess fees
Realtime Data
Analytics
Subscription fees
Data Feeds Historical Data
Crowdsourcing
& Sensor Apps
Connected Apps
4. Data Sources & Destinations
ZΛDΛTΛ © 2015
ZΛDΛTΛ web apps
Physical
Alt. Finance
Social Media
analytics apps
mobile apps
Sports
Entertainment
Traffic & Transit
32. Timeseries/metrics for Riak
• 4 closed source implementations:
– Boundary Kobayashi
– Hosted Graphite
– Kivra Metyr
– Temetra (smart meter data)
ZΛDΛTΛ © 2013
38. What is TimeSpace DB?
Scalable
GPU-accelerated*
Geospatial
Timeseries*
Database*ZΛDΛTΛ © 2013
39. TSDB (TimeSeries DataBase )
The TSDB project was divided into two parts – Storage (A) & Analysis (B)
A distributed system for storage and analysis of
geo-temporal data collected from different sources.
Dr. Yehuda Ben-Shimol Mr Zvi Avraham
Eyal Segal
Shahar Ben-David
Alon Rolnik
Ron Schmid Morag
42. 1st Prototype used Graphite
• Graphite problems
–1-sec resolution
–Losing datapoints
–Designed for Monitoring
–Only support simple timeseries of
•{timestamp, numeric_value}
ZΛDΛTΛ © 2013
43. Requirements
• 1-ms resolution
• Support for geolocation
• Multiple pre-defined schemas
– Not only {timestamp, numeric_value}
• Ability to run pre-defined functions on stored data
• Use Data Locality – running Computations near Data
• Use efficient compute language – OpenCL
• Near realtime queries shouldn’t interfere with online
• Both REST APIs & Push (Graphite protocol, MQTT)
• Bulk import/export (CSV, JSON)
ZΛDΛTΛ © 2013
44. Query/Workload types
OLTP / Online OLAP / Analytics
Pre-defined queries
UDF queries
- GET
- PUT
- UPDATE
- DELETE
- 2i
- Full-text search
- Statistics
- Aggregations
- Rollups
- Reporting
- Scan
- etc.
Ad-hoc queries - SQL injection ;-)
- http://TrySQL.com ?
SELECT *
FROM …
WHERE …
Online Cluster
NoSQL –
Dynamo-style
Analytics Cluster
MPP or
in-memory
batch
import
GET
PUT
DELETE
Pre-defiined
queries
Ad-hoc
queries
Map/Reduce
Pre-defined
queris
45. Query/Workload types
OLTP / Online OLAP / Analytics
Pre-defined queries
UDF queries
- GET
- PUT
- UPDATE
- DELETE
- 2i
- Full-text search
- Statistics
- Aggregations
- Rollups
- Reporting
- Scan
- etc.
Ad-hoc queries - SQL injection ;-)
- http://TrySQL.com ?
SELECT *
FROM …
WHERE …
Online Cluster
NoSQL –
Dynamo-style
Analytics Cluster
MPP or
in-memory
batch
import
GET
PUT
DELETE
Pre-defiined
queries
Ad-hoc
queries
Map/Reduce
Pre-defined
queris
46. OLTP vs OLAP DBs
OLTP OLAP
Online Analytics
Realtime Interactive
Pre-defined queries Ad-hoc queries
Low predictable latency Latency small enough, so analyst will not
lose concentration
Many clients Not many clients
Read and/or Write-intensive Batch import / ETL
Online Cluster
NoSQL –
Dynamo-style
Analytics Cluster
MPP or
in-memory
batch
import
GET
PUT
DELETE
Pre-defiined
queries
Ad-hoc
queries
Map/Reduce
Pre-defined
queris
47. No Need in Analytics Cluster
ZΛDΛTΛ © 2013
Online Cluster
NoSQL –
Dynamo-style
Analytics Cluster
MPP or
in-memory
batch import
GET
PUT
DELETE
Pre-defiined
queries
Ad-hoc
queries
Map/Reduce
Pre-defined
queries
48. Online & pre-defined analytics
in the same DB cluster
• Each node have
dedicated Compute
Device for M/R
• M/R run on either
dedicated CPU cores or
on GPUs or Accelerators
(like Xeon Phi)
ZΛDΛTΛ © 2013
Online Cluster
NoSQL – Dynamo-
style
batch import
GET
PUT
DELETE
Ad-hoc
queries
Predef. queries:
M/R in OpenCL
In-Memory
Analytics DB
49. TimeSpace DB Stack
CPU
OpenCL
Erlang VM
Stats/Timeseries
Application
Riak
CPUCPU
CPUCPUGPU
CPUCPUFPGA
CPUCPUAccelerator
Geo NLP/Search
TimeSpace DB
Open, heterogeneous
CPU+GPU, standard
Compute Language
Open, reliable, cross-
platform software for
concurrent, distributed
computing
ZΛDΛTΛ © 2013
Modular,
Ops-friendly,
Distributed
K/V Store
57. vs
Erlang/OTP OpenCL
Parallelism Task-parallel Data-parallel(*)
T-put Moderate to bad(*) Optimized for
high t-put
Latency Optimized for
low latency
bad
Floating Point / HPC bad excellent
Self-hosted Yes No –
requires host code
IO & Network Yes No
ZΛDΛTΛ © 2013
58. Data Representation
• Timeseries divided to number of “tablets”
• Each “tablet” has header & payload
• Everything is in binary format
– Binary
– Little Endian
– Aligned for OpenCL data types
– Essentially unpacked OpenCL struct
ZΛDΛTΛ © 2013
59. Marshaling Erlang ↔ OpenCL
• Erlang Binary Syntax &
Binary Comprehensions to
marshal & unmarshal
“tablets”
• Arrays of unpacked aligned
OpenCL structs
• No need in parsing
ZΛDΛTΛ © 2013
68. 3 W-s
• WHAT?
– Topic name
• WHEN?
– timestamp
• WHERE?
– Location (lat, lon)
• PAYLOAD:
– Value(s)
ZΛDΛTΛ © 2013
Metadata
Data
69. 3 W-s
• WHAT?
– “/weather/us/ca/san-francisco/temp_c”
• WHEN?
– 2013-10-23T07:31:00.150Z
• WHERE?
– (37.7756, -122.4193)
• PAYLOAD:
– 20.35
ZΛDΛTΛ © 2013
70. Other Pre-defined Schemas
• Raw & Aggregated Geospatial Timeseries
– Timestamp, Lat, Lon, Value
– Timestamp, Bounding Box, Count, Min, Max, Sum
• Financial Timeseries
– Timestamp, Bid, Ask, Last, Volume
– Timestamp, Open, High, Low, Close, Volume
• Raw & Aggregated Analytics data
– Clickstream, CTR, etc.
• Twitter data (see demo)
ZΛDΛTΛ © 2013
74. Multiple Storage Backends
• ETS
– In-memory, mostly for testing
• Riak PB
– Using Riak as external DB
• Riak Local Client
– Native Erlang client
– Usefull in M/R & riak-core
• DynamoDB
– Using AWS DynamoDB as external DB
ZΛDΛTΛ © 2013
75. Multiple Storage Backends
• ETS
– In-memory, mostly for testing
• Riak PB
– Using Riak as external DB
• Riak Local Client – Data Locality
– Native Erlang client
– Usefull in M/R & riak-core
• DynamoDB
– Using AWS DynamoDB as external DB
ZΛDΛTΛ © 2013
76. Riak Storage
• LevelDB backend, since we need 2i
• Unlike BitCask, no auto expiration in LevelDB,
so we have a process deleting old “tablets”
ZΛDΛTΛ © 2013
80. Semantic Keys for Tablets
• Tablet Key:
– “timeframe|topic_name|first_timestamp”
• 2i for time range:
– Integer index: [first_timestamp, last_timestamp]
• 2i for location bounding box:
– Binary index: [southwest_geohash, northeast_geohash]
• where timeframe:
– raw|sec|min|hour|day
ZΛDΛTΛ © 2013
81. Problem
• Max recommended Riak Object size is 5MB
– (theoretical limit is 50MB)
• But OpenCL need much larger buffers to be
efficient!
ZΛDΛTΛ © 2013
82. TimeSpace DB API
• API calls:
– Insert value with timestamp
– Insert value with timestamp & location
– Insert bulk (from CSV)
– Fetch/Delete by time range
– Fetch/Delete by time range & location bounding box
• Common params for all API calls:
– Topic name
– Timeframe (raw/sec/min/hour/day)
ZΛDΛTΛ © 2013
83. TimeSpace DB API (2)
• Rollups for time range
– Convert from one timeframe to another
– Store result in a new timeseries topic
• Reduce for time range
– Calculate statistics (min/max/sum/avg/etc.)
• Run OpenCL kernel on time range using M/R:
– i.e. Sentiment Analysis for tweets
– Calculate Correlations between timeseries
– Etc.
ZΛDΛTΛ © 2013
84. Riak Map/Reduce Languages
• JavaScript
– Slow (not V8)
• Erlang
– ~ 6 time faster than JS, but still slow
• OpenCL (“OpenCL from Erlang”)
– as fast as it gets, but
– Has overhead for small buffers
– Can interfere with Erlang VM Scheduling, if
running on host CPUs
ZΛDΛTΛ © 2013
86. Geo on GPU
• Check if a point is inside a bounding box
• Check if a point is inside a circle
• Clustering of nearby points
– Using Naïve Grid-based Clustering + CoGs
ZΛDΛTΛ © 2013
88. NLP on GPU
• Implemented using Prime Encoding:
–Full-text Search
–Sentiment Analysis
• By counting negative vs. positive words
–Language Detection
• By counting language-specific stopwords
ZΛDΛTΛ © 2013
89. Prime Encoding
• Assign primes to each unique token in corpus:
– most frequent word is assigned “2”
– the next most frequent “3”, and so on
• To encode a tweet, calculate product of
primes for each token in the tweet:
– the product stored in ulong (64-bit unsinged int)
– If there is overflow, then start new 64-bit product
• Erlang:
-spec prime_encode(Str::binary()) -> [cl_ulong()].
ZΛDΛTΛ © 2013
90. Prime Encoding
Check if a word in a tweet
• If at least one of the products divides without
reminder to prime of the search token
– then tweet has this token
• If non of the products can be divided without
reminder
– Then this token is not in a tweet
• Erlang:
-spec prime_test(Tweet::[cl_ulong()],
Token::cl_ulong()) -> boolean().
ZΛDΛTΛ © 2013
92. Anatomy
of a
tweet
• JSON
• ~ 4KB
• Need to parse
• Many duplications
• Nested objects
• Only few fields
actually needed
ZΛDΛTΛ © 2013
93. Tweet Schema – OpenCL struct
ZΛDΛTΛ © 2013
Only 112B (with calculated fields)
95. CSV API & Export example
• Twitter data exported as CSV:
ZΛDΛTΛ © 2013
98. Full Scan + Location Clustering
(in pseudo SQL)
then run Location Clustering before returning
results to the demo app in browser
ZΛDΛTΛ © 2013
99. Future Directions (1)
• To Open Source or not to Open Source?
• Benchmarks, Benchmarks, Benchmarks!
• Build library of reusable OpenCL kernels
• Kernels optimized for specific devices:
–Xeon Phi, NVIDIA Tesla, AMD, latest CPUs
ZΛDΛTΛ © 2013
100. Future Directions (2)
• Migrate to AoS / true Column Store
• Implement NULL-columns / fields
• Consider using Parse Transforms or Elixir
metaprograming for various OpenCL & Erlang
code generation (schemas, marshalling, etc.)
ZΛDΛTΛ © 2013
101. Future Directions (3)
• Workarounds for Riak’s max 5MB/object limit
• Riak Core
• Consider using Riak Pipe instead of M/R
• CRDT in Riak 2.0
– counters, sets, maps, etc.
ZΛDΛTΛ © 2013
102. Thank You! Now Q&A
All images are taken from Google Image search and various other places on the Internet
© Copyright of corresponding owners