Más contenido relacionado
La actualidad más candente (13)
Similar a Low Latency “OLAP” with HBase - HBaseCon 2012 (20)
Low Latency “OLAP” with HBase - HBaseCon 2012
- 1. Low Latency “OLAP” with HBase
Cosmin Lehene | Adobe
© 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
- 2. What we needed … and built
OLAP Semantics
Low Latency Ingestion
High Throughput
Real-time Query API
Not hardcoded to web analytics or x-, y-, z-
analytics, but extensible
© 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 2
- 3. Building Blocks
Dimensions, Metrics
Aggregations
Roll-up, drill-down, slicing and dicing, sorting
© 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 3
- 4. OLAP 101 – Queries example
Date Countr City OS Browser Sale
y
2012-05-21 USA NY Windows FF 0.0
2012-05-21 USA NY Windows FF 10.0
2012-05-22 USA SF OSX Chrome 25.0
2012-05-22 Canada Ontario Linux Chrome 0.0
2012-05-23 USA Chicago OSX Safari 15.0
5 visits, 2 4 cities: 3 OS-es 3 browsers 50.0
3 days countries NY: 2 Win: 2 FF: 2 3 sales
USA: 4 SF: 1 OSX: 2 Chrome:2
Canada: 1
© 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 4
- 5. OLAP 101 – Queries example
Rolling up to country level: Country visits sales
SELECT COUNT(visits), SUM(sales)
USA 4 $50
GROUP BY country
Canada 1 0
“Slicing” by browser Country visits sales
SELECT COUNT(visits), SUM(sales) USA 2 $10
GROUP BY country
Canada 0 0
HAVING browser = “FF”
Top browsers by sales Browser sales visits
SELECT SUM(sales), COUNT(visits) Chrome $25 2
GROUP BY browser
Safari $15 1
ORDER BY sales
FF $10 2
© 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 5
- 6. OLAP – Runtime Aggregation vs. Pre-aggregation
Aggregate at runtime Pre-aggregate
Most flexible Fast
Fast – scatter gather Efficient – O(1)
Space efficient High throughput
But But
I/O, CPU intensive More effort to process (latency)
slow for larger data Combinatorial explosion (space)
low throughput No flexibility
© 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 6
- 7. Pre-aggregation
Data needs to be summarized
Can’t visualize 1B data points (no, not even with Retina display)
Difficult to comprehend correlations among more than 3 dimensions
Not all dimension groups are relevant
Index on a needed basis (view selection problem)
Runtime aggregation == TeraSort for every query?
Pre-aggregate to reduce cardinality
© 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 7
- 8. SaasBase
We tune both
pre-aggregation level vs. runtime post-aggregation
(ingestion speed + space ) vs. (query speed)
Think materialized views from RDBMS
© 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 8
- 9. SaasBase Domain Model Mapping
© 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 9
- 10. SaasBase - Domain Model Mapping
© 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 10
- 11. SaasBase - Ingestion, Processing, Indexing, Querying
© 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 11
- 12. SaasBase - Ingestion, Processing, Indexing, Querying
© 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 12
- 14. Ingestion throughput vs. latency
Historical data (large batches)
Optimize for throughput
Increments (latest data, smaller)
Optimize for latency
© 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 14
- 15. Large, granular input strategies
Slow listing in HDFS
Archive processed files
Filtering input
FileDateFilter (log name patterns: log-YYYY-MM-dd-HH.log)
TableInputFormat start/stop row
File Index in HBase (track processed/new files)
Map tasks overhead - stitching input splits
400K files => 400K map tasks => overhead, slow reduce copy
CombineFileInputFormat – 2GB-splits => 500 splits for 1TB
FixedMappersTableInputFormat (e.g. 5-region splits)
© 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 15
- 16. Ingestion – Bulk Import
HFileOutputFormat (HFOF)
100s X faster than HBase API
No need to recover from failed jobs
No unnecessary load on machines
* No shuffle - global reduce order
required!
e.g. first reduce key needs to be in the
first region, last one in the last region
Watch for uneven partitions
© 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 16
- 17. HFOF – FileSizeDatePartitioner
1 partition(reduce) / day for initial import
Uneven reduce (partitions) due to data growth over time
Reduce k: 2010-12-04 = 500MB
Reduce n: 2012-05-22 = 5GB => slow and will result in a 5GB region
Balance reduce buckets based on input file sizes and the reduce key
Generate sub-partitions based on predefined size (e.g. 1GB)
© 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 17
- 19. Processing
Processing involves reading the Input (files, tables, events), pre-
aggregating it (reducing cardinality) and generating tables that can be
queried in real-time
1 year: 1B events => 100B data points indexed
Query => scan 365 data points (e.g. daily page views)
Processing could be either MR or real-time (e.g. Storm)
© 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 19
- 20. Processing for OLAP semantics
GROUP BY (process, query)
COUNT, SUM, AVG, etc. (process, query)
SORT (process, query)
HAVING (mostly query, can define pre-process constraints)
© 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 20
- 21. SaasBase vs. SQL Views Comparison
© 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 21
- 23. Processing Performance
read, map, partition, combine, copy, sort, reduce, write
Read:
Scan.setCaching() (I/O ~ buffer)
Scan.setBatching() (avoid timeouts for abnormal input, e.g. 1M hits/visit)
Even region distribution across cluster (distributes CPU, I/O)
Map:
No unnecessary transformations: Bytes.toString(bytes) + Bytes.toBytes(string)
(CPU)
Avoid GC : new X() (CPU, Memory)
Avoid system calls (context switching)
Stripping unnecessary data (I/O)
© 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 23
- 24. Processing Performance
Hot (in memory) vs. Cold (on disk, on network) data
Minimize I/O from disk/network
Single shot MR job: SuperProcessor
Emit all groups from one map() call
Incremental processing
Data format YYYY-MM-DD prefixed rowkey (HH:mm for more granularity)
© 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 24
- 26. HBase natural order: hierarchical representation
© 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 26
- 27. Indexing - Why
Example: top 10 cities
~50K [country, city] combinations per day
Top 10 cities for 1 year =>
365 (days) X 50K ~=15M data points scanned
If you add gender => 30M
If you add Device, OS, Browser …
Might compress well, but think about the environment
How much energy would you spend for just top 10 cities?
* Image from: http://my.neutralexistence.com/images/Green-Earth.jpg
© 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 27
- 28. Indexing with HBase “10” < “2”
GROUP BY year, month, country, city ORDER BY visits DESC LIMIT 10
Lexicographic sorting
2012/05/USA/0000000000/
2012/05/USA/4294961296/San Francisco = 1000 visits*
2012/05/USA/4294961396/New York = 900 visits*
. . .
2012/05/USA/9999999999/
scan “t” startrow => “2012/05/USA/”, limit => 10
* Padding numbers for lexicographic sorting:
1000 -> Long.MAX_VALUE – 1000 = 4294961296
© 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 28
- 29. Query Engine
Always reads indexed, compact data
Query parsing
Scan strategy
Single vs. multiple scans
Start/stop rows (prefixes, index positions, etc.)
Index selection (volatile indexes with incremental processing)
Deserialization
Post-aggregation, sorting, fuzzy-sorting etc.
Paging
Custom dimension/metric class loading
© 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 29
- 30. Conclusions
OLAP semantics on a simple data model
Data as first class citizen
Domain Specific “Language” for Dimensions, Metrics, Aggregations
Tunable performance, resource allocation
Framework for vertical analytics systems
© 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 30
- 31. Thank you!
Cosmin Lehene @clehene
http://hstack.org
Credits:
Andrei Dragomir
Adrian Muraru
Andrei Dulvac
Raluca Podiuc
Tudor Scurtu
Bogdan Dragu
Bogdan Drutu
© 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 31
- 32. © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
- 33. OLAP 101 - Rollup
Countr Visits Sale
y
USA 4 $50
Canada 1 $0
Rollup: SELECT COUNT(visits), SUM(sales) GROUP BY country
© 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 33
- 34. OLAP 101 - Slicing
Date Countr City OS Browser Sale
y
2012-03-02 USA NY Windows FF 0.0
2012-03-02 USA NY Windows FF 10.0
2012-03-03 USA S OSX Chrome 25.0
2012-03-03 Canada Ontario Linux Chrome 0.0
2012-03-04 USA Chicago OSX Safari 15.0
5 visits, 2 4 cities: 3 OS-es 3 browsers 50.0
3 days countries NY: 2 Win: 2 FF: 2 3 sales
USA: 4 SF: 1 OSX: 2 Chrome:2
Canada: 1
Filter or Segment or Slice (WHERE or HAVING)
© 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 34
- 35. OLAP 101 – Sorting, TOP n
Date Countr City OS Browser Sale
y
Chrome $25
Safari $15
Firefox $10
SELECT SUM(sales) as total GROUP BY browser ORDER BY total
© 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 35
Notas del editor
- How many HBase users?
- Data as first class citizen
- Check contrast on projector
- Just like speedvs space in general CS/algoQueries always hit indexes
- Dimensions – readtransformserializedeserialize data attributesMetrics – read/transform/aggregate/serializeConstraints: ingestion filteringReport: instrument dimensions groups + metrics with aggregations, sorting
- QUERY ENGINE -> INDEX(always realtime)
- Initial import/process and NEW reports (not covered) on historical data
- 18K regions, upgrade to 0.92
- DiagramHARD TO DIGEST (TOO MUCH INFO, TOO CONDENSED)
- Process = aggregate,generate indexes (natural)Query = uses indexes, can do extra aggregation
- LEFT: report definition, NOT a QUERYLIKE A VIEW - CREATED - THEN QUERIED
- Inconsistent
- Rowkey =dimensions group -> metrics (right)
- GO BACK to EXPLAIN
- >100K/sec/threadREALTIME
- Data analysts work with familiar concepts