Más contenido relacionado La actualidad más candente (20) Similar a Data Integration (20) Más de Datio Big Data (20) Data Integration8. 1990 Data Warehousing
- Drop relational assumption
- Programmability
- Open Source
2008 Hadoop + MapReduce
- Batch → Real-time
- Daily → Continous
2015 Kafka + Streaming data
19. Batch Processing
Data Lake
Batch
Processing
Pageviews
[url, timestamp]
[url, timestamp]
[url, timestamp]
[url, timestamp]
DBRollups
[url, hour,
count]
[url, hour,
count]
[url, hour,
count]
{url+hour :
count}
{url+hour :
count}
{url+hour :
count}
mapreduce mapreduce Data Analysis
20. Stream Processing
Real Time Technologies
Data
Source
flume
Kafka producer
Events /
DB writes
Process
Stream
Event
Stream
Output
Stream
24. Lambda Architecture
Serving Layer
New Data
Stream
Batch Views
Real-Time Views
Partial
Aggregate
Partial
Aggregate
Partial
Aggregate
Real-Time Data
Bath LayerPrecompute Views
(MapReduce)Batch
Processing
Real-Time
Layer
Increment Views
Stream
Processing
Process
Stream
Merged
View
query
merge