More Related Content More from Apache Apex (20) IOT Big Data Ingestion and Processing in Hadoop by Silver Spring Networks1. © 2016 Silver Spring Networks. All rights reserved.1
Silver Spring Networks
Greg Brosman
Product Manager
SilverLink Data Platform
2. © 2016 Silver Spring Networks. All rights reserved.2
Silver Spring Networks
• Silver Spring Networks helps global utilities and cities
connect, optimize, and manage smart energy and smart city
infrastructure
• Over 22 million connected devices
• 200B records read per year
• 2 million remote operations per year
Integrate
Renewables
Engage
Customers
Improve
Operational Efficiency
Improve
Reliability
Manage
Peak
Automate
Measurement
Improve
Energy Efficiency
Reduce Truck Rolls for
Device Maintenance
3. © 2016 Silver Spring Networks. All rights reserved.3
More Devices, More Data
• How can we do more with our network?
- We deployed a network to support meter reading. It works great, but
we’re ready for the next thing to leverage these investments
• How do we manage these new devices and make all this
data accessible and secure?
- There are lots of opportunities to enhance our service by making use of
advanced analytics, but we can’t get the data to the right people
• How can we reduce the cost, time, and pain of integrating
with 3rd party apps?
- The ecosystem of 3rd party apps is growing, but need a scalable way to
connect apps with data
Managing the volume, variety, and velocity of data
4. © 2016 Silver Spring Networks. All rights reserved.4
SilverLink Data Platform
• Automatically ingest smart grid data
• Enrich data with valuable context
• Enable real-time and batch
applications
• Archive raw and enriched data
• Connect apps through standard APIs
• Explore data through BI tool
integrations
Seamlessly connecting apps with sensor data
Security & API Management
Storage & BatchReal-Time
Data Ingestion
Data Sources
SilverLink Data Platform
Applications
Silver Spring
Networks Apps
3rd Party
Apps
In-House
Apps
Devices
Silver Spring
Networks Data
Utility
Data
3rd Party
Data
5. © 2016 Silver Spring Networks. All rights reserved.5
Starfish
• A Worldwide Wireless IPv6 Network Service for the IoT.
Starfish enables cities, utilities, enterprises, and developers to
connect and manage a new generation of intelligent devices
• Focus areas include water, energy, food, traffic, transportation
and safety
• 2016 Global IoT Hackathon Series: an opportunity to develop
and test innovations and collaborate with leading IoT
technologists
Building a new ecosystem of IoT services
6. © 2016 Silver Spring Networks. All rights reserved.6
IOT Big Data Ingestion &
Processing in Hadoop
Darin Nee
Silver Spring Networks
7. © 2016 Silver Spring Networks. All rights reserved.7
• Context & scope of our use case
• Tour a DataTorrent app we built
• Some technical hurdles & solutions we came up with
• Q & A
Agenda
8. © 2016 Silver Spring Networks. All rights reserved.8
• Sensor reads
• Meter register reads & interval data
• Threshold events, traps
• Device metadata
Kinds of Data
9. © 2016 Silver Spring Networks. All rights reserved.9
• NICs collect data from meters
• Head end software poll NICs
• Some data sent asynchronously to head end
• Agents send data to SilverLink
• Data processing using DataTorrent + more
• Data consumed via APIs and SQL
Data Flow
10. © 2016 Silver Spring Networks. All rights reserved.10
• Encryption of data at rest & in-transit
• Ranger & Knox
• Custom requirements to satisfy local laws
• Auditing
• No data leakage across tenants
• Not enough to be secure – need to prove it
Security
11. © 2016 Silver Spring Networks. All rights reserved.11
• Shared resources to cut costs
• Customers with millions of devices, and pilots with a handful of them
• Centralized management of software & operations
• Challenge in selling shared anything to our customers
Multi-Tenancy
12. © 2016 Silver Spring Networks. All rights reserved.12
• 23 million network endpoints in service today
• Up to 96 intervals a day
• Each interval has 4 channels
• So, approximately 8 billion intervals per day
• Keep this data forever
• Also, 100 million events a day
• And, sensors that can collect data every 10s
• 19.4 GB per million meters per day
• ½ TB per day
Scalability
13. © 2016 Silver Spring Networks. All rights reserved.13
• Clustering
• Automated Fail-overs
• Rolling upgrades
High Availability & Disaster Recovery
14. © 2016 Silver Spring Networks. All rights reserved.14
• HDFS
• Kafka
• DataTorrent
• Elasticsearch
• OpenTSDB & HBase
• Oozie
• Hive
• Mule
• Apigee
• Tableau
Tech Architecture
15. © 2016 Silver Spring Networks. All rights reserved.15
• Management UI Console
• Malhar Library + Java
• Support
• Rapid Development
• Stats, Operability, Auto-Scaling
Why DT?
16. © 2016 Silver Spring Networks. All rights reserved.16
• Resilient operators (availability)
• Easily partition operators (scalability)
• Any java programmer can build a simple app
• Facilitate management hand-off to operations
• Easy to detect failures with UI and stats
Strengths
17. © 2016 Silver Spring Networks. All rights reserved.17
• No “back pressure”
• If container crashes with OOM, it restores container to OOM state
• No good way to stop an app and save context
• Can be difficult to navigate logs
Our focus areas for improvement
18. © 2016 Silver Spring Networks. All rights reserved.18
Example DT App: AMM Export Ingestion
19. © 2016 Silver Spring Networks. All rights reserved.19
Example App: AMM Export Ingestion
• Scans last 2 days’ HDFS directories
• Emits filenames
• Too fast!
Input Operator
20. © 2016 Silver Spring Networks. All rights reserved.20
Example App: AMM Export Ingestion
• Parses different types
• Emits avro tuples
• XML parsing can be slow
• File & tuple sizes vary a lot
AMM File Reader
21. © 2016 Silver Spring Networks. All rights reserved.21
Example App: AMM Export Ingestion
• Adds metadata to every tuple
• External dependency on elasticsearch
• Uses a thread pool since one YARN container too big for
a single client
Enricher
22. © 2016 Silver Spring Networks. All rights reserved.22
Example App: AMM Export Ingestion
• Normalizes tuples across schema versions
• Outputs many tuples from one
Avro Converter
23. © 2016 Silver Spring Networks. All rights reserved.23
Example App: AMM Export Ingestion
• Writes avro tuples to HDFS files
• Names output files by date, input file, part, etc.
• HDFS can be slow – another external dependency
• Container death causes rewriting of tuples
Enriched Persister
24. © 2016 Silver Spring Networks. All rights reserved.24
Example App: AMM Export Ingestion
• Embedded instance of OpenTSDB
• External dependency on HBase
• Slow during metric creation and Hbase Region splits
TSDB Writer
25. © 2016 Silver Spring Networks. All rights reserved.25
AMM Export Ingestion
Continuing to extend the DAG with new operators
26. © 2016 Silver Spring Networks. All rights reserved.26
• The classic YARN application solution is to spin up more containers
• Not so simple due to external dependencies, and,
• Highly variable loads
- Tuple mix
- Tuple size
- Kind of tuple
• Buffering tuples in the DAG
• Static partitioning means the DAG has to be slow
• Throughput: how many tuples operator can emit per window
• We need dynamic throughput management
Scalability & Throughput
27. © 2016 Silver Spring Networks. All rights reserved.27
Throughput Management
We use a Stats Listener to “auto-tune” the throughput rate
28. © 2016 Silver Spring Networks. All rights reserved.28
Throughput Management
• Any pair of logical operators
• Adjusts upstream operator throughput every N windows
• Scales it by a factor based on downstream operator
backlog threshold levels
• A lagging correction since based on operator stats from
prior windows
• Observed overall processing rate across DAG oscillates
• Control theory says this is not going to work since it will
never converge to a reasonable value
First implementation
29. © 2016 Silver Spring Networks. All rights reserved.29
Throughput Management
• Compute a backlog
• Try to maintain a target backlog that is a multiple of the
downstream operator processing rate
• Problem: starvation
- Stats not reported when throughput set to zero
- Solution 1: small, positive min throughput
- Solution 2: fractional/probabilistic emit
Second implementation
30. © 2016 Silver Spring Networks. All rights reserved.30
Throughput Management
• Operators don’t run out of memory and crash
• Overall throughput across the DAG is much higher
• Can adapt to a wide mix of loads
• General enough that we are using it in all our apps
• We ingested 4 multi-month pilot datasets successfully
• Reduced the time it takes to ingest 1 day’s worth of data
from 1½ hrs to 15 min
• Hands off, automated tuning
Successes
31. © 2016 Silver Spring Networks. All rights reserved.31
Throughput Management
• Throughput management is based on tuple count and not
all tuples are the same
• Garbage Collection causes uneven performance
• Slow to converge
• Hard to test and debug
Remaining problems
32. © 2016 Silver Spring Networks. All rights reserved.32
• Persist processed state for files & Kafka messages
- Save Kafka offsets in ZooKeeper
- Rename input files to .processed
• Checkpoint Listener
- Wait to persist state until tuple fully transits DAG
- Prevent loss of data
• However, some tuples get processed twice
• Suspend script
- Use REST API to set a flag on Input Operator
- Wait until no more activity
Stopping DAGs
33. © 2016 Silver Spring Networks. All rights reserved.33
• Hadoop 2.3.0
• DataTorrent 3.1.1
Versions