Making Sense of Streaming Sensor Data: How Uber Detects on Trip Car Crashes - Nikolas Anderson & Jin Yang, Uber

Making Sense of Streaming
Sensor Data: How Uber Detects
On-Trip Car Crashes
Nikolas Anderson, Safety
Jin Yang, Safety
October 9, 2019

Who We Are
Nikolas Anderson
− Software Engineer, Safety
Jin Yang
− Software Engineer, Safety

Rider Driver
Proactive
Response Team

■ A First Model
■ Choosing and Integrating Flink
■ 1st Iteration: A Modular, Light Topology
■ 2nd Iteration: On a Reusable Sensor Platform
■ 3rd Iteration: On-Trip Detection
Agenda

● Distance from dropoff to destination
● Rider/Driver cancellation vs normal trip
completion
● Overall length of trip (time, distance)
● Location context (highway, movie
theatre, airport, etc.)
Trip Context Features

Large Force
Long Stop
>3g
Sensor Event Features
GPS
Long Stop
Accelerometer
Spikes

Building a Model
● Uber has very high-accuracy labels
● Extremely unbalanced dataset
● Model trained using Apache Spark
● Can host model for streaming using
platform called Michelangelo
The Apache Spark logo is either a registered trademark or a trademark of the Apache Software Foundation in the United States and/or other countries. No endorsement by The Apache Software Foundation is implied by the use of this mark.

Why Flink?
● Uber migrated from Samza to Flink
● Rich API: keyBy, join, window, etc.
● Supports batch processing
● Exactly once guarantee
The Apache Spark and Apache Flink logos are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. No endorsement by The Apache Software Foundation is implied by the use of these marks.

Uber Infrastructure: Schema’d Kafka
● Many Kafka topics at Uber have enforced schemas
● Centralized schema registry that stores Avro schemas
● Wrote a custom, config-driven SourceFunction/SinkFunction that
loads into and out of generated Java classes
topic-name
feature-name
DataStream<T> inputStream = env.addSource(
(SourceFunction<T>) getInputs().get(topicName),
new AvroTypeInfo<>(tClass)
);

Uber Infrastructure: M3 Metrics
● In-house, open source metrics system, M3, roughly compatible with
Prometheus
● Implemented custom MetricReporter, lightly adapted from Flink’s
PrometheusReporter
● A Prometheus scraper then ingests into Uber metrics system
● Utilize M3 with our internal alerting and monitoring
M3 Query Language:

Sensor Data at Uber
5 minutes 3 minutes 6 minutes
GPS: Points sent up one at a time;
0.5 Hz, latitude/longitude/speed, ~3 TB/day
Accelerometer: ~5-minute batched payloads;
25 Hz, 3 dimensions, ~10 TB/day
Hive
Cassandra
● Uber operates tens of millions of trips daily
● Sensor data is MBs per trip

Joining TBs of Sensor Streams
● Managing state is difficult; state is sensitive to failures
● Trade-offs between state size and data coverage
● Focus on reducing stream joins
?
GPS
Accel
Points sent up one or two at a time;
0.5 Hz, latitude/longitude/speed, ~3 TB/day
~5-minute payloads;
25 Hz, 3 dimensions, ~10 TB/day

Condensing Prior to Trip Joins/Aggregations
Detect
Spikes
Accelerometer
Payloads
Detect
Stops
Trip GPS
3 TB
10 TB 60 GB
1 GB

Condensing Prior to Trip Joins/Aggregations
Detect
Spikes
Aggregate
Spikes by Trip
Accelerometer
Payloads
Detect
Stops
Trip GPS
3 TB
10 TB 60 GB
1 GB
Join Stops and
Spikes by Trip

A Modular Post-Trip Crash Detection Topology
Location Service
Fetch GPS
Route
Detect
Spikes
Aggregate
Spikes by Trip
Detect
Stops
Trip End Event
Kafka Topic
Fetch Trip
Context
Stops and
Spikes
Scored by
Model
Michelangelo:
Machine Learning
Platform
Trip Service
To RideCheck
Service
Accelerometer
Payloads
Why so many jobs?
● Resource isolation
● "Paper trail"/debuggability
● Reuse intermediate features
● Facilitates cross-team
collaboration

The Power of Flink: Joining by Trip ID
● Use SessionWindow
● We first ensure that both streams are
deduplicated by trip ID
● Configured “gap” roughly acts as an expiry time
● See power of windows in Flink:
○ Triggered the moment that both sides have
arrived, immediately freeing state

Platformizing: When Use Cases Diverge
● Example of stop detection between RideCheck's Crash vs Trip Anomaly
● Different products have different criteria for data latency, data quality,
precision, and definition for contextual features
● It's a case where a feature is engineered differently for different applications

Platformizing: Demand for Sensor Data
● Efforts joining large data streams to give data context is not unique
● Fraud detection, ETA calculation
● Examples of Aggregations: per trip, rider/driver match, geolocation (segments
of streets, region), time

Adding Sensor Embeddings to the Model
● Use deep learning to learn
features from raw sensor
data
○ GPS
○ Accelerometer
○ Gyroscope
● Produce 100-dimension
embedding
● Add output as features for
existing model
● TensorFlow sub-model runs
within Flink

Sensor Trip Aggregation
● A few things make this more feasible:
○ More demand for clean, convenient sensor
data from other teams within Uber
○ Reliable GPS now included in batched sensor
payloads
● Time for a Sensor Platform that does the
aggregation once for everybody
● Unlocks:
○ Full-trip raw data analysis
○ Easy use of trip context data
○ Data quality guarantees
?

Consolidated
Crash Detection
Consolidation
Trip
Aggregation
Extract Stops,
Spikes, Embeddings
Accel/Gyro/GPS
Payloads
Trip Event
Kafka
Topics Fetch Trip
Context
Scored by
Model
Michelangelo:
Hosted ML ModelsTrip Service
To
RideCheck
Per-Trip
Sensor
Data
Trip
Events
Why move to a single job now?
● Platform has simplified things
● Much more stable now; less
need to isolate
● Rapid iteration has slowed;
less need for debugging

On-Trip Crash Detection: A Hybrid Solution
● Hardest part is forgoing some valuable trip context
● Model performance is inevitably lower due to
○ Giving up Post-Trip features
○ Consider only a sliding window of data
● Meant to be run in tandem with Post-Trip pipeline

On-Trip Crash Detection
Trip
Aggregation 1-Minute
Payloads
On-Trip Crash Detection
To
RideCheck
Trip
Events
Retain at most 5 minutes of data
Still Emit Original Per-
Trip Sensor Data

The Future
● Drive down the delay further
○ There would be enormous value
in being able to respond in
seconds
● On-device heuristics/model
○ Trigger early upload of batched
sensor data
○ Backend still does heavy lifting

Thank you!
nikolas@uber.com
jiny@uber.com

Proprietary and confidential © 2019 Uber Technologies, Inc. All rights reserved. No part of this document may be reproduced or utilized in any
form or by any means, electronic or mechanical, including photocopying, recording, or by any information storage or retrieval systems, without
permission in writing from Uber. This document is intended only for the use of the individual or entity to whom it is addressed and contains
information that is privileged, confidential or otherwise exempt from disclosure under applicable law. All recipients of this document are notified
that the information contained herein includes proprietary and confidential information of Uber, and recipient may not make use of, disseminate,
or in any way disclose this document or any of the enclosed information to any person other than employees of addressee to the extent
necessary for consultations with authorized personnel of Uber.

Making Sense of Streaming Sensor Data: How Uber Detects on Trip Car Crashes - Nikolas Anderson & Jin Yang, Uber

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a Making Sense of Streaming Sensor Data: How Uber Detects on Trip Car Crashes - Nikolas Anderson & Jin Yang, Uber

Similar a Making Sense of Streaming Sensor Data: How Uber Detects on Trip Car Crashes - Nikolas Anderson & Jin Yang, Uber (20)

Más de Flink Forward

Más de Flink Forward (20)

Último

Último (20)

Making Sense of Streaming Sensor Data: How Uber Detects on Trip Car Crashes - Nikolas Anderson & Jin Yang, Uber