Collecting Uncertain Data
the Reactive Way
Jeff Smith
@jeffksmithjr
x.ai is a personal assistant
who schedules meetings for you
Reactive Machine Learning
Machine Learning Systems
Machine Learning Systems
Machine Learning Systems
Traits of Reactive Systems
Traits of Reactive Systems
Reactive Strategies
Reactive Strategies
Reactive Machine Learning
Reactive Machine Learning
Reactive Machine Learning
Collecting Data
What’s for
dinner?
Reactive Data Collection
Modeling Uncertain Data
Certain Data Model
case class ZebraReading(sensorId: Int,
locationId: Int,
timestamp: Long,
count: Int)
Uncertainty Interval
27 33
Uncertain Data Model
case class PreyReading(sensorId: Int,
locationId: Int,
timestamp: Long,
animalsLowerBound: Double,
animalsUpperBound: Double,
percentZebras: Double)
Scaling Data Collection
Simple Data Architecture
Simple Data Architecture
Mutable State
case class Region(id: Int)
import collection.mutable.HashMap
var densities = new HashMap[Region, Double]()
densities.put(Region(4), 52.4)
Scaling with Queues
Scaling with Queues
Out of Order Updates
Out of Order Updates
densities.put(Region(6), 73.6)
densities.put(Region(6), 0.5)
densities.get(Region(6)).get
Out of Order Updates
densities.put(Region(6), 73.6)
densities.put(Region(6), 0.5)
densities.get(Region(6)).get
densities.put(Region(6), 0.5)
densities.put(Region(6), 73.6)
densities.get(Region(6)).get
Concurrent Collections
import collection.mutable._
var synchronizedDensities = new LinkedHashMap[Region, Double]()
with SynchronizedMap[Region, Double]
Scaling with Locks
Scaling with Locks
Immutable Facts
case class PreyReading(sensorId: Int,
locationId: Int,
timestamp: Long,
animalsLowerBound: Double,
animalsUpperBound: Double,
percentZebras: Double)
implicit val preyReadingFormatter = Json.format[PreyReading]
Immutable Facts
val reading = PreyReading(36,
12,
currentTimeMillis(),
12.0,
18.0,
0.60)
val setDoc = bucket.set[PreyReading](readingId(reading), reading)
Scaling with Distributed Databases
Scaling with Distributed Databases
Handling Incomplete Data
Distributed Data Storage
Querying Complete Data
(bucket.searchValues[PreyReading]("prey", "by_sensor_id")
(new Query().setIncludeDocs(true)))
.enumerate.apply(Iteratee.foreach { doc =>
println(s"Prey Reading: $doc")})
Complete Data
Partition Tolerance
Partition Tolerance
Partition Tolerance
Partition Tolerance
Querying Incomplete Data
(bucket.searchValues[PreyReading]("prey", "by_sensor_id")
(new Query().setIncludeDocs(true)))
.enumerate.apply(Iteratee.foreach { doc =>
println(s"Prey Reading: $doc")})
Incomplete Data
Incomplete Data
Reactive Data Collection
For Later
reactivemachinelearning.com
medium.com/data-engineering
M A N N I N G
Jeff Smith
x.ai
@xdotai
hello@human.x.ai
New York, New York
skillsmatter.com/conferences/
6862-scala-exchange-2015#skillscasts
Thank You
Collecting Uncertain Data
the Reactive Way
Jeff Smith
@jeffksmithjr

Collecting Uncertain Data the Reactive Way