SlideShare una empresa de Scribd logo
1 de 36
Descargar para leer sin conexión
Mark Hamilton, Microsoft
Anand Raman, Microsoft
Unsupervised Object
Detection using the Azure
Cognitive Services on Spark
#SAISExp4
Hamilton and Raman, #SAISExp4 2
Hamilton and Raman, #SAISExp4 3
Classifying all 1.3 million images will
take 20k hours
Hamilton and Raman, #SAISExp4 5
Hamilton and Raman, #SAISExp4 6
Deep Learning
• Brain-like algorithms trained with gradient
descent
• Has become a recent favorite because:
– Spectacular performance in many
domains
– Quick training
– Low memory
– Large space of possible model
architectures
– Automatic differentiation software makes
it very easy
Hamilton and Raman, #SAISExp4 7
8Hamilton and Raman, #SAISExp4
Tire
Dog
Scotty Gu
Input Images
Low Level
Features
Mid Level
Features
High Level
Features
Output
Classes
Filters from Zeiler + Fergus 2013
9Hamilton and Raman, #SAISExp4
Input Images
Low Level
Features
Mid Level
Features
Logistic
Regression
Output
Classes
Filters from Zeiler + Fergus 2013
Any
SparkML
Model
Other
Leopard
Predictions
ML
• High level library for distributed
machine learning
• More general than SciKit-Learn
• All models have a uniform
interface
– Can compose models into
complex pipelines
– Can save, load, and
transport models
Load Data
Tokenizer
Term Hashing
Logistic Reg
Evaluate
Pipeline
Evaluate
Load Data
● Estimator
● Transformer
Hamilton and Raman, #SAISExp4 10
 GPU/CPU acceleration
 Automated gradient calculations
 Flexible language for defining
huge space of models
 State of the art performance
and accuracy
 ONNX Runtime
 Large scale parallelism
 Fault tolerance
 Auto-scaling / elasticity
 High throughput streaming
 Data source + orchestrator
agnostic
Distributed Computing
Hamilton and Raman, #SAISExp4 11
Deep Learning
Microsoft ML for
Apache Spark
• Open Source Distributed ML
library
• Unifies major computing
paradigms and Microsoft tools
into a single library and API
– Python, Java, Scala, R
– Batch, Streaming, Serving
– Local, Static Cluster,
Dynamic Cluster,
Serverless
– Databricks, HDI, AZTK,
Kubernetes, …
www.aka.ms/spark
Hamilton and Raman, #SAISExp4 12
Scalable Deep
Learning
Stress Free
Serving
Distributed
Microservices
Fast Gradient
Boosting
CNTK on Spark Spark Serving
HTTP on Spark LightGBM on Spark
MMLSpark
Training a Deep Transfer
Learner:
Hamilton and Raman, #SAISExp4 13
Performance
With Deep Featurization,
Augmentation, and Temporal
Ensembling
Without Deep
Featurization
Accuracy 65.6% Accuracy 94.7%
Hamilton and Raman, #SAISExp4 14
…
Time
1 2 3 4 5 6
Time
1 2 3 4 5 6
Deep Network
…
Hamilton and Raman, #SAISExp4 15
Real Time monitoring
with PowerBI
Stream from Kafka /
Cosmos
OpenCV on Spark
CNTK on Spark
SparkML Logistic Reg
Stream Results to
PowerBI
Camera Image Upload in
Kyrgyzstan
MMLSpark
Contribution
27 Lines of
Code
~1k imgs/s
throughput
on CPU
~3 second
latencies
Hamilton and Raman, #SAISExp4 16
Simple APIs matter
Hamilton and Raman, #SAISExp4 17
18Hamilton and Raman, #SAISExp4
Lightning Fast Web Services on Any Spark Cluster
• Sub-millisecond latencies
• Fully Distributed
• Spins up in seconds
• Same API as Batch and
Streaming
• Scala, Python, R and Java
• Fully Open Source
www.aka.ms/spark
JIRA: SPARK-25350
Serving
113
0.9
0
50
100
150
Latency (ms)
Spark Serving Spark Continuous Serving
Announcing: 100x Latency Reduction
with MMLSpark v0.14
Web Serving with the Same API:
Hamilton and Raman, #SAISExp4 19
• 3 main modes:
– server: 1 service on the head node
– distributedServer: 1 service per
executor
– continuousServer: 1 service per
partition
• Built on top of Spark Streaming
• Each worker keeps a running
service, and a routing table
• Use HTTP on Spark objects to
represent requests and
responses in Spark SQL
Spark Serving Architecture
Hamilton and Raman, #SAISExp4 20
Going Further: No Labels, No Problem
Hamilton and Raman, #SAISExp4 21
Bing Powered Snow Leopard
Detection Pipeline
Bing Photo Search of
“Snow Leopard” +
Related terms
Logistic
Regression
Add Class +
Union +
Distinct
Truncated
ResNet50
Cloud Service
CNTK on Spark
Key
Bing Image Search of
random nouns Spark ML
Hamilton and Raman, #SAISExp4 22
Bing Image Search API
Hamilton and Raman, #SAISExp4 23
Announcing: Intelligent and Distributed
Microservices in SparkML
www.aka.ms/spark
Microsoft
Cognitive Services
+
Bing Text Vision Face
Hamilton and Raman, #SAISExp4 24
Celebrity Quote Analysis
Hamilton and Raman, #SAISExp4 25
HTTP on Spark
• Allows Spark users to call into any
HTTP service
• Leverage parallel networking of
cluster
• Allows Spark to work with
microservice architectures
• Mini-batching, asynchrony, and
buffering supported
• Statically typed and usable from
Spark, PySpark, and SparklyR
Hamilton and Raman, #SAISExp4 26
Model Interpretability with LIME
Source: Marco Tulio Ribeiro et al.
Hamilton and Raman, #SAISExp4 27
Model Interpretability with LIME
Source: Marco Tulio Ribeiro et al.
Hamilton and Raman, #SAISExp4 28
Announcing: LIME on
• Elastic and Distributed
• Works with any image classification
model
• Example notebook instructions at:
– www.aka.ms/spark
Hamilton and Raman, #SAISExp4 29
Using LIME to generate
bounding boxes
30Hamilton and Raman, #SAISExp4
Transferring LIME’s Knowledge
to FasterRCNN
31Hamilton and Raman, #SAISExp4
𝑚𝑎𝑠𝑘 = 𝑢𝑛𝑖𝑜𝑛 𝑠𝑢𝑝𝑒𝑟𝑝𝑖𝑥𝑒𝑙𝑠 𝑡𝑜𝑝 70% 𝑜𝑓 𝑤𝑒𝑖𝑔ℎ𝑡𝑠)
𝑥1, 𝑦1 = min 𝑚𝑎𝑠𝑘 𝑥 , min 𝑚𝑎𝑠𝑘 𝑦
𝑥2, 𝑦2 = max 𝑚𝑎𝑠𝑘 𝑥 , max(𝑚𝑎𝑠𝑘 𝑦)
Results
32Hamilton and Raman, #SAISExp4
Human Labels
Unsupervised
FRCNN Outputs Human Labels
Unsupervised
FRCNN Outputs
33Hamilton and Raman, #SAISExp4
Next Steps: Hotspotter
34Hamilton and Raman, #SAISExp4
Source: HotSpotter - Patterned Species Instance Recognition
In Conclusion
www.aka.ms/spark
Contributions Welcome!
github.com/Azure/mmlspark
Hamilton and Raman, #SAISExp4 35
• Big Releases in v0.14:
– Cognitive Services on Spark
– Sub-Millisecond latency
distributed web services
– Distributed Model
Interpretability
• Easy batch, streaming apps,
PowerBI dashboards, or RESTful
web services
• We aim to give all work back to the
community!
• Easy to Get Started on Databricks:
– 16 Jupiter notebook guide
MMLSpark v0.14
Thanks to
• You all!
• Microsoft AI Development Acceleration Program: Abhiram Eswaran, Ari
Green, Courtney Cochrane, Janhavi Suresh Mahajan, Karthik Rajendran,
Minsoo Thigpen, Casey Hong, Soundar Srinivasan
• MMLSpark Team: Sudarshan Raghunathan, Ilya Matiach, Eli Barzilay, Tong
Wen, Ben Brodsky
• The Snow Leopard Trust: Rhetick Sengupta, Koustubh Sharma, Jeff Brown,
Michael Despines
• Microsoft: Joseph Sirosh, Lucas Joppa, WeeHyong Tok
Get in touch: marhamil@microsoft.com
MMLSpark Website: aka.ms/spark
Hamilton and Raman, #SAISExp4 36

Más contenido relacionado

La actualidad más candente

MLLeap, or How to Productionize Data Science Workflows Using Spark by Mikha...
  MLLeap, or How to Productionize Data Science Workflows Using Spark by Mikha...  MLLeap, or How to Productionize Data Science Workflows Using Spark by Mikha...
MLLeap, or How to Productionize Data Science Workflows Using Spark by Mikha...Spark Summit
 
Portable Streaming Pipelines with Apache Beam
Portable Streaming Pipelines with Apache BeamPortable Streaming Pipelines with Apache Beam
Portable Streaming Pipelines with Apache Beamconfluent
 
Load Balancing Projects for Master Thesis Students
Load Balancing Projects for Master Thesis StudentsLoad Balancing Projects for Master Thesis Students
Load Balancing Projects for Master Thesis StudentsPhdtopiccom
 
簡単にレビュー環境が作れる仕組みを作ってる話し
簡単にレビュー環境が作れる仕組みを作ってる話し簡単にレビュー環境が作れる仕組みを作ってる話し
簡単にレビュー環境が作れる仕組みを作ってる話し正貴 小川
 
Simulation of phase diagram and its advantages
Simulation of phase diagram and its advantages Simulation of phase diagram and its advantages
Simulation of phase diagram and its advantages Md Musavvir Mahmud
 
Rails 5 subjective overview
Rails 5 subjective overviewRails 5 subjective overview
Rails 5 subjective overviewJan Berdajs
 
MATLAB Simulation for Master Thesis
MATLAB Simulation for Master ThesisMATLAB Simulation for Master Thesis
MATLAB Simulation for Master ThesisPhdtopiccom
 
Matlab Master Thesis Writing Service
Matlab Master Thesis Writing ServiceMatlab Master Thesis Writing Service
Matlab Master Thesis Writing ServicePhdtopiccom
 
Matlab Projects USA
Matlab Projects USAMatlab Projects USA
Matlab Projects USAPhdtopiccom
 
Large Graph Processing
Large Graph ProcessingLarge Graph Processing
Large Graph ProcessingZuhair khayyat
 
A Pluggable Autoscaling System @ UCC
A Pluggable Autoscaling System @ UCCA Pluggable Autoscaling System @ UCC
A Pluggable Autoscaling System @ UCCChris Bunch
 
Matlab Thesis for Phd Students
Matlab Thesis for Phd StudentsMatlab Thesis for Phd Students
Matlab Thesis for Phd StudentsPhdtopiccom
 
Tech Talk on Autoscaling in Apache Stratos
Tech Talk on Autoscaling in Apache StratosTech Talk on Autoscaling in Apache Stratos
Tech Talk on Autoscaling in Apache StratosVishanth Bala
 
Learning Kafka Streams with Scala
Learning Kafka Streams with ScalaLearning Kafka Streams with Scala
Learning Kafka Streams with ScalaKnoldus Inc.
 
AppScale + Neptune @ HPCDB
AppScale + Neptune @ HPCDBAppScale + Neptune @ HPCDB
AppScale + Neptune @ HPCDBChris Bunch
 
Matlab-Assignment-Help-Qatar
Matlab-Assignment-Help-QatarMatlab-Assignment-Help-Qatar
Matlab-Assignment-Help-QatarPhdtopiccom
 
Matlab-Programming-Homework-Help
Matlab-Programming-Homework-HelpMatlab-Programming-Homework-Help
Matlab-Programming-Homework-HelpPhdtopiccom
 

La actualidad más candente (20)

MLLeap, or How to Productionize Data Science Workflows Using Spark by Mikha...
  MLLeap, or How to Productionize Data Science Workflows Using Spark by Mikha...  MLLeap, or How to Productionize Data Science Workflows Using Spark by Mikha...
MLLeap, or How to Productionize Data Science Workflows Using Spark by Mikha...
 
What's new on Rails 5
What's new on Rails 5What's new on Rails 5
What's new on Rails 5
 
Portable Streaming Pipelines with Apache Beam
Portable Streaming Pipelines with Apache BeamPortable Streaming Pipelines with Apache Beam
Portable Streaming Pipelines with Apache Beam
 
Load Balancing Projects for Master Thesis Students
Load Balancing Projects for Master Thesis StudentsLoad Balancing Projects for Master Thesis Students
Load Balancing Projects for Master Thesis Students
 
簡単にレビュー環境が作れる仕組みを作ってる話し
簡単にレビュー環境が作れる仕組みを作ってる話し簡単にレビュー環境が作れる仕組みを作ってる話し
簡単にレビュー環境が作れる仕組みを作ってる話し
 
Simulation of phase diagram and its advantages
Simulation of phase diagram and its advantages Simulation of phase diagram and its advantages
Simulation of phase diagram and its advantages
 
Rails 5 subjective overview
Rails 5 subjective overviewRails 5 subjective overview
Rails 5 subjective overview
 
MATLAB Simulation for Master Thesis
MATLAB Simulation for Master ThesisMATLAB Simulation for Master Thesis
MATLAB Simulation for Master Thesis
 
Pregel and giraph
Pregel and giraphPregel and giraph
Pregel and giraph
 
Matlab Master Thesis Writing Service
Matlab Master Thesis Writing ServiceMatlab Master Thesis Writing Service
Matlab Master Thesis Writing Service
 
Matlab Projects USA
Matlab Projects USAMatlab Projects USA
Matlab Projects USA
 
Reasonable app development
Reasonable app developmentReasonable app development
Reasonable app development
 
Large Graph Processing
Large Graph ProcessingLarge Graph Processing
Large Graph Processing
 
A Pluggable Autoscaling System @ UCC
A Pluggable Autoscaling System @ UCCA Pluggable Autoscaling System @ UCC
A Pluggable Autoscaling System @ UCC
 
Matlab Thesis for Phd Students
Matlab Thesis for Phd StudentsMatlab Thesis for Phd Students
Matlab Thesis for Phd Students
 
Tech Talk on Autoscaling in Apache Stratos
Tech Talk on Autoscaling in Apache StratosTech Talk on Autoscaling in Apache Stratos
Tech Talk on Autoscaling in Apache Stratos
 
Learning Kafka Streams with Scala
Learning Kafka Streams with ScalaLearning Kafka Streams with Scala
Learning Kafka Streams with Scala
 
AppScale + Neptune @ HPCDB
AppScale + Neptune @ HPCDBAppScale + Neptune @ HPCDB
AppScale + Neptune @ HPCDB
 
Matlab-Assignment-Help-Qatar
Matlab-Assignment-Help-QatarMatlab-Assignment-Help-Qatar
Matlab-Assignment-Help-Qatar
 
Matlab-Programming-Homework-Help
Matlab-Programming-Homework-HelpMatlab-Programming-Homework-Help
Matlab-Programming-Homework-Help
 

Similar a Deep Reality Simulation for Automated Poacher Detection with Mark Hamilton and Anand Raman

Spark Summit EU talk by Mikhail Semeniuk Hollin Wilkins
Spark Summit EU talk by Mikhail Semeniuk Hollin WilkinsSpark Summit EU talk by Mikhail Semeniuk Hollin Wilkins
Spark Summit EU talk by Mikhail Semeniuk Hollin WilkinsSpark Summit
 
Asynchronous Hyperparameter Search with Spark on Hopsworks and Maggy
Asynchronous Hyperparameter Search with Spark on Hopsworks and MaggyAsynchronous Hyperparameter Search with Spark on Hopsworks and Maggy
Asynchronous Hyperparameter Search with Spark on Hopsworks and MaggyJim Dowling
 
Building Large Scale Machine Learning Applications with Pipelines-(Evan Spark...
Building Large Scale Machine Learning Applications with Pipelines-(Evan Spark...Building Large Scale Machine Learning Applications with Pipelines-(Evan Spark...
Building Large Scale Machine Learning Applications with Pipelines-(Evan Spark...Spark Summit
 
R4ML: An R Based Scalable Machine Learning Framework
R4ML: An R Based Scalable Machine Learning FrameworkR4ML: An R Based Scalable Machine Learning Framework
R4ML: An R Based Scalable Machine Learning FrameworkAlok Singh
 
Spark Summit EU talk by Josef Habdank
Spark Summit EU talk by Josef HabdankSpark Summit EU talk by Josef Habdank
Spark Summit EU talk by Josef HabdankSpark Summit
 
Scaling Up Machine Learning Experimentation at Tubi 5x and Beyond
Scaling Up Machine Learning Experimentation at Tubi 5x and BeyondScaling Up Machine Learning Experimentation at Tubi 5x and Beyond
Scaling Up Machine Learning Experimentation at Tubi 5x and BeyondScyllaDB
 
Spark Summit East 2016 - MLeap Presentation
Spark Summit East 2016 -   MLeap PresentationSpark Summit East 2016 -   MLeap Presentation
Spark Summit East 2016 - MLeap PresentationMikhail Semeniuk
 
Media_Entertainment_Veriticals
Media_Entertainment_VeriticalsMedia_Entertainment_Veriticals
Media_Entertainment_VeriticalsPeyman Mohajerian
 
Spark streaming state of the union
Spark streaming state of the unionSpark streaming state of the union
Spark streaming state of the unionDatabricks
 
Operationalize Apache Spark Analytics
Operationalize Apache Spark AnalyticsOperationalize Apache Spark Analytics
Operationalize Apache Spark AnalyticsDatabricks
 
MATLAB Project Topics
MATLAB Project TopicsMATLAB Project Topics
MATLAB Project TopicsPhdtopiccom
 
Recent Developments In SparkR For Advanced Analytics
Recent Developments In SparkR For Advanced AnalyticsRecent Developments In SparkR For Advanced Analytics
Recent Developments In SparkR For Advanced AnalyticsDatabricks
 
Expanding Apache Spark Use Cases in 2.2 and Beyond with Matei Zaharia and dem...
Expanding Apache Spark Use Cases in 2.2 and Beyond with Matei Zaharia and dem...Expanding Apache Spark Use Cases in 2.2 and Beyond with Matei Zaharia and dem...
Expanding Apache Spark Use Cases in 2.2 and Beyond with Matei Zaharia and dem...Databricks
 
Monitoring Akka with Kamon 1.0
Monitoring Akka with Kamon 1.0Monitoring Akka with Kamon 1.0
Monitoring Akka with Kamon 1.0Steffen Gebert
 
Deep Reality Simulation for Automated Poacher Detection with Mark Hamilton an...
Deep Reality Simulation for Automated Poacher Detection with Mark Hamilton an...Deep Reality Simulation for Automated Poacher Detection with Mark Hamilton an...
Deep Reality Simulation for Automated Poacher Detection with Mark Hamilton an...Databricks
 
Building Machine Learning models with Apache Spark and Amazon SageMaker | AWS...
Building Machine Learning models with Apache Spark and Amazon SageMaker | AWS...Building Machine Learning models with Apache Spark and Amazon SageMaker | AWS...
Building Machine Learning models with Apache Spark and Amazon SageMaker | AWS...Amazon Web Services
 
Grow and Shrink - Dynamically Extending the Ruby VM Stack
Grow and Shrink - Dynamically Extending the Ruby VM StackGrow and Shrink - Dynamically Extending the Ruby VM Stack
Grow and Shrink - Dynamically Extending the Ruby VM StackKeitaSugiyama1
 
Fighting Fraud with Apache Spark
Fighting Fraud with Apache SparkFighting Fraud with Apache Spark
Fighting Fraud with Apache SparkMiklos Christine
 

Similar a Deep Reality Simulation for Automated Poacher Detection with Mark Hamilton and Anand Raman (20)

Spark Summit EU talk by Mikhail Semeniuk Hollin Wilkins
Spark Summit EU talk by Mikhail Semeniuk Hollin WilkinsSpark Summit EU talk by Mikhail Semeniuk Hollin Wilkins
Spark Summit EU talk by Mikhail Semeniuk Hollin Wilkins
 
MLeap: Release Spark ML Pipelines
MLeap: Release Spark ML PipelinesMLeap: Release Spark ML Pipelines
MLeap: Release Spark ML Pipelines
 
Asynchronous Hyperparameter Search with Spark on Hopsworks and Maggy
Asynchronous Hyperparameter Search with Spark on Hopsworks and MaggyAsynchronous Hyperparameter Search with Spark on Hopsworks and Maggy
Asynchronous Hyperparameter Search with Spark on Hopsworks and Maggy
 
Building Large Scale Machine Learning Applications with Pipelines-(Evan Spark...
Building Large Scale Machine Learning Applications with Pipelines-(Evan Spark...Building Large Scale Machine Learning Applications with Pipelines-(Evan Spark...
Building Large Scale Machine Learning Applications with Pipelines-(Evan Spark...
 
R4ML: An R Based Scalable Machine Learning Framework
R4ML: An R Based Scalable Machine Learning FrameworkR4ML: An R Based Scalable Machine Learning Framework
R4ML: An R Based Scalable Machine Learning Framework
 
Spark Technology Center IBM
Spark Technology Center IBMSpark Technology Center IBM
Spark Technology Center IBM
 
Spark Summit EU talk by Josef Habdank
Spark Summit EU talk by Josef HabdankSpark Summit EU talk by Josef Habdank
Spark Summit EU talk by Josef Habdank
 
Scaling Up Machine Learning Experimentation at Tubi 5x and Beyond
Scaling Up Machine Learning Experimentation at Tubi 5x and BeyondScaling Up Machine Learning Experimentation at Tubi 5x and Beyond
Scaling Up Machine Learning Experimentation at Tubi 5x and Beyond
 
Spark Summit East 2016 - MLeap Presentation
Spark Summit East 2016 -   MLeap PresentationSpark Summit East 2016 -   MLeap Presentation
Spark Summit East 2016 - MLeap Presentation
 
Media_Entertainment_Veriticals
Media_Entertainment_VeriticalsMedia_Entertainment_Veriticals
Media_Entertainment_Veriticals
 
Spark streaming state of the union
Spark streaming state of the unionSpark streaming state of the union
Spark streaming state of the union
 
Operationalize Apache Spark Analytics
Operationalize Apache Spark AnalyticsOperationalize Apache Spark Analytics
Operationalize Apache Spark Analytics
 
MATLAB Project Topics
MATLAB Project TopicsMATLAB Project Topics
MATLAB Project Topics
 
Recent Developments In SparkR For Advanced Analytics
Recent Developments In SparkR For Advanced AnalyticsRecent Developments In SparkR For Advanced Analytics
Recent Developments In SparkR For Advanced Analytics
 
Expanding Apache Spark Use Cases in 2.2 and Beyond with Matei Zaharia and dem...
Expanding Apache Spark Use Cases in 2.2 and Beyond with Matei Zaharia and dem...Expanding Apache Spark Use Cases in 2.2 and Beyond with Matei Zaharia and dem...
Expanding Apache Spark Use Cases in 2.2 and Beyond with Matei Zaharia and dem...
 
Monitoring Akka with Kamon 1.0
Monitoring Akka with Kamon 1.0Monitoring Akka with Kamon 1.0
Monitoring Akka with Kamon 1.0
 
Deep Reality Simulation for Automated Poacher Detection with Mark Hamilton an...
Deep Reality Simulation for Automated Poacher Detection with Mark Hamilton an...Deep Reality Simulation for Automated Poacher Detection with Mark Hamilton an...
Deep Reality Simulation for Automated Poacher Detection with Mark Hamilton an...
 
Building Machine Learning models with Apache Spark and Amazon SageMaker | AWS...
Building Machine Learning models with Apache Spark and Amazon SageMaker | AWS...Building Machine Learning models with Apache Spark and Amazon SageMaker | AWS...
Building Machine Learning models with Apache Spark and Amazon SageMaker | AWS...
 
Grow and Shrink - Dynamically Extending the Ruby VM Stack
Grow and Shrink - Dynamically Extending the Ruby VM StackGrow and Shrink - Dynamically Extending the Ruby VM Stack
Grow and Shrink - Dynamically Extending the Ruby VM Stack
 
Fighting Fraud with Apache Spark
Fighting Fraud with Apache SparkFighting Fraud with Apache Spark
Fighting Fraud with Apache Spark
 

Más de Databricks

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDatabricks
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Databricks
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Databricks
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Databricks
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Databricks
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of HadoopDatabricks
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDatabricks
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceDatabricks
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringDatabricks
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixDatabricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationDatabricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchDatabricks
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesDatabricks
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesDatabricks
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsDatabricks
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkDatabricks
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkDatabricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesDatabricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkDatabricks
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeDatabricks
 

Más de Databricks (20)

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 

Último

Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Boston Institute of Analytics
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...Jack Cole
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxHimangsuNath
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter
 
Rithik Kumar Singh codealpha pythohn.pdf
Rithik Kumar Singh codealpha pythohn.pdfRithik Kumar Singh codealpha pythohn.pdf
Rithik Kumar Singh codealpha pythohn.pdfrahulyadav957181
 
Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectBoston Institute of Analytics
 
IBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaIBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaManalVerma4
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...Dr Arash Najmaei ( Phd., MBA, BSc)
 
SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxHaritikaChhatwal1
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksdeepakthakur548787
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataTecnoIncentive
 

Último (20)

Data Analysis Project: Stroke Prediction
Data Analysis Project: Stroke PredictionData Analysis Project: Stroke Prediction
Data Analysis Project: Stroke Prediction
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptx
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
 
Rithik Kumar Singh codealpha pythohn.pdf
Rithik Kumar Singh codealpha pythohn.pdfRithik Kumar Singh codealpha pythohn.pdf
Rithik Kumar Singh codealpha pythohn.pdf
 
Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis Project
 
IBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaIBEF report on the Insurance market in India
IBEF report on the Insurance market in India
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
 
SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptx
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing works
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded data
 

Deep Reality Simulation for Automated Poacher Detection with Mark Hamilton and Anand Raman

  • 1. Mark Hamilton, Microsoft Anand Raman, Microsoft Unsupervised Object Detection using the Azure Cognitive Services on Spark #SAISExp4
  • 2. Hamilton and Raman, #SAISExp4 2
  • 3. Hamilton and Raman, #SAISExp4 3
  • 4.
  • 5. Classifying all 1.3 million images will take 20k hours Hamilton and Raman, #SAISExp4 5
  • 6. Hamilton and Raman, #SAISExp4 6
  • 7. Deep Learning • Brain-like algorithms trained with gradient descent • Has become a recent favorite because: – Spectacular performance in many domains – Quick training – Low memory – Large space of possible model architectures – Automatic differentiation software makes it very easy Hamilton and Raman, #SAISExp4 7
  • 8. 8Hamilton and Raman, #SAISExp4 Tire Dog Scotty Gu Input Images Low Level Features Mid Level Features High Level Features Output Classes Filters from Zeiler + Fergus 2013
  • 9. 9Hamilton and Raman, #SAISExp4 Input Images Low Level Features Mid Level Features Logistic Regression Output Classes Filters from Zeiler + Fergus 2013 Any SparkML Model Other Leopard Predictions
  • 10. ML • High level library for distributed machine learning • More general than SciKit-Learn • All models have a uniform interface – Can compose models into complex pipelines – Can save, load, and transport models Load Data Tokenizer Term Hashing Logistic Reg Evaluate Pipeline Evaluate Load Data ● Estimator ● Transformer Hamilton and Raman, #SAISExp4 10
  • 11.  GPU/CPU acceleration  Automated gradient calculations  Flexible language for defining huge space of models  State of the art performance and accuracy  ONNX Runtime  Large scale parallelism  Fault tolerance  Auto-scaling / elasticity  High throughput streaming  Data source + orchestrator agnostic Distributed Computing Hamilton and Raman, #SAISExp4 11 Deep Learning
  • 12. Microsoft ML for Apache Spark • Open Source Distributed ML library • Unifies major computing paradigms and Microsoft tools into a single library and API – Python, Java, Scala, R – Batch, Streaming, Serving – Local, Static Cluster, Dynamic Cluster, Serverless – Databricks, HDI, AZTK, Kubernetes, … www.aka.ms/spark Hamilton and Raman, #SAISExp4 12 Scalable Deep Learning Stress Free Serving Distributed Microservices Fast Gradient Boosting CNTK on Spark Spark Serving HTTP on Spark LightGBM on Spark MMLSpark
  • 13. Training a Deep Transfer Learner: Hamilton and Raman, #SAISExp4 13
  • 14. Performance With Deep Featurization, Augmentation, and Temporal Ensembling Without Deep Featurization Accuracy 65.6% Accuracy 94.7% Hamilton and Raman, #SAISExp4 14
  • 15. … Time 1 2 3 4 5 6 Time 1 2 3 4 5 6 Deep Network … Hamilton and Raman, #SAISExp4 15
  • 16. Real Time monitoring with PowerBI Stream from Kafka / Cosmos OpenCV on Spark CNTK on Spark SparkML Logistic Reg Stream Results to PowerBI Camera Image Upload in Kyrgyzstan MMLSpark Contribution 27 Lines of Code ~1k imgs/s throughput on CPU ~3 second latencies Hamilton and Raman, #SAISExp4 16
  • 17. Simple APIs matter Hamilton and Raman, #SAISExp4 17
  • 18. 18Hamilton and Raman, #SAISExp4 Lightning Fast Web Services on Any Spark Cluster • Sub-millisecond latencies • Fully Distributed • Spins up in seconds • Same API as Batch and Streaming • Scala, Python, R and Java • Fully Open Source www.aka.ms/spark JIRA: SPARK-25350 Serving 113 0.9 0 50 100 150 Latency (ms) Spark Serving Spark Continuous Serving Announcing: 100x Latency Reduction with MMLSpark v0.14
  • 19. Web Serving with the Same API: Hamilton and Raman, #SAISExp4 19
  • 20. • 3 main modes: – server: 1 service on the head node – distributedServer: 1 service per executor – continuousServer: 1 service per partition • Built on top of Spark Streaming • Each worker keeps a running service, and a routing table • Use HTTP on Spark objects to represent requests and responses in Spark SQL Spark Serving Architecture Hamilton and Raman, #SAISExp4 20
  • 21. Going Further: No Labels, No Problem Hamilton and Raman, #SAISExp4 21
  • 22. Bing Powered Snow Leopard Detection Pipeline Bing Photo Search of “Snow Leopard” + Related terms Logistic Regression Add Class + Union + Distinct Truncated ResNet50 Cloud Service CNTK on Spark Key Bing Image Search of random nouns Spark ML Hamilton and Raman, #SAISExp4 22
  • 23. Bing Image Search API Hamilton and Raman, #SAISExp4 23
  • 24. Announcing: Intelligent and Distributed Microservices in SparkML www.aka.ms/spark Microsoft Cognitive Services + Bing Text Vision Face Hamilton and Raman, #SAISExp4 24
  • 25. Celebrity Quote Analysis Hamilton and Raman, #SAISExp4 25
  • 26. HTTP on Spark • Allows Spark users to call into any HTTP service • Leverage parallel networking of cluster • Allows Spark to work with microservice architectures • Mini-batching, asynchrony, and buffering supported • Statically typed and usable from Spark, PySpark, and SparklyR Hamilton and Raman, #SAISExp4 26
  • 27. Model Interpretability with LIME Source: Marco Tulio Ribeiro et al. Hamilton and Raman, #SAISExp4 27
  • 28. Model Interpretability with LIME Source: Marco Tulio Ribeiro et al. Hamilton and Raman, #SAISExp4 28
  • 29. Announcing: LIME on • Elastic and Distributed • Works with any image classification model • Example notebook instructions at: – www.aka.ms/spark Hamilton and Raman, #SAISExp4 29
  • 30. Using LIME to generate bounding boxes 30Hamilton and Raman, #SAISExp4
  • 31. Transferring LIME’s Knowledge to FasterRCNN 31Hamilton and Raman, #SAISExp4 𝑚𝑎𝑠𝑘 = 𝑢𝑛𝑖𝑜𝑛 𝑠𝑢𝑝𝑒𝑟𝑝𝑖𝑥𝑒𝑙𝑠 𝑡𝑜𝑝 70% 𝑜𝑓 𝑤𝑒𝑖𝑔ℎ𝑡𝑠) 𝑥1, 𝑦1 = min 𝑚𝑎𝑠𝑘 𝑥 , min 𝑚𝑎𝑠𝑘 𝑦 𝑥2, 𝑦2 = max 𝑚𝑎𝑠𝑘 𝑥 , max(𝑚𝑎𝑠𝑘 𝑦)
  • 32. Results 32Hamilton and Raman, #SAISExp4 Human Labels Unsupervised FRCNN Outputs Human Labels Unsupervised FRCNN Outputs
  • 34. Next Steps: Hotspotter 34Hamilton and Raman, #SAISExp4 Source: HotSpotter - Patterned Species Instance Recognition
  • 35. In Conclusion www.aka.ms/spark Contributions Welcome! github.com/Azure/mmlspark Hamilton and Raman, #SAISExp4 35 • Big Releases in v0.14: – Cognitive Services on Spark – Sub-Millisecond latency distributed web services – Distributed Model Interpretability • Easy batch, streaming apps, PowerBI dashboards, or RESTful web services • We aim to give all work back to the community! • Easy to Get Started on Databricks: – 16 Jupiter notebook guide MMLSpark v0.14
  • 36. Thanks to • You all! • Microsoft AI Development Acceleration Program: Abhiram Eswaran, Ari Green, Courtney Cochrane, Janhavi Suresh Mahajan, Karthik Rajendran, Minsoo Thigpen, Casey Hong, Soundar Srinivasan • MMLSpark Team: Sudarshan Raghunathan, Ilya Matiach, Eli Barzilay, Tong Wen, Ben Brodsky • The Snow Leopard Trust: Rhetick Sengupta, Koustubh Sharma, Jeff Brown, Michael Despines • Microsoft: Joseph Sirosh, Lucas Joppa, WeeHyong Tok Get in touch: marhamil@microsoft.com MMLSpark Website: aka.ms/spark Hamilton and Raman, #SAISExp4 36