SlideShare una empresa de Scribd logo
1 de 22
Descargar para leer sin conexión
STARBUCKS
TECHNOLOGY
Simplifying Deep Learning
with HorovodRunner at Starbucks
About the presenters
Denny Lee
Denny Lee is a Technology
Evangelist with Databricks; he
is a hands-on data sciences
engineer with more than 15
years of experience
developing internet-scale
infrastructure, data platforms,
and distributed systems for
both on-premises and cloud.
His key focuses surround
solving complex large scale
data problems – providing not
only architectural direction
but the hands-on
implementation of these
systems.
Vishwanath Subramanian is a
Director of Data and Analytics
Engineering at Starbucks.
Vishwanath has over 15 years of
experience with a background in
distributed systems, product
management, software
engineering and Analytics.
At Starbucks, his key focus is on
providing Next Generation
Analytics platforms and enabling
large scale data processing and
machine learning to enable
Business Intelligence and Data
Services across Starbucks.
Vishwanath Subramanian
Scenarios
• On-Demand one click Provisioning
of Seamlessly integrated
Infrastructure Bill of Material for
Data Science and Intelligent Apps.
• Secured Connectivity to Enterprise
Data Platform completely
abstracted from Analytics teams.
• Solution template containing
organization of deployments to
enable Adhoc experiments, shared
data engineering and Intelligent
App Development
• Smarter checkout experiences
• Predicting customer traffic
• Planogram Analysis
• And more…
Current State
• Solving complex / streaming image and video analytics is
hard
• It also typically involves distributing the problem to multiple
nodes
• But how do I perform Keras+TensorFlow on a distributed
environment?
Convolutional Neural Networks
Convolutional Neural Networks
28 x 28 28 x 28 14 x 14
Convolution
32 filters
Convolution
64 filters
Subsampling
Stride (2,2)
Feature Extraction Classification
0
1
8
9
FullyConnected
Dropout
DEMO
Running Keras CNNs Standalone
Keras, TensorFlow, HorovodRunner, and MLflow: https://dbricks.co/2D58PDw
Introducing HorovodRunner
• On-Demand one click Provisioning
of Seamlessly integrated
Infrastructure Bill of Material for
Data Science and Intelligent Apps.
• Secured Connectivity to Enterprise
Data Platform completely
abstracted from Analytics teams.
• Solution template containing
organization of deployments to
enable Adhoc experiments, shared
data engineering and Intelligent
App Development
• HorovodRunner is a general API to run distributed learning workloads
on Databricks using Uber’s Horovod framework
• Combining Horovod with Apache Spark’s barrier mode allows longer-
running deep learning training jobs
• A Horovod MPI job is embedded as a Spark job using barrier
execution mode
HorovodRunner
• HorovodRunner takes a Python
method that contains DL training code
with Horovod hooks
• The first executor collects the IP
address of all of the task executors
using BarrierTaskContext
• Then it triggers a Horovod job using
mpirun.
• Each Python MPI process loads the
pickled program back, deserializes it,
and runs it.
HorovodRunner
driver
workers
HorovodRunner
driver
workers
runCNN():
model.add(Conv2D(32, …))
model.add(Conv2D(64, …))
model.add(MaxPooling2D(…))
model.add(Dense(128, …)
model.add(Dense(10, ’softmax’)
optimizer = keras.optimizers 
.Adadelta(1.0)
In standalone or hvd local mode, the code is running on the driver
HorovodRunner
driver
workers
variables
runCNN_hvd():
hvd.init()
config.tf.ConfigProto()
# Original code
runCNN()
callbacks = []
With HorovodRunner, we wrap the original code and
code and variables are pushed to the workers
HorovodRunner
driver
workers
With HorovodRunner, we wrap the original code and
code and variables are pushed to the workers
HorovodRunner
driver
workers
With HorovodRunner, we wrap the original code and
code and variables are pushed to the workers
HorovodRunner
driver
workers
With HorovodRunner, we wrap the original code and
code and variables are pushed to the workers
HorovodRunner
driver
workers
Variables are transferred from driver to workers
Code is executed at the workers
Migrate to HorovodRunner
• On-Demand one click Provisioning
of Seamlessly integrated
Infrastructure Bill of Material for
Data Science and Intelligent Apps.
• Secured Connectivity to Enterprise
Data Platform completely
abstracted from Analytics teams.
• Solution template containing
organization of deployments to
enable Adhoc experiments, shared
data engineering and Intelligent
App Development
# Primary code differences are noted below
+ hvd.init()
+ config.tfConfigProto()
+ config.gpu_options.allow_growth = True
+ config.gpu_options.visible_device_list = str(hvd.local_rank())
+ epochs = int(math.ceil(12.0 / hvd.size()))
+ callbacks = [
+ hvd.callbacks.BroadcastGlobalVariablesCallback(0),
+ ]
Comparing the runs using MLflow
• On-Demand one click Provisioning
of Seamlessly integrated
Infrastructure Bill of Material for
Data Science and Intelligent Apps.
• Secured Connectivity to Enterprise
Data Platform completely
abstracted from Analytics teams.
• Solution template containing
organization of deployments to
enable Adhoc experiments, shared
data engineering and Intelligent
App Development
DEMO
Object Detection
Keras, TensorFlow, HorovodRunner, and MLflow
Object Detection Approaches
RCNN (2012)
• Region proposal algorithms - give you a set of regions in the image that are likely
to contain objects.
• Run those images in the bounding boxes to a pre-trained alexnet to compute
the features for that bounding box.
• Support vector machine, to classify what the object in the image is of.
• Run the box through a linear regression model to output tighter coordinates
for the box.
• RCNN -> Fast RCNN ->Faster RCNN
Rich feature hierarchies for accurate object detection and semantic segmentation - Girshick, Donahue, Darrell, Malik
Fast R-CNN - Girshick
Faster R-CNN: Towards Real-Time ObjectDetection with Region Proposal Networks - Ren, He, Girshick, Su
Object Detection Approaches (contd.)
• YOLO – detection as a regression problem
• Not a traditional classifier
• Divide image into grid, each cell is responsible for predicting n bounding boxes
• Output confidence score that predicted bounding box
• Gives a probability distribution of all the classes its trained on
• Confidence score and class prediction is combined is combined into a score for
object classification
• Based on threshold, we determine relevant boxes.
• All the boxes fed to the neural network all at once.
You Only Look Once: Unified, Real-Time Object Detection - Redmon, Divvala, Girshick, Farhadi
A TALENTED TECHNOLOGISTS
DELIVERING TODAY
aavaLEADING INTO THE FUTURE
https://www.starbucks.com/careers/

Más contenido relacionado

La actualidad más candente

Operationalizing Machine Learning—Managing Provenance from Raw Data to Predic...
Operationalizing Machine Learning—Managing Provenance from Raw Data to Predic...Operationalizing Machine Learning—Managing Provenance from Raw Data to Predic...
Operationalizing Machine Learning—Managing Provenance from Raw Data to Predic...
Databricks
 
Deploying MLlib for Scoring in Structured Streaming with Joseph Bradley
Deploying MLlib for Scoring in Structured Streaming with Joseph BradleyDeploying MLlib for Scoring in Structured Streaming with Joseph Bradley
Deploying MLlib for Scoring in Structured Streaming with Joseph Bradley
Databricks
 

La actualidad más candente (20)

Build, Scale, and Deploy Deep Learning Pipelines Using Apache Spark
Build, Scale, and Deploy Deep Learning Pipelines Using Apache SparkBuild, Scale, and Deploy Deep Learning Pipelines Using Apache Spark
Build, Scale, and Deploy Deep Learning Pipelines Using Apache Spark
 
Koalas: How Well Does Koalas Work?
Koalas: How Well Does Koalas Work?Koalas: How Well Does Koalas Work?
Koalas: How Well Does Koalas Work?
 
Apache Spark Model Deployment
Apache Spark Model Deployment Apache Spark Model Deployment
Apache Spark Model Deployment
 
Build Large-Scale Data Analytics and AI Pipeline Using RayDP
Build Large-Scale Data Analytics and AI Pipeline Using RayDPBuild Large-Scale Data Analytics and AI Pipeline Using RayDP
Build Large-Scale Data Analytics and AI Pipeline Using RayDP
 
Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating ...
Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating ...Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating ...
Advanced MLflow: Multi-Step Workflows, Hyperparameter Tuning and Integrating ...
 
Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...
Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...
Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...
 
Spark Summit EU talk by Zoltan Zvara
Spark Summit EU talk by Zoltan ZvaraSpark Summit EU talk by Zoltan Zvara
Spark Summit EU talk by Zoltan Zvara
 
Operationalizing Machine Learning—Managing Provenance from Raw Data to Predic...
Operationalizing Machine Learning—Managing Provenance from Raw Data to Predic...Operationalizing Machine Learning—Managing Provenance from Raw Data to Predic...
Operationalizing Machine Learning—Managing Provenance from Raw Data to Predic...
 
From Pipelines to Refineries: scaling big data applications with Tim Hunter
From Pipelines to Refineries: scaling big data applications with Tim HunterFrom Pipelines to Refineries: scaling big data applications with Tim Hunter
From Pipelines to Refineries: scaling big data applications with Tim Hunter
 
Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...
Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...
Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...
 
Yggdrasil: Faster Decision Trees Using Column Partitioning In Spark
Yggdrasil: Faster Decision Trees Using Column Partitioning In SparkYggdrasil: Faster Decision Trees Using Column Partitioning In Spark
Yggdrasil: Faster Decision Trees Using Column Partitioning In Spark
 
Deploying MLlib for Scoring in Structured Streaming with Joseph Bradley
Deploying MLlib for Scoring in Structured Streaming with Joseph BradleyDeploying MLlib for Scoring in Structured Streaming with Joseph Bradley
Deploying MLlib for Scoring in Structured Streaming with Joseph Bradley
 
Productionizing Machine Learning Pipelines with Databricks and Azure ML
Productionizing Machine Learning Pipelines with Databricks and Azure MLProductionizing Machine Learning Pipelines with Databricks and Azure ML
Productionizing Machine Learning Pipelines with Databricks and Azure ML
 
Operationalizing Machine Learning at Scale with Sameer Nori
Operationalizing Machine Learning at Scale with Sameer NoriOperationalizing Machine Learning at Scale with Sameer Nori
Operationalizing Machine Learning at Scale with Sameer Nori
 
Scaling Ride-Hailing with Machine Learning on MLflow
Scaling Ride-Hailing with Machine Learning on MLflowScaling Ride-Hailing with Machine Learning on MLflow
Scaling Ride-Hailing with Machine Learning on MLflow
 
A Predictive Analytics Workflow on DICOM Images using Apache Spark with Anahi...
A Predictive Analytics Workflow on DICOM Images using Apache Spark with Anahi...A Predictive Analytics Workflow on DICOM Images using Apache Spark with Anahi...
A Predictive Analytics Workflow on DICOM Images using Apache Spark with Anahi...
 
Accelerating Data Science with Better Data Engineering on Databricks
Accelerating Data Science with Better Data Engineering on DatabricksAccelerating Data Science with Better Data Engineering on Databricks
Accelerating Data Science with Better Data Engineering on Databricks
 
Scalable Deep Learning Platform On Spark In Baidu
Scalable Deep Learning Platform On Spark In BaiduScalable Deep Learning Platform On Spark In Baidu
Scalable Deep Learning Platform On Spark In Baidu
 
Handling Data Skew Adaptively In Spark Using Dynamic Repartitioning
Handling Data Skew Adaptively In Spark Using Dynamic RepartitioningHandling Data Skew Adaptively In Spark Using Dynamic Repartitioning
Handling Data Skew Adaptively In Spark Using Dynamic Repartitioning
 
Spark Autotuning: Spark Summit East talk by Lawrence Spracklen
Spark Autotuning: Spark Summit East talk by Lawrence SpracklenSpark Autotuning: Spark Summit East talk by Lawrence Spracklen
Spark Autotuning: Spark Summit East talk by Lawrence Spracklen
 

Similar a Simplify Distributed TensorFlow Training for Fast Image Categorization at Starbucks

Monish R_9163_b
Monish R_9163_bMonish R_9163_b
Monish R_9163_b
samnik60
 
Manikyam_Hadoop_5+Years
Manikyam_Hadoop_5+YearsManikyam_Hadoop_5+Years
Manikyam_Hadoop_5+Years
Manikyam M
 

Similar a Simplify Distributed TensorFlow Training for Fast Image Categorization at Starbucks (20)

Big Data Adavnced Analytics on Microsoft Azure
Big Data Adavnced Analytics on Microsoft AzureBig Data Adavnced Analytics on Microsoft Azure
Big Data Adavnced Analytics on Microsoft Azure
 
Venkata brundavanam 2020
Venkata brundavanam 2020Venkata brundavanam 2020
Venkata brundavanam 2020
 
Venkata brundavanam 2020
Venkata brundavanam 2020Venkata brundavanam 2020
Venkata brundavanam 2020
 
LinkedinResume
LinkedinResumeLinkedinResume
LinkedinResume
 
Monish R_9163_b
Monish R_9163_bMonish R_9163_b
Monish R_9163_b
 
pres_all_latest
pres_all_latestpres_all_latest
pres_all_latest
 
Satwik Mishra resume
Satwik Mishra resumeSatwik Mishra resume
Satwik Mishra resume
 
Smart Data Conference: DL4J and DataVec
Smart Data Conference: DL4J and DataVecSmart Data Conference: DL4J and DataVec
Smart Data Conference: DL4J and DataVec
 
Dilnoza Bobokalonova Resume | Embedded Systems Engineering | Backend Software...
Dilnoza Bobokalonova Resume | Embedded Systems Engineering | Backend Software...Dilnoza Bobokalonova Resume | Embedded Systems Engineering | Backend Software...
Dilnoza Bobokalonova Resume | Embedded Systems Engineering | Backend Software...
 
Scaling up with Cisco Big Data: Data + Science = Data Science
Scaling up with Cisco Big Data: Data + Science = Data ScienceScaling up with Cisco Big Data: Data + Science = Data Science
Scaling up with Cisco Big Data: Data + Science = Data Science
 
LarKC Tutorial at ISWC 2009 - Introduction
LarKC Tutorial at ISWC 2009 - IntroductionLarKC Tutorial at ISWC 2009 - Introduction
LarKC Tutorial at ISWC 2009 - Introduction
 
BigData_Krishna Kumar Sharma
BigData_Krishna Kumar SharmaBigData_Krishna Kumar Sharma
BigData_Krishna Kumar Sharma
 
Your Self-Driving Car - How Did it Get So Smart?
Your Self-Driving Car - How Did it Get So Smart?Your Self-Driving Car - How Did it Get So Smart?
Your Self-Driving Car - How Did it Get So Smart?
 
Manikyam_Hadoop_5+Years
Manikyam_Hadoop_5+YearsManikyam_Hadoop_5+Years
Manikyam_Hadoop_5+Years
 
Deepanshu Mandal Resume
Deepanshu Mandal ResumeDeepanshu Mandal Resume
Deepanshu Mandal Resume
 
Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016
Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016
Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016
 
Deep Learning: DL4J and DataVec
Deep Learning: DL4J and DataVecDeep Learning: DL4J and DataVec
Deep Learning: DL4J and DataVec
 
Resume_Ronak Dhalawat
Resume_Ronak DhalawatResume_Ronak Dhalawat
Resume_Ronak Dhalawat
 
Final project report format
Final project report formatFinal project report format
Final project report format
 
DeepeshRehi
DeepeshRehiDeepeshRehi
DeepeshRehi
 

Más de Databricks

Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
Databricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
Databricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Databricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
Databricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
Databricks
 

Más de Databricks (20)

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 

Último

➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
amitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
JoseMangaJr1
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
MarinCaroMartnezBerg
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
AroojKhan71
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
amitlee9823
 

Último (20)

➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 

Simplify Distributed TensorFlow Training for Fast Image Categorization at Starbucks

  • 2. About the presenters Denny Lee Denny Lee is a Technology Evangelist with Databricks; he is a hands-on data sciences engineer with more than 15 years of experience developing internet-scale infrastructure, data platforms, and distributed systems for both on-premises and cloud. His key focuses surround solving complex large scale data problems – providing not only architectural direction but the hands-on implementation of these systems. Vishwanath Subramanian is a Director of Data and Analytics Engineering at Starbucks. Vishwanath has over 15 years of experience with a background in distributed systems, product management, software engineering and Analytics. At Starbucks, his key focus is on providing Next Generation Analytics platforms and enabling large scale data processing and machine learning to enable Business Intelligence and Data Services across Starbucks. Vishwanath Subramanian
  • 3. Scenarios • On-Demand one click Provisioning of Seamlessly integrated Infrastructure Bill of Material for Data Science and Intelligent Apps. • Secured Connectivity to Enterprise Data Platform completely abstracted from Analytics teams. • Solution template containing organization of deployments to enable Adhoc experiments, shared data engineering and Intelligent App Development • Smarter checkout experiences • Predicting customer traffic • Planogram Analysis • And more…
  • 4. Current State • Solving complex / streaming image and video analytics is hard • It also typically involves distributing the problem to multiple nodes • But how do I perform Keras+TensorFlow on a distributed environment?
  • 6. Convolutional Neural Networks 28 x 28 28 x 28 14 x 14 Convolution 32 filters Convolution 64 filters Subsampling Stride (2,2) Feature Extraction Classification 0 1 8 9 FullyConnected Dropout
  • 7. DEMO Running Keras CNNs Standalone Keras, TensorFlow, HorovodRunner, and MLflow: https://dbricks.co/2D58PDw
  • 8. Introducing HorovodRunner • On-Demand one click Provisioning of Seamlessly integrated Infrastructure Bill of Material for Data Science and Intelligent Apps. • Secured Connectivity to Enterprise Data Platform completely abstracted from Analytics teams. • Solution template containing organization of deployments to enable Adhoc experiments, shared data engineering and Intelligent App Development • HorovodRunner is a general API to run distributed learning workloads on Databricks using Uber’s Horovod framework • Combining Horovod with Apache Spark’s barrier mode allows longer- running deep learning training jobs • A Horovod MPI job is embedded as a Spark job using barrier execution mode
  • 9. HorovodRunner • HorovodRunner takes a Python method that contains DL training code with Horovod hooks • The first executor collects the IP address of all of the task executors using BarrierTaskContext • Then it triggers a Horovod job using mpirun. • Each Python MPI process loads the pickled program back, deserializes it, and runs it.
  • 11. HorovodRunner driver workers runCNN(): model.add(Conv2D(32, …)) model.add(Conv2D(64, …)) model.add(MaxPooling2D(…)) model.add(Dense(128, …) model.add(Dense(10, ’softmax’) optimizer = keras.optimizers .Adadelta(1.0) In standalone or hvd local mode, the code is running on the driver
  • 12. HorovodRunner driver workers variables runCNN_hvd(): hvd.init() config.tf.ConfigProto() # Original code runCNN() callbacks = [] With HorovodRunner, we wrap the original code and code and variables are pushed to the workers
  • 13. HorovodRunner driver workers With HorovodRunner, we wrap the original code and code and variables are pushed to the workers
  • 14. HorovodRunner driver workers With HorovodRunner, we wrap the original code and code and variables are pushed to the workers
  • 15. HorovodRunner driver workers With HorovodRunner, we wrap the original code and code and variables are pushed to the workers
  • 16. HorovodRunner driver workers Variables are transferred from driver to workers Code is executed at the workers
  • 17. Migrate to HorovodRunner • On-Demand one click Provisioning of Seamlessly integrated Infrastructure Bill of Material for Data Science and Intelligent Apps. • Secured Connectivity to Enterprise Data Platform completely abstracted from Analytics teams. • Solution template containing organization of deployments to enable Adhoc experiments, shared data engineering and Intelligent App Development # Primary code differences are noted below + hvd.init() + config.tfConfigProto() + config.gpu_options.allow_growth = True + config.gpu_options.visible_device_list = str(hvd.local_rank()) + epochs = int(math.ceil(12.0 / hvd.size())) + callbacks = [ + hvd.callbacks.BroadcastGlobalVariablesCallback(0), + ]
  • 18. Comparing the runs using MLflow • On-Demand one click Provisioning of Seamlessly integrated Infrastructure Bill of Material for Data Science and Intelligent Apps. • Secured Connectivity to Enterprise Data Platform completely abstracted from Analytics teams. • Solution template containing organization of deployments to enable Adhoc experiments, shared data engineering and Intelligent App Development
  • 19. DEMO Object Detection Keras, TensorFlow, HorovodRunner, and MLflow
  • 20. Object Detection Approaches RCNN (2012) • Region proposal algorithms - give you a set of regions in the image that are likely to contain objects. • Run those images in the bounding boxes to a pre-trained alexnet to compute the features for that bounding box. • Support vector machine, to classify what the object in the image is of. • Run the box through a linear regression model to output tighter coordinates for the box. • RCNN -> Fast RCNN ->Faster RCNN Rich feature hierarchies for accurate object detection and semantic segmentation - Girshick, Donahue, Darrell, Malik Fast R-CNN - Girshick Faster R-CNN: Towards Real-Time ObjectDetection with Region Proposal Networks - Ren, He, Girshick, Su
  • 21. Object Detection Approaches (contd.) • YOLO – detection as a regression problem • Not a traditional classifier • Divide image into grid, each cell is responsible for predicting n bounding boxes • Output confidence score that predicted bounding box • Gives a probability distribution of all the classes its trained on • Confidence score and class prediction is combined is combined into a score for object classification • Based on threshold, we determine relevant boxes. • All the boxes fed to the neural network all at once. You Only Look Once: Unified, Real-Time Object Detection - Redmon, Divvala, Girshick, Farhadi
  • 22. A TALENTED TECHNOLOGISTS DELIVERING TODAY aavaLEADING INTO THE FUTURE https://www.starbucks.com/careers/