SlideShare una empresa de Scribd logo
1 de 61
Descargar para leer sin conexión
Distributed Deep Learning (on Hopsworks)
Jfokus, February 5th, 2019
jim_dowling
©2018 Logical Clocks AB. All Rights Reserved
Massive Increase in Compute for AI*
Distributed Systems
3.5 month-doubling time
*https://blog.openai.com/ai-and-compute
2
©2018 Logical Clocks AB. All Rights Reserved
All Roads Lead to Distribution
3
Distributed
Deep Learning
Hyper
Parameter
Optimization
Distributed
Training
Larger
Training
Datasets
Elastic
Model
Serving
Parallel
Experiments
(Commodity)
GPU Clusters
Auto
ML
©2018 Logical Clocks AB. All Rights Reserved
Model Training just One Piece of the Puzzle
4
MODEL TRAINING
[Diagram adapted from “technical debt of machine learning”]
This is a GPU
(graphical processing unit)
Deep Learning needs Big Data and GPUs
•Lots of data is needed to train
machine learning (ML) models
Big Data
Analytics
Features
DL
©2018 Logical Clocks AB. All Rights Reserved
Hopsworks – the Data and AI Platform
6
Hopsworks
Batch
Deep Learning
Streaming
©2018 Logical Clocks AB. All Rights Reserved
Hopsworks – the Data and AI Platform
7
Hopsworks
Batch
Deep Learning
Streaming
DATA
AI
Data
Sources
HopsFS
Kafka
Airflow
Stream
Processing
Batch
Processing
Feature
Store
Data
Warehouse
Deep
Learning
BI Tools &
Reporting
Notebooks
Model
Serving
Hopsworks
On-Premise, AWS, Azure, GCE
External
Service
Hopsworks
Service
Data
Sources
HopsFS
Kafka
Airflow
Stream
Processing
Batch
Processing
Feature
Store
Data
Warehouse
Deep
Learning
BI Tools &
Reporting
Notebooks
Model
Serving
Hopsworks
On-Premise, AWS, Azure, GCE
External
Service
Hopsworks
Service
BATCH
STREAMING
DEEP LEARNING
©2018 Logical Clocks AB. All Rights Reserved
ML Pipelines in Hopsworks
10
Event Data
StoreRaw Data Pre-ProcessIngest Explore/Deploy
REST API
HopsFS
FeatureStore
BI Tools
Experiment/Train
Serving
Airflow
©2018 Logical Clocks AB. All Rights Reserved
Scalable ML Pipeline
11
Data
Collection
Experimentation Training Serving
Feature
Extraction
Data
Transformation
& Verification
Test
Distributed Storage
Potential Bottlenecks
Object Stores (S3, GCS), HDFS, Ceph
StandaloneSingle-Host Single GPU
TensorFlow Extended (TFX) Pipeline
122/5/2019
Model Monitoring?
TF
Serving
TF Model
Analysis
TF
Estimator
TF
Transform
TF Data
Validation
©2018 Logical Clocks AB. All Rights Reserved
A Fully Distributed Deep Learning Pipeline
13
PySpark KubernetesDistributed TensorFlow
Input Data Training Data Model
DO NOT feed Spark
DataFrames to
TensorFlow
©2018 Logical Clocks AB. All Rights Reserved
Hopsworks Machine Learning Pipeline
HopsFS
PySpark Kubernetes
Tensorboard
Logs +
Experiments
Feature
Store
Training
Data
Models
Kafka
REST
API
Keras, TensorFlow, PyTorch
©2018 Logical Clocks AB. All Rights Reserved
Orchestrating ML Pipelines with Airflow
15
Airflow
Spark
ETL
Dist Training
TensorFlow
Test &
Optimize
Kubernetes
ModelServing
SparkStreaming
Monitoring
Data Problem
“Data is the hardest part of ML and the most important
piece to get right. Modelers spend most of their time
selecting and transforming features at training time and
then building the pipelines to deliver those features to
production models. Broken data is the most common cause
of problems in production ML systems”
– Uber
16
Data
Collection
Experimentation Training Serving
Feature
Extraction
Data
Transformation
& Verification
TestFEATURE STORE
Hopsworks’ Feature Store
- Reusability of features
between models and
teams
- Automatic backfilling of
features
- Automatic feature
documentation and
analysis
- Feature versioning
- Standardized access of
features between training
and serving
- Feature discovery
17
Without a Feature Store
18
With a Feature Store
19
Modelling Data in the Feature Store
20
•The Storage Layer: For storing feature data in the feature store
•The Metadata Layer: For storing feature metadata (versioning, feature
analysis, documentation, jobs)
•The Feature Engineering Jobs: For computing features
•The Feature Registry: A user interface to share and discover features
•The Feature Store API: For writing/reading to/from the feature store
Building Blocks of a Feature Store
21
Feature Storage
22
Feature Metadata
23
©2018 Logical Clocks AB. All Rights Reserved
24
Reading from the Feature Store (Data Scientist API)
from hops import featurestore
raw_data = spark.read.parquet(filename)
polynomial_features = raw_data.map(lambda x: x^2)
featurestore.insert_into_featuregroup(polynomial_features,
"polynomial_featuregroup")
from hops import featurestore
features_df = featurestore.get_features([
"average_attendance", "average_player_age“
])
Writing to the Feature Store (Data Engineer API)
Distribution with PySpark
25
Data
Collection
Experimentation Training Serving
Feature
Extraction
Data
Transformation
& Verification
Test
©2018 Logical Clocks AB. All Rights Reserved
PySpark gets GPUs from a Resource Manager
YARN/Kubernetes/Mesos
PySpark Application
10 GPUs,
please
©2018 Logical Clocks AB. All Rights Reserved
PySpark with GPUs in Hopsworks
27
Executor Executor
Driver
©2018 Logical Clocks AB. All Rights Reserved
PySpark Hello World
28
©2018 Logical Clocks AB. All Rights Reserved
PySpark in Code
29
Hello GPU World
1
*
Executor
print(“Hello from GPU”)
Driver
experiment.launch(..)
©2018 Logical Clocks AB. All Rights Reserved
Driver with 4 Executors
30
print(“Hello from GPU”)
Driver
print(“Hello from GPU”) print(“Hello from GPU”) print(“Hello from GPU”)
©2018 Logical Clocks AB. All Rights Reserved
Driver with 4 Executors in Code
31
©2018 Logical Clocks AB. All Rights Reserved
Hopsworks: Replicated Conda Environment
32
conda_env
conda_env conda_env conda_env conda_env
©2018 Logical Clocks AB. All Rights Reserved
33
A Conda Environment Per Project in Hopsworks
©2018 Logical Clocks AB. All Rights Reserved
34
Executor 1 Executor N
Driver
HopsFSTensorBoard
HopsFS for:
• Training/Test Datasets (Big Data)
• Model checkpointing, Model Serving
• State for Distributed Training & Parallel Experiments
Model Serving
Distributed Filesystem for Coordination
Filesystems are not good enough
Uber on Petastorm:
“[Using files] is hard to implement at large scale, especially
using modern distributed file systems such
as HDFS and S3 (these systems are typically optimized for
fast reads of large chunks of data).”
https://eng.uber.com/petastorm/
35
NVMe Disks – Game Changer
•HDFS (and S3) are designed around large blocks
(optimized to overcome slow random I/O on disks), while
new NVMe hardware supports orders of magnitude
faster random disk I/O.
•Can we support faster random disk I/O with HDFS?
-Yes with HopsFS.
36
©2018 Logical Clocks AB. All Rights Reserved
Small files on NVMe*
37
•Spotify’s HDFS:
-33% of files < 64KB in size
-42% of operations are on
files < 16KB in size
*Size Matters: Improving the Performance of Small Files in Hadoop, Middleware 2018. Niazi et al
©2018 Logical Clocks AB. All Rights Reserved
HopsFS – NVMe Performance
•HopsFS is HDFS with
Distributed Metadata
-Winner IEEE Scale Prize 2017
•Small files stored replicated
in the metadata layer on
NVMe disks*
- Read/Writes tens of 1000s of
images/second with HopsFS + NVMe
*Size Matters: Improving the Performance of Small Files in Hadoop, Middleware 2018. Niazi et al
©2018 Logical Clocks AB. All Rights Reserved
Hyperparameter Optimization
39
(Because DL Theory Sucks!)
©2018 Logical Clocks AB. All Rights Reserved
Faster Experimentation
40
LearningRate (LR): [0.0001-0.001]
NumOfLayers (NL): [5-10]
……
LR: 0.0001
NL: 5
Error: 0.35
LR: 0.0001
NL: 7
Error: 0.34
LR: 0.0005
NL: 5
Error: 0.38
LR: 0.0005
NL: 10
Error: 0.37
LR: 0.001
NL: 6
Error: 0.31
LR: 0.001
NL: 9
HyperParameter Optimization
LR: 0.001
NL: 6
Error: 0.31
TensorFlow Program
Hyperparameters
Blacklist Executor
LR: 0.001
NL: 9
Error: 0.36
©2018 Logical Clocks AB. All Rights Reserved
def train(learning_rate, dropout):
[TensorFlow Code here]
args_dict = {'learning_rate': [0.001, 0.005, 0.01],
'dropout': [0.5, 0.6]}
experiment.launch(train, args_dict)
GridSearch for Hyperparameters on HopsML
Launch 6 Spark Executors
©2018 Logical Clocks AB. All Rights Reserved
Data Parallel (Distributed) Training
42
Training Time
Generalization
Error
(Synchronous Stochastic Gradient Descent (SGD))
©2018 Logical Clocks AB. All Rights Reserved
Aggregate
Gradients
New Model
Barrier
43
GPU0 GPU1 GPU99
Distributed Filesystem
32 files 32 files 32 files
Batch Size
of 3200
Copy of the Model
on each GPU
Data Parallel Training
Frameworks for Distributed Training
©2018 Logical Clocks AB. All Rights Reserved
Distributed TensorFlow / TfOnSpark
45
TF_CONFIG
Bring your own
Distribution!
1. Start all processes for
P1,P2, G1-G4 yourself
2. Enter all IP addresses in
TF_CONFIG along with
GPU device IDs.
Parameter Servers
G1 G2 G3 G4
P1 P2
GPU Servers
TF_CONFIG
©2018 Logical Clocks AB. All Rights Reserved
Ring AllReduce
46
•Bandwidth optimal
•Builds the Ring, runs
AllReduce using MPI
and NCCL2
-Horovod
-TensorFlow 1.11+
©2018 Logical Clocks AB. All Rights Reserved
Tf CollectiveAllReduceStrategy
47
TF_CONFIG, again.
Bring your own
Distribution!
1. Start all processes for
G1-G4 yourself
2. Enter all IP addresses
in TF_CONFIG along
with GPU device IDs.
G1
G2
G3
G4
TF_CONFIG
Available from TensorFlow 1.11
©2018 Logical Clocks AB. All Rights Reserved
HopsML CollectiveAllReduceStrategy
48
•Uses Spark/YARN to add
distribution to TensorFlow’s
CollectiveAllReduceStrategy
-Automatically builds the ring
(Spark/YARN)
-Allocates GPUs to Spark Executors
https://github.com/logicalclocks/hops-examples/tree/master/tensorflow/notebooks/Distributed_Training
©2018 Logical Clocks AB. All Rights Reserved
Estimator APIs in TensorFlow
49
©2018 Logical Clocks AB. All Rights Reserved
Estimators log to the Distributed Filesystem
50
#over-simplified code – see
#notebook for full example
tf.estimator.RunConfig(
‘CollectiveAllReduceStrategy’
model_dir
tensorboard_logs
checkpoints
)
experiment.collective_all_reduce(…)
/Experiments/appId/run.ID/<nam
/Experiments/appId/run.ID/<nam
/Experiments/appId/run.ID/<nam
HopsFS (HDFS)
/Experiments/appId/run.ID/<nam
/Experiments/appId/run.ID/<nam
©2018 Logical Clocks AB. All Rights Reserved
#over-simplified code – see notebook for full example
def distributed_training():
…
rc = tf.estimator.RunConfig(‘CollectiveAllReduceStrategy’)
tf.estimator.train_and_evaluate(…)
experiment.collective_all_reduce(distributed_training)
HopsML CollectiveAllReduceStrategy with Keras/Tf
©2018 Logical Clocks AB. All Rights Reserved
Experiments/Versioning in Hopsworks
©2018 Logical Clocks AB. All Rights Reserved
53
Experiments
Tensorboard
©2018 Logical Clocks AB. All Rights Reserved
54
Experiment
Details
©2018 Logical Clocks AB. All Rights Reserved
55
Tensorboard
Visualizations
©2018 Logical Clocks AB. All Rights Reserved
Model Serving
56
Hopsworks
REST API
External
Apps
Internal
Apps
Logging
Load
Balancer
Model Serving
Containers
Streaming
Notifications
Retrain
Features:
• Canary
• Multiple Models
• Scale-Out/In
Frameworks:
 TensorFlow Serving
 MLeap for Spark
 scikit-learn
Summary
•The future of Deep Learning is Distributed
https://www.oreilly.com/ideas/distributed-tensorflow
•Hopsworks is a new Data and AI Platform with
first-class support for
Python / Deep Learning / ML / Data Governance / GPUs
57
HopsML Meetup
•https://www.meetup.com/HopsML-Stockholm/
•3 months old with already 320+ members
•Next Event in March 2019
©2018 Logical Clocks AB. All Rights Reserved
Hops / Hopsworks Open Source Projects
Contributors:
Jim Dowling, Seif Haridi, Gautier Berthou, Salman Niazi, Mahmoud Ismail, Theofilos
Kakantousis, Ermias Gebremeskel, Antonios Kouzoupis, Alex Ormenisan, Fabio Buso,
Robin Andersson, Kim Hammar, Vasileios Giannokostas, Johan Svedlund
Nordström,Rizvi Hasan, Paul Mälzer, Bram Leenders, Juan Roca, Misganu Dessalegn,
K “Sri” Srijeyanthan, Jude D’Souza, Alberto Lorente, Andre Moré, Ali Gholami, Davis
Jaunzems, Stig Viaene, Hooman Peiro, Evangelos Savvidis, Steffen Grohsschmiedt, Qi
Qi, Gayana Chandrasekara, Nikolaos Stanogias, Daniel Bali, Ioannis Kerkinos, Peter
Buechler, Pushparaj Motamari, Hamid Afzali, Wasif Malik, Lalith Suresh, Mariano
Valles, Ying Lieu, Fanti Machmount Al Samisti, Braulio Grana, Adam Alpire, Zahin
Azher Rashid, ArunaKumari Yedurupaka, Tobias Johansson , Roberto Bampi, Roshan
Sedar, August Bonds.
www.hops.io
@hopshadoop
©2018 Logical Clocks AB. All Rights Reserved
60
@logicalclocks
www.logicalclocks.com
1. Register for an account at:
www.hops.site
2. Enter your Firstname/lastname here:
https://bit.ly/2UEixTr
61

Más contenido relacionado

La actualidad más candente

Simple, Modular and Extensible Big Data Platform Concept
Simple, Modular and Extensible Big Data Platform ConceptSimple, Modular and Extensible Big Data Platform Concept
Simple, Modular and Extensible Big Data Platform ConceptSatish Mohan
 
Asynchronous Hyperparameter Search with Spark on Hopsworks and Maggy
Asynchronous Hyperparameter Search with Spark on Hopsworks and MaggyAsynchronous Hyperparameter Search with Spark on Hopsworks and Maggy
Asynchronous Hyperparameter Search with Spark on Hopsworks and MaggyJim Dowling
 
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...Databricks
 
Deploying End-to-End Deep Learning Pipelines with ONNX
Deploying End-to-End Deep Learning Pipelines with ONNXDeploying End-to-End Deep Learning Pipelines with ONNX
Deploying End-to-End Deep Learning Pipelines with ONNXDatabricks
 
Kim Hammar - Feature Store: the missing data layer in ML pipelines? - HopsML ...
Kim Hammar - Feature Store: the missing data layer in ML pipelines? - HopsML ...Kim Hammar - Feature Store: the missing data layer in ML pipelines? - HopsML ...
Kim Hammar - Feature Store: the missing data layer in ML pipelines? - HopsML ...Kim Hammar
 
Automated Hyperparameter Tuning, Scaling and Tracking
Automated Hyperparameter Tuning, Scaling and TrackingAutomated Hyperparameter Tuning, Scaling and Tracking
Automated Hyperparameter Tuning, Scaling and TrackingDatabricks
 
PyMADlib - A Python wrapper for MADlib : in-database, parallel, machine learn...
PyMADlib - A Python wrapper for MADlib : in-database, parallel, machine learn...PyMADlib - A Python wrapper for MADlib : in-database, parallel, machine learn...
PyMADlib - A Python wrapper for MADlib : in-database, parallel, machine learn...Srivatsan Ramanujam
 
Hopsworks hands on_feature_store_palo_alto_kim_hammar_23_april_2019
Hopsworks hands on_feature_store_palo_alto_kim_hammar_23_april_2019Hopsworks hands on_feature_store_palo_alto_kim_hammar_23_april_2019
Hopsworks hands on_feature_store_palo_alto_kim_hammar_23_april_2019Kim Hammar
 
Hops fs huawei internal conference july 2021
Hops fs huawei internal conference july 2021Hops fs huawei internal conference july 2021
Hops fs huawei internal conference july 2021Jim Dowling
 
Hopsworks - The Platform for Data-Intensive AI
Hopsworks - The Platform for Data-Intensive AIHopsworks - The Platform for Data-Intensive AI
Hopsworks - The Platform for Data-Intensive AIQAware GmbH
 
Apache Spark 2.3 boosts advanced analytics and deep learning with Python
Apache Spark 2.3 boosts advanced analytics and deep learning with PythonApache Spark 2.3 boosts advanced analytics and deep learning with Python
Apache Spark 2.3 boosts advanced analytics and deep learning with PythonDataWorks Summit
 
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...Databricks
 
Operationalizing Machine Learning at Scale with Sameer Nori
Operationalizing Machine Learning at Scale with Sameer NoriOperationalizing Machine Learning at Scale with Sameer Nori
Operationalizing Machine Learning at Scale with Sameer NoriDatabricks
 
Leveraging smart meter data for electric utilities: Comparison of Spark SQL w...
Leveraging smart meter data for electric utilities: Comparison of Spark SQL w...Leveraging smart meter data for electric utilities: Comparison of Spark SQL w...
Leveraging smart meter data for electric utilities: Comparison of Spark SQL w...DataWorks Summit/Hadoop Summit
 
Apache Spark for Cyber Security in an Enterprise Company
Apache Spark for Cyber Security in an Enterprise CompanyApache Spark for Cyber Security in an Enterprise Company
Apache Spark for Cyber Security in an Enterprise CompanyDatabricks
 
Spark ML with High Dimensional Labels Michael Zargham and Stefan Panayotov
Spark ML with High Dimensional Labels Michael Zargham and Stefan PanayotovSpark ML with High Dimensional Labels Michael Zargham and Stefan Panayotov
Spark ML with High Dimensional Labels Michael Zargham and Stefan PanayotovDatabricks
 
Improving the Life of Data Scientists: Automating ML Lifecycle through MLflow
Improving the Life of Data Scientists: Automating ML Lifecycle through MLflowImproving the Life of Data Scientists: Automating ML Lifecycle through MLflow
Improving the Life of Data Scientists: Automating ML Lifecycle through MLflowDatabricks
 
Introduction to machine learning with GPUs
Introduction to machine learning with GPUsIntroduction to machine learning with GPUs
Introduction to machine learning with GPUsCarol McDonald
 
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...Spark Summit
 
Model Parallelism in Spark ML Cross-Validation with Nick Pentreath and Bryan ...
Model Parallelism in Spark ML Cross-Validation with Nick Pentreath and Bryan ...Model Parallelism in Spark ML Cross-Validation with Nick Pentreath and Bryan ...
Model Parallelism in Spark ML Cross-Validation with Nick Pentreath and Bryan ...Databricks
 

La actualidad más candente (20)

Simple, Modular and Extensible Big Data Platform Concept
Simple, Modular and Extensible Big Data Platform ConceptSimple, Modular and Extensible Big Data Platform Concept
Simple, Modular and Extensible Big Data Platform Concept
 
Asynchronous Hyperparameter Search with Spark on Hopsworks and Maggy
Asynchronous Hyperparameter Search with Spark on Hopsworks and MaggyAsynchronous Hyperparameter Search with Spark on Hopsworks and Maggy
Asynchronous Hyperparameter Search with Spark on Hopsworks and Maggy
 
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
 
Deploying End-to-End Deep Learning Pipelines with ONNX
Deploying End-to-End Deep Learning Pipelines with ONNXDeploying End-to-End Deep Learning Pipelines with ONNX
Deploying End-to-End Deep Learning Pipelines with ONNX
 
Kim Hammar - Feature Store: the missing data layer in ML pipelines? - HopsML ...
Kim Hammar - Feature Store: the missing data layer in ML pipelines? - HopsML ...Kim Hammar - Feature Store: the missing data layer in ML pipelines? - HopsML ...
Kim Hammar - Feature Store: the missing data layer in ML pipelines? - HopsML ...
 
Automated Hyperparameter Tuning, Scaling and Tracking
Automated Hyperparameter Tuning, Scaling and TrackingAutomated Hyperparameter Tuning, Scaling and Tracking
Automated Hyperparameter Tuning, Scaling and Tracking
 
PyMADlib - A Python wrapper for MADlib : in-database, parallel, machine learn...
PyMADlib - A Python wrapper for MADlib : in-database, parallel, machine learn...PyMADlib - A Python wrapper for MADlib : in-database, parallel, machine learn...
PyMADlib - A Python wrapper for MADlib : in-database, parallel, machine learn...
 
Hopsworks hands on_feature_store_palo_alto_kim_hammar_23_april_2019
Hopsworks hands on_feature_store_palo_alto_kim_hammar_23_april_2019Hopsworks hands on_feature_store_palo_alto_kim_hammar_23_april_2019
Hopsworks hands on_feature_store_palo_alto_kim_hammar_23_april_2019
 
Hops fs huawei internal conference july 2021
Hops fs huawei internal conference july 2021Hops fs huawei internal conference july 2021
Hops fs huawei internal conference july 2021
 
Hopsworks - The Platform for Data-Intensive AI
Hopsworks - The Platform for Data-Intensive AIHopsworks - The Platform for Data-Intensive AI
Hopsworks - The Platform for Data-Intensive AI
 
Apache Spark 2.3 boosts advanced analytics and deep learning with Python
Apache Spark 2.3 boosts advanced analytics and deep learning with PythonApache Spark 2.3 boosts advanced analytics and deep learning with Python
Apache Spark 2.3 boosts advanced analytics and deep learning with Python
 
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
 
Operationalizing Machine Learning at Scale with Sameer Nori
Operationalizing Machine Learning at Scale with Sameer NoriOperationalizing Machine Learning at Scale with Sameer Nori
Operationalizing Machine Learning at Scale with Sameer Nori
 
Leveraging smart meter data for electric utilities: Comparison of Spark SQL w...
Leveraging smart meter data for electric utilities: Comparison of Spark SQL w...Leveraging smart meter data for electric utilities: Comparison of Spark SQL w...
Leveraging smart meter data for electric utilities: Comparison of Spark SQL w...
 
Apache Spark for Cyber Security in an Enterprise Company
Apache Spark for Cyber Security in an Enterprise CompanyApache Spark for Cyber Security in an Enterprise Company
Apache Spark for Cyber Security in an Enterprise Company
 
Spark ML with High Dimensional Labels Michael Zargham and Stefan Panayotov
Spark ML with High Dimensional Labels Michael Zargham and Stefan PanayotovSpark ML with High Dimensional Labels Michael Zargham and Stefan Panayotov
Spark ML with High Dimensional Labels Michael Zargham and Stefan Panayotov
 
Improving the Life of Data Scientists: Automating ML Lifecycle through MLflow
Improving the Life of Data Scientists: Automating ML Lifecycle through MLflowImproving the Life of Data Scientists: Automating ML Lifecycle through MLflow
Improving the Life of Data Scientists: Automating ML Lifecycle through MLflow
 
Introduction to machine learning with GPUs
Introduction to machine learning with GPUsIntroduction to machine learning with GPUs
Introduction to machine learning with GPUs
 
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
 
Model Parallelism in Spark ML Cross-Validation with Nick Pentreath and Bryan ...
Model Parallelism in Spark ML Cross-Validation with Nick Pentreath and Bryan ...Model Parallelism in Spark ML Cross-Validation with Nick Pentreath and Bryan ...
Model Parallelism in Spark ML Cross-Validation with Nick Pentreath and Bryan ...
 

Similar a Jfokus 2019-dowling-logical-clocks

Deep learning beyond the learning - Jörg Schad - Codemotion Rome 2018
Deep learning beyond the learning - Jörg Schad - Codemotion Rome 2018 Deep learning beyond the learning - Jörg Schad - Codemotion Rome 2018
Deep learning beyond the learning - Jörg Schad - Codemotion Rome 2018 Codemotion
 
Dagster - DataOps and MLOps for Machine Learning Engineers.pdf
Dagster - DataOps and MLOps for Machine Learning Engineers.pdfDagster - DataOps and MLOps for Machine Learning Engineers.pdf
Dagster - DataOps and MLOps for Machine Learning Engineers.pdfHong Ong
 
Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...
Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...
Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...Joachim Schlosser
 
Deep learning beyond the learning - Jörg Schad - Codemotion Amsterdam 2018
Deep learning beyond the learning - Jörg Schad - Codemotion Amsterdam 2018Deep learning beyond the learning - Jörg Schad - Codemotion Amsterdam 2018
Deep learning beyond the learning - Jörg Schad - Codemotion Amsterdam 2018Codemotion
 
Matlab Based High Level Synthesis Engine for Area And Power Efficient Arithme...
Matlab Based High Level Synthesis Engine for Area And Power Efficient Arithme...Matlab Based High Level Synthesis Engine for Area And Power Efficient Arithme...
Matlab Based High Level Synthesis Engine for Area And Power Efficient Arithme...ijceronline
 
AWS re:Invent 2018 - ENT321 - SageMaker Workshop
AWS re:Invent 2018 - ENT321 - SageMaker WorkshopAWS re:Invent 2018 - ENT321 - SageMaker Workshop
AWS re:Invent 2018 - ENT321 - SageMaker WorkshopJulien SIMON
 
Diagnose Your Microservices
Diagnose Your MicroservicesDiagnose Your Microservices
Diagnose Your MicroservicesMarcus Hirt
 
Build, Train, and Deploy Machine Learning for the Enterprise with Amazon Sage...
Build, Train, and Deploy Machine Learning for the Enterprise with Amazon Sage...Build, Train, and Deploy Machine Learning for the Enterprise with Amazon Sage...
Build, Train, and Deploy Machine Learning for the Enterprise with Amazon Sage...Amazon Web Services
 
RISC-V & SoC Architectural Exploration for AI and ML Accelerators
RISC-V & SoC Architectural Exploration for AI and ML AcceleratorsRISC-V & SoC Architectural Exploration for AI and ML Accelerators
RISC-V & SoC Architectural Exploration for AI and ML AcceleratorsRISC-V International
 
Toward Hybrid Cloud Serverless Transparency with Lithops Framework
Toward Hybrid Cloud Serverless Transparency with Lithops FrameworkToward Hybrid Cloud Serverless Transparency with Lithops Framework
Toward Hybrid Cloud Serverless Transparency with Lithops FrameworkLibbySchulze
 
Data Science with the Help of Metadata
Data Science with the Help of MetadataData Science with the Help of Metadata
Data Science with the Help of MetadataJim Dowling
 
PyData Meetup - Feature Store for Hopsworks and ML Pipelines
PyData Meetup - Feature Store for Hopsworks and ML PipelinesPyData Meetup - Feature Store for Hopsworks and ML Pipelines
PyData Meetup - Feature Store for Hopsworks and ML PipelinesJim Dowling
 
TensorFlow 16: Building a Data Science Platform
TensorFlow 16: Building a Data Science Platform TensorFlow 16: Building a Data Science Platform
TensorFlow 16: Building a Data Science Platform Seldon
 
How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors
How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors
How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors DataWorks Summit/Hadoop Summit
 
Optimizing Hortonworks Apache Spark machine learning workloads for contempora...
Optimizing Hortonworks Apache Spark machine learning workloads for contempora...Optimizing Hortonworks Apache Spark machine learning workloads for contempora...
Optimizing Hortonworks Apache Spark machine learning workloads for contempora...Indrajit Poddar
 
ExtremeEarth: Hopsworks, a data-intensive AI platform for Deep Learning with ...
ExtremeEarth: Hopsworks, a data-intensive AI platform for Deep Learning with ...ExtremeEarth: Hopsworks, a data-intensive AI platform for Deep Learning with ...
ExtremeEarth: Hopsworks, a data-intensive AI platform for Deep Learning with ...Big Data Value Association
 
All AI Roads lead to Distribution - Dot AI
All AI Roads lead to Distribution - Dot AIAll AI Roads lead to Distribution - Dot AI
All AI Roads lead to Distribution - Dot AIJim Dowling
 
Machine Learning for Capacity Management
 Machine Learning for Capacity Management Machine Learning for Capacity Management
Machine Learning for Capacity ManagementEDB
 

Similar a Jfokus 2019-dowling-logical-clocks (20)

Deep learning beyond the learning - Jörg Schad - Codemotion Rome 2018
Deep learning beyond the learning - Jörg Schad - Codemotion Rome 2018 Deep learning beyond the learning - Jörg Schad - Codemotion Rome 2018
Deep learning beyond the learning - Jörg Schad - Codemotion Rome 2018
 
Dagster - DataOps and MLOps for Machine Learning Engineers.pdf
Dagster - DataOps and MLOps for Machine Learning Engineers.pdfDagster - DataOps and MLOps for Machine Learning Engineers.pdf
Dagster - DataOps and MLOps for Machine Learning Engineers.pdf
 
Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...
Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...
Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...
 
Deep learning beyond the learning - Jörg Schad - Codemotion Amsterdam 2018
Deep learning beyond the learning - Jörg Schad - Codemotion Amsterdam 2018Deep learning beyond the learning - Jörg Schad - Codemotion Amsterdam 2018
Deep learning beyond the learning - Jörg Schad - Codemotion Amsterdam 2018
 
Matlab Based High Level Synthesis Engine for Area And Power Efficient Arithme...
Matlab Based High Level Synthesis Engine for Area And Power Efficient Arithme...Matlab Based High Level Synthesis Engine for Area And Power Efficient Arithme...
Matlab Based High Level Synthesis Engine for Area And Power Efficient Arithme...
 
OpenPOWER Boot camp in Zurich
OpenPOWER Boot camp in ZurichOpenPOWER Boot camp in Zurich
OpenPOWER Boot camp in Zurich
 
AWS re:Invent 2018 - ENT321 - SageMaker Workshop
AWS re:Invent 2018 - ENT321 - SageMaker WorkshopAWS re:Invent 2018 - ENT321 - SageMaker Workshop
AWS re:Invent 2018 - ENT321 - SageMaker Workshop
 
Diagnose Your Microservices
Diagnose Your MicroservicesDiagnose Your Microservices
Diagnose Your Microservices
 
Build, Train, and Deploy Machine Learning for the Enterprise with Amazon Sage...
Build, Train, and Deploy Machine Learning for the Enterprise with Amazon Sage...Build, Train, and Deploy Machine Learning for the Enterprise with Amazon Sage...
Build, Train, and Deploy Machine Learning for the Enterprise with Amazon Sage...
 
RISC-V & SoC Architectural Exploration for AI and ML Accelerators
RISC-V & SoC Architectural Exploration for AI and ML AcceleratorsRISC-V & SoC Architectural Exploration for AI and ML Accelerators
RISC-V & SoC Architectural Exploration for AI and ML Accelerators
 
Toward Hybrid Cloud Serverless Transparency with Lithops Framework
Toward Hybrid Cloud Serverless Transparency with Lithops FrameworkToward Hybrid Cloud Serverless Transparency with Lithops Framework
Toward Hybrid Cloud Serverless Transparency with Lithops Framework
 
Data Science with the Help of Metadata
Data Science with the Help of MetadataData Science with the Help of Metadata
Data Science with the Help of Metadata
 
PyData Meetup - Feature Store for Hopsworks and ML Pipelines
PyData Meetup - Feature Store for Hopsworks and ML PipelinesPyData Meetup - Feature Store for Hopsworks and ML Pipelines
PyData Meetup - Feature Store for Hopsworks and ML Pipelines
 
TensorFlow 16: Building a Data Science Platform
TensorFlow 16: Building a Data Science Platform TensorFlow 16: Building a Data Science Platform
TensorFlow 16: Building a Data Science Platform
 
How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors
How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors
How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors
 
Optimizing Hortonworks Apache Spark machine learning workloads for contempora...
Optimizing Hortonworks Apache Spark machine learning workloads for contempora...Optimizing Hortonworks Apache Spark machine learning workloads for contempora...
Optimizing Hortonworks Apache Spark machine learning workloads for contempora...
 
ExtremeEarth: Hopsworks, a data-intensive AI platform for Deep Learning with ...
ExtremeEarth: Hopsworks, a data-intensive AI platform for Deep Learning with ...ExtremeEarth: Hopsworks, a data-intensive AI platform for Deep Learning with ...
ExtremeEarth: Hopsworks, a data-intensive AI platform for Deep Learning with ...
 
All AI Roads lead to Distribution - Dot AI
All AI Roads lead to Distribution - Dot AIAll AI Roads lead to Distribution - Dot AI
All AI Roads lead to Distribution - Dot AI
 
BSC LMS DDL
BSC LMS DDL BSC LMS DDL
BSC LMS DDL
 
Machine Learning for Capacity Management
 Machine Learning for Capacity Management Machine Learning for Capacity Management
Machine Learning for Capacity Management
 

Más de Jim Dowling

ARVC and flecainide case report[EI] Jim.docx.pdf
ARVC and flecainide case report[EI] Jim.docx.pdfARVC and flecainide case report[EI] Jim.docx.pdf
ARVC and flecainide case report[EI] Jim.docx.pdfJim Dowling
 
PyData Berlin 2023 - Mythical ML Pipeline.pdf
PyData Berlin 2023 - Mythical ML Pipeline.pdfPyData Berlin 2023 - Mythical ML Pipeline.pdf
PyData Berlin 2023 - Mythical ML Pipeline.pdfJim Dowling
 
Serverless ML Workshop with Hopsworks at PyData Seattle
Serverless ML Workshop with Hopsworks at PyData SeattleServerless ML Workshop with Hopsworks at PyData Seattle
Serverless ML Workshop with Hopsworks at PyData SeattleJim Dowling
 
PyCon Sweden 2022 - Dowling - Serverless ML with Hopsworks.pdf
PyCon Sweden 2022 - Dowling - Serverless ML with Hopsworks.pdfPyCon Sweden 2022 - Dowling - Serverless ML with Hopsworks.pdf
PyCon Sweden 2022 - Dowling - Serverless ML with Hopsworks.pdfJim Dowling
 
_Python Ireland Meetup - Serverless ML - Dowling.pdf
_Python Ireland Meetup - Serverless ML - Dowling.pdf_Python Ireland Meetup - Serverless ML - Dowling.pdf
_Python Ireland Meetup - Serverless ML - Dowling.pdfJim Dowling
 
Building Hopsworks, a cloud-native managed feature store for machine learning
Building Hopsworks, a cloud-native managed feature store for machine learning Building Hopsworks, a cloud-native managed feature store for machine learning
Building Hopsworks, a cloud-native managed feature store for machine learning Jim Dowling
 
Real-Time Recommendations with Hopsworks and OpenSearch - MLOps World 2022
Real-Time Recommendations  with Hopsworks and OpenSearch - MLOps World 2022Real-Time Recommendations  with Hopsworks and OpenSearch - MLOps World 2022
Real-Time Recommendations with Hopsworks and OpenSearch - MLOps World 2022Jim Dowling
 
Ml ops and the feature store with hopsworks, DC Data Science Meetup
Ml ops and the feature store with hopsworks, DC Data Science MeetupMl ops and the feature store with hopsworks, DC Data Science Meetup
Ml ops and the feature store with hopsworks, DC Data Science MeetupJim Dowling
 
Hopsworks MLOps World talk june 21
Hopsworks MLOps World talk june 21Hopsworks MLOps World talk june 21
Hopsworks MLOps World talk june 21Jim Dowling
 
Hopsworks Feature Store 2.0 a new paradigm
Hopsworks Feature Store  2.0   a new paradigmHopsworks Feature Store  2.0   a new paradigm
Hopsworks Feature Store 2.0 a new paradigmJim Dowling
 
Metadata and Provenance for ML Pipelines with Hopsworks
Metadata and Provenance for ML Pipelines with Hopsworks Metadata and Provenance for ML Pipelines with Hopsworks
Metadata and Provenance for ML Pipelines with Hopsworks Jim Dowling
 
GANs for Anti Money Laundering
GANs for Anti Money LaunderingGANs for Anti Money Laundering
GANs for Anti Money LaunderingJim Dowling
 
Berlin buzzwords 2020-feature-store-dowling
Berlin buzzwords 2020-feature-store-dowlingBerlin buzzwords 2020-feature-store-dowling
Berlin buzzwords 2020-feature-store-dowlingJim Dowling
 
Invited Lecture on GPUs and Distributed Deep Learning at Uppsala University
Invited Lecture on GPUs and Distributed Deep Learning at Uppsala UniversityInvited Lecture on GPUs and Distributed Deep Learning at Uppsala University
Invited Lecture on GPUs and Distributed Deep Learning at Uppsala UniversityJim Dowling
 
Hopsworks data engineering melbourne april 2020
Hopsworks   data engineering melbourne april 2020Hopsworks   data engineering melbourne april 2020
Hopsworks data engineering melbourne april 2020Jim Dowling
 
HopsML Meetup talk on Hopsworks + ROCm/AMD June 2019
HopsML Meetup talk on Hopsworks + ROCm/AMD June 2019HopsML Meetup talk on Hopsworks + ROCm/AMD June 2019
HopsML Meetup talk on Hopsworks + ROCm/AMD June 2019Jim Dowling
 
End-to-End Platform Support for Distributed Deep Learning in Finance
End-to-End Platform Support for Distributed Deep Learning in FinanceEnd-to-End Platform Support for Distributed Deep Learning in Finance
End-to-End Platform Support for Distributed Deep Learning in FinanceJim Dowling
 
Scaling TensorFlow with Hops, Global AI Conference Santa Clara
Scaling TensorFlow with Hops, Global AI Conference Santa ClaraScaling TensorFlow with Hops, Global AI Conference Santa Clara
Scaling TensorFlow with Hops, Global AI Conference Santa ClaraJim Dowling
 
Scaling out Tensorflow-as-a-Service on Spark and Commodity GPUs
Scaling out Tensorflow-as-a-Service on Spark and Commodity GPUsScaling out Tensorflow-as-a-Service on Spark and Commodity GPUs
Scaling out Tensorflow-as-a-Service on Spark and Commodity GPUsJim Dowling
 
Odsc workshop - Distributed Tensorflow on Hops
Odsc workshop - Distributed Tensorflow on HopsOdsc workshop - Distributed Tensorflow on Hops
Odsc workshop - Distributed Tensorflow on HopsJim Dowling
 

Más de Jim Dowling (20)

ARVC and flecainide case report[EI] Jim.docx.pdf
ARVC and flecainide case report[EI] Jim.docx.pdfARVC and flecainide case report[EI] Jim.docx.pdf
ARVC and flecainide case report[EI] Jim.docx.pdf
 
PyData Berlin 2023 - Mythical ML Pipeline.pdf
PyData Berlin 2023 - Mythical ML Pipeline.pdfPyData Berlin 2023 - Mythical ML Pipeline.pdf
PyData Berlin 2023 - Mythical ML Pipeline.pdf
 
Serverless ML Workshop with Hopsworks at PyData Seattle
Serverless ML Workshop with Hopsworks at PyData SeattleServerless ML Workshop with Hopsworks at PyData Seattle
Serverless ML Workshop with Hopsworks at PyData Seattle
 
PyCon Sweden 2022 - Dowling - Serverless ML with Hopsworks.pdf
PyCon Sweden 2022 - Dowling - Serverless ML with Hopsworks.pdfPyCon Sweden 2022 - Dowling - Serverless ML with Hopsworks.pdf
PyCon Sweden 2022 - Dowling - Serverless ML with Hopsworks.pdf
 
_Python Ireland Meetup - Serverless ML - Dowling.pdf
_Python Ireland Meetup - Serverless ML - Dowling.pdf_Python Ireland Meetup - Serverless ML - Dowling.pdf
_Python Ireland Meetup - Serverless ML - Dowling.pdf
 
Building Hopsworks, a cloud-native managed feature store for machine learning
Building Hopsworks, a cloud-native managed feature store for machine learning Building Hopsworks, a cloud-native managed feature store for machine learning
Building Hopsworks, a cloud-native managed feature store for machine learning
 
Real-Time Recommendations with Hopsworks and OpenSearch - MLOps World 2022
Real-Time Recommendations  with Hopsworks and OpenSearch - MLOps World 2022Real-Time Recommendations  with Hopsworks and OpenSearch - MLOps World 2022
Real-Time Recommendations with Hopsworks and OpenSearch - MLOps World 2022
 
Ml ops and the feature store with hopsworks, DC Data Science Meetup
Ml ops and the feature store with hopsworks, DC Data Science MeetupMl ops and the feature store with hopsworks, DC Data Science Meetup
Ml ops and the feature store with hopsworks, DC Data Science Meetup
 
Hopsworks MLOps World talk june 21
Hopsworks MLOps World talk june 21Hopsworks MLOps World talk june 21
Hopsworks MLOps World talk june 21
 
Hopsworks Feature Store 2.0 a new paradigm
Hopsworks Feature Store  2.0   a new paradigmHopsworks Feature Store  2.0   a new paradigm
Hopsworks Feature Store 2.0 a new paradigm
 
Metadata and Provenance for ML Pipelines with Hopsworks
Metadata and Provenance for ML Pipelines with Hopsworks Metadata and Provenance for ML Pipelines with Hopsworks
Metadata and Provenance for ML Pipelines with Hopsworks
 
GANs for Anti Money Laundering
GANs for Anti Money LaunderingGANs for Anti Money Laundering
GANs for Anti Money Laundering
 
Berlin buzzwords 2020-feature-store-dowling
Berlin buzzwords 2020-feature-store-dowlingBerlin buzzwords 2020-feature-store-dowling
Berlin buzzwords 2020-feature-store-dowling
 
Invited Lecture on GPUs and Distributed Deep Learning at Uppsala University
Invited Lecture on GPUs and Distributed Deep Learning at Uppsala UniversityInvited Lecture on GPUs and Distributed Deep Learning at Uppsala University
Invited Lecture on GPUs and Distributed Deep Learning at Uppsala University
 
Hopsworks data engineering melbourne april 2020
Hopsworks   data engineering melbourne april 2020Hopsworks   data engineering melbourne april 2020
Hopsworks data engineering melbourne april 2020
 
HopsML Meetup talk on Hopsworks + ROCm/AMD June 2019
HopsML Meetup talk on Hopsworks + ROCm/AMD June 2019HopsML Meetup talk on Hopsworks + ROCm/AMD June 2019
HopsML Meetup talk on Hopsworks + ROCm/AMD June 2019
 
End-to-End Platform Support for Distributed Deep Learning in Finance
End-to-End Platform Support for Distributed Deep Learning in FinanceEnd-to-End Platform Support for Distributed Deep Learning in Finance
End-to-End Platform Support for Distributed Deep Learning in Finance
 
Scaling TensorFlow with Hops, Global AI Conference Santa Clara
Scaling TensorFlow with Hops, Global AI Conference Santa ClaraScaling TensorFlow with Hops, Global AI Conference Santa Clara
Scaling TensorFlow with Hops, Global AI Conference Santa Clara
 
Scaling out Tensorflow-as-a-Service on Spark and Commodity GPUs
Scaling out Tensorflow-as-a-Service on Spark and Commodity GPUsScaling out Tensorflow-as-a-Service on Spark and Commodity GPUs
Scaling out Tensorflow-as-a-Service on Spark and Commodity GPUs
 
Odsc workshop - Distributed Tensorflow on Hops
Odsc workshop - Distributed Tensorflow on HopsOdsc workshop - Distributed Tensorflow on Hops
Odsc workshop - Distributed Tensorflow on Hops
 

Último

SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanyChristoph Pohl
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfAlina Yurenko
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesPhilip Schwarz
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesŁukasz Chruściel
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Matt Ray
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Cizo Technology Services
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEEVICTOR MAESTRE RAMIREZ
 
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxReal-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxRTS corp
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)jennyeacort
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Natan Silnitsky
 
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfInnovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfYashikaSharma391629
 
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...Akihiro Suda
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Angel Borroy López
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZABSYZ Inc
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...OnePlan Solutions
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxAndreas Kunz
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureDinusha Kumarasiri
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0
 

Último (20)

SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a series
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New Features
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
 
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort ServiceHot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEE
 
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxReal-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
 
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfInnovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
 
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZ
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with Azure
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
 

Jfokus 2019-dowling-logical-clocks

  • 1. Distributed Deep Learning (on Hopsworks) Jfokus, February 5th, 2019 jim_dowling
  • 2. ©2018 Logical Clocks AB. All Rights Reserved Massive Increase in Compute for AI* Distributed Systems 3.5 month-doubling time *https://blog.openai.com/ai-and-compute 2
  • 3. ©2018 Logical Clocks AB. All Rights Reserved All Roads Lead to Distribution 3 Distributed Deep Learning Hyper Parameter Optimization Distributed Training Larger Training Datasets Elastic Model Serving Parallel Experiments (Commodity) GPU Clusters Auto ML
  • 4. ©2018 Logical Clocks AB. All Rights Reserved Model Training just One Piece of the Puzzle 4 MODEL TRAINING [Diagram adapted from “technical debt of machine learning”]
  • 5. This is a GPU (graphical processing unit) Deep Learning needs Big Data and GPUs •Lots of data is needed to train machine learning (ML) models Big Data Analytics Features DL
  • 6. ©2018 Logical Clocks AB. All Rights Reserved Hopsworks – the Data and AI Platform 6 Hopsworks Batch Deep Learning Streaming
  • 7. ©2018 Logical Clocks AB. All Rights Reserved Hopsworks – the Data and AI Platform 7 Hopsworks Batch Deep Learning Streaming DATA AI
  • 10. ©2018 Logical Clocks AB. All Rights Reserved ML Pipelines in Hopsworks 10 Event Data StoreRaw Data Pre-ProcessIngest Explore/Deploy REST API HopsFS FeatureStore BI Tools Experiment/Train Serving Airflow
  • 11. ©2018 Logical Clocks AB. All Rights Reserved Scalable ML Pipeline 11 Data Collection Experimentation Training Serving Feature Extraction Data Transformation & Verification Test Distributed Storage Potential Bottlenecks Object Stores (S3, GCS), HDFS, Ceph StandaloneSingle-Host Single GPU
  • 12. TensorFlow Extended (TFX) Pipeline 122/5/2019 Model Monitoring? TF Serving TF Model Analysis TF Estimator TF Transform TF Data Validation
  • 13. ©2018 Logical Clocks AB. All Rights Reserved A Fully Distributed Deep Learning Pipeline 13 PySpark KubernetesDistributed TensorFlow Input Data Training Data Model DO NOT feed Spark DataFrames to TensorFlow
  • 14. ©2018 Logical Clocks AB. All Rights Reserved Hopsworks Machine Learning Pipeline HopsFS PySpark Kubernetes Tensorboard Logs + Experiments Feature Store Training Data Models Kafka REST API Keras, TensorFlow, PyTorch
  • 15. ©2018 Logical Clocks AB. All Rights Reserved Orchestrating ML Pipelines with Airflow 15 Airflow Spark ETL Dist Training TensorFlow Test & Optimize Kubernetes ModelServing SparkStreaming Monitoring
  • 16. Data Problem “Data is the hardest part of ML and the most important piece to get right. Modelers spend most of their time selecting and transforming features at training time and then building the pipelines to deliver those features to production models. Broken data is the most common cause of problems in production ML systems” – Uber 16 Data Collection Experimentation Training Serving Feature Extraction Data Transformation & Verification TestFEATURE STORE
  • 17. Hopsworks’ Feature Store - Reusability of features between models and teams - Automatic backfilling of features - Automatic feature documentation and analysis - Feature versioning - Standardized access of features between training and serving - Feature discovery 17
  • 18. Without a Feature Store 18
  • 19. With a Feature Store 19
  • 20. Modelling Data in the Feature Store 20
  • 21. •The Storage Layer: For storing feature data in the feature store •The Metadata Layer: For storing feature metadata (versioning, feature analysis, documentation, jobs) •The Feature Engineering Jobs: For computing features •The Feature Registry: A user interface to share and discover features •The Feature Store API: For writing/reading to/from the feature store Building Blocks of a Feature Store 21
  • 24. ©2018 Logical Clocks AB. All Rights Reserved 24 Reading from the Feature Store (Data Scientist API) from hops import featurestore raw_data = spark.read.parquet(filename) polynomial_features = raw_data.map(lambda x: x^2) featurestore.insert_into_featuregroup(polynomial_features, "polynomial_featuregroup") from hops import featurestore features_df = featurestore.get_features([ "average_attendance", "average_player_age“ ]) Writing to the Feature Store (Data Engineer API)
  • 25. Distribution with PySpark 25 Data Collection Experimentation Training Serving Feature Extraction Data Transformation & Verification Test
  • 26. ©2018 Logical Clocks AB. All Rights Reserved PySpark gets GPUs from a Resource Manager YARN/Kubernetes/Mesos PySpark Application 10 GPUs, please
  • 27. ©2018 Logical Clocks AB. All Rights Reserved PySpark with GPUs in Hopsworks 27 Executor Executor Driver
  • 28. ©2018 Logical Clocks AB. All Rights Reserved PySpark Hello World 28
  • 29. ©2018 Logical Clocks AB. All Rights Reserved PySpark in Code 29 Hello GPU World 1 * Executor print(“Hello from GPU”) Driver experiment.launch(..)
  • 30. ©2018 Logical Clocks AB. All Rights Reserved Driver with 4 Executors 30 print(“Hello from GPU”) Driver print(“Hello from GPU”) print(“Hello from GPU”) print(“Hello from GPU”)
  • 31. ©2018 Logical Clocks AB. All Rights Reserved Driver with 4 Executors in Code 31
  • 32. ©2018 Logical Clocks AB. All Rights Reserved Hopsworks: Replicated Conda Environment 32 conda_env conda_env conda_env conda_env conda_env
  • 33. ©2018 Logical Clocks AB. All Rights Reserved 33 A Conda Environment Per Project in Hopsworks
  • 34. ©2018 Logical Clocks AB. All Rights Reserved 34 Executor 1 Executor N Driver HopsFSTensorBoard HopsFS for: • Training/Test Datasets (Big Data) • Model checkpointing, Model Serving • State for Distributed Training & Parallel Experiments Model Serving Distributed Filesystem for Coordination
  • 35. Filesystems are not good enough Uber on Petastorm: “[Using files] is hard to implement at large scale, especially using modern distributed file systems such as HDFS and S3 (these systems are typically optimized for fast reads of large chunks of data).” https://eng.uber.com/petastorm/ 35
  • 36. NVMe Disks – Game Changer •HDFS (and S3) are designed around large blocks (optimized to overcome slow random I/O on disks), while new NVMe hardware supports orders of magnitude faster random disk I/O. •Can we support faster random disk I/O with HDFS? -Yes with HopsFS. 36
  • 37. ©2018 Logical Clocks AB. All Rights Reserved Small files on NVMe* 37 •Spotify’s HDFS: -33% of files < 64KB in size -42% of operations are on files < 16KB in size *Size Matters: Improving the Performance of Small Files in Hadoop, Middleware 2018. Niazi et al
  • 38. ©2018 Logical Clocks AB. All Rights Reserved HopsFS – NVMe Performance •HopsFS is HDFS with Distributed Metadata -Winner IEEE Scale Prize 2017 •Small files stored replicated in the metadata layer on NVMe disks* - Read/Writes tens of 1000s of images/second with HopsFS + NVMe *Size Matters: Improving the Performance of Small Files in Hadoop, Middleware 2018. Niazi et al
  • 39. ©2018 Logical Clocks AB. All Rights Reserved Hyperparameter Optimization 39 (Because DL Theory Sucks!)
  • 40. ©2018 Logical Clocks AB. All Rights Reserved Faster Experimentation 40 LearningRate (LR): [0.0001-0.001] NumOfLayers (NL): [5-10] …… LR: 0.0001 NL: 5 Error: 0.35 LR: 0.0001 NL: 7 Error: 0.34 LR: 0.0005 NL: 5 Error: 0.38 LR: 0.0005 NL: 10 Error: 0.37 LR: 0.001 NL: 6 Error: 0.31 LR: 0.001 NL: 9 HyperParameter Optimization LR: 0.001 NL: 6 Error: 0.31 TensorFlow Program Hyperparameters Blacklist Executor LR: 0.001 NL: 9 Error: 0.36
  • 41. ©2018 Logical Clocks AB. All Rights Reserved def train(learning_rate, dropout): [TensorFlow Code here] args_dict = {'learning_rate': [0.001, 0.005, 0.01], 'dropout': [0.5, 0.6]} experiment.launch(train, args_dict) GridSearch for Hyperparameters on HopsML Launch 6 Spark Executors
  • 42. ©2018 Logical Clocks AB. All Rights Reserved Data Parallel (Distributed) Training 42 Training Time Generalization Error (Synchronous Stochastic Gradient Descent (SGD))
  • 43. ©2018 Logical Clocks AB. All Rights Reserved Aggregate Gradients New Model Barrier 43 GPU0 GPU1 GPU99 Distributed Filesystem 32 files 32 files 32 files Batch Size of 3200 Copy of the Model on each GPU Data Parallel Training
  • 45. ©2018 Logical Clocks AB. All Rights Reserved Distributed TensorFlow / TfOnSpark 45 TF_CONFIG Bring your own Distribution! 1. Start all processes for P1,P2, G1-G4 yourself 2. Enter all IP addresses in TF_CONFIG along with GPU device IDs. Parameter Servers G1 G2 G3 G4 P1 P2 GPU Servers TF_CONFIG
  • 46. ©2018 Logical Clocks AB. All Rights Reserved Ring AllReduce 46 •Bandwidth optimal •Builds the Ring, runs AllReduce using MPI and NCCL2 -Horovod -TensorFlow 1.11+
  • 47. ©2018 Logical Clocks AB. All Rights Reserved Tf CollectiveAllReduceStrategy 47 TF_CONFIG, again. Bring your own Distribution! 1. Start all processes for G1-G4 yourself 2. Enter all IP addresses in TF_CONFIG along with GPU device IDs. G1 G2 G3 G4 TF_CONFIG Available from TensorFlow 1.11
  • 48. ©2018 Logical Clocks AB. All Rights Reserved HopsML CollectiveAllReduceStrategy 48 •Uses Spark/YARN to add distribution to TensorFlow’s CollectiveAllReduceStrategy -Automatically builds the ring (Spark/YARN) -Allocates GPUs to Spark Executors https://github.com/logicalclocks/hops-examples/tree/master/tensorflow/notebooks/Distributed_Training
  • 49. ©2018 Logical Clocks AB. All Rights Reserved Estimator APIs in TensorFlow 49
  • 50. ©2018 Logical Clocks AB. All Rights Reserved Estimators log to the Distributed Filesystem 50 #over-simplified code – see #notebook for full example tf.estimator.RunConfig( ‘CollectiveAllReduceStrategy’ model_dir tensorboard_logs checkpoints ) experiment.collective_all_reduce(…) /Experiments/appId/run.ID/<nam /Experiments/appId/run.ID/<nam /Experiments/appId/run.ID/<nam HopsFS (HDFS) /Experiments/appId/run.ID/<nam /Experiments/appId/run.ID/<nam
  • 51. ©2018 Logical Clocks AB. All Rights Reserved #over-simplified code – see notebook for full example def distributed_training(): … rc = tf.estimator.RunConfig(‘CollectiveAllReduceStrategy’) tf.estimator.train_and_evaluate(…) experiment.collective_all_reduce(distributed_training) HopsML CollectiveAllReduceStrategy with Keras/Tf
  • 52. ©2018 Logical Clocks AB. All Rights Reserved Experiments/Versioning in Hopsworks
  • 53. ©2018 Logical Clocks AB. All Rights Reserved 53 Experiments Tensorboard
  • 54. ©2018 Logical Clocks AB. All Rights Reserved 54 Experiment Details
  • 55. ©2018 Logical Clocks AB. All Rights Reserved 55 Tensorboard Visualizations
  • 56. ©2018 Logical Clocks AB. All Rights Reserved Model Serving 56 Hopsworks REST API External Apps Internal Apps Logging Load Balancer Model Serving Containers Streaming Notifications Retrain Features: • Canary • Multiple Models • Scale-Out/In Frameworks:  TensorFlow Serving  MLeap for Spark  scikit-learn
  • 57. Summary •The future of Deep Learning is Distributed https://www.oreilly.com/ideas/distributed-tensorflow •Hopsworks is a new Data and AI Platform with first-class support for Python / Deep Learning / ML / Data Governance / GPUs 57
  • 58. HopsML Meetup •https://www.meetup.com/HopsML-Stockholm/ •3 months old with already 320+ members •Next Event in March 2019
  • 59. ©2018 Logical Clocks AB. All Rights Reserved Hops / Hopsworks Open Source Projects Contributors: Jim Dowling, Seif Haridi, Gautier Berthou, Salman Niazi, Mahmoud Ismail, Theofilos Kakantousis, Ermias Gebremeskel, Antonios Kouzoupis, Alex Ormenisan, Fabio Buso, Robin Andersson, Kim Hammar, Vasileios Giannokostas, Johan Svedlund Nordström,Rizvi Hasan, Paul Mälzer, Bram Leenders, Juan Roca, Misganu Dessalegn, K “Sri” Srijeyanthan, Jude D’Souza, Alberto Lorente, Andre Moré, Ali Gholami, Davis Jaunzems, Stig Viaene, Hooman Peiro, Evangelos Savvidis, Steffen Grohsschmiedt, Qi Qi, Gayana Chandrasekara, Nikolaos Stanogias, Daniel Bali, Ioannis Kerkinos, Peter Buechler, Pushparaj Motamari, Hamid Afzali, Wasif Malik, Lalith Suresh, Mariano Valles, Ying Lieu, Fanti Machmount Al Samisti, Braulio Grana, Adam Alpire, Zahin Azher Rashid, ArunaKumari Yedurupaka, Tobias Johansson , Roberto Bampi, Roshan Sedar, August Bonds. www.hops.io @hopshadoop
  • 60. ©2018 Logical Clocks AB. All Rights Reserved 60 @logicalclocks www.logicalclocks.com
  • 61. 1. Register for an account at: www.hops.site 2. Enter your Firstname/lastname here: https://bit.ly/2UEixTr 61