SlideShare una empresa de Scribd logo
1 de 114
Descargar para leer sin conexión
H2O at Poznan R Meetup
Introduction to H2O, IoT Use Cases and Deep Water
Jo-fai (Joe) Chow
Data Scientist
joe@h2o.ai
@matlabulous
Poznan R
20th April, 2017
About Me
• Civil (Water) Engineer
• 2010 – 2015
• Consultant (UK)
• Utilities
• Asset Management
• Constrained Optimization
• Industrial PhD (UK)
• Infrastructure Design Optimization
• Machine Learning +
Water Engineering
• Discovered H2O in 2014
• Data Scientist
• 2015
• Virgin Media (UK)
• Domino Data Lab (Silicon Valley)
• 2016 – Present
• H2O.ai (Silicon Valley)
2
About Me
3
Side Project #1 – Crime Data Visualization
4
https://github.com/woobe/rApps/tree/master/crimemap
http://insidebigdata.com/2013/11/30/visualization-week-crimemap/
Side Project #2 – Data Visualization Contest
5
https://github.com/woobe/rugsmaps http://blog.revolutionanalytics.com/2014/08/winner-for-revolution-analytics-user-group-map-contest.html
Side Project #3
6
Developing R Packages for Fun
rPlotter (2014)
Side Project #4 – Kaggle Blog Post
7
R + H2O + Domino for Kaggle
Guest Blog Post for Domino & H2O (2014)
• The Long Story
• bit.ly/joe_kaggle_story
Agenda
• Introduction
• Company
• Machine Learning Platform
• IoT Use Cases
• Predictive Maintenance
• Outlier Detection
• Break (25 mins)
• Deep Water
• Motivation / Benefits
• Interfaces / Demo
8
About H2O.ai
9
Company Overview
Founded 2011 Venture-backed, debuted in 2012
Products • H2O Open Source In-Memory AI Prediction Engine
• Sparkling Water
• Steam
Mission Operationalize Data Science, and provide a platform for users to build beautiful data products
Team 70 employees
• Distributed Systems Engineers doing Machine Learning
• World-class visualization designers
Headquarters Mountain View, CA
10
11
Our Team
Joe
Scientific Advisory Council
12
13
0
10000
20000
30000
40000
50000
60000
70000
1-Jan-15 1-Jul-15 1-Jan-16 1-Oct-16
# H2O Users
H2O Community Growth
Tremendous Momentum Globally
65,000+ users globally
(Sept 2016)
• 65,000+ users from
~8,000 companies in 140
countries. Top 5 from:
Large User Circle
* DATA FROM GOOGLE ANALYTICS EMBEDDED IN THE END USER PRODUCT
14
0
2000
4000
6000
8000
10000
1-Jan-15 1-Jul-15 1-Jan-16 1-Oct-16
# Companies Using H2O ~8,000+ companies
(Sept 2016)
+127%
+60%
#AroundTheWorldWithH2Oai
15
H2O for Kaggle Competitions
16
H2O for Academic Research
17
http://www.sciencedirect.com/science/article/pii/S0377221716308657
https://arxiv.org/abs/1509.01199
Users In Various Verticals Adore H2O
Financial Insurance MarketingTelecom Healthcare
18
19
Joe (2015)
http://www.h2o.ai/gartner-magic-quadrant/
20
Check
out our
website
h2o.ai
H2O Machine Learning Platform
21
22
23
HDFS
S3
NFS
Distributed
In-Memory
Load Data
Loss-less
Compression
H2O Compute Engine
Production Scoring Environment
Exploratory &
Descriptive
Analysis
Feature
Engineering &
Selection
Supervised &
Unsupervised
Modeling
Model
Evaluation &
Selection
Predict
Data & Model
Storage
Model Export:
Plain Old Java Object
Your
Imagination
Data Prep Export:
Plain Old Java Object
Local
SQL
High Level Architecture
24
HDFS
S3
NFS
Distributed
In-Memory
Load Data
Loss-less
Compression
H2O Compute Engine
Production Scoring Environment
Exploratory &
Descriptive
Analysis
Feature
Engineering &
Selection
Supervised &
Unsupervised
Modeling
Model
Evaluation &
Selection
Predict
Data & Model
Storage
Model Export:
Plain Old Java Object
Your
Imagination
Data Prep Export:
Plain Old Java Object
Local
SQL
High Level Architecture
25
Import Data from
Multiple Sources
HDFS
S3
NFS
Distributed
In-Memory
Load Data
Loss-less
Compression
H2O Compute Engine
Production Scoring Environment
Exploratory &
Descriptive
Analysis
Feature
Engineering &
Selection
Supervised &
Unsupervised
Modeling
Model
Evaluation &
Selection
Predict
Data & Model
Storage
Model Export:
Plain Old Java Object
Your
Imagination
Data Prep Export:
Plain Old Java Object
Local
SQL
High Level Architecture
26
Fast, Scalable & Distributed
Compute Engine Written in
Java
HDFS
S3
NFS
Distributed
In-Memory
Load Data
Loss-less
Compression
H2O Compute Engine
Production Scoring Environment
Exploratory &
Descriptive
Analysis
Feature
Engineering &
Selection
Supervised &
Unsupervised
Modeling
Model
Evaluation &
Selection
Predict
Data & Model
Storage
Model Export:
Plain Old Java Object
Your
Imagination
Data Prep Export:
Plain Old Java Object
Local
SQL
High Level Architecture
27
Fast, Scalable & Distributed
Compute Engine Written in
Java
Supervised Learning
• Generalized Linear Models: Binomial,
Gaussian, Gamma, Poisson and Tweedie
• Naïve Bayes
Statistical
Analysis
Ensembles
• Distributed Random Forest: Classification
or regression models
• Gradient Boosting Machine: Produces an
ensemble of decision trees with increasing
refined approximations
Deep Neural
Networks
• Deep learning: Create multi-layer feed
forward neural networks starting with an
input layer followed by multiple layers of
nonlinear transformations
Algorithms Overview
Unsupervised Learning
• K-means: Partitions observations into k
clusters/groups of the same spatial size.
Automatically detect optimal k
Clustering
Dimensionality
Reduction
• Principal Component Analysis: Linearly transforms
correlated variables to independent components
• Generalized Low Rank Models: extend the idea of
PCA to handle arbitrary data consisting of numerical,
Boolean, categorical, and missing data
Anomaly
Detection
• Autoencoders: Find outliers using a
nonlinear dimensionality reduction using
deep learning
28
H2O Deep Learning in Action
29
HDFS
S3
NFS
Distributed
In-Memory
Load Data
Loss-less
Compression
H2O Compute Engine
Production Scoring Environment
Exploratory &
Descriptive
Analysis
Feature
Engineering &
Selection
Supervised &
Unsupervised
Modeling
Model
Evaluation &
Selection
Predict
Data & Model
Storage
Model Export:
Plain Old Java Object
Your
Imagination
Data Prep Export:
Plain Old Java Object
Local
SQL
High Level Architecture
30
Multiple Interfaces
H2O + R
31
Package ‘h2o’ from CRAN
or H2O’s website
Start a local H2O (Java
Virtual Machine) cluster
Simple ‘iris’ example
H2O + R
32
H2O + Python
33
34
H2O Flow (Web) Interface
HDFS
S3
NFS
Distributed
In-Memory
Load Data
Loss-less
Compression
H2O Compute Engine
Production Scoring Environment
Exploratory &
Descriptive
Analysis
Feature
Engineering &
Selection
Supervised &
Unsupervised
Modeling
Model
Evaluation &
Selection
Predict
Data & Model
Storage
Model Export:
Plain Old Java Object
Your
Imagination
Data Prep Export:
Plain Old Java Object
Local
SQL
High Level Architecture
35
Export Standalone Models
for Production
H2O Overview
36
37
docs.h2o.ai
H2O Tutorials
• Introduction to Machine
Learning with H2O and Python
• Basic Extract, Transform and Load
(ETL)
• Supervised Learning
• Parameters Tuning
• Stacking
• GitHub Repository
• bit.ly/joe_h2o_tutorials
• R Code Examples (available soon)
38
H2O IoT Use Cases
Predictive Maintenance and Outlier Detection
39
Predictive Maintenance – SECOM Dataset
40
Predictive Maintenance – SECOM Dataset
41
We want to predict fails in the future.
Predictive Maintenance – SECOM Dataset
42
Predictive Maintenance – SECOM Dataset
• Dataset Summary
• Inputs:
• 591 numerical features
• Binary Outcome:
• Pass (-1)
• Fail (1)
• Size:
• 1567 samples
43
Basic H2O Usage – GBM with Default Settings
• Link to Jupyter Notebook
• https://github.com/woobe/h2o_tutorials/blob/master/use_cases/predictive
_maintenance/step_01_basics.ipynb
44
45
H2O’s R package
Start a local H2O Cluster (JVM)
“nthreads = -1” means using all
available virtual cores
Information of the
H2O Cluster
46
Import data into H2O cluster
(instead of R’s memory)
47
Convert numerical to
categorical values
48
49
Split data into training / test in
order to evaluate out-of-bag
performance later
50
H2O automatically ignore
columns with constant values
Classification Performance – Confusion Matrix
51
Confusion Matrix
52
53
Looking at metrics based on
training (in-sample) data only
It doesn’t represent out-of-
bag performance
54
Users could reduce number of
features based on these
findings
55
Metrics based on test (out-of-
bag) samples
56
H2O returns predicted class as
well as probabilities of each class
Advanced H2O Usage – Random Grid Search
• Link to Jupyter Notebook
• https://github.com/woobe/h2o_tutorials/blob/master/use_cases/predictive_
maintenance/step_02_random_grid_search.ipynb
• Using Random Grid Search to fine-tune hyper-parameters
57
58
API for performing a Random
Grid Search
59
Grid search results sorted by
specific metric. Best model on top.
60
Extract the best model
61
Performance metrics based on
5-fold cross-validation
Comparison – Default vs. Tuned
62GBM with Default Settings GBM with Random Grid Search
Still not perfect but
better than default
settings
H2O Tutorials
• Introduction to Machine
Learning with H2O and Python
• Basic Extract, Transform and Load
(ETL)
• Supervised Learning
• Parameters Tuning
• Stacking
• GitHub Repository
• bit.ly/joe_h2o_tutorials
• R Code Examples (available soon)
63
Outlier Detection
64
Photo credit: www.dbta.com
Outlier Detection – MNIST Dataset
65
Photo credit: http://www.opendeep.org/v0.0.5/docs/tutorial-classifying-handwritten-mnist-images
Outlier Detection – MNIST Dataset
• 784 Inputs
• 28 x 28 = 784 pixels
• 1 Output
• 0, 1, 2, 3, 4, 5, 6, 7, 8 or 9
• File (from Kaggle)
• Train (42k Records)
• kaggle_mnist_train.csv.gz
• Source
• https://www.kaggle.com/c/digit-
recognizer/data
66
Photo credit: https://ml4a.github.io/ml4a/neural_networks/
Advanced H2O Usage – Random Grid Search
• Link to Data and Code
• https://github.com/woobe/h2o_tutorials/tree/master/use_cases/outlier_det
ection
67
68
69
Import data
(directly from a gz)
Remove the label (not needed
for unsupervised learning)
70
Input =
784 pixels
Output =
784 pixels
Nodes =
50
71
Quantify the average
reconstruction error per sample
- ) / no. of pixels = Avg. Error(
72
Helper functions for
plotting graphs only
(not related to H2O
algorithms)
73
74
Reconstruction
Original Image
75
Reconstruction
Original Image
76
Reconstruction
Original Image
77
End of First Talk
Let’s have a break ☺
78
H2O Deep Water
H2O’s Integration with TensorFlow, mxnet and Caffe
79
TensorFlow
• Open source machine learning
framework by Google
• Python / C++ API
• TensorBoard
• Data Flow Graph Visualization
• Multi CPU / GPU
• v0.8+ distributed machines support
• Multi devices support
• desktop, server and Android devices
• Image, audio and NLP applications
• HUGE Community
• Support for Spark, Windows …
80
https://github.com/tensorflow/tensorflow
81
https://github.com/dmlc/mxnet
https://www.zeolearn.com/magazine/amazon-to-use-mxnet-as-deep-learning-framework
Caffe
• Convolution Architecture For
Feature Extraction (CAFFE)
• Pure C++ / CUDA architecture for
deep learning
• Command line, Python and
MATLAB interface
• Model Zoo
• Open collection of models
82
https://docs.google.com/presentation/d/1UeKXVgRvvxg9OUdh_UiC5G71UMscNPlvArsWER41PsU/
H2O Deep Learning
83
Both TensorFlow and H2O are widely used
84
TensorFlow , MXNet, Caffe and H2O DL
democratize the power of deep learning.
H2O platform democratizes artificial
intelligence & big data science.
There are other open source deep learning libraries like Theano and Torch too.
Let’s have a party, this will be fun!
85
86
Deep Water
Next-Gen Distributed Deep Learning with H2O
H2O integrates with existing GPU backends
for significant performance gains
One Interface - GPU Enabled - Significant Performance Gains
Inherits All H2O Properties in Scalability, Ease of Use and Deployment
Recurrent Neural Networks
enabling natural language processing,
sequences, time series, and more
Convolutional Neural Networks enabling
Image, video, speech recognition
Hybrid Neural Network Architectures
enabling speech to text translation, image
captioning, scene parsing and more
Deep Water
87
Deep Water Architecture
Node 1 Node N
Scala
Spark
H2O
Java
Execution Engine
TensorFlow/mxnet/Caffe
C++
GPU CPU
TensorFlow/mxnet/Caffe
C++
GPU CPU
RPC
R/Py/Flow/Scala client
REST API
Web server
H2O
Java
Execution Engine
grpc/MPI/RDMA
Scala
Spark
88
Available Networks in Deep Water
• LeNet
• AlexNet
• VGGNet
• Inception (GoogLeNet)
• ResNet (Deep Residual
Learning)
• Build Your Own
89
ResNet
90
Example: Deep Water + H2O Flow
Choosing different network structures
91
Choosing different backends
(TensorFlow, MXNet, Caffe)
Unified Interface (Deep Water + R)
92
Choosing different network structures
Unified Interface (Deep Water + Python)
93
Change backend to
“mxnet”, “caffe” or “auto”
Choosing different network structures
Easy Stacking with other H2O Models
94
Ensemble of Deep Water, Gradient Boosting
Machine & Random Forest models
95
docs.h2o.ai
96
https://github.com/h2oai/h2o-3/tree/master/examples/deeplearning/notebooks
Deep Water
Example notebooks
Deep Water Cat/Dog/Mouse
Demo
97
Deep Water R Demo
• H2O + MXNet + TensorFlow
• Dataset – Cat/Dog/Mouse
• MXNet & TF as GPU backend
• Train LeNet (CNN) models
• R Demo
• Code and Data
• github.com/h2oai/deepwater
98
Data – Cat/Dog/Mouse Images
99
Data – CSV
100
Deep Water – Basic Usage
Live Demo if Possible
101
Start and Connect to H2O Deep Water Cluster
102
• Download Latest Nightly Build
• https://s3.amazonaws.com/h2o-deepwater/public/nightly/latest/h2o.jar
• In Terminal
• cd to the folder containing h2o.jar
• java –jar h2o.jar (this is the default command)
• java –jar –Xmx16g h2o.jar (this is the command to allocate 16GB of memory)
• In R
• library(h2o) (latest stable release from h2o.ai website or CRAN)
• h2o.connect(ip = “xxx.xxx.xxx.xxx”, strict_version_check = FALSE)
Import CSV
103
Train a CNN (LeNet) Model on GPU
104
Train a CNN (LeNet) Model on GPU
105
Using GPU for training
Model
106
Deep Water – Custom Network
107
Xxx
108
Saving the custom network
structure as a file
Configure custom
network structure
(MXNet syntax)
Train a Custom Network
109
Point it to the custom
network structure file
Model
110
Conclusions
111
Project “Deep Water”
• H2O + TF + MXNet + Caffe
• A powerful combination of widely
used open source machine
learning libraries.
• All Goodies from H2O
• Inherits all H2O properties in
scalability, ease of use and
deployment.
• Unified Interface
• Allows users to build, stack and
deploy deep learning models from
different libraries efficiently.
112
• Latest Nightly Build
• https://s3.amazonaws.com/h2o-
deepwater/public/nightly/latest/h
2o.jar
• 100% Open Source
• The party will get bigger!
Other H2O Developments
• H2O + xgboost [Link]
• Stacked Ensembles [Link]
• Automatic Machine Learning
[Link]
• Time Series [Link]
• High Availability Mode in
Sparkling Water [Link]
• Model Interpretation [Link]
• word2vec [Link]
113
• Previous Talks
• https://github.com/h2oai/h2o-
meetups/blob/master/2017_04_0
6_Amsterdam/2017_04_06_Latest
_H2O_Developments.pdf
• Organizers & Sponsors
• Poznan R Users Group (PAZUR)
• H2O.ai
114
Thanks!
• Code, Slides & Documents
• bit.ly/h2o_meetups
• docs.h2o.ai
• Contact
• joe@h2o.ai
• @matlabulous
• github.com/woobe
• Please search/ask questions on
Stack Overflow
• Use the tag `h2o` (not H2 zero)

Más contenido relacionado

La actualidad más candente

H2O Deep Water - Making Deep Learning Accessible to Everyone
H2O Deep Water - Making Deep Learning Accessible to EveryoneH2O Deep Water - Making Deep Learning Accessible to Everyone
H2O Deep Water - Making Deep Learning Accessible to EveryoneJo-fai Chow
 
Project "Deep Water"
Project "Deep Water"Project "Deep Water"
Project "Deep Water"Jo-fai Chow
 
Introduction to Machine Learning with H2O and Python
Introduction to Machine Learning with H2O and PythonIntroduction to Machine Learning with H2O and Python
Introduction to Machine Learning with H2O and PythonSri Ambati
 
Automatic and Interpretable Machine Learning with H2O and LIME
Automatic and Interpretable Machine Learning with H2O and LIMEAutomatic and Interpretable Machine Learning with H2O and LIME
Automatic and Interpretable Machine Learning with H2O and LIMEJo-fai Chow
 
H2O Big Join Slides
H2O Big Join SlidesH2O Big Join Slides
H2O Big Join SlidesSri Ambati
 
Intro to H2O in Python - Data Science LA
Intro to H2O in Python - Data Science LAIntro to H2O in Python - Data Science LA
Intro to H2O in Python - Data Science LASri Ambati
 
Scalable Data Science and Deep Learning with H2O
Scalable Data Science and Deep Learning with H2OScalable Data Science and Deep Learning with H2O
Scalable Data Science and Deep Learning with H2Oodsc
 
H2O Machine Learning Use Cases
H2O Machine Learning Use CasesH2O Machine Learning Use Cases
H2O Machine Learning Use CasesJo-fai Chow
 
From H2O to Steam - Dr. Bingwei Liu, Sr. Data Engineer, Aetna
From H2O to Steam - Dr. Bingwei Liu, Sr. Data Engineer, AetnaFrom H2O to Steam - Dr. Bingwei Liu, Sr. Data Engineer, Aetna
From H2O to Steam - Dr. Bingwei Liu, Sr. Data Engineer, AetnaSri Ambati
 
Scalable Machine Learning in R and Python with H2O
Scalable Machine Learning in R and Python with H2OScalable Machine Learning in R and Python with H2O
Scalable Machine Learning in R and Python with H2OSri Ambati
 
Making Multimillion-Dollar Baseball Decisions with H2O AutoML, LIME and Shiny
Making Multimillion-Dollar Baseball Decisions with H2O AutoML, LIME and ShinyMaking Multimillion-Dollar Baseball Decisions with H2O AutoML, LIME and Shiny
Making Multimillion-Dollar Baseball Decisions with H2O AutoML, LIME and ShinyJo-fai Chow
 
ArnoCandelAIFrontiers011217
ArnoCandelAIFrontiers011217ArnoCandelAIFrontiers011217
ArnoCandelAIFrontiers011217Sri Ambati
 
Intro to H2O Machine Learning in Python - Galvanize Seattle
Intro to H2O Machine Learning in Python - Galvanize SeattleIntro to H2O Machine Learning in Python - Galvanize Seattle
Intro to H2O Machine Learning in Python - Galvanize SeattleSri Ambati
 
H2O World 2017 Keynote - Jim McHugh, VP & GM of Data Center, NVIDIA
H2O World 2017 Keynote - Jim McHugh, VP & GM of Data Center, NVIDIAH2O World 2017 Keynote - Jim McHugh, VP & GM of Data Center, NVIDIA
H2O World 2017 Keynote - Jim McHugh, VP & GM of Data Center, NVIDIASri Ambati
 
H2O PySparkling Water
H2O PySparkling WaterH2O PySparkling Water
H2O PySparkling WaterSri Ambati
 
Applying Machine Learning using H2O
Applying Machine Learning using H2OApplying Machine Learning using H2O
Applying Machine Learning using H2OIan Gomez
 
Drive Away Fraudsters With Driverless AI - Venkatesh Ramanathan, Senior Data ...
Drive Away Fraudsters With Driverless AI - Venkatesh Ramanathan, Senior Data ...Drive Away Fraudsters With Driverless AI - Venkatesh Ramanathan, Senior Data ...
Drive Away Fraudsters With Driverless AI - Venkatesh Ramanathan, Senior Data ...Sri Ambati
 
Scalable and Automatic Machine Learning with H2O
Scalable and Automatic Machine Learning with H2OScalable and Automatic Machine Learning with H2O
Scalable and Automatic Machine Learning with H2OSri Ambati
 
Automatic and Interpretable Machine Learning in R with H2O and LIME
Automatic and Interpretable Machine Learning in R with H2O and LIMEAutomatic and Interpretable Machine Learning in R with H2O and LIME
Automatic and Interpretable Machine Learning in R with H2O and LIMEJo-fai Chow
 

La actualidad más candente (19)

H2O Deep Water - Making Deep Learning Accessible to Everyone
H2O Deep Water - Making Deep Learning Accessible to EveryoneH2O Deep Water - Making Deep Learning Accessible to Everyone
H2O Deep Water - Making Deep Learning Accessible to Everyone
 
Project "Deep Water"
Project "Deep Water"Project "Deep Water"
Project "Deep Water"
 
Introduction to Machine Learning with H2O and Python
Introduction to Machine Learning with H2O and PythonIntroduction to Machine Learning with H2O and Python
Introduction to Machine Learning with H2O and Python
 
Automatic and Interpretable Machine Learning with H2O and LIME
Automatic and Interpretable Machine Learning with H2O and LIMEAutomatic and Interpretable Machine Learning with H2O and LIME
Automatic and Interpretable Machine Learning with H2O and LIME
 
H2O Big Join Slides
H2O Big Join SlidesH2O Big Join Slides
H2O Big Join Slides
 
Intro to H2O in Python - Data Science LA
Intro to H2O in Python - Data Science LAIntro to H2O in Python - Data Science LA
Intro to H2O in Python - Data Science LA
 
Scalable Data Science and Deep Learning with H2O
Scalable Data Science and Deep Learning with H2OScalable Data Science and Deep Learning with H2O
Scalable Data Science and Deep Learning with H2O
 
H2O Machine Learning Use Cases
H2O Machine Learning Use CasesH2O Machine Learning Use Cases
H2O Machine Learning Use Cases
 
From H2O to Steam - Dr. Bingwei Liu, Sr. Data Engineer, Aetna
From H2O to Steam - Dr. Bingwei Liu, Sr. Data Engineer, AetnaFrom H2O to Steam - Dr. Bingwei Liu, Sr. Data Engineer, Aetna
From H2O to Steam - Dr. Bingwei Liu, Sr. Data Engineer, Aetna
 
Scalable Machine Learning in R and Python with H2O
Scalable Machine Learning in R and Python with H2OScalable Machine Learning in R and Python with H2O
Scalable Machine Learning in R and Python with H2O
 
Making Multimillion-Dollar Baseball Decisions with H2O AutoML, LIME and Shiny
Making Multimillion-Dollar Baseball Decisions with H2O AutoML, LIME and ShinyMaking Multimillion-Dollar Baseball Decisions with H2O AutoML, LIME and Shiny
Making Multimillion-Dollar Baseball Decisions with H2O AutoML, LIME and Shiny
 
ArnoCandelAIFrontiers011217
ArnoCandelAIFrontiers011217ArnoCandelAIFrontiers011217
ArnoCandelAIFrontiers011217
 
Intro to H2O Machine Learning in Python - Galvanize Seattle
Intro to H2O Machine Learning in Python - Galvanize SeattleIntro to H2O Machine Learning in Python - Galvanize Seattle
Intro to H2O Machine Learning in Python - Galvanize Seattle
 
H2O World 2017 Keynote - Jim McHugh, VP & GM of Data Center, NVIDIA
H2O World 2017 Keynote - Jim McHugh, VP & GM of Data Center, NVIDIAH2O World 2017 Keynote - Jim McHugh, VP & GM of Data Center, NVIDIA
H2O World 2017 Keynote - Jim McHugh, VP & GM of Data Center, NVIDIA
 
H2O PySparkling Water
H2O PySparkling WaterH2O PySparkling Water
H2O PySparkling Water
 
Applying Machine Learning using H2O
Applying Machine Learning using H2OApplying Machine Learning using H2O
Applying Machine Learning using H2O
 
Drive Away Fraudsters With Driverless AI - Venkatesh Ramanathan, Senior Data ...
Drive Away Fraudsters With Driverless AI - Venkatesh Ramanathan, Senior Data ...Drive Away Fraudsters With Driverless AI - Venkatesh Ramanathan, Senior Data ...
Drive Away Fraudsters With Driverless AI - Venkatesh Ramanathan, Senior Data ...
 
Scalable and Automatic Machine Learning with H2O
Scalable and Automatic Machine Learning with H2OScalable and Automatic Machine Learning with H2O
Scalable and Automatic Machine Learning with H2O
 
Automatic and Interpretable Machine Learning in R with H2O and LIME
Automatic and Interpretable Machine Learning in R with H2O and LIMEAutomatic and Interpretable Machine Learning in R with H2O and LIME
Automatic and Interpretable Machine Learning in R with H2O and LIME
 

Similar a H2O at Poznan R Meetup

Belgrade R - Intro to H2O and Deep Water
Belgrade R - Intro to H2O and Deep WaterBelgrade R - Intro to H2O and Deep Water
Belgrade R - Intro to H2O and Deep WaterSri Ambati
 
Introduction to Machine Learning with H2O and Python
Introduction to Machine Learning with H2O and PythonIntroduction to Machine Learning with H2O and Python
Introduction to Machine Learning with H2O and PythonJo-fai Chow
 
Big Data LDN 2017: H2O.ai Driverless AI: Fast, Accurate, Interpretable AI
Big Data LDN 2017: H2O.ai Driverless AI: Fast, Accurate, Interpretable AIBig Data LDN 2017: H2O.ai Driverless AI: Fast, Accurate, Interpretable AI
Big Data LDN 2017: H2O.ai Driverless AI: Fast, Accurate, Interpretable AIMatt Stubbs
 
Berlin R Meetup
Berlin R MeetupBerlin R Meetup
Berlin R MeetupSri Ambati
 
Intro to Machine Learning with H2O and AWS
Intro to Machine Learning with H2O and AWSIntro to Machine Learning with H2O and AWS
Intro to Machine Learning with H2O and AWSSri Ambati
 
Hambug R Meetup - Intro to H2O
Hambug R Meetup - Intro to H2OHambug R Meetup - Intro to H2O
Hambug R Meetup - Intro to H2OSri Ambati
 
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...Demi Ben-Ari
 
Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...
Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...
Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...Codemotion
 
Introduction to Machine Learning with H2O - Jo-Fai (Joe) Chow, H2O
Introduction to Machine Learning with H2O - Jo-Fai (Joe) Chow, H2OIntroduction to Machine Learning with H2O - Jo-Fai (Joe) Chow, H2O
Introduction to Machine Learning with H2O - Jo-Fai (Joe) Chow, H2OData Science Milan
 
Ruben Diaz, Vision Banco + Rafael Coss, H2O ai + Luis Armenta, IBM - AI journ...
Ruben Diaz, Vision Banco + Rafael Coss, H2O ai + Luis Armenta, IBM - AI journ...Ruben Diaz, Vision Banco + Rafael Coss, H2O ai + Luis Armenta, IBM - AI journ...
Ruben Diaz, Vision Banco + Rafael Coss, H2O ai + Luis Armenta, IBM - AI journ...Sri Ambati
 
H2o.ai presentation at 2nd Virtual Pydata Piraeus meetup
H2o.ai presentation at 2nd Virtual Pydata Piraeus meetupH2o.ai presentation at 2nd Virtual Pydata Piraeus meetup
H2o.ai presentation at 2nd Virtual Pydata Piraeus meetupPyData Piraeus
 
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017Demi Ben-Ari
 
Apache Arrow: Open Source Standard Becomes an Enterprise Necessity
Apache Arrow: Open Source Standard Becomes an Enterprise NecessityApache Arrow: Open Source Standard Becomes an Enterprise Necessity
Apache Arrow: Open Source Standard Becomes an Enterprise NecessityWes McKinney
 
Workshop on Google Cloud Data Platform
Workshop on Google Cloud Data PlatformWorkshop on Google Cloud Data Platform
Workshop on Google Cloud Data PlatformGoDataDriven
 
Monitoring Big Data Systems Done "The Simple Way" - Demi Ben-Ari - Codemotion...
Monitoring Big Data Systems Done "The Simple Way" - Demi Ben-Ari - Codemotion...Monitoring Big Data Systems Done "The Simple Way" - Demi Ben-Ari - Codemotion...
Monitoring Big Data Systems Done "The Simple Way" - Demi Ben-Ari - Codemotion...Codemotion
 
Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...
Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...
Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...Demi Ben-Ari
 
OSA Con 2022 - Scaling your Pandas Analytics with Modin - Doris Lee - Ponder.pdf
OSA Con 2022 - Scaling your Pandas Analytics with Modin - Doris Lee - Ponder.pdfOSA Con 2022 - Scaling your Pandas Analytics with Modin - Doris Lee - Ponder.pdf
OSA Con 2022 - Scaling your Pandas Analytics with Modin - Doris Lee - Ponder.pdfAltinity Ltd
 
Machine Learning for Smarter Apps - Jacksonville Meetup
Machine Learning for Smarter Apps - Jacksonville MeetupMachine Learning for Smarter Apps - Jacksonville Meetup
Machine Learning for Smarter Apps - Jacksonville MeetupSri Ambati
 
Deploying Data Science Engines to Production
Deploying Data Science Engines to ProductionDeploying Data Science Engines to Production
Deploying Data Science Engines to ProductionMostafa Majidpour
 
Machine Learning on Google Cloud with H2O
Machine Learning on Google Cloud with H2OMachine Learning on Google Cloud with H2O
Machine Learning on Google Cloud with H2OSri Ambati
 

Similar a H2O at Poznan R Meetup (20)

Belgrade R - Intro to H2O and Deep Water
Belgrade R - Intro to H2O and Deep WaterBelgrade R - Intro to H2O and Deep Water
Belgrade R - Intro to H2O and Deep Water
 
Introduction to Machine Learning with H2O and Python
Introduction to Machine Learning with H2O and PythonIntroduction to Machine Learning with H2O and Python
Introduction to Machine Learning with H2O and Python
 
Big Data LDN 2017: H2O.ai Driverless AI: Fast, Accurate, Interpretable AI
Big Data LDN 2017: H2O.ai Driverless AI: Fast, Accurate, Interpretable AIBig Data LDN 2017: H2O.ai Driverless AI: Fast, Accurate, Interpretable AI
Big Data LDN 2017: H2O.ai Driverless AI: Fast, Accurate, Interpretable AI
 
Berlin R Meetup
Berlin R MeetupBerlin R Meetup
Berlin R Meetup
 
Intro to Machine Learning with H2O and AWS
Intro to Machine Learning with H2O and AWSIntro to Machine Learning with H2O and AWS
Intro to Machine Learning with H2O and AWS
 
Hambug R Meetup - Intro to H2O
Hambug R Meetup - Intro to H2OHambug R Meetup - Intro to H2O
Hambug R Meetup - Intro to H2O
 
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...
 
Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...
Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...
Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...
 
Introduction to Machine Learning with H2O - Jo-Fai (Joe) Chow, H2O
Introduction to Machine Learning with H2O - Jo-Fai (Joe) Chow, H2OIntroduction to Machine Learning with H2O - Jo-Fai (Joe) Chow, H2O
Introduction to Machine Learning with H2O - Jo-Fai (Joe) Chow, H2O
 
Ruben Diaz, Vision Banco + Rafael Coss, H2O ai + Luis Armenta, IBM - AI journ...
Ruben Diaz, Vision Banco + Rafael Coss, H2O ai + Luis Armenta, IBM - AI journ...Ruben Diaz, Vision Banco + Rafael Coss, H2O ai + Luis Armenta, IBM - AI journ...
Ruben Diaz, Vision Banco + Rafael Coss, H2O ai + Luis Armenta, IBM - AI journ...
 
H2o.ai presentation at 2nd Virtual Pydata Piraeus meetup
H2o.ai presentation at 2nd Virtual Pydata Piraeus meetupH2o.ai presentation at 2nd Virtual Pydata Piraeus meetup
H2o.ai presentation at 2nd Virtual Pydata Piraeus meetup
 
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
 
Apache Arrow: Open Source Standard Becomes an Enterprise Necessity
Apache Arrow: Open Source Standard Becomes an Enterprise NecessityApache Arrow: Open Source Standard Becomes an Enterprise Necessity
Apache Arrow: Open Source Standard Becomes an Enterprise Necessity
 
Workshop on Google Cloud Data Platform
Workshop on Google Cloud Data PlatformWorkshop on Google Cloud Data Platform
Workshop on Google Cloud Data Platform
 
Monitoring Big Data Systems Done "The Simple Way" - Demi Ben-Ari - Codemotion...
Monitoring Big Data Systems Done "The Simple Way" - Demi Ben-Ari - Codemotion...Monitoring Big Data Systems Done "The Simple Way" - Demi Ben-Ari - Codemotion...
Monitoring Big Data Systems Done "The Simple Way" - Demi Ben-Ari - Codemotion...
 
Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...
Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...
Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...
 
OSA Con 2022 - Scaling your Pandas Analytics with Modin - Doris Lee - Ponder.pdf
OSA Con 2022 - Scaling your Pandas Analytics with Modin - Doris Lee - Ponder.pdfOSA Con 2022 - Scaling your Pandas Analytics with Modin - Doris Lee - Ponder.pdf
OSA Con 2022 - Scaling your Pandas Analytics with Modin - Doris Lee - Ponder.pdf
 
Machine Learning for Smarter Apps - Jacksonville Meetup
Machine Learning for Smarter Apps - Jacksonville MeetupMachine Learning for Smarter Apps - Jacksonville Meetup
Machine Learning for Smarter Apps - Jacksonville Meetup
 
Deploying Data Science Engines to Production
Deploying Data Science Engines to ProductionDeploying Data Science Engines to Production
Deploying Data Science Engines to Production
 
Machine Learning on Google Cloud with H2O
Machine Learning on Google Cloud with H2OMachine Learning on Google Cloud with H2O
Machine Learning on Google Cloud with H2O
 

Más de Jo-fai Chow

Kaggle Competitions, New Friends, New Skills and New Opportunities
Kaggle Competitions, New Friends, New Skills and New OpportunitiesKaggle Competitions, New Friends, New Skills and New Opportunities
Kaggle Competitions, New Friends, New Skills and New OpportunitiesJo-fai Chow
 
Using H2O Random Grid Search for Hyper-parameters Optimization
Using H2O Random Grid Search for Hyper-parameters OptimizationUsing H2O Random Grid Search for Hyper-parameters Optimization
Using H2O Random Grid Search for Hyper-parameters OptimizationJo-fai Chow
 
Introduction to Generalised Low-Rank Model and Missing Values
Introduction to Generalised Low-Rank Model and Missing ValuesIntroduction to Generalised Low-Rank Model and Missing Values
Introduction to Generalised Low-Rank Model and Missing ValuesJo-fai Chow
 
Improving Model Predictions via Stacking and Hyper-parameters Tuning
Improving Model Predictions via Stacking and Hyper-parameters TuningImproving Model Predictions via Stacking and Hyper-parameters Tuning
Improving Model Predictions via Stacking and Hyper-parameters TuningJo-fai Chow
 
Kaggle competitions, new friends, new skills and new opportunities
Kaggle competitions, new friends, new skills and new opportunitiesKaggle competitions, new friends, new skills and new opportunities
Kaggle competitions, new friends, new skills and new opportunitiesJo-fai Chow
 
Deploying your Predictive Models as a Service via Domino
Deploying your Predictive Models as a Service via DominoDeploying your Predictive Models as a Service via Domino
Deploying your Predictive Models as a Service via DominoJo-fai Chow
 
A Systematic, Multi-Criteria Decision Support Framework for Sustainable Drain...
A Systematic, Multi-Criteria Decision Support Framework for Sustainable Drain...A Systematic, Multi-Criteria Decision Support Framework for Sustainable Drain...
A Systematic, Multi-Criteria Decision Support Framework for Sustainable Drain...Jo-fai Chow
 
Designing Sustainable Drainage Systems
Designing Sustainable Drainage SystemsDesigning Sustainable Drainage Systems
Designing Sustainable Drainage SystemsJo-fai Chow
 
Developing a New Decision Support System for SuDS
Developing a New Decision Support System for SuDSDeveloping a New Decision Support System for SuDS
Developing a New Decision Support System for SuDSJo-fai Chow
 
Udacity Statement (Introduction to Statistics, August 2012)
Udacity Statement (Introduction to Statistics, August 2012)Udacity Statement (Introduction to Statistics, August 2012)
Udacity Statement (Introduction to Statistics, August 2012)Jo-fai Chow
 
Coursera Statement (Computational Investing, Part I,
Coursera Statement (Computational Investing, Part I, Coursera Statement (Computational Investing, Part I,
Coursera Statement (Computational Investing, Part I, Jo-fai Chow
 
Coursera Statement (Computing for Data Analysis, Oct 2013)
Coursera Statement (Computing for Data Analysis, Oct 2013)Coursera Statement (Computing for Data Analysis, Oct 2013)
Coursera Statement (Computing for Data Analysis, Oct 2013)Jo-fai Chow
 
Coursera Statement (Data Analysis, Mar 2013)
Coursera Statement (Data Analysis, Mar 2013)Coursera Statement (Data Analysis, Mar 2013)
Coursera Statement (Data Analysis, Mar 2013)Jo-fai Chow
 
A Systematic, Multi-Criteria Decision Support Framework for Sustainable Drain...
A Systematic, Multi-Criteria Decision Support Framework for Sustainable Drain...A Systematic, Multi-Criteria Decision Support Framework for Sustainable Drain...
A Systematic, Multi-Criteria Decision Support Framework for Sustainable Drain...Jo-fai Chow
 

Más de Jo-fai Chow (14)

Kaggle Competitions, New Friends, New Skills and New Opportunities
Kaggle Competitions, New Friends, New Skills and New OpportunitiesKaggle Competitions, New Friends, New Skills and New Opportunities
Kaggle Competitions, New Friends, New Skills and New Opportunities
 
Using H2O Random Grid Search for Hyper-parameters Optimization
Using H2O Random Grid Search for Hyper-parameters OptimizationUsing H2O Random Grid Search for Hyper-parameters Optimization
Using H2O Random Grid Search for Hyper-parameters Optimization
 
Introduction to Generalised Low-Rank Model and Missing Values
Introduction to Generalised Low-Rank Model and Missing ValuesIntroduction to Generalised Low-Rank Model and Missing Values
Introduction to Generalised Low-Rank Model and Missing Values
 
Improving Model Predictions via Stacking and Hyper-parameters Tuning
Improving Model Predictions via Stacking and Hyper-parameters TuningImproving Model Predictions via Stacking and Hyper-parameters Tuning
Improving Model Predictions via Stacking and Hyper-parameters Tuning
 
Kaggle competitions, new friends, new skills and new opportunities
Kaggle competitions, new friends, new skills and new opportunitiesKaggle competitions, new friends, new skills and new opportunities
Kaggle competitions, new friends, new skills and new opportunities
 
Deploying your Predictive Models as a Service via Domino
Deploying your Predictive Models as a Service via DominoDeploying your Predictive Models as a Service via Domino
Deploying your Predictive Models as a Service via Domino
 
A Systematic, Multi-Criteria Decision Support Framework for Sustainable Drain...
A Systematic, Multi-Criteria Decision Support Framework for Sustainable Drain...A Systematic, Multi-Criteria Decision Support Framework for Sustainable Drain...
A Systematic, Multi-Criteria Decision Support Framework for Sustainable Drain...
 
Designing Sustainable Drainage Systems
Designing Sustainable Drainage SystemsDesigning Sustainable Drainage Systems
Designing Sustainable Drainage Systems
 
Developing a New Decision Support System for SuDS
Developing a New Decision Support System for SuDSDeveloping a New Decision Support System for SuDS
Developing a New Decision Support System for SuDS
 
Udacity Statement (Introduction to Statistics, August 2012)
Udacity Statement (Introduction to Statistics, August 2012)Udacity Statement (Introduction to Statistics, August 2012)
Udacity Statement (Introduction to Statistics, August 2012)
 
Coursera Statement (Computational Investing, Part I,
Coursera Statement (Computational Investing, Part I, Coursera Statement (Computational Investing, Part I,
Coursera Statement (Computational Investing, Part I,
 
Coursera Statement (Computing for Data Analysis, Oct 2013)
Coursera Statement (Computing for Data Analysis, Oct 2013)Coursera Statement (Computing for Data Analysis, Oct 2013)
Coursera Statement (Computing for Data Analysis, Oct 2013)
 
Coursera Statement (Data Analysis, Mar 2013)
Coursera Statement (Data Analysis, Mar 2013)Coursera Statement (Data Analysis, Mar 2013)
Coursera Statement (Data Analysis, Mar 2013)
 
A Systematic, Multi-Criteria Decision Support Framework for Sustainable Drain...
A Systematic, Multi-Criteria Decision Support Framework for Sustainable Drain...A Systematic, Multi-Criteria Decision Support Framework for Sustainable Drain...
A Systematic, Multi-Criteria Decision Support Framework for Sustainable Drain...
 

Último

DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationkaushalgiri8080
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number SystemsJheuzeDellosa
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 
Introduction to Decentralized Applications (dApps)
Introduction to Decentralized Applications (dApps)Introduction to Decentralized Applications (dApps)
Introduction to Decentralized Applications (dApps)Intelisync
 

Último (20)

DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanation
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
Exploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the ProcessExploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the Process
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number Systems
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 
Introduction to Decentralized Applications (dApps)
Introduction to Decentralized Applications (dApps)Introduction to Decentralized Applications (dApps)
Introduction to Decentralized Applications (dApps)
 

H2O at Poznan R Meetup

  • 1. H2O at Poznan R Meetup Introduction to H2O, IoT Use Cases and Deep Water Jo-fai (Joe) Chow Data Scientist joe@h2o.ai @matlabulous Poznan R 20th April, 2017
  • 2. About Me • Civil (Water) Engineer • 2010 – 2015 • Consultant (UK) • Utilities • Asset Management • Constrained Optimization • Industrial PhD (UK) • Infrastructure Design Optimization • Machine Learning + Water Engineering • Discovered H2O in 2014 • Data Scientist • 2015 • Virgin Media (UK) • Domino Data Lab (Silicon Valley) • 2016 – Present • H2O.ai (Silicon Valley) 2
  • 4. Side Project #1 – Crime Data Visualization 4 https://github.com/woobe/rApps/tree/master/crimemap http://insidebigdata.com/2013/11/30/visualization-week-crimemap/
  • 5. Side Project #2 – Data Visualization Contest 5 https://github.com/woobe/rugsmaps http://blog.revolutionanalytics.com/2014/08/winner-for-revolution-analytics-user-group-map-contest.html
  • 6. Side Project #3 6 Developing R Packages for Fun rPlotter (2014)
  • 7. Side Project #4 – Kaggle Blog Post 7 R + H2O + Domino for Kaggle Guest Blog Post for Domino & H2O (2014) • The Long Story • bit.ly/joe_kaggle_story
  • 8. Agenda • Introduction • Company • Machine Learning Platform • IoT Use Cases • Predictive Maintenance • Outlier Detection • Break (25 mins) • Deep Water • Motivation / Benefits • Interfaces / Demo 8
  • 10. Company Overview Founded 2011 Venture-backed, debuted in 2012 Products • H2O Open Source In-Memory AI Prediction Engine • Sparkling Water • Steam Mission Operationalize Data Science, and provide a platform for users to build beautiful data products Team 70 employees • Distributed Systems Engineers doing Machine Learning • World-class visualization designers Headquarters Mountain View, CA 10
  • 13. 13
  • 14. 0 10000 20000 30000 40000 50000 60000 70000 1-Jan-15 1-Jul-15 1-Jan-16 1-Oct-16 # H2O Users H2O Community Growth Tremendous Momentum Globally 65,000+ users globally (Sept 2016) • 65,000+ users from ~8,000 companies in 140 countries. Top 5 from: Large User Circle * DATA FROM GOOGLE ANALYTICS EMBEDDED IN THE END USER PRODUCT 14 0 2000 4000 6000 8000 10000 1-Jan-15 1-Jul-15 1-Jan-16 1-Oct-16 # Companies Using H2O ~8,000+ companies (Sept 2016) +127% +60%
  • 16. H2O for Kaggle Competitions 16
  • 17. H2O for Academic Research 17 http://www.sciencedirect.com/science/article/pii/S0377221716308657 https://arxiv.org/abs/1509.01199
  • 18. Users In Various Verticals Adore H2O Financial Insurance MarketingTelecom Healthcare 18
  • 21. H2O Machine Learning Platform 21
  • 22. 22
  • 23. 23
  • 24. HDFS S3 NFS Distributed In-Memory Load Data Loss-less Compression H2O Compute Engine Production Scoring Environment Exploratory & Descriptive Analysis Feature Engineering & Selection Supervised & Unsupervised Modeling Model Evaluation & Selection Predict Data & Model Storage Model Export: Plain Old Java Object Your Imagination Data Prep Export: Plain Old Java Object Local SQL High Level Architecture 24
  • 25. HDFS S3 NFS Distributed In-Memory Load Data Loss-less Compression H2O Compute Engine Production Scoring Environment Exploratory & Descriptive Analysis Feature Engineering & Selection Supervised & Unsupervised Modeling Model Evaluation & Selection Predict Data & Model Storage Model Export: Plain Old Java Object Your Imagination Data Prep Export: Plain Old Java Object Local SQL High Level Architecture 25 Import Data from Multiple Sources
  • 26. HDFS S3 NFS Distributed In-Memory Load Data Loss-less Compression H2O Compute Engine Production Scoring Environment Exploratory & Descriptive Analysis Feature Engineering & Selection Supervised & Unsupervised Modeling Model Evaluation & Selection Predict Data & Model Storage Model Export: Plain Old Java Object Your Imagination Data Prep Export: Plain Old Java Object Local SQL High Level Architecture 26 Fast, Scalable & Distributed Compute Engine Written in Java
  • 27. HDFS S3 NFS Distributed In-Memory Load Data Loss-less Compression H2O Compute Engine Production Scoring Environment Exploratory & Descriptive Analysis Feature Engineering & Selection Supervised & Unsupervised Modeling Model Evaluation & Selection Predict Data & Model Storage Model Export: Plain Old Java Object Your Imagination Data Prep Export: Plain Old Java Object Local SQL High Level Architecture 27 Fast, Scalable & Distributed Compute Engine Written in Java
  • 28. Supervised Learning • Generalized Linear Models: Binomial, Gaussian, Gamma, Poisson and Tweedie • Naïve Bayes Statistical Analysis Ensembles • Distributed Random Forest: Classification or regression models • Gradient Boosting Machine: Produces an ensemble of decision trees with increasing refined approximations Deep Neural Networks • Deep learning: Create multi-layer feed forward neural networks starting with an input layer followed by multiple layers of nonlinear transformations Algorithms Overview Unsupervised Learning • K-means: Partitions observations into k clusters/groups of the same spatial size. Automatically detect optimal k Clustering Dimensionality Reduction • Principal Component Analysis: Linearly transforms correlated variables to independent components • Generalized Low Rank Models: extend the idea of PCA to handle arbitrary data consisting of numerical, Boolean, categorical, and missing data Anomaly Detection • Autoencoders: Find outliers using a nonlinear dimensionality reduction using deep learning 28
  • 29. H2O Deep Learning in Action 29
  • 30. HDFS S3 NFS Distributed In-Memory Load Data Loss-less Compression H2O Compute Engine Production Scoring Environment Exploratory & Descriptive Analysis Feature Engineering & Selection Supervised & Unsupervised Modeling Model Evaluation & Selection Predict Data & Model Storage Model Export: Plain Old Java Object Your Imagination Data Prep Export: Plain Old Java Object Local SQL High Level Architecture 30 Multiple Interfaces
  • 31. H2O + R 31 Package ‘h2o’ from CRAN or H2O’s website Start a local H2O (Java Virtual Machine) cluster Simple ‘iris’ example
  • 34. 34 H2O Flow (Web) Interface
  • 35. HDFS S3 NFS Distributed In-Memory Load Data Loss-less Compression H2O Compute Engine Production Scoring Environment Exploratory & Descriptive Analysis Feature Engineering & Selection Supervised & Unsupervised Modeling Model Evaluation & Selection Predict Data & Model Storage Model Export: Plain Old Java Object Your Imagination Data Prep Export: Plain Old Java Object Local SQL High Level Architecture 35 Export Standalone Models for Production
  • 38. H2O Tutorials • Introduction to Machine Learning with H2O and Python • Basic Extract, Transform and Load (ETL) • Supervised Learning • Parameters Tuning • Stacking • GitHub Repository • bit.ly/joe_h2o_tutorials • R Code Examples (available soon) 38
  • 39. H2O IoT Use Cases Predictive Maintenance and Outlier Detection 39
  • 40. Predictive Maintenance – SECOM Dataset 40
  • 41. Predictive Maintenance – SECOM Dataset 41 We want to predict fails in the future.
  • 42. Predictive Maintenance – SECOM Dataset 42
  • 43. Predictive Maintenance – SECOM Dataset • Dataset Summary • Inputs: • 591 numerical features • Binary Outcome: • Pass (-1) • Fail (1) • Size: • 1567 samples 43
  • 44. Basic H2O Usage – GBM with Default Settings • Link to Jupyter Notebook • https://github.com/woobe/h2o_tutorials/blob/master/use_cases/predictive _maintenance/step_01_basics.ipynb 44
  • 45. 45 H2O’s R package Start a local H2O Cluster (JVM) “nthreads = -1” means using all available virtual cores Information of the H2O Cluster
  • 46. 46 Import data into H2O cluster (instead of R’s memory)
  • 48. 48
  • 49. 49 Split data into training / test in order to evaluate out-of-bag performance later
  • 50. 50 H2O automatically ignore columns with constant values
  • 51. Classification Performance – Confusion Matrix 51
  • 53. 53 Looking at metrics based on training (in-sample) data only It doesn’t represent out-of- bag performance
  • 54. 54 Users could reduce number of features based on these findings
  • 55. 55 Metrics based on test (out-of- bag) samples
  • 56. 56 H2O returns predicted class as well as probabilities of each class
  • 57. Advanced H2O Usage – Random Grid Search • Link to Jupyter Notebook • https://github.com/woobe/h2o_tutorials/blob/master/use_cases/predictive_ maintenance/step_02_random_grid_search.ipynb • Using Random Grid Search to fine-tune hyper-parameters 57
  • 58. 58 API for performing a Random Grid Search
  • 59. 59 Grid search results sorted by specific metric. Best model on top.
  • 61. 61 Performance metrics based on 5-fold cross-validation
  • 62. Comparison – Default vs. Tuned 62GBM with Default Settings GBM with Random Grid Search Still not perfect but better than default settings
  • 63. H2O Tutorials • Introduction to Machine Learning with H2O and Python • Basic Extract, Transform and Load (ETL) • Supervised Learning • Parameters Tuning • Stacking • GitHub Repository • bit.ly/joe_h2o_tutorials • R Code Examples (available soon) 63
  • 65. Outlier Detection – MNIST Dataset 65 Photo credit: http://www.opendeep.org/v0.0.5/docs/tutorial-classifying-handwritten-mnist-images
  • 66. Outlier Detection – MNIST Dataset • 784 Inputs • 28 x 28 = 784 pixels • 1 Output • 0, 1, 2, 3, 4, 5, 6, 7, 8 or 9 • File (from Kaggle) • Train (42k Records) • kaggle_mnist_train.csv.gz • Source • https://www.kaggle.com/c/digit- recognizer/data 66 Photo credit: https://ml4a.github.io/ml4a/neural_networks/
  • 67. Advanced H2O Usage – Random Grid Search • Link to Data and Code • https://github.com/woobe/h2o_tutorials/tree/master/use_cases/outlier_det ection 67
  • 68. 68
  • 69. 69 Import data (directly from a gz) Remove the label (not needed for unsupervised learning)
  • 70. 70 Input = 784 pixels Output = 784 pixels Nodes = 50
  • 71. 71 Quantify the average reconstruction error per sample - ) / no. of pixels = Avg. Error(
  • 72. 72 Helper functions for plotting graphs only (not related to H2O algorithms)
  • 73. 73
  • 77. 77
  • 78. End of First Talk Let’s have a break ☺ 78
  • 79. H2O Deep Water H2O’s Integration with TensorFlow, mxnet and Caffe 79
  • 80. TensorFlow • Open source machine learning framework by Google • Python / C++ API • TensorBoard • Data Flow Graph Visualization • Multi CPU / GPU • v0.8+ distributed machines support • Multi devices support • desktop, server and Android devices • Image, audio and NLP applications • HUGE Community • Support for Spark, Windows … 80 https://github.com/tensorflow/tensorflow
  • 82. Caffe • Convolution Architecture For Feature Extraction (CAFFE) • Pure C++ / CUDA architecture for deep learning • Command line, Python and MATLAB interface • Model Zoo • Open collection of models 82 https://docs.google.com/presentation/d/1UeKXVgRvvxg9OUdh_UiC5G71UMscNPlvArsWER41PsU/
  • 84. Both TensorFlow and H2O are widely used 84
  • 85. TensorFlow , MXNet, Caffe and H2O DL democratize the power of deep learning. H2O platform democratizes artificial intelligence & big data science. There are other open source deep learning libraries like Theano and Torch too. Let’s have a party, this will be fun! 85
  • 86. 86
  • 87. Deep Water Next-Gen Distributed Deep Learning with H2O H2O integrates with existing GPU backends for significant performance gains One Interface - GPU Enabled - Significant Performance Gains Inherits All H2O Properties in Scalability, Ease of Use and Deployment Recurrent Neural Networks enabling natural language processing, sequences, time series, and more Convolutional Neural Networks enabling Image, video, speech recognition Hybrid Neural Network Architectures enabling speech to text translation, image captioning, scene parsing and more Deep Water 87
  • 88. Deep Water Architecture Node 1 Node N Scala Spark H2O Java Execution Engine TensorFlow/mxnet/Caffe C++ GPU CPU TensorFlow/mxnet/Caffe C++ GPU CPU RPC R/Py/Flow/Scala client REST API Web server H2O Java Execution Engine grpc/MPI/RDMA Scala Spark 88
  • 89. Available Networks in Deep Water • LeNet • AlexNet • VGGNet • Inception (GoogLeNet) • ResNet (Deep Residual Learning) • Build Your Own 89 ResNet
  • 90. 90 Example: Deep Water + H2O Flow Choosing different network structures
  • 92. Unified Interface (Deep Water + R) 92 Choosing different network structures
  • 93. Unified Interface (Deep Water + Python) 93 Change backend to “mxnet”, “caffe” or “auto” Choosing different network structures
  • 94. Easy Stacking with other H2O Models 94 Ensemble of Deep Water, Gradient Boosting Machine & Random Forest models
  • 98. Deep Water R Demo • H2O + MXNet + TensorFlow • Dataset – Cat/Dog/Mouse • MXNet & TF as GPU backend • Train LeNet (CNN) models • R Demo • Code and Data • github.com/h2oai/deepwater 98
  • 101. Deep Water – Basic Usage Live Demo if Possible 101
  • 102. Start and Connect to H2O Deep Water Cluster 102 • Download Latest Nightly Build • https://s3.amazonaws.com/h2o-deepwater/public/nightly/latest/h2o.jar • In Terminal • cd to the folder containing h2o.jar • java –jar h2o.jar (this is the default command) • java –jar –Xmx16g h2o.jar (this is the command to allocate 16GB of memory) • In R • library(h2o) (latest stable release from h2o.ai website or CRAN) • h2o.connect(ip = “xxx.xxx.xxx.xxx”, strict_version_check = FALSE)
  • 104. Train a CNN (LeNet) Model on GPU 104
  • 105. Train a CNN (LeNet) Model on GPU 105 Using GPU for training
  • 107. Deep Water – Custom Network 107
  • 108. Xxx 108 Saving the custom network structure as a file Configure custom network structure (MXNet syntax)
  • 109. Train a Custom Network 109 Point it to the custom network structure file
  • 112. Project “Deep Water” • H2O + TF + MXNet + Caffe • A powerful combination of widely used open source machine learning libraries. • All Goodies from H2O • Inherits all H2O properties in scalability, ease of use and deployment. • Unified Interface • Allows users to build, stack and deploy deep learning models from different libraries efficiently. 112 • Latest Nightly Build • https://s3.amazonaws.com/h2o- deepwater/public/nightly/latest/h 2o.jar • 100% Open Source • The party will get bigger!
  • 113. Other H2O Developments • H2O + xgboost [Link] • Stacked Ensembles [Link] • Automatic Machine Learning [Link] • Time Series [Link] • High Availability Mode in Sparkling Water [Link] • Model Interpretation [Link] • word2vec [Link] 113 • Previous Talks • https://github.com/h2oai/h2o- meetups/blob/master/2017_04_0 6_Amsterdam/2017_04_06_Latest _H2O_Developments.pdf
  • 114. • Organizers & Sponsors • Poznan R Users Group (PAZUR) • H2O.ai 114 Thanks! • Code, Slides & Documents • bit.ly/h2o_meetups • docs.h2o.ai • Contact • joe@h2o.ai • @matlabulous • github.com/woobe • Please search/ask questions on Stack Overflow • Use the tag `h2o` (not H2 zero)