SlideShare una empresa de Scribd logo
1 de 9
Descargar para leer sin conexión
We l c o m e ' t o ' H 2 O ' Wo r l d
Sri'&'H2O'Team'
Data	
  Science	
  is	
  a	
  Team	
  Sport!	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Culture	
  Matters!
Open	
  Source	
  Breeds	
  Courage!	
  
Community	
  Matters!	
  
Every	
  generation	
  needs	
  to	
  make	
  its	
  own	
  history!
Code	
  is	
  conversation	
  with	
  Customer!
Great	
  Product	
  Matters!
Accuracy	
  with	
  Speed	
  and	
  Scale
HDFS%
S3%
SQL%%
NoSQL%
CLASSIFICATION%
REGRESSION%
FEATURE%
ENGINEERING%
IN4MEMORY%
MAP%REDUCE/FORK%JOIN%
COLUMNAR%COMPRESSION%
DEEP%LEARNING%
PCA,%GLM,%COX%
RANDOM%FOREST%/%GBM%
ENSEMBLES%
FAST % M O D E L I NG % E NG I NE %
Streaming%
NANO % FAST % JAVA% S CO R I NG% E NGI NES %
MATRIX%
FACTORIZATION% CLUSTERING%
MUNGING%
What’s	
  New	
  in	
  H2O-­‐3
H2O-­‐3	
  vs	
  H2O-­‐2:	
  
• Total	
  rewrite	
  of	
  the	
  core	
  in	
  Java:	
  built	
  for	
  data	
  scientists	
  AND	
  developers!	
  
• Unique	
  Flow	
  GUI	
  (Notebook	
  and	
  more)	
  
• REST	
  Schemas	
  for	
  self-­‐describing	
  API	
  for	
  all	
  methods/algos	
  
• New	
  R	
  client:	
  cleaner,	
  faster	
  
• Sparkling	
  Water:	
  H2O	
  is	
  the	
  Killer	
  App	
  on	
  Spark	
  
• Fully	
  featured	
  Python	
  client	
  (incl.	
  Pipelines,	
  scikit-­‐learn	
  look&feel)	
  
• New	
  expression	
  parser	
  &	
  backend	
  execution	
  engine	
  for	
  R,	
  Py,	
  Flow	
  
• New	
  Algo:	
  GLRM	
  -­‐	
  Generalized	
  Low	
  Rank	
  Modeling

(unifies	
  PCA,	
  K-­‐Means,	
  Matrix	
  Factorization,	
  Imputation,	
  etc.)	
  
• New	
  Solvers	
  for	
  GLM:	
  Coordinate	
  Descent	
  and	
  L-­‐BFGS
continued…
What’s	
  New	
  in	
  H2O-­‐3
Additional	
  New	
  Features:	
  
• Grid	
  Search	
  for	
  all	
  Algorithms	
  (R/Py/Flow)	
  
• N-­‐fold	
  Cross-­‐Validation	
  for	
  all	
  Algorithms	
  
• Early	
  Stopping	
  (check	
  for	
  convergence)	
  for	
  GBM/DRF/DL	
  
• Stochastic	
  GBM	
  (row/col	
  sampling)	
  
• Distributions	
  (Gaussian,	
  Laplace,	
  Poisson,	
  Gamma,	
  Tweedie)	
  for	
  GBM/DL	
  
• Improved	
  sparse	
  data	
  handling	
  for	
  DL	
  
• Multi-­‐node	
  auto-­‐tuning	
  for	
  DL	
  
• Multinomial	
  GLM	
  
• Scalable	
  Scatter	
  Plots	
  for	
  numeric	
  and	
  categorical	
  data	
  
• Big-­‐Big	
  Joins	
  (“distributed	
  data.table”)	
  -­‐	
  in	
  QA
…and	
  many	
  more!
Convergence-­‐Based	
  Early	
  Stopping	
  in	
  H2O
Before:	
  trains	
  too	
  long,	
  but	
  at	
  least	
  
overwrite_with_best_model=true	
  
prevents	
  overfitting	
  (returns	
  the	
  
model	
  with	
  lowest	
  validation	
  error)
Now:	
  specify	
  additional	
  convergence	
  criterion:	
  E.g.	
  
stopping_rounds=5,	
  stopping_metric=“MSE”,	
  
stopping_tolerance=1e-­‐3,	
  to	
  stop	
  as	
  soon	
  as	
  the	
  moving	
  
average	
  (length	
  5)	
  of	
  the	
  validation	
  MSE	
  does	
  not	
  
improve	
  by	
  at	
  least	
  0.1%	
  for	
  5	
  consecutive	
  scoring	
  events
validation	
  error
training	
  error
overwrite_with_best_model=true
training	
  time	
  /	
  epochs
training	
  time	
  /	
  epochsUse	
  Flow	
  to	
  inspect	
  the	
  model
Early	
  stopping	
  saves	
  tons	
  of	
  time
Best	
  Model
Deep	
  Learning	
  with	
  Higgs	
  data
What	
  do	
  these	
  stickers	
  mean?
I have H2O
Installed
I have Python
installed
I have R
installed
I have the H2O
World data
sets
Pick	
  up	
  stickers	
  or	
  get	
  install	
  help	
  at	
  the	
  
information	
  booth

Más contenido relacionado

La actualidad más candente

HBaseConAsia2018: Track2-5: JanusGraph-Distributed graph database with HBase
HBaseConAsia2018: Track2-5: JanusGraph-Distributed graph database with HBaseHBaseConAsia2018: Track2-5: JanusGraph-Distributed graph database with HBase
HBaseConAsia2018: Track2-5: JanusGraph-Distributed graph database with HBase
Michael Stack
 
Intuitive & Scalable Hyperparameter Tuning with Apache Spark + Fugue
Intuitive & Scalable Hyperparameter Tuning with Apache Spark + FugueIntuitive & Scalable Hyperparameter Tuning with Apache Spark + Fugue
Intuitive & Scalable Hyperparameter Tuning with Apache Spark + Fugue
Databricks
 
Petabyte Scale Anomaly Detection Using R & Spark by Sridhar Alla and Kiran Mu...
Petabyte Scale Anomaly Detection Using R & Spark by Sridhar Alla and Kiran Mu...Petabyte Scale Anomaly Detection Using R & Spark by Sridhar Alla and Kiran Mu...
Petabyte Scale Anomaly Detection Using R & Spark by Sridhar Alla and Kiran Mu...
Spark Summit
 
Superworkflow of Graph Neural Networks with K8S and Fugue
Superworkflow of Graph Neural Networks with K8S and FugueSuperworkflow of Graph Neural Networks with K8S and Fugue
Superworkflow of Graph Neural Networks with K8S and Fugue
Databricks
 

La actualidad más candente (20)

HBaseConAsia2018: Track2-5: JanusGraph-Distributed graph database with HBase
HBaseConAsia2018: Track2-5: JanusGraph-Distributed graph database with HBaseHBaseConAsia2018: Track2-5: JanusGraph-Distributed graph database with HBase
HBaseConAsia2018: Track2-5: JanusGraph-Distributed graph database with HBase
 
Scalable Acceleration of XGBoost Training on Apache Spark GPU Clusters
Scalable Acceleration of XGBoost Training on Apache Spark GPU ClustersScalable Acceleration of XGBoost Training on Apache Spark GPU Clusters
Scalable Acceleration of XGBoost Training on Apache Spark GPU Clusters
 
HBaseConAsia2018 Track1-1: Use CCSMap to improve HBase YGC time
HBaseConAsia2018 Track1-1: Use CCSMap to improve HBase YGC timeHBaseConAsia2018 Track1-1: Use CCSMap to improve HBase YGC time
HBaseConAsia2018 Track1-1: Use CCSMap to improve HBase YGC time
 
Huawei Advanced Data Science With Spark Streaming
Huawei Advanced Data Science With Spark StreamingHuawei Advanced Data Science With Spark Streaming
Huawei Advanced Data Science With Spark Streaming
 
Processing Terabyte-Scale Genomics Datasets with ADAM: Spark Summit East talk...
Processing Terabyte-Scale Genomics Datasets with ADAM: Spark Summit East talk...Processing Terabyte-Scale Genomics Datasets with ADAM: Spark Summit East talk...
Processing Terabyte-Scale Genomics Datasets with ADAM: Spark Summit East talk...
 
IMCSummit 2015 - Day 1 Developer Track - Spark After Dark: Generating High Qu...
IMCSummit 2015 - Day 1 Developer Track - Spark After Dark: Generating High Qu...IMCSummit 2015 - Day 1 Developer Track - Spark After Dark: Generating High Qu...
IMCSummit 2015 - Day 1 Developer Track - Spark After Dark: Generating High Qu...
 
Scaling Apache Spark MLlib to Billions of Parameters: Spark Summit East talk ...
Scaling Apache Spark MLlib to Billions of Parameters: Spark Summit East talk ...Scaling Apache Spark MLlib to Billions of Parameters: Spark Summit East talk ...
Scaling Apache Spark MLlib to Billions of Parameters: Spark Summit East talk ...
 
"Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft...
"Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft..."Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft...
"Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft...
 
Intuitive & Scalable Hyperparameter Tuning with Apache Spark + Fugue
Intuitive & Scalable Hyperparameter Tuning with Apache Spark + FugueIntuitive & Scalable Hyperparameter Tuning with Apache Spark + Fugue
Intuitive & Scalable Hyperparameter Tuning with Apache Spark + Fugue
 
Javantura v4 - Getting started with Apache Spark - Dinko Srkoč
Javantura v4 - Getting started with Apache Spark - Dinko SrkočJavantura v4 - Getting started with Apache Spark - Dinko Srkoč
Javantura v4 - Getting started with Apache Spark - Dinko Srkoč
 
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kuberneteshbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
 
Scylla Summit 2022: New AWS Instances Perfect for ScyllaDB
Scylla Summit 2022: New AWS Instances Perfect for ScyllaDBScylla Summit 2022: New AWS Instances Perfect for ScyllaDB
Scylla Summit 2022: New AWS Instances Perfect for ScyllaDB
 
Lessons from Driverless AI going to Production
Lessons from Driverless AI going to ProductionLessons from Driverless AI going to Production
Lessons from Driverless AI going to Production
 
Spark Streaming and MLlib - Hyderabad Spark Group
Spark Streaming and MLlib - Hyderabad Spark GroupSpark Streaming and MLlib - Hyderabad Spark Group
Spark Streaming and MLlib - Hyderabad Spark Group
 
Scalable Deep Learning Platform On Spark In Baidu
Scalable Deep Learning Platform On Spark In BaiduScalable Deep Learning Platform On Spark In Baidu
Scalable Deep Learning Platform On Spark In Baidu
 
Petabyte Scale Anomaly Detection Using R & Spark by Sridhar Alla and Kiran Mu...
Petabyte Scale Anomaly Detection Using R & Spark by Sridhar Alla and Kiran Mu...Petabyte Scale Anomaly Detection Using R & Spark by Sridhar Alla and Kiran Mu...
Petabyte Scale Anomaly Detection Using R & Spark by Sridhar Alla and Kiran Mu...
 
Superworkflow of Graph Neural Networks with K8S and Fugue
Superworkflow of Graph Neural Networks with K8S and FugueSuperworkflow of Graph Neural Networks with K8S and Fugue
Superworkflow of Graph Neural Networks with K8S and Fugue
 
How Adobe uses Structured Streaming at Scale
How Adobe uses Structured Streaming at ScaleHow Adobe uses Structured Streaming at Scale
How Adobe uses Structured Streaming at Scale
 
Spark Autotuning: Spark Summit East talk by Lawrence Spracklen
Spark Autotuning: Spark Summit East talk by Lawrence SpracklenSpark Autotuning: Spark Summit East talk by Lawrence Spracklen
Spark Autotuning: Spark Summit East talk by Lawrence Spracklen
 
Spark Summit EU talk by Ahsan Javed Awan
Spark Summit EU talk by Ahsan Javed AwanSpark Summit EU talk by Ahsan Javed Awan
Spark Summit EU talk by Ahsan Javed Awan
 

Destacado

H2O World - Migrating from Proprietary Analytics Software - Fonda Ingram
H2O World - Migrating from Proprietary Analytics Software - Fonda IngramH2O World - Migrating from Proprietary Analytics Software - Fonda Ingram
H2O World - Migrating from Proprietary Analytics Software - Fonda Ingram
Sri Ambati
 
H2O World - Intro to Data Science with Erin Ledell
H2O World - Intro to Data Science with Erin LedellH2O World - Intro to Data Science with Erin Ledell
H2O World - Intro to Data Science with Erin Ledell
Sri Ambati
 

Destacado (20)

Sparkling Water Meetup: Deep Learning for Public Safety
Sparkling Water Meetup: Deep Learning for Public SafetySparkling Water Meetup: Deep Learning for Public Safety
Sparkling Water Meetup: Deep Learning for Public Safety
 
H2O World - Machine Learning for non-data scientists
H2O World - Machine Learning for non-data scientistsH2O World - Machine Learning for non-data scientists
H2O World - Machine Learning for non-data scientists
 
Distributed GLM with H2O - Atlanta Meetup
Distributed GLM with H2O - Atlanta MeetupDistributed GLM with H2O - Atlanta Meetup
Distributed GLM with H2O - Atlanta Meetup
 
Python and H2O with Cliff Click at PyData Dallas 2015
Python and H2O with Cliff Click at PyData Dallas 2015Python and H2O with Cliff Click at PyData Dallas 2015
Python and H2O with Cliff Click at PyData Dallas 2015
 
Machine Learning for the Sensored Internet of Things
Machine Learning for the Sensored Internet of ThingsMachine Learning for the Sensored Internet of Things
Machine Learning for the Sensored Internet of Things
 
H2O Machine Learning and Kalman Filters for Machine Prognostics - Galvanize SF
H2O Machine Learning and Kalman Filters for Machine Prognostics - Galvanize SFH2O Machine Learning and Kalman Filters for Machine Prognostics - Galvanize SF
H2O Machine Learning and Kalman Filters for Machine Prognostics - Galvanize SF
 
H2O World - Generalized Low Rank Models - Madeleine Udell
H2O World - Generalized Low Rank Models - Madeleine UdellH2O World - Generalized Low Rank Models - Madeleine Udell
H2O World - Generalized Low Rank Models - Madeleine Udell
 
H2O World - Migrating from Proprietary Analytics Software - Fonda Ingram
H2O World - Migrating from Proprietary Analytics Software - Fonda IngramH2O World - Migrating from Proprietary Analytics Software - Fonda Ingram
H2O World - Migrating from Proprietary Analytics Software - Fonda Ingram
 
H2O World - Building a Smarter Application - Tom Kraljevic
H2O World - Building a Smarter Application - Tom KraljevicH2O World - Building a Smarter Application - Tom Kraljevic
H2O World - Building a Smarter Application - Tom Kraljevic
 
H2O World - Intro to Data Science with Erin Ledell
H2O World - Intro to Data Science with Erin LedellH2O World - Intro to Data Science with Erin Ledell
H2O World - Intro to Data Science with Erin Ledell
 
H2O World - H2O Deep Learning with Arno Candel
H2O World - H2O Deep Learning with Arno CandelH2O World - H2O Deep Learning with Arno Candel
H2O World - H2O Deep Learning with Arno Candel
 
Top 10 Data Science Practitioner Pitfalls
Top 10 Data Science Practitioner PitfallsTop 10 Data Science Practitioner Pitfalls
Top 10 Data Science Practitioner Pitfalls
 
Tweeting beyond Facts – The Need for a Linguistic Perspective
Tweeting beyond Facts – The Need for a Linguistic PerspectiveTweeting beyond Facts – The Need for a Linguistic Perspective
Tweeting beyond Facts – The Need for a Linguistic Perspective
 
H2O World - H2O Rains with Databricks Cloud
H2O World - H2O Rains with Databricks CloudH2O World - H2O Rains with Databricks Cloud
H2O World - H2O Rains with Databricks Cloud
 
Intro to Machine Learning with H2O and Python - Denver
Intro to Machine Learning with H2O and Python - DenverIntro to Machine Learning with H2O and Python - Denver
Intro to Machine Learning with H2O and Python - Denver
 
H2O World - Top 10 Data Science Pitfalls - Mark Landry
H2O World - Top 10 Data Science Pitfalls - Mark LandryH2O World - Top 10 Data Science Pitfalls - Mark Landry
H2O World - Top 10 Data Science Pitfalls - Mark Landry
 
H2O World - Transamerica's Product Recommender Platform - Vishal Bamba & Niti...
H2O World - Transamerica's Product Recommender Platform - Vishal Bamba & Niti...H2O World - Transamerica's Product Recommender Platform - Vishal Bamba & Niti...
H2O World - Transamerica's Product Recommender Platform - Vishal Bamba & Niti...
 
Making of R2D3 - Tony Chu
Making of R2D3 - Tony ChuMaking of R2D3 - Tony Chu
Making of R2D3 - Tony Chu
 
H20: A platform for big math
H20: A platform for big math H20: A platform for big math
H20: A platform for big math
 
H2O World - Top 10 Deep Learning Tips & Tricks - Arno Candel
H2O World - Top 10 Deep Learning Tips & Tricks - Arno CandelH2O World - Top 10 Deep Learning Tips & Tricks - Arno Candel
H2O World - Top 10 Deep Learning Tips & Tricks - Arno Candel
 

Similar a H2O World - Welcome to H2O World with Arno Candel

Making_Good_Enough...Better-Addressing_the_Multiple_Objectives_of_High-Perfor...
Making_Good_Enough...Better-Addressing_the_Multiple_Objectives_of_High-Perfor...Making_Good_Enough...Better-Addressing_the_Multiple_Objectives_of_High-Perfor...
Making_Good_Enough...Better-Addressing_the_Multiple_Objectives_of_High-Perfor...
John Gunnels
 

Similar a H2O World - Welcome to H2O World with Arno Candel (20)

Applied Machine learning using H2O, python and R Workshop
Applied Machine learning using H2O, python and R WorkshopApplied Machine learning using H2O, python and R Workshop
Applied Machine learning using H2O, python and R Workshop
 
Database Camp 2016 @ United Nations, NYC - Brad Bebee, CEO, Blazegraph
Database Camp 2016 @ United Nations, NYC - Brad Bebee, CEO, BlazegraphDatabase Camp 2016 @ United Nations, NYC - Brad Bebee, CEO, Blazegraph
Database Camp 2016 @ United Nations, NYC - Brad Bebee, CEO, Blazegraph
 
Paradigmas de procesamiento en Big Data: estado actual, tendencias y oportu...
Paradigmas de procesamiento en  Big Data: estado actual,  tendencias y oportu...Paradigmas de procesamiento en  Big Data: estado actual,  tendencias y oportu...
Paradigmas de procesamiento en Big Data: estado actual, tendencias y oportu...
 
Generalized Linear Models with H2O
Generalized Linear Models with H2O Generalized Linear Models with H2O
Generalized Linear Models with H2O
 
Scaling a SaaS backend with PostgreSQL - A case study
Scaling a SaaS backend with PostgreSQL - A case studyScaling a SaaS backend with PostgreSQL - A case study
Scaling a SaaS backend with PostgreSQL - A case study
 
PipelineAI Optimizes Your Enterprise AI Pipeline from Distributed Training to...
PipelineAI Optimizes Your Enterprise AI Pipeline from Distributed Training to...PipelineAI Optimizes Your Enterprise AI Pipeline from Distributed Training to...
PipelineAI Optimizes Your Enterprise AI Pipeline from Distributed Training to...
 
Case Study: Stream Processing on AWS using Kappa Architecture
Case Study: Stream Processing on AWS using Kappa ArchitectureCase Study: Stream Processing on AWS using Kappa Architecture
Case Study: Stream Processing on AWS using Kappa Architecture
 
Microsoft R - Data Science at Scale
Microsoft R - Data Science at ScaleMicrosoft R - Data Science at Scale
Microsoft R - Data Science at Scale
 
Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)
Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)
Thorny Path to the Large Scale Graph Processing, Алексей Зиновьев (Тамтэк)
 
Thorny path to the Large-Scale Graph Processing (Highload++, 2014)
Thorny path to the Large-Scale Graph Processing (Highload++, 2014)Thorny path to the Large-Scale Graph Processing (Highload++, 2014)
Thorny path to the Large-Scale Graph Processing (Highload++, 2014)
 
Creating PostgreSQL-as-a-Service at Scale
Creating PostgreSQL-as-a-Service at ScaleCreating PostgreSQL-as-a-Service at Scale
Creating PostgreSQL-as-a-Service at Scale
 
The Convergence of HPC and Deep Learning
The Convergence of HPC and Deep LearningThe Convergence of HPC and Deep Learning
The Convergence of HPC and Deep Learning
 
DIscover Spark and Spark streaming
DIscover Spark and Spark streamingDIscover Spark and Spark streaming
DIscover Spark and Spark streaming
 
Advanced Data Science on Spark-(Reza Zadeh, Stanford)
Advanced Data Science on Spark-(Reza Zadeh, Stanford)Advanced Data Science on Spark-(Reza Zadeh, Stanford)
Advanced Data Science on Spark-(Reza Zadeh, Stanford)
 
Interpreting Performance Test Results
Interpreting Performance Test ResultsInterpreting Performance Test Results
Interpreting Performance Test Results
 
MongoDB World 2019: Raiders of the Anti-patterns: A Journey Towards Fixing Sc...
MongoDB World 2019: Raiders of the Anti-patterns: A Journey Towards Fixing Sc...MongoDB World 2019: Raiders of the Anti-patterns: A Journey Towards Fixing Sc...
MongoDB World 2019: Raiders of the Anti-patterns: A Journey Towards Fixing Sc...
 
useR 2014 jskim
useR 2014 jskimuseR 2014 jskim
useR 2014 jskim
 
Accelerate Reed-Solomon coding for Fault-Tolerance in RAID-like system
Accelerate Reed-Solomon coding for Fault-Tolerance in RAID-like systemAccelerate Reed-Solomon coding for Fault-Tolerance in RAID-like system
Accelerate Reed-Solomon coding for Fault-Tolerance in RAID-like system
 
MMDS 2014: Myria (and Scalable Graph Clustering with RelaxMap)
MMDS 2014: Myria (and Scalable Graph Clustering with RelaxMap)MMDS 2014: Myria (and Scalable Graph Clustering with RelaxMap)
MMDS 2014: Myria (and Scalable Graph Clustering with RelaxMap)
 
Making_Good_Enough...Better-Addressing_the_Multiple_Objectives_of_High-Perfor...
Making_Good_Enough...Better-Addressing_the_Multiple_Objectives_of_High-Perfor...Making_Good_Enough...Better-Addressing_the_Multiple_Objectives_of_High-Perfor...
Making_Good_Enough...Better-Addressing_the_Multiple_Objectives_of_High-Perfor...
 

Más de Sri Ambati

Más de Sri Ambati (20)

H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Generative AI Masterclass - Model Risk Management.pptx
Generative AI Masterclass - Model Risk Management.pptxGenerative AI Masterclass - Model Risk Management.pptx
Generative AI Masterclass - Model Risk Management.pptx
 
AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek
 
LLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5thLLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5th
 
Building, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for ProductionBuilding, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for Production
 
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
 
Risk Management for LLMs
Risk Management for LLMsRisk Management for LLMs
Risk Management for LLMs
 
Open-Source AI: Community is the Way
Open-Source AI: Community is the WayOpen-Source AI: Community is the Way
Open-Source AI: Community is the Way
 
Building Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2OBuilding Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2O
 
Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical
 
Cutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM PapersCutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM Papers
 
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
 
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
 
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
 
LLM Interpretability
LLM Interpretability LLM Interpretability
LLM Interpretability
 
Never Reply to an Email Again
Never Reply to an Email AgainNever Reply to an Email Again
Never Reply to an Email Again
 
Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)
 
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
 
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
 
AI Foundations Course Module 1 - An AI Transformation Journey
AI Foundations Course Module 1 - An AI Transformation JourneyAI Foundations Course Module 1 - An AI Transformation Journey
AI Foundations Course Module 1 - An AI Transformation Journey
 

Último

introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
VishalKumarJha10
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Último (20)

introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
 
ManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide DeckManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide Deck
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
BUS PASS MANGEMENT SYSTEM USING PHP.pptx
BUS PASS MANGEMENT SYSTEM USING PHP.pptxBUS PASS MANGEMENT SYSTEM USING PHP.pptx
BUS PASS MANGEMENT SYSTEM USING PHP.pptx
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
Pharm-D Biostatistics and Research methodology
Pharm-D Biostatistics and Research methodologyPharm-D Biostatistics and Research methodology
Pharm-D Biostatistics and Research methodology
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 

H2O World - Welcome to H2O World with Arno Candel

  • 1. We l c o m e ' t o ' H 2 O ' Wo r l d Sri'&'H2O'Team'
  • 2. Data  Science  is  a  Team  Sport!                                                                                              Culture  Matters!
  • 3. Open  Source  Breeds  Courage!   Community  Matters!   Every  generation  needs  to  make  its  own  history!
  • 4. Code  is  conversation  with  Customer! Great  Product  Matters!
  • 5. Accuracy  with  Speed  and  Scale HDFS% S3% SQL%% NoSQL% CLASSIFICATION% REGRESSION% FEATURE% ENGINEERING% IN4MEMORY% MAP%REDUCE/FORK%JOIN% COLUMNAR%COMPRESSION% DEEP%LEARNING% PCA,%GLM,%COX% RANDOM%FOREST%/%GBM% ENSEMBLES% FAST % M O D E L I NG % E NG I NE % Streaming% NANO % FAST % JAVA% S CO R I NG% E NGI NES % MATRIX% FACTORIZATION% CLUSTERING% MUNGING%
  • 6. What’s  New  in  H2O-­‐3 H2O-­‐3  vs  H2O-­‐2:   • Total  rewrite  of  the  core  in  Java:  built  for  data  scientists  AND  developers!   • Unique  Flow  GUI  (Notebook  and  more)   • REST  Schemas  for  self-­‐describing  API  for  all  methods/algos   • New  R  client:  cleaner,  faster   • Sparkling  Water:  H2O  is  the  Killer  App  on  Spark   • Fully  featured  Python  client  (incl.  Pipelines,  scikit-­‐learn  look&feel)   • New  expression  parser  &  backend  execution  engine  for  R,  Py,  Flow   • New  Algo:  GLRM  -­‐  Generalized  Low  Rank  Modeling
 (unifies  PCA,  K-­‐Means,  Matrix  Factorization,  Imputation,  etc.)   • New  Solvers  for  GLM:  Coordinate  Descent  and  L-­‐BFGS continued…
  • 7. What’s  New  in  H2O-­‐3 Additional  New  Features:   • Grid  Search  for  all  Algorithms  (R/Py/Flow)   • N-­‐fold  Cross-­‐Validation  for  all  Algorithms   • Early  Stopping  (check  for  convergence)  for  GBM/DRF/DL   • Stochastic  GBM  (row/col  sampling)   • Distributions  (Gaussian,  Laplace,  Poisson,  Gamma,  Tweedie)  for  GBM/DL   • Improved  sparse  data  handling  for  DL   • Multi-­‐node  auto-­‐tuning  for  DL   • Multinomial  GLM   • Scalable  Scatter  Plots  for  numeric  and  categorical  data   • Big-­‐Big  Joins  (“distributed  data.table”)  -­‐  in  QA …and  many  more!
  • 8. Convergence-­‐Based  Early  Stopping  in  H2O Before:  trains  too  long,  but  at  least   overwrite_with_best_model=true   prevents  overfitting  (returns  the   model  with  lowest  validation  error) Now:  specify  additional  convergence  criterion:  E.g.   stopping_rounds=5,  stopping_metric=“MSE”,   stopping_tolerance=1e-­‐3,  to  stop  as  soon  as  the  moving   average  (length  5)  of  the  validation  MSE  does  not   improve  by  at  least  0.1%  for  5  consecutive  scoring  events validation  error training  error overwrite_with_best_model=true training  time  /  epochs training  time  /  epochsUse  Flow  to  inspect  the  model Early  stopping  saves  tons  of  time Best  Model Deep  Learning  with  Higgs  data
  • 9. What  do  these  stickers  mean? I have H2O Installed I have Python installed I have R installed I have the H2O World data sets Pick  up  stickers  or  get  install  help  at  the   information  booth