SlideShare una empresa de Scribd logo
1 de 20
Scaling out Driverless AI
in Enterprise Data
Centers with IBM
Spectrum Conductor
Kevin Doyle
Lead Architect IBM Spectrum Conductor
IBM
LinkedIn: https://www.linkedin.com/in/kevin-doyle-675a4031/
Benefits of managing H2O with IBM Spectrum
Conductor
• H2O Driverless AI can scale across compute nodes for multiple instances, with each instance
allocated to one host
• In a future IBM Spectrum Conductor release, integration improves at the GPU level: You will be
able to run multiple Driverless AI instances on the same host, where each instance is allocated to
an assigned GPU
• Shared file system for Data and logs
• Failover to another host if Driverless AI goes down: IBM Spectrum Conductor starts it up on
another host (if resources available)
• Easily start and stop H2O Driverless AI and maintain instances for each user or groups of users
through role-based access control (RBAC) and consumer association, along with all other
workloads in one shared compute cluster
• H2O Driverless AI and IBM POWER9 GPU Systems are bringing together the best of breed AI
innovation. To handle the increasingly complex workloads of AI you need an integrated system of
software and hardware:
• IBM supports nearly 2.6x mPOWER9ore RAM, 9.5x more I/O bandwidth than comparable systems
• Nearly 2X the data ingest speed and over 50% faster feature engineering
• With GPU accelerated machine learning delivering nearly 30X speedup on model building
• Support for up to 6 V100 GPUs on a single system
What is IBM® Spectrum Conductor?
• IBM Spectrum Conductor confidently deploys modern computing frameworks and
services for a multitenant enterprise environment, both on-premises and in the cloud
• Provides multitenancy through application instances and Spark instance groups. You can
deploy modern computing frameworks and services, such as Spark, Anaconda, Driverless
AI, and H2O Sparkling Water efficiently and effectively, supporting multiple versions and
instances of each framework and service
• Increases performance and scale through granular and dynamic resource allocation for
application instances and Spark instance groups that share a resource pool
• Maximizes usage of resources and eliminates silos of resources that would otherwise
each be tied to separate application implementations
• Provides flexible and efficient data management for shared storage and high availability
by connecting to existing storage infrastructure, such as NFS mounts to a file system or
IBM Spectrum Scale™
VIRTUALIZED VIEWOF COMPUTE,NETWORKAND STORAGERESOURCES
Application
Application
examples
• Simulation
• Analysis
• Design
• Big data
IT constrained
• Long wait times
• Low utilization
• Data access
bottlenecks
• IT Sprawl
IBM Software Defined Infrastructure
Big data
Simulation and
modeling
Analytics
Traditional IBM Spectrum Conductor
Make multiple computers look
like one
Prioritized matching of supply
with demand
Benefits
• High utilization
• Throughput
• Performance
• Prioritization
• Reduced cost
Repeated for many
apps and groups
Converged
compute
and
storage
VIRTUALIZED VIEWOF COMPUTE,NETWORKAND STORAGERESOURCES
Faster results Fewer resources
Long running services
Distinct resources for
compute and storage
Traditional vs Conductor Management
IBM Systems
Shared Services Model for Spark, Machine Learning, and Deep Learning
• Physical view: IBM Spectrum Conductor installed on each Linux server
• Logical view: Users (groups) have their own Spark cluster (optional) that is isolated, protected, and
secured by Spark instance groups or application instances – Managed by SLA
| 5
Administrator
Compute Nodes
Linux
Linux
Linux
Linux
Linux
Linux
Linux
Linux
Linux
Linux
Linux
Linux
Linux
Linux
Linux
Linux
Linux
Linux
Linux
Linux
Linux
Linux
Linux
Linux
Linux
Linux
Linux
Linux
Linux
Linux
Instance #1
LOB
Marketing…
Fraud Detection…
Data scientist
Instance #2
Data scientist
Driverless AI
Instance #3
Researcher
Instance #4
x86 Systems
Cloud Object Storage (COS)Spectrum Scale
Spectrum Conductor
Data Connectors
IBM Systems
IBM Spectrum Conductor
The most complete enterprise-grade solution for Data Science
• Anaconda Distributions
The solution supports multiple distributions of Anaconda running concurrently.
Users can add/remove Conda packages.
• Notebooks Integration
Out-of-the-box notebooks available: Jupyter, Zeppelin, RStudio, H2O
Sparkling Water. Other notebooks and distributed frameworks can be quickly
integrated.
• Spark Distributions
The solution supports multiple versions of Spark running concurrently.
• Workload Management / Scheduling
A proven workload scheduling engine that enhances the Spark master
scheduling logic to enable multi-tenancy.
• Services Management
Management of other long running application services on the same grid.
Spark applications commonly have dependencies on other services that can
now be managed as a single solution.
• Resource Management & Orchestration
Proven architecture at scale. Resources are dynamically allocated to Spark
workload with fine grain sharing across applications.
• IBM services and support
A single point of contact for your services and support needs.
| 6
Monitoring&Reporting
Workload Management / Scheduling
Resource Management &
Orchestration
Services Management
Services and Support
Red Hat Linux
x86…
Notebooks
IBM Systems 7© 2016 IBM Corporation
Competitive advantage through faster, more predictable analytics
Throughput: 41% greater than Spark with YARN; 57% greater than Spark with Mesos
Spectrum Conductor
with Spark
Spark / YARN Spark / Mesos
When minutes count 10 minutes 14.1 minutes 15.7 minutes
At quarter-end 80 hours 112.8 hours 125.6 hours
Product development 26 weeks 36.7 weeks 40.8 weeks
Source: STAC Report: Spark Resource Managers, Phase 1 (March 28, 2016)
Note: IBM is an active contributor in the Mesos community, helping to advance its capabilities and integration with IBM solutions
Predictability: longest job duration compared with median (lower is better)
Spectrum Conductor
with Spark
Spark / YARN Spark / Mesos
1.51X 1.62X 66.32X
IBM Systems 8© 2016 IBM Corporation
STAC reported significant advantages, up to 2.2x, for IBM Spectrum Conductor with Spark
over YARN and Mesos.
PowerAI Enterprise ML/DL - Data Science Stack
Open Source Frameworks Distribution
Data Layer
Runtimes,
Resource &
WL Managers
DL Frameworks
ML Libraries
ML/DL
UI and Flow
Data Science
Apps
Value-add Tools
IBM Spectrum Conductor
Tensor
Flow
Caffe PyTorch Chainer MLLib Graphx
Scikit-
learn
R xgboost
GPU Support / Distributed / BYOF / Session Scheduler / MPI / Containers… Anaconda
Python
Spark
Anaconda
Distributed Deep Learning (DDL)
Data Prep / Parallel Training / Model Tuning / Model Evaluation / Inference Services…
IBM Spectrum Conductor Deep Learning Impact
PowerAI Vision
IBM
PowerAI
Enterprise
IBM Spectrum Scale IBM Cloud Object Store
Watson Studio
Elastic Distributed Training (EDT)
Key concepts of IBM Spectrum Conductor
• Application instances
• Customizable feature to support running any long-running service within the cluster
• Application templates (yaml) are created to define the processes (services) that you
want to run in the cluster
• Driverless AI integration is done through application instances
• Spark instance groups
• Is an installation of Apache Spark that can run Spark core services (master, shuffle,
and history), Anaconda distribution instances, and notebooks as configured
• You can create and run multiple Spark instance groups, associating each instance
group with different Spark/Anaconda/notebook version packages as required
• H2O Sparkling Water integration is treated as a notebook within your Spark instance
groups
Key concepts of IBM Spectrum Conductor Cont
• Resource groups
• Provide a simple way of organizing and grouping resources (hosts)
• Defines how to divide up the hosts in the group into slots
• Slots are used to decide if a host is available to place new workload on it
• Consumers
• A way to map organizations/teams to resources they are allowed to use
• Resource planning uses consumers to determine advanced policies for when
to borrow/lend resources to other consumers
• Resource groups map to consumers to allow users adding application
instances or Spark instance groups to only use those resource groups
Role-based access control
• Permissions are assigned to roles
• Roles are assigned to users
• Most permissions are based on a consumer
• Users will have the permissions/role assigned but only for the consumers they
have access to
• Ability to allow users to only access/control what they should
• Example: Each user can see only their Driverless AI instances as desired
How does the integration work?
• H2O Driverless AI is launched on a single host
• The host can have either GPUs or just run with CPUs
• If using GPUs the entire host is taken (with current integration)
• An application instance is created for each user of Driverless AI
• Maintains security for the data this user has access to
• Environment variables through parameters are used to configure Driverless AI
• H2O Sparkling Water runs as a notebook in a Spark instance group
• When the notebook is started up it forms a mini cluster of executors
• These executors stay alive for the entire duration of the notebook
• IBM Spectrum Conductor disables preemption to not reclaim these hosts
• Multiple users can share a Sparkling Water notebook instance or have
dedicated ones per user
Current Integration
14
Session Scheduler
Security
Data Connector
Report/log management
Notebook Spark ELKPython
Resource, Cluster, Service Management (K8s/EGO)
ContainerGPU and Acceleration
Multi-tenancy
Batch Scheduler
Session Scheduler
Session Scheduler
Instance Group #1 Instance Group #2
App instance
# marketing
App instance
# fraud
Instance Group
# 5
Elastic Distributed
Training (EDT)
# other
apps …
Demo
Future Plans (short term)
• Log retrieval from IBM Spectrum Conductor web UI
• Ability to deploy Driverless AI with IBM Spectrum Conductor instead
of installing on all systems (new application template)
• Ability to modify application instance outputs more effectively
• Enhance job monitor to check when Driverless AI is up
Future Plans (longer term)
• Improved port management
• Today you can specify the ports to use, however, you don’t know if they are
being used on existing hots
• The ports might work at first but not later if something else is using the ports
• Improve handling of running Driverless AI with a subset of GPUs on
hosts in the cluster
• Integrate Driverless AI authentication with IBM Spectrum Conductor
authentication/authorization for easier setup
• Look at supporting Driverless AI to run across multiple machines
• Investigate the best approaches to connect to data sources
Long term architecture vision for Driverless AI
integrated with IBM Spectrum Conductor
H2O Driverless AI
Batch Scheduler
(1) Start Driverless AI
Linux Linux
Linux Linux
Linux Linux
Linux Linux
Linux Linux
Linux Linux
Linux Linux
Linux Linux
Linux Linux
Linux Linux
Session Scheduler
(2) Find a host to run Driverless AI
(3) Run workload
(training,
experiment, etc)
(4) Find hosts to run the
workload on to speed up
execution
It’s available now
• Contact Richard Shedrick ( rshedrick@us.ibm.com ) to get access to
the integration and learn more
• Future announcements and contact points on the integration at:
• IBM Spectrum Conductor Blog:
http://ibm.biz/ConductorBlogs
• IBM Spectrum Conductor’s Slack channel:
http://ibm.biz/ConductorSlack
20
Simplicity: Integrated
Platform that Just Works
Curate, Test, and Support
Fast Moving Open Source
Provide Enterprise
Distribution on RedHat
Easy to deploy Enterprise
AI Platform
Ease of Use, Unique
Capabilities
Faster Model
Training Time
Large data & model
support due to NVLink
Acceleration of Analytics &
ML
AutoML: PowerAI Vision
Elastic Training: Scale GPUs
as Required
Faster Training Times in
Single Server
Scalability to 100s of
Servers (Cluster level
Integration)
Leads to Faster Insights
and Better Economics
Platform that Partners can
build on
Software Partners: H2O,
IBM, Anaconda
SIs, Solution Vendors
& Accelerator Partners
Open AI Platform w/
Ecosystem Partners
Power9
CPU
GPU
PowerAI
IBM
SW
ISV SW
Solution
SIs
Top Reasons to Choose PowerAI Enterprise

Más contenido relacionado

La actualidad más candente

Hadoop Summit EU 2013: Parallel Linear Regression, IterativeReduce, and YARN
Hadoop Summit EU 2013: Parallel Linear Regression, IterativeReduce, and YARNHadoop Summit EU 2013: Parallel Linear Regression, IterativeReduce, and YARN
Hadoop Summit EU 2013: Parallel Linear Regression, IterativeReduce, and YARNJosh Patterson
 
H2O Machine Learning with KNIME Analytics Platform - Christian Dietz - H2O AI...
H2O Machine Learning with KNIME Analytics Platform - Christian Dietz - H2O AI...H2O Machine Learning with KNIME Analytics Platform - Christian Dietz - H2O AI...
H2O Machine Learning with KNIME Analytics Platform - Christian Dietz - H2O AI...Sri Ambati
 
Saving Energy in Homes with a Unified Approach to Data and AI
Saving Energy in Homes with a Unified Approach to Data and AISaving Energy in Homes with a Unified Approach to Data and AI
Saving Energy in Homes with a Unified Approach to Data and AIDatabricks
 
H2O Advancements - Arno Candel
H2O Advancements - Arno CandelH2O Advancements - Arno Candel
H2O Advancements - Arno CandelSri Ambati
 
Lightning-Fast Analytics for Workday Transactional Data with Pavel Hardak and...
Lightning-Fast Analytics for Workday Transactional Data with Pavel Hardak and...Lightning-Fast Analytics for Workday Transactional Data with Pavel Hardak and...
Lightning-Fast Analytics for Workday Transactional Data with Pavel Hardak and...Databricks
 
GPU Acceleration for Financial Services
GPU Acceleration for Financial ServicesGPU Acceleration for Financial Services
GPU Acceleration for Financial ServicesKinetica
 
An Introduction to H2O4GPU
An Introduction to H2O4GPUAn Introduction to H2O4GPU
An Introduction to H2O4GPUSri Ambati
 
Accelerating Deep Learning Training with BigDL and Drizzle on Apache Spark wi...
Accelerating Deep Learning Training with BigDL and Drizzle on Apache Spark wi...Accelerating Deep Learning Training with BigDL and Drizzle on Apache Spark wi...
Accelerating Deep Learning Training with BigDL and Drizzle on Apache Spark wi...Databricks
 
Erin LeDell, H2O.ai - Scalable Automatic Machine Learning - H2O World San Fra...
Erin LeDell, H2O.ai - Scalable Automatic Machine Learning - H2O World San Fra...Erin LeDell, H2O.ai - Scalable Automatic Machine Learning - H2O World San Fra...
Erin LeDell, H2O.ai - Scalable Automatic Machine Learning - H2O World San Fra...Sri Ambati
 
Operationalizing Machine Learning Using GPU Accelerated, In-Database Analytics
Operationalizing Machine Learning Using GPU Accelerated, In-Database AnalyticsOperationalizing Machine Learning Using GPU Accelerated, In-Database Analytics
Operationalizing Machine Learning Using GPU Accelerated, In-Database AnalyticsKinetica
 
Dataminds - ML in Production
Dataminds - ML in ProductionDataminds - ML in Production
Dataminds - ML in ProductionNathan Bijnens
 
Google Cloud Platform - Introduction & Certification Path 2018
Google Cloud Platform - Introduction & Certification Path 2018Google Cloud Platform - Introduction & Certification Path 2018
Google Cloud Platform - Introduction & Certification Path 2018Pavan Dikondkar
 
Azure Databricks & Spark @ Techorama 2018
Azure Databricks & Spark @ Techorama 2018Azure Databricks & Spark @ Techorama 2018
Azure Databricks & Spark @ Techorama 2018Nathan Bijnens
 
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)Spark and Hadoop at Production Scale-(Anil Gadre, MapR)
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)Spark Summit
 
Big Data and ML on Google Cloud
Big Data and ML on Google CloudBig Data and ML on Google Cloud
Big Data and ML on Google CloudWlodek Bielski
 
Azure AI platform - Automated ML workshop
Azure AI platform - Automated ML workshopAzure AI platform - Automated ML workshop
Azure AI platform - Automated ML workshopParashar Shah
 
Lessons Learned from Using Spark for Evaluating Road Detection at BMW Autonom...
Lessons Learned from Using Spark for Evaluating Road Detection at BMW Autonom...Lessons Learned from Using Spark for Evaluating Road Detection at BMW Autonom...
Lessons Learned from Using Spark for Evaluating Road Detection at BMW Autonom...Databricks
 

La actualidad más candente (20)

Hadoop Summit EU 2013: Parallel Linear Regression, IterativeReduce, and YARN
Hadoop Summit EU 2013: Parallel Linear Regression, IterativeReduce, and YARNHadoop Summit EU 2013: Parallel Linear Regression, IterativeReduce, and YARN
Hadoop Summit EU 2013: Parallel Linear Regression, IterativeReduce, and YARN
 
H2O Machine Learning with KNIME Analytics Platform - Christian Dietz - H2O AI...
H2O Machine Learning with KNIME Analytics Platform - Christian Dietz - H2O AI...H2O Machine Learning with KNIME Analytics Platform - Christian Dietz - H2O AI...
H2O Machine Learning with KNIME Analytics Platform - Christian Dietz - H2O AI...
 
Saving Energy in Homes with a Unified Approach to Data and AI
Saving Energy in Homes with a Unified Approach to Data and AISaving Energy in Homes with a Unified Approach to Data and AI
Saving Energy in Homes with a Unified Approach to Data and AI
 
H2O Advancements - Arno Candel
H2O Advancements - Arno CandelH2O Advancements - Arno Candel
H2O Advancements - Arno Candel
 
Lightning-Fast Analytics for Workday Transactional Data with Pavel Hardak and...
Lightning-Fast Analytics for Workday Transactional Data with Pavel Hardak and...Lightning-Fast Analytics for Workday Transactional Data with Pavel Hardak and...
Lightning-Fast Analytics for Workday Transactional Data with Pavel Hardak and...
 
GPU Acceleration for Financial Services
GPU Acceleration for Financial ServicesGPU Acceleration for Financial Services
GPU Acceleration for Financial Services
 
An Introduction to H2O4GPU
An Introduction to H2O4GPUAn Introduction to H2O4GPU
An Introduction to H2O4GPU
 
Accelerating Deep Learning Training with BigDL and Drizzle on Apache Spark wi...
Accelerating Deep Learning Training with BigDL and Drizzle on Apache Spark wi...Accelerating Deep Learning Training with BigDL and Drizzle on Apache Spark wi...
Accelerating Deep Learning Training with BigDL and Drizzle on Apache Spark wi...
 
Erin LeDell, H2O.ai - Scalable Automatic Machine Learning - H2O World San Fra...
Erin LeDell, H2O.ai - Scalable Automatic Machine Learning - H2O World San Fra...Erin LeDell, H2O.ai - Scalable Automatic Machine Learning - H2O World San Fra...
Erin LeDell, H2O.ai - Scalable Automatic Machine Learning - H2O World San Fra...
 
Google Cloud Platform
Google Cloud PlatformGoogle Cloud Platform
Google Cloud Platform
 
Operationalizing Machine Learning Using GPU Accelerated, In-Database Analytics
Operationalizing Machine Learning Using GPU Accelerated, In-Database AnalyticsOperationalizing Machine Learning Using GPU Accelerated, In-Database Analytics
Operationalizing Machine Learning Using GPU Accelerated, In-Database Analytics
 
Dataminds - ML in Production
Dataminds - ML in ProductionDataminds - ML in Production
Dataminds - ML in Production
 
MLOps with Kubeflow
MLOps with Kubeflow MLOps with Kubeflow
MLOps with Kubeflow
 
Google Cloud Platform - Introduction & Certification Path 2018
Google Cloud Platform - Introduction & Certification Path 2018Google Cloud Platform - Introduction & Certification Path 2018
Google Cloud Platform - Introduction & Certification Path 2018
 
Azure Databricks & Spark @ Techorama 2018
Azure Databricks & Spark @ Techorama 2018Azure Databricks & Spark @ Techorama 2018
Azure Databricks & Spark @ Techorama 2018
 
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)Spark and Hadoop at Production Scale-(Anil Gadre, MapR)
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)
 
Big Data and ML on Google Cloud
Big Data and ML on Google CloudBig Data and ML on Google Cloud
Big Data and ML on Google Cloud
 
Data Science on Google Cloud Platform
Data Science on Google Cloud PlatformData Science on Google Cloud Platform
Data Science on Google Cloud Platform
 
Azure AI platform - Automated ML workshop
Azure AI platform - Automated ML workshopAzure AI platform - Automated ML workshop
Azure AI platform - Automated ML workshop
 
Lessons Learned from Using Spark for Evaluating Road Detection at BMW Autonom...
Lessons Learned from Using Spark for Evaluating Road Detection at BMW Autonom...Lessons Learned from Using Spark for Evaluating Road Detection at BMW Autonom...
Lessons Learned from Using Spark for Evaluating Road Detection at BMW Autonom...
 

Similar a Scaling out Driverless AI with IBM Spectrum Conductor - Kevin Doyle - H2O AI World London 2018

Introduction to Apache Mesos and DC/OS
Introduction to Apache Mesos and DC/OSIntroduction to Apache Mesos and DC/OS
Introduction to Apache Mesos and DC/OSSteve Wong
 
HPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journeyHPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journeyPeter Clapham
 
Hpc lunch and learn
Hpc lunch and learnHpc lunch and learn
Hpc lunch and learnJohn D Almon
 
AI Scalability for the Next Decade
AI Scalability for the Next DecadeAI Scalability for the Next Decade
AI Scalability for the Next DecadePaula Koziol
 
5 Comparing Microsoft Big Data Technologies for Analytics
5 Comparing Microsoft Big Data Technologies for Analytics5 Comparing Microsoft Big Data Technologies for Analytics
5 Comparing Microsoft Big Data Technologies for AnalyticsJen Stirrup
 
Docker & aPaaS: Enterprise Innovation and Trends for 2015
Docker & aPaaS: Enterprise Innovation and Trends for 2015Docker & aPaaS: Enterprise Innovation and Trends for 2015
Docker & aPaaS: Enterprise Innovation and Trends for 2015WaveMaker, Inc.
 
Spark introduction and architecture
Spark introduction and architectureSpark introduction and architecture
Spark introduction and architectureSohil Jain
 
Spark introduction and architecture
Spark introduction and architectureSpark introduction and architecture
Spark introduction and architectureSohil Jain
 
Building cloud native data microservice
Building cloud native data microserviceBuilding cloud native data microservice
Building cloud native data microserviceNilanjan Roy
 
What's New in IBM Streams V4.1
What's New in IBM Streams V4.1What's New in IBM Streams V4.1
What's New in IBM Streams V4.1lisanl
 
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part20812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2Raul Chong
 
Spark One Platform Webinar
Spark One Platform WebinarSpark One Platform Webinar
Spark One Platform WebinarCloudera, Inc.
 
Lessons Learned from Deploying Apache Spark as a Service on IBM Power Systems...
Lessons Learned from Deploying Apache Spark as a Service on IBM Power Systems...Lessons Learned from Deploying Apache Spark as a Service on IBM Power Systems...
Lessons Learned from Deploying Apache Spark as a Service on IBM Power Systems...Indrajit Poddar
 
VTU Open Elective 6th Sem CSE - Module 2 - Cloud Computing
VTU Open Elective 6th Sem CSE - Module 2 - Cloud ComputingVTU Open Elective 6th Sem CSE - Module 2 - Cloud Computing
VTU Open Elective 6th Sem CSE - Module 2 - Cloud ComputingSachin Gowda
 
Understanding cloud with Google Cloud Platform
Understanding cloud with Google Cloud PlatformUnderstanding cloud with Google Cloud Platform
Understanding cloud with Google Cloud PlatformDr. Ketan Parmar
 
Apache Spark Fundamentals
Apache Spark FundamentalsApache Spark Fundamentals
Apache Spark FundamentalsZahra Eskandari
 
Basics of Java Cloud
Basics of Java CloudBasics of Java Cloud
Basics of Java CloudAnkur Gupta
 

Similar a Scaling out Driverless AI with IBM Spectrum Conductor - Kevin Doyle - H2O AI World London 2018 (20)

Introduction to Apache Mesos and DC/OS
Introduction to Apache Mesos and DC/OSIntroduction to Apache Mesos and DC/OS
Introduction to Apache Mesos and DC/OS
 
HPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journeyHPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journey
 
Hpc lunch and learn
Hpc lunch and learnHpc lunch and learn
Hpc lunch and learn
 
AI Scalability for the Next Decade
AI Scalability for the Next DecadeAI Scalability for the Next Decade
AI Scalability for the Next Decade
 
5 Comparing Microsoft Big Data Technologies for Analytics
5 Comparing Microsoft Big Data Technologies for Analytics5 Comparing Microsoft Big Data Technologies for Analytics
5 Comparing Microsoft Big Data Technologies for Analytics
 
Docker & aPaaS: Enterprise Innovation and Trends for 2015
Docker & aPaaS: Enterprise Innovation and Trends for 2015Docker & aPaaS: Enterprise Innovation and Trends for 2015
Docker & aPaaS: Enterprise Innovation and Trends for 2015
 
Spark introduction and architecture
Spark introduction and architectureSpark introduction and architecture
Spark introduction and architecture
 
Spark introduction and architecture
Spark introduction and architectureSpark introduction and architecture
Spark introduction and architecture
 
Building cloud native data microservice
Building cloud native data microserviceBuilding cloud native data microservice
Building cloud native data microservice
 
What's New in IBM Streams V4.1
What's New in IBM Streams V4.1What's New in IBM Streams V4.1
What's New in IBM Streams V4.1
 
Cloud presentation NELA
Cloud presentation NELACloud presentation NELA
Cloud presentation NELA
 
IBM - Introduction to Cloudant
IBM - Introduction to CloudantIBM - Introduction to Cloudant
IBM - Introduction to Cloudant
 
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part20812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
 
Spark One Platform Webinar
Spark One Platform WebinarSpark One Platform Webinar
Spark One Platform Webinar
 
Lessons Learned from Deploying Apache Spark as a Service on IBM Power Systems...
Lessons Learned from Deploying Apache Spark as a Service on IBM Power Systems...Lessons Learned from Deploying Apache Spark as a Service on IBM Power Systems...
Lessons Learned from Deploying Apache Spark as a Service on IBM Power Systems...
 
VTU Open Elective 6th Sem CSE - Module 2 - Cloud Computing
VTU Open Elective 6th Sem CSE - Module 2 - Cloud ComputingVTU Open Elective 6th Sem CSE - Module 2 - Cloud Computing
VTU Open Elective 6th Sem CSE - Module 2 - Cloud Computing
 
DataOps with Project Amaterasu
DataOps with Project AmaterasuDataOps with Project Amaterasu
DataOps with Project Amaterasu
 
Understanding cloud with Google Cloud Platform
Understanding cloud with Google Cloud PlatformUnderstanding cloud with Google Cloud Platform
Understanding cloud with Google Cloud Platform
 
Apache Spark Fundamentals
Apache Spark FundamentalsApache Spark Fundamentals
Apache Spark Fundamentals
 
Basics of Java Cloud
Basics of Java CloudBasics of Java Cloud
Basics of Java Cloud
 

Más de Sri Ambati

H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Generative AI Masterclass - Model Risk Management.pptx
Generative AI Masterclass - Model Risk Management.pptxGenerative AI Masterclass - Model Risk Management.pptx
Generative AI Masterclass - Model Risk Management.pptxSri Ambati
 
AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek Sri Ambati
 
LLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5thLLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5thSri Ambati
 
Building, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for ProductionBuilding, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for ProductionSri Ambati
 
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...Sri Ambati
 
Risk Management for LLMs
Risk Management for LLMsRisk Management for LLMs
Risk Management for LLMsSri Ambati
 
Open-Source AI: Community is the Way
Open-Source AI: Community is the WayOpen-Source AI: Community is the Way
Open-Source AI: Community is the WaySri Ambati
 
Building Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2OBuilding Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2OSri Ambati
 
Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical Sri Ambati
 
Cutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM PapersCutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM PapersSri Ambati
 
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...Sri Ambati
 
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...Sri Ambati
 
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...Sri Ambati
 
LLM Interpretability
LLM Interpretability LLM Interpretability
LLM Interpretability Sri Ambati
 
Never Reply to an Email Again
Never Reply to an Email AgainNever Reply to an Email Again
Never Reply to an Email AgainSri Ambati
 
Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)Sri Ambati
 
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...Sri Ambati
 
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...Sri Ambati
 
AI Foundations Course Module 1 - An AI Transformation Journey
AI Foundations Course Module 1 - An AI Transformation JourneyAI Foundations Course Module 1 - An AI Transformation Journey
AI Foundations Course Module 1 - An AI Transformation JourneySri Ambati
 

Más de Sri Ambati (20)

H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Generative AI Masterclass - Model Risk Management.pptx
Generative AI Masterclass - Model Risk Management.pptxGenerative AI Masterclass - Model Risk Management.pptx
Generative AI Masterclass - Model Risk Management.pptx
 
AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek
 
LLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5thLLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5th
 
Building, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for ProductionBuilding, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for Production
 
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
 
Risk Management for LLMs
Risk Management for LLMsRisk Management for LLMs
Risk Management for LLMs
 
Open-Source AI: Community is the Way
Open-Source AI: Community is the WayOpen-Source AI: Community is the Way
Open-Source AI: Community is the Way
 
Building Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2OBuilding Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2O
 
Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical
 
Cutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM PapersCutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM Papers
 
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
 
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
 
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
 
LLM Interpretability
LLM Interpretability LLM Interpretability
LLM Interpretability
 
Never Reply to an Email Again
Never Reply to an Email AgainNever Reply to an Email Again
Never Reply to an Email Again
 
Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)
 
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
 
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
 
AI Foundations Course Module 1 - An AI Transformation Journey
AI Foundations Course Module 1 - An AI Transformation JourneyAI Foundations Course Module 1 - An AI Transformation Journey
AI Foundations Course Module 1 - An AI Transformation Journey
 

Último

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...apidays
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 

Último (20)

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 

Scaling out Driverless AI with IBM Spectrum Conductor - Kevin Doyle - H2O AI World London 2018

  • 1. Scaling out Driverless AI in Enterprise Data Centers with IBM Spectrum Conductor Kevin Doyle Lead Architect IBM Spectrum Conductor IBM LinkedIn: https://www.linkedin.com/in/kevin-doyle-675a4031/
  • 2. Benefits of managing H2O with IBM Spectrum Conductor • H2O Driverless AI can scale across compute nodes for multiple instances, with each instance allocated to one host • In a future IBM Spectrum Conductor release, integration improves at the GPU level: You will be able to run multiple Driverless AI instances on the same host, where each instance is allocated to an assigned GPU • Shared file system for Data and logs • Failover to another host if Driverless AI goes down: IBM Spectrum Conductor starts it up on another host (if resources available) • Easily start and stop H2O Driverless AI and maintain instances for each user or groups of users through role-based access control (RBAC) and consumer association, along with all other workloads in one shared compute cluster • H2O Driverless AI and IBM POWER9 GPU Systems are bringing together the best of breed AI innovation. To handle the increasingly complex workloads of AI you need an integrated system of software and hardware: • IBM supports nearly 2.6x mPOWER9ore RAM, 9.5x more I/O bandwidth than comparable systems • Nearly 2X the data ingest speed and over 50% faster feature engineering • With GPU accelerated machine learning delivering nearly 30X speedup on model building • Support for up to 6 V100 GPUs on a single system
  • 3. What is IBM® Spectrum Conductor? • IBM Spectrum Conductor confidently deploys modern computing frameworks and services for a multitenant enterprise environment, both on-premises and in the cloud • Provides multitenancy through application instances and Spark instance groups. You can deploy modern computing frameworks and services, such as Spark, Anaconda, Driverless AI, and H2O Sparkling Water efficiently and effectively, supporting multiple versions and instances of each framework and service • Increases performance and scale through granular and dynamic resource allocation for application instances and Spark instance groups that share a resource pool • Maximizes usage of resources and eliminates silos of resources that would otherwise each be tied to separate application implementations • Provides flexible and efficient data management for shared storage and high availability by connecting to existing storage infrastructure, such as NFS mounts to a file system or IBM Spectrum Scale™
  • 4. VIRTUALIZED VIEWOF COMPUTE,NETWORKAND STORAGERESOURCES Application Application examples • Simulation • Analysis • Design • Big data IT constrained • Long wait times • Low utilization • Data access bottlenecks • IT Sprawl IBM Software Defined Infrastructure Big data Simulation and modeling Analytics Traditional IBM Spectrum Conductor Make multiple computers look like one Prioritized matching of supply with demand Benefits • High utilization • Throughput • Performance • Prioritization • Reduced cost Repeated for many apps and groups Converged compute and storage VIRTUALIZED VIEWOF COMPUTE,NETWORKAND STORAGERESOURCES Faster results Fewer resources Long running services Distinct resources for compute and storage Traditional vs Conductor Management
  • 5. IBM Systems Shared Services Model for Spark, Machine Learning, and Deep Learning • Physical view: IBM Spectrum Conductor installed on each Linux server • Logical view: Users (groups) have their own Spark cluster (optional) that is isolated, protected, and secured by Spark instance groups or application instances – Managed by SLA | 5 Administrator Compute Nodes Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Instance #1 LOB Marketing… Fraud Detection… Data scientist Instance #2 Data scientist Driverless AI Instance #3 Researcher Instance #4 x86 Systems Cloud Object Storage (COS)Spectrum Scale Spectrum Conductor Data Connectors
  • 6. IBM Systems IBM Spectrum Conductor The most complete enterprise-grade solution for Data Science • Anaconda Distributions The solution supports multiple distributions of Anaconda running concurrently. Users can add/remove Conda packages. • Notebooks Integration Out-of-the-box notebooks available: Jupyter, Zeppelin, RStudio, H2O Sparkling Water. Other notebooks and distributed frameworks can be quickly integrated. • Spark Distributions The solution supports multiple versions of Spark running concurrently. • Workload Management / Scheduling A proven workload scheduling engine that enhances the Spark master scheduling logic to enable multi-tenancy. • Services Management Management of other long running application services on the same grid. Spark applications commonly have dependencies on other services that can now be managed as a single solution. • Resource Management & Orchestration Proven architecture at scale. Resources are dynamically allocated to Spark workload with fine grain sharing across applications. • IBM services and support A single point of contact for your services and support needs. | 6 Monitoring&Reporting Workload Management / Scheduling Resource Management & Orchestration Services Management Services and Support Red Hat Linux x86… Notebooks
  • 7. IBM Systems 7© 2016 IBM Corporation Competitive advantage through faster, more predictable analytics Throughput: 41% greater than Spark with YARN; 57% greater than Spark with Mesos Spectrum Conductor with Spark Spark / YARN Spark / Mesos When minutes count 10 minutes 14.1 minutes 15.7 minutes At quarter-end 80 hours 112.8 hours 125.6 hours Product development 26 weeks 36.7 weeks 40.8 weeks Source: STAC Report: Spark Resource Managers, Phase 1 (March 28, 2016) Note: IBM is an active contributor in the Mesos community, helping to advance its capabilities and integration with IBM solutions Predictability: longest job duration compared with median (lower is better) Spectrum Conductor with Spark Spark / YARN Spark / Mesos 1.51X 1.62X 66.32X
  • 8. IBM Systems 8© 2016 IBM Corporation STAC reported significant advantages, up to 2.2x, for IBM Spectrum Conductor with Spark over YARN and Mesos.
  • 9. PowerAI Enterprise ML/DL - Data Science Stack Open Source Frameworks Distribution Data Layer Runtimes, Resource & WL Managers DL Frameworks ML Libraries ML/DL UI and Flow Data Science Apps Value-add Tools IBM Spectrum Conductor Tensor Flow Caffe PyTorch Chainer MLLib Graphx Scikit- learn R xgboost GPU Support / Distributed / BYOF / Session Scheduler / MPI / Containers… Anaconda Python Spark Anaconda Distributed Deep Learning (DDL) Data Prep / Parallel Training / Model Tuning / Model Evaluation / Inference Services… IBM Spectrum Conductor Deep Learning Impact PowerAI Vision IBM PowerAI Enterprise IBM Spectrum Scale IBM Cloud Object Store Watson Studio Elastic Distributed Training (EDT)
  • 10. Key concepts of IBM Spectrum Conductor • Application instances • Customizable feature to support running any long-running service within the cluster • Application templates (yaml) are created to define the processes (services) that you want to run in the cluster • Driverless AI integration is done through application instances • Spark instance groups • Is an installation of Apache Spark that can run Spark core services (master, shuffle, and history), Anaconda distribution instances, and notebooks as configured • You can create and run multiple Spark instance groups, associating each instance group with different Spark/Anaconda/notebook version packages as required • H2O Sparkling Water integration is treated as a notebook within your Spark instance groups
  • 11. Key concepts of IBM Spectrum Conductor Cont • Resource groups • Provide a simple way of organizing and grouping resources (hosts) • Defines how to divide up the hosts in the group into slots • Slots are used to decide if a host is available to place new workload on it • Consumers • A way to map organizations/teams to resources they are allowed to use • Resource planning uses consumers to determine advanced policies for when to borrow/lend resources to other consumers • Resource groups map to consumers to allow users adding application instances or Spark instance groups to only use those resource groups
  • 12. Role-based access control • Permissions are assigned to roles • Roles are assigned to users • Most permissions are based on a consumer • Users will have the permissions/role assigned but only for the consumers they have access to • Ability to allow users to only access/control what they should • Example: Each user can see only their Driverless AI instances as desired
  • 13. How does the integration work? • H2O Driverless AI is launched on a single host • The host can have either GPUs or just run with CPUs • If using GPUs the entire host is taken (with current integration) • An application instance is created for each user of Driverless AI • Maintains security for the data this user has access to • Environment variables through parameters are used to configure Driverless AI • H2O Sparkling Water runs as a notebook in a Spark instance group • When the notebook is started up it forms a mini cluster of executors • These executors stay alive for the entire duration of the notebook • IBM Spectrum Conductor disables preemption to not reclaim these hosts • Multiple users can share a Sparkling Water notebook instance or have dedicated ones per user
  • 14. Current Integration 14 Session Scheduler Security Data Connector Report/log management Notebook Spark ELKPython Resource, Cluster, Service Management (K8s/EGO) ContainerGPU and Acceleration Multi-tenancy Batch Scheduler Session Scheduler Session Scheduler Instance Group #1 Instance Group #2 App instance # marketing App instance # fraud Instance Group # 5 Elastic Distributed Training (EDT) # other apps …
  • 15. Demo
  • 16. Future Plans (short term) • Log retrieval from IBM Spectrum Conductor web UI • Ability to deploy Driverless AI with IBM Spectrum Conductor instead of installing on all systems (new application template) • Ability to modify application instance outputs more effectively • Enhance job monitor to check when Driverless AI is up
  • 17. Future Plans (longer term) • Improved port management • Today you can specify the ports to use, however, you don’t know if they are being used on existing hots • The ports might work at first but not later if something else is using the ports • Improve handling of running Driverless AI with a subset of GPUs on hosts in the cluster • Integrate Driverless AI authentication with IBM Spectrum Conductor authentication/authorization for easier setup • Look at supporting Driverless AI to run across multiple machines • Investigate the best approaches to connect to data sources
  • 18. Long term architecture vision for Driverless AI integrated with IBM Spectrum Conductor H2O Driverless AI Batch Scheduler (1) Start Driverless AI Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Session Scheduler (2) Find a host to run Driverless AI (3) Run workload (training, experiment, etc) (4) Find hosts to run the workload on to speed up execution
  • 19. It’s available now • Contact Richard Shedrick ( rshedrick@us.ibm.com ) to get access to the integration and learn more • Future announcements and contact points on the integration at: • IBM Spectrum Conductor Blog: http://ibm.biz/ConductorBlogs • IBM Spectrum Conductor’s Slack channel: http://ibm.biz/ConductorSlack
  • 20. 20 Simplicity: Integrated Platform that Just Works Curate, Test, and Support Fast Moving Open Source Provide Enterprise Distribution on RedHat Easy to deploy Enterprise AI Platform Ease of Use, Unique Capabilities Faster Model Training Time Large data & model support due to NVLink Acceleration of Analytics & ML AutoML: PowerAI Vision Elastic Training: Scale GPUs as Required Faster Training Times in Single Server Scalability to 100s of Servers (Cluster level Integration) Leads to Faster Insights and Better Economics Platform that Partners can build on Software Partners: H2O, IBM, Anaconda SIs, Solution Vendors & Accelerator Partners Open AI Platform w/ Ecosystem Partners Power9 CPU GPU PowerAI IBM SW ISV SW Solution SIs Top Reasons to Choose PowerAI Enterprise