SlideShare una empresa de Scribd logo
1 de 45
Open, Secure & Transparent AI
Pipelines
Nick Pentreath, Principal Engineer IBM
About
@MLnick on Twitter & Github
Principal Engineer, IBM
CODAIT - Center for Open-Source Data
& AI Technologies
Machine Learning & AI
Apache Spark committer & PMC
Author of Machine Learning with Spark
Various conferences & meetups
Center for Open Source
Data and AI Technologies
Watson West Building
505 Howard St.
San Francisco, California
CODAIT aims to make AI solutions dramatically
easier to create, deploy, and manage in the
enterprise.
Relaunch of the IBM Spark Technology Center
(STC) to reflect expanded mission.
We contribute to foundational open source
software across the enterprise AI lifecycle.
36 open-source developers!
Improving Enterprise AI Lifecycle in Open Source
CODAIT
codait.org
What is AI?
What about Machine Learning
/ Deep Learning?
Learn from data to make predictions
What about Machine Learning
/ Deep Learning?
Learn from historical data to make
predictions about the future
Applied Machine Learning
Learn from historical data to make
predictions about the future, in order
to make decisions
Intelligent Systems
Source: https://deepmind.com/blog/alphastar-mastering-real-time-strategy-game-starcraft-ii/
Automated decision-making
Continual learning (new data & feedback)
Adapting to environment
Memory & generalization
Trust in AI
The Machine Learning
Workflow
Perception
In reality the workflow spans
teams …
… and tools …
… and is a small (but critical!)
piece of the puzzle
*Source: Hidden Technical Debt in Machine Learning Systems
Training
* Logos trademarks of their respective projects
17
Fabric for Deep Learning
https://developer.ibm.com/ope
n/projects/fabric-for-deep-
learning-ffdl/
https://github.com/IBM/FfDL• Fabric for Deep Learning or FfDL (pronounced ‘fiddle’) aims
at making Deep Learning easily accessible to Data
Scientists and AI developers.
• FfDL provides a consistent way to train and visualize Deep
Learning jobs across multiple frameworks like TensorFlow,
Caffe, PyTorch, Keras etc.
FfDL
Community Partners
FfDL is one of InfoWorld’s 2018 Best of Open
Source Software Award winners for machine
learning and deep learning!
AI Deployment
What, Where, How?
• What are you deploying?
• What is a “model”?
• Where are you deploying?
• Target environment
• Batch, streaming, real-time?
• How are you deploying?
• “devops” deployment mechanism
• Serving framework
We will talk mostly about the what
What is a “model”?
Pipelines, not Models
• Deploying just the model part of the
workflow is not enough
• Entire pipeline must be deployed
• Data transform
• Feature extraction & pre-processing
• ML model itself
• Prediction transformation
• Technically even ETL is part of the
pipeline!
• Pipelines in open-source frameworks
• scikit-learn
• Spark ML pipelines
• TensorFlow Transform
• pipeliner (R)
Challenges
• Proliferation of formats
• Open source, open standard: PMML, PFA, ONNX
• Open-source, non-standard: MLeap, Spark,
TensorFlow, Keras, PyTorch, Caffe, …
• Proprietary formats: lock-in, not portable
• Lack of standardization leads to custom
solutions
• Where standards exist, limitations lead to
custom extensions, eliminating the benefits
• Need to manage and bridge many different:
• Languages - Python, R, Notebooks, Scala / Java / C
• Frameworks – too many to count!
• Dependencies
• Versions
• Performance characteristics can be highly
variable across these dimensions
• Friction between teams
• Data scientists & researchers – latest & greatest
• Production – stability, control, minimize changes,
performance
• Business – metrics, business impact, product must
always work!
Containers for ML
Pipelines
23
Containers for ML Deployment
• But …
• What goes in the container is most
important
• Performance can be highly variable across
language, framework, version
• Requires devops knowledge, CI /
deployment pipelines, good practices
• Does not solve the issue of standardization
• Formats
• APIs exposed
• A serving framework is still required on top
• Container-based deployment has
significant benefits
• Repeatability
• Ease of configuration
• Separation of concerns – focus on what, not
how
• Allow data scientists & researchers to use
their language / framework of choice
• Container frameworks take care of (certain)
monitoring, fault tolerance, HA, etc.
IBM Developer
Model Asset eXchange
http://ibm.biz/model-exchange
Free, open-source deep learning models.
Wide variety of domains.
Multiple deep learning frameworks.
Vetted and tested code and IP.
Build and deploy a web service in 30 seconds.
Start training on Fabric for Deep Learning (FfDL) or
Watson Machine Learning in minutes.
MAX Model Metadata
Open Standards for Model Deployment
Predictive Model Markup
Language (PMML)
• Shortcomings
• Cannot represent arbitrary programs / analytic
applications
• Flexibility comes from custom plugins => lose
benefits of standardization
• Data Mining Group (DMG)
• Model interchange format in XML with
operators
• Widely used and supported; open standard
• Spark support lacking natively but 3rd party
projects available: jpmml-sparkml
• Comprehensive support for Spark ML components
(perhaps surprisingly!)
• Watch SPARK-11237
• Other exporters include scikit-learn, R,
XGBoost and LightGBM
Portable Format for Analytics
• Shortcomings
• PFA is still young and needs to gain adoption
• Production robustness and performance at
scale?
• Limitations of PFA – especially for deep learning
applications
• A standard can move slowly in terms of new
features, fixes and enhancements
• PFA is being championed by the DMG
• PFA consists of:
• JSON serialization format
• AVRO schemas for data types
• Encodes functions (actions) that are applied to inputs to
create outputs with a set of built-in functions and
language constructs (e.g. control-flow, conditionals)
• Essentially a mini functional math language + schema
specification
• Portability across languages, frameworks, run
times and versions
https://github.com/CODAIT/aardpfark
Open Neural Network Exchange (ONNX)
• Shortcomings
• Relatively poor support for “traditional” ML or
general language constructs (currently)
• String / categorical processing
• Datetime, collection operators
• Intermediate variables
• Evolving standard – coverage of operators (e.g.
TensorFlow)
• Championed by Facebook & Microsoft
• Protobuf serialization format
• Describes computation graph (including
operators)
• In this way the serialized graph is “self-describing”
similarly to PFA
• More focused on Deep Learning / tensor
operations
• Baked into PyTorch 1.0.0 / Caffe2 as the
serialization & interchange format
• ONNX-ML covers “traditional” ML operators
https://github.com/onnx
Monitoring & Feedback
DBG / May 10, 2018 / © 2018 IBM Corporation
Monitoring
Performance Business
Monitoring
Software
Traditional software monitoring
Latency, throughput, resource usage, etc
Model metrics
Traditional ML evaluation measures +
bias, robustness, explainability
Business metrics
Impact of predictions on business
outcomes
• Additional revenue - e.g. uplift from recommender
• Cost savings – e.g. value of fraud prevented
• Metrics implicitly influencing these – e.g. user
engagement
Feedback
Adapt
An intelligent system must automatically
learn from & adapt to the world around it
Continual learning
Retraining, online learning,
reinforcement learning
Feedback loops
Explicit: models create or directly
influence their own training data
Implicit: predictions influence behavior in
longer-term or indirect ways
Humans in the loop
Data
Transform
TrainDeploy
Feedback
AI Fairness & Transparency
34
35
AI Fairness 360
Toolbox:
Fairness metrics (30+)
Fairness metric
explanations
Bias mitigation
algorithms (10)
AIF360AIF360 toolkit is an open-source library to help
detect and remove bias in machine learning
models.
The AI Fairness 360 Python package includes
a comprehensive set of metrics for datasets
and models to test for biases, explanations for
these metrics, and algorithms to mitigate bias
in datasets and models.
https://github.com/IBM/AIF360
https://developer.ibm.com/patterns/ensuring-
fairness-when-processing-loan-applications/
36
AIF360 Demo: http://aif360.mybluemix.net
Defending machine
learning systems
Adversarial Attacks
38Sources: Explaining and Harnessing Adversarial Examples
Robust Physical-World Attacks on Deep Learning Visual Classification
IBM Adversarial Robustness
Toolbox
ART
ART is a library dedicated to adversarial
machine learning. Its purpose is to allow rapid
crafting and analysis of attack and defense
methods for machine learning models. The
Adversarial Robustness Toolbox provides an
implementation for many state-of-the-art
methods for attacking and defending
classifiers.
39
https://github.com/IBM/adversarial-robustness-toolbox
https://developer.ibm.com/patterns/integrate-adversarial-
attacks-model-training-pipeline/
Toolbox
Evasion attacks (11)
Defenses (9)
Detection methods for
adversarial samples &
poisoning attacks
Robustness metrics
Getting CLEVER
40
Cross Lipschitz Extreme Value for nEtwork Robustness (CLEVER):
• Published by IBM Research
• Attack-agnostic measure
• Efficient computation
https://medium.com/@MITIBMLab/getting-clever-er-expanding-
the-scope-of-a-robustness-metric-for-neural-networks-81c6c6ecb
https://arxiv.org/abs/1801.10578
https://github.com/IBM/CLEVER-Robustness-Score
41
ART Demo: https://art-demo.mybluemix.net/
Wrapping Up
CODAIT: End-to-end Enterprise AI
in Open Source
44
codait.org
twitter.com/MLnick
github.com/MLnick
developer.ibm.com
FfDL
Sign up for IBM Cloud and try Watson Studio!
https://ibm.biz/Bd2Gb7
MAX
Thank you.

Más contenido relacionado

La actualidad más candente

Index conf sparkai-feb20-n-pentreath
Index conf sparkai-feb20-n-pentreathIndex conf sparkai-feb20-n-pentreath
Index conf sparkai-feb20-n-pentreathChester Chen
 
ML at the Edge: Building Your Production Pipeline with Apache Spark and Tens...
 ML at the Edge: Building Your Production Pipeline with Apache Spark and Tens... ML at the Edge: Building Your Production Pipeline with Apache Spark and Tens...
ML at the Edge: Building Your Production Pipeline with Apache Spark and Tens...Databricks
 
Ugif 10 2012 lycia2 introduction in 45 minutes
Ugif 10 2012 lycia2 introduction in 45 minutesUgif 10 2012 lycia2 introduction in 45 minutes
Ugif 10 2012 lycia2 introduction in 45 minutesUGIF
 
ER/Studio 2016: Build a Business-Driven Data Architecture
ER/Studio 2016: Build a Business-Driven Data ArchitectureER/Studio 2016: Build a Business-Driven Data Architecture
ER/Studio 2016: Build a Business-Driven Data ArchitectureEmbarcadero Technologies
 
Querix 4 gl app analyzer 2016 journey to the center of your 4gl application
Querix 4 gl app analyzer 2016 journey to the center of your 4gl applicationQuerix 4 gl app analyzer 2016 journey to the center of your 4gl application
Querix 4 gl app analyzer 2016 journey to the center of your 4gl applicationBeGooden-IT Consulting
 
Deploying and Monitoring Heterogeneous Machine Learning Applications with Cli...
Deploying and Monitoring Heterogeneous Machine Learning Applications with Cli...Deploying and Monitoring Heterogeneous Machine Learning Applications with Cli...
Deploying and Monitoring Heterogeneous Machine Learning Applications with Cli...Databricks
 
Deep Learning for Natural Language Processing Using Apache Spark and TensorFl...
Deep Learning for Natural Language Processing Using Apache Spark and TensorFl...Deep Learning for Natural Language Processing Using Apache Spark and TensorFl...
Deep Learning for Natural Language Processing Using Apache Spark and TensorFl...Databricks
 
MLOps - Build pipelines with Tensor Flow Extended & Kubeflow
MLOps - Build pipelines with Tensor Flow Extended & KubeflowMLOps - Build pipelines with Tensor Flow Extended & Kubeflow
MLOps - Build pipelines with Tensor Flow Extended & KubeflowJan Kirenz
 
Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep...
 Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep... Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep...
Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep...Databricks
 
Serverless machine learning operations
Serverless machine learning operationsServerless machine learning operations
Serverless machine learning operationsStepan Pushkarev
 
ML Infra for Netflix Recommendations - AI NEXTCon talk
ML Infra for Netflix Recommendations - AI NEXTCon talkML Infra for Netflix Recommendations - AI NEXTCon talk
ML Infra for Netflix Recommendations - AI NEXTCon talkFaisal Siddiqi
 
Managing and Versioning Machine Learning Models in Python
Managing and Versioning Machine Learning Models in PythonManaging and Versioning Machine Learning Models in Python
Managing and Versioning Machine Learning Models in PythonSimon Frid
 
IBM Rhapsody and MATLAB/Simulink
IBM Rhapsody and MATLAB/SimulinkIBM Rhapsody and MATLAB/Simulink
IBM Rhapsody and MATLAB/Simulinkgjuljo
 
Simpler and Safer Java Types (via the Vavr and Lambda Libraries)
Simpler and Safer Java Types (via the Vavr and Lambda Libraries)Simpler and Safer Java Types (via the Vavr and Lambda Libraries)
Simpler and Safer Java Types (via the Vavr and Lambda Libraries)Garth Gilmour
 
AISF19 - Building Scalable, Kubernetes-Native ML/AI Pipelines with TFX, KubeF...
AISF19 - Building Scalable, Kubernetes-Native ML/AI Pipelines with TFX, KubeF...AISF19 - Building Scalable, Kubernetes-Native ML/AI Pipelines with TFX, KubeF...
AISF19 - Building Scalable, Kubernetes-Native ML/AI Pipelines with TFX, KubeF...Bill Liu
 
Building A Machine Learning Platform At Quora (1)
Building A Machine Learning Platform At Quora (1)Building A Machine Learning Platform At Quora (1)
Building A Machine Learning Platform At Quora (1)Nikhil Garg
 
Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...
Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...
Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...MLconf
 
Feature store: Solving anti-patterns in ML-systems
Feature store: Solving anti-patterns in ML-systemsFeature store: Solving anti-patterns in ML-systems
Feature store: Solving anti-patterns in ML-systemsAndrzej Michałowski
 
DIY Analytics with Apache Spark
DIY Analytics with Apache SparkDIY Analytics with Apache Spark
DIY Analytics with Apache SparkAdam Roberts
 

La actualidad más candente (20)

Index conf sparkai-feb20-n-pentreath
Index conf sparkai-feb20-n-pentreathIndex conf sparkai-feb20-n-pentreath
Index conf sparkai-feb20-n-pentreath
 
ML at the Edge: Building Your Production Pipeline with Apache Spark and Tens...
 ML at the Edge: Building Your Production Pipeline with Apache Spark and Tens... ML at the Edge: Building Your Production Pipeline with Apache Spark and Tens...
ML at the Edge: Building Your Production Pipeline with Apache Spark and Tens...
 
Ugif 10 2012 lycia2 introduction in 45 minutes
Ugif 10 2012 lycia2 introduction in 45 minutesUgif 10 2012 lycia2 introduction in 45 minutes
Ugif 10 2012 lycia2 introduction in 45 minutes
 
ER/Studio 2016: Build a Business-Driven Data Architecture
ER/Studio 2016: Build a Business-Driven Data ArchitectureER/Studio 2016: Build a Business-Driven Data Architecture
ER/Studio 2016: Build a Business-Driven Data Architecture
 
Querix 4 gl app analyzer 2016 journey to the center of your 4gl application
Querix 4 gl app analyzer 2016 journey to the center of your 4gl applicationQuerix 4 gl app analyzer 2016 journey to the center of your 4gl application
Querix 4 gl app analyzer 2016 journey to the center of your 4gl application
 
Deploying and Monitoring Heterogeneous Machine Learning Applications with Cli...
Deploying and Monitoring Heterogeneous Machine Learning Applications with Cli...Deploying and Monitoring Heterogeneous Machine Learning Applications with Cli...
Deploying and Monitoring Heterogeneous Machine Learning Applications with Cli...
 
Deep Learning for Natural Language Processing Using Apache Spark and TensorFl...
Deep Learning for Natural Language Processing Using Apache Spark and TensorFl...Deep Learning for Natural Language Processing Using Apache Spark and TensorFl...
Deep Learning for Natural Language Processing Using Apache Spark and TensorFl...
 
Monitoring AI with AI
Monitoring AI with AIMonitoring AI with AI
Monitoring AI with AI
 
MLOps - Build pipelines with Tensor Flow Extended & Kubeflow
MLOps - Build pipelines with Tensor Flow Extended & KubeflowMLOps - Build pipelines with Tensor Flow Extended & Kubeflow
MLOps - Build pipelines with Tensor Flow Extended & Kubeflow
 
Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep...
 Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep... Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep...
Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep...
 
Serverless machine learning operations
Serverless machine learning operationsServerless machine learning operations
Serverless machine learning operations
 
ML Infra for Netflix Recommendations - AI NEXTCon talk
ML Infra for Netflix Recommendations - AI NEXTCon talkML Infra for Netflix Recommendations - AI NEXTCon talk
ML Infra for Netflix Recommendations - AI NEXTCon talk
 
Managing and Versioning Machine Learning Models in Python
Managing and Versioning Machine Learning Models in PythonManaging and Versioning Machine Learning Models in Python
Managing and Versioning Machine Learning Models in Python
 
IBM Rhapsody and MATLAB/Simulink
IBM Rhapsody and MATLAB/SimulinkIBM Rhapsody and MATLAB/Simulink
IBM Rhapsody and MATLAB/Simulink
 
Simpler and Safer Java Types (via the Vavr and Lambda Libraries)
Simpler and Safer Java Types (via the Vavr and Lambda Libraries)Simpler and Safer Java Types (via the Vavr and Lambda Libraries)
Simpler and Safer Java Types (via the Vavr and Lambda Libraries)
 
AISF19 - Building Scalable, Kubernetes-Native ML/AI Pipelines with TFX, KubeF...
AISF19 - Building Scalable, Kubernetes-Native ML/AI Pipelines with TFX, KubeF...AISF19 - Building Scalable, Kubernetes-Native ML/AI Pipelines with TFX, KubeF...
AISF19 - Building Scalable, Kubernetes-Native ML/AI Pipelines with TFX, KubeF...
 
Building A Machine Learning Platform At Quora (1)
Building A Machine Learning Platform At Quora (1)Building A Machine Learning Platform At Quora (1)
Building A Machine Learning Platform At Quora (1)
 
Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...
Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...
Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...
 
Feature store: Solving anti-patterns in ML-systems
Feature store: Solving anti-patterns in ML-systemsFeature store: Solving anti-patterns in ML-systems
Feature store: Solving anti-patterns in ML-systems
 
DIY Analytics with Apache Spark
DIY Analytics with Apache SparkDIY Analytics with Apache Spark
DIY Analytics with Apache Spark
 

Similar a Open, Secure & Transparent AI Pipelines

Ideas spracklen-final
Ideas spracklen-finalIdeas spracklen-final
Ideas spracklen-finalsupportlogic
 
Bridging the Gap: from Data Science to Production
Bridging the Gap: from Data Science to ProductionBridging the Gap: from Data Science to Production
Bridging the Gap: from Data Science to ProductionFlorian Wilhelm
 
Building a MLOps Platform Around MLflow to Enable Model Productionalization i...
Building a MLOps Platform Around MLflow to Enable Model Productionalization i...Building a MLOps Platform Around MLflow to Enable Model Productionalization i...
Building a MLOps Platform Around MLflow to Enable Model Productionalization i...Databricks
 
Feature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine LearningFeature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine LearningProvectus
 
Apex Enterprise Patterns Galore - Boston, MA dev group meeting 062719
Apex Enterprise Patterns Galore - Boston, MA dev group meeting 062719Apex Enterprise Patterns Galore - Boston, MA dev group meeting 062719
Apex Enterprise Patterns Galore - Boston, MA dev group meeting 062719BingWang77
 
Building a Scalable and reliable open source ML Platform with MLFlow
Building a Scalable and reliable open source ML Platform with MLFlowBuilding a Scalable and reliable open source ML Platform with MLFlow
Building a Scalable and reliable open source ML Platform with MLFlowGoDataDriven
 
Productionising Machine Learning Models
Productionising Machine Learning ModelsProductionising Machine Learning Models
Productionising Machine Learning ModelsTash Bickley
 
Strata parallel m-ml-ops_sept_2017
Strata parallel m-ml-ops_sept_2017Strata parallel m-ml-ops_sept_2017
Strata parallel m-ml-ops_sept_2017Nisha Talagala
 
Nikhil Garg, Engineering Manager, Quora at MLconf SF 2016
Nikhil Garg, Engineering Manager, Quora at MLconf SF 2016Nikhil Garg, Engineering Manager, Quora at MLconf SF 2016
Nikhil Garg, Engineering Manager, Quora at MLconf SF 2016MLconf
 
Agile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for SuccessAgile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for SuccessInside Analysis
 
Machine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabsMachine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabszekeLabs Technologies
 
How AI and ML Can Accelerate and Optimize Software Development and Testing
How AI and ML Can Accelerate and Optimize Software Development and TestingHow AI and ML Can Accelerate and Optimize Software Development and Testing
How AI and ML Can Accelerate and Optimize Software Development and TestingAggregage
 
Databricks for MLOps Presentation (AI/ML)
Databricks for MLOps Presentation (AI/ML)Databricks for MLOps Presentation (AI/ML)
Databricks for MLOps Presentation (AI/ML)Knoldus Inc.
 
World Artificial Intelligence Conference Shanghai 2018
World Artificial Intelligence Conference Shanghai 2018World Artificial Intelligence Conference Shanghai 2018
World Artificial Intelligence Conference Shanghai 2018Adam Gibson
 
GenerativeAI and Automation - IEEE ACSOS 2023.pptx
GenerativeAI and Automation - IEEE ACSOS 2023.pptxGenerativeAI and Automation - IEEE ACSOS 2023.pptx
GenerativeAI and Automation - IEEE ACSOS 2023.pptxAllen Chan
 
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning Models
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning ModelsApache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning Models
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning ModelsAnyscale
 
Network Automation Journey, A systems engineer NetOps perspective
Network Automation Journey, A systems engineer NetOps perspectiveNetwork Automation Journey, A systems engineer NetOps perspective
Network Automation Journey, A systems engineer NetOps perspectiveWalid Shaari
 
[DSC Europe 23] Igor Ilic - Redefining User Experience with Large Language Mo...
[DSC Europe 23] Igor Ilic - Redefining User Experience with Large Language Mo...[DSC Europe 23] Igor Ilic - Redefining User Experience with Large Language Mo...
[DSC Europe 23] Igor Ilic - Redefining User Experience with Large Language Mo...DataScienceConferenc1
 
DBCS Office Hours - Modernization through Migration
DBCS Office Hours - Modernization through MigrationDBCS Office Hours - Modernization through Migration
DBCS Office Hours - Modernization through MigrationTammy Bednar
 

Similar a Open, Secure & Transparent AI Pipelines (20)

Ideas spracklen-final
Ideas spracklen-finalIdeas spracklen-final
Ideas spracklen-final
 
Bridging the Gap: from Data Science to Production
Bridging the Gap: from Data Science to ProductionBridging the Gap: from Data Science to Production
Bridging the Gap: from Data Science to Production
 
Building a MLOps Platform Around MLflow to Enable Model Productionalization i...
Building a MLOps Platform Around MLflow to Enable Model Productionalization i...Building a MLOps Platform Around MLflow to Enable Model Productionalization i...
Building a MLOps Platform Around MLflow to Enable Model Productionalization i...
 
Feature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine LearningFeature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine Learning
 
Apex Enterprise Patterns Galore - Boston, MA dev group meeting 062719
Apex Enterprise Patterns Galore - Boston, MA dev group meeting 062719Apex Enterprise Patterns Galore - Boston, MA dev group meeting 062719
Apex Enterprise Patterns Galore - Boston, MA dev group meeting 062719
 
Building a Scalable and reliable open source ML Platform with MLFlow
Building a Scalable and reliable open source ML Platform with MLFlowBuilding a Scalable and reliable open source ML Platform with MLFlow
Building a Scalable and reliable open source ML Platform with MLFlow
 
Productionising Machine Learning Models
Productionising Machine Learning ModelsProductionising Machine Learning Models
Productionising Machine Learning Models
 
Strata parallel m-ml-ops_sept_2017
Strata parallel m-ml-ops_sept_2017Strata parallel m-ml-ops_sept_2017
Strata parallel m-ml-ops_sept_2017
 
Nikhil Garg, Engineering Manager, Quora at MLconf SF 2016
Nikhil Garg, Engineering Manager, Quora at MLconf SF 2016Nikhil Garg, Engineering Manager, Quora at MLconf SF 2016
Nikhil Garg, Engineering Manager, Quora at MLconf SF 2016
 
DevOps for DataScience
DevOps for DataScienceDevOps for DataScience
DevOps for DataScience
 
Agile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for SuccessAgile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for Success
 
Machine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabsMachine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabs
 
How AI and ML Can Accelerate and Optimize Software Development and Testing
How AI and ML Can Accelerate and Optimize Software Development and TestingHow AI and ML Can Accelerate and Optimize Software Development and Testing
How AI and ML Can Accelerate and Optimize Software Development and Testing
 
Databricks for MLOps Presentation (AI/ML)
Databricks for MLOps Presentation (AI/ML)Databricks for MLOps Presentation (AI/ML)
Databricks for MLOps Presentation (AI/ML)
 
World Artificial Intelligence Conference Shanghai 2018
World Artificial Intelligence Conference Shanghai 2018World Artificial Intelligence Conference Shanghai 2018
World Artificial Intelligence Conference Shanghai 2018
 
GenerativeAI and Automation - IEEE ACSOS 2023.pptx
GenerativeAI and Automation - IEEE ACSOS 2023.pptxGenerativeAI and Automation - IEEE ACSOS 2023.pptx
GenerativeAI and Automation - IEEE ACSOS 2023.pptx
 
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning Models
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning ModelsApache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning Models
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning Models
 
Network Automation Journey, A systems engineer NetOps perspective
Network Automation Journey, A systems engineer NetOps perspectiveNetwork Automation Journey, A systems engineer NetOps perspective
Network Automation Journey, A systems engineer NetOps perspective
 
[DSC Europe 23] Igor Ilic - Redefining User Experience with Large Language Mo...
[DSC Europe 23] Igor Ilic - Redefining User Experience with Large Language Mo...[DSC Europe 23] Igor Ilic - Redefining User Experience with Large Language Mo...
[DSC Europe 23] Igor Ilic - Redefining User Experience with Large Language Mo...
 
DBCS Office Hours - Modernization through Migration
DBCS Office Hours - Modernization through MigrationDBCS Office Hours - Modernization through Migration
DBCS Office Hours - Modernization through Migration
 

Más de Nick Pentreath

Scaling up deep learning by scaling down
Scaling up deep learning by scaling downScaling up deep learning by scaling down
Scaling up deep learning by scaling downNick Pentreath
 
IBM Developer Model Asset eXchange - Deep Learning for Everyone
IBM Developer Model Asset eXchange - Deep Learning for EveryoneIBM Developer Model Asset eXchange - Deep Learning for Everyone
IBM Developer Model Asset eXchange - Deep Learning for EveryoneNick Pentreath
 
Search and Recommendations: 3 Sides of the Same Coin
Search and Recommendations: 3 Sides of the Same CoinSearch and Recommendations: 3 Sides of the Same Coin
Search and Recommendations: 3 Sides of the Same CoinNick Pentreath
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender SystemsNick Pentreath
 
Productionizing Spark ML Pipelines with the Portable Format for Analytics
Productionizing Spark ML Pipelines with the Portable Format for AnalyticsProductionizing Spark ML Pipelines with the Portable Format for Analytics
Productionizing Spark ML Pipelines with the Portable Format for AnalyticsNick Pentreath
 
RNNs for Recommendations and Personalization
RNNs for Recommendations and PersonalizationRNNs for Recommendations and Personalization
RNNs for Recommendations and PersonalizationNick Pentreath
 

Más de Nick Pentreath (6)

Scaling up deep learning by scaling down
Scaling up deep learning by scaling downScaling up deep learning by scaling down
Scaling up deep learning by scaling down
 
IBM Developer Model Asset eXchange - Deep Learning for Everyone
IBM Developer Model Asset eXchange - Deep Learning for EveryoneIBM Developer Model Asset eXchange - Deep Learning for Everyone
IBM Developer Model Asset eXchange - Deep Learning for Everyone
 
Search and Recommendations: 3 Sides of the Same Coin
Search and Recommendations: 3 Sides of the Same CoinSearch and Recommendations: 3 Sides of the Same Coin
Search and Recommendations: 3 Sides of the Same Coin
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systems
 
Productionizing Spark ML Pipelines with the Portable Format for Analytics
Productionizing Spark ML Pipelines with the Portable Format for AnalyticsProductionizing Spark ML Pipelines with the Portable Format for Analytics
Productionizing Spark ML Pipelines with the Portable Format for Analytics
 
RNNs for Recommendations and Personalization
RNNs for Recommendations and PersonalizationRNNs for Recommendations and Personalization
RNNs for Recommendations and Personalization
 

Último

Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一ffjhghh
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 

Último (20)

Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 

Open, Secure & Transparent AI Pipelines

  • 1. Open, Secure & Transparent AI Pipelines Nick Pentreath, Principal Engineer IBM
  • 2. About @MLnick on Twitter & Github Principal Engineer, IBM CODAIT - Center for Open-Source Data & AI Technologies Machine Learning & AI Apache Spark committer & PMC Author of Machine Learning with Spark Various conferences & meetups
  • 3. Center for Open Source Data and AI Technologies Watson West Building 505 Howard St. San Francisco, California CODAIT aims to make AI solutions dramatically easier to create, deploy, and manage in the enterprise. Relaunch of the IBM Spark Technology Center (STC) to reflect expanded mission. We contribute to foundational open source software across the enterprise AI lifecycle. 36 open-source developers! Improving Enterprise AI Lifecycle in Open Source CODAIT codait.org
  • 5. What about Machine Learning / Deep Learning? Learn from data to make predictions
  • 6. What about Machine Learning / Deep Learning? Learn from historical data to make predictions about the future
  • 7. Applied Machine Learning Learn from historical data to make predictions about the future, in order to make decisions
  • 8. Intelligent Systems Source: https://deepmind.com/blog/alphastar-mastering-real-time-strategy-game-starcraft-ii/ Automated decision-making Continual learning (new data & feedback) Adapting to environment Memory & generalization
  • 12. In reality the workflow spans teams …
  • 14. … and is a small (but critical!) piece of the puzzle *Source: Hidden Technical Debt in Machine Learning Systems
  • 16. * Logos trademarks of their respective projects
  • 17. 17 Fabric for Deep Learning https://developer.ibm.com/ope n/projects/fabric-for-deep- learning-ffdl/ https://github.com/IBM/FfDL• Fabric for Deep Learning or FfDL (pronounced ‘fiddle’) aims at making Deep Learning easily accessible to Data Scientists and AI developers. • FfDL provides a consistent way to train and visualize Deep Learning jobs across multiple frameworks like TensorFlow, Caffe, PyTorch, Keras etc. FfDL Community Partners FfDL is one of InfoWorld’s 2018 Best of Open Source Software Award winners for machine learning and deep learning!
  • 19. What, Where, How? • What are you deploying? • What is a “model”? • Where are you deploying? • Target environment • Batch, streaming, real-time? • How are you deploying? • “devops” deployment mechanism • Serving framework We will talk mostly about the what
  • 20. What is a “model”?
  • 21. Pipelines, not Models • Deploying just the model part of the workflow is not enough • Entire pipeline must be deployed • Data transform • Feature extraction & pre-processing • ML model itself • Prediction transformation • Technically even ETL is part of the pipeline! • Pipelines in open-source frameworks • scikit-learn • Spark ML pipelines • TensorFlow Transform • pipeliner (R)
  • 22. Challenges • Proliferation of formats • Open source, open standard: PMML, PFA, ONNX • Open-source, non-standard: MLeap, Spark, TensorFlow, Keras, PyTorch, Caffe, … • Proprietary formats: lock-in, not portable • Lack of standardization leads to custom solutions • Where standards exist, limitations lead to custom extensions, eliminating the benefits • Need to manage and bridge many different: • Languages - Python, R, Notebooks, Scala / Java / C • Frameworks – too many to count! • Dependencies • Versions • Performance characteristics can be highly variable across these dimensions • Friction between teams • Data scientists & researchers – latest & greatest • Production – stability, control, minimize changes, performance • Business – metrics, business impact, product must always work!
  • 24. Containers for ML Deployment • But … • What goes in the container is most important • Performance can be highly variable across language, framework, version • Requires devops knowledge, CI / deployment pipelines, good practices • Does not solve the issue of standardization • Formats • APIs exposed • A serving framework is still required on top • Container-based deployment has significant benefits • Repeatability • Ease of configuration • Separation of concerns – focus on what, not how • Allow data scientists & researchers to use their language / framework of choice • Container frameworks take care of (certain) monitoring, fault tolerance, HA, etc.
  • 25. IBM Developer Model Asset eXchange http://ibm.biz/model-exchange Free, open-source deep learning models. Wide variety of domains. Multiple deep learning frameworks. Vetted and tested code and IP. Build and deploy a web service in 30 seconds. Start training on Fabric for Deep Learning (FfDL) or Watson Machine Learning in minutes.
  • 27. Open Standards for Model Deployment
  • 28. Predictive Model Markup Language (PMML) • Shortcomings • Cannot represent arbitrary programs / analytic applications • Flexibility comes from custom plugins => lose benefits of standardization • Data Mining Group (DMG) • Model interchange format in XML with operators • Widely used and supported; open standard • Spark support lacking natively but 3rd party projects available: jpmml-sparkml • Comprehensive support for Spark ML components (perhaps surprisingly!) • Watch SPARK-11237 • Other exporters include scikit-learn, R, XGBoost and LightGBM
  • 29. Portable Format for Analytics • Shortcomings • PFA is still young and needs to gain adoption • Production robustness and performance at scale? • Limitations of PFA – especially for deep learning applications • A standard can move slowly in terms of new features, fixes and enhancements • PFA is being championed by the DMG • PFA consists of: • JSON serialization format • AVRO schemas for data types • Encodes functions (actions) that are applied to inputs to create outputs with a set of built-in functions and language constructs (e.g. control-flow, conditionals) • Essentially a mini functional math language + schema specification • Portability across languages, frameworks, run times and versions https://github.com/CODAIT/aardpfark
  • 30. Open Neural Network Exchange (ONNX) • Shortcomings • Relatively poor support for “traditional” ML or general language constructs (currently) • String / categorical processing • Datetime, collection operators • Intermediate variables • Evolving standard – coverage of operators (e.g. TensorFlow) • Championed by Facebook & Microsoft • Protobuf serialization format • Describes computation graph (including operators) • In this way the serialized graph is “self-describing” similarly to PFA • More focused on Deep Learning / tensor operations • Baked into PyTorch 1.0.0 / Caffe2 as the serialization & interchange format • ONNX-ML covers “traditional” ML operators https://github.com/onnx
  • 31. Monitoring & Feedback DBG / May 10, 2018 / © 2018 IBM Corporation
  • 32. Monitoring Performance Business Monitoring Software Traditional software monitoring Latency, throughput, resource usage, etc Model metrics Traditional ML evaluation measures + bias, robustness, explainability Business metrics Impact of predictions on business outcomes • Additional revenue - e.g. uplift from recommender • Cost savings – e.g. value of fraud prevented • Metrics implicitly influencing these – e.g. user engagement
  • 33. Feedback Adapt An intelligent system must automatically learn from & adapt to the world around it Continual learning Retraining, online learning, reinforcement learning Feedback loops Explicit: models create or directly influence their own training data Implicit: predictions influence behavior in longer-term or indirect ways Humans in the loop Data Transform TrainDeploy Feedback
  • 34. AI Fairness & Transparency 34
  • 35. 35 AI Fairness 360 Toolbox: Fairness metrics (30+) Fairness metric explanations Bias mitigation algorithms (10) AIF360AIF360 toolkit is an open-source library to help detect and remove bias in machine learning models. The AI Fairness 360 Python package includes a comprehensive set of metrics for datasets and models to test for biases, explanations for these metrics, and algorithms to mitigate bias in datasets and models. https://github.com/IBM/AIF360 https://developer.ibm.com/patterns/ensuring- fairness-when-processing-loan-applications/
  • 38. Adversarial Attacks 38Sources: Explaining and Harnessing Adversarial Examples Robust Physical-World Attacks on Deep Learning Visual Classification
  • 39. IBM Adversarial Robustness Toolbox ART ART is a library dedicated to adversarial machine learning. Its purpose is to allow rapid crafting and analysis of attack and defense methods for machine learning models. The Adversarial Robustness Toolbox provides an implementation for many state-of-the-art methods for attacking and defending classifiers. 39 https://github.com/IBM/adversarial-robustness-toolbox https://developer.ibm.com/patterns/integrate-adversarial- attacks-model-training-pipeline/ Toolbox Evasion attacks (11) Defenses (9) Detection methods for adversarial samples & poisoning attacks Robustness metrics
  • 40. Getting CLEVER 40 Cross Lipschitz Extreme Value for nEtwork Robustness (CLEVER): • Published by IBM Research • Attack-agnostic measure • Efficient computation https://medium.com/@MITIBMLab/getting-clever-er-expanding- the-scope-of-a-robustness-metric-for-neural-networks-81c6c6ecb https://arxiv.org/abs/1801.10578 https://github.com/IBM/CLEVER-Robustness-Score
  • 43. CODAIT: End-to-end Enterprise AI in Open Source
  • 44. 44 codait.org twitter.com/MLnick github.com/MLnick developer.ibm.com FfDL Sign up for IBM Cloud and try Watson Studio! https://ibm.biz/Bd2Gb7 MAX