SlideShare a Scribd company logo
1 of 32
Deploying machine learning models
in the enterprise
Strata Data Conference NYC
Diego Oppenheimer, CEO
diego@algorithmia.com
About Me
Diego Oppenheimer - Founder and CEO - Algorithmia
● Product developer, entrepreneur, extensive background in all things data.
● Microsoft: PowerPivot, PowerBI, Excel and SQL Server.
● Founder of algorithmic trading startup
● BS/MS Carnegie Mellon University
Make state-of-the-art algorithms
discoverable and accessible
to everyone.
4
Algorithmia.com
AI/ML scalable infrastructure on demand + marketplace
● Function-as-a-service for Machine & Deep Learning
● Discoverable, live inventory of AI
● Monetizable
● Composable
● Every developer on earth can make their app intelligent
“There’s an algorithm for that!”
77K DEVELOPERS 6.4K algorithms, models and functions
6
What does production mean for us
● ~6,400 algorithms,models, functions (50k w/ different versions)
● Each model: 1 to 1,000 calls a second, fluctuates, no devops
● ~15ms overhead latency
● Accessible in any of 14 languages (through SDKs)
● Any runtime, any architecture
7
ALGORITHMIA ENTERPRISE
Algorithmia Enterprise is an organization’s internal inventory of intelligence and a algorithm-as-a-
service platform
Deploy
Write your function or
model in any
programming
language, framework,
or infrastructure.
Scale
Expose your model as a
highly-reliable
versioned REST API that
automatically scales
from one to hundreds
of requests per second.
Discover
Name and describe
your model, making it
available in a central
catalog where your
peers can easily
discover and reuse it.
Monitor
House thousands of
models under one roof
with a uniform REST
interface and a single
cluster monitoring
dashboard.
MACHINE LEARNING
!=
PRODUCTION MACHINE LEARNING
9
● Challenges of Deploying Models in the Enterprise
● Characteristics of AI and Technologies
● Varying languages
● Standardize versioning
● Continuous model deployment
● Managing your portfolio of ML models
● Analytics on how and where models are being used
● Maintaining auditability
● Best practices for your organization
What we will cover
10
Challenges of deploying models in the enterprise
● Machine learning
○ CPU/GPU/Specialized hardware
○ Multiple frameworks, languages,
dependencies
○ Called from different
devices/architectures
● “Snowflake” environments
○ Unique cloud hardware and services
● Security and Audit
○ Stringent security and access controls
○ “Who called what when” for audit and
compliance
● Uncharted territory
○ Not a lot of literature
○ Deployment for data science teams is a
new problem
○ Many teams have not bought software
or dealt with their own infrastructure
teams.
○ Chargebacks and billing
• Two distinct phases: training and inference
• Lots of processing power
• Heterogeneous hardware (CPUs, GPUs, TPUs, etc.)
• Limited by compute rather than bandwidth
• “Tensorflow is open source, scaling it is not.” - Kenny Daniel
Characteristics of AI/ML
12
Metal or VM Containers Kubernetes
INFERENCE
Short compute bursts
Elastic
Stateless
Multiple users
OWNER: DevOps
TRAINING
Long compute cycle
Fixed load (Inelastic)
Stateful
Single user
OWNER: Data Scientists
Technologies
MICROSERVICES: the design of a system as
independently deployable, loosely coupled
services.
Two system design paradigms that work well
ADVANTAGES
• Maintainability
• Scalability
• Rolling deployments
• Elastic
• Software/Hardware agnostic
SERVERLESS: the encapsulation, starting, and
stopping of singular functions per request, with
a just-in-time-compute model.
ADVANTAGES
• Cost / Efficiency
• Concurrency built-in
• Speed of development
• Improved latency
14
Runtime Abstraction
Support any
programming language
or framework, including
interoperability between
mixed stacks.
Elastic Scale
Prioritize and
automatically optimize
execution of concurrent
short-lived jobs.
Cloud Abstraction
Provide portability to
algorithms, including
public clouds or private
clouds.
Discoverability, Authentication, Instrumentation, etc.
Shell & Services
Kernel
Think of your technology stack as an OS for running AI in your enterprise
15
Multiple frameworks, languages, dependencies
● Models are rarely developed in the language they will be consumed in.
● In large enterprises there is rarely a standard language for software
development
● The goal should be to make models consumable by any part of the
organization on any platform:
○ To ensure the max value extracted from the model
○ To ensure model re-use
○ To ensure the fastest time from lab to production
● APIs and well developed SDKs are your best friend.
○ APIs allow for easy testing if a model is adequate.
○ APIs allow for easy consumption.
○ Well developed APIs have built in versioning. Which takes us to the next
step....
16
Standardize versioning
● Versioning is an extremely important part of deploying models in the enterprise.
● Starting thinking about models as any other piece of modular software:
○ Must be versioned
○ Versioned must be tracked
○ Older versions should be accessible (for rollbacks, acceptance testing,etc).
For the Data Scientist:
● Ability to compare two different versions of the model is key.
○ Not only at training and verification time but also to understand performance
and SLA changes.
● Model drift.
For the Application Developers:
● Ability to match acceptance testing to predetermined developer cycles.
● Ability to stay behind a version.
● Avoid performance issues even if the model accuracy is better.
17
Standardize versioning
● Even better, make your system auto-version.
● Borrow the best practices from the Software
Development world.
● Rolling non-interruptive deployments
18
Standardize Documentation
● Just like any API, they are
only as good as their
documentation.
● Make the documentation
travel and be updated
with the model to ensure
its always up to date.
○ Directly in
markdown inside the
git that contains the
model.
○ As an artifact that
travels with the
model.
19
Continuous deployment
● Just like with standardizing versioning, Continuous
Integration and Continuous Deployment for AI/ML
should borrow from best practices of software
development.
● The fastest path is usually the best (git push -> deploy).
○ Git + Docker + API generation makes this really
easy.
● Don’t forget dependency management!
● Some interesting use cases:
○ Continuous training and deployment
○ Human in the loop training and deployment
○ Bespoke training and deployment to central
platform
Continuous deployment - Human in the Loop
Continuous deployment - Single train platform to Deployment
Continuous deployment - Multiple training frameworks to deployment
23
Managing a model portfolio
“We have 14 versions of imagic magik running as services
for image resizing before feeding into a number of different
models. ”
Dev Manager Analytics Platform - F500 Media Company
● Similar to an API strategy as new models become available you need to
start caring about how to find them, who can use them and how to bill
for them.
● Borrowing from the concept of an API Gateway and API registry the same
paradigms work for model management and distribution.
● A common, centralized registry will offer the ability to find what has
already been created and potentially re-use it.
24
Managing a model portfolio
● Centralized
repository/registry of
models that can be accessed
across organization.
● Encourage reuse (many
preprocessing and post
processing functions should
only be built once).
● Finding existing models is
key for experimenting with
different pipelines.
25
Managing a model portfolio
● Centralization of models allows for understanding business impact and usage across the
organization.
● C-level understanding of AI/ML investments working or not.
● Finding existing models is key for experimenting with different pipelines and for rapid
application development by disparate teams or external developers.
● Security and Access controls so that only the right people in the organization can access the
models.
○ Who can view how the model works vs Who can call upon it.
26
Model Analytics
During the training phase analytics such as accuracy, drift, errors rates are very important. When
deploying models inside an enterprise a different slice of analytics is required.
What is important during deployment and production:
● Latency
● Resources used (CPU/GPU, I/O)
● System Capacity
● Scale up and Scale down
● Authentication
● API timing metrics and calls
● Errors rates
But also…
● What teams are using the models
● What applications are using them
27
Model Auditability and Compliance
You enterprise deployment system should be able to answer:
“Who called what model, when and with what data.”
Why is this important:
● Compliance:
○ Regulated industries need to provide this information to government regulators:
■ Financial Services
■ Life Sciences
■ Federal Government
● C-level understanding of AI/ML investments working or not.
● Debugging production systems
● Billing
○ Complete understanding of who is using what models allows for chargebacks
28
Best practices and conclusion
"Expecting your engineering and DevOps teams to deploy ML models well is like showing up to
Seaworld with a giraffe since they are already handling large mammals.”
● Technology:
○ Borrow from the best practices of software development and deployment of
applications and code at scale.
■ CI/CD, Versioning, API design, etc.
○ The most advanced AI/ML companies in the world are centralizing their deployment
and serving platforms under one roof. This is because the influence of data science
and ML teams across your organization will only grow.
○ Understand that training and serving have very different profiles and different
technology choices will need to be made.
○ Seriously consider using microservices and serverless as design patterns for re-use,
scale and modularity.
29
Best practices and conclusion
● Organization:
○ Production and serving will usually be owned DevOps or Enterprise Architecture.
Success in deployment in the enterprise will dicated by understanding the roles and
responsibilities of these teams.
○ Data science teams tend to be new with limited experience in:
■ Purchasing enterprise software
■ Requirements and considerations for production environments
■ IT requirements around Information Security, Compliance and Support.
It’s crucial for success to educate and guide these teams through the enterprise
requirements.
30
Best practices and conclusion
● Future proofing:
○ Think re-use: many models will be interesting and usable to multiple parts of the
enterprise. Discoverability and accessibility become key.
○ Safe to assume that your number of models will only grow in time. How you will
manage them in the future requires having a conversation early in the process.
○ Your AI/ML model portfolio is an extremely important asset - understanding the value
you are getting from them is important to measurement and influence should be
tracked.
○ When deciding to build vs buy think of your teams capacity to adapt and move at the
pace the AI/ML industry is moving at over time.
MACHINE LEARNING
!=
PRODUCTION MACHINE LEARNING
Diego Oppenheimer
CEO
Thank you!
diego@algorithmia.com
@doppenhe

More Related Content

What's hot

Machine Learning & Amazon SageMaker
Machine Learning & Amazon SageMakerMachine Learning & Amazon SageMaker
Machine Learning & Amazon SageMakerAmazon Web Services
 
On the Application of AI for Failure Management: Problems, Solutions and Algo...
On the Application of AI for Failure Management: Problems, Solutions and Algo...On the Application of AI for Failure Management: Problems, Solutions and Algo...
On the Application of AI for Failure Management: Problems, Solutions and Algo...Jorge Cardoso
 
MLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in ProductionMLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in ProductionProvectus
 
Kubeflow Pipelines (with Tekton)
Kubeflow Pipelines (with Tekton)Kubeflow Pipelines (with Tekton)
Kubeflow Pipelines (with Tekton)Animesh Singh
 
AutoML - The Future of AI
AutoML - The Future of AIAutoML - The Future of AI
AutoML - The Future of AINing Jiang
 
Microsoft Introduction to Automated Machine Learning
Microsoft Introduction to Automated Machine LearningMicrosoft Introduction to Automated Machine Learning
Microsoft Introduction to Automated Machine LearningSetu Chokshi
 
Azure kubernetes service (aks)
Azure kubernetes service (aks)Azure kubernetes service (aks)
Azure kubernetes service (aks)Akash Agrawal
 
Yapay Zeka Güvenliği : Machine Learning & Deep Learning & Computer Vision Sec...
Yapay Zeka Güvenliği : Machine Learning & Deep Learning & Computer Vision Sec...Yapay Zeka Güvenliği : Machine Learning & Deep Learning & Computer Vision Sec...
Yapay Zeka Güvenliği : Machine Learning & Deep Learning & Computer Vision Sec...Cihan Özhan
 
Google Cloud Fundamentals by CloudZone
Google Cloud Fundamentals by CloudZoneGoogle Cloud Fundamentals by CloudZone
Google Cloud Fundamentals by CloudZoneIdan Tohami
 
Build accurate training datasets with Amazon SageMaker Ground Truth - AIM305 ...
Build accurate training datasets with Amazon SageMaker Ground Truth - AIM305 ...Build accurate training datasets with Amazon SageMaker Ground Truth - AIM305 ...
Build accurate training datasets with Amazon SageMaker Ground Truth - AIM305 ...Amazon Web Services
 
Introduction to Google Cloud Platform
Introduction to Google Cloud PlatformIntroduction to Google Cloud Platform
Introduction to Google Cloud PlatformSujai Prakasam
 
End-to-End Spark/TensorFlow/PyTorch Pipelines with Databricks Delta
End-to-End Spark/TensorFlow/PyTorch Pipelines with Databricks DeltaEnd-to-End Spark/TensorFlow/PyTorch Pipelines with Databricks Delta
End-to-End Spark/TensorFlow/PyTorch Pipelines with Databricks DeltaDatabricks
 
Seldon: Deploying Models at Scale
Seldon: Deploying Models at ScaleSeldon: Deploying Models at Scale
Seldon: Deploying Models at ScaleSeldon
 
How to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML PlatformHow to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML PlatformDatabricks
 
Deep Learning: Application & Opportunity
Deep Learning: Application & OpportunityDeep Learning: Application & Opportunity
Deep Learning: Application & OpportunityiTrain
 
An introduction to Deep Learning
An introduction to Deep LearningAn introduction to Deep Learning
An introduction to Deep LearningJulien SIMON
 
Apply MLOps at Scale by H&M
Apply MLOps at Scale by H&MApply MLOps at Scale by H&M
Apply MLOps at Scale by H&MDatabricks
 
Introducing Kubeflow (w. Special Guests Tensorflow and Apache Spark)
Introducing Kubeflow (w. Special Guests Tensorflow and Apache Spark)Introducing Kubeflow (w. Special Guests Tensorflow and Apache Spark)
Introducing Kubeflow (w. Special Guests Tensorflow and Apache Spark)DataWorks Summit
 

What's hot (20)

Machine Learning & Amazon SageMaker
Machine Learning & Amazon SageMakerMachine Learning & Amazon SageMaker
Machine Learning & Amazon SageMaker
 
On the Application of AI for Failure Management: Problems, Solutions and Algo...
On the Application of AI for Failure Management: Problems, Solutions and Algo...On the Application of AI for Failure Management: Problems, Solutions and Algo...
On the Application of AI for Failure Management: Problems, Solutions and Algo...
 
Serving models using KFServing
Serving models using KFServingServing models using KFServing
Serving models using KFServing
 
Kubernetes CI/CD with Helm
Kubernetes CI/CD with HelmKubernetes CI/CD with Helm
Kubernetes CI/CD with Helm
 
MLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in ProductionMLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in Production
 
Kubeflow Pipelines (with Tekton)
Kubeflow Pipelines (with Tekton)Kubeflow Pipelines (with Tekton)
Kubeflow Pipelines (with Tekton)
 
AutoML - The Future of AI
AutoML - The Future of AIAutoML - The Future of AI
AutoML - The Future of AI
 
Microsoft Introduction to Automated Machine Learning
Microsoft Introduction to Automated Machine LearningMicrosoft Introduction to Automated Machine Learning
Microsoft Introduction to Automated Machine Learning
 
Azure kubernetes service (aks)
Azure kubernetes service (aks)Azure kubernetes service (aks)
Azure kubernetes service (aks)
 
Yapay Zeka Güvenliği : Machine Learning & Deep Learning & Computer Vision Sec...
Yapay Zeka Güvenliği : Machine Learning & Deep Learning & Computer Vision Sec...Yapay Zeka Güvenliği : Machine Learning & Deep Learning & Computer Vision Sec...
Yapay Zeka Güvenliği : Machine Learning & Deep Learning & Computer Vision Sec...
 
Google Cloud Fundamentals by CloudZone
Google Cloud Fundamentals by CloudZoneGoogle Cloud Fundamentals by CloudZone
Google Cloud Fundamentals by CloudZone
 
Build accurate training datasets with Amazon SageMaker Ground Truth - AIM305 ...
Build accurate training datasets with Amazon SageMaker Ground Truth - AIM305 ...Build accurate training datasets with Amazon SageMaker Ground Truth - AIM305 ...
Build accurate training datasets with Amazon SageMaker Ground Truth - AIM305 ...
 
Introduction to Google Cloud Platform
Introduction to Google Cloud PlatformIntroduction to Google Cloud Platform
Introduction to Google Cloud Platform
 
End-to-End Spark/TensorFlow/PyTorch Pipelines with Databricks Delta
End-to-End Spark/TensorFlow/PyTorch Pipelines with Databricks DeltaEnd-to-End Spark/TensorFlow/PyTorch Pipelines with Databricks Delta
End-to-End Spark/TensorFlow/PyTorch Pipelines with Databricks Delta
 
Seldon: Deploying Models at Scale
Seldon: Deploying Models at ScaleSeldon: Deploying Models at Scale
Seldon: Deploying Models at Scale
 
How to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML PlatformHow to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
How to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform
 
Deep Learning: Application & Opportunity
Deep Learning: Application & OpportunityDeep Learning: Application & Opportunity
Deep Learning: Application & Opportunity
 
An introduction to Deep Learning
An introduction to Deep LearningAn introduction to Deep Learning
An introduction to Deep Learning
 
Apply MLOps at Scale by H&M
Apply MLOps at Scale by H&MApply MLOps at Scale by H&M
Apply MLOps at Scale by H&M
 
Introducing Kubeflow (w. Special Guests Tensorflow and Apache Spark)
Introducing Kubeflow (w. Special Guests Tensorflow and Apache Spark)Introducing Kubeflow (w. Special Guests Tensorflow and Apache Spark)
Introducing Kubeflow (w. Special Guests Tensorflow and Apache Spark)
 

Similar to Deploying ML models in the enterprise

Rsqrd AI: From R&D to ROI of AI
Rsqrd AI: From R&D to ROI of AIRsqrd AI: From R&D to ROI of AI
Rsqrd AI: From R&D to ROI of AISanjana Chowdhury
 
Machine Learning Models: From Research to Production 6.13.18
Machine Learning Models: From Research to Production 6.13.18Machine Learning Models: From Research to Production 6.13.18
Machine Learning Models: From Research to Production 6.13.18Cloudera, Inc.
 
Technology insights: Decision Science Platform
Technology insights: Decision Science PlatformTechnology insights: Decision Science Platform
Technology insights: Decision Science PlatformDecision Science Community
 
World Artificial Intelligence Conference Shanghai 2018
World Artificial Intelligence Conference Shanghai 2018World Artificial Intelligence Conference Shanghai 2018
World Artificial Intelligence Conference Shanghai 2018Adam Gibson
 
Bridging the Gap: from Data Science to Production
Bridging the Gap: from Data Science to ProductionBridging the Gap: from Data Science to Production
Bridging the Gap: from Data Science to ProductionFlorian Wilhelm
 
Mohamed Sabri: Operationalize machine learning with Kubeflow
Mohamed Sabri: Operationalize machine learning with KubeflowMohamed Sabri: Operationalize machine learning with Kubeflow
Mohamed Sabri: Operationalize machine learning with KubeflowLviv Startup Club
 
Mohamed Sabri: Operationalize machine learning with Kubeflow
Mohamed Sabri: Operationalize machine learning with KubeflowMohamed Sabri: Operationalize machine learning with Kubeflow
Mohamed Sabri: Operationalize machine learning with KubeflowEdunomica
 
Modern apps in a microservices age May meet up Architecting for Innovation
Modern apps in a microservices age May meet up Architecting for InnovationModern apps in a microservices age May meet up Architecting for Innovation
Modern apps in a microservices age May meet up Architecting for InnovationAndrew Blades
 
Michelangelo - Machine Learning Platform - 2018
Michelangelo - Machine Learning Platform - 2018Michelangelo - Machine Learning Platform - 2018
Michelangelo - Machine Learning Platform - 2018Karthik Murugesan
 
Using Machine Learning & Artificial Intelligence to Create Impactful Customer...
Using Machine Learning & Artificial Intelligence to Create Impactful Customer...Using Machine Learning & Artificial Intelligence to Create Impactful Customer...
Using Machine Learning & Artificial Intelligence to Create Impactful Customer...Costanoa Ventures
 
Building a Scalable and reliable open source ML Platform with MLFlow
Building a Scalable and reliable open source ML Platform with MLFlowBuilding a Scalable and reliable open source ML Platform with MLFlow
Building a Scalable and reliable open source ML Platform with MLFlowGoDataDriven
 
Ai platform at scale
Ai platform at scaleAi platform at scale
Ai platform at scaleHenry Saputra
 
From Data Science to MLOps
From Data Science to MLOpsFrom Data Science to MLOps
From Data Science to MLOpsCarl W. Handlin
 
DutchMLSchool. ML for Energy Trading and Automotive Sector
DutchMLSchool. ML for Energy Trading and Automotive SectorDutchMLSchool. ML for Energy Trading and Automotive Sector
DutchMLSchool. ML for Energy Trading and Automotive SectorBigML, Inc
 
It Consulting & Services - Black Basil Technologies
It Consulting & Services  - Black Basil TechnologiesIt Consulting & Services  - Black Basil Technologies
It Consulting & Services - Black Basil TechnologiesBlack Basil Technologies
 
Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdf
Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdfSlides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdf
Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdfvitm11
 
Rapid app building with loopback framework
Rapid app building with loopback frameworkRapid app building with loopback framework
Rapid app building with loopback frameworkThomas Papaspiros
 
ICLR 2020 Recap
ICLR 2020 RecapICLR 2020 Recap
ICLR 2020 RecapSri Ambati
 
Utilisation de la plateforme virtuelle QEMU/SystemC pour l'IoT
Utilisation de la plateforme virtuelle QEMU/SystemC pour l'IoTUtilisation de la plateforme virtuelle QEMU/SystemC pour l'IoT
Utilisation de la plateforme virtuelle QEMU/SystemC pour l'IoTPôle Systematic Paris-Region
 
Serverless machine learning architectures at Helixa
Serverless machine learning architectures at HelixaServerless machine learning architectures at Helixa
Serverless machine learning architectures at HelixaData Science Milan
 

Similar to Deploying ML models in the enterprise (20)

Rsqrd AI: From R&D to ROI of AI
Rsqrd AI: From R&D to ROI of AIRsqrd AI: From R&D to ROI of AI
Rsqrd AI: From R&D to ROI of AI
 
Machine Learning Models: From Research to Production 6.13.18
Machine Learning Models: From Research to Production 6.13.18Machine Learning Models: From Research to Production 6.13.18
Machine Learning Models: From Research to Production 6.13.18
 
Technology insights: Decision Science Platform
Technology insights: Decision Science PlatformTechnology insights: Decision Science Platform
Technology insights: Decision Science Platform
 
World Artificial Intelligence Conference Shanghai 2018
World Artificial Intelligence Conference Shanghai 2018World Artificial Intelligence Conference Shanghai 2018
World Artificial Intelligence Conference Shanghai 2018
 
Bridging the Gap: from Data Science to Production
Bridging the Gap: from Data Science to ProductionBridging the Gap: from Data Science to Production
Bridging the Gap: from Data Science to Production
 
Mohamed Sabri: Operationalize machine learning with Kubeflow
Mohamed Sabri: Operationalize machine learning with KubeflowMohamed Sabri: Operationalize machine learning with Kubeflow
Mohamed Sabri: Operationalize machine learning with Kubeflow
 
Mohamed Sabri: Operationalize machine learning with Kubeflow
Mohamed Sabri: Operationalize machine learning with KubeflowMohamed Sabri: Operationalize machine learning with Kubeflow
Mohamed Sabri: Operationalize machine learning with Kubeflow
 
Modern apps in a microservices age May meet up Architecting for Innovation
Modern apps in a microservices age May meet up Architecting for InnovationModern apps in a microservices age May meet up Architecting for Innovation
Modern apps in a microservices age May meet up Architecting for Innovation
 
Michelangelo - Machine Learning Platform - 2018
Michelangelo - Machine Learning Platform - 2018Michelangelo - Machine Learning Platform - 2018
Michelangelo - Machine Learning Platform - 2018
 
Using Machine Learning & Artificial Intelligence to Create Impactful Customer...
Using Machine Learning & Artificial Intelligence to Create Impactful Customer...Using Machine Learning & Artificial Intelligence to Create Impactful Customer...
Using Machine Learning & Artificial Intelligence to Create Impactful Customer...
 
Building a Scalable and reliable open source ML Platform with MLFlow
Building a Scalable and reliable open source ML Platform with MLFlowBuilding a Scalable and reliable open source ML Platform with MLFlow
Building a Scalable and reliable open source ML Platform with MLFlow
 
Ai platform at scale
Ai platform at scaleAi platform at scale
Ai platform at scale
 
From Data Science to MLOps
From Data Science to MLOpsFrom Data Science to MLOps
From Data Science to MLOps
 
DutchMLSchool. ML for Energy Trading and Automotive Sector
DutchMLSchool. ML for Energy Trading and Automotive SectorDutchMLSchool. ML for Energy Trading and Automotive Sector
DutchMLSchool. ML for Energy Trading and Automotive Sector
 
It Consulting & Services - Black Basil Technologies
It Consulting & Services  - Black Basil TechnologiesIt Consulting & Services  - Black Basil Technologies
It Consulting & Services - Black Basil Technologies
 
Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdf
Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdfSlides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdf
Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdf
 
Rapid app building with loopback framework
Rapid app building with loopback frameworkRapid app building with loopback framework
Rapid app building with loopback framework
 
ICLR 2020 Recap
ICLR 2020 RecapICLR 2020 Recap
ICLR 2020 Recap
 
Utilisation de la plateforme virtuelle QEMU/SystemC pour l'IoT
Utilisation de la plateforme virtuelle QEMU/SystemC pour l'IoTUtilisation de la plateforme virtuelle QEMU/SystemC pour l'IoT
Utilisation de la plateforme virtuelle QEMU/SystemC pour l'IoT
 
Serverless machine learning architectures at Helixa
Serverless machine learning architectures at HelixaServerless machine learning architectures at Helixa
Serverless machine learning architectures at Helixa
 

Recently uploaded

TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 

Recently uploaded (20)

TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 

Deploying ML models in the enterprise

  • 1. Deploying machine learning models in the enterprise Strata Data Conference NYC Diego Oppenheimer, CEO diego@algorithmia.com
  • 2. About Me Diego Oppenheimer - Founder and CEO - Algorithmia ● Product developer, entrepreneur, extensive background in all things data. ● Microsoft: PowerPivot, PowerBI, Excel and SQL Server. ● Founder of algorithmic trading startup ● BS/MS Carnegie Mellon University
  • 3. Make state-of-the-art algorithms discoverable and accessible to everyone.
  • 4. 4 Algorithmia.com AI/ML scalable infrastructure on demand + marketplace ● Function-as-a-service for Machine & Deep Learning ● Discoverable, live inventory of AI ● Monetizable ● Composable ● Every developer on earth can make their app intelligent
  • 5. “There’s an algorithm for that!” 77K DEVELOPERS 6.4K algorithms, models and functions
  • 6. 6 What does production mean for us ● ~6,400 algorithms,models, functions (50k w/ different versions) ● Each model: 1 to 1,000 calls a second, fluctuates, no devops ● ~15ms overhead latency ● Accessible in any of 14 languages (through SDKs) ● Any runtime, any architecture
  • 7. 7 ALGORITHMIA ENTERPRISE Algorithmia Enterprise is an organization’s internal inventory of intelligence and a algorithm-as-a- service platform Deploy Write your function or model in any programming language, framework, or infrastructure. Scale Expose your model as a highly-reliable versioned REST API that automatically scales from one to hundreds of requests per second. Discover Name and describe your model, making it available in a central catalog where your peers can easily discover and reuse it. Monitor House thousands of models under one roof with a uniform REST interface and a single cluster monitoring dashboard.
  • 9. 9 ● Challenges of Deploying Models in the Enterprise ● Characteristics of AI and Technologies ● Varying languages ● Standardize versioning ● Continuous model deployment ● Managing your portfolio of ML models ● Analytics on how and where models are being used ● Maintaining auditability ● Best practices for your organization What we will cover
  • 10. 10 Challenges of deploying models in the enterprise ● Machine learning ○ CPU/GPU/Specialized hardware ○ Multiple frameworks, languages, dependencies ○ Called from different devices/architectures ● “Snowflake” environments ○ Unique cloud hardware and services ● Security and Audit ○ Stringent security and access controls ○ “Who called what when” for audit and compliance ● Uncharted territory ○ Not a lot of literature ○ Deployment for data science teams is a new problem ○ Many teams have not bought software or dealt with their own infrastructure teams. ○ Chargebacks and billing
  • 11. • Two distinct phases: training and inference • Lots of processing power • Heterogeneous hardware (CPUs, GPUs, TPUs, etc.) • Limited by compute rather than bandwidth • “Tensorflow is open source, scaling it is not.” - Kenny Daniel Characteristics of AI/ML
  • 12. 12 Metal or VM Containers Kubernetes INFERENCE Short compute bursts Elastic Stateless Multiple users OWNER: DevOps TRAINING Long compute cycle Fixed load (Inelastic) Stateful Single user OWNER: Data Scientists Technologies
  • 13. MICROSERVICES: the design of a system as independently deployable, loosely coupled services. Two system design paradigms that work well ADVANTAGES • Maintainability • Scalability • Rolling deployments • Elastic • Software/Hardware agnostic SERVERLESS: the encapsulation, starting, and stopping of singular functions per request, with a just-in-time-compute model. ADVANTAGES • Cost / Efficiency • Concurrency built-in • Speed of development • Improved latency
  • 14. 14 Runtime Abstraction Support any programming language or framework, including interoperability between mixed stacks. Elastic Scale Prioritize and automatically optimize execution of concurrent short-lived jobs. Cloud Abstraction Provide portability to algorithms, including public clouds or private clouds. Discoverability, Authentication, Instrumentation, etc. Shell & Services Kernel Think of your technology stack as an OS for running AI in your enterprise
  • 15. 15 Multiple frameworks, languages, dependencies ● Models are rarely developed in the language they will be consumed in. ● In large enterprises there is rarely a standard language for software development ● The goal should be to make models consumable by any part of the organization on any platform: ○ To ensure the max value extracted from the model ○ To ensure model re-use ○ To ensure the fastest time from lab to production ● APIs and well developed SDKs are your best friend. ○ APIs allow for easy testing if a model is adequate. ○ APIs allow for easy consumption. ○ Well developed APIs have built in versioning. Which takes us to the next step....
  • 16. 16 Standardize versioning ● Versioning is an extremely important part of deploying models in the enterprise. ● Starting thinking about models as any other piece of modular software: ○ Must be versioned ○ Versioned must be tracked ○ Older versions should be accessible (for rollbacks, acceptance testing,etc). For the Data Scientist: ● Ability to compare two different versions of the model is key. ○ Not only at training and verification time but also to understand performance and SLA changes. ● Model drift. For the Application Developers: ● Ability to match acceptance testing to predetermined developer cycles. ● Ability to stay behind a version. ● Avoid performance issues even if the model accuracy is better.
  • 17. 17 Standardize versioning ● Even better, make your system auto-version. ● Borrow the best practices from the Software Development world. ● Rolling non-interruptive deployments
  • 18. 18 Standardize Documentation ● Just like any API, they are only as good as their documentation. ● Make the documentation travel and be updated with the model to ensure its always up to date. ○ Directly in markdown inside the git that contains the model. ○ As an artifact that travels with the model.
  • 19. 19 Continuous deployment ● Just like with standardizing versioning, Continuous Integration and Continuous Deployment for AI/ML should borrow from best practices of software development. ● The fastest path is usually the best (git push -> deploy). ○ Git + Docker + API generation makes this really easy. ● Don’t forget dependency management! ● Some interesting use cases: ○ Continuous training and deployment ○ Human in the loop training and deployment ○ Bespoke training and deployment to central platform
  • 20. Continuous deployment - Human in the Loop
  • 21. Continuous deployment - Single train platform to Deployment
  • 22. Continuous deployment - Multiple training frameworks to deployment
  • 23. 23 Managing a model portfolio “We have 14 versions of imagic magik running as services for image resizing before feeding into a number of different models. ” Dev Manager Analytics Platform - F500 Media Company ● Similar to an API strategy as new models become available you need to start caring about how to find them, who can use them and how to bill for them. ● Borrowing from the concept of an API Gateway and API registry the same paradigms work for model management and distribution. ● A common, centralized registry will offer the ability to find what has already been created and potentially re-use it.
  • 24. 24 Managing a model portfolio ● Centralized repository/registry of models that can be accessed across organization. ● Encourage reuse (many preprocessing and post processing functions should only be built once). ● Finding existing models is key for experimenting with different pipelines.
  • 25. 25 Managing a model portfolio ● Centralization of models allows for understanding business impact and usage across the organization. ● C-level understanding of AI/ML investments working or not. ● Finding existing models is key for experimenting with different pipelines and for rapid application development by disparate teams or external developers. ● Security and Access controls so that only the right people in the organization can access the models. ○ Who can view how the model works vs Who can call upon it.
  • 26. 26 Model Analytics During the training phase analytics such as accuracy, drift, errors rates are very important. When deploying models inside an enterprise a different slice of analytics is required. What is important during deployment and production: ● Latency ● Resources used (CPU/GPU, I/O) ● System Capacity ● Scale up and Scale down ● Authentication ● API timing metrics and calls ● Errors rates But also… ● What teams are using the models ● What applications are using them
  • 27. 27 Model Auditability and Compliance You enterprise deployment system should be able to answer: “Who called what model, when and with what data.” Why is this important: ● Compliance: ○ Regulated industries need to provide this information to government regulators: ■ Financial Services ■ Life Sciences ■ Federal Government ● C-level understanding of AI/ML investments working or not. ● Debugging production systems ● Billing ○ Complete understanding of who is using what models allows for chargebacks
  • 28. 28 Best practices and conclusion "Expecting your engineering and DevOps teams to deploy ML models well is like showing up to Seaworld with a giraffe since they are already handling large mammals.” ● Technology: ○ Borrow from the best practices of software development and deployment of applications and code at scale. ■ CI/CD, Versioning, API design, etc. ○ The most advanced AI/ML companies in the world are centralizing their deployment and serving platforms under one roof. This is because the influence of data science and ML teams across your organization will only grow. ○ Understand that training and serving have very different profiles and different technology choices will need to be made. ○ Seriously consider using microservices and serverless as design patterns for re-use, scale and modularity.
  • 29. 29 Best practices and conclusion ● Organization: ○ Production and serving will usually be owned DevOps or Enterprise Architecture. Success in deployment in the enterprise will dicated by understanding the roles and responsibilities of these teams. ○ Data science teams tend to be new with limited experience in: ■ Purchasing enterprise software ■ Requirements and considerations for production environments ■ IT requirements around Information Security, Compliance and Support. It’s crucial for success to educate and guide these teams through the enterprise requirements.
  • 30. 30 Best practices and conclusion ● Future proofing: ○ Think re-use: many models will be interesting and usable to multiple parts of the enterprise. Discoverability and accessibility become key. ○ Safe to assume that your number of models will only grow in time. How you will manage them in the future requires having a conversation early in the process. ○ Your AI/ML model portfolio is an extremely important asset - understanding the value you are getting from them is important to measurement and influence should be tracked. ○ When deciding to build vs buy think of your teams capacity to adapt and move at the pace the AI/ML industry is moving at over time.