The talk was given at OReilly Strata Data Conference September 2018 in NYC
All the conferences and thought leaders have been painting a vision of the businesses of the future being powered by data, but if we’re honest with ourselves, the vast majority of our massive data science investments are being deployed to PowerPoint or maybe a business dashboard. Productionizing your machine learning (ML) portfolio is the next big step on the path to ROI from AI.
You probably started out years ago on a “big data” initiative: You collected and cleaned your data and built data warehouses, and when those filled up you upgraded to data lakes. You hired data engineers and data scientists, and around the organization, everyone brushed up their SQL querying skills and got some licenses to Tableau and PowerBI.
Then you saw what Google, Uber, Facebook, and Amazon were doing with machine learning to automate business processes and customer interactions. To not get broadsided, you hired more data scientists and machine learning engineers. They were put on your teams and started using your big data investments to train models. But what you probably found is that your tech stack and DevOps processes don’t fit ML models. Unlike most of your systems, ML models require short spikes of massive compute; they are often written in different languages than your core code; they need different hardware to perform well; one model probably has applications across many teams; and the people making the models often don’t have the engineering experience to write production code but need to iterate faster than traditional engineers. Expecting your engineering and DevOps teams to deploy ML models well is like showing up to Seaworld with a giraffe since they are already handling large mammals.
There is a path forward. Almost five years ago Algorithmia launched a marketplace for models, functions, and algorithms. Today 65,000 developers are on the platform deploying 4,500 models—the result has been a layer of tools and best practices to make deploying ML models frictionless, scalable, and low maintenance. The company refers to it as the “AI layer.”
Drawing on this experience, Diego Oppenheimer covers the strategic and technical hurdles each company must overcome and the best practices developed while deploying over 4,000 ML models for 70,000 engineers.
Topics include:
Best practices for your organization
Continuous model deployment
Varying languages (Your code base probably isn’t in Python or R, but your ML models probably are.)
Managing your portfolio of ML models
Standardize versioning
Enabling models across your organization
Analytics on how and where models are being used
Maintaining auditability
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Deploying ML models in the enterprise
1. Deploying machine learning models
in the enterprise
Strata Data Conference NYC
Diego Oppenheimer, CEO
diego@algorithmia.com
2. About Me
Diego Oppenheimer - Founder and CEO - Algorithmia
● Product developer, entrepreneur, extensive background in all things data.
● Microsoft: PowerPivot, PowerBI, Excel and SQL Server.
● Founder of algorithmic trading startup
● BS/MS Carnegie Mellon University
4. 4
Algorithmia.com
AI/ML scalable infrastructure on demand + marketplace
● Function-as-a-service for Machine & Deep Learning
● Discoverable, live inventory of AI
● Monetizable
● Composable
● Every developer on earth can make their app intelligent
5. “There’s an algorithm for that!”
77K DEVELOPERS 6.4K algorithms, models and functions
6. 6
What does production mean for us
● ~6,400 algorithms,models, functions (50k w/ different versions)
● Each model: 1 to 1,000 calls a second, fluctuates, no devops
● ~15ms overhead latency
● Accessible in any of 14 languages (through SDKs)
● Any runtime, any architecture
7. 7
ALGORITHMIA ENTERPRISE
Algorithmia Enterprise is an organization’s internal inventory of intelligence and a algorithm-as-a-
service platform
Deploy
Write your function or
model in any
programming
language, framework,
or infrastructure.
Scale
Expose your model as a
highly-reliable
versioned REST API that
automatically scales
from one to hundreds
of requests per second.
Discover
Name and describe
your model, making it
available in a central
catalog where your
peers can easily
discover and reuse it.
Monitor
House thousands of
models under one roof
with a uniform REST
interface and a single
cluster monitoring
dashboard.
9. 9
● Challenges of Deploying Models in the Enterprise
● Characteristics of AI and Technologies
● Varying languages
● Standardize versioning
● Continuous model deployment
● Managing your portfolio of ML models
● Analytics on how and where models are being used
● Maintaining auditability
● Best practices for your organization
What we will cover
10. 10
Challenges of deploying models in the enterprise
● Machine learning
○ CPU/GPU/Specialized hardware
○ Multiple frameworks, languages,
dependencies
○ Called from different
devices/architectures
● “Snowflake” environments
○ Unique cloud hardware and services
● Security and Audit
○ Stringent security and access controls
○ “Who called what when” for audit and
compliance
● Uncharted territory
○ Not a lot of literature
○ Deployment for data science teams is a
new problem
○ Many teams have not bought software
or dealt with their own infrastructure
teams.
○ Chargebacks and billing
11. • Two distinct phases: training and inference
• Lots of processing power
• Heterogeneous hardware (CPUs, GPUs, TPUs, etc.)
• Limited by compute rather than bandwidth
• “Tensorflow is open source, scaling it is not.” - Kenny Daniel
Characteristics of AI/ML
12. 12
Metal or VM Containers Kubernetes
INFERENCE
Short compute bursts
Elastic
Stateless
Multiple users
OWNER: DevOps
TRAINING
Long compute cycle
Fixed load (Inelastic)
Stateful
Single user
OWNER: Data Scientists
Technologies
13. MICROSERVICES: the design of a system as
independently deployable, loosely coupled
services.
Two system design paradigms that work well
ADVANTAGES
• Maintainability
• Scalability
• Rolling deployments
• Elastic
• Software/Hardware agnostic
SERVERLESS: the encapsulation, starting, and
stopping of singular functions per request, with
a just-in-time-compute model.
ADVANTAGES
• Cost / Efficiency
• Concurrency built-in
• Speed of development
• Improved latency
14. 14
Runtime Abstraction
Support any
programming language
or framework, including
interoperability between
mixed stacks.
Elastic Scale
Prioritize and
automatically optimize
execution of concurrent
short-lived jobs.
Cloud Abstraction
Provide portability to
algorithms, including
public clouds or private
clouds.
Discoverability, Authentication, Instrumentation, etc.
Shell & Services
Kernel
Think of your technology stack as an OS for running AI in your enterprise
15. 15
Multiple frameworks, languages, dependencies
● Models are rarely developed in the language they will be consumed in.
● In large enterprises there is rarely a standard language for software
development
● The goal should be to make models consumable by any part of the
organization on any platform:
○ To ensure the max value extracted from the model
○ To ensure model re-use
○ To ensure the fastest time from lab to production
● APIs and well developed SDKs are your best friend.
○ APIs allow for easy testing if a model is adequate.
○ APIs allow for easy consumption.
○ Well developed APIs have built in versioning. Which takes us to the next
step....
16. 16
Standardize versioning
● Versioning is an extremely important part of deploying models in the enterprise.
● Starting thinking about models as any other piece of modular software:
○ Must be versioned
○ Versioned must be tracked
○ Older versions should be accessible (for rollbacks, acceptance testing,etc).
For the Data Scientist:
● Ability to compare two different versions of the model is key.
○ Not only at training and verification time but also to understand performance
and SLA changes.
● Model drift.
For the Application Developers:
● Ability to match acceptance testing to predetermined developer cycles.
● Ability to stay behind a version.
● Avoid performance issues even if the model accuracy is better.
17. 17
Standardize versioning
● Even better, make your system auto-version.
● Borrow the best practices from the Software
Development world.
● Rolling non-interruptive deployments
18. 18
Standardize Documentation
● Just like any API, they are
only as good as their
documentation.
● Make the documentation
travel and be updated
with the model to ensure
its always up to date.
○ Directly in
markdown inside the
git that contains the
model.
○ As an artifact that
travels with the
model.
19. 19
Continuous deployment
● Just like with standardizing versioning, Continuous
Integration and Continuous Deployment for AI/ML
should borrow from best practices of software
development.
● The fastest path is usually the best (git push -> deploy).
○ Git + Docker + API generation makes this really
easy.
● Don’t forget dependency management!
● Some interesting use cases:
○ Continuous training and deployment
○ Human in the loop training and deployment
○ Bespoke training and deployment to central
platform
23. 23
Managing a model portfolio
“We have 14 versions of imagic magik running as services
for image resizing before feeding into a number of different
models. ”
Dev Manager Analytics Platform - F500 Media Company
● Similar to an API strategy as new models become available you need to
start caring about how to find them, who can use them and how to bill
for them.
● Borrowing from the concept of an API Gateway and API registry the same
paradigms work for model management and distribution.
● A common, centralized registry will offer the ability to find what has
already been created and potentially re-use it.
24. 24
Managing a model portfolio
● Centralized
repository/registry of
models that can be accessed
across organization.
● Encourage reuse (many
preprocessing and post
processing functions should
only be built once).
● Finding existing models is
key for experimenting with
different pipelines.
25. 25
Managing a model portfolio
● Centralization of models allows for understanding business impact and usage across the
organization.
● C-level understanding of AI/ML investments working or not.
● Finding existing models is key for experimenting with different pipelines and for rapid
application development by disparate teams or external developers.
● Security and Access controls so that only the right people in the organization can access the
models.
○ Who can view how the model works vs Who can call upon it.
26. 26
Model Analytics
During the training phase analytics such as accuracy, drift, errors rates are very important. When
deploying models inside an enterprise a different slice of analytics is required.
What is important during deployment and production:
● Latency
● Resources used (CPU/GPU, I/O)
● System Capacity
● Scale up and Scale down
● Authentication
● API timing metrics and calls
● Errors rates
But also…
● What teams are using the models
● What applications are using them
27. 27
Model Auditability and Compliance
You enterprise deployment system should be able to answer:
“Who called what model, when and with what data.”
Why is this important:
● Compliance:
○ Regulated industries need to provide this information to government regulators:
■ Financial Services
■ Life Sciences
■ Federal Government
● C-level understanding of AI/ML investments working or not.
● Debugging production systems
● Billing
○ Complete understanding of who is using what models allows for chargebacks
28. 28
Best practices and conclusion
"Expecting your engineering and DevOps teams to deploy ML models well is like showing up to
Seaworld with a giraffe since they are already handling large mammals.”
● Technology:
○ Borrow from the best practices of software development and deployment of
applications and code at scale.
■ CI/CD, Versioning, API design, etc.
○ The most advanced AI/ML companies in the world are centralizing their deployment
and serving platforms under one roof. This is because the influence of data science
and ML teams across your organization will only grow.
○ Understand that training and serving have very different profiles and different
technology choices will need to be made.
○ Seriously consider using microservices and serverless as design patterns for re-use,
scale and modularity.
29. 29
Best practices and conclusion
● Organization:
○ Production and serving will usually be owned DevOps or Enterprise Architecture.
Success in deployment in the enterprise will dicated by understanding the roles and
responsibilities of these teams.
○ Data science teams tend to be new with limited experience in:
■ Purchasing enterprise software
■ Requirements and considerations for production environments
■ IT requirements around Information Security, Compliance and Support.
It’s crucial for success to educate and guide these teams through the enterprise
requirements.
30. 30
Best practices and conclusion
● Future proofing:
○ Think re-use: many models will be interesting and usable to multiple parts of the
enterprise. Discoverability and accessibility become key.
○ Safe to assume that your number of models will only grow in time. How you will
manage them in the future requires having a conversation early in the process.
○ Your AI/ML model portfolio is an extremely important asset - understanding the value
you are getting from them is important to measurement and influence should be
tracked.
○ When deciding to build vs buy think of your teams capacity to adapt and move at the
pace the AI/ML industry is moving at over time.