Machine Learning In Production

•Descargar como PPTX, PDF•

18 recomendaciones•5,816 vistas

Samir Bessalah

Deploying Machine Learning in Prod

Datos y análisis

Running
Machine Learning Applications
In Production
Sam BESSALAH
@samklr

Might Works well for KAGGLE!
But Kaggle isn’t real world Machine
learning!

In Real Life
- Trade off : Accuracy vs Interpretability cs Speed vs Infrastructure contraints
- Interpretability and Speed often beats accuracy
- Most of the time Kaggle is a feature engineering contest
- Contest oriented vs Real Product Impact

But in real life … Things are less obvious
Data Engineers
Data Pipeline
Data Scientists
/ ML Engineers
APP
Applications
Developers

But in real life … Things are less obvious
Data Engineers
Data Pipeline
Data Scientists
/ ML Engineers
APP
Applications
Developers
Innovation is often (wrongly ?) thought to be here ...

http://www.slideshare.net/jssm1th/an-architecture-for-agile-machine-learning-in-realtime-applications

Production Requirements :
- Flexibility and agility
- Scalability and Performance
- Enable Real time decision making, sometimes at huge QPS at subseconds
pace.
- Security

Machine Learning as a Software Problem
- Most ML developement patterns lead to software design anti patterns
- Dependencies in code, creeps through Models dependencies in Data
- Wasteful use of data, since most ml model selection require multiple version
of data. Hence the instability of data, and of prediction services
- Breaks system isolation, leading to un-maintainable stacks

In Production, Machine Learning is a
Software and System Problem.
Treat it accordingly !!!!

Deployment / Model Serving
The Missing Part in ML

- Model Serving is often ignored or left out to Back End Engineers to implement
at their own liking.
- More often it involves serving an API or a Service to do the Predict function.
But that not often enough.
- Software scaling can become problematic to the accuracy of the model.
- How many models are you serving?
- Are you running something else ?
- Are you updating your model in real time?

- Trained Models are stored in PMML files
- They serve their models via Openscoring

PMML?
- Might be the solution for some (most ?) cases
- Support many models, but lacks support for many others
- Fails to capture the evolution of your modeling process … Transformations, re
encoding, etc .
- Better suited for exporting models to other systems, rather than being served
to machine learning products with real user facing.
- And … XML ?? Really????

Model Versioning - Packaging
- You usually don’t serve only one model. But a lot more. Especially when
running experiments.
- You should vie to package your model in versionned way.
- Git is awesome, but not appropriate for live model serving
- Build a model repository or a model index
- I usually use fast KV store or advanced data stores to save my models
- Build a service to manage your models (Model Manager) responsible for
evaluating and updating your model.

Serialization
- Remember PMML ?
- In Big Data, data has schema and proper evolution?
- Why not models ?
- Lots to choose from : Protobuf, Avro
- Use binary schema to represent and version your models

Evaluation
- Business metrics often differ from core model metrics : Trade off between
long term metrics and short term metrics.
- Hyperparameters
- A/B Testing - Multi Armed Bandits Problem

A/B Testing - Multi-armed Bandit
Dataiku

Reproducibility
- How to keep track of data used for training ?
- Are notebooks enough?
- Junpyter Notebooks, Spark Notebooks, Zeppelin, etc ….
- Need for an end to end solution. Not perfect, but a workable one.

I forgot many things
- Monitoring
- Pipeline tuning (one model is often fed to another one)
- RPC over REST for fast model serving ?
- How to deal with heterogeneous systems ?
- Do you really have to distribute your processing?
- Is more data better than smartly tuned algorithms?

Más contenido relacionado

La actualidad más candente

Data extraction, cleanup & transformation tools 29.1.16Dhilsath Fathima

Practicing Data Science: A Collection of Case StudiesKNIMESlides

Machine Learning 101Setu Chokshi

Introduction to-machine-learningBabu Priyavrat

Machine Learning Platformization & AutoML: Adopting ML at Scale in the Enterp...Ed Fernandez

MLOps Using MLflowDatabricks

Model bias in AIJason Tamara Widjaja

Towards Human-Centered Machine LearningSri Ambati

A Comprehensive Review of Large Language Models for.pptxSaiPragnaKancheti

MLOps Virtual Event | Building Machine Learning Platforms for the Full LifecycleDatabricks

How to Become a Data Scientistryanorban

Explainable AIDinesh V

BI and Data Analytics Incorta

Explainable AI - making ML and DL models more interpretableAditya Bhattacharya

An Introduction to Supervised Machine Learning and Pattern Classification: Th...Sebastian Raschka

Monitoring Models in ProductionJannes Klaas

Machine learning model to productionGeorg Heiler

Customizing LLMsJim Steele

Machine learningDr Geetha Mohan

Intro to LLMsLoic Merckel

La actualidad más candente (20)

Data extraction, cleanup & transformation tools 29.1.16

Practicing Data Science: A Collection of Case Studies

Machine Learning 101

Introduction to-machine-learning

Machine Learning Platformization & AutoML: Adopting ML at Scale in the Enterp...

MLOps Using MLflow

Model bias in AI

Towards Human-Centered Machine Learning

A Comprehensive Review of Large Language Models for.pptx

MLOps Virtual Event | Building Machine Learning Platforms for the Full Lifecycle

How to Become a Data Scientist

Explainable AI

BI and Data Analytics

Explainable AI - making ML and DL models more interpretable

An Introduction to Supervised Machine Learning and Pattern Classification: Th...

Monitoring Models in Production

Machine learning model to production

Customizing LLMs

Machine learning

Intro to LLMs

Destacado

Square's Machine Learning Infrastructure and Applications - Rong YanHakka Labs

Serverless machine learning operationsStepan Pushkarev

Machine learning in productionTuri, Inc.

Managing and Versioning Machine Learning Models in PythonSimon Frid

Multi runtime serving pipelines for machine learningStepan Pushkarev

Production machine learning_infrastructurejoshwills

Using PySpark to Process Boat Loads of DataRobert Dempsey

Production and Beyond: Deploying and Managing Machine Learning ModelsTuri, Inc.

Machine learning in production with scikit-learnJeff Klukas

Building A Production-Level Machine Learning PipelineRobert Dempsey

Python as part of a production machine learning stack by Michael Manapat PyDa...PyData

PostgreSQL + Kafka: The Delight of Change Data CaptureJeff Klukas

A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...Jose Quesada (hiring)

Machine Learning Pipelinesjeykottalam

Spark and machine learning in microservices architectureStepan Pushkarev

Simple (and Simplistic) Introduction to Econometrics and Linear RegressionPhilip Tiongson

AI and Machine Learning Demystified by Carol Smith at Midwest UX 2017Carol Smith

Destacado (17)

Square's Machine Learning Infrastructure and Applications - Rong Yan

Serverless machine learning operations

Machine learning in production

Managing and Versioning Machine Learning Models in Python

Multi runtime serving pipelines for machine learning

Production machine learning_infrastructure

Using PySpark to Process Boat Loads of Data

Production and Beyond: Deploying and Managing Machine Learning Models

Machine learning in production with scikit-learn

Building A Production-Level Machine Learning Pipeline

Python as part of a production machine learning stack by Michael Manapat PyDa...

PostgreSQL + Kafka: The Delight of Change Data Capture

A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...

Machine Learning Pipelines

Spark and machine learning in microservices architecture

Simple (and Simplistic) Introduction to Econometrics and Linear Regression

AI and Machine Learning Demystified by Carol Smith at Midwest UX 2017

Similar a Machine Learning In Production

Building a MLOps Platform Around MLflow to Enable Model Productionalization i...Databricks

Why is dev ops for machine learning so differentRyan Dawson

DevOps for DataScienceStepan Pushkarev

Machine learning at scale challenges and solutionsStavros Kontopoulos

Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning ModelsAnyscale

“Houston, we have a model...” Introduction to MLOpsRui Quintino

Use Case Patterns for LLM Applications (1).pdfM Waleed Kadous

Magdalena Stenius: MLOPS Will Change Machine LearningLviv Startup Club

Notes on Deploying Machine-learning Models at ScaleDeep Kayal

Why is dev ops for machine learning so different - dataxdaysRyan Dawson

How to Productionize Your Machine Learning Models Using Apache Spark MLlib 2....Databricks

Feature store: Solving anti-patterns in ML-systemsAndrzej Michałowski

Model Monitoring at Scale with Apache Spark and VertaDatabricks

Splice Machine's use of Apache Spark and MLflowDatabricks

LLMOps for Your Data: Best Practices to Ensure Safety, Quality, and CostAggregage

Strata CA 2019: From Jupyter to Production Manu MukerjiManu Mukerji

Practical machine learningFaizan Javed

Open, Secure & Transparent AI PipelinesNick Pentreath

artificggggggggggggggialintelligence.pdftt4765690

MLOps and Reproducible ML on AWS with Kubeflow and SageMakerProvectus

Similar a Machine Learning In Production (20)

Building a MLOps Platform Around MLflow to Enable Model Productionalization i...

Why is dev ops for machine learning so different

DevOps for DataScience

Machine learning at scale challenges and solutions

Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning Models

“Houston, we have a model...” Introduction to MLOps

Use Case Patterns for LLM Applications (1).pdf

Magdalena Stenius: MLOPS Will Change Machine Learning

Notes on Deploying Machine-learning Models at Scale

Why is dev ops for machine learning so different - dataxdays

How to Productionize Your Machine Learning Models Using Apache Spark MLlib 2....

Feature store: Solving anti-patterns in ML-systems

Model Monitoring at Scale with Apache Spark and Verta

Splice Machine's use of Apache Spark and MLflow

LLMOps for Your Data: Best Practices to Ensure Safety, Quality, and Cost

Strata CA 2019: From Jupyter to Production Manu Mukerji

Practical machine learning

Open, Secure & Transparent AI Pipelines

artificggggggggggggggialintelligence.pdf

MLOps and Reproducible ML on AWS with Kubeflow and SageMaker

Más de Samir Bessalah

Tuning tips for Apache Spark JobsSamir Bessalah

Eventual Consitency with CRDTSSamir Bessalah

Deep learning for mere mortals - Devoxx Belgium 2015Samir Bessalah

High Performance RPC with FinagleSamir Bessalah

scalable machine learningSamir Bessalah

mesos-devoxx14Samir Bessalah

Algebird : Abstract Algebra for big data analytics. Devoxx 2014Samir Bessalah

Big Data Analytics with Scala at SCALA.IO 2013Samir Bessalah

Scala+dataSamir Bessalah

Structures de données exotiquesSamir Bessalah

Más de Samir Bessalah (10)

Tuning tips for Apache Spark Jobs

Eventual Consitency with CRDTS

Deep learning for mere mortals - Devoxx Belgium 2015

High Performance RPC with Finagle

scalable machine learning

mesos-devoxx14

Algebird : Abstract Algebra for big data analytics. Devoxx 2014

Big Data Analytics with Scala at SCALA.IO 2013

Scala+data

Structures de données exotiques

Último

The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxTasha Penwell

NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali

Insurance Churn Prediction Data Analysis ProjectBoston Institute of Analytics

Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelBoston Institute of Analytics

Learn How Data Science Changes Our WorldEduminds Learning

Principles and Practices of Data VisualizationKianJazayeri1

Rithik Kumar Singh codealpha pythohn.pdfrahulyadav957181

World Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdfsimulationsindia

Data Analysis Project: Stroke PredictionBoston Institute of Analytics

What To Do For World Nature Conservation Day by Slidesgo.pptxSimranPal17

Real-Time AI Streaming - AI Max PrincetonTimothy Spann

Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy

Decoding Patterns: Customer Churn Prediction Data Analysis ProjectBoston Institute of Analytics

Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter

why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...Jack Cole

Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen

FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024Susanna-Assunta Sansone

Networking Case Study prepared by teacher.pptxHimangsuNath

6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...Dr Arash Najmaei ( Phd., MBA, BSc)

modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx

Machine Learning In Production

1. Running Machine Learning Applications In Production Sam BESSALAH @samklr

5. Might Works well for KAGGLE!

6. Might Works well for KAGGLE! But Kaggle isn’t real world Machine learning!

8. In Real Life - Trade off : Accuracy vs Interpretability cs Speed vs Infrastructure contraints - Interpretability and Speed often beats accuracy - Most of the time Kaggle is a feature engineering contest - Contest oriented vs Real Product Impact

9. But in real life … Things are less obvious Data Engineers Data Pipeline Data Scientists / ML Engineers APP Applications Developers

10. But in real life … Things are less obvious Data Engineers Data Pipeline Data Scientists / ML Engineers APP Applications Developers Innovation is often (wrongly ?) thought to be here ...

11. http://www.slideshare.net/jssm1th/an-architecture-for-agile-machine-learning-in-realtime-applications

12.

13. @josh_wills

14. josh_wills

15.

16. Production Requirements : - Flexibility and agility - Scalability and Performance - Enable Real time decision making, sometimes at huge QPS at subseconds pace. - Security

17.

18. josh_wills

19.

20. Machine Learning as a Software Problem - Most ML developement patterns lead to software design anti patterns - Dependencies in code, creeps through Models dependencies in Data - Wasteful use of data, since most ml model selection require multiple version of data. Hence the instability of data, and of prediction services - Breaks system isolation, leading to un-maintainable stacks

21. In Production, Machine Learning is a Software and System Problem. Treat it accordingly !!!!

22.

23.

24.

25. Deployment / Model Serving

26. Deployment / Model Serving The Missing Part in ML

27. - Model Serving is often ignored or left out to Back End Engineers to implement at their own liking. - More often it involves serving an API or a Service to do the Predict function. But that not often enough. - Software scaling can become problematic to the accuracy of the model. - How many models are you serving? - Are you running something else ? - Are you updating your model in real time?

28. Example : AirBnb

29.

30. - Trained Models are stored in PMML files - They serve their models via Openscoring

31.

32. PMML ?

33.

34. PMML? - Might be the solution for some (most ?) cases - Support many models, but lacks support for many others - Fails to capture the evolution of your modeling process … Transformations, re encoding, etc . - Better suited for exporting models to other systems, rather than being served to machine learning products with real user facing. - And … XML ?? Really????

35.

36.

37. Model Versioning - Packaging - You usually don’t serve only one model. But a lot more. Especially when running experiments. - You should vie to package your model in versionned way. - Git is awesome, but not appropriate for live model serving - Build a model repository or a model index - I usually use fast KV store or advanced data stores to save my models - Build a service to manage your models (Model Manager) responsible for evaluating and updating your model.

38.

39.

40. TensorFlow Serving

41.

42.

43. Serialization - Remember PMML ? - In Big Data, data has schema and proper evolution? - Why not models ? - Lots to choose from : Protobuf, Avro - Use binary schema to represent and version your models

44. Evaluation - Business metrics often differ from core model metrics : Trade off between long term metrics and short term metrics. - Hyperparameters - A/B Testing - Multi Armed Bandits Problem

45. Hyper Parameters Netflix

46.

47.

48. A/B Testing - Multi-armed Bandit

49. A/B Testing - Multi-armed Bandit Dataiku

50. Experiments

51.

52. Reproducibility - How to keep track of data used for training ? - Are notebooks enough? - Junpyter Notebooks, Spark Notebooks, Zeppelin, etc …. - Need for an end to end solution. Not perfect, but a workable one.

53. I forgot many things - Monitoring - Pipeline tuning (one model is often fed to another one) - RPC over REST for fast model serving ? - How to deal with heterogeneous systems ? - Do you really have to distribute your processing? - Is more data better than smartly tuned algorithms?

Machine Learning In Production

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Destacado

Destacado (17)

Similar a Machine Learning In Production

Similar a Machine Learning In Production (20)

Más de Samir Bessalah

Más de Samir Bessalah (10)

Último

Último (20)

Machine Learning In Production