Se ha denunciado esta presentación.
Se está descargando tu SlideShare. ×

MLOps Virtual Event | Building Machine Learning Platforms for the Full Lifecycle

Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Próximo SlideShare
MLOps in action
MLOps in action
Cargando en…3
×

Eche un vistazo a continuación

1 de 41 Anuncio

MLOps Virtual Event | Building Machine Learning Platforms for the Full Lifecycle

Descargar para leer sin conexión

Successfully building a machine learning model is hard enough. Reproducing your results at scale — enabling others to reproduce pipelines, comparing results from other versions, moving models into production, redeploying and rolling out updated models — is exponentially harder. To address these challenges and accelerate innovation, many companies are building custom “ML platforms” to automate the end-to-end ML lifecycle.

Watch a replay of this MLOps Virtual Event to hear more about the latest developments and best practices for managing the full ML lifecycle on Databricks with MLflow. We covered a checklist of capabilities you’ll need, common pitfalls, technological and organizational challenges, and how to overcome them.

https://www.youtube.com/playlist?list=PLTPXxbhUt-YUFNBwBsSIlknoNbS7GExZw

Successfully building a machine learning model is hard enough. Reproducing your results at scale — enabling others to reproduce pipelines, comparing results from other versions, moving models into production, redeploying and rolling out updated models — is exponentially harder. To address these challenges and accelerate innovation, many companies are building custom “ML platforms” to automate the end-to-end ML lifecycle.

Watch a replay of this MLOps Virtual Event to hear more about the latest developments and best practices for managing the full ML lifecycle on Databricks with MLflow. We covered a checklist of capabilities you’ll need, common pitfalls, technological and organizational challenges, and how to overcome them.

https://www.youtube.com/playlist?list=PLTPXxbhUt-YUFNBwBsSIlknoNbS7GExZw

Anuncio
Anuncio

Más Contenido Relacionado

Presentaciones para usted (20)

Similares a MLOps Virtual Event | Building Machine Learning Platforms for the Full Lifecycle (20)

Anuncio

Más de Databricks (20)

Más reciente (20)

Anuncio

MLOps Virtual Event | Building Machine Learning Platforms for the Full Lifecycle

  1. 1. MLOps Virtual Event: Building Machine Learning Platforms Matei Zaharia Chief Technologist, Databricks @matei_zaharia
  2. 2. A Common Story
  3. 3. Even After Deploying, Operating ML is Complex!  Monitoring performance of the model  Data drift  Governance and security Many ML teams spend >50% of their time maintaining existing models
  4. 4. Why is ML Hard to Operationalize?  Dependence on data  Multiple, application-specific ways to evaluate performance  Many teams and systems involved Data Prep Training Deployment Raw Data ML ENGINEER APPLICATIO N DEVELOPER DATA ENGINEE R
  5. 5. Response: ML Platforms  Software platforms to manage ML applications, from development to production  Most companies that use ML at scale are building one  Tech examples: Facebook FBLearner, Google TFX, Uber Michelangelo
  6. 6. Common Components in an ML Platform  Data management, in development and at scoring time ▪ Data transformation, quality monitoring, data versioning ▪ Feature stores  Model management ▪ Packaging, review, quality assurance, versioning  Code and deployment management ▪ Reproducibility, deployment, monitoring, experimentation ModelDB
  7. 7. Our Approach at Databricks  Every team’s requirements will be different, and will change over time  Provide a general platform that is easy to integrate with diverse tools Open source machine learning platform Transactional, versioned data lake storage Data science & ML workspace
  8. 8. In This Webinar  How we and other organizations handle the different components of a machine learning platform  Demos and experience from 4 different companies
  9. 9. End-to-End Data Science and Machine Learning on Databricks Clemens Mewald Director of Product Management, Databricks
  10. 10. End-to-End Data Science and ML on AutoML End-to-End ML Lifecycle ML Runtime and Environments Batch Scoring Online Serving Data Science Workspace Prep Data Build Model Deploy/Monitor Model Open,pluggable architecture
  11. 11. End-to-End Data Science and ML on AutoML End-to-End ML Lifecycle Batch Scoring Online Serving Data Science Workspace Prep Data Build Model Deploy/Monitor Model Open,pluggable architecture ML Runtime and Environments
  12. 12. Projects Packaging format for reproducible runs on any compute platform Components
  13. 13. Models General model format that standardizes deployment options Projects Packaging format for reproducible runs on any compute platform Components
  14. 14. Tracking Record and query experiments: code, metrics, parameters, artifacts, models Models General model format that standardizes deployment options Projects Packaging format for reproducible runs on any compute platform Components
  15. 15. Tracking Record and query experiments: code, metrics, parameters, artifacts, models Models General model format that standardizes deployment options Model Registry Centralized and collaborative model lifecycle management Projects Packaging format for reproducible runs on any compute platform Components
  16. 16. Model Lifecycle Models Flavor 2 Flavor 1 Custom Models
  17. 17. Model Lifecycle Models Tracking Flavor 2 Flavor 1 Custom Models Parameter s Metrics Artifacts ModelsMetadata
  18. 18. Model Lifecycle Staging Production Archived Data Scientists Deployment Engineers v1 v2 v3 Models Tracking Flavor 2 Flavor 1 Model Registry Custom Models Parameter s Metrics Artifacts ModelsMetadata
  19. 19. Model Lifecycle Staging Production Archived Data Scientists Deployment Engineers v1 v2 v3 Models Tracking Flavor 2 Flavor 1 Model Registry Custom Models In-Line Code Containers Batch & Stream Scoring Cloud Inference Services OSS Serving Solutions Serving Parameter s Metrics Artifacts ModelsMetadata
  20. 20. Parameters and (a time series of) metrics Artifacts (including model) Auto-logging for ML Frameworks: A single line of code logs parameters, metrics, and artifacts. mlflow.keras.autolog() # or: mlflow.tensorflow.autolog() Auto-Logging
  21. 21. End-to-End Data Science and ML on AutoML End-to-End ML Lifecycle Batch Scoring Online Serving Data Science Workspace Prep Data Build Model Deploy/Monitor Model Open,pluggable architecture ML Runtime and Environments
  22. 22. Enterprise Ready Enterprise grade access controls, identity pass-through, and auditability Collaborative Realtime co-editing and commenting Reproducible Auto-logged revision history and Git integration for version control Visualizations Built-in visualizations and support for the most popular visualization libraries (e.g. matplotlib, ggplot) Experiment Tracking Built-in tracking of Data Science and ML experiments, with metrics, parameters, artifacts, and more Multi-Language Scala, SQL, Python, R: All in one notebook Databricks Notebooks Provide a collaborative environment for Unified Data Analytics
  23. 23. Databricks Notebooks for Collaborative Data Science Data Engineers, Data Scientists, ML Engineers, and Data Analysts can all collaborate in one shared environment using modern collaboration patterns. Co-Presence / Co-Editing CommentingVersioning
  24. 24. Integration with Databricks Notebooks ● Runs Sidebar integrated with MLflow Tracking ● Track runs, sort by metrics and parameters ● Linked to revision history of the notebook
  25. 25. End-to-End Data Science and ML on AutoML End-to-End ML Lifecycle Batch Scoring Online Serving Data Science Workspace Prep Data Build Model Deploy/Monitor Model Open,pluggable architecture ML Runtime and Environments
  26. 26. Your Existing Data Lake Ingestion Tables Data Catalog Feature Store Azure Data Lake Storage Amazon S3 Streaming Batch 3rd Party Data Marketplace Files for Data Science and ML ● Schema enforced high quality data ● Optimized performance ● Full data lineage / governance ● Reproducibility through time travel ML Runtime
  27. 27. for Data Science and ML Ingest data and visualize data distribution
  28. 28. for Data Science and ML Data versioning and time travel
  29. 29. for Data Science and ML Data versioning and time travel
  30. 30. Integration with Delta Auto-Logging for any Spark Datasource
  31. 31. End-to-End Data Science and ML on AutoML End-to-End ML Lifecycle Batch Scoring Online Serving Data Science Workspace Prep Data Build Model Deploy/Monitor Model Open,pluggable architecture ML Runtime and Environments
  32. 32. Packages and optimizes most common ML Frameworks ... Machine Learning Runtime
  33. 33. Packages and optimizes most common ML Frameworks ... Built-in Optimization for Distributed Deep Learning Distribute and Scale any Single-Machine ML Code to 1,000’s of machines Machine Learning Runtime
  34. 34. Built-In AutoML and Experiment Tracking Packages and optimizes most common ML Frameworks ... Built-in Optimization for Distributed Deep Learning Distribute and Scale any Single-Machine ML Code to 1,000’s of machines AutoML and Tracking / Visualizations with MLflow Machine Learning Runtime
  35. 35. Machine Learning Pre-configured Environment Customizatio n requirements.txt Built-In AutoML and Experiment Tracking conda.yaml Packages and optimizes most common ML Frameworks ... Built-in Optimization for Distributed Deep Learning Distribute and Scale any Single-Machine ML Code to 1,000’s of machines Customized Environments using Conda Conda- BasedAutoML and Tracking / Visualizations with MLflow Machine Learning Runtime
  36. 36. Integration with ML Runtime Hyperopt autlogging to MLflow
  37. 37. End-to-End Data Science and ML on AutoML End-to-End ML Lifecycle Batch Scoring Online Serving Data Science Workspace Prep Data Build Model Deploy/Monitor Model Open,pluggable architecture ML Runtime and Environments
  38. 38. Model Deployment Staging Production Archived Data Scientists Deployment Engineers v1 v2 v3 Model Registry In-Line Code Containers Batch & Stream Scoring Cloud Inference Services OSS Serving Solutions Serving
  39. 39. Model Deployment model_udf = mlflow.pyfunc.spark_udf( spark, model_uri='models:/forecast/production') Staging Production Archived Data Scientists Deployment Engineers v1 v2 v3 Model Registry In-Line Code Containers Batch & Stream Scoring Cloud Inference Services OSS Serving Solutions Serving
  40. 40. In summary, Databri cks accelerates the full ML Lifecycle AutoML End-to-End ML Lifecycle Batch Scoring Online Serving Data Science Workspace Prep Data Build Model Deploy/Monitor Model Open,pluggable architecture ML Runtime and Environments

×