SlideShare una empresa de Scribd logo
1 de 27
Reproducibility and Versioning
of ML Systems
ŠPELA POKLUKAR | MACHINE LEARNING CONSULTANT
DSC 2022 // © COPYRIGHT 2022 ENDAVA 2
"Špela is experienced machine learning consultant with experience mostly
in SW engineering services and energy sector. She has successfully lead
projects in various domains such as manufacturing, finance, robotics,
energy, and IT services. She is currently employed as a data discipline lead
in Endava Slovenia and an active member of innovation and gender
balance communities. Her background is in mathematics, philosophy, and
theology.”
Spela.poklukar@endava.com
+386 40 545 898
Špela Poklukar
MACHINE LEARNING CONSULTANT
DSC 2022 // © COPYRIGHT 2022 ENDAVA 3
Agenda
1. MOTIVATION
2. MODULARITY
3. VERSIONING
4. DOCUMENTATION
DSC 2022 // © COPYRIGHT 2022 ENDAVA
1
Motivation
WHY WE NEED REPRODUCIBILITY ANYWAY
DSC 2022 // © COPYRIGHT 2022 ENDAVA 5
Reproducibility:
Two Sides of the Same Coin
REPRODUCIBILITY OF
ML Research
Results
REPRODUCIBILITY OF
ML Systems
Reproducibility and Versioning of ML Systems - 1. Motivation
Reproducibility of ML research
results means being able to
recreate a ML workflow of
someone else and reach the
same or similar conclusions
as the original work.
Reproducibility of ML system
means being able to
repeatedly run a ML workflow
and reach the same or similar
results on each run.
DSC 2022 // © COPYRIGHT 2022 ENDAVA 6
EVIDENCE OF SIGNIFICANCE
To ensure the obtained results are accurate
and significant.
ABLATION
To ensure that claimed gain really comes
from the intended change and is not random.
Why Reproducibility?
COST ESTIMATION
To inform potential consumers about
computational complexity.
Reproducibility and Versioning of ML Systems - 1. Motivation
DSC 2022 // © COPYRIGHT 2022 ENDAVA 7
SCALING
To be able to scale the machine learning
system by replicating its parts.
INFERENCE
To ensure selected model is the same one
used for inference.
FAULT TOLERANCE
To reduce the risk of errors by consistently
obtaining the same results.
MODEL ROLLBACK
To allow for model rollback in case the new
model is not performing as expected.
TRUST
To create trust and credibility of the machine
learning product.
REGULATION
To adhere to the increasing regulation
constraints.
Why Reproducibility?
Reproducibility and Versioning of ML Systems - 1. Motivation
DSC 2022 // © COPYRIGHT 2022 ENDAVA
2
Modularity
ADOPTION OF PIPELINE MENTALITY
DSC 2022 // © COPYRIGHT 2022 ENDAVA 9
Feature Engineering
Data Preprocessing Model Training Prediction Service Model Evaluation
Feature Engineering
Data Preprocessing Model Training
Feature Engineering
Data Preprocessing Prediction Service
Development Pipeline:
Training Pipeline:
Inference Pipeline:
Reproducibility and Versioning of ML Systems - 2. Modularity
DSC 2022 // © COPYRIGHT 2022 ENDAVA
3
Versioning
TRACKING THE CHANGES IN ML SYSTEM
DSC 2022 // © COPYRIGHT 2022 ENDAVA
Reproducibility can be achieved
by tracking and versioning
every change in ML system.
11
for Training Datasets
Reproducibility and Versioning of ML Systems - 3. Versioning
DSC 2022 // © COPYRIGHT 2022 ENDAVA 12
Environment
Source Code
Model Parameters
Features
Preprocessing
System
Model
Dataset
Changes to Track
Data
‣ Dataset version
‣ Data availability
timestamp
‣ Dataset split
‣ Dataset shuffling
‣ Preprocessing
parameters
‣ Target variable
transformation
‣ Feature computation
parameters
‣ Feature selection
‣ Model type
‣ Model
hyperparameters
‣ Weights initialization
‣ Evaluation parameters
‣ Dropout
‣ Components source
code
‣ Pipeline definition
‣ Dependencies
‣ Environment variables
‣ Infrastructure
‣ Floating point
calculation
Reproducibility and Versioning of ML Systems - 3. Versioning
DSC 2022 // © COPYRIGHT 2022 ENDAVA 13
Reproducibility and Versioning of ML Systems - 3. Versioning
DSC 2022 // © COPYRIGHT 2022 ENDAVA
Experiment Tracking
14
Reproducibility and Versioning of ML Systems - 3. Versioning
DSC 2022 // © COPYRIGHT 2022 ENDAVA
Dataset Versioning
15
Reproducibility and Versioning of ML Systems - 3. Versioning
DSC 2022 // © COPYRIGHT 2022 ENDAVA
The feature store is a central location where the features are stored and organized for the explicit purpose of being used to either train models
or make predictions. Features are computed when the new data become available and stored in the feature store as opposed to being
computed on the fly by training and serving services.
Feature store should provide:
‣ Updated list of feature consumers
‣ Point-in-time lookup
Benefits of using feature store:
‣ Consistent feature engineering for model development, training and serving
‣ Bridging the gap between data scientists and data & ML engineers
‣ Discover and reuse available feature sets, avoid having similar features with different definitions
‣ Point-in-time lookup to prevent data leakage
‣ Accelerate ML innovation
‣ Reproducibility of ML experiments
‣ Empower legal and compliance teams to ensure compliant use of data
Feature Versioning – Feature Store
16
for Training Datasets
Reproducibility and Versioning of ML Systems - 3. Versioning
DSC 2022 // © COPYRIGHT 2022 ENDAVA
Model registry is a service that manages multiple model artifacts, tracks, and governs models at different stages of the ML lifecycle.
The model registry provides:
‣ Centralized storage for all types of models,
‣ Collaborative unit for model lifecycle management.
‣ Basis for assessing model risks and model governance.
‣ Fast and seamless model roll-out and roll-back.
Model registry should keep track of:
‣ Model name
‣ Model architecture
‣ Model hyperparameters
‣ Trained model/model weights
‣ Model metrics
Model Versioning – Model Registry
17
Reproducibility and Versioning of ML Systems - 3. Versioning
DSC 2022 // © COPYRIGHT 2022 ENDAVA
Environment Versioning – Container Registry
18
Reproducibility and Versioning of ML Systems - 3. Versioning
DSC 2022 // © COPYRIGHT 2022 ENDAVA
Pipeline Versioning – Workflow Orchestration
19
Reproducibility and Versioning of ML Systems - 3. Versioning
DSC 2022 // © COPYRIGHT 2022 ENDAVA
Provisioning, configuring and managing infrastructure with machine-readable definition files.
Benefits:
‣ Ensures infrastructure consistency and eliminates configuration drift.
‣ Cost reduction.
‣ Increase in speed of deployments.
‣ Scalability and availability.
‣ Fosters collaboration.
‣ Standardizes deployment workflow.
‣ Error risk reduction.
Infrastructure Versioning – IaC
20
Reproducibility and Versioning of ML Systems - 3. Versioning
DSC 2022 // © COPYRIGHT 2022 ENDAVA
Metadata store is a central place that holds and connects all parameters about ML system.
It may hold, for example:
‣ Data version: Reference to the dataset, md5 hash, dataset sample to know which data was used
to train the model
‣ Environment configuration: Docker image ID, requirements.txt, conda.yml, Dockerfile, Makefile to
know how to recreate the environment where the model was trained
‣ Code version: Git SHA of a commit or an actual snapshot of code to know what code was used
to build a model
‣ Model version: Model ID, configuration of the feature preprocessing steps of the pipeline, model
training, and inference to reproduce the process if needed
‣ Model performance metrics: Experiment ID, F1, accuracy, ROC on test and validation set to
know how your model performs
‣ Hardware metrics: CPU, GPU, TPU, memory to see how much your model consumes during
training/inference
‣ Performance visualizations: ROC curve, Confusion matrix, PR curve to understand the errors
deeply
‣ Model predictions: to see the actual predictions and understand model performance beyond
metrics
Version Versioning – Metadata Store
21
Reproducibility and Versioning of ML Systems - 3. Versioning
DSC 2022 // © COPYRIGHT 2022 ENDAVA 22
EXPERIMENT
TRACKING
SOURCE
CODE
FEATURE
STORE
MODEL
REGISTRY
METADATA
STORE
EXPERIMENTING AND
MODEL DEVELOPMENT
ML PIPELINE CI/CD:
BUILD, TEST,
PACKAGE, DEPLOY
DATA ENGINEERING
CONTINUOUS MODEL
TRAINING
MODEL CD
PREDICTION SERVICE
CONTINUOUS
MONITORING
DATA
ENGINEERING
Reproducibility and Versioning of ML Systems - 3. Versioning
DSC 2022 // © COPYRIGHT 2022 ENDAVA
4
Documentation
THE ONLY DIFFERENCE BETWEEN SCIENCE AND FOOLING AROUND IS WRITIN G IT DOWN
DSC 2022 // © COPYRIGHT 2022 ENDAVA 24
Reproducibility and Versioning of ML Systems - 4. Documentation
DSC 2022 // © COPYRIGHT 2022 ENDAVA
Document as you go.
Start from day 1.
25
Reproducibility and Versioning of ML Systems - 4. Documentation
DSC 2022 // © COPYRIGHT 2022 ENDAVA 26
MLOps – New Kid on the Block - Thank You!
Thank You!
Q&A

Más contenido relacionado

Similar a [DSC Europe 22] Reproducibility and Versioning of ML Systems - Spela Poklukar

Train, predict, serve: How to go into production your machine learning model
Train, predict, serve: How to go into production your machine learning modelTrain, predict, serve: How to go into production your machine learning model
Train, predict, serve: How to go into production your machine learning modelCloudera Japan
 
Confluent Partner Tech Talk with Reply
Confluent Partner Tech Talk with ReplyConfluent Partner Tech Talk with Reply
Confluent Partner Tech Talk with Replyconfluent
 
Paige Roberts: Shortcut MLOps with In-Database Machine Learning
Paige Roberts: Shortcut MLOps with In-Database Machine LearningPaige Roberts: Shortcut MLOps with In-Database Machine Learning
Paige Roberts: Shortcut MLOps with In-Database Machine LearningEdunomica
 
Surrogate Model-Based Reliability Analysis of Composite UAV Wing facilitation...
Surrogate Model-Based Reliability Analysis of Composite UAV Wing facilitation...Surrogate Model-Based Reliability Analysis of Composite UAV Wing facilitation...
Surrogate Model-Based Reliability Analysis of Composite UAV Wing facilitation...Altair
 
Pitfalls of machine learning in production
Pitfalls of machine learning in productionPitfalls of machine learning in production
Pitfalls of machine learning in productionAntoine Sauray
 
MLOps Using MLflow
MLOps Using MLflowMLOps Using MLflow
MLOps Using MLflowDatabricks
 
Unlocking MLOps Potential: Streamlining Machine Learning Lifecycle with Datab...
Unlocking MLOps Potential: Streamlining Machine Learning Lifecycle with Datab...Unlocking MLOps Potential: Streamlining Machine Learning Lifecycle with Datab...
Unlocking MLOps Potential: Streamlining Machine Learning Lifecycle with Datab...AbishekSubramanian2
 
Dagster - DataOps and MLOps for Machine Learning Engineers.pdf
Dagster - DataOps and MLOps for Machine Learning Engineers.pdfDagster - DataOps and MLOps for Machine Learning Engineers.pdf
Dagster - DataOps and MLOps for Machine Learning Engineers.pdfHong Ong
 
An Integrated Simulation Tool Framework for Process Data Management
An Integrated Simulation Tool Framework for Process Data ManagementAn Integrated Simulation Tool Framework for Process Data Management
An Integrated Simulation Tool Framework for Process Data ManagementCognizant
 
GDG Cloud Southlake #3 Charles Adetiloye: Enterprise MLOps in Practice
GDG Cloud Southlake #3 Charles Adetiloye: Enterprise MLOps in PracticeGDG Cloud Southlake #3 Charles Adetiloye: Enterprise MLOps in Practice
GDG Cloud Southlake #3 Charles Adetiloye: Enterprise MLOps in PracticeJames Anderson
 
Tool-Driven Technology Transfer in Software Engineering
Tool-Driven Technology Transfer in Software EngineeringTool-Driven Technology Transfer in Software Engineering
Tool-Driven Technology Transfer in Software EngineeringHeiko Koziolek
 
Serverless machine learning architectures at Helixa
Serverless machine learning architectures at HelixaServerless machine learning architectures at Helixa
Serverless machine learning architectures at HelixaData Science Milan
 
Deploying and Monitoring Heterogeneous Machine Learning Applications with Cli...
Deploying and Monitoring Heterogeneous Machine Learning Applications with Cli...Deploying and Monitoring Heterogeneous Machine Learning Applications with Cli...
Deploying and Monitoring Heterogeneous Machine Learning Applications with Cli...Databricks
 
Legion - AI Runtime Platform
Legion -  AI Runtime PlatformLegion -  AI Runtime Platform
Legion - AI Runtime PlatformAlexey Kharlamov
 
Implement a Universal Data Distribution Architecture to Manage All Streaming ...
Implement a Universal Data Distribution Architecture to Manage All Streaming ...Implement a Universal Data Distribution Architecture to Manage All Streaming ...
Implement a Universal Data Distribution Architecture to Manage All Streaming ...Timothy Spann
 
Notes on Deploying Machine-learning Models at Scale
Notes on Deploying Machine-learning Models at ScaleNotes on Deploying Machine-learning Models at Scale
Notes on Deploying Machine-learning Models at ScaleDeep Kayal
 
“MLOps: Managing Data and Workflows for Efficient Model Development and Deplo...
“MLOps: Managing Data and Workflows for Efficient Model Development and Deplo...“MLOps: Managing Data and Workflows for Efficient Model Development and Deplo...
“MLOps: Managing Data and Workflows for Efficient Model Development and Deplo...Edge AI and Vision Alliance
 
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...James Anderson
 
A Software Factory Integrating Rational & WebSphere Tools
A Software Factory Integrating Rational & WebSphere ToolsA Software Factory Integrating Rational & WebSphere Tools
A Software Factory Integrating Rational & WebSphere Toolsghodgkinson
 

Similar a [DSC Europe 22] Reproducibility and Versioning of ML Systems - Spela Poklukar (20)

Train, predict, serve: How to go into production your machine learning model
Train, predict, serve: How to go into production your machine learning modelTrain, predict, serve: How to go into production your machine learning model
Train, predict, serve: How to go into production your machine learning model
 
Confluent Partner Tech Talk with Reply
Confluent Partner Tech Talk with ReplyConfluent Partner Tech Talk with Reply
Confluent Partner Tech Talk with Reply
 
Paige Roberts: Shortcut MLOps with In-Database Machine Learning
Paige Roberts: Shortcut MLOps with In-Database Machine LearningPaige Roberts: Shortcut MLOps with In-Database Machine Learning
Paige Roberts: Shortcut MLOps with In-Database Machine Learning
 
Surrogate Model-Based Reliability Analysis of Composite UAV Wing facilitation...
Surrogate Model-Based Reliability Analysis of Composite UAV Wing facilitation...Surrogate Model-Based Reliability Analysis of Composite UAV Wing facilitation...
Surrogate Model-Based Reliability Analysis of Composite UAV Wing facilitation...
 
Pitfalls of machine learning in production
Pitfalls of machine learning in productionPitfalls of machine learning in production
Pitfalls of machine learning in production
 
MLOps Using MLflow
MLOps Using MLflowMLOps Using MLflow
MLOps Using MLflow
 
Unlocking MLOps Potential: Streamlining Machine Learning Lifecycle with Datab...
Unlocking MLOps Potential: Streamlining Machine Learning Lifecycle with Datab...Unlocking MLOps Potential: Streamlining Machine Learning Lifecycle with Datab...
Unlocking MLOps Potential: Streamlining Machine Learning Lifecycle with Datab...
 
Dagster - DataOps and MLOps for Machine Learning Engineers.pdf
Dagster - DataOps and MLOps for Machine Learning Engineers.pdfDagster - DataOps and MLOps for Machine Learning Engineers.pdf
Dagster - DataOps and MLOps for Machine Learning Engineers.pdf
 
An Integrated Simulation Tool Framework for Process Data Management
An Integrated Simulation Tool Framework for Process Data ManagementAn Integrated Simulation Tool Framework for Process Data Management
An Integrated Simulation Tool Framework for Process Data Management
 
GDG Cloud Southlake #3 Charles Adetiloye: Enterprise MLOps in Practice
GDG Cloud Southlake #3 Charles Adetiloye: Enterprise MLOps in PracticeGDG Cloud Southlake #3 Charles Adetiloye: Enterprise MLOps in Practice
GDG Cloud Southlake #3 Charles Adetiloye: Enterprise MLOps in Practice
 
Tool-Driven Technology Transfer in Software Engineering
Tool-Driven Technology Transfer in Software EngineeringTool-Driven Technology Transfer in Software Engineering
Tool-Driven Technology Transfer in Software Engineering
 
Serverless machine learning architectures at Helixa
Serverless machine learning architectures at HelixaServerless machine learning architectures at Helixa
Serverless machine learning architectures at Helixa
 
Deploying and Monitoring Heterogeneous Machine Learning Applications with Cli...
Deploying and Monitoring Heterogeneous Machine Learning Applications with Cli...Deploying and Monitoring Heterogeneous Machine Learning Applications with Cli...
Deploying and Monitoring Heterogeneous Machine Learning Applications with Cli...
 
Legion - AI Runtime Platform
Legion -  AI Runtime PlatformLegion -  AI Runtime Platform
Legion - AI Runtime Platform
 
Implement a Universal Data Distribution Architecture to Manage All Streaming ...
Implement a Universal Data Distribution Architecture to Manage All Streaming ...Implement a Universal Data Distribution Architecture to Manage All Streaming ...
Implement a Universal Data Distribution Architecture to Manage All Streaming ...
 
Introducing MLOps.pdf
Introducing MLOps.pdfIntroducing MLOps.pdf
Introducing MLOps.pdf
 
Notes on Deploying Machine-learning Models at Scale
Notes on Deploying Machine-learning Models at ScaleNotes on Deploying Machine-learning Models at Scale
Notes on Deploying Machine-learning Models at Scale
 
“MLOps: Managing Data and Workflows for Efficient Model Development and Deplo...
“MLOps: Managing Data and Workflows for Efficient Model Development and Deplo...“MLOps: Managing Data and Workflows for Efficient Model Development and Deplo...
“MLOps: Managing Data and Workflows for Efficient Model Development and Deplo...
 
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...
 
A Software Factory Integrating Rational & WebSphere Tools
A Software Factory Integrating Rational & WebSphere ToolsA Software Factory Integrating Rational & WebSphere Tools
A Software Factory Integrating Rational & WebSphere Tools
 

Más de DataScienceConferenc1

[DSC MENA 24] Mostafa_Essa_-_Ai_and_cloud.pdf
[DSC MENA 24] Mostafa_Essa_-_Ai_and_cloud.pdf[DSC MENA 24] Mostafa_Essa_-_Ai_and_cloud.pdf
[DSC MENA 24] Mostafa_Essa_-_Ai_and_cloud.pdfDataScienceConferenc1
 
[DSC MENA 24] Yasser_El_Bendary - How NLP & LLMs model can excel in comprehen...
[DSC MENA 24] Yasser_El_Bendary - How NLP & LLMs model can excel in comprehen...[DSC MENA 24] Yasser_El_Bendary - How NLP & LLMs model can excel in comprehen...
[DSC MENA 24] Yasser_El_Bendary - How NLP & LLMs model can excel in comprehen...DataScienceConferenc1
 
[DSC MENA 24] Medhat_Kandil - Empowering Egypt's AI & Biotechnology Scenes.pdf
[DSC MENA 24] Medhat_Kandil - Empowering Egypt's AI & Biotechnology Scenes.pdf[DSC MENA 24] Medhat_Kandil - Empowering Egypt's AI & Biotechnology Scenes.pdf
[DSC MENA 24] Medhat_Kandil - Empowering Egypt's AI & Biotechnology Scenes.pdfDataScienceConferenc1
 
[DSC MENA 24] Youssef_Kamal - Data governance and quality.pdf
[DSC MENA 24] Youssef_Kamal - Data governance and quality.pdf[DSC MENA 24] Youssef_Kamal - Data governance and quality.pdf
[DSC MENA 24] Youssef_Kamal - Data governance and quality.pdfDataScienceConferenc1
 
[DSC MENA 24] Abdelrahman_Ghallab_-_Data_Product_mgmt.pdf
[DSC MENA 24] Abdelrahman_Ghallab_-_Data_Product_mgmt.pdf[DSC MENA 24] Abdelrahman_Ghallab_-_Data_Product_mgmt.pdf
[DSC MENA 24] Abdelrahman_Ghallab_-_Data_Product_mgmt.pdfDataScienceConferenc1
 
[DSC MENA 24] Asmaa_Eltaher_-_Innovation_Beyond_Brainstorming.pptx
[DSC MENA 24] Asmaa_Eltaher_-_Innovation_Beyond_Brainstorming.pptx[DSC MENA 24] Asmaa_Eltaher_-_Innovation_Beyond_Brainstorming.pptx
[DSC MENA 24] Asmaa_Eltaher_-_Innovation_Beyond_Brainstorming.pptxDataScienceConferenc1
 
[DSC MENA 24] Muhammad_Ezzat_-_Sustianable_Growth_Empowerment.pdf
[DSC MENA 24] Muhammad_Ezzat_-_Sustianable_Growth_Empowerment.pdf[DSC MENA 24] Muhammad_Ezzat_-_Sustianable_Growth_Empowerment.pdf
[DSC MENA 24] Muhammad_Ezzat_-_Sustianable_Growth_Empowerment.pdfDataScienceConferenc1
 
[DSC MENA 24] Basma_Rady_-_Building_a_Data_Driven_Culture_in_Your_Organizatio...
[DSC MENA 24] Basma_Rady_-_Building_a_Data_Driven_Culture_in_Your_Organizatio...[DSC MENA 24] Basma_Rady_-_Building_a_Data_Driven_Culture_in_Your_Organizatio...
[DSC MENA 24] Basma_Rady_-_Building_a_Data_Driven_Culture_in_Your_Organizatio...DataScienceConferenc1
 
[DSC MENA 24] Ahmed_Muselhy_-_Unveiling-the-Secrets-of-AI-in-Hiring.pdf
[DSC MENA 24] Ahmed_Muselhy_-_Unveiling-the-Secrets-of-AI-in-Hiring.pdf[DSC MENA 24] Ahmed_Muselhy_-_Unveiling-the-Secrets-of-AI-in-Hiring.pdf
[DSC MENA 24] Ahmed_Muselhy_-_Unveiling-the-Secrets-of-AI-in-Hiring.pdfDataScienceConferenc1
 
[DSC MENA 24] Ziad_Diab_-_Data-Driven_Disruption_-_The_Role_of_Data_Strategy_...
[DSC MENA 24] Ziad_Diab_-_Data-Driven_Disruption_-_The_Role_of_Data_Strategy_...[DSC MENA 24] Ziad_Diab_-_Data-Driven_Disruption_-_The_Role_of_Data_Strategy_...
[DSC MENA 24] Ziad_Diab_-_Data-Driven_Disruption_-_The_Role_of_Data_Strategy_...DataScienceConferenc1
 
[DSC MENA 24] Mohammad_Essam_- Leveraging Scene Graphs for Generative AI and ...
[DSC MENA 24] Mohammad_Essam_- Leveraging Scene Graphs for Generative AI and ...[DSC MENA 24] Mohammad_Essam_- Leveraging Scene Graphs for Generative AI and ...
[DSC MENA 24] Mohammad_Essam_- Leveraging Scene Graphs for Generative AI and ...DataScienceConferenc1
 
[DSC MENA 24] Ahmed_Fahmy - Navigating the Future.pdf
[DSC MENA 24] Ahmed_Fahmy - Navigating the Future.pdf[DSC MENA 24] Ahmed_Fahmy - Navigating the Future.pdf
[DSC MENA 24] Ahmed_Fahmy - Navigating the Future.pdfDataScienceConferenc1
 
[DSC MENA 24] Hany_Saad_Gheit_-_Azure_OpenAI_service.pptx
[DSC MENA 24] Hany_Saad_Gheit_-_Azure_OpenAI_service.pptx[DSC MENA 24] Hany_Saad_Gheit_-_Azure_OpenAI_service.pptx
[DSC MENA 24] Hany_Saad_Gheit_-_Azure_OpenAI_service.pptxDataScienceConferenc1
 
[DSC MENA 24] Nezar_El_Kady_-_From_Turing_to_Transformers__Navigating_the_AI_...
[DSC MENA 24] Nezar_El_Kady_-_From_Turing_to_Transformers__Navigating_the_AI_...[DSC MENA 24] Nezar_El_Kady_-_From_Turing_to_Transformers__Navigating_the_AI_...
[DSC MENA 24] Nezar_El_Kady_-_From_Turing_to_Transformers__Navigating_the_AI_...DataScienceConferenc1
 
[DSC MENA 24] Amira_Abdelaziz_-_AI_in_Financial_Services.pptx
[DSC MENA 24] Amira_Abdelaziz_-_AI_in_Financial_Services.pptx[DSC MENA 24] Amira_Abdelaziz_-_AI_in_Financial_Services.pptx
[DSC MENA 24] Amira_Abdelaziz_-_AI_in_Financial_Services.pptxDataScienceConferenc1
 
[DSC MENA 24] Omar_Ossama - My Journey from the Field of Oil & Gas, to the Ex...
[DSC MENA 24] Omar_Ossama - My Journey from the Field of Oil & Gas, to the Ex...[DSC MENA 24] Omar_Ossama - My Journey from the Field of Oil & Gas, to the Ex...
[DSC MENA 24] Omar_Ossama - My Journey from the Field of Oil & Gas, to the Ex...DataScienceConferenc1
 
[DSC MENA 24] Ramy_Agieb_-_Advancements_in_Artificial_Intelligence_for_Cybers...
[DSC MENA 24] Ramy_Agieb_-_Advancements_in_Artificial_Intelligence_for_Cybers...[DSC MENA 24] Ramy_Agieb_-_Advancements_in_Artificial_Intelligence_for_Cybers...
[DSC MENA 24] Ramy_Agieb_-_Advancements_in_Artificial_Intelligence_for_Cybers...DataScienceConferenc1
 
[DSC MENA 24] Sohaila_Diab_-_Lets_Talk_Gen_AI_Presentation.pptx
[DSC MENA 24] Sohaila_Diab_-_Lets_Talk_Gen_AI_Presentation.pptx[DSC MENA 24] Sohaila_Diab_-_Lets_Talk_Gen_AI_Presentation.pptx
[DSC MENA 24] Sohaila_Diab_-_Lets_Talk_Gen_AI_Presentation.pptxDataScienceConferenc1
 
[DSC MENA 24] Amal_Elgammal_-_QUALITOP_presentation.pptx
[DSC MENA 24] Amal_Elgammal_-_QUALITOP_presentation.pptx[DSC MENA 24] Amal_Elgammal_-_QUALITOP_presentation.pptx
[DSC MENA 24] Amal_Elgammal_-_QUALITOP_presentation.pptxDataScienceConferenc1
 
[DSC MENA 24] Abdelrahman_Sleem_-_AI_For_Marketing_DSC.pdf
[DSC MENA 24] Abdelrahman_Sleem_-_AI_For_Marketing_DSC.pdf[DSC MENA 24] Abdelrahman_Sleem_-_AI_For_Marketing_DSC.pdf
[DSC MENA 24] Abdelrahman_Sleem_-_AI_For_Marketing_DSC.pdfDataScienceConferenc1
 

Más de DataScienceConferenc1 (20)

[DSC MENA 24] Mostafa_Essa_-_Ai_and_cloud.pdf
[DSC MENA 24] Mostafa_Essa_-_Ai_and_cloud.pdf[DSC MENA 24] Mostafa_Essa_-_Ai_and_cloud.pdf
[DSC MENA 24] Mostafa_Essa_-_Ai_and_cloud.pdf
 
[DSC MENA 24] Yasser_El_Bendary - How NLP & LLMs model can excel in comprehen...
[DSC MENA 24] Yasser_El_Bendary - How NLP & LLMs model can excel in comprehen...[DSC MENA 24] Yasser_El_Bendary - How NLP & LLMs model can excel in comprehen...
[DSC MENA 24] Yasser_El_Bendary - How NLP & LLMs model can excel in comprehen...
 
[DSC MENA 24] Medhat_Kandil - Empowering Egypt's AI & Biotechnology Scenes.pdf
[DSC MENA 24] Medhat_Kandil - Empowering Egypt's AI & Biotechnology Scenes.pdf[DSC MENA 24] Medhat_Kandil - Empowering Egypt's AI & Biotechnology Scenes.pdf
[DSC MENA 24] Medhat_Kandil - Empowering Egypt's AI & Biotechnology Scenes.pdf
 
[DSC MENA 24] Youssef_Kamal - Data governance and quality.pdf
[DSC MENA 24] Youssef_Kamal - Data governance and quality.pdf[DSC MENA 24] Youssef_Kamal - Data governance and quality.pdf
[DSC MENA 24] Youssef_Kamal - Data governance and quality.pdf
 
[DSC MENA 24] Abdelrahman_Ghallab_-_Data_Product_mgmt.pdf
[DSC MENA 24] Abdelrahman_Ghallab_-_Data_Product_mgmt.pdf[DSC MENA 24] Abdelrahman_Ghallab_-_Data_Product_mgmt.pdf
[DSC MENA 24] Abdelrahman_Ghallab_-_Data_Product_mgmt.pdf
 
[DSC MENA 24] Asmaa_Eltaher_-_Innovation_Beyond_Brainstorming.pptx
[DSC MENA 24] Asmaa_Eltaher_-_Innovation_Beyond_Brainstorming.pptx[DSC MENA 24] Asmaa_Eltaher_-_Innovation_Beyond_Brainstorming.pptx
[DSC MENA 24] Asmaa_Eltaher_-_Innovation_Beyond_Brainstorming.pptx
 
[DSC MENA 24] Muhammad_Ezzat_-_Sustianable_Growth_Empowerment.pdf
[DSC MENA 24] Muhammad_Ezzat_-_Sustianable_Growth_Empowerment.pdf[DSC MENA 24] Muhammad_Ezzat_-_Sustianable_Growth_Empowerment.pdf
[DSC MENA 24] Muhammad_Ezzat_-_Sustianable_Growth_Empowerment.pdf
 
[DSC MENA 24] Basma_Rady_-_Building_a_Data_Driven_Culture_in_Your_Organizatio...
[DSC MENA 24] Basma_Rady_-_Building_a_Data_Driven_Culture_in_Your_Organizatio...[DSC MENA 24] Basma_Rady_-_Building_a_Data_Driven_Culture_in_Your_Organizatio...
[DSC MENA 24] Basma_Rady_-_Building_a_Data_Driven_Culture_in_Your_Organizatio...
 
[DSC MENA 24] Ahmed_Muselhy_-_Unveiling-the-Secrets-of-AI-in-Hiring.pdf
[DSC MENA 24] Ahmed_Muselhy_-_Unveiling-the-Secrets-of-AI-in-Hiring.pdf[DSC MENA 24] Ahmed_Muselhy_-_Unveiling-the-Secrets-of-AI-in-Hiring.pdf
[DSC MENA 24] Ahmed_Muselhy_-_Unveiling-the-Secrets-of-AI-in-Hiring.pdf
 
[DSC MENA 24] Ziad_Diab_-_Data-Driven_Disruption_-_The_Role_of_Data_Strategy_...
[DSC MENA 24] Ziad_Diab_-_Data-Driven_Disruption_-_The_Role_of_Data_Strategy_...[DSC MENA 24] Ziad_Diab_-_Data-Driven_Disruption_-_The_Role_of_Data_Strategy_...
[DSC MENA 24] Ziad_Diab_-_Data-Driven_Disruption_-_The_Role_of_Data_Strategy_...
 
[DSC MENA 24] Mohammad_Essam_- Leveraging Scene Graphs for Generative AI and ...
[DSC MENA 24] Mohammad_Essam_- Leveraging Scene Graphs for Generative AI and ...[DSC MENA 24] Mohammad_Essam_- Leveraging Scene Graphs for Generative AI and ...
[DSC MENA 24] Mohammad_Essam_- Leveraging Scene Graphs for Generative AI and ...
 
[DSC MENA 24] Ahmed_Fahmy - Navigating the Future.pdf
[DSC MENA 24] Ahmed_Fahmy - Navigating the Future.pdf[DSC MENA 24] Ahmed_Fahmy - Navigating the Future.pdf
[DSC MENA 24] Ahmed_Fahmy - Navigating the Future.pdf
 
[DSC MENA 24] Hany_Saad_Gheit_-_Azure_OpenAI_service.pptx
[DSC MENA 24] Hany_Saad_Gheit_-_Azure_OpenAI_service.pptx[DSC MENA 24] Hany_Saad_Gheit_-_Azure_OpenAI_service.pptx
[DSC MENA 24] Hany_Saad_Gheit_-_Azure_OpenAI_service.pptx
 
[DSC MENA 24] Nezar_El_Kady_-_From_Turing_to_Transformers__Navigating_the_AI_...
[DSC MENA 24] Nezar_El_Kady_-_From_Turing_to_Transformers__Navigating_the_AI_...[DSC MENA 24] Nezar_El_Kady_-_From_Turing_to_Transformers__Navigating_the_AI_...
[DSC MENA 24] Nezar_El_Kady_-_From_Turing_to_Transformers__Navigating_the_AI_...
 
[DSC MENA 24] Amira_Abdelaziz_-_AI_in_Financial_Services.pptx
[DSC MENA 24] Amira_Abdelaziz_-_AI_in_Financial_Services.pptx[DSC MENA 24] Amira_Abdelaziz_-_AI_in_Financial_Services.pptx
[DSC MENA 24] Amira_Abdelaziz_-_AI_in_Financial_Services.pptx
 
[DSC MENA 24] Omar_Ossama - My Journey from the Field of Oil & Gas, to the Ex...
[DSC MENA 24] Omar_Ossama - My Journey from the Field of Oil & Gas, to the Ex...[DSC MENA 24] Omar_Ossama - My Journey from the Field of Oil & Gas, to the Ex...
[DSC MENA 24] Omar_Ossama - My Journey from the Field of Oil & Gas, to the Ex...
 
[DSC MENA 24] Ramy_Agieb_-_Advancements_in_Artificial_Intelligence_for_Cybers...
[DSC MENA 24] Ramy_Agieb_-_Advancements_in_Artificial_Intelligence_for_Cybers...[DSC MENA 24] Ramy_Agieb_-_Advancements_in_Artificial_Intelligence_for_Cybers...
[DSC MENA 24] Ramy_Agieb_-_Advancements_in_Artificial_Intelligence_for_Cybers...
 
[DSC MENA 24] Sohaila_Diab_-_Lets_Talk_Gen_AI_Presentation.pptx
[DSC MENA 24] Sohaila_Diab_-_Lets_Talk_Gen_AI_Presentation.pptx[DSC MENA 24] Sohaila_Diab_-_Lets_Talk_Gen_AI_Presentation.pptx
[DSC MENA 24] Sohaila_Diab_-_Lets_Talk_Gen_AI_Presentation.pptx
 
[DSC MENA 24] Amal_Elgammal_-_QUALITOP_presentation.pptx
[DSC MENA 24] Amal_Elgammal_-_QUALITOP_presentation.pptx[DSC MENA 24] Amal_Elgammal_-_QUALITOP_presentation.pptx
[DSC MENA 24] Amal_Elgammal_-_QUALITOP_presentation.pptx
 
[DSC MENA 24] Abdelrahman_Sleem_-_AI_For_Marketing_DSC.pdf
[DSC MENA 24] Abdelrahman_Sleem_-_AI_For_Marketing_DSC.pdf[DSC MENA 24] Abdelrahman_Sleem_-_AI_For_Marketing_DSC.pdf
[DSC MENA 24] Abdelrahman_Sleem_-_AI_For_Marketing_DSC.pdf
 

Último

Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...only4webmaster01
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...amitlee9823
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...amitlee9823
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangaloreamitlee9823
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 

Último (20)

Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 

[DSC Europe 22] Reproducibility and Versioning of ML Systems - Spela Poklukar

  • 1. Reproducibility and Versioning of ML Systems ŠPELA POKLUKAR | MACHINE LEARNING CONSULTANT
  • 2. DSC 2022 // © COPYRIGHT 2022 ENDAVA 2 "Špela is experienced machine learning consultant with experience mostly in SW engineering services and energy sector. She has successfully lead projects in various domains such as manufacturing, finance, robotics, energy, and IT services. She is currently employed as a data discipline lead in Endava Slovenia and an active member of innovation and gender balance communities. Her background is in mathematics, philosophy, and theology.” Spela.poklukar@endava.com +386 40 545 898 Špela Poklukar MACHINE LEARNING CONSULTANT
  • 3. DSC 2022 // © COPYRIGHT 2022 ENDAVA 3 Agenda 1. MOTIVATION 2. MODULARITY 3. VERSIONING 4. DOCUMENTATION
  • 4. DSC 2022 // © COPYRIGHT 2022 ENDAVA 1 Motivation WHY WE NEED REPRODUCIBILITY ANYWAY
  • 5. DSC 2022 // © COPYRIGHT 2022 ENDAVA 5 Reproducibility: Two Sides of the Same Coin REPRODUCIBILITY OF ML Research Results REPRODUCIBILITY OF ML Systems Reproducibility and Versioning of ML Systems - 1. Motivation Reproducibility of ML research results means being able to recreate a ML workflow of someone else and reach the same or similar conclusions as the original work. Reproducibility of ML system means being able to repeatedly run a ML workflow and reach the same or similar results on each run.
  • 6. DSC 2022 // © COPYRIGHT 2022 ENDAVA 6 EVIDENCE OF SIGNIFICANCE To ensure the obtained results are accurate and significant. ABLATION To ensure that claimed gain really comes from the intended change and is not random. Why Reproducibility? COST ESTIMATION To inform potential consumers about computational complexity. Reproducibility and Versioning of ML Systems - 1. Motivation
  • 7. DSC 2022 // © COPYRIGHT 2022 ENDAVA 7 SCALING To be able to scale the machine learning system by replicating its parts. INFERENCE To ensure selected model is the same one used for inference. FAULT TOLERANCE To reduce the risk of errors by consistently obtaining the same results. MODEL ROLLBACK To allow for model rollback in case the new model is not performing as expected. TRUST To create trust and credibility of the machine learning product. REGULATION To adhere to the increasing regulation constraints. Why Reproducibility? Reproducibility and Versioning of ML Systems - 1. Motivation
  • 8. DSC 2022 // © COPYRIGHT 2022 ENDAVA 2 Modularity ADOPTION OF PIPELINE MENTALITY
  • 9. DSC 2022 // © COPYRIGHT 2022 ENDAVA 9 Feature Engineering Data Preprocessing Model Training Prediction Service Model Evaluation Feature Engineering Data Preprocessing Model Training Feature Engineering Data Preprocessing Prediction Service Development Pipeline: Training Pipeline: Inference Pipeline: Reproducibility and Versioning of ML Systems - 2. Modularity
  • 10. DSC 2022 // © COPYRIGHT 2022 ENDAVA 3 Versioning TRACKING THE CHANGES IN ML SYSTEM
  • 11. DSC 2022 // © COPYRIGHT 2022 ENDAVA Reproducibility can be achieved by tracking and versioning every change in ML system. 11 for Training Datasets Reproducibility and Versioning of ML Systems - 3. Versioning
  • 12. DSC 2022 // © COPYRIGHT 2022 ENDAVA 12 Environment Source Code Model Parameters Features Preprocessing System Model Dataset Changes to Track Data ‣ Dataset version ‣ Data availability timestamp ‣ Dataset split ‣ Dataset shuffling ‣ Preprocessing parameters ‣ Target variable transformation ‣ Feature computation parameters ‣ Feature selection ‣ Model type ‣ Model hyperparameters ‣ Weights initialization ‣ Evaluation parameters ‣ Dropout ‣ Components source code ‣ Pipeline definition ‣ Dependencies ‣ Environment variables ‣ Infrastructure ‣ Floating point calculation Reproducibility and Versioning of ML Systems - 3. Versioning
  • 13. DSC 2022 // © COPYRIGHT 2022 ENDAVA 13 Reproducibility and Versioning of ML Systems - 3. Versioning
  • 14. DSC 2022 // © COPYRIGHT 2022 ENDAVA Experiment Tracking 14 Reproducibility and Versioning of ML Systems - 3. Versioning
  • 15. DSC 2022 // © COPYRIGHT 2022 ENDAVA Dataset Versioning 15 Reproducibility and Versioning of ML Systems - 3. Versioning
  • 16. DSC 2022 // © COPYRIGHT 2022 ENDAVA The feature store is a central location where the features are stored and organized for the explicit purpose of being used to either train models or make predictions. Features are computed when the new data become available and stored in the feature store as opposed to being computed on the fly by training and serving services. Feature store should provide: ‣ Updated list of feature consumers ‣ Point-in-time lookup Benefits of using feature store: ‣ Consistent feature engineering for model development, training and serving ‣ Bridging the gap between data scientists and data & ML engineers ‣ Discover and reuse available feature sets, avoid having similar features with different definitions ‣ Point-in-time lookup to prevent data leakage ‣ Accelerate ML innovation ‣ Reproducibility of ML experiments ‣ Empower legal and compliance teams to ensure compliant use of data Feature Versioning – Feature Store 16 for Training Datasets Reproducibility and Versioning of ML Systems - 3. Versioning
  • 17. DSC 2022 // © COPYRIGHT 2022 ENDAVA Model registry is a service that manages multiple model artifacts, tracks, and governs models at different stages of the ML lifecycle. The model registry provides: ‣ Centralized storage for all types of models, ‣ Collaborative unit for model lifecycle management. ‣ Basis for assessing model risks and model governance. ‣ Fast and seamless model roll-out and roll-back. Model registry should keep track of: ‣ Model name ‣ Model architecture ‣ Model hyperparameters ‣ Trained model/model weights ‣ Model metrics Model Versioning – Model Registry 17 Reproducibility and Versioning of ML Systems - 3. Versioning
  • 18. DSC 2022 // © COPYRIGHT 2022 ENDAVA Environment Versioning – Container Registry 18 Reproducibility and Versioning of ML Systems - 3. Versioning
  • 19. DSC 2022 // © COPYRIGHT 2022 ENDAVA Pipeline Versioning – Workflow Orchestration 19 Reproducibility and Versioning of ML Systems - 3. Versioning
  • 20. DSC 2022 // © COPYRIGHT 2022 ENDAVA Provisioning, configuring and managing infrastructure with machine-readable definition files. Benefits: ‣ Ensures infrastructure consistency and eliminates configuration drift. ‣ Cost reduction. ‣ Increase in speed of deployments. ‣ Scalability and availability. ‣ Fosters collaboration. ‣ Standardizes deployment workflow. ‣ Error risk reduction. Infrastructure Versioning – IaC 20 Reproducibility and Versioning of ML Systems - 3. Versioning
  • 21. DSC 2022 // © COPYRIGHT 2022 ENDAVA Metadata store is a central place that holds and connects all parameters about ML system. It may hold, for example: ‣ Data version: Reference to the dataset, md5 hash, dataset sample to know which data was used to train the model ‣ Environment configuration: Docker image ID, requirements.txt, conda.yml, Dockerfile, Makefile to know how to recreate the environment where the model was trained ‣ Code version: Git SHA of a commit or an actual snapshot of code to know what code was used to build a model ‣ Model version: Model ID, configuration of the feature preprocessing steps of the pipeline, model training, and inference to reproduce the process if needed ‣ Model performance metrics: Experiment ID, F1, accuracy, ROC on test and validation set to know how your model performs ‣ Hardware metrics: CPU, GPU, TPU, memory to see how much your model consumes during training/inference ‣ Performance visualizations: ROC curve, Confusion matrix, PR curve to understand the errors deeply ‣ Model predictions: to see the actual predictions and understand model performance beyond metrics Version Versioning – Metadata Store 21 Reproducibility and Versioning of ML Systems - 3. Versioning
  • 22. DSC 2022 // © COPYRIGHT 2022 ENDAVA 22 EXPERIMENT TRACKING SOURCE CODE FEATURE STORE MODEL REGISTRY METADATA STORE EXPERIMENTING AND MODEL DEVELOPMENT ML PIPELINE CI/CD: BUILD, TEST, PACKAGE, DEPLOY DATA ENGINEERING CONTINUOUS MODEL TRAINING MODEL CD PREDICTION SERVICE CONTINUOUS MONITORING DATA ENGINEERING Reproducibility and Versioning of ML Systems - 3. Versioning
  • 23. DSC 2022 // © COPYRIGHT 2022 ENDAVA 4 Documentation THE ONLY DIFFERENCE BETWEEN SCIENCE AND FOOLING AROUND IS WRITIN G IT DOWN
  • 24. DSC 2022 // © COPYRIGHT 2022 ENDAVA 24 Reproducibility and Versioning of ML Systems - 4. Documentation
  • 25. DSC 2022 // © COPYRIGHT 2022 ENDAVA Document as you go. Start from day 1. 25 Reproducibility and Versioning of ML Systems - 4. Documentation
  • 26. DSC 2022 // © COPYRIGHT 2022 ENDAVA 26 MLOps – New Kid on the Block - Thank You!