SlideShare una empresa de Scribd logo
1 de 42
Descargar para leer sin conexión
MLOps and the Feature Store
with Hopsworks
Jim Dowling
CEO, Hopsworks
DC Data Science Meetup,
Sep 14th 2021
We all take different Journeys to arrive at the Feature Store
Data Engineer
“Gotta feed those data
‘scientists’ with data”
Data Scientist
“Hello!?! Hello!?!
Is there any data out there?”
ML Engineer
And then she said
“productionize this notebook”
Feature Store
We all take different Journeys to arrive at MLOps
Data Engineer
Orchestrated Pipelines,
baby!
Data Scientist
Notebooks as Jobs, yay!
ML Engineer
Containerize, kubernetize,
observerize!
Feature Store
triggers them
Feature
Store
Feature
Engineering
Model
Training
Model
serving
Model
monitoring
Validate
& Test
Input Data
MLOps with a Feature Store
SQL or Python or Spark for Feature Engineering?
SQL Features
(Table)
DB
DB
Python
Features
(Dataframe)
Msg Bus
Files
Extract,
Aggregate,
Transform
Spark
DBT
Extract,
Aggregate,
Transform
What Feature Engineering do we typically perform where?
Aggregations,
Data Validation
Training
Data
Serving
Raw Data
Feature
Store
Model
Repo
Transformations Input Data
Need to ensure no
skew between training
and serving
transformations
Feature Group
Feature 1 Feature M
Primary Key
0 ... ...
1 ... ...
2 ... ...
... ... ...
N ... ...
import hsfs
connection = hsfs.connection(...)
fs = connection.get_feature_store()
fg_meta = fs.create_feature_group(name="sales_fg",
version=1,
primary_key=['store',’date’,’dept’],
event_time="ts",
description="customer features",
online_enabled=True)
HSFS API - Create Feature Groups
sales_fg = fg.get_feature_group(“sales_fg”, version=1)
df = # featurize some data to ingest into the feature store
sales_fg.insert(df)
Batch insert/backfilling features into the Feature Store
Spark Streaming insertion of features into the Feature Store
sales_fg = fg.get_feature_group(“sales_fg”, version=1)
streaming_df = # get streaming dataframe to ingest into the feature store
sales_fg.insert_stream(streaming_df)
Data Validation for Feature Groups (using Deequ)
expectation_sales = fs.create_expectation(..,
rules=[Rule(name="HAS_MIN", level="WARNING", min=0),
Rule(name="HAS_MAX", level="ERROR", max=1000000)])
sales_fg = fg.get_feature_group(“sales_fg”, version=1)
sales_fg.attach_expectation(expectation_sales)
df = # get some dataframe to ingest into the feature store
# Run Data Validation Rules when data is written
sales_fg.insert(df)
On-Demand Feature Groups (External Tables)
snowflake_conn = fs.get_storage_connector("telco_snowflake_cluster")
telco_on_dmd = fs.create_on_demand_feature_group(name="telco_snowflake",
version=latest_version,
query="select * from telco",
description="On-demand FG",
storage_connector=snowflake_conn,
statistics_config=True)
telco_on_dmd.save()
You can also use connectors to any JDBC source or S3 source or ADLS on Azure
JOIN, Transform, Filter Features to create Training Datasets
Feature 1
LABEL
(CHURN_weekly)
Feature J
Primary Key
0 ... ... 1
1 ... ... 0
2 ... ... 0
... ... ... ...
N ... ... 1
Feature 1 Feature M
Primary Key
0 ... ...
1 ... ...
2 ... ...
... ... ...
N ... ...
Feature 1 Feature J
Primary Key
0 ... ...
1 ... ...
2 ... ...
... ... ...
N ... ...
Feature Group A Feature Group B
Training Dataset
Transform, Filter
HSFS API - Transformation Functions
# Store in a Python module. More than 1 transformation fn per file is allowed.
from datetime import datetime
def date_string_to_timestamp(date):
date_format = "%Y%m%d%H%M%S"
return int(float(datetime.strptime(date, date_format).timestamp()) * 1000)
HSFS API - Create Training Datasets with Transformations
date_string_2_ts = fs.create_transformation_function(
transformation_function=python_file.date_string_to_timestamp,
output_type="long", version=1)
# JOIN the features together
query = sales_fg.select_all().join(exogeneous_fg.select(['fuel_price',‘cpi’])
td = fs.create_training_dataset(name="sales_dc_td",
description="Dataset to train the Sales model for DC",
data_format="tfrecord",
transformation_functions={"sale_ts":date_string_2_ts},
version=1,
label=[”label_col”])
.filter(state=”DC”)
td.save(query)
16
Feature Store
Batch
Inference
Report
Model
Serving Feature Store
Latency and availability are critical for user experience
High throughput important, latency not critical
Analytical Models Operational Models
Feature Vectors
Models retrieve pre-computed features (Feature Vectors) from the Feature Store
Feature 1 CHURN_weekly
Feature N
Primary Key
ID ... ... N/A
From App From Feature Store No Label - Predict it
Lookup Features from Feature Store using “ID”
Note: this is the sames features as in the Training Dataset, minus the label
HSFS API - Serving
td = fs.get_training_dataset(“sales_dc_td”, version=1)
td.init_prepared_statement()
# online transformation functions are transparently applied before returning
prediction_array = td.get_serving_vector({“date”: “2021-06-01 21:04:00”})
# call model with ‘prediction_array’ as input
transaction_type
transaction_amount
user_id
user_nationality
user_gender
transactions_fg
users_fg
Feature Groups Training
Datasets
pk join
fraud_td
Descriptive
Statistics,
Feature
Correlations,
Histograms
...
Use for Drift
Detection
fraud_classifier
Models
Training Data
Features Models
Raw
Data
From Raw Data to Production Models in Hopsworks
Provenance Graph of Dependencies
Feature Groups Models
Training Datasets
Changes in upstream entities trigger actions that can cause downstream computations to run
Upstream Downstream
MLOps is Feature Pipelines, Training Pipelines, and Model Monitoring
transaction_type
transaction_amount
user_id
user_nationality
user_gender
transactions_fg
users_fg
Feature Groups Training
Datasets
pk join
fraud_td
Descriptive
Statistics,
Feature
Correlations,
Histograms
...
Use for Drift
Detection
fraud_classifier
Models
Feature Pipeline Training Pipeline
Model
Monitoring
Feature
Store
Feature
Engineering
Model
Training
Model
serving
Model
monitoring
ML Engineers
Data Scientists
Model
Testing
Data Engineers
Architects (Governance)
Roles and Responsibilities in a ML Pipeline
CI/CD Triggers and Orchestration of Pipelines in MLOps
Enterprise
Data
Model
Registry
Feature
Pipeline
Model
Serving
Training
Pipeline
Feature
Store
Orchestrator: Airflow, Github Actions, Jenkins
CI/CD Triggers: Code commit, New data, time trigger (e.g., daily)
Model
Monitoring
Orchestrate Feature and Training Pipelines with Airflow in Hopsworks
Feature Engineering
Notebook/Job
Validate on Data Slices
& Deploy Model
Run Experiment
to Train Model
Select Features, File Format
and Create Training Data
FEATURE
STORE
Data Science
Data Engineering Compliance & Regulatory
Feature Store
Teams use the tools of their choice,
integrated with the
Hopsworks Feature Store
Model Serving
Hopsworks is an Open, Modular Feature Store that can Plug into ML Pipelines
26
Feature Pipeline
Feature Store
Batch or Streaming
Feature Pipeline
Enterprise Datastores
Aggregations
Data Validation
27
Training Pipeline
Model
architecture
Select
target,
features
Find best
HParams
Train model
(distributed)
Validate
Model
Deploy
Model
Feature Store
Maggy - Experiments, Distributed ML, and write-once training logic
https://www.youtube.com/watch?v=1SHOwl37I5c
KubeFlow Model Serving (KFServing), the Feature Store, and Logging to Kafka
Local Remote
AI-Enabled
Application
KFServing Feature Store
1. 2.
3.
4.
1. Prediction Request
2. Request Features
3. Return Enriched Feature Vector
4. Predict, Log, & Return Result
class Transformer:
def _init_(self):
self.fs = #connect to feature store
self.td = self.fs.get_training_dataset("sales_dc_td")
def preprocess(inputs):
return td.get_serving_vector(inputs["some-key"])
2. Request Features from inside the KFServing Transformer
Kafka
4.
29
Model Monitoring from KFServing Logs
Usage example
Windowed Outliers
Pipe
Windowed Drift Pipe
Stats Outliers Pipe
Stats Drift Pipe
Outliers Pipe
Drift Pipe
Monitor pipe Window pipe
Stats pipe
Sink Pipe
Alerts
Reports
Insights
Prediction
Requests
Kafka
30
New Training Data from Prediction Logs and the Evaluation Store
Prediction
Requests
● Interactive Queries to debug the Model
● Interactive Queries to debug Inference Data
● Inspect Model KPIs Charts
● Inspect Model Serving Performance Charts
● Identify Model/Data Drift
● Interactive Queries to Audit Logs
Evaluation
Store
Feature
Store
ML Engineer
Data Scientist
● Understand Live Model Performance
● Use new Training Data
Kafka
End-to-End Example -
Anti-Money Laundering
https://github.com/logicalclocks/AMLend2end
CUSTOMER CASE STUDY SWEDBANK - ANTI-MONEY LAUNDERING (AML) WITH HOPSWORKS
THE CHALLENGE
Increase detection rate and reduce false positives and costs for AML.
GANs with
a ~40 TB
transaction dataset
Spark for Feature
Engineering
(including graph embeddings)
TensorFlow/GPUs to
train a GAN
Features, Scale-out
training, models, model
serving
Webinar, Thursday 16th, 9am PT:
https://info.nvidia.com/accelerate-financial-fraud-detection-webinar.html?ncid=so-link-610204-vt09&linkId=100000063386013
With Hopsworks, Swedbank managed to decrease in 99% of their false positive compared
to their previous system (rule based).
RULES-BASE AML vs DEEP LEARNING AML
CUSTOMER CASE STUDY SWEDBANK - ANTI-MONEY LAUNDERING (AML) WITH HOPSWORKS
Kafka
Teradata
Cloudera
AML
Application
Retrieve
Features
(<10 ms)
Real-Time Financial Features
Customer Credit Score / KYC
Historial Financial Transactions
Is this Money Transfer Suspicious?
Model
Train (40 TB)
Hopsworks Feature Store is the central location where all the data (features) are stored and manipulated
to be used for the AML application.
Hopsworks
Feature Store
35
Anti-Money Laundering End-to-End Example
transactions alert_transactions
party
trans_embeddings alert_trans_embeddings
training_data
user_id is the join key for party and (alert_)transactions
test_data
trans_id is the join key for (alert_)transactions and (alert_)trans_embeddings
user_id
trans_id trans_id
MLOps Lifecycle with Hopsworks
Enterprise
Data
Model
Registry
Feature
Engineering
Model
Serving
Model
Training
Model
Deploy
Model
Monitoring
Log Predictions Statistics
CDC
Experiments
Feature Statistics
A/B Test
Model
Metadata
Serving
Statistics
Free-text Search,
Provenance API
RonDB
Feature Store
Elasticsearch
RonDB
Metastore
Feature Vectors
Demo
Anti-Money Laundering
https://github.com/logicalclocks/AMLend2end
Training
Development
Model Repo
Model Serving
Output
Feature
Store
Feature
Engineering
Sources
Feature
Store
Database
Application/ERP
Logs
3rd Party APIs
Object and File Storage
• • •
Dashboards
Batch Applications
Augmented Analytics
Applications
Microservices
• • •
Hopsworks - Design and Operate AI Applications
Python
Spark/SQL
Spark
Streaming
Flink
Any Python Library
HopsFS (S3 / Azure Blob Storage)
RonDB
www.hopsworks.ai
-
@hopsworks
github.com/logicalclocks/hopsworks
github.com/logicalclocks/hopsworks
-
@logicalclocks
-
www.logicalclocks.com
Feature serving both online and batch
41
Offline Feature Store
OnlineFS-ClusterJ
OnlineFS-ClusterJ
HSFS
FG 2
FG 1
OnlineFS-ClusterJ
Meta Data (Avro Schema)
Online
Feature Store
Scalable stateless Online FS
upsert ingestion service
Kafka topic per Online
Feature Group
FG 1 FG 2 FG 3
Meta Data
Meta Data (Avro Schema)
Upsert
based on
Primary Key
Consume
and decode
Encode and
produce
Upsert
User/Application
fg.insert(df)
42
RonDB powers the Hopsworks Platform
RonDB makes Hopsworks the only LATS Feature Store
< 1ms KV lookup
>10M KV Lookups/sec
>99.999% availability

Más contenido relacionado

La actualidad más candente

Berlin buzzwords 2020-feature-store-dowling
Berlin buzzwords 2020-feature-store-dowlingBerlin buzzwords 2020-feature-store-dowling
Berlin buzzwords 2020-feature-store-dowling
Jim Dowling
 
MLOps with a Feature Store: Filling the Gap in ML Infrastructure
MLOps with a Feature Store: Filling the Gap in ML InfrastructureMLOps with a Feature Store: Filling the Gap in ML Infrastructure
MLOps with a Feature Store: Filling the Gap in ML Infrastructure
Data Science Milan
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
Databricks
 
Building Intelligent Applications, Experimental ML with Uber’s Data Science W...
Building Intelligent Applications, Experimental ML with Uber’s Data Science W...Building Intelligent Applications, Experimental ML with Uber’s Data Science W...
Building Intelligent Applications, Experimental ML with Uber’s Data Science W...
Databricks
 

La actualidad más candente (20)

Feature store: Solving anti-patterns in ML-systems
Feature store: Solving anti-patterns in ML-systemsFeature store: Solving anti-patterns in ML-systems
Feature store: Solving anti-patterns in ML-systems
 
The Feature Store in Hopsworks
The Feature Store in HopsworksThe Feature Store in Hopsworks
The Feature Store in Hopsworks
 
Hopsworks - The Platform for Data-Intensive AI
Hopsworks - The Platform for Data-Intensive AIHopsworks - The Platform for Data-Intensive AI
Hopsworks - The Platform for Data-Intensive AI
 
Hops fs huawei internal conference july 2021
Hops fs huawei internal conference july 2021Hops fs huawei internal conference july 2021
Hops fs huawei internal conference july 2021
 
Berlin buzzwords 2020-feature-store-dowling
Berlin buzzwords 2020-feature-store-dowlingBerlin buzzwords 2020-feature-store-dowling
Berlin buzzwords 2020-feature-store-dowling
 
Hopsworks data engineering melbourne april 2020
Hopsworks   data engineering melbourne april 2020Hopsworks   data engineering melbourne april 2020
Hopsworks data engineering melbourne april 2020
 
MLOps with a Feature Store: Filling the Gap in ML Infrastructure
MLOps with a Feature Store: Filling the Gap in ML InfrastructureMLOps with a Feature Store: Filling the Gap in ML Infrastructure
MLOps with a Feature Store: Filling the Gap in ML Infrastructure
 
The Killer Feature Store: Orchestrating Spark ML Pipelines and MLflow for Pro...
The Killer Feature Store: Orchestrating Spark ML Pipelines and MLflow for Pro...The Killer Feature Store: Orchestrating Spark ML Pipelines and MLflow for Pro...
The Killer Feature Store: Orchestrating Spark ML Pipelines and MLflow for Pro...
 
Asynchronous Hyperparameter Search with Spark on Hopsworks and Maggy
Asynchronous Hyperparameter Search with Spark on Hopsworks and MaggyAsynchronous Hyperparameter Search with Spark on Hopsworks and Maggy
Asynchronous Hyperparameter Search with Spark on Hopsworks and Maggy
 
Hopsworks hands on_feature_store_palo_alto_kim_hammar_23_april_2019
Hopsworks hands on_feature_store_palo_alto_kim_hammar_23_april_2019Hopsworks hands on_feature_store_palo_alto_kim_hammar_23_april_2019
Hopsworks hands on_feature_store_palo_alto_kim_hammar_23_april_2019
 
Hopsworks at Google AI Huddle, Sunnyvale
Hopsworks at Google AI Huddle, SunnyvaleHopsworks at Google AI Huddle, Sunnyvale
Hopsworks at Google AI Huddle, Sunnyvale
 
Spark ML Pipeline serving
Spark ML Pipeline servingSpark ML Pipeline serving
Spark ML Pipeline serving
 
TensorFlow Extended: An End-to-End Machine Learning Platform for TensorFlow
TensorFlow Extended: An End-to-End Machine Learning Platform for TensorFlowTensorFlow Extended: An End-to-End Machine Learning Platform for TensorFlow
TensorFlow Extended: An End-to-End Machine Learning Platform for TensorFlow
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
 
mlflow: Accelerating the End-to-End ML lifecycle
mlflow: Accelerating the End-to-End ML lifecyclemlflow: Accelerating the End-to-End ML lifecycle
mlflow: Accelerating the End-to-End ML lifecycle
 
Introducing apache prediction io (incubating) (bay area spark meetup at sales...
Introducing apache prediction io (incubating) (bay area spark meetup at sales...Introducing apache prediction io (incubating) (bay area spark meetup at sales...
Introducing apache prediction io (incubating) (bay area spark meetup at sales...
 
Building Intelligent Applications, Experimental ML with Uber’s Data Science W...
Building Intelligent Applications, Experimental ML with Uber’s Data Science W...Building Intelligent Applications, Experimental ML with Uber’s Data Science W...
Building Intelligent Applications, Experimental ML with Uber’s Data Science W...
 
Multi runtime serving pipelines for machine learning
Multi runtime serving pipelines for machine learningMulti runtime serving pipelines for machine learning
Multi runtime serving pipelines for machine learning
 
Apply MLOps at Scale by H&M
Apply MLOps at Scale by H&MApply MLOps at Scale by H&M
Apply MLOps at Scale by H&M
 

Similar a Ml ops and the feature store with hopsworks, DC Data Science Meetup

KFServing, Model Monitoring with Apache Spark and a Feature Store
KFServing, Model Monitoring with Apache Spark and a Feature StoreKFServing, Model Monitoring with Apache Spark and a Feature Store
KFServing, Model Monitoring with Apache Spark and a Feature Store
Databricks
 
Monitoring AI with AI
Monitoring AI with AIMonitoring AI with AI
Monitoring AI with AI
Stepan Pushkarev
 
Machine Learning Models in Production
Machine Learning Models in ProductionMachine Learning Models in Production
Machine Learning Models in Production
DataWorks Summit
 
(Py)testing the Limits of Machine Learning
(Py)testing the Limits of Machine Learning(Py)testing the Limits of Machine Learning
(Py)testing the Limits of Machine Learning
Rebecca Bilbro
 
Data Con LA 2019 - MetaConfig driven FeatureStore with Feature compute & Serv...
Data Con LA 2019 - MetaConfig driven FeatureStore with Feature compute & Serv...Data Con LA 2019 - MetaConfig driven FeatureStore with Feature compute & Serv...
Data Con LA 2019 - MetaConfig driven FeatureStore with Feature compute & Serv...
Data Con LA
 

Similar a Ml ops and the feature store with hopsworks, DC Data Science Meetup (20)

KFServing, Model Monitoring with Apache Spark and a Feature Store
KFServing, Model Monitoring with Apache Spark and a Feature StoreKFServing, Model Monitoring with Apache Spark and a Feature Store
KFServing, Model Monitoring with Apache Spark and a Feature Store
 
PyCon Sweden 2022 - Dowling - Serverless ML with Hopsworks.pdf
PyCon Sweden 2022 - Dowling - Serverless ML with Hopsworks.pdfPyCon Sweden 2022 - Dowling - Serverless ML with Hopsworks.pdf
PyCon Sweden 2022 - Dowling - Serverless ML with Hopsworks.pdf
 
Hamburg Data Science Meetup - MLOps with a Feature Store
Hamburg Data Science Meetup - MLOps with a Feature StoreHamburg Data Science Meetup - MLOps with a Feature Store
Hamburg Data Science Meetup - MLOps with a Feature Store
 
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...
 
Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...
Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...
Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...
 
Monitoring AI with AI
Monitoring AI with AIMonitoring AI with AI
Monitoring AI with AI
 
Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...
Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...
Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...
 
Machine Learning Models in Production
Machine Learning Models in ProductionMachine Learning Models in Production
Machine Learning Models in Production
 
Analytics Metrics delivery and ML Feature visualization: Evolution of Data Pl...
Analytics Metrics delivery and ML Feature visualization: Evolution of Data Pl...Analytics Metrics delivery and ML Feature visualization: Evolution of Data Pl...
Analytics Metrics delivery and ML Feature visualization: Evolution of Data Pl...
 
Netflix Machine Learning Infra for Recommendations - 2018
Netflix Machine Learning Infra for Recommendations - 2018Netflix Machine Learning Infra for Recommendations - 2018
Netflix Machine Learning Infra for Recommendations - 2018
 
ML Infra for Netflix Recommendations - AI NEXTCon talk
ML Infra for Netflix Recommendations - AI NEXTCon talkML Infra for Netflix Recommendations - AI NEXTCon talk
ML Infra for Netflix Recommendations - AI NEXTCon talk
 
Data Product Architectures
Data Product ArchitecturesData Product Architectures
Data Product Architectures
 
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
 
Serverless ML Workshop with Hopsworks at PyData Seattle
Serverless ML Workshop with Hopsworks at PyData SeattleServerless ML Workshop with Hopsworks at PyData Seattle
Serverless ML Workshop with Hopsworks at PyData Seattle
 
PyData Berlin 2023 - Mythical ML Pipeline.pdf
PyData Berlin 2023 - Mythical ML Pipeline.pdfPyData Berlin 2023 - Mythical ML Pipeline.pdf
PyData Berlin 2023 - Mythical ML Pipeline.pdf
 
Productionalizing ML : Real Experience
Productionalizing ML : Real ExperienceProductionalizing ML : Real Experience
Productionalizing ML : Real Experience
 
(Py)testing the Limits of Machine Learning
(Py)testing the Limits of Machine Learning(Py)testing the Limits of Machine Learning
(Py)testing the Limits of Machine Learning
 
Simplifying the Creation of Machine Learning Workflow Pipelines for IoT Appli...
Simplifying the Creation of Machine Learning Workflow Pipelines for IoT Appli...Simplifying the Creation of Machine Learning Workflow Pipelines for IoT Appli...
Simplifying the Creation of Machine Learning Workflow Pipelines for IoT Appli...
 
MetaConfig driven FeatureStore : MakeMyTrip | Presented at Data Con LA 2019 b...
MetaConfig driven FeatureStore : MakeMyTrip | Presented at Data Con LA 2019 b...MetaConfig driven FeatureStore : MakeMyTrip | Presented at Data Con LA 2019 b...
MetaConfig driven FeatureStore : MakeMyTrip | Presented at Data Con LA 2019 b...
 
Data Con LA 2019 - MetaConfig driven FeatureStore with Feature compute & Serv...
Data Con LA 2019 - MetaConfig driven FeatureStore with Feature compute & Serv...Data Con LA 2019 - MetaConfig driven FeatureStore with Feature compute & Serv...
Data Con LA 2019 - MetaConfig driven FeatureStore with Feature compute & Serv...
 

Más de Jim Dowling

Más de Jim Dowling (16)

ARVC and flecainide case report[EI] Jim.docx.pdf
ARVC and flecainide case report[EI] Jim.docx.pdfARVC and flecainide case report[EI] Jim.docx.pdf
ARVC and flecainide case report[EI] Jim.docx.pdf
 
_Python Ireland Meetup - Serverless ML - Dowling.pdf
_Python Ireland Meetup - Serverless ML - Dowling.pdf_Python Ireland Meetup - Serverless ML - Dowling.pdf
_Python Ireland Meetup - Serverless ML - Dowling.pdf
 
Building Hopsworks, a cloud-native managed feature store for machine learning
Building Hopsworks, a cloud-native managed feature store for machine learning Building Hopsworks, a cloud-native managed feature store for machine learning
Building Hopsworks, a cloud-native managed feature store for machine learning
 
Real-Time Recommendations with Hopsworks and OpenSearch - MLOps World 2022
Real-Time Recommendations  with Hopsworks and OpenSearch - MLOps World 2022Real-Time Recommendations  with Hopsworks and OpenSearch - MLOps World 2022
Real-Time Recommendations with Hopsworks and OpenSearch - MLOps World 2022
 
GANs for Anti Money Laundering
GANs for Anti Money LaunderingGANs for Anti Money Laundering
GANs for Anti Money Laundering
 
Invited Lecture on GPUs and Distributed Deep Learning at Uppsala University
Invited Lecture on GPUs and Distributed Deep Learning at Uppsala UniversityInvited Lecture on GPUs and Distributed Deep Learning at Uppsala University
Invited Lecture on GPUs and Distributed Deep Learning at Uppsala University
 
Hopsworks in the cloud Berlin Buzzwords 2019
Hopsworks in the cloud Berlin Buzzwords 2019 Hopsworks in the cloud Berlin Buzzwords 2019
Hopsworks in the cloud Berlin Buzzwords 2019
 
HopsML Meetup talk on Hopsworks + ROCm/AMD June 2019
HopsML Meetup talk on Hopsworks + ROCm/AMD June 2019HopsML Meetup talk on Hopsworks + ROCm/AMD June 2019
HopsML Meetup talk on Hopsworks + ROCm/AMD June 2019
 
Jfokus 2019-dowling-logical-clocks
Jfokus 2019-dowling-logical-clocksJfokus 2019-dowling-logical-clocks
Jfokus 2019-dowling-logical-clocks
 
Berlin buzzwords 2018 TensorFlow on Hops
Berlin buzzwords 2018 TensorFlow on HopsBerlin buzzwords 2018 TensorFlow on Hops
Berlin buzzwords 2018 TensorFlow on Hops
 
All AI Roads lead to Distribution - Dot AI
All AI Roads lead to Distribution - Dot AIAll AI Roads lead to Distribution - Dot AI
All AI Roads lead to Distribution - Dot AI
 
Distributed TensorFlow on Hops (Papis London, April 2018)
Distributed TensorFlow on Hops (Papis London, April 2018)Distributed TensorFlow on Hops (Papis London, April 2018)
Distributed TensorFlow on Hops (Papis London, April 2018)
 
End-to-End Platform Support for Distributed Deep Learning in Finance
End-to-End Platform Support for Distributed Deep Learning in FinanceEnd-to-End Platform Support for Distributed Deep Learning in Finance
End-to-End Platform Support for Distributed Deep Learning in Finance
 
Scaling TensorFlow with Hops, Global AI Conference Santa Clara
Scaling TensorFlow with Hops, Global AI Conference Santa ClaraScaling TensorFlow with Hops, Global AI Conference Santa Clara
Scaling TensorFlow with Hops, Global AI Conference Santa Clara
 
Scaling out Tensorflow-as-a-Service on Spark and Commodity GPUs
Scaling out Tensorflow-as-a-Service on Spark and Commodity GPUsScaling out Tensorflow-as-a-Service on Spark and Commodity GPUs
Scaling out Tensorflow-as-a-Service on Spark and Commodity GPUs
 
Odsc workshop - Distributed Tensorflow on Hops
Odsc workshop - Distributed Tensorflow on HopsOdsc workshop - Distributed Tensorflow on Hops
Odsc workshop - Distributed Tensorflow on Hops
 

Último

Último (20)

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 

Ml ops and the feature store with hopsworks, DC Data Science Meetup

  • 1. MLOps and the Feature Store with Hopsworks Jim Dowling CEO, Hopsworks DC Data Science Meetup, Sep 14th 2021
  • 2. We all take different Journeys to arrive at the Feature Store Data Engineer “Gotta feed those data ‘scientists’ with data” Data Scientist “Hello!?! Hello!?! Is there any data out there?” ML Engineer And then she said “productionize this notebook” Feature Store
  • 3. We all take different Journeys to arrive at MLOps Data Engineer Orchestrated Pipelines, baby! Data Scientist Notebooks as Jobs, yay! ML Engineer Containerize, kubernetize, observerize! Feature Store triggers them
  • 5. SQL or Python or Spark for Feature Engineering? SQL Features (Table) DB DB Python Features (Dataframe) Msg Bus Files Extract, Aggregate, Transform Spark DBT Extract, Aggregate, Transform
  • 6. What Feature Engineering do we typically perform where? Aggregations, Data Validation Training Data Serving Raw Data Feature Store Model Repo Transformations Input Data Need to ensure no skew between training and serving transformations
  • 7. Feature Group Feature 1 Feature M Primary Key 0 ... ... 1 ... ... 2 ... ... ... ... ... N ... ...
  • 8. import hsfs connection = hsfs.connection(...) fs = connection.get_feature_store() fg_meta = fs.create_feature_group(name="sales_fg", version=1, primary_key=['store',’date’,’dept’], event_time="ts", description="customer features", online_enabled=True) HSFS API - Create Feature Groups
  • 9. sales_fg = fg.get_feature_group(“sales_fg”, version=1) df = # featurize some data to ingest into the feature store sales_fg.insert(df) Batch insert/backfilling features into the Feature Store
  • 10. Spark Streaming insertion of features into the Feature Store sales_fg = fg.get_feature_group(“sales_fg”, version=1) streaming_df = # get streaming dataframe to ingest into the feature store sales_fg.insert_stream(streaming_df)
  • 11. Data Validation for Feature Groups (using Deequ) expectation_sales = fs.create_expectation(.., rules=[Rule(name="HAS_MIN", level="WARNING", min=0), Rule(name="HAS_MAX", level="ERROR", max=1000000)]) sales_fg = fg.get_feature_group(“sales_fg”, version=1) sales_fg.attach_expectation(expectation_sales) df = # get some dataframe to ingest into the feature store # Run Data Validation Rules when data is written sales_fg.insert(df)
  • 12. On-Demand Feature Groups (External Tables) snowflake_conn = fs.get_storage_connector("telco_snowflake_cluster") telco_on_dmd = fs.create_on_demand_feature_group(name="telco_snowflake", version=latest_version, query="select * from telco", description="On-demand FG", storage_connector=snowflake_conn, statistics_config=True) telco_on_dmd.save() You can also use connectors to any JDBC source or S3 source or ADLS on Azure
  • 13. JOIN, Transform, Filter Features to create Training Datasets Feature 1 LABEL (CHURN_weekly) Feature J Primary Key 0 ... ... 1 1 ... ... 0 2 ... ... 0 ... ... ... ... N ... ... 1 Feature 1 Feature M Primary Key 0 ... ... 1 ... ... 2 ... ... ... ... ... N ... ... Feature 1 Feature J Primary Key 0 ... ... 1 ... ... 2 ... ... ... ... ... N ... ... Feature Group A Feature Group B Training Dataset Transform, Filter
  • 14. HSFS API - Transformation Functions # Store in a Python module. More than 1 transformation fn per file is allowed. from datetime import datetime def date_string_to_timestamp(date): date_format = "%Y%m%d%H%M%S" return int(float(datetime.strptime(date, date_format).timestamp()) * 1000)
  • 15. HSFS API - Create Training Datasets with Transformations date_string_2_ts = fs.create_transformation_function( transformation_function=python_file.date_string_to_timestamp, output_type="long", version=1) # JOIN the features together query = sales_fg.select_all().join(exogeneous_fg.select(['fuel_price',‘cpi’]) td = fs.create_training_dataset(name="sales_dc_td", description="Dataset to train the Sales model for DC", data_format="tfrecord", transformation_functions={"sale_ts":date_string_2_ts}, version=1, label=[”label_col”]) .filter(state=”DC”) td.save(query)
  • 16. 16 Feature Store Batch Inference Report Model Serving Feature Store Latency and availability are critical for user experience High throughput important, latency not critical Analytical Models Operational Models Feature Vectors
  • 17. Models retrieve pre-computed features (Feature Vectors) from the Feature Store Feature 1 CHURN_weekly Feature N Primary Key ID ... ... N/A From App From Feature Store No Label - Predict it Lookup Features from Feature Store using “ID” Note: this is the sames features as in the Training Dataset, minus the label
  • 18. HSFS API - Serving td = fs.get_training_dataset(“sales_dc_td”, version=1) td.init_prepared_statement() # online transformation functions are transparently applied before returning prediction_array = td.get_serving_vector({“date”: “2021-06-01 21:04:00”}) # call model with ‘prediction_array’ as input
  • 19. transaction_type transaction_amount user_id user_nationality user_gender transactions_fg users_fg Feature Groups Training Datasets pk join fraud_td Descriptive Statistics, Feature Correlations, Histograms ... Use for Drift Detection fraud_classifier Models Training Data Features Models Raw Data From Raw Data to Production Models in Hopsworks
  • 20. Provenance Graph of Dependencies Feature Groups Models Training Datasets Changes in upstream entities trigger actions that can cause downstream computations to run Upstream Downstream
  • 21. MLOps is Feature Pipelines, Training Pipelines, and Model Monitoring transaction_type transaction_amount user_id user_nationality user_gender transactions_fg users_fg Feature Groups Training Datasets pk join fraud_td Descriptive Statistics, Feature Correlations, Histograms ... Use for Drift Detection fraud_classifier Models Feature Pipeline Training Pipeline Model Monitoring
  • 23. CI/CD Triggers and Orchestration of Pipelines in MLOps Enterprise Data Model Registry Feature Pipeline Model Serving Training Pipeline Feature Store Orchestrator: Airflow, Github Actions, Jenkins CI/CD Triggers: Code commit, New data, time trigger (e.g., daily) Model Monitoring
  • 24. Orchestrate Feature and Training Pipelines with Airflow in Hopsworks Feature Engineering Notebook/Job Validate on Data Slices & Deploy Model Run Experiment to Train Model Select Features, File Format and Create Training Data FEATURE STORE
  • 25. Data Science Data Engineering Compliance & Regulatory Feature Store Teams use the tools of their choice, integrated with the Hopsworks Feature Store Model Serving Hopsworks is an Open, Modular Feature Store that can Plug into ML Pipelines
  • 26. 26 Feature Pipeline Feature Store Batch or Streaming Feature Pipeline Enterprise Datastores Aggregations Data Validation
  • 27. 27 Training Pipeline Model architecture Select target, features Find best HParams Train model (distributed) Validate Model Deploy Model Feature Store Maggy - Experiments, Distributed ML, and write-once training logic https://www.youtube.com/watch?v=1SHOwl37I5c
  • 28. KubeFlow Model Serving (KFServing), the Feature Store, and Logging to Kafka Local Remote AI-Enabled Application KFServing Feature Store 1. 2. 3. 4. 1. Prediction Request 2. Request Features 3. Return Enriched Feature Vector 4. Predict, Log, & Return Result class Transformer: def _init_(self): self.fs = #connect to feature store self.td = self.fs.get_training_dataset("sales_dc_td") def preprocess(inputs): return td.get_serving_vector(inputs["some-key"]) 2. Request Features from inside the KFServing Transformer Kafka 4.
  • 29. 29 Model Monitoring from KFServing Logs Usage example Windowed Outliers Pipe Windowed Drift Pipe Stats Outliers Pipe Stats Drift Pipe Outliers Pipe Drift Pipe Monitor pipe Window pipe Stats pipe Sink Pipe Alerts Reports Insights Prediction Requests Kafka
  • 30. 30 New Training Data from Prediction Logs and the Evaluation Store Prediction Requests ● Interactive Queries to debug the Model ● Interactive Queries to debug Inference Data ● Inspect Model KPIs Charts ● Inspect Model Serving Performance Charts ● Identify Model/Data Drift ● Interactive Queries to Audit Logs Evaluation Store Feature Store ML Engineer Data Scientist ● Understand Live Model Performance ● Use new Training Data Kafka
  • 31. End-to-End Example - Anti-Money Laundering https://github.com/logicalclocks/AMLend2end
  • 32. CUSTOMER CASE STUDY SWEDBANK - ANTI-MONEY LAUNDERING (AML) WITH HOPSWORKS THE CHALLENGE Increase detection rate and reduce false positives and costs for AML. GANs with a ~40 TB transaction dataset Spark for Feature Engineering (including graph embeddings) TensorFlow/GPUs to train a GAN Features, Scale-out training, models, model serving Webinar, Thursday 16th, 9am PT: https://info.nvidia.com/accelerate-financial-fraud-detection-webinar.html?ncid=so-link-610204-vt09&linkId=100000063386013 With Hopsworks, Swedbank managed to decrease in 99% of their false positive compared to their previous system (rule based).
  • 33. RULES-BASE AML vs DEEP LEARNING AML
  • 34. CUSTOMER CASE STUDY SWEDBANK - ANTI-MONEY LAUNDERING (AML) WITH HOPSWORKS Kafka Teradata Cloudera AML Application Retrieve Features (<10 ms) Real-Time Financial Features Customer Credit Score / KYC Historial Financial Transactions Is this Money Transfer Suspicious? Model Train (40 TB) Hopsworks Feature Store is the central location where all the data (features) are stored and manipulated to be used for the AML application. Hopsworks Feature Store
  • 35. 35 Anti-Money Laundering End-to-End Example transactions alert_transactions party trans_embeddings alert_trans_embeddings training_data user_id is the join key for party and (alert_)transactions test_data trans_id is the join key for (alert_)transactions and (alert_)trans_embeddings user_id trans_id trans_id
  • 36. MLOps Lifecycle with Hopsworks Enterprise Data Model Registry Feature Engineering Model Serving Model Training Model Deploy Model Monitoring Log Predictions Statistics CDC Experiments Feature Statistics A/B Test Model Metadata Serving Statistics Free-text Search, Provenance API RonDB Feature Store Elasticsearch RonDB Metastore Feature Vectors
  • 38. Training Development Model Repo Model Serving Output Feature Store Feature Engineering Sources Feature Store Database Application/ERP Logs 3rd Party APIs Object and File Storage • • • Dashboards Batch Applications Augmented Analytics Applications Microservices • • • Hopsworks - Design and Operate AI Applications Python Spark/SQL Spark Streaming Flink Any Python Library HopsFS (S3 / Azure Blob Storage) RonDB
  • 41. Feature serving both online and batch 41 Offline Feature Store OnlineFS-ClusterJ OnlineFS-ClusterJ HSFS FG 2 FG 1 OnlineFS-ClusterJ Meta Data (Avro Schema) Online Feature Store Scalable stateless Online FS upsert ingestion service Kafka topic per Online Feature Group FG 1 FG 2 FG 3 Meta Data Meta Data (Avro Schema) Upsert based on Primary Key Consume and decode Encode and produce Upsert User/Application fg.insert(df)
  • 42. 42 RonDB powers the Hopsworks Platform RonDB makes Hopsworks the only LATS Feature Store < 1ms KV lookup >10M KV Lookups/sec >99.999% availability