SlideShare una empresa de Scribd logo
1 de 22
Descargar para leer sin conexión
KFServing and Feast
Animesh Singh
The InferenceService architecture consists of a static graph of components which coordinate
requests for a single model. Advanced features such as Ensembling, A/B testing, and Multi-Arm-
Bandits should compose InferenceServices together.
Inference Service Control Plane
Inference Service with Transformer
apiVersion: serving.kubeflow.org/v1alpha2
kind: InferenceService
metadata:
name: bert-serving
spec:
default
transformer:
custom:
container:
image: bert-transformer:v1
env:
name: STORAGE_URI
value: s3://examples/bert_transformer
predictor:
pytorch:
storageUri: s3://examples/bert
runtimeVersion: v0.3.0-gpu
resources:
limits:
nvidia.com/gpu: 1 Pytorch Model Server
apiVersion: serving.kubeflow.org/v1alpha2
kind: InferenceService
metadata:
name: bert-serving-onnx
spec:
default
transformer:
custom:
container:
image: bert-transformer:v1
env:
name: STORAGE_URI
value: s3://examples/bert_transformer
predictor:
onnx:
storageUri: s3://examples/bert
runtimeVersion: 0.5.1
Pre/Post Processing
ONNX Runtime Server
Pre/Post Processing
Feature
● What is a feature?
4
/feature/
A feature is a measurable property of the object you’re trying to analyze.
Features are the basic building blocks of models. The
quality of the features in your dataset has a major impact
on the quality of the insights you will gain when you use
that dataset for machine learning.
Importance of Features
Hidden Technical Debt in Machine Learning Systems
5
Tech/User
Trends
Coming up with features is difficult, time-consuming,
requires expert knowledge. "Applied machine
learning" is basically feature engineering.
- Andrew Ng, Founder of deeplearning.ai
...some machine learning projects succeed and
some fail. What makes the difference? Easily the
most important factor is the features used.
- Pedro Domingos, author of ‘The Master Algorithm
algorithms we used are very standard for Kagglers. […]
We spent most of our efforts in feature engineering.
[...]
- Xavier Conort, Chief Data Scientist DataRobot
Feature Engineering is essential, difficult, and costly
The Feature problem
● Different ML models typically use some common set of features
● Examples of common features:
○ Average Loan default rate by zip code: Used by models which predict who should be targeted for a marketing offer, models which predict who should be offered a loan, etc.
○ Average property prices in an area
○ Credit history of customers: Used by models which predict anything that is related to clients.
○ Average traffic in an area: Used by models which deal with finding best route
● Finding the right features for an ML models requires:
○ Thinking of which features will be relevant for building the model
○ Identifying the right data from the data catalog/data lake for building the feature
○ Feature engineering to get the feature in the right format from the source data
○ This is repeated across teams by data scientists!
7
Feature Management is a Huge Painpoint
Spend more time on data prep
Lack of data consistency between training and serving
Duplicate work because they do not know it exists
Manage fragmented data infrastructure
Deal with more request as the data science team scales
Hard to get features into production
Data Scientists
Data Engineer
● Poor Feature Management Leads to….
8
Long Development Time Poor Data Quality Difficulty in Production
The feature store is the central
place to store curated features
for machine learning pipelines.
F E A T U R E S T O R E
Feature Store
9
Feast is a Feature Store Catalog that
attempts to solve the key data
challenges with production machine
learning
FEAST
Feature stores are a critical piece of ML infra
‘17 Uber Michelangelo (Proprietary, original feature store)
‘18 Feast (Open source)
‘18 Logical Clocks (Open source, ML platform)
‘19 Airbnb’s Zipline (Closed source)
‘19 Spotify’s Feature Store on Kubeflow (Closed source)
‘20 Pinterest (Closed source)
‘20 Twitter Feature Store (Closed source, library based)
‘20 Tecton Feature Store (Closed source)
What is a Feature Catalog?
● Feature catalog can be thought of as “Master Data” which is used for building and serving Machine learning models
● It stores different features which can be used across different teams for building ML models
● It is not just a feature repository, but also includes two serving mechanisms for:
○ Batch access
○ Real time access
● Feature Update: Feature values will get updated over time
○ Some will be updated in real time. E.g., average traffic in an area
○ Some will be updated not very frequently. E.g., Credit rating of customer
○ Features need to be synced between repositories used for batch access and real time access.
What does Feast provide?
Registry: A common catalog with which to explore, develop, collaborate on, and publish new feature definitions within
and across teams.
Ingestion: A means for continually ingesting batch and streaming data and storing consistent copies in both an offline
and online store
Serving: A feature-retrieval interface which provides a temporally consistent view of features for both training and online
serving.
Monitoring: Tools that allow operational teams to monitor and act on the quality and accuracy of data reaching models.
Feature Repo feast apply
Redis Serving API
Ingestion
API
Offline Store
(BQ/S3/GCS/Other)
Kafka Spark on K8s
Spark on K8s
Configures infrastructure based on feature definitions
and “provider”
Feast on K8s
Exists
Planning phase
TBD what the scope of apply would be for an K8s provider. It may be that it only spins up jobs and updates stores.
GCS/S3
registry
Redis Serving API
Feast- KFS
Feast online Serving
• gRPC server
• Serves 2 Methods
• GetFeastServingInfo
• GetOnlineFeaturesV2
• How to call the gRPC server-side methods?
• Short answer: Feast SDK
• Feast Python SDK, (also in Java and Go) wraps around gRPC client libs.
• gRPC client libs calls gRPC server-side method like calling local methods.
• gRPC client libs are generated from protobuf definition
• How are python gPRC client libs (modules) generated?
• They are generated from *.proto - Example:
python -m grpc_tools.protoc -I.
--grpc_python_out=../sdk/python/feast/protos/
--python_out=../sdk/python/feast/protos/
--mypy_out=../sdk/python/feast/protos/
feast/serving/ServingService.proto
• Generates
ServingService_pb2_grpc.py: client stub. Wrapper around the ServingService_pb.py
ServingService_pb.py - Implementation of ServingService_pb2_grpc.py
ServingService_pb2.pyi - Stub (interface) file of ServingService_pb2.py
Extend the Transformer
apiVersion: serving.kubeflow.org/v1alpha2
kind: InferenceService
metadata:
name: bert-serving
spec:
default
transformer:
feast:
feastUrl: "http://feast-serving.default.svc"
dataType: TensorProto,
entityIds:
- source
featureIds:
- weather:1:temp
- weather:1:clouds
- weather:1:humidity
numFeatureValues: 5
predictor:
pytorch:
storageUri: s3://examples/bert
runtimeVersion: v0.3.0-gpu
resources:
limits:
nvidia.com/gpu: 1
Pytorch Model Server
Feast Transformer
KFServing with Feast
Feast transformer as a new type of transformer for preprocess
○ Has a custom container image with generic implementation to interact with Feast online serving
○ Properties: entity IDs, feature refs, project, Feast serving URL…
○ Specify IDs in inference service yaml
■ Entity ids è FeatureStore.get_online_features(entity_rows…)
■ Feature refs è FeatureStore.get_online_features(feature_refs…)
○ The initial request will be augmented with features from Feast online store and sent to predictor as the final input
○ Postprocess is a pass-through, not implemented in this transformer
Preprocess Predict Postprocess
Explain
Python
dict
Python
dict
Transformer Transformer
Predictor
Explainer
Feast Online Serving
Model Serving
Request (predict
or explain)
Online Store
(Redis)
Registry
Feast
Model Serving
Response
(predict or
explain)
Python, gRPC
KFServing with Feast – Phased Approach
Phase 1: Provide a sample Feast transformer
○ Illustrate how online features in Feast feature stores can be retrieved and used for model serving
○ As a sample in KFServing docs folder
○ Use the driver ranking data and model from Feast tutorial, https://github.com/feast-dev/feast-
driver-ranking-tutorial
○ Use a custom container image
○ Interact with Feast online serving via python API
Phase 2: Provide a generic Feast transformer
○ Support a variety of Feast feature stores in preprocessing and model serving
○ As a general transformer in KFServing python folder
○ Include test, instructions, and examples
○ Provide a common Feast base image
○ Interact with Feast online serving via gRPC API
Where can it go?
Better precision
21
Data Asset 1
Model 1
(poor quality)
Data Asset 2
Model 2
(poor quality)
Data Asset 3
Model 3
(poor quality)
Difficult to identify the features
the lead to poor quality models
Feature
Better precision
22
Feature 1
Poor Quality Feature
Store
Model 2
(poor quality)
Model 3
(poor quality)
Model 4
(Good quality)
Feature 2
Moderate Quality
Model 1
(poor quality)
• Feature 1 – Used in 3
models all have poor quality
• Feature 2 – Used in 2
models which have good +
poor quality
• Feature 3 – Used in 1 model
with good quality
Easy to identify feature quality

Más contenido relacionado

La actualidad más candente

Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdfDeep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdfAltinity Ltd
 
The Modern Data Team for the Modern Data Stack: dbt and the Role of the Analy...
The Modern Data Team for the Modern Data Stack: dbt and the Role of the Analy...The Modern Data Team for the Modern Data Stack: dbt and the Role of the Analy...
The Modern Data Team for the Modern Data Stack: dbt and the Role of the Analy...Databricks
 
The Patterns of Distributed Logging and Containers
The Patterns of Distributed Logging and ContainersThe Patterns of Distributed Logging and Containers
The Patterns of Distributed Logging and ContainersSATOSHI TAGOMORI
 
Trino at linkedIn - 2021
Trino at linkedIn - 2021Trino at linkedIn - 2021
Trino at linkedIn - 2021Akshay Rai
 
DBT ELT approach for Advanced Analytics.pptx
DBT ELT approach for Advanced Analytics.pptxDBT ELT approach for Advanced Analytics.pptx
DBT ELT approach for Advanced Analytics.pptxHong Ong
 
ETL and Event Sourcing
ETL and Event SourcingETL and Event Sourcing
ETL and Event SourcingMarc Siegel
 
Pythonsevilla2019 - Introduction to MLFlow
Pythonsevilla2019 - Introduction to MLFlowPythonsevilla2019 - Introduction to MLFlow
Pythonsevilla2019 - Introduction to MLFlowFernando Ortega Gallego
 
Mlflow with databricks
Mlflow with databricksMlflow with databricks
Mlflow with databricksLiangjun Jiang
 
Introducing Change Data Capture with Debezium
Introducing Change Data Capture with DebeziumIntroducing Change Data Capture with Debezium
Introducing Change Data Capture with DebeziumChengKuan Gan
 
Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...
Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...
Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...DataWorks Summit
 
Terraform -- Infrastructure as Code
Terraform -- Infrastructure as CodeTerraform -- Infrastructure as Code
Terraform -- Infrastructure as CodeMartin Schütte
 
Building End-to-End Delta Pipelines on GCP
Building End-to-End Delta Pipelines on GCPBuilding End-to-End Delta Pipelines on GCP
Building End-to-End Delta Pipelines on GCPDatabricks
 
DataOps introduction : DataOps is not only DevOps applied to data!
DataOps introduction : DataOps is not only DevOps applied to data!DataOps introduction : DataOps is not only DevOps applied to data!
DataOps introduction : DataOps is not only DevOps applied to data!Adrien Blind
 
Near Real-Time Data Warehousing with Apache Spark and Delta Lake
Near Real-Time Data Warehousing with Apache Spark and Delta LakeNear Real-Time Data Warehousing with Apache Spark and Delta Lake
Near Real-Time Data Warehousing with Apache Spark and Delta LakeDatabricks
 
Siligong.Data - May 2021 - Transforming your analytics workflow with dbt
Siligong.Data - May 2021 - Transforming your analytics workflow with dbtSiligong.Data - May 2021 - Transforming your analytics workflow with dbt
Siligong.Data - May 2021 - Transforming your analytics workflow with dbtJon Su
 
Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa...
 Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa... Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa...
Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa...Databricks
 
KFServing - Serverless Model Inferencing
KFServing - Serverless Model InferencingKFServing - Serverless Model Inferencing
KFServing - Serverless Model InferencingAnimesh Singh
 
Building Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft AzureBuilding Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft AzureDmitry Anoshin
 

La actualidad más candente (20)

Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdfDeep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
 
The Modern Data Team for the Modern Data Stack: dbt and the Role of the Analy...
The Modern Data Team for the Modern Data Stack: dbt and the Role of the Analy...The Modern Data Team for the Modern Data Stack: dbt and the Role of the Analy...
The Modern Data Team for the Modern Data Stack: dbt and the Role of the Analy...
 
The Patterns of Distributed Logging and Containers
The Patterns of Distributed Logging and ContainersThe Patterns of Distributed Logging and Containers
The Patterns of Distributed Logging and Containers
 
Trino at linkedIn - 2021
Trino at linkedIn - 2021Trino at linkedIn - 2021
Trino at linkedIn - 2021
 
Kubernetes Basics
Kubernetes BasicsKubernetes Basics
Kubernetes Basics
 
DBT ELT approach for Advanced Analytics.pptx
DBT ELT approach for Advanced Analytics.pptxDBT ELT approach for Advanced Analytics.pptx
DBT ELT approach for Advanced Analytics.pptx
 
ETL and Event Sourcing
ETL and Event SourcingETL and Event Sourcing
ETL and Event Sourcing
 
Pythonsevilla2019 - Introduction to MLFlow
Pythonsevilla2019 - Introduction to MLFlowPythonsevilla2019 - Introduction to MLFlow
Pythonsevilla2019 - Introduction to MLFlow
 
Mlflow with databricks
Mlflow with databricksMlflow with databricks
Mlflow with databricks
 
Introducing Change Data Capture with Debezium
Introducing Change Data Capture with DebeziumIntroducing Change Data Capture with Debezium
Introducing Change Data Capture with Debezium
 
Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...
Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...
Unify Stream and Batch Processing using Dataflow, a Portable Programmable Mod...
 
Terraform -- Infrastructure as Code
Terraform -- Infrastructure as CodeTerraform -- Infrastructure as Code
Terraform -- Infrastructure as Code
 
Building End-to-End Delta Pipelines on GCP
Building End-to-End Delta Pipelines on GCPBuilding End-to-End Delta Pipelines on GCP
Building End-to-End Delta Pipelines on GCP
 
DataOps introduction : DataOps is not only DevOps applied to data!
DataOps introduction : DataOps is not only DevOps applied to data!DataOps introduction : DataOps is not only DevOps applied to data!
DataOps introduction : DataOps is not only DevOps applied to data!
 
Kubeflow
KubeflowKubeflow
Kubeflow
 
Near Real-Time Data Warehousing with Apache Spark and Delta Lake
Near Real-Time Data Warehousing with Apache Spark and Delta LakeNear Real-Time Data Warehousing with Apache Spark and Delta Lake
Near Real-Time Data Warehousing with Apache Spark and Delta Lake
 
Siligong.Data - May 2021 - Transforming your analytics workflow with dbt
Siligong.Data - May 2021 - Transforming your analytics workflow with dbtSiligong.Data - May 2021 - Transforming your analytics workflow with dbt
Siligong.Data - May 2021 - Transforming your analytics workflow with dbt
 
Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa...
 Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa... Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa...
Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa...
 
KFServing - Serverless Model Inferencing
KFServing - Serverless Model InferencingKFServing - Serverless Model Inferencing
KFServing - Serverless Model Inferencing
 
Building Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft AzureBuilding Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft Azure
 

Similar a KFServing and Feast

Advanced Model Inferencing leveraging Kubeflow Serving, KNative and Istio
Advanced Model Inferencing leveraging Kubeflow Serving, KNative and IstioAdvanced Model Inferencing leveraging Kubeflow Serving, KNative and Istio
Advanced Model Inferencing leveraging Kubeflow Serving, KNative and IstioAnimesh Singh
 
Productionizing Machine Learning - Bigdata meetup 5-06-2019
Productionizing Machine Learning - Bigdata meetup 5-06-2019Productionizing Machine Learning - Bigdata meetup 5-06-2019
Productionizing Machine Learning - Bigdata meetup 5-06-2019Iulian Pintoiu
 
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning InfrastructureML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning InfrastructureFei Chen
 
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
 MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ... MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...Databricks
 
Kubernetes @ Squarespace: Kubernetes in the Datacenter
Kubernetes @ Squarespace: Kubernetes in the DatacenterKubernetes @ Squarespace: Kubernetes in the Datacenter
Kubernetes @ Squarespace: Kubernetes in the DatacenterKevin Lynch
 
Introduction to Kubernetes Workshop
Introduction to Kubernetes WorkshopIntroduction to Kubernetes Workshop
Introduction to Kubernetes WorkshopBob Killen
 
Using Spark Mllib Models in a Production Training and Serving Platform: Exper...
Using Spark Mllib Models in a Production Training and Serving Platform: Exper...Using Spark Mllib Models in a Production Training and Serving Platform: Exper...
Using Spark Mllib Models in a Production Training and Serving Platform: Exper...Databricks
 
Spinnaker Summit 2018: CI/CD Patterns for Kubernetes with Spinnaker
Spinnaker Summit 2018: CI/CD Patterns for Kubernetes with SpinnakerSpinnaker Summit 2018: CI/CD Patterns for Kubernetes with Spinnaker
Spinnaker Summit 2018: CI/CD Patterns for Kubernetes with SpinnakerAndrew Phillips
 
API workshop by AWS and 3scale
API workshop by AWS and 3scaleAPI workshop by AWS and 3scale
API workshop by AWS and 3scale3scale
 
Data collection in AWS at Schibsted
Data collection in AWS at SchibstedData collection in AWS at Schibsted
Data collection in AWS at SchibstedLars Marius Garshol
 
Seattle Spark Meetup Mobius CSharp API
Seattle Spark Meetup Mobius CSharp APISeattle Spark Meetup Mobius CSharp API
Seattle Spark Meetup Mobius CSharp APIshareddatamsft
 
Ultimate Guide to Microservice Architecture on Kubernetes
Ultimate Guide to Microservice Architecture on KubernetesUltimate Guide to Microservice Architecture on Kubernetes
Ultimate Guide to Microservice Architecture on Kuberneteskloia
 
from ai.backend import python @ pycontw2018
from ai.backend import python @ pycontw2018from ai.backend import python @ pycontw2018
from ai.backend import python @ pycontw2018Chun-Yu Tseng
 
Ml ops and the feature store with hopsworks, DC Data Science Meetup
Ml ops and the feature store with hopsworks, DC Data Science MeetupMl ops and the feature store with hopsworks, DC Data Science Meetup
Ml ops and the feature store with hopsworks, DC Data Science MeetupJim Dowling
 
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUp
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUpStrimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUp
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUpJosé Román Martín Gil
 
AI & Machine Learning Pipelines with Knative
AI & Machine Learning Pipelines with KnativeAI & Machine Learning Pipelines with Knative
AI & Machine Learning Pipelines with KnativeAnimesh Singh
 
DevOps Days Boston 2017: Real-world Kubernetes for DevOps
DevOps Days Boston 2017: Real-world Kubernetes for DevOpsDevOps Days Boston 2017: Real-world Kubernetes for DevOps
DevOps Days Boston 2017: Real-world Kubernetes for DevOpsAmbassador Labs
 
Event-Based API Patterns and Practices
Event-Based API Patterns and PracticesEvent-Based API Patterns and Practices
Event-Based API Patterns and PracticesLaunchAny
 

Similar a KFServing and Feast (20)

Advanced Model Inferencing leveraging Kubeflow Serving, KNative and Istio
Advanced Model Inferencing leveraging Kubeflow Serving, KNative and IstioAdvanced Model Inferencing leveraging Kubeflow Serving, KNative and Istio
Advanced Model Inferencing leveraging Kubeflow Serving, KNative and Istio
 
Productionizing Machine Learning - Bigdata meetup 5-06-2019
Productionizing Machine Learning - Bigdata meetup 5-06-2019Productionizing Machine Learning - Bigdata meetup 5-06-2019
Productionizing Machine Learning - Bigdata meetup 5-06-2019
 
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning InfrastructureML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
 
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
 MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ... MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
MLflow: Infrastructure for a Complete Machine Learning Life Cycle with Mani ...
 
Kubernetes @ Squarespace: Kubernetes in the Datacenter
Kubernetes @ Squarespace: Kubernetes in the DatacenterKubernetes @ Squarespace: Kubernetes in the Datacenter
Kubernetes @ Squarespace: Kubernetes in the Datacenter
 
Introduction to Kubernetes Workshop
Introduction to Kubernetes WorkshopIntroduction to Kubernetes Workshop
Introduction to Kubernetes Workshop
 
Using Spark Mllib Models in a Production Training and Serving Platform: Exper...
Using Spark Mllib Models in a Production Training and Serving Platform: Exper...Using Spark Mllib Models in a Production Training and Serving Platform: Exper...
Using Spark Mllib Models in a Production Training and Serving Platform: Exper...
 
Spinnaker Summit 2018: CI/CD Patterns for Kubernetes with Spinnaker
Spinnaker Summit 2018: CI/CD Patterns for Kubernetes with SpinnakerSpinnaker Summit 2018: CI/CD Patterns for Kubernetes with Spinnaker
Spinnaker Summit 2018: CI/CD Patterns for Kubernetes with Spinnaker
 
API workshop by AWS and 3scale
API workshop by AWS and 3scaleAPI workshop by AWS and 3scale
API workshop by AWS and 3scale
 
Data collection in AWS at Schibsted
Data collection in AWS at SchibstedData collection in AWS at Schibsted
Data collection in AWS at Schibsted
 
Seattle Spark Meetup Mobius CSharp API
Seattle Spark Meetup Mobius CSharp APISeattle Spark Meetup Mobius CSharp API
Seattle Spark Meetup Mobius CSharp API
 
Ultimate Guide to Microservice Architecture on Kubernetes
Ultimate Guide to Microservice Architecture on KubernetesUltimate Guide to Microservice Architecture on Kubernetes
Ultimate Guide to Microservice Architecture on Kubernetes
 
from ai.backend import python @ pycontw2018
from ai.backend import python @ pycontw2018from ai.backend import python @ pycontw2018
from ai.backend import python @ pycontw2018
 
Ml ops and the feature store with hopsworks, DC Data Science Meetup
Ml ops and the feature store with hopsworks, DC Data Science MeetupMl ops and the feature store with hopsworks, DC Data Science Meetup
Ml ops and the feature store with hopsworks, DC Data Science Meetup
 
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUp
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUpStrimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUp
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUp
 
AI & Machine Learning Pipelines with Knative
AI & Machine Learning Pipelines with KnativeAI & Machine Learning Pipelines with Knative
AI & Machine Learning Pipelines with Knative
 
Serving models using KFServing
Serving models using KFServingServing models using KFServing
Serving models using KFServing
 
Mulesoft lisbon_meetup_asyncapis
Mulesoft lisbon_meetup_asyncapisMulesoft lisbon_meetup_asyncapis
Mulesoft lisbon_meetup_asyncapis
 
DevOps Days Boston 2017: Real-world Kubernetes for DevOps
DevOps Days Boston 2017: Real-world Kubernetes for DevOpsDevOps Days Boston 2017: Real-world Kubernetes for DevOps
DevOps Days Boston 2017: Real-world Kubernetes for DevOps
 
Event-Based API Patterns and Practices
Event-Based API Patterns and PracticesEvent-Based API Patterns and Practices
Event-Based API Patterns and Practices
 

Más de Animesh Singh

Machine Learning Exchange (MLX)
Machine Learning Exchange (MLX)Machine Learning Exchange (MLX)
Machine Learning Exchange (MLX)Animesh Singh
 
KFServing Payload Logging for Trusted AI
KFServing Payload Logging for Trusted AIKFServing Payload Logging for Trusted AI
KFServing Payload Logging for Trusted AIAnimesh Singh
 
KFServing and Kubeflow Pipelines
KFServing and Kubeflow PipelinesKFServing and Kubeflow Pipelines
KFServing and Kubeflow PipelinesAnimesh Singh
 
Kubeflow Distributed Training and HPO
Kubeflow Distributed Training and HPOKubeflow Distributed Training and HPO
Kubeflow Distributed Training and HPOAnimesh Singh
 
Kubeflow Pipelines (with Tekton)
Kubeflow Pipelines (with Tekton)Kubeflow Pipelines (with Tekton)
Kubeflow Pipelines (with Tekton)Animesh Singh
 
End to end Machine Learning using Kubeflow - Build, Train, Deploy and Manage
End to end Machine Learning using Kubeflow - Build, Train, Deploy and ManageEnd to end Machine Learning using Kubeflow - Build, Train, Deploy and Manage
End to end Machine Learning using Kubeflow - Build, Train, Deploy and ManageAnimesh Singh
 
Defend against adversarial AI using Adversarial Robustness Toolbox
Defend against adversarial AI using Adversarial Robustness Toolbox Defend against adversarial AI using Adversarial Robustness Toolbox
Defend against adversarial AI using Adversarial Robustness Toolbox Animesh Singh
 
Hybrid Cloud, Kubeflow and Tensorflow Extended [TFX]
Hybrid Cloud, Kubeflow and Tensorflow Extended [TFX]Hybrid Cloud, Kubeflow and Tensorflow Extended [TFX]
Hybrid Cloud, Kubeflow and Tensorflow Extended [TFX]Animesh Singh
 
Trusted, Transparent and Fair AI using Open Source
Trusted, Transparent and Fair AI using Open SourceTrusted, Transparent and Fair AI using Open Source
Trusted, Transparent and Fair AI using Open SourceAnimesh Singh
 
AIF360 - Trusted and Fair AI
AIF360 - Trusted and Fair AIAIF360 - Trusted and Fair AI
AIF360 - Trusted and Fair AIAnimesh Singh
 
Fabric for Deep Learning
Fabric for Deep LearningFabric for Deep Learning
Fabric for Deep LearningAnimesh Singh
 
Microservices, Kubernetes and Istio - A Great Fit!
Microservices, Kubernetes and Istio - A Great Fit!Microservices, Kubernetes and Istio - A Great Fit!
Microservices, Kubernetes and Istio - A Great Fit!Animesh Singh
 
How to build a Distributed Serverless Polyglot Microservices IoT Platform us...
How to build a Distributed Serverless Polyglot Microservices IoT Platform us...How to build a Distributed Serverless Polyglot Microservices IoT Platform us...
How to build a Distributed Serverless Polyglot Microservices IoT Platform us...Animesh Singh
 
How to build an event-driven, polyglot serverless microservices framework on ...
How to build an event-driven, polyglot serverless microservices framework on ...How to build an event-driven, polyglot serverless microservices framework on ...
How to build an event-driven, polyglot serverless microservices framework on ...Animesh Singh
 
As a Service: Cloud Foundry on OpenStack - Lessons Learnt
As a Service: Cloud Foundry on OpenStack - Lessons LearntAs a Service: Cloud Foundry on OpenStack - Lessons Learnt
As a Service: Cloud Foundry on OpenStack - Lessons LearntAnimesh Singh
 
Introducing Cloud Native, Event Driven, Serverless, Micrsoservices Framework ...
Introducing Cloud Native, Event Driven, Serverless, Micrsoservices Framework ...Introducing Cloud Native, Event Driven, Serverless, Micrsoservices Framework ...
Introducing Cloud Native, Event Driven, Serverless, Micrsoservices Framework ...Animesh Singh
 
Finding and-organizing Great Cloud Foundry User Groups
Finding and-organizing Great Cloud Foundry User GroupsFinding and-organizing Great Cloud Foundry User Groups
Finding and-organizing Great Cloud Foundry User GroupsAnimesh Singh
 
CAPS: What's best for deploying and managing OpenStack? Chef vs. Ansible vs. ...
CAPS: What's best for deploying and managing OpenStack? Chef vs. Ansible vs. ...CAPS: What's best for deploying and managing OpenStack? Chef vs. Ansible vs. ...
CAPS: What's best for deploying and managing OpenStack? Chef vs. Ansible vs. ...Animesh Singh
 
Building a PaaS Platform like Bluemix on OpenStack
Building a PaaS Platform like Bluemix on OpenStackBuilding a PaaS Platform like Bluemix on OpenStack
Building a PaaS Platform like Bluemix on OpenStackAnimesh Singh
 
Cloud foundry Docker Openstack - Leading Open Source Triumvirate
Cloud foundry Docker Openstack - Leading Open Source TriumvirateCloud foundry Docker Openstack - Leading Open Source Triumvirate
Cloud foundry Docker Openstack - Leading Open Source TriumvirateAnimesh Singh
 

Más de Animesh Singh (20)

Machine Learning Exchange (MLX)
Machine Learning Exchange (MLX)Machine Learning Exchange (MLX)
Machine Learning Exchange (MLX)
 
KFServing Payload Logging for Trusted AI
KFServing Payload Logging for Trusted AIKFServing Payload Logging for Trusted AI
KFServing Payload Logging for Trusted AI
 
KFServing and Kubeflow Pipelines
KFServing and Kubeflow PipelinesKFServing and Kubeflow Pipelines
KFServing and Kubeflow Pipelines
 
Kubeflow Distributed Training and HPO
Kubeflow Distributed Training and HPOKubeflow Distributed Training and HPO
Kubeflow Distributed Training and HPO
 
Kubeflow Pipelines (with Tekton)
Kubeflow Pipelines (with Tekton)Kubeflow Pipelines (with Tekton)
Kubeflow Pipelines (with Tekton)
 
End to end Machine Learning using Kubeflow - Build, Train, Deploy and Manage
End to end Machine Learning using Kubeflow - Build, Train, Deploy and ManageEnd to end Machine Learning using Kubeflow - Build, Train, Deploy and Manage
End to end Machine Learning using Kubeflow - Build, Train, Deploy and Manage
 
Defend against adversarial AI using Adversarial Robustness Toolbox
Defend against adversarial AI using Adversarial Robustness Toolbox Defend against adversarial AI using Adversarial Robustness Toolbox
Defend against adversarial AI using Adversarial Robustness Toolbox
 
Hybrid Cloud, Kubeflow and Tensorflow Extended [TFX]
Hybrid Cloud, Kubeflow and Tensorflow Extended [TFX]Hybrid Cloud, Kubeflow and Tensorflow Extended [TFX]
Hybrid Cloud, Kubeflow and Tensorflow Extended [TFX]
 
Trusted, Transparent and Fair AI using Open Source
Trusted, Transparent and Fair AI using Open SourceTrusted, Transparent and Fair AI using Open Source
Trusted, Transparent and Fair AI using Open Source
 
AIF360 - Trusted and Fair AI
AIF360 - Trusted and Fair AIAIF360 - Trusted and Fair AI
AIF360 - Trusted and Fair AI
 
Fabric for Deep Learning
Fabric for Deep LearningFabric for Deep Learning
Fabric for Deep Learning
 
Microservices, Kubernetes and Istio - A Great Fit!
Microservices, Kubernetes and Istio - A Great Fit!Microservices, Kubernetes and Istio - A Great Fit!
Microservices, Kubernetes and Istio - A Great Fit!
 
How to build a Distributed Serverless Polyglot Microservices IoT Platform us...
How to build a Distributed Serverless Polyglot Microservices IoT Platform us...How to build a Distributed Serverless Polyglot Microservices IoT Platform us...
How to build a Distributed Serverless Polyglot Microservices IoT Platform us...
 
How to build an event-driven, polyglot serverless microservices framework on ...
How to build an event-driven, polyglot serverless microservices framework on ...How to build an event-driven, polyglot serverless microservices framework on ...
How to build an event-driven, polyglot serverless microservices framework on ...
 
As a Service: Cloud Foundry on OpenStack - Lessons Learnt
As a Service: Cloud Foundry on OpenStack - Lessons LearntAs a Service: Cloud Foundry on OpenStack - Lessons Learnt
As a Service: Cloud Foundry on OpenStack - Lessons Learnt
 
Introducing Cloud Native, Event Driven, Serverless, Micrsoservices Framework ...
Introducing Cloud Native, Event Driven, Serverless, Micrsoservices Framework ...Introducing Cloud Native, Event Driven, Serverless, Micrsoservices Framework ...
Introducing Cloud Native, Event Driven, Serverless, Micrsoservices Framework ...
 
Finding and-organizing Great Cloud Foundry User Groups
Finding and-organizing Great Cloud Foundry User GroupsFinding and-organizing Great Cloud Foundry User Groups
Finding and-organizing Great Cloud Foundry User Groups
 
CAPS: What's best for deploying and managing OpenStack? Chef vs. Ansible vs. ...
CAPS: What's best for deploying and managing OpenStack? Chef vs. Ansible vs. ...CAPS: What's best for deploying and managing OpenStack? Chef vs. Ansible vs. ...
CAPS: What's best for deploying and managing OpenStack? Chef vs. Ansible vs. ...
 
Building a PaaS Platform like Bluemix on OpenStack
Building a PaaS Platform like Bluemix on OpenStackBuilding a PaaS Platform like Bluemix on OpenStack
Building a PaaS Platform like Bluemix on OpenStack
 
Cloud foundry Docker Openstack - Leading Open Source Triumvirate
Cloud foundry Docker Openstack - Leading Open Source TriumvirateCloud foundry Docker Openstack - Leading Open Source Triumvirate
Cloud foundry Docker Openstack - Leading Open Source Triumvirate
 

Último

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 

Último (20)

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 

KFServing and Feast

  • 2. The InferenceService architecture consists of a static graph of components which coordinate requests for a single model. Advanced features such as Ensembling, A/B testing, and Multi-Arm- Bandits should compose InferenceServices together. Inference Service Control Plane
  • 3. Inference Service with Transformer apiVersion: serving.kubeflow.org/v1alpha2 kind: InferenceService metadata: name: bert-serving spec: default transformer: custom: container: image: bert-transformer:v1 env: name: STORAGE_URI value: s3://examples/bert_transformer predictor: pytorch: storageUri: s3://examples/bert runtimeVersion: v0.3.0-gpu resources: limits: nvidia.com/gpu: 1 Pytorch Model Server apiVersion: serving.kubeflow.org/v1alpha2 kind: InferenceService metadata: name: bert-serving-onnx spec: default transformer: custom: container: image: bert-transformer:v1 env: name: STORAGE_URI value: s3://examples/bert_transformer predictor: onnx: storageUri: s3://examples/bert runtimeVersion: 0.5.1 Pre/Post Processing ONNX Runtime Server Pre/Post Processing
  • 4. Feature ● What is a feature? 4 /feature/ A feature is a measurable property of the object you’re trying to analyze. Features are the basic building blocks of models. The quality of the features in your dataset has a major impact on the quality of the insights you will gain when you use that dataset for machine learning. Importance of Features
  • 5. Hidden Technical Debt in Machine Learning Systems 5 Tech/User Trends Coming up with features is difficult, time-consuming, requires expert knowledge. "Applied machine learning" is basically feature engineering. - Andrew Ng, Founder of deeplearning.ai ...some machine learning projects succeed and some fail. What makes the difference? Easily the most important factor is the features used. - Pedro Domingos, author of ‘The Master Algorithm algorithms we used are very standard for Kagglers. […] We spent most of our efforts in feature engineering. [...] - Xavier Conort, Chief Data Scientist DataRobot Feature Engineering is essential, difficult, and costly
  • 6. The Feature problem ● Different ML models typically use some common set of features ● Examples of common features: ○ Average Loan default rate by zip code: Used by models which predict who should be targeted for a marketing offer, models which predict who should be offered a loan, etc. ○ Average property prices in an area ○ Credit history of customers: Used by models which predict anything that is related to clients. ○ Average traffic in an area: Used by models which deal with finding best route ● Finding the right features for an ML models requires: ○ Thinking of which features will be relevant for building the model ○ Identifying the right data from the data catalog/data lake for building the feature ○ Feature engineering to get the feature in the right format from the source data ○ This is repeated across teams by data scientists!
  • 7. 7 Feature Management is a Huge Painpoint Spend more time on data prep Lack of data consistency between training and serving Duplicate work because they do not know it exists Manage fragmented data infrastructure Deal with more request as the data science team scales Hard to get features into production Data Scientists Data Engineer
  • 8. ● Poor Feature Management Leads to…. 8 Long Development Time Poor Data Quality Difficulty in Production
  • 9. The feature store is the central place to store curated features for machine learning pipelines. F E A T U R E S T O R E Feature Store 9
  • 10. Feast is a Feature Store Catalog that attempts to solve the key data challenges with production machine learning FEAST
  • 11. Feature stores are a critical piece of ML infra ‘17 Uber Michelangelo (Proprietary, original feature store) ‘18 Feast (Open source) ‘18 Logical Clocks (Open source, ML platform) ‘19 Airbnb’s Zipline (Closed source) ‘19 Spotify’s Feature Store on Kubeflow (Closed source) ‘20 Pinterest (Closed source) ‘20 Twitter Feature Store (Closed source, library based) ‘20 Tecton Feature Store (Closed source)
  • 12. What is a Feature Catalog? ● Feature catalog can be thought of as “Master Data” which is used for building and serving Machine learning models ● It stores different features which can be used across different teams for building ML models ● It is not just a feature repository, but also includes two serving mechanisms for: ○ Batch access ○ Real time access ● Feature Update: Feature values will get updated over time ○ Some will be updated in real time. E.g., average traffic in an area ○ Some will be updated not very frequently. E.g., Credit rating of customer ○ Features need to be synced between repositories used for batch access and real time access.
  • 13. What does Feast provide? Registry: A common catalog with which to explore, develop, collaborate on, and publish new feature definitions within and across teams. Ingestion: A means for continually ingesting batch and streaming data and storing consistent copies in both an offline and online store Serving: A feature-retrieval interface which provides a temporally consistent view of features for both training and online serving. Monitoring: Tools that allow operational teams to monitor and act on the quality and accuracy of data reaching models.
  • 14. Feature Repo feast apply Redis Serving API Ingestion API Offline Store (BQ/S3/GCS/Other) Kafka Spark on K8s Spark on K8s Configures infrastructure based on feature definitions and “provider” Feast on K8s Exists Planning phase TBD what the scope of apply would be for an K8s provider. It may be that it only spins up jobs and updates stores. GCS/S3 registry
  • 16. Feast online Serving • gRPC server • Serves 2 Methods • GetFeastServingInfo • GetOnlineFeaturesV2 • How to call the gRPC server-side methods? • Short answer: Feast SDK • Feast Python SDK, (also in Java and Go) wraps around gRPC client libs. • gRPC client libs calls gRPC server-side method like calling local methods. • gRPC client libs are generated from protobuf definition • How are python gPRC client libs (modules) generated? • They are generated from *.proto - Example: python -m grpc_tools.protoc -I. --grpc_python_out=../sdk/python/feast/protos/ --python_out=../sdk/python/feast/protos/ --mypy_out=../sdk/python/feast/protos/ feast/serving/ServingService.proto • Generates ServingService_pb2_grpc.py: client stub. Wrapper around the ServingService_pb.py ServingService_pb.py - Implementation of ServingService_pb2_grpc.py ServingService_pb2.pyi - Stub (interface) file of ServingService_pb2.py
  • 17. Extend the Transformer apiVersion: serving.kubeflow.org/v1alpha2 kind: InferenceService metadata: name: bert-serving spec: default transformer: feast: feastUrl: "http://feast-serving.default.svc" dataType: TensorProto, entityIds: - source featureIds: - weather:1:temp - weather:1:clouds - weather:1:humidity numFeatureValues: 5 predictor: pytorch: storageUri: s3://examples/bert runtimeVersion: v0.3.0-gpu resources: limits: nvidia.com/gpu: 1 Pytorch Model Server Feast Transformer
  • 18. KFServing with Feast Feast transformer as a new type of transformer for preprocess ○ Has a custom container image with generic implementation to interact with Feast online serving ○ Properties: entity IDs, feature refs, project, Feast serving URL… ○ Specify IDs in inference service yaml ■ Entity ids è FeatureStore.get_online_features(entity_rows…) ■ Feature refs è FeatureStore.get_online_features(feature_refs…) ○ The initial request will be augmented with features from Feast online store and sent to predictor as the final input ○ Postprocess is a pass-through, not implemented in this transformer Preprocess Predict Postprocess Explain Python dict Python dict Transformer Transformer Predictor Explainer Feast Online Serving Model Serving Request (predict or explain) Online Store (Redis) Registry Feast Model Serving Response (predict or explain) Python, gRPC
  • 19. KFServing with Feast – Phased Approach Phase 1: Provide a sample Feast transformer ○ Illustrate how online features in Feast feature stores can be retrieved and used for model serving ○ As a sample in KFServing docs folder ○ Use the driver ranking data and model from Feast tutorial, https://github.com/feast-dev/feast- driver-ranking-tutorial ○ Use a custom container image ○ Interact with Feast online serving via python API Phase 2: Provide a generic Feast transformer ○ Support a variety of Feast feature stores in preprocessing and model serving ○ As a general transformer in KFServing python folder ○ Include test, instructions, and examples ○ Provide a common Feast base image ○ Interact with Feast online serving via gRPC API
  • 21. Better precision 21 Data Asset 1 Model 1 (poor quality) Data Asset 2 Model 2 (poor quality) Data Asset 3 Model 3 (poor quality) Difficult to identify the features the lead to poor quality models Feature
  • 22. Better precision 22 Feature 1 Poor Quality Feature Store Model 2 (poor quality) Model 3 (poor quality) Model 4 (Good quality) Feature 2 Moderate Quality Model 1 (poor quality) • Feature 1 – Used in 3 models all have poor quality • Feature 2 – Used in 2 models which have good + poor quality • Feature 3 – Used in 1 model with good quality Easy to identify feature quality