Ml 3 ways

•

0 recomendaciones•93 vistas

PhilipBasford

A lightning talk on all the ways to run your Machine Learning models on AWS

Tecnología

16 JUNE 2021
Machine Learning
Serving your ML Model 3 ways!

➤ 10 x AWS Certifications including SA Pro, Dev Ops and Machine
Learning Specialism.
➤ Visionary in ML Ops, Produced production workloads of ML models at
scale, including 1500 inferences per minute, including active monitoring
and alerting
➤ Contributes to the AWS Community by speaking at several summits,
community days and meet-ups.
➤ Regular blogger, open-source contributor, and SME on Machine
Learning, MLOps, DevOps, Containers and Serverless.
➤ Experienced principal solutions architect, a lead developer with over 6
years of AWS experience. He has been responsible for running
production workloads of over 200 and 18,000 requests per second
WHO I AM?
Phil Basford
phil@inawisdom.com
@philipbasford
Phil B#4237

Inference types
ML OPS – INFERENCE TYPES
Real Time
➤ Business Critical, commonly uses are chat
bots, classifiers, recommenders or liner
regressors. Like credit risk, journey times
etc
➤ Hundred or thousands individual
predictions per second
➤ API Driven with Low Latency, typically
below 135ms at the 90th percentile.
Near Real Time
➤ Commonly used for image classification or
file analysis
➤ Hundred individual predictions per minute
and processing needs to be done within
seconds
➤ Event or Message Queue based,
predictions are sent back or stored
Occasional
➤ Examples are simple classifiers like Tax
codes
➤ Only a few predictions a month and
processing needs to be completed with
minutes
➤ API, Event or Message Queue based,
predicts sent back or stored
Batch
➤ End of month reporting, invoice
generation, warranty plan management
➤ Runs at Daily / Monthly / Set Times
➤ The data set is typically millions or tens of
millions of rows at once
Micro Batch
➤ Anomaly detection, invoice
approval and Image processing
➤ Executed regularly : every x
minutes or Y number of events.
Triggered by file upload or data
ingestion
➤ The data set is typically hundreds
or thousands of rows at once
Edge
➤ Used for Computer Vision, Fault Detection
in Manufacturing
➤ Runs on mobile phone apps and low
power devices. Uses sensors (i.e. video,
location, or heat)
➤ Model output is normally sent back to the
Cloud at regular intervals for analysis.

Fargate
OPTION 1
➤ Supports Batch and Realtime
➤ Low Latency (<100ms)
➤ Supports only CPU and Not GPU (can
step it down to full ECS)
➤ Pay Per Hour
➤ Application Auto Scaling
➤ Runs Docker and full native support
➤ Not integrated with Notebooks
SageMaker SDK.
➤ No Model Monitor support (records
predictions)
➤ Requires you to build your own images or
a deep learning container
➤ Memory and GPU Limits (can step it down
to full ECS)

SageMaker : Endpoints and Batch Transforms
➤ Supports Batch and Realtime
➤ Built in Algos, Framework and BOYM
support
➤ Low Latency (<100ms)
➤ Supports CPU and GPU
➤ Pay Per Hour (saving plans)
➤ Only recently add to saving plans
➤ Application Auto Scaling
➤ Runs Docker and full native support
➤ One click Deployment: Integration with
SageMaker Studio and Notebook support
via SDK.
➤ Model Monitor support (records
predictions)
➤ No resource limits
OPTION 2

Lambda
OPTION 3
➤ Simple
➤ Supports only Realtime, or micro batch
(15mins)
➤ Low Latency (<100ms)
➤ Supports only CPU and Not GPU
➤ Pay Per Request
➤ Scales on concurrency
➤ Saving plans
➤ *Custom Image : Runs Docker and full
native support
➤ Not integrated with Notebooks
SageMaker SDK.
➤ No Model Monitor support (records
predictions)
➤ Memory and GPU Limits

Más contenido relacionado

La actualidad más candente

From AWS to Series A in 5 Easy PiecesAmazon Web Services

AWS Serverless patterns & best-practices in AWSDima Pasko

Scalable Deep Learning on AWS with Apache MXNetJulien SIMON

Machine learning at scale by Amy Unruh from GoogleBill Liu

AWS re:Invent 2016: Getting the most Bang for your buck with #EC2 #Winning (C...Amazon Web Services

How to Reduce your Spend on AWSJoseph K. Ziegler

AWS APAC Webinar Series: How to Reduce Your Spend on AWSAmazon Web Services

Picking the right AWS backend for your Java application (May 2017)Julien SIMON

Risk Management and Particle Accelerators: Innovating with New Compute Platfo...Amazon Web Services

Aws Summit Berlin 2013 - Understanding database options on AWSAWS Germany

Understand AWS PricingLynn Langit

DAT301 Accelerating Amazon Relational Database Service Performance with Amazo...Amazon Web Services

Advanced Autoscaling for Kubernetes & Amazon ECS Amazon Web Services

Machine Learning in ActionAmazon Web Services

Tensorflow model using docker and AWS SageMakerArif Khan

Cloud Economics, from Genesis to ScaleAmazon Web Services

Amazon EC2 FoundationsAmazon Web Services

AWS re:Invent 2016: Learn How FINRA Aligns Billions of Time Ordered Events wi...Amazon Web Services

AWS re:Invent 2016: Dollars and Sense: Technical Tips for Continual Cost Opti...Amazon Web Services

AWS Summit Tel Aviv - Enterprise Track - Backup and Disaster RecoveryAmazon Web Services

La actualidad más candente (20)

From AWS to Series A in 5 Easy Pieces

AWS Serverless patterns & best-practices in AWS

Scalable Deep Learning on AWS with Apache MXNet

Machine learning at scale by Amy Unruh from Google

AWS re:Invent 2016: Getting the most Bang for your buck with #EC2 #Winning (C...

How to Reduce your Spend on AWS

AWS APAC Webinar Series: How to Reduce Your Spend on AWS

Picking the right AWS backend for your Java application (May 2017)

Risk Management and Particle Accelerators: Innovating with New Compute Platfo...

Aws Summit Berlin 2013 - Understanding database options on AWS

Understand AWS Pricing

DAT301 Accelerating Amazon Relational Database Service Performance with Amazo...

Advanced Autoscaling for Kubernetes & Amazon ECS

Machine Learning in Action

Tensorflow model using docker and AWS SageMaker

Cloud Economics, from Genesis to Scale

Amazon EC2 Foundations

AWS re:Invent 2016: Learn How FINRA Aligns Billions of Time Ordered Events wi...

AWS re:Invent 2016: Dollars and Sense: Technical Tips for Continual Cost Opti...

AWS Summit Tel Aviv - Enterprise Track - Backup and Disaster Recovery

Similar a Ml 3 ways

Ml ops on AWSPhilipBasford

Serverless Functions and Machine Learning: Putting the AI in APIsNordic APIs

OS for AI: Elastic Microservices & the Next Gen of MLNordic APIs

BSC LMS DDL Ganesan Narayanasamy

"Scaling ML from 0 to millions of users", Julien Simon, AWS Dev Day Kyiv 2019Provectus

Path to continuous deliveryAnirudh Bhatnagar

World Artificial Intelligence Conference Shanghai 2018Adam Gibson

Availability in a cloud native world v1.6 (Feb 2019)Haytham Elkhoja

Deep Learning Using Caffe2 on AWS - MCL313 - re:Invent 2017Amazon Web Services

Deep Learning Neural Network Acceleration at the Edge - Andrea GalloLinaro

Google Cloud Fundamentals Omar Fathy

Deep AutoViML For Tensorflow Models and MLOps WorkflowsBill Liu

Amazon SageMaker for Fraud DetectionAmazon Web Services

Why Pay for Open Source Linux? Avoid the Hidden Cost of DIYEnterprise Management Associates

LLMOps for Your Data: Best Practices to Ensure Safety, Quality, and CostAggregage

Generative AI at the edge.pdfQualcomm Research

The "Holy Grail" of Dev/OpsErik Osterman

TensorFlow meetup: Keras - Pytorch - TensorFlow.jsStijn Decubber

Java script anywhere. What Nombas was doing pre-acquisition.Brent Noorda

Infrastructure Automation Groupware Technology

Similar a Ml 3 ways (20)

Ml ops on AWS

Serverless Functions and Machine Learning: Putting the AI in APIs

OS for AI: Elastic Microservices & the Next Gen of ML

BSC LMS DDL

"Scaling ML from 0 to millions of users", Julien Simon, AWS Dev Day Kyiv 2019

Path to continuous delivery

World Artificial Intelligence Conference Shanghai 2018

Availability in a cloud native world v1.6 (Feb 2019)

Deep Learning Using Caffe2 on AWS - MCL313 - re:Invent 2017

Deep Learning Neural Network Acceleration at the Edge - Andrea Gallo

Google Cloud Fundamentals

Deep AutoViML For Tensorflow Models and MLOps Workflows

Amazon SageMaker for Fraud Detection

Why Pay for Open Source Linux? Avoid the Hidden Cost of DIY

LLMOps for Your Data: Best Practices to Ensure Safety, Quality, and Cost

Generative AI at the edge.pdf

The "Holy Grail" of Dev/Ops

TensorFlow meetup: Keras - Pytorch - TensorFlow.js

Java script anywhere. What Nombas was doing pre-acquisition.

Infrastructure Automation

Más de PhilipBasford

re:cap Generative AI journey with BedrockPhilipBasford

AIM102-S_Cognizant_CognizantCognitivePhilipBasford

Inawisdom IDPPhilipBasford

Inawisdom MLOPSPhilipBasford

Inawisdom Quick SightPhilipBasford

Inawsidom - Data JourneyPhilipBasford

Realizing_the_real_business_impact_of_gen_AI_white_paper.pdfPhilipBasford

Gen AI Cognizant & AWS event presentation_12 Oct.pdfPhilipBasford

Inawisdom Overview - construction.pdfPhilipBasford

D3 IDP Slides.pdfPhilipBasford

C04 Driving understanding from Documents and unstructured data sources final.pdfPhilipBasford

Securing your Machine Learning modelsPhilipBasford

Fish Cam.pptxPhilipBasford

Palringo AWS London Summit 2017PhilipBasford

Palringo : a startup's journey from a data center to the cloudPhilipBasford

Más de PhilipBasford (15)

re:cap Generative AI journey with Bedrock

AIM102-S_Cognizant_CognizantCognitive

Inawisdom IDP

Inawisdom MLOPS

Inawisdom Quick Sight

Inawsidom - Data Journey

Realizing_the_real_business_impact_of_gen_AI_white_paper.pdf

Gen AI Cognizant & AWS event presentation_12 Oct.pdf

Inawisdom Overview - construction.pdf

D3 IDP Slides.pdf

C04 Driving understanding from Documents and unstructured data sources final.pdf

Securing your Machine Learning models

Fish Cam.pptx

Palringo AWS London Summit 2017

Palringo : a startup's journey from a data center to the cloud

Último

WebAssembly is Key to Better LLM PerformanceSamy Fodil

Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...Skynet Technologies

ADP Passwordless Journey Case Study.pptxFIDO Alliance

Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfFIDO Alliance

Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...FIDO Alliance

How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfFIDO Alliance

2024 May Patch TuesdayIvanti

Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxFIDO Alliance

Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfFIDO Alliance

JavaScript Usage Statistics 2024 - The Ultimate GuidePixlogix Infotech

Overview of Hyperledger FoundationHyperleger Tokyo Meetup

ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...FIDO Alliance

Design and Development of a Provenance Capture Platform for Data SciencePaolo Missier

AI mind or machine power point presentationyogeshlabana357357

Generative AI Use Cases and Applications.pdfalexjohnson7307

WebRTC and SIP not just audio and video @ OpenSIPS 2024Lorenzo Miniero

Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsLeah Henrickson

Event-Driven Architecture Masterclass: Challenges in Stream ProcessingScyllaDB

Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfFIDO Alliance

Design Guidelines for Passkeys 2024.pptxFIDO Alliance

Ml 3 ways

1. 16 JUNE 2021 Machine Learning Serving your ML Model 3 ways!

2. ➤ 10 x AWS Certifications including SA Pro, Dev Ops and Machine Learning Specialism. ➤ Visionary in ML Ops, Produced production workloads of ML models at scale, including 1500 inferences per minute, including active monitoring and alerting ➤ Contributes to the AWS Community by speaking at several summits, community days and meet-ups. ➤ Regular blogger, open-source contributor, and SME on Machine Learning, MLOps, DevOps, Containers and Serverless. ➤ Experienced principal solutions architect, a lead developer with over 6 years of AWS experience. He has been responsible for running production workloads of over 200 and 18,000 requests per second WHO I AM? Phil Basford phil@inawisdom.com @philipbasford Phil B#4237

3. Inference types ML OPS – INFERENCE TYPES Real Time ➤ Business Critical, commonly uses are chat bots, classifiers, recommenders or liner regressors. Like credit risk, journey times etc ➤ Hundred or thousands individual predictions per second ➤ API Driven with Low Latency, typically below 135ms at the 90th percentile. Near Real Time ➤ Commonly used for image classification or file analysis ➤ Hundred individual predictions per minute and processing needs to be done within seconds ➤ Event or Message Queue based, predictions are sent back or stored Occasional ➤ Examples are simple classifiers like Tax codes ➤ Only a few predictions a month and processing needs to be completed with minutes ➤ API, Event or Message Queue based, predicts sent back or stored Batch ➤ End of month reporting, invoice generation, warranty plan management ➤ Runs at Daily / Monthly / Set Times ➤ The data set is typically millions or tens of millions of rows at once Micro Batch ➤ Anomaly detection, invoice approval and Image processing ➤ Executed regularly : every x minutes or Y number of events. Triggered by file upload or data ingestion ➤ The data set is typically hundreds or thousands of rows at once Edge ➤ Used for Computer Vision, Fault Detection in Manufacturing ➤ Runs on mobile phone apps and low power devices. Uses sensors (i.e. video, location, or heat) ➤ Model output is normally sent back to the Cloud at regular intervals for analysis.

4. Fargate OPTION 1 ➤ Supports Batch and Realtime ➤ Low Latency (<100ms) ➤ Supports only CPU and Not GPU (can step it down to full ECS) ➤ Pay Per Hour ➤ Application Auto Scaling ➤ Runs Docker and full native support ➤ Not integrated with Notebooks SageMaker SDK. ➤ No Model Monitor support (records predictions) ➤ Requires you to build your own images or a deep learning container ➤ Memory and GPU Limits (can step it down to full ECS)

5. SageMaker : Endpoints and Batch Transforms ➤ Supports Batch and Realtime ➤ Built in Algos, Framework and BOYM support ➤ Low Latency (<100ms) ➤ Supports CPU and GPU ➤ Pay Per Hour (saving plans) ➤ Only recently add to saving plans ➤ Application Auto Scaling ➤ Runs Docker and full native support ➤ One click Deployment: Integration with SageMaker Studio and Notebook support via SDK. ➤ Model Monitor support (records predictions) ➤ No resource limits OPTION 2

6. Lambda OPTION 3 ➤ Simple ➤ Supports only Realtime, or micro batch (15mins) ➤ Low Latency (<100ms) ➤ Supports only CPU and Not GPU ➤ Pay Per Request ➤ Scales on concurrency ➤ Saving plans ➤ *Custom Image : Runs Docker and full native support ➤ Not integrated with Notebooks SageMaker SDK. ➤ No Model Monitor support (records predictions) ➤ Memory and GPU Limits

7. Thank you.

Ml 3 ways

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a Ml 3 ways

Similar a Ml 3 ways (20)

Más de PhilipBasford

Más de PhilipBasford (15)

Último

Último (20)

Ml 3 ways