SlideShare una empresa de Scribd logo
1 de 37
Descargar para leer sin conexión
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Driving Machine Learning and Analytics Use
Cases with AWS Storage
Rob Krugman
Chief Digital Officer
Broadridge Financial Systems
S T G 3 0 2
Mahendra Bairagi
AI/ML Specialist Solutions Architect
AWS
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Agenda
• Typical AI development life cycle
• Overview of AWS machine learning portfolio
• Data needs for AI ML workload
• Storage options for AI ML workload
• Best practices - storage options for AI
• Broadridge solution
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Related breakouts
Thursday November 29th
Breaking the Ice: Transform Cold Archival Data into Fresh Insights
4 – 5PM | Aria East, Level 1, Joshua 3
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
The Machine Learning Process
Monitoring
& Debugging
Predictions
Yes
Model Deployment
Data
Augmentation
No Are business
goals met?
Model Evaluation
Model Training
& Parameter Tuning
Feature Engineering
Data Visualization
& Analysis
Data Preparation
Data Integration
Data Collection
ML Problem Framing
Business Problem
Feature
Augmentation
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Integration: The data architecture
Model Evaluation Model Deployment
Business Problem
Are business
goals met?
ML Problem Framing
YesNo
Feature
Augmentation
Data
Augmentation
Predictions
Build the data
platform:
Amazon S3
Amazon Athena
Amazon EMR
Amazon Redshift
Spectrum
AWS Glue
Monitoring
& Debugging
Data Collection
Data Integration
Data Preparation
Feature Engineering
Model Training
& Parameter Tuning
Data Visualization
& Analysis
The Machine Learning Process
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
The model training: Undifferentiated heavy lifting
Setup and manage
Notebook
environments
Training clusters
Write data
connectors
Scale ML algorithms
to large datasets
Distribute ML
training algorithm
to multiple
machines
Secure model
artifacts
Model Deployment
Business Problem
Are business
goals met?
ML Problem Framing
YesNo
Feature
Augmentation
Data
Augmentation
Predictions
Feature Engineering
Model Training
& Parameter Tuning
Model Evaluation
Data Visualization
& Analysis
Data Collection
Data Integration
Data Preparation
Monitoring
& Debugging
The Machine Learning Process
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
DevOps: Undifferentiated heavy lifting
Setup and manage
Model Inference
Clusters
Manage and scale
model inference
APIs
Monitor and debug
model predictions
Models versioning
and performance
tracking
Automate new model
version promotion
to production (A/B
testing)
Data Collection
Data Integration
Data Preparation
Data Visualization
& Analysis
Feature Engineering
Model Training
& Parameter Tuning
Model Evaluation
Business Problem
Are business
goals met?
ML Problem Framing
YesNo
Feature
Augmentation
Data
Augmentation
Predictions
Model Deployment
Monitoring
& Debugging
The Machine Learning Process
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Collect
Store/data
lake
Build &
train
model
Inference
Latency
Throughput
Cost
Simplify ML Processing
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Platform services
Application services
Frameworks & interfaces
Caffe2 CNTK
Apache
MXNet
PyTorch TensorFlow Chainer Keras Gluon
AWS Deep Learning AMIs
Amazon SageMaker
Amazon
Rekognition
Amazon
Transcribe
Amazon
Translate
Amazon Polly
Amazon
Comprehend
Amazon Lex
AWS
DeepLens
Education
The Amazon Machine Learning Stack
AI application needs
1. Compute
2. Storage
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
COLLECT
Devices
Sensors
IoT platforms
AWS IoT STREAMS
IoT
EventsData streams
Migration
Snowball
Logging
Amazon
CloudWatch
AWS
CloudTrail
FILES
DataTransport&Logging
Import/expo
rt
Files / Objects
Log files
Media files
Mobile apps
Web apps
Data centers AWS Direct
Connect
RECORDS
Applications
Transactions
Data structures
Database records
Type of Data
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Hot Warm Cold
Volume MB–GB GB–TB PB–EB
Item size B–KB KB–MB KB–TB
Latency µs, ms ms, sec min, hrs
Durability Low–high High Very high
Request rate Very high High Low
Cost/GB $$-$ $-¢¢ ¢
Hot data Warm data Cold data
What Is the Temperature of Your ML Data?
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Spark Streaming
AWS Lambda
KCL apps
Amazon
Redshift
Amazon
Redshift
Hive
Spark
Presto
ProcessingTechnology
FastSlow
Hive
Native apps
KCL apps
AWS Lambda
Amazon
Athena
Amazon Kinesis Amazon
DynamoDB/RDS
Amazon S3data
Hot Cold
Storage options for ML and Analytics workload
Streaming
Best practices - storage options for AI
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Best Practice 1: Make sure Datalake is
Well-Architected!
5 Pillars of Well-Architected systems
• Operational excellence
• Security
• Reliability
• Performance efficiency
• Cost optimization
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Best Practice 2: Use the right tool for the job
Data Tier
Relational
Referential integrity
with strong
consistency,
transactions, and
hardened scale
Key-value
Low-latency, key-
based queries with
high throughput and
fast data ingestion
Document
Indexing and storing
of documents with
support for query on
any property
In-memory
Microsecond latency,
key-based queries,
specialized data
structures
Graph
Creating and
navigating relations
between data easily
and quickly
Complex query
support via SQL
Simple query
methods with filters
Simple query with
filters, projections
and aggregates
Simple query
methods with filters
Easily express queries
in terms of relations
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Best Practice 3: Right tool for right AI stack
Amazon SageMaker storage options:
• Use Amazon S3 pipe mode for Amazon Sagemaker where applicable
• EFS for Amazon Sagemaker notebook external storage
Deep learning AMIs Storage options:
• Amazon Elastic Block Storage (EBS), Amazon S3 and Amazon Elastic
File System (EFS)
AI Application services:
• AWS Rekognition, Amazon Polly, Amazon Comprehend, etc.
• Amazon S3
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
From Edge (IoT devices):
• For data: MQTT, Amazon Kinesis Data Streams
• For video: Amazon Kinesis Video Streams
• AWS SDK for Python (e.g. Amazon S3 through Boto libraries)
Best Practice 4: Storage options for ML @Edge
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Why do organizations typically
store and archive data?
REGULATION
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Unfortunately todays solutions reinforce
regulation as the primary driver
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
ar·chi·val: Where content and data
goes to be forgotten
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Yet, for 70% of organizations, the
monolithic model characteristics of
historic information management
has been replaced by a desire to
consume content capabilities as
needed*
*”Digitalizing”CoreBusinessProcesses–AIIMInternational2018
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
A challenge: Last January we challenged
a group of AWS and
Broadridge resources to
answer a question …
… Can we create services that change the
perception of what it means to store and
archive information, in an effort to make it a
value driver to an enterprise?
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
What we recognized in today’s solutions
• Solutions are cost prohibitive
• Typically in house, leveraging expensive hardware and software
• Information is captive
• Minimal interfaces available
• Typically customized to support point solutions
• Information in standardized
• Data and Content must adhere to a structure to be stored and leveraged
• Business use is minimal
• Access via website
• Support e-discovery capabilities
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Where we focused (Personas)
Operations Legal & compliance
Customer service
representatives
1 2 3
• Eliminate data silos
• Support migration,
interoperability, or
conversion
• Complete view
• Learn
• GDPR & California privacy
• Regulatory overlap
• Flexible taxonomies
• Anomaly detection
• Automation
• Experience
• Chatbots
• Self service
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Our Solution: Intelligent Information Management
A PaaS that turns archive data into live,
useable, actionable information
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Reimaging customer service
through information
management, AI and Chatbots
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
IIM – A Platform Focused on Information Assets
Interrogate
stored
information
Match data
point with
current trade
information
Calculate
differences
Send as text
to user
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Intelligent Information Management (IIM)
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
info
Information
services
info
UX API CHATBOT
Abstraction Tier (PaaS)
Information
services
3rd Party
Services
Cloud
services
Datall
lakes
Data
lakes
Data
lakes
Amazon
Aurora
Amazon
DynamoDB
Broadridge presentment tier
including Chatbots (voice and
text), Mobile First UX, and
Microservices APIs
Broadridge PaaS a
reusable purpose driven
services tier
BASIC
PREMIUM
Client customers | Clients | Partners | Internal customers
Users and systems will interface
with Broadridge API, UX, or
Chatbot to access information
Architected as a commodities
layer to avoid vendor lock-in
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
The abstraction layer
Abstraction tier (PaaS)
E-Discovery
Traditional
Search
Web
Presentment
Compliant
Archival
Actionable
Insights
NLP
Search
Risk & Fraud Image Search
Recommendatio
ns
Regulatory
Reporting
Pattern
Recognition
Chat Bot
Broadridge
Services
AWS
Services
Thank you!
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Rob Krugman
Chief Digital Officer
Broadridge Financial Systems
Maheandra Bairagi
AI/ML Specialist Solutions
Architect AWS
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Automate & Audit Cloud Governance & Compliance in Your Landing Zone (ENT315-R...
Automate & Audit Cloud Governance & Compliance in Your Landing Zone (ENT315-R...Automate & Audit Cloud Governance & Compliance in Your Landing Zone (ENT315-R...
Automate & Audit Cloud Governance & Compliance in Your Landing Zone (ENT315-R...
 
NFL and Forwood Safety Deploy Business Analytics at Scale with Amazon QuickSi...
NFL and Forwood Safety Deploy Business Analytics at Scale with Amazon QuickSi...NFL and Forwood Safety Deploy Business Analytics at Scale with Amazon QuickSi...
NFL and Forwood Safety Deploy Business Analytics at Scale with Amazon QuickSi...
 
Dissecting Media Asset Management Architecture and Media Archive TCO (MAE301)...
Dissecting Media Asset Management Architecture and Media Archive TCO (MAE301)...Dissecting Media Asset Management Architecture and Media Archive TCO (MAE301)...
Dissecting Media Asset Management Architecture and Media Archive TCO (MAE301)...
 
Breaking the Ice: Transform Cold Archival Data into Fresh Insights (STG355) -...
Breaking the Ice: Transform Cold Archival Data into Fresh Insights (STG355) -...Breaking the Ice: Transform Cold Archival Data into Fresh Insights (STG355) -...
Breaking the Ice: Transform Cold Archival Data into Fresh Insights (STG355) -...
 
Shift-Left SRE: Self-Healing with AWS Lambda Functions (DEV313-S) - AWS re:In...
Shift-Left SRE: Self-Healing with AWS Lambda Functions (DEV313-S) - AWS re:In...Shift-Left SRE: Self-Healing with AWS Lambda Functions (DEV313-S) - AWS re:In...
Shift-Left SRE: Self-Healing with AWS Lambda Functions (DEV313-S) - AWS re:In...
 
DevSecOps: Instituting Cultural Transformation for Public Sector Organization...
DevSecOps: Instituting Cultural Transformation for Public Sector Organization...DevSecOps: Instituting Cultural Transformation for Public Sector Organization...
DevSecOps: Instituting Cultural Transformation for Public Sector Organization...
 
Edge Computing with AWS Greengrass
Edge Computing with AWS Greengrass Edge Computing with AWS Greengrass
Edge Computing with AWS Greengrass
 
Sicurezza e conformità al GDPR con AWS
Sicurezza e conformità al GDPR con AWSSicurezza e conformità al GDPR con AWS
Sicurezza e conformità al GDPR con AWS
 
Instrumenting Kubernetes for Observability Using AWS X-Ray and Amazon CloudWa...
Instrumenting Kubernetes for Observability Using AWS X-Ray and Amazon CloudWa...Instrumenting Kubernetes for Observability Using AWS X-Ray and Amazon CloudWa...
Instrumenting Kubernetes for Observability Using AWS X-Ray and Amazon CloudWa...
 
Usare la tecnologia Container su AWS
Usare la tecnologia Container su AWSUsare la tecnologia Container su AWS
Usare la tecnologia Container su AWS
 
Serverless Video Ingestion & Analytics with Amazon Kinesis Video Streams (ANT...
Serverless Video Ingestion & Analytics with Amazon Kinesis Video Streams (ANT...Serverless Video Ingestion & Analytics with Amazon Kinesis Video Streams (ANT...
Serverless Video Ingestion & Analytics with Amazon Kinesis Video Streams (ANT...
 
Building IoT Analytics (IOT327-R1) - AWS re:Invent 2018
Building IoT Analytics (IOT327-R1) - AWS re:Invent 2018Building IoT Analytics (IOT327-R1) - AWS re:Invent 2018
Building IoT Analytics (IOT327-R1) - AWS re:Invent 2018
 
Drive Customer Value with Data-Driven Decisions (GPSBUS206) - AWS re:Invent 2018
Drive Customer Value with Data-Driven Decisions (GPSBUS206) - AWS re:Invent 2018Drive Customer Value with Data-Driven Decisions (GPSBUS206) - AWS re:Invent 2018
Drive Customer Value with Data-Driven Decisions (GPSBUS206) - AWS re:Invent 2018
 
Set Up a Communications Platform on AWS with AI-Enhanced Services (TLC302) - ...
Set Up a Communications Platform on AWS with AI-Enhanced Services (TLC302) - ...Set Up a Communications Platform on AWS with AI-Enhanced Services (TLC302) - ...
Set Up a Communications Platform on AWS with AI-Enhanced Services (TLC302) - ...
 
Serverless State Management & Orchestration for Modern Apps (API302) - AWS re...
Serverless State Management & Orchestration for Modern Apps (API302) - AWS re...Serverless State Management & Orchestration for Modern Apps (API302) - AWS re...
Serverless State Management & Orchestration for Modern Apps (API302) - AWS re...
 
[NEW LAUNCH!] Introducing Amazon Textract: Now in Preview (AIM363) - AWS re:I...
[NEW LAUNCH!] Introducing Amazon Textract: Now in Preview (AIM363) - AWS re:I...[NEW LAUNCH!] Introducing Amazon Textract: Now in Preview (AIM363) - AWS re:I...
[NEW LAUNCH!] Introducing Amazon Textract: Now in Preview (AIM363) - AWS re:I...
 
Serverless Stream Processing Tips & Tricks (ANT358) - AWS re:Invent 2018
Serverless Stream Processing Tips & Tricks (ANT358) - AWS re:Invent 2018Serverless Stream Processing Tips & Tricks (ANT358) - AWS re:Invent 2018
Serverless Stream Processing Tips & Tricks (ANT358) - AWS re:Invent 2018
 
Operationalizing Your Analysis with AWS IoT Analytics (IOT358-R1) - AWS re:In...
Operationalizing Your Analysis with AWS IoT Analytics (IOT358-R1) - AWS re:In...Operationalizing Your Analysis with AWS IoT Analytics (IOT358-R1) - AWS re:In...
Operationalizing Your Analysis with AWS IoT Analytics (IOT358-R1) - AWS re:In...
 
Artificial Intelligence nella realtà di oggi: come utilizzarla al meglio
Artificial Intelligence nella realtà di oggi: come utilizzarla al meglioArtificial Intelligence nella realtà di oggi: come utilizzarla al meglio
Artificial Intelligence nella realtà di oggi: come utilizzarla al meglio
 
Amazon Cloud Directory Deep Dive (DAT364) - AWS re:Invent 2018
Amazon Cloud Directory Deep Dive (DAT364) - AWS re:Invent 2018Amazon Cloud Directory Deep Dive (DAT364) - AWS re:Invent 2018
Amazon Cloud Directory Deep Dive (DAT364) - AWS re:Invent 2018
 

Similar a Driving Machine Learning and Analytics Use Cases with AWS Storage (STG302) - AWS re:Invent 2018

Similar a Driving Machine Learning and Analytics Use Cases with AWS Storage (STG302) - AWS re:Invent 2018 (20)

BI & Analytics
BI & AnalyticsBI & Analytics
BI & Analytics
 
雲上打造資料湖 (Data Lake):智能化駕馭商機 (Level 300)
雲上打造資料湖 (Data Lake):智能化駕馭商機 (Level 300)雲上打造資料湖 (Data Lake):智能化駕馭商機 (Level 300)
雲上打造資料湖 (Data Lake):智能化駕馭商機 (Level 300)
 
BI & Analytics - A Datalake on AWS
BI & Analytics - A Datalake on AWSBI & Analytics - A Datalake on AWS
BI & Analytics - A Datalake on AWS
 
Modern Cloud Data Warehousing ft. Equinox Fitness Clubs: Optimize Analytics P...
Modern Cloud Data Warehousing ft. Equinox Fitness Clubs: Optimize Analytics P...Modern Cloud Data Warehousing ft. Equinox Fitness Clubs: Optimize Analytics P...
Modern Cloud Data Warehousing ft. Equinox Fitness Clubs: Optimize Analytics P...
 
Quickly and easily build, train, and deploy machine learning models at any scale
Quickly and easily build, train, and deploy machine learning models at any scaleQuickly and easily build, train, and deploy machine learning models at any scale
Quickly and easily build, train, and deploy machine learning models at any scale
 
Building a Data Lake for Your Enterprise, ft. Sysco (STG309) - AWS re:Invent ...
Building a Data Lake for Your Enterprise, ft. Sysco (STG309) - AWS re:Invent ...Building a Data Lake for Your Enterprise, ft. Sysco (STG309) - AWS re:Invent ...
Building a Data Lake for Your Enterprise, ft. Sysco (STG309) - AWS re:Invent ...
 
Social Media Analytics with Amazon QuickSight (ANT370) - AWS re:Invent 2018
Social Media Analytics with Amazon QuickSight (ANT370) - AWS re:Invent 2018Social Media Analytics with Amazon QuickSight (ANT370) - AWS re:Invent 2018
Social Media Analytics with Amazon QuickSight (ANT370) - AWS re:Invent 2018
 
Building a Data Lake in Amazon S3 & Amazon Glacier (STG401-R1) - AWS re:Inven...
Building a Data Lake in Amazon S3 & Amazon Glacier (STG401-R1) - AWS re:Inven...Building a Data Lake in Amazon S3 & Amazon Glacier (STG401-R1) - AWS re:Inven...
Building a Data Lake in Amazon S3 & Amazon Glacier (STG401-R1) - AWS re:Inven...
 
Introducing Amazon SageMaker - AWS Online Tech Talks
Introducing Amazon SageMaker - AWS Online Tech TalksIntroducing Amazon SageMaker - AWS Online Tech Talks
Introducing Amazon SageMaker - AWS Online Tech Talks
 
Big Data Meets AI - Driving Insights and Adding Intelligence to Your Solutions
 Big Data Meets AI - Driving Insights and Adding Intelligence to Your Solutions Big Data Meets AI - Driving Insights and Adding Intelligence to Your Solutions
Big Data Meets AI - Driving Insights and Adding Intelligence to Your Solutions
 
SaaS Analytics and Metrics: Capturing and Surfacing the Data That's Fundament...
SaaS Analytics and Metrics: Capturing and Surfacing the Data That's Fundament...SaaS Analytics and Metrics: Capturing and Surfacing the Data That's Fundament...
SaaS Analytics and Metrics: Capturing and Surfacing the Data That's Fundament...
 
How to Build HR Lakes on AWS to Unlock New Business Insights (DAT367) - AWS r...
How to Build HR Lakes on AWS to Unlock New Business Insights (DAT367) - AWS r...How to Build HR Lakes on AWS to Unlock New Business Insights (DAT367) - AWS r...
How to Build HR Lakes on AWS to Unlock New Business Insights (DAT367) - AWS r...
 
AI/ML with Data Lakes: Counterintuitive Consumer Insights in Retail (RET206) ...
AI/ML with Data Lakes: Counterintuitive Consumer Insights in Retail (RET206) ...AI/ML with Data Lakes: Counterintuitive Consumer Insights in Retail (RET206) ...
AI/ML with Data Lakes: Counterintuitive Consumer Insights in Retail (RET206) ...
 
Automate Business Insights on AWS - Simple, Fast, and Secure Analytics Platforms
Automate Business Insights on AWS - Simple, Fast, and Secure Analytics PlatformsAutomate Business Insights on AWS - Simple, Fast, and Secure Analytics Platforms
Automate Business Insights on AWS - Simple, Fast, and Secure Analytics Platforms
 
Choosing the Right Database for My Workload: Purpose-Built Databases
Choosing the Right Database for My Workload: Purpose-Built Databases Choosing the Right Database for My Workload: Purpose-Built Databases
Choosing the Right Database for My Workload: Purpose-Built Databases
 
Big Data - EBC on the road Brazil Edition [Portuguese]
Big Data - EBC on the road Brazil Edition [Portuguese]Big Data - EBC on the road Brazil Edition [Portuguese]
Big Data - EBC on the road Brazil Edition [Portuguese]
 
Choose the right DB for the Job - Builders Day Israel
Choose the right DB for the Job - Builders Day IsraelChoose the right DB for the Job - Builders Day Israel
Choose the right DB for the Job - Builders Day Israel
 
M&E Leadership Session: The State of the Industry, What's New from AWS for M&...
M&E Leadership Session: The State of the Industry, What's New from AWS for M&...M&E Leadership Session: The State of the Industry, What's New from AWS for M&...
M&E Leadership Session: The State of the Industry, What's New from AWS for M&...
 
CI/CD for Your Machine Learning Pipeline with Amazon SageMaker (DVC303) - AWS...
CI/CD for Your Machine Learning Pipeline with Amazon SageMaker (DVC303) - AWS...CI/CD for Your Machine Learning Pipeline with Amazon SageMaker (DVC303) - AWS...
CI/CD for Your Machine Learning Pipeline with Amazon SageMaker (DVC303) - AWS...
 
Get to Know Your Customers - Build and Innovate with a Modern Data Architecture
Get to Know Your Customers - Build and Innovate with a Modern Data ArchitectureGet to Know Your Customers - Build and Innovate with a Modern Data Architecture
Get to Know Your Customers - Build and Innovate with a Modern Data Architecture
 

Más de Amazon Web Services

Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
Amazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
Amazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
Amazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
Amazon Web Services
 

Más de Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Driving Machine Learning and Analytics Use Cases with AWS Storage (STG302) - AWS re:Invent 2018

  • 1.
  • 2. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Driving Machine Learning and Analytics Use Cases with AWS Storage Rob Krugman Chief Digital Officer Broadridge Financial Systems S T G 3 0 2 Mahendra Bairagi AI/ML Specialist Solutions Architect AWS
  • 3. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Agenda • Typical AI development life cycle • Overview of AWS machine learning portfolio • Data needs for AI ML workload • Storage options for AI ML workload • Best practices - storage options for AI • Broadridge solution
  • 4. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Related breakouts Thursday November 29th Breaking the Ice: Transform Cold Archival Data into Fresh Insights 4 – 5PM | Aria East, Level 1, Joshua 3
  • 5. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 6. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. The Machine Learning Process Monitoring & Debugging Predictions Yes Model Deployment Data Augmentation No Are business goals met? Model Evaluation Model Training & Parameter Tuning Feature Engineering Data Visualization & Analysis Data Preparation Data Integration Data Collection ML Problem Framing Business Problem Feature Augmentation
  • 7. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Integration: The data architecture Model Evaluation Model Deployment Business Problem Are business goals met? ML Problem Framing YesNo Feature Augmentation Data Augmentation Predictions Build the data platform: Amazon S3 Amazon Athena Amazon EMR Amazon Redshift Spectrum AWS Glue Monitoring & Debugging Data Collection Data Integration Data Preparation Feature Engineering Model Training & Parameter Tuning Data Visualization & Analysis The Machine Learning Process
  • 8. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. The model training: Undifferentiated heavy lifting Setup and manage Notebook environments Training clusters Write data connectors Scale ML algorithms to large datasets Distribute ML training algorithm to multiple machines Secure model artifacts Model Deployment Business Problem Are business goals met? ML Problem Framing YesNo Feature Augmentation Data Augmentation Predictions Feature Engineering Model Training & Parameter Tuning Model Evaluation Data Visualization & Analysis Data Collection Data Integration Data Preparation Monitoring & Debugging The Machine Learning Process
  • 9. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. DevOps: Undifferentiated heavy lifting Setup and manage Model Inference Clusters Manage and scale model inference APIs Monitor and debug model predictions Models versioning and performance tracking Automate new model version promotion to production (A/B testing) Data Collection Data Integration Data Preparation Data Visualization & Analysis Feature Engineering Model Training & Parameter Tuning Model Evaluation Business Problem Are business goals met? ML Problem Framing YesNo Feature Augmentation Data Augmentation Predictions Model Deployment Monitoring & Debugging The Machine Learning Process
  • 10. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Collect Store/data lake Build & train model Inference Latency Throughput Cost Simplify ML Processing
  • 11. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Platform services Application services Frameworks & interfaces Caffe2 CNTK Apache MXNet PyTorch TensorFlow Chainer Keras Gluon AWS Deep Learning AMIs Amazon SageMaker Amazon Rekognition Amazon Transcribe Amazon Translate Amazon Polly Amazon Comprehend Amazon Lex AWS DeepLens Education The Amazon Machine Learning Stack
  • 12. AI application needs 1. Compute 2. Storage
  • 13. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. COLLECT Devices Sensors IoT platforms AWS IoT STREAMS IoT EventsData streams Migration Snowball Logging Amazon CloudWatch AWS CloudTrail FILES DataTransport&Logging Import/expo rt Files / Objects Log files Media files Mobile apps Web apps Data centers AWS Direct Connect RECORDS Applications Transactions Data structures Database records Type of Data
  • 14. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Hot Warm Cold Volume MB–GB GB–TB PB–EB Item size B–KB KB–MB KB–TB Latency µs, ms ms, sec min, hrs Durability Low–high High Very high Request rate Very high High Low Cost/GB $$-$ $-¢¢ ¢ Hot data Warm data Cold data What Is the Temperature of Your ML Data?
  • 15. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Spark Streaming AWS Lambda KCL apps Amazon Redshift Amazon Redshift Hive Spark Presto ProcessingTechnology FastSlow Hive Native apps KCL apps AWS Lambda Amazon Athena Amazon Kinesis Amazon DynamoDB/RDS Amazon S3data Hot Cold Storage options for ML and Analytics workload Streaming
  • 16. Best practices - storage options for AI
  • 17. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Best Practice 1: Make sure Datalake is Well-Architected! 5 Pillars of Well-Architected systems • Operational excellence • Security • Reliability • Performance efficiency • Cost optimization
  • 18. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Best Practice 2: Use the right tool for the job Data Tier Relational Referential integrity with strong consistency, transactions, and hardened scale Key-value Low-latency, key- based queries with high throughput and fast data ingestion Document Indexing and storing of documents with support for query on any property In-memory Microsecond latency, key-based queries, specialized data structures Graph Creating and navigating relations between data easily and quickly Complex query support via SQL Simple query methods with filters Simple query with filters, projections and aggregates Simple query methods with filters Easily express queries in terms of relations
  • 19. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Best Practice 3: Right tool for right AI stack Amazon SageMaker storage options: • Use Amazon S3 pipe mode for Amazon Sagemaker where applicable • EFS for Amazon Sagemaker notebook external storage Deep learning AMIs Storage options: • Amazon Elastic Block Storage (EBS), Amazon S3 and Amazon Elastic File System (EFS) AI Application services: • AWS Rekognition, Amazon Polly, Amazon Comprehend, etc. • Amazon S3
  • 20. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. From Edge (IoT devices): • For data: MQTT, Amazon Kinesis Data Streams • For video: Amazon Kinesis Video Streams • AWS SDK for Python (e.g. Amazon S3 through Boto libraries) Best Practice 4: Storage options for ML @Edge
  • 21. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 22. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Why do organizations typically store and archive data? REGULATION
  • 23. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Unfortunately todays solutions reinforce regulation as the primary driver
  • 24. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. ar·chi·val: Where content and data goes to be forgotten
  • 25. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Yet, for 70% of organizations, the monolithic model characteristics of historic information management has been replaced by a desire to consume content capabilities as needed* *”Digitalizing”CoreBusinessProcesses–AIIMInternational2018
  • 26. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. A challenge: Last January we challenged a group of AWS and Broadridge resources to answer a question …
  • 27. … Can we create services that change the perception of what it means to store and archive information, in an effort to make it a value driver to an enterprise?
  • 28. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. What we recognized in today’s solutions • Solutions are cost prohibitive • Typically in house, leveraging expensive hardware and software • Information is captive • Minimal interfaces available • Typically customized to support point solutions • Information in standardized • Data and Content must adhere to a structure to be stored and leveraged • Business use is minimal • Access via website • Support e-discovery capabilities
  • 29. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Where we focused (Personas) Operations Legal & compliance Customer service representatives 1 2 3 • Eliminate data silos • Support migration, interoperability, or conversion • Complete view • Learn • GDPR & California privacy • Regulatory overlap • Flexible taxonomies • Anomaly detection • Automation • Experience • Chatbots • Self service
  • 30. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Our Solution: Intelligent Information Management A PaaS that turns archive data into live, useable, actionable information
  • 31. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Reimaging customer service through information management, AI and Chatbots
  • 32. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. IIM – A Platform Focused on Information Assets Interrogate stored information Match data point with current trade information Calculate differences Send as text to user
  • 33. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Intelligent Information Management (IIM)
  • 34. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. info Information services info UX API CHATBOT Abstraction Tier (PaaS) Information services 3rd Party Services Cloud services Datall lakes Data lakes Data lakes Amazon Aurora Amazon DynamoDB Broadridge presentment tier including Chatbots (voice and text), Mobile First UX, and Microservices APIs Broadridge PaaS a reusable purpose driven services tier BASIC PREMIUM Client customers | Clients | Partners | Internal customers Users and systems will interface with Broadridge API, UX, or Chatbot to access information Architected as a commodities layer to avoid vendor lock-in
  • 35. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. The abstraction layer Abstraction tier (PaaS) E-Discovery Traditional Search Web Presentment Compliant Archival Actionable Insights NLP Search Risk & Fraud Image Search Recommendatio ns Regulatory Reporting Pattern Recognition Chat Bot Broadridge Services AWS Services
  • 36. Thank you! © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Rob Krugman Chief Digital Officer Broadridge Financial Systems Maheandra Bairagi AI/ML Specialist Solutions Architect AWS
  • 37. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.