SlideShare una empresa de Scribd logo
1 de 36
Descargar para leer sin conexión
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Fanatics ingests streaming data
to a data lake on AWS
July 12, 2018| 10AM-11AM PDT
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Today’s presenters
Paul Sears, Partner Solutions Architect, Amazon Web Services
Jordan Martz, Director of Technology Solutions, Attunity
Alan Chang, Senior Product Manager, Fanatics
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Today’s agenda
• Driving innovation with AWS data lake solutions
• Moving data in real time with Attunity Replicate
• How Fanatics leverages data for customer insights
• Q&A/Discussion
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Learning objectives
• How to deploy a data lake on Amazon S3
• How to ingest real-time data to a data lake with
minimal operational impact
• How to use AWS, Attunity, and Kafka to get
more value from your data
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
The data lake and AWS
Driving business value with disparate types of data
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1
Legacy data warehouses and RDBMS
• Complex to set up and manage
• Do not scale
• Take months to add new
data sources
• Queries take too long
• Cost $MM upfront
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1
Should I build a data lake?
Starting by amassing "all your data" and dumping
into a large repository for the data gurus to start
finding "insights" is like trying to win the lottery by
buying all the tickets
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1
Rethink how to become a data-driven business
• Business outcomes - start with the insights and actions you
want to drive, then work backwards to a streamlined design
• Experimentation - start small, test many ideas, keep the
good ones and scale those up, paying only for what you
consume
• Agile and timely - deploy data processing infrastructure in
minutes, not months. Take advantage of a rich platform of
services to respond quickly to changing business needs
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1
Business case determines platform design
Ingest/
collect
Consume/
visualize
Store
Process/
analyze
Data
1 4
0 9
5
Answers
and
insights
START HERE
WITH A BUSINESS CASE
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1
Experiment and scale based on your business needs
MATCH
AVAILABLE DATA
Metrics and
monitoring
Workflow
logs
ERP
transactions
Ingest/
collect
Consume/
visualize
Store
Process/
analyze
Data
1 4
0 9
5
Answers
and
insights
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1
Business outcomes on a modern data architecture
Outcome 1 : Modernize and consolidate
• Insights to enhance business applications and create new digital services
Outcome 2 : Innovate for new revenues
• Personalization, demand forecasting, risk analysis
Outcome 3 : Real-time engagement
• Interactive customer experience, event-driven automation, fraud detection
Outcome 4 : Automate for expansive reach
• Automation of business processes and physical infrastructure
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data lake on AWS
AWS
Snowball
AWS
Snowmobile
Amazon
Kinesis
Data Firehose
Amazon
Kinesis
Data Streams
S3
Relational and non-relational data
Schema defined during analysis
Unmatched durability and availability at EB scale
Best security, compliance, and audit capabilities
Run any analytics on the same data without
movement
Scale storage and compute independently
Store data at $0.023 / month; Query for $0.05/GB
scanned
Amazon
Redshift
Amazo
n
EMR
Amazo
nAthen
a
Amazo
n
Kinesis Amazon
Elasticsearch Service
Amazon
Kinesis
Video Streams
AI Services
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1
Why Amazon S3 for modern data architecture?
Designed for 11 9s
of durability
Designed for
99.99% availability
Durable Available High performance
 Multiple upload
 Range GET
 Store as much as you need
 Scale storage and compute
independently
 No minimum usage commitments
Scalable
 Amazon EMR
 Amazon Redshift Spectrum
 Amazon DynamoDB
 Amazon Athena
 AWS Glue
 Amazon Kinesis
 Amazon SageMaker
IntegratedEasy to use
 Simple REST API
 AWS SDKs
 Read-after-create consistency
 Event notification
 Lifecycle policies
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1
Decouple storage and compute
• Legacy design was large databases or
data warehouses with integrated
hardware
• Big data architectures often benefit
from decoupling storage and compute
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Attunity and AWS
Working together to help customers ingest real-time
data into the cloud
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1
AWS Partner Network (APN)AdvancedTechnology Partne
APN Migration and Big Data Competency Partner
5-star rating onAWS Marketplace
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1 17© 2017 Attunity
Automate to reduce cost of
traditional EDW process
Adapt data processes and
technologies to changing
business needs
Provide near real-time
updates of analytics-ready
data sets
Automate your data lake pipeline with Attunity
Deliver timely transactional data for insights
Ensure quality and governance
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1
Requirement Attunity Replicate capabilities
Low latency Propagate data and schema changes end-to-end with near-zero
latency to enable analytics-ready data sets
Scale Load from 100s sources into data lake or Hadoop – no agents
Efficiently control large scale environments
Flexibility Universal, optimized platform integration for future flexibility
Full load and CDC to big data components
Time to value Automate to eliminate manual scripting, enabling non-programmers to
create analytics-ready ODS/HDS
Performance Protect production with low-touch, agentless processing of incremental
updates
Efficiency Low-impact CDC eliminates disruptive full loads and re-loads
Data lake capabilities of Attunity Replicate
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1 19© 2017 Attunity
Continuous data ingestion with our
CDC technology
Scales for hundreds of
heterogenous sources
Pipeline automation – ingestion
and Hive merging
Attunity differentiators – data lake
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1 20© 2017 Attunity
ATTUNITY REPLICATE ARCHITECTURE
TRANSFER
IN-MEMORY
FILTER
HADOOP
RDBMS
DATA
WAREHOUSE
FILES
MAINFRAME
TRANSFORM
PERSISTENT
STORE
CDC
BATCH
INCREMENTAL
BATCH
HADOOP
RDBMS
DATA
WAREHOUSE
STREAMING
FILES
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1
UNIVERSAL PLATFORM COVERAGE – SOURCES
DATABASE EDW
CLOUD
MAINFRAMESAP
FLAT FILESOTHER LEGACY
Oracle
SQL Server
DB2iSeries
DB2z/OS
DB2LUW
MySQL
PostgreSQL
SybaseASE
Informix
Exadata
Teradata
Netezza
Vertica
Pivotal
DB2forz/OS
IMS/DB
VSAM
ECC onOracle
ECC onSQL
ECC onDB2
ECC onHANA
S4 HANA
AWS RDS
AmazonAurora
AmazonRedshift
SQL/MP
Enscribe
RMS
Delimited
(e.g., CSV, TSV)
Sources from whichAttunity Replicate moves data
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1
UNIVERSAL PLATFORM COVERAGE – TARGETS
DATABASE EDW
STREAMING
CLOUD DATA LAKE
Oracle
SQL Server
DB2LUW
MySQL
PostgreSQL
SybaseASE
Informix
MemSQL
Exadata
Teradata
Netezza
Vertica
SybaseIQ
AmazonRedshift
ActianVector
SAP HANA
AmazonRDS
AmazonRedshift
AmazonEMR
AmazonS3
AmazonAurora
Snowflake
Hortonworks
Cloudera
MapR
AmazonEMR
HDInsight
MapR-ES
Kafka
FLAT FILESSAP
HANA Delimited
(e.g., CSV, TSV)
Targets where Attunity Replicate moves data
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Fanatics
Using big data to identify customer needs
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1
Fan Gear And Replica Jersey
Rights To Meet Today’s
On-Demand Culture
Licensing Rights Across
All Major Leagues And Numerous
NCAA Partners Programs
Over $2B In Sales Through
A Multichannel Approach For
More Than 300 Global Partners
Event Retail And In-Venue
Retail Rights With Top Leagues,
Teams And Global Events
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1
Multi-faceted
capabilities
• E-commerce
• Tech, data and, mobile
• Hot market
• In-house manufacturing / On-demand
• In-venue event and retail
• Memorabilia and game-used
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1
More than 300 partners worldwide
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1
Diverse
destinations
More than just a website
Fanatics delivers a comprehensive, multi-channel technology
and data platform. If you are a sports fan, you have likely had
a Fanatics Experience.
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Fanatics: mobile-first, omni-channel company emphasizing
data and technology
 Data is important to execute upon this vision
 Make data ubiquitous in all aspects of user experience & business operations
Why is data so important at Fanatics?
 eCommerce + offline venues
 Business ecosystem: dynamic nature of sports business
 Sampling of data use cases
– BI insights, including real-time analytics
– User experiences (search, personalization, shipping & delivery, experimentation)
– Paid marketing optimization
– Pricing and promotions
– Merchandising optimization
– Planning and optimization (fulfillment, customer service, manufacturing)
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Our Challenge: low overhead and scalable data ingestion solution needed
 Variety of apps: homegrown site transactional, ERP, Manufacturing
 Support for DBs: SQL Server, Oracle, Postgres, MySQL
 Volume: 1000+ tables; 100 TB + over time
 Key decisions:
– Leverage cloud (elasticity, agility, new systems)
– Leverage Amazon S3
 Permanent data repository (data lake)
 Primary data exchange mechanism across applications, regions, partners
 Amazon S3 allows us the flexibility to enable multiple data processing and querying technologies
 Continuous data ingestion from on-premises applications to Amazon S3
 Flexibility: low overhead to add new tables, sources
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1
Our solution: AWS and Attunity
Sources
• More than five major internal systems
• 100+ SQLServer tables (current)
Micro-batches (15 min intervals)
• Normalize transaction logs.
• Parquet output serialization
• Detect DDL changes
Near real time batches (30-60 min intervals)
• Create table snapshot from most recent data
Near future
• Install Attunity on AWS
• Take a look at Kafka connector
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1
Transitioning Attunity with AWS
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1
Future state
Real-time ingestion in AWS
• Kafka as target for Attunity
Real-time integration and reporting
• Apache Spark/Flink applications enrich
and integrate incoming data in real time
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1
Best Practices and Lessons Learned
• Attunity installation and data consumption owned by the
same team
• Correctly understand the data frequency requirements
• Evaluate tradeoff between file size and file IO operations
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1
The takeaway
Amazon Web Services
Fanatics selected AWS to take advantage of multiple data platforms
Attunity
Fanatics used Attunity for data integration into multiple open source
and AWS platforms
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Q&A
Next steps and further information
 Get a free trial of Attunity solutions:
– http://amzn.to/2kBDgtF
 Learn more about Attunity solutions on AWS:
– http://bit.ly/2kAU4kJ
 Learn more about Fanatics:
– https://www.fanatics.com/

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

A Journey from Too Much Data to Curated Insights - ABD211 - re:Invent 2017
A Journey from Too Much Data to Curated Insights - ABD211 - re:Invent 2017A Journey from Too Much Data to Curated Insights - ABD211 - re:Invent 2017
A Journey from Too Much Data to Curated Insights - ABD211 - re:Invent 2017
 
Redshift Advisor Quick Start: Recommendations on Tuning Your Data Warehouse (...
Redshift Advisor Quick Start: Recommendations on Tuning Your Data Warehouse (...Redshift Advisor Quick Start: Recommendations on Tuning Your Data Warehouse (...
Redshift Advisor Quick Start: Recommendations on Tuning Your Data Warehouse (...
 
Data Warehouses and Data Lakes
Data Warehouses and Data LakesData Warehouses and Data Lakes
Data Warehouses and Data Lakes
 
ABD303_Developing an Insights Platform—the Sysco Journey from Disparate Syste...
ABD303_Developing an Insights Platform—the Sysco Journey from Disparate Syste...ABD303_Developing an Insights Platform—the Sysco Journey from Disparate Syste...
ABD303_Developing an Insights Platform—the Sysco Journey from Disparate Syste...
 
Under the Hood: How Amazon Uses AWS Services for Analytics at a Massive Scale...
Under the Hood: How Amazon Uses AWS Services for Analytics at a Massive Scale...Under the Hood: How Amazon Uses AWS Services for Analytics at a Massive Scale...
Under the Hood: How Amazon Uses AWS Services for Analytics at a Massive Scale...
 
Building-a-Modern-Data-Platform-in-the-Cloud.pdf
Building-a-Modern-Data-Platform-in-the-Cloud.pdfBuilding-a-Modern-Data-Platform-in-the-Cloud.pdf
Building-a-Modern-Data-Platform-in-the-Cloud.pdf
 
Leadership Session: AWS Database and Analytics (DAT206-L) - AWS re:Invent 2018
Leadership Session: AWS Database and Analytics (DAT206-L) - AWS re:Invent 2018Leadership Session: AWS Database and Analytics (DAT206-L) - AWS re:Invent 2018
Leadership Session: AWS Database and Analytics (DAT206-L) - AWS re:Invent 2018
 
ABD316_American Heart Association Finding Cures to Heart Disease Through the ...
ABD316_American Heart Association Finding Cures to Heart Disease Through the ...ABD316_American Heart Association Finding Cures to Heart Disease Through the ...
ABD316_American Heart Association Finding Cures to Heart Disease Through the ...
 
Accelerate Database Migration to AWS with DB Best
 Accelerate Database Migration to AWS with DB Best Accelerate Database Migration to AWS with DB Best
Accelerate Database Migration to AWS with DB Best
 
Best Practices for Building Your Data Lake on AWS
Best Practices for Building Your Data Lake on AWSBest Practices for Building Your Data Lake on AWS
Best Practices for Building Your Data Lake on AWS
 
ABD209_Accelerating the Speed of Innovation with a Data Sciences Data & Analy...
ABD209_Accelerating the Speed of Innovation with a Data Sciences Data & Analy...ABD209_Accelerating the Speed of Innovation with a Data Sciences Data & Analy...
ABD209_Accelerating the Speed of Innovation with a Data Sciences Data & Analy...
 
NetApp Cloud Data Services & AWS Empower Your Cloud Champions
NetApp Cloud Data Services & AWS Empower Your Cloud ChampionsNetApp Cloud Data Services & AWS Empower Your Cloud Champions
NetApp Cloud Data Services & AWS Empower Your Cloud Champions
 
Building a modern data platform in the cloud. AWS DevDay Nordics
Building a modern data platform in the cloud. AWS DevDay NordicsBuilding a modern data platform in the cloud. AWS DevDay Nordics
Building a modern data platform in the cloud. AWS DevDay Nordics
 
Building Data Lake on AWS | AWS Floor28
Building Data Lake on AWS | AWS Floor28Building Data Lake on AWS | AWS Floor28
Building Data Lake on AWS | AWS Floor28
 
Leadership Session: AWS Semiconductor (MFG201-L) - AWS re:Invent 2018
Leadership Session: AWS Semiconductor (MFG201-L) - AWS re:Invent 2018Leadership Session: AWS Semiconductor (MFG201-L) - AWS re:Invent 2018
Leadership Session: AWS Semiconductor (MFG201-L) - AWS re:Invent 2018
 
Data preparation and transformation - Spin your straw into gold - Tel Aviv Su...
Data preparation and transformation - Spin your straw into gold - Tel Aviv Su...Data preparation and transformation - Spin your straw into gold - Tel Aviv Su...
Data preparation and transformation - Spin your straw into gold - Tel Aviv Su...
 
How to Build HR Lakes on AWS to Unlock New Business Insights (DAT367) - AWS r...
How to Build HR Lakes on AWS to Unlock New Business Insights (DAT367) - AWS r...How to Build HR Lakes on AWS to Unlock New Business Insights (DAT367) - AWS r...
How to Build HR Lakes on AWS to Unlock New Business Insights (DAT367) - AWS r...
 
Creating Rich, Interactive Business Dashboards in Amazon QuickSight (ANT339) ...
Creating Rich, Interactive Business Dashboards in Amazon QuickSight (ANT339) ...Creating Rich, Interactive Business Dashboards in Amazon QuickSight (ANT339) ...
Creating Rich, Interactive Business Dashboards in Amazon QuickSight (ANT339) ...
 
Visualization with Amazon QuickSight
Visualization with Amazon QuickSightVisualization with Amazon QuickSight
Visualization with Amazon QuickSight
 
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
 

Similar a Fanatics Ingests Streaming Data to a Data Lake on AWS

Similar a Fanatics Ingests Streaming Data to a Data Lake on AWS (20)

How TrueCar Gains Actionable Insights with Splunk Cloud PPT
How TrueCar Gains Actionable Insights with Splunk Cloud PPTHow TrueCar Gains Actionable Insights with Splunk Cloud PPT
How TrueCar Gains Actionable Insights with Splunk Cloud PPT
 
Citrix Moves Data to Amazon Redshift Fast with Matillion ETL
 Citrix Moves Data to Amazon Redshift Fast with Matillion ETL Citrix Moves Data to Amazon Redshift Fast with Matillion ETL
Citrix Moves Data to Amazon Redshift Fast with Matillion ETL
 
TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
 TiVo: How to Scale New Products with a Data Lake on AWS and Qubole TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
TiVo: How to Scale New Products with a Data Lake on AWS and Qubole
 
Building a Data Lake in Amazon S3 & Amazon Glacier (STG401-R1) - AWS re:Inven...
Building a Data Lake in Amazon S3 & Amazon Glacier (STG401-R1) - AWS re:Inven...Building a Data Lake in Amazon S3 & Amazon Glacier (STG401-R1) - AWS re:Inven...
Building a Data Lake in Amazon S3 & Amazon Glacier (STG401-R1) - AWS re:Inven...
 
雲上打造資料湖 (Data Lake):智能化駕馭商機 (Level 300)
雲上打造資料湖 (Data Lake):智能化駕馭商機 (Level 300)雲上打造資料湖 (Data Lake):智能化駕馭商機 (Level 300)
雲上打造資料湖 (Data Lake):智能化駕馭商機 (Level 300)
 
STG206_Big Data Data Lakes and Data Oceans
STG206_Big Data Data Lakes and Data OceansSTG206_Big Data Data Lakes and Data Oceans
STG206_Big Data Data Lakes and Data Oceans
 
Building Data Lakes for Analytics on AWS
Building Data Lakes for Analytics on AWSBuilding Data Lakes for Analytics on AWS
Building Data Lakes for Analytics on AWS
 
AWS Floor 28 - Building Data lake on AWS
AWS Floor 28 - Building Data lake on AWSAWS Floor 28 - Building Data lake on AWS
AWS Floor 28 - Building Data lake on AWS
 
ABD206-Building Visualizations and Dashboards with Amazon QuickSight
ABD206-Building Visualizations and Dashboards with Amazon QuickSightABD206-Building Visualizations and Dashboards with Amazon QuickSight
ABD206-Building Visualizations and Dashboards with Amazon QuickSight
 
21st Century Analytics with Zopa
21st Century Analytics with Zopa21st Century Analytics with Zopa
21st Century Analytics with Zopa
 
McGraw-Hill Optimizes Analytics Workloads with Databricks
 McGraw-Hill Optimizes Analytics Workloads with Databricks McGraw-Hill Optimizes Analytics Workloads with Databricks
McGraw-Hill Optimizes Analytics Workloads with Databricks
 
BDA306 Building a Modern Data Warehouse: Deep Dive on Amazon Redshift
BDA306 Building a Modern Data Warehouse: Deep Dive on Amazon RedshiftBDA306 Building a Modern Data Warehouse: Deep Dive on Amazon Redshift
BDA306 Building a Modern Data Warehouse: Deep Dive on Amazon Redshift
 
Data_Analytics_and_AI_ML
Data_Analytics_and_AI_MLData_Analytics_and_AI_ML
Data_Analytics_and_AI_ML
 
How Citrix Uses AWS Marketplace Solutions to Accelerate Analytic Workloads on...
How Citrix Uses AWS Marketplace Solutions to Accelerate Analytic Workloads on...How Citrix Uses AWS Marketplace Solutions to Accelerate Analytic Workloads on...
How Citrix Uses AWS Marketplace Solutions to Accelerate Analytic Workloads on...
 
MSC203_How Citrix Uses AWS Marketplace Solutions To Accelerate Analytic Workl...
MSC203_How Citrix Uses AWS Marketplace Solutions To Accelerate Analytic Workl...MSC203_How Citrix Uses AWS Marketplace Solutions To Accelerate Analytic Workl...
MSC203_How Citrix Uses AWS Marketplace Solutions To Accelerate Analytic Workl...
 
FINRA's Managed Data Lake: Next-Gen Analytics in the Cloud - ENT328 - re:Inve...
FINRA's Managed Data Lake: Next-Gen Analytics in the Cloud - ENT328 - re:Inve...FINRA's Managed Data Lake: Next-Gen Analytics in the Cloud - ENT328 - re:Inve...
FINRA's Managed Data Lake: Next-Gen Analytics in the Cloud - ENT328 - re:Inve...
 
ABD201-Big Data Architectural Patterns and Best Practices on AWS
ABD201-Big Data Architectural Patterns and Best Practices on AWSABD201-Big Data Architectural Patterns and Best Practices on AWS
ABD201-Big Data Architectural Patterns and Best Practices on AWS
 
Get to Know Your Customers - Build and Innovate with a Modern Data Architecture
Get to Know Your Customers - Build and Innovate with a Modern Data ArchitectureGet to Know Your Customers - Build and Innovate with a Modern Data Architecture
Get to Know Your Customers - Build and Innovate with a Modern Data Architecture
 
Cutting to the chase for Machine Learning Analytics Ecosystem & AWS Lake Form...
Cutting to the chase for Machine Learning Analytics Ecosystem & AWS Lake Form...Cutting to the chase for Machine Learning Analytics Ecosystem & AWS Lake Form...
Cutting to the chase for Machine Learning Analytics Ecosystem & AWS Lake Form...
 
Modern Data Platforms - Thinking Data Flywheel on the Cloud
Modern Data Platforms - Thinking Data Flywheel on the CloudModern Data Platforms - Thinking Data Flywheel on the Cloud
Modern Data Platforms - Thinking Data Flywheel on the Cloud
 

Más de Amazon Web Services

Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
Amazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
Amazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
Amazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
Amazon Web Services
 

Más de Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Fanatics Ingests Streaming Data to a Data Lake on AWS

  • 1. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Fanatics ingests streaming data to a data lake on AWS July 12, 2018| 10AM-11AM PDT
  • 2. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Today’s presenters Paul Sears, Partner Solutions Architect, Amazon Web Services Jordan Martz, Director of Technology Solutions, Attunity Alan Chang, Senior Product Manager, Fanatics
  • 3. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Today’s agenda • Driving innovation with AWS data lake solutions • Moving data in real time with Attunity Replicate • How Fanatics leverages data for customer insights • Q&A/Discussion
  • 4. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Learning objectives • How to deploy a data lake on Amazon S3 • How to ingest real-time data to a data lake with minimal operational impact • How to use AWS, Attunity, and Kafka to get more value from your data
  • 5. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. The data lake and AWS Driving business value with disparate types of data
  • 6. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1 Legacy data warehouses and RDBMS • Complex to set up and manage • Do not scale • Take months to add new data sources • Queries take too long • Cost $MM upfront
  • 7. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1 Should I build a data lake? Starting by amassing "all your data" and dumping into a large repository for the data gurus to start finding "insights" is like trying to win the lottery by buying all the tickets
  • 8. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1 Rethink how to become a data-driven business • Business outcomes - start with the insights and actions you want to drive, then work backwards to a streamlined design • Experimentation - start small, test many ideas, keep the good ones and scale those up, paying only for what you consume • Agile and timely - deploy data processing infrastructure in minutes, not months. Take advantage of a rich platform of services to respond quickly to changing business needs
  • 9. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1 Business case determines platform design Ingest/ collect Consume/ visualize Store Process/ analyze Data 1 4 0 9 5 Answers and insights START HERE WITH A BUSINESS CASE
  • 10. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1 Experiment and scale based on your business needs MATCH AVAILABLE DATA Metrics and monitoring Workflow logs ERP transactions Ingest/ collect Consume/ visualize Store Process/ analyze Data 1 4 0 9 5 Answers and insights
  • 11. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1 Business outcomes on a modern data architecture Outcome 1 : Modernize and consolidate • Insights to enhance business applications and create new digital services Outcome 2 : Innovate for new revenues • Personalization, demand forecasting, risk analysis Outcome 3 : Real-time engagement • Interactive customer experience, event-driven automation, fraud detection Outcome 4 : Automate for expansive reach • Automation of business processes and physical infrastructure
  • 12. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Data lake on AWS AWS Snowball AWS Snowmobile Amazon Kinesis Data Firehose Amazon Kinesis Data Streams S3 Relational and non-relational data Schema defined during analysis Unmatched durability and availability at EB scale Best security, compliance, and audit capabilities Run any analytics on the same data without movement Scale storage and compute independently Store data at $0.023 / month; Query for $0.05/GB scanned Amazon Redshift Amazo n EMR Amazo nAthen a Amazo n Kinesis Amazon Elasticsearch Service Amazon Kinesis Video Streams AI Services
  • 13. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1 Why Amazon S3 for modern data architecture? Designed for 11 9s of durability Designed for 99.99% availability Durable Available High performance  Multiple upload  Range GET  Store as much as you need  Scale storage and compute independently  No minimum usage commitments Scalable  Amazon EMR  Amazon Redshift Spectrum  Amazon DynamoDB  Amazon Athena  AWS Glue  Amazon Kinesis  Amazon SageMaker IntegratedEasy to use  Simple REST API  AWS SDKs  Read-after-create consistency  Event notification  Lifecycle policies
  • 14. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1 Decouple storage and compute • Legacy design was large databases or data warehouses with integrated hardware • Big data architectures often benefit from decoupling storage and compute
  • 15. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Attunity and AWS Working together to help customers ingest real-time data into the cloud
  • 16. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1 AWS Partner Network (APN)AdvancedTechnology Partne APN Migration and Big Data Competency Partner 5-star rating onAWS Marketplace
  • 17. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1 17© 2017 Attunity Automate to reduce cost of traditional EDW process Adapt data processes and technologies to changing business needs Provide near real-time updates of analytics-ready data sets Automate your data lake pipeline with Attunity Deliver timely transactional data for insights Ensure quality and governance
  • 18. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1 Requirement Attunity Replicate capabilities Low latency Propagate data and schema changes end-to-end with near-zero latency to enable analytics-ready data sets Scale Load from 100s sources into data lake or Hadoop – no agents Efficiently control large scale environments Flexibility Universal, optimized platform integration for future flexibility Full load and CDC to big data components Time to value Automate to eliminate manual scripting, enabling non-programmers to create analytics-ready ODS/HDS Performance Protect production with low-touch, agentless processing of incremental updates Efficiency Low-impact CDC eliminates disruptive full loads and re-loads Data lake capabilities of Attunity Replicate
  • 19. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1 19© 2017 Attunity Continuous data ingestion with our CDC technology Scales for hundreds of heterogenous sources Pipeline automation – ingestion and Hive merging Attunity differentiators – data lake
  • 20. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1 20© 2017 Attunity ATTUNITY REPLICATE ARCHITECTURE TRANSFER IN-MEMORY FILTER HADOOP RDBMS DATA WAREHOUSE FILES MAINFRAME TRANSFORM PERSISTENT STORE CDC BATCH INCREMENTAL BATCH HADOOP RDBMS DATA WAREHOUSE STREAMING FILES
  • 21. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1 UNIVERSAL PLATFORM COVERAGE – SOURCES DATABASE EDW CLOUD MAINFRAMESAP FLAT FILESOTHER LEGACY Oracle SQL Server DB2iSeries DB2z/OS DB2LUW MySQL PostgreSQL SybaseASE Informix Exadata Teradata Netezza Vertica Pivotal DB2forz/OS IMS/DB VSAM ECC onOracle ECC onSQL ECC onDB2 ECC onHANA S4 HANA AWS RDS AmazonAurora AmazonRedshift SQL/MP Enscribe RMS Delimited (e.g., CSV, TSV) Sources from whichAttunity Replicate moves data
  • 22. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1 UNIVERSAL PLATFORM COVERAGE – TARGETS DATABASE EDW STREAMING CLOUD DATA LAKE Oracle SQL Server DB2LUW MySQL PostgreSQL SybaseASE Informix MemSQL Exadata Teradata Netezza Vertica SybaseIQ AmazonRedshift ActianVector SAP HANA AmazonRDS AmazonRedshift AmazonEMR AmazonS3 AmazonAurora Snowflake Hortonworks Cloudera MapR AmazonEMR HDInsight MapR-ES Kafka FLAT FILESSAP HANA Delimited (e.g., CSV, TSV) Targets where Attunity Replicate moves data
  • 23. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Fanatics Using big data to identify customer needs
  • 24. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1 Fan Gear And Replica Jersey Rights To Meet Today’s On-Demand Culture Licensing Rights Across All Major Leagues And Numerous NCAA Partners Programs Over $2B In Sales Through A Multichannel Approach For More Than 300 Global Partners Event Retail And In-Venue Retail Rights With Top Leagues, Teams And Global Events
  • 25. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1 Multi-faceted capabilities • E-commerce • Tech, data and, mobile • Hot market • In-house manufacturing / On-demand • In-venue event and retail • Memorabilia and game-used
  • 26. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1 More than 300 partners worldwide
  • 27. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1 Diverse destinations More than just a website Fanatics delivers a comprehensive, multi-channel technology and data platform. If you are a sports fan, you have likely had a Fanatics Experience.
  • 28. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Fanatics: mobile-first, omni-channel company emphasizing data and technology  Data is important to execute upon this vision  Make data ubiquitous in all aspects of user experience & business operations Why is data so important at Fanatics?  eCommerce + offline venues  Business ecosystem: dynamic nature of sports business  Sampling of data use cases – BI insights, including real-time analytics – User experiences (search, personalization, shipping & delivery, experimentation) – Paid marketing optimization – Pricing and promotions – Merchandising optimization – Planning and optimization (fulfillment, customer service, manufacturing)
  • 29. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Our Challenge: low overhead and scalable data ingestion solution needed  Variety of apps: homegrown site transactional, ERP, Manufacturing  Support for DBs: SQL Server, Oracle, Postgres, MySQL  Volume: 1000+ tables; 100 TB + over time  Key decisions: – Leverage cloud (elasticity, agility, new systems) – Leverage Amazon S3  Permanent data repository (data lake)  Primary data exchange mechanism across applications, regions, partners  Amazon S3 allows us the flexibility to enable multiple data processing and querying technologies  Continuous data ingestion from on-premises applications to Amazon S3  Flexibility: low overhead to add new tables, sources
  • 30. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1 Our solution: AWS and Attunity Sources • More than five major internal systems • 100+ SQLServer tables (current) Micro-batches (15 min intervals) • Normalize transaction logs. • Parquet output serialization • Detect DDL changes Near real time batches (30-60 min intervals) • Create table snapshot from most recent data Near future • Install Attunity on AWS • Take a look at Kafka connector
  • 31. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1 Transitioning Attunity with AWS
  • 32. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1 Future state Real-time ingestion in AWS • Kafka as target for Attunity Real-time integration and reporting • Apache Spark/Flink applications enrich and integrate incoming data in real time
  • 33. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1 Best Practices and Lessons Learned • Attunity installation and data consumption owned by the same team • Correctly understand the data frequency requirements • Evaluate tradeoff between file size and file IO operations
  • 34. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1 The takeaway Amazon Web Services Fanatics selected AWS to take advantage of multiple data platforms Attunity Fanatics used Attunity for data integration into multiple open source and AWS platforms
  • 35. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20170717-v1© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Q&A
  • 36. Next steps and further information  Get a free trial of Attunity solutions: – http://amzn.to/2kBDgtF  Learn more about Attunity solutions on AWS: – http://bit.ly/2kAU4kJ  Learn more about Fanatics: – https://www.fanatics.com/