SlideShare a Scribd company logo
1 of 26
Download to read offline
AWS S3 + Alluxio + Presto = ❤
The Ryte Use Case
● What is Ryte: Platform to optimize your Online-Marketing
● Requirements for the Ryte Search Success
● The way to Presto on AWS EMR with S3
● Problems came up
● How we solve them with Alluxio
AWS S3 + Alluxio + Presto = ❤
Customer Relationship
Management
Productivity
Application
Website Quality
Management
Welcome to the age of strategic platforms
3
Welcome the Ryte Suite with three essential tools!
4
Make sure the technical
elements of your
website are flawless.
Website Success
Create engaging
content that users
will love.
Content Success
Score top search
rankings with the help
of real Google data.
Search Success
Monitor OptimizeAnalyze
Making sure the digital evolution won’t hurt your business
Hands-on advice to
make your website the
best it can be.
Get the best possible
data to make informed
decisions.
Monitor vital
elements of your
website’s success.
5
Let’s have a look at Website Success
Website Success
Content Success Search Success
Make sure the technical
elements of your
website are flawless.
6
Let’s have a look at Content Success
Website Success
Content Success
Search Success
Create engaging
content that users
will love.
7
Let’s have a look at Search Success
Website Success Content Success
Search Success
Score top search
rankings with the help
of real Google data.
8
● Ryte is 100% Google compliant
● Ryte does not use scraped data, only Google Search Console data
● So monitor your important keywords based on 100% real Google data and
therefore real search queries
Our key asset: real Google data
Requirements for Ryte Search Success
From Product Objectives to the technical implementation
Daily Import of multiple GB JSON Data
Mainly analytics based Queries
Our Product Features require queries on raw Data
Product Objectives to
choose our technical
solution
Prefer usage of AWS High-Level Services
Development-Team experience
HTTP JSON API
Daily Data Import
Ryte Data-Backend
Ryte Web-Frontend
HTTP JSON API
Daily Data Import
Ryte Data-Backend
Ryte Web-Frontend
Ryte Data-Backend: first Edition
AWS Elasticsearch
Service
Import Application
on AWS EC2
REST API
on AWS EC2
Ryte Data-Backend: first Edition
AWS Elasticsearch
Service
Import Application
on AWS EC2
REST API
on AWS EC2
● Simple Data transformation. JSON 2 JSON
● Knowledge in Teams exist
● Performance
● Analytics Queries
● Costs scales on Data, not on Usage
● High-Performance Setup are difficult
Elasticsearch as Data-Storage
Downsides
Ryte Data-Backend: second Edition
Parquet on AWS S3
Import Application
on AWS ECS
AWS EMR (Hadoop)
w Presto as
Dist-SQL-Engine
● Full decoupled read / write engine
● High reliable, low-cost storage with AWS S3
(99.999999999% durability)
● Cost-intensive scaling is usage based
REST API
on AWS ECS
AWS ECS Container
AWS ECS
Container
AWS S3 (unlimited persistent & reliable object store)
AWS EMR Task Node
REST API
AWS EMR Master Instance
Presto Task NodePresto Master Node
Ryte Backend Flow
RANDOMLY HIGH PEAKS ON S3 Request RESULT IN TIMEOUTS!!!111
AWS EMR/Presto with S3 as Data-Backend
Downsides
● 1 of 100 slow S3 Request kill the whole query
● S3 Latency has direct impact to the user
● AWS try they best to find a solution but it
stuck for days & weeks
API Responses up to 20s instead of 3s
Decoupling of our Storage Layer, or: How Alluxio solves all Problems
Ryte Data-Backend: third Edition
Parquet stored
on AWS S3
Import
on AWS ECS
REST API
on AWS ECS
Presto
on AWS EMR
Cluster
Alluxio cache
on AWS EMR
Cluster
● Currently no extra Hardware costs
● Alluxio Cache can “warmed up”
● Cache costs scaling on usage
● Fits perfectly between Presto and S3
AWS ECS Container
AWS ECS
Container
AWS S3 (unlimited persistent & reliable object store)
AWS EMR Task Node
REST API
AWS EMR Master Instance
Alluxio workerAlluxio Master
RAM CachePresto Task Node
Alluxio Client
Presto Master Node
Ryte Backend Flow
Performance Push 😱😍
Query-Time
reduced by
72%
on average!
Summary
● Alluxio help us perfectly to decouple S3 latency spikes from user requests
● No need for additional Hardware until today
● Easy integration between Presto on Hadoop & S3
● Hardware requirements scaling still with business 👍
26
Danny Linden @ Ryte
Chapter Lead Engineering
E-Mail: d.linden@ryte.com
linkedin.com/in/danny-linden/
Twitter: @CodingDanny
Questions ? WE ARE HIRING IN MUNICH:
jobs.ryte.com

More Related Content

More from Alluxio, Inc.

More from Alluxio, Inc. (20)

Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio Edge
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio EdgeData Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio Edge
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio Edge
 
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the Cloud
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the CloudData Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the Cloud
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the Cloud
 
Data Infra Meetup | ByteDance's Native Parquet Reader
Data Infra Meetup | ByteDance's Native Parquet ReaderData Infra Meetup | ByteDance's Native Parquet Reader
Data Infra Meetup | ByteDance's Native Parquet Reader
 
Data Infra Meetup | Uber's Data Storage Evolution
Data Infra Meetup | Uber's Data Storage EvolutionData Infra Meetup | Uber's Data Storage Evolution
Data Infra Meetup | Uber's Data Storage Evolution
 
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...
 
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...
 
AI Infra Day | The AI Infra in the Generative AI Era
AI Infra Day | The AI Infra in the Generative AI EraAI Infra Day | The AI Infra in the Generative AI Era
AI Infra Day | The AI Infra in the Generative AI Era
 
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...
 
AI Infra Day | The Generative AI Market And Intel AI Strategy and Product Up...
AI Infra Day | The Generative AI Market  And Intel AI Strategy and Product Up...AI Infra Day | The Generative AI Market  And Intel AI Strategy and Product Up...
AI Infra Day | The Generative AI Market And Intel AI Strategy and Product Up...
 
AI Infra Day | Composable PyTorch Distributed with PT2 @ Meta
AI Infra Day | Composable PyTorch Distributed with PT2 @ MetaAI Infra Day | Composable PyTorch Distributed with PT2 @ Meta
AI Infra Day | Composable PyTorch Distributed with PT2 @ Meta
 
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber ScaleAI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
 
Alluxio Monthly Webinar | Efficient Data Loading for Model Training on AWS
Alluxio Monthly Webinar | Efficient Data Loading for Model Training on AWSAlluxio Monthly Webinar | Efficient Data Loading for Model Training on AWS
Alluxio Monthly Webinar | Efficient Data Loading for Model Training on AWS
 
Alluxio + Eckerson Webinar | Simplifying and Accelerating Data Access for AI/...
Alluxio + Eckerson Webinar | Simplifying and Accelerating Data Access for AI/...Alluxio + Eckerson Webinar | Simplifying and Accelerating Data Access for AI/...
Alluxio + Eckerson Webinar | Simplifying and Accelerating Data Access for AI/...
 
Alluxio Monthly Webinar - Accelerate AI Path to Production
Alluxio Monthly Webinar - Accelerate AI Path to ProductionAlluxio Monthly Webinar - Accelerate AI Path to Production
Alluxio Monthly Webinar - Accelerate AI Path to Production
 
Alluxio Webinar - Maximize GPU Utilization for Model Training
Alluxio Webinar - Maximize GPU Utilization for Model TrainingAlluxio Webinar - Maximize GPU Utilization for Model Training
Alluxio Webinar - Maximize GPU Utilization for Model Training
 
Alluxio Product school Webinar - Distributed Caching for Generative AI
Alluxio Product school Webinar - Distributed Caching for Generative AIAlluxio Product school Webinar - Distributed Caching for Generative AI
Alluxio Product school Webinar - Distributed Caching for Generative AI
 
Alluxio Product School Webinar - Get Started with Alluxio on Kubernetes
Alluxio Product School Webinar - Get Started with Alluxio on KubernetesAlluxio Product School Webinar - Get Started with Alluxio on Kubernetes
Alluxio Product School Webinar - Get Started with Alluxio on Kubernetes
 
Alluxio Product School Webinar - Boosting Trino Performance.
Alluxio Product School Webinar - Boosting Trino Performance.Alluxio Product School Webinar - Boosting Trino Performance.
Alluxio Product School Webinar - Boosting Trino Performance.
 
Alluxio Product School Webinar - Transparent URI
Alluxio Product School Webinar - Transparent URIAlluxio Product School Webinar - Transparent URI
Alluxio Product School Webinar - Transparent URI
 
Alluxio 2.9 Release Overview
Alluxio 2.9 Release OverviewAlluxio 2.9 Release Overview
Alluxio 2.9 Release Overview
 

Recently uploaded

%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
masabamasaba
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
masabamasaba
 

Recently uploaded (20)

%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 
WSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - KeynoteWSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - Keynote
 
tonesoftg
tonesoftgtonesoftg
tonesoftg
 
What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationWhat Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the Situation
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go Platformless
 
Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxArtyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptx
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
WSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security ProgramWSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security Program
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 

AWS S3 + Alluxio + Presto = ❤️ The Ryte Use Case

  • 1. AWS S3 + Alluxio + Presto = ❤ The Ryte Use Case
  • 2. ● What is Ryte: Platform to optimize your Online-Marketing ● Requirements for the Ryte Search Success ● The way to Presto on AWS EMR with S3 ● Problems came up ● How we solve them with Alluxio AWS S3 + Alluxio + Presto = ❤
  • 4. Welcome the Ryte Suite with three essential tools! 4 Make sure the technical elements of your website are flawless. Website Success Create engaging content that users will love. Content Success Score top search rankings with the help of real Google data. Search Success
  • 5. Monitor OptimizeAnalyze Making sure the digital evolution won’t hurt your business Hands-on advice to make your website the best it can be. Get the best possible data to make informed decisions. Monitor vital elements of your website’s success. 5
  • 6. Let’s have a look at Website Success Website Success Content Success Search Success Make sure the technical elements of your website are flawless. 6
  • 7. Let’s have a look at Content Success Website Success Content Success Search Success Create engaging content that users will love. 7
  • 8. Let’s have a look at Search Success Website Success Content Success Search Success Score top search rankings with the help of real Google data. 8
  • 9. ● Ryte is 100% Google compliant ● Ryte does not use scraped data, only Google Search Console data ● So monitor your important keywords based on 100% real Google data and therefore real search queries Our key asset: real Google data
  • 10. Requirements for Ryte Search Success From Product Objectives to the technical implementation Daily Import of multiple GB JSON Data Mainly analytics based Queries Our Product Features require queries on raw Data Product Objectives to choose our technical solution Prefer usage of AWS High-Level Services Development-Team experience
  • 11. HTTP JSON API Daily Data Import Ryte Data-Backend Ryte Web-Frontend
  • 12. HTTP JSON API Daily Data Import Ryte Data-Backend Ryte Web-Frontend
  • 13. Ryte Data-Backend: first Edition AWS Elasticsearch Service Import Application on AWS EC2 REST API on AWS EC2
  • 14. Ryte Data-Backend: first Edition AWS Elasticsearch Service Import Application on AWS EC2 REST API on AWS EC2 ● Simple Data transformation. JSON 2 JSON ● Knowledge in Teams exist ● Performance ● Analytics Queries
  • 15. ● Costs scales on Data, not on Usage ● High-Performance Setup are difficult Elasticsearch as Data-Storage Downsides
  • 16. Ryte Data-Backend: second Edition Parquet on AWS S3 Import Application on AWS ECS AWS EMR (Hadoop) w Presto as Dist-SQL-Engine ● Full decoupled read / write engine ● High reliable, low-cost storage with AWS S3 (99.999999999% durability) ● Cost-intensive scaling is usage based REST API on AWS ECS
  • 17. AWS ECS Container AWS ECS Container AWS S3 (unlimited persistent & reliable object store) AWS EMR Task Node REST API AWS EMR Master Instance Presto Task NodePresto Master Node Ryte Backend Flow
  • 18. RANDOMLY HIGH PEAKS ON S3 Request RESULT IN TIMEOUTS!!!111 AWS EMR/Presto with S3 as Data-Backend Downsides ● 1 of 100 slow S3 Request kill the whole query ● S3 Latency has direct impact to the user ● AWS try they best to find a solution but it stuck for days & weeks
  • 19. API Responses up to 20s instead of 3s
  • 20. Decoupling of our Storage Layer, or: How Alluxio solves all Problems
  • 21. Ryte Data-Backend: third Edition Parquet stored on AWS S3 Import on AWS ECS REST API on AWS ECS Presto on AWS EMR Cluster Alluxio cache on AWS EMR Cluster ● Currently no extra Hardware costs ● Alluxio Cache can “warmed up” ● Cache costs scaling on usage ● Fits perfectly between Presto and S3
  • 22. AWS ECS Container AWS ECS Container AWS S3 (unlimited persistent & reliable object store) AWS EMR Task Node REST API AWS EMR Master Instance Alluxio workerAlluxio Master RAM CachePresto Task Node Alluxio Client Presto Master Node Ryte Backend Flow
  • 23.
  • 25. Summary ● Alluxio help us perfectly to decouple S3 latency spikes from user requests ● No need for additional Hardware until today ● Easy integration between Presto on Hadoop & S3 ● Hardware requirements scaling still with business 👍
  • 26. 26 Danny Linden @ Ryte Chapter Lead Engineering E-Mail: d.linden@ryte.com linkedin.com/in/danny-linden/ Twitter: @CodingDanny Questions ? WE ARE HIRING IN MUNICH: jobs.ryte.com