Alluxio Use Cases and Future Directions

DATA ORCHESTRATION SUMMI
T
Alluxio Use Cases and Future Directions
Bin Fan - Founding Engineer, VP of Open Source @ Alluxio
Calvin Jia - Founding Engineer @ Alluxio

Data Orchestration for
Analytics & AI in the Cloud
A DATA ORCHESTRATION APPROACH
Available:

DATA ORCHESTRATION SUMMIT
Agenda
• Alluxio Use Cases
• Future Directions
• Community Collaborations

DATA ORCHESTRATION
SUMMIT
2020
Alluxio Use Cases

Companies Using Alluxio

Single Cloud & On-Prem Use Cases
Consistent SLAs, Performance, and
Cost Savings on cloud storage
USE CASE 01: CLOUD USE CASE 02: ON PREM
PUBLIC CLOUD
Tensorflow
Alluxio
Speed-up analytics on on-prem
object stores
ON PREMISE
Spark
Alluxio
OR OR

CHALLENGES WITH CLOUD STORAGE
USE CASE 01: CLOUD
Ineﬀicient access to cloud storage
• Performance is variable and consistent SLAs are hard to achieve
• Metadata operations are expensive & slowdown workloads
• Embedded caching solutions are ineﬀective for ephemeral
workloads & clusters
Tensorflow
Alluxio
OR

• 40%+ reduction in AI training time & cost
• 2-8x performance with Analytics engines
• Eliminate storage access cost to cut total cost by up to 50%
• Reduce latency spikes by up to 6x using data pre-loading &
consistent performance guarantees
• Optional oﬀ-cluster caching for ephemeral workloads
SOLUTION
Consistent SLAs, Performance &
Cost Savings on cloud storage
USE CASE 01: CLOUD
Tensorflow
Alluxio
OR

CHALLENGES WITH ON-PREM OBJECT STORES
USE CASE 02: ON PREM
Slow transition to object storage
• Performance for analytics & AI workloads can be very poor
• No native support for popular frameworks
• Expensive metadata operations further reduce performance
t
Spark
Alluxio
OR OR

• Improved performance over co-located HDFS with the
flexibility of segregated storage
• Support for multiple APIs
• No changes to the end-user experience
• Enable cheap storage at a fraction of the cost
SOLUTION
Speed-up analytics & AI on
on-prem object stores
USE CASE 02: ON PREM
t
Spark
Alluxio
SAME REGION
OR OR

Hybrid Cloud & Multi-Datacenter
Burst compute to a public cloud
and gradually migrate
USE CASE 03: HYBRID
Hive
Alluxio
PUBLIC CLOUD
ON PREMISE
Hybrid Cloud Gateway to utilize
on-prem compute for data in the cloud
USE CASE 04: HYBRID
Alluxio
Pytorch
PUBLIC CLOUD
ON PREMISE
Cross Datacenter Access without
changing Ingest Pipeline across regions
USE CASE 05: MULTI-DATACENTER
Presto
Alluxio
DATACENTER 1
DATACENTER 2
INGESTION

ALLUXIO 12
CHALLENGES WITH HYBRID CLOUD BURSTING
USE CASE 03: HYBRID
Migrating Analytics or AI to the
Cloud is Hard
• Repeated data access across the corporate network to a public
cloud is not feasible
• Copying data to cloud storage is time consuming and complex
• Using a cloud storage system like S3 means expensive
application changes and low performance
t
Hive
Alluxio

t
Hive
Alluxio
SAME REGION
ALLUXIO 13
• Performance as if data is on the cloud compute cluster
• 100% of I/O is oﬀloaded from on-premises
• No changes to end-user experience and security model
• Common data fabric with only a logical data copies
• Utilization of elastic cloud compute for up to 4x costs savings
SOLUTION
Burst Compute to a Public Cloud
and Gradually Migrate
USE CASE 03: HYBRID

Alluxio @ Walmart
• Zero-Copy
○ No new copies of data in the cloud
• High Performance
○ Data caching accelerates queries
• Lower Costs
○ One source of truth for data avoids
additional storage

ALLUXIO 15
CHALLENGES WITH HYBRID CLOUD STORAGE
USE CASE 04: HYBRID
Accessing Cloud Storage from a
Private Datacenter
• No unified view for cloud and on-prem storage
• Prohibitively high network egress costs
• Inability to utilize compute on-premises for data generated
in the cloud
• Inadequate performance for analytics and AI
PyTorch
ON PREMISE
PUBLIC CLOUD

ALLUXIO 16
• Performance as if data is on the on-prem compute cluster
• Intelligent distributed caching for reads & writes
• Network cost savings of up to 80% by eliminating replication
• No changes to the end-user experience with flexible APIs and
security model on cloud storage
SOLUTION
Hybrid Cloud Storage Gateway for
data in the cloud
USE CASE 04: HYBRID
Alluxio
PyTorch
ON PREMISE
PUBLIC CLOUD

ALLUXIO 17
CHALLENGES WITH SUPPORTING SATELLITE CLUSTERS
ACROSS DATA CENTERS
USE CASE 05: MULTI DATACENTER
Utilization of compute resources
across datacenters
• Orchestrating data to compute clusters in another data center is
manual and time consuming
• Storing and managing multiple copies of the data is expensive
with unnecessary network traﬀic for replication
• Running replication frameworks on an overloaded storage
cluster dramatically impacts performance of existing workloads
Presto
Alluxio
DATACENTER 1
a
DATACENTER 2
Hive

ALLUXIO 18
• No redundant data copies across datacenters
• Elimination of complex data synchronization
• 3-6x performance compared to remote data access across regions
• Self-service data infrastructure across business units
SOLUTION
Cross Datacenter Access without
changing Ingest Pipeline
USE CASE 05: MULTI DATACENTER
Presto
Alluxio
DATACENTER 1
a
DATACENTER 2
Hive

Alluxio @ Adobe
Primary DC with large Hadoop Cluster out of
space, ad hoc SQL workloads exponentially
growing as analyst headcount as reached 1800 ppl
PROBLEM
● 80% less network usage
● More stable infrastructure
● Lower costs
● Results come in faster
● Easier to scale
● Ability handle new analysts with no impact and increase response times
● Self-service for end-users
Leverage compute resources outside of
primary on-prem DC for multiple analytical
frameworks.
SOLUTION
REMOTE DATA RESULTS

Alluxio & Data Analytics
• Data Analytics runs on Data Lakes
• Data Lakes are designed for data storage, not access
• Alluxio is the Data Orchestration layer which bridges the
compute and data layers
○ If the Data Lake is remote
○ If the Data Lake is overloaded
○ If the Data Lake has variable latency
○ If the Data Lake has low performance
○ If the Data Lake doesn’t support the same semantics
○ ...

DATA ORCHESTRATION
SUMMIT
2020
Growing Workloads

Alluxio & AI w/ K8s
• Machine Learning & AI runs on Data Lakes
• Compared to Data Analytics, AI workloads have diﬀerent
characteristics, but a similar mismatch between compute
and storage

Alluxio & AI - Better Together
• Access Pattern - Repeated access on a dataset
• Dataset - Many small files
• Preferred API - Posix Filesystem
• Workload Regularity - Predictable, bulk access

Powered by the Community
• Future directions and growing workloads for Alluxio are
greatly influenced by our community! Thank you!

DATA ORCHESTRATION
SUMMIT
2020
Community Collaborations

Alluxio Open Source Project Stats
Latest stable release: 2.4.1
Total number of contributors: 1092
+1013 more commits since v2.1.0 (Nov 2019, 1st Summit)
5100+ Slack users (alluxio.io/slack)

Fast Growing User Slack Channel
alluxio.io/slack

Production Deployments at Scale
● Top-tier cell phone provider
○ 3000+ Alluxio servers in a single cluster
● Top-tier social network company
○ 10,000+ concurrent Alluxio clients
○ 10+PB data managed

Special Interest Groups in Ecosystem
● SIG in Machine Learning/K8s on Alluxio
■ Regular Community R&D meetings
■ Re-implemented JNI-based FUSE integration
■ Performance optimizations for small files, RPCs
● A new SIG kicked off in Presto on Alluxio

Experimental Two-week Release Cycle
● Previous release cadence: quarterly
● New experimental release schedule:
○ every two weeks
○ starting early December!
● What does it bring to Alluxio community?
○ deliver feature/bug fixes faster

Welcome to Join Alluxio Community!
alluxio.io/slack Alluxio-Global-Online-Meetup/

Alluxio Use Cases and Future Directions

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a Alluxio Use Cases and Future Directions

Similar a Alluxio Use Cases and Future Directions (20)

Más de Alluxio, Inc.

Más de Alluxio, Inc. (20)

Último

Último (20)

Alluxio Use Cases and Future Directions