Best Angular 17 Classroom & Online training - Naresh IT
Data Orchestration for AI, Big Data, and Cloud
1. Data Orchestration for AI, Big Data, and Cloud
Haoyuan (HY) Li | Founder, Chairman, CTO | Alluxio
haoyuan@alluxio.com | @haoyuan
2019-09-09 @ IFA+ Summit 2019
2. Realities:A Fragmented Data World
Data Silos are Inevitable
More data
generated
every day
Data Scientists and
Analysts need access
to this data
New compute
technologies in
the cloud
5. Data silos cross data centers, regions, clouds
HDFS
HIVE
HDFS
Spark
NFS
TENSOR
FLOW
DATA IN DISPARATE STORAGE SYSTEMS
OBJECT
STORE
PRESTO
COMPUTE SPREAD ACROSS MANY DIFFERENT FRAMEWORKS
WAN
HDFS
WAN
S3
Spark
AZURE
PRESTO
6. Abstract & orchestrate data across data silos
HDFS
HIVE Spark
NFS
TENSOR
FLOW
DATA IN DISPARATE STORAGE SYSTEMS
PRESTO
COMPUTE SPREAD ACROSS MANY DIFFERENT FRAMEWORKS
S3
SPARK
DATA
ORCHESTRATION
DATA
ORCHESTRATION
DATA
ORCHESTRATION
DATA
ORCHESTRATION
DATA
ORCHESTRATION
ANY
DATA
APP
DATA
ORCHESTRATION
7. Data Orchestration for the cloud
Data Locality,Accessibility & Elasticity for AI & Data Analytics
• Faster Data to Insights to Innovation
• Elastic Compute Resource in the Cloud
• Saving Cost from both Machine & People
9. Java File API HDFS Interface S3 Interface REST APIFUSE Interface
HDFS Driver Swift Driver S3 Driver NFS Driver
Data Orchestration for the AI, Big Data, and Cloud
12. An Open Source Implementation of Data Orchestration
Started From UC Berkeley AMPLab
1000+ contributors &
growing
4000+ Git Stars
Apache 2.0 Licensed
GitHub’s Top 100 Most
Valuable Repositories
Out of 96 Million
Join the
conversation on
Slack
slackin.alluxio.io