SlideShare una empresa de Scribd logo
1 de 22
To The Cloud and Back:
A Look at Hybrid Analytics
Keith Manthey
© Dell EMC 2016 - All rights reserved2
ITIL-based
IT processes
Client-server
scale-up apps
Infrastructure
resiliency
Traditional
CIO challenge: next-generation infrastructures are needed
© Dell EMC 2016 - All rights reserved3
ITIL-based
IT processes
Client-server
scale-up apps
Infrastructure
resiliency
Cloud-native
Coexisting IT paradigms
Traditional
DevOps based
IT processes
Distributed
scale-out apps
Application
resiliency
CIO challenge: next-generation infrastructures are needed
© Dell EMC 2016 - All rights reserved4
Traditional Cloud-native
Modern
IT
Off-Premises
On-Premises
Available both on and off premises
CLOUD ENABLED STORAGE
DATALAKE
DATACENTER
CLOUD ENABLED STORAGE
DATALAKE
DATACENTER
Through 2020, IDC predicts 4.5x faster spending on cloud
based big data analytics then on-premise solutions
By 2020, IDC predicts that usage of big data analytics for
actionable intelligence will double over today
On premise and Cloud use of Hybrid Analytics will become a
dominant use of the technology
More specifically – Moving data from On-Premise to the Cloud
and Back
The journey to here
© Dell EMC 2016 - All rights reserved8
Data
Management
DATA LAKE SOLUTION FOR EDW MODERNISATION
Clickstream
Web & Social
Geolocation
Sensor & Machine
Server Logs
EXISTINGSOURCES
ERP
CRM
Commodity Compute
DATA
SERVICES
OPERATIONAL
SERVICES
HORTONWORKS
DATA PLATFORM
HADOOP CORE
Business
Analytics
Visualization
& Dashboards
IT
Applications
NEWSOURCES
2
3
1
ETL/ELT OFFLOAD
ACTIVE ARCHIVE
ENRICH WITH NEW DATA
TYPES
MULTI-PROTOCOL
ACCESS
ENTERPRISE-GRADE
DATA MANAGEMENT
5
NFS, SMB,
HTTP, Swift
1
2
3
4
5
4
New Data Flow
Current Data Flow
Legend
OFFLOAD
Isilon
© Dell EMC 2016 - All rights reserved9
1. Active Archive
– Optimise Enterprise Data Warehouse storage by archiving cold data and still analyse it as
needed
2. ETL Offload
– Improve EDW performance by offloading ETL processing to Hadoop
3. Semi/Unstructured Data Analytics
– Increase confidence in business decisions with new data sources
4. Multi-protocol Access
– Enable applications to access/update Hadoop data using NFS, SMB, HTTP, Swift and
other file/object based access methods
5. Data Management
– Enterprise-grade data management at Hadoop economics
DATA LAKE BENEFITS
Unique to
Isilon
© Dell EMC 2016 - All rights reserved10
ISILON MOMENTUM
2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015
• >12,000 Clusters in the Field
• Approaching 1PB Average Cluster
Capacity
• 1.3EB of Hadoop Usage
• 7,000+ Customers World Wide
• 1,300 New Customers in 2015
• #1 in Data Lakes
>1,200 HDFS Customers
Scale-Out NAS Leader
A look forward
Ability to:
• Enact Policy Based Data Movement
• Migrate data from On-Prem to Cloud
and vice versa
• Leverage Hybrid Analytics to greatest
Effect
To The Cloud and Back
Limits:
100 Gb/Sec
1 Petabyte (1 million gigabytes)
~1 Day to transfer one way
To The Cloud and Back
Data Silos
OnPremOnlyfiles
CloudOnlyfiles
Strong Bandwidth
Typically Hand Scripted
Expensive Extraction
Limit outbound data movement
Our Research
Problem Domain
1. File movement automation in place today is labor intensive and fraught with
peril
2. Non-prescriptive file movement out of the cloud will be extremely expensive
with limited value in return.
3. Most file movements might not return to exactly the same target location. For
example:
Location 1,
File 1 Location 2,
File 1 –
begets File 2
Location 3,
File 2
What if….
1. There was a way to move files by policy (i.e., rules based) to various
locations, including the cloud and back
1. For Example – only target net new files created in the cloud in certain
directories for movement.
2. The rules based file movement could allow for multiple targets and / or
destinations for files.
1. For Example – move net new files in one directory to a single argetand
net new files in a second directory to three targets.
OneFS in the Cloud
Our Journey to Here
1. OneFS already is a software defined asset.
1. SDEdge is a software defined storage offering
2. CloudPools is a software feature in SDEdge that uses the Public Cloud
(Azure, AWS, Google) as a back up target
2. OneFS has been loaded and run in the public cloud as noted in the previous
slide.
3. Dell EMC has a long history of data mobility across our product suite
(replication, backups, etc)
4. Dell EMC has a long history of policy based file movement features.
Futures
1. We are exploring what OneFS with all its features and abilities to move data
up to the cloud and back would look like.
2. What about a cloud environment that contained a OneFS daemon that
allowed policy based file movement.?
3. The cloud environment file movement (up and back) could be controlled by a
Isilon cluster
On-Prem Files
Movement policies
Future State?
Cloud Files
OneFS Daemon
To The Cloud and Back: A Look At Hybrid Analytics

Más contenido relacionado

La actualidad más candente

Hadoop in the Cloud – The What, Why and How from the Experts
Hadoop in the Cloud – The What, Why and How from the ExpertsHadoop in the Cloud – The What, Why and How from the Experts
Hadoop in the Cloud – The What, Why and How from the ExpertsDataWorks Summit/Hadoop Summit
 
How the Internet of Things are Turning the Internet Upside Down
How the Internet of Things are Turning the Internet Upside DownHow the Internet of Things are Turning the Internet Upside Down
How the Internet of Things are Turning the Internet Upside DownDataWorks Summit
 
Apache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and FutureApache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and FutureDataWorks Summit
 
Apache Hadoop 3.0 Community Update
Apache Hadoop 3.0 Community UpdateApache Hadoop 3.0 Community Update
Apache Hadoop 3.0 Community UpdateDataWorks Summit
 
HPE Hadoop Solutions - From use cases to proposal
HPE Hadoop Solutions - From use cases to proposalHPE Hadoop Solutions - From use cases to proposal
HPE Hadoop Solutions - From use cases to proposalDataWorks Summit
 
Dancing elephants - efficiently working with object stores from Apache Spark ...
Dancing elephants - efficiently working with object stores from Apache Spark ...Dancing elephants - efficiently working with object stores from Apache Spark ...
Dancing elephants - efficiently working with object stores from Apache Spark ...DataWorks Summit
 
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage SchemesScaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage SchemesDataWorks Summit
 
Stinger Initiative - Deep Dive
Stinger Initiative - Deep DiveStinger Initiative - Deep Dive
Stinger Initiative - Deep DiveHortonworks
 
Improving Hadoop Resiliency and Operational Efficiency with EMC Isilon
Improving Hadoop Resiliency and Operational Efficiency with EMC IsilonImproving Hadoop Resiliency and Operational Efficiency with EMC Isilon
Improving Hadoop Resiliency and Operational Efficiency with EMC IsilonDataWorks Summit/Hadoop Summit
 
Practice of large Hadoop cluster in China Mobile
Practice of large Hadoop cluster in China MobilePractice of large Hadoop cluster in China Mobile
Practice of large Hadoop cluster in China MobileDataWorks Summit
 
2013 July 23 Toronto Hadoop User Group Hive Tuning
2013 July 23 Toronto Hadoop User Group Hive Tuning2013 July 23 Toronto Hadoop User Group Hive Tuning
2013 July 23 Toronto Hadoop User Group Hive TuningAdam Muise
 
Disaster Recovery and Cloud Migration for your Apache Hive Warehouse
Disaster Recovery and Cloud Migration for your Apache Hive WarehouseDisaster Recovery and Cloud Migration for your Apache Hive Warehouse
Disaster Recovery and Cloud Migration for your Apache Hive WarehouseDataWorks Summit
 
Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...
Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...
Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...DataWorks Summit/Hadoop Summit
 
Flexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkFlexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkDataWorks Summit
 

La actualidad más candente (20)

Hadoop in the Cloud – The What, Why and How from the Experts
Hadoop in the Cloud – The What, Why and How from the ExpertsHadoop in the Cloud – The What, Why and How from the Experts
Hadoop in the Cloud – The What, Why and How from the Experts
 
Evolving HDFS to Generalized Storage Subsystem
Evolving HDFS to Generalized Storage SubsystemEvolving HDFS to Generalized Storage Subsystem
Evolving HDFS to Generalized Storage Subsystem
 
How the Internet of Things are Turning the Internet Upside Down
How the Internet of Things are Turning the Internet Upside DownHow the Internet of Things are Turning the Internet Upside Down
How the Internet of Things are Turning the Internet Upside Down
 
Apache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and FutureApache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and Future
 
Apache Hadoop 3.0 Community Update
Apache Hadoop 3.0 Community UpdateApache Hadoop 3.0 Community Update
Apache Hadoop 3.0 Community Update
 
HPE Hadoop Solutions - From use cases to proposal
HPE Hadoop Solutions - From use cases to proposalHPE Hadoop Solutions - From use cases to proposal
HPE Hadoop Solutions - From use cases to proposal
 
Dancing elephants - efficiently working with object stores from Apache Spark ...
Dancing elephants - efficiently working with object stores from Apache Spark ...Dancing elephants - efficiently working with object stores from Apache Spark ...
Dancing elephants - efficiently working with object stores from Apache Spark ...
 
HDFS Tiered Storage: Mounting Object Stores in HDFS
HDFS Tiered Storage: Mounting Object Stores in HDFSHDFS Tiered Storage: Mounting Object Stores in HDFS
HDFS Tiered Storage: Mounting Object Stores in HDFS
 
IoT:what about data storage?
IoT:what about data storage?IoT:what about data storage?
IoT:what about data storage?
 
Streamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache AmbariStreamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache Ambari
 
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage SchemesScaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
 
Stinger Initiative - Deep Dive
Stinger Initiative - Deep DiveStinger Initiative - Deep Dive
Stinger Initiative - Deep Dive
 
Improving Hadoop Resiliency and Operational Efficiency with EMC Isilon
Improving Hadoop Resiliency and Operational Efficiency with EMC IsilonImproving Hadoop Resiliency and Operational Efficiency with EMC Isilon
Improving Hadoop Resiliency and Operational Efficiency with EMC Isilon
 
Practice of large Hadoop cluster in China Mobile
Practice of large Hadoop cluster in China MobilePractice of large Hadoop cluster in China Mobile
Practice of large Hadoop cluster in China Mobile
 
2013 July 23 Toronto Hadoop User Group Hive Tuning
2013 July 23 Toronto Hadoop User Group Hive Tuning2013 July 23 Toronto Hadoop User Group Hive Tuning
2013 July 23 Toronto Hadoop User Group Hive Tuning
 
Achieving 100k Queries per Hour on Hive on Tez
Achieving 100k Queries per Hour on Hive on TezAchieving 100k Queries per Hour on Hive on Tez
Achieving 100k Queries per Hour on Hive on Tez
 
Disaster Recovery and Cloud Migration for your Apache Hive Warehouse
Disaster Recovery and Cloud Migration for your Apache Hive WarehouseDisaster Recovery and Cloud Migration for your Apache Hive Warehouse
Disaster Recovery and Cloud Migration for your Apache Hive Warehouse
 
Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...
Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...
Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...
 
Flexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkFlexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache Flink
 
Evolving HDFS to a Generalized Storage Subsystem
Evolving HDFS to a Generalized Storage SubsystemEvolving HDFS to a Generalized Storage Subsystem
Evolving HDFS to a Generalized Storage Subsystem
 

Destacado

Data infrastructure architecture for medium size organization: tips for colle...
Data infrastructure architecture for medium size organization: tips for colle...Data infrastructure architecture for medium size organization: tips for colle...
Data infrastructure architecture for medium size organization: tips for colle...DataWorks Summit/Hadoop Summit
 
Path to 400M Members: LinkedIn’s Data Powered Journey
Path to 400M Members: LinkedIn’s Data Powered JourneyPath to 400M Members: LinkedIn’s Data Powered Journey
Path to 400M Members: LinkedIn’s Data Powered JourneyDataWorks Summit/Hadoop Summit
 
Hivemall: Scalable machine learning library for Apache Hive/Spark/Pig
Hivemall: Scalable machine learning library for Apache Hive/Spark/PigHivemall: Scalable machine learning library for Apache Hive/Spark/Pig
Hivemall: Scalable machine learning library for Apache Hive/Spark/PigDataWorks Summit/Hadoop Summit
 
Introduction to Hadoop and Spark (before joining the other talk) and An Overv...
Introduction to Hadoop and Spark (before joining the other talk) and An Overv...Introduction to Hadoop and Spark (before joining the other talk) and An Overv...
Introduction to Hadoop and Spark (before joining the other talk) and An Overv...DataWorks Summit/Hadoop Summit
 
A3RT - the details and actual use cases of "Analytics & Artificial intelligen...
A3RT - the details and actual use cases of "Analytics & Artificial intelligen...A3RT - the details and actual use cases of "Analytics & Artificial intelligen...
A3RT - the details and actual use cases of "Analytics & Artificial intelligen...DataWorks Summit/Hadoop Summit
 
Enabling Apache Zeppelin and Spark for Data Science in the Enterprise
Enabling Apache Zeppelin and Spark for Data Science in the EnterpriseEnabling Apache Zeppelin and Spark for Data Science in the Enterprise
Enabling Apache Zeppelin and Spark for Data Science in the EnterpriseDataWorks Summit/Hadoop Summit
 
Leveraging smart meter data for electric utilities: Comparison of Spark SQL w...
Leveraging smart meter data for electric utilities: Comparison of Spark SQL w...Leveraging smart meter data for electric utilities: Comparison of Spark SQL w...
Leveraging smart meter data for electric utilities: Comparison of Spark SQL w...DataWorks Summit/Hadoop Summit
 
Real-time Analytics in Financial: Use Case, Architecture and Challenges
Real-time Analytics in Financial: Use Case, Architecture and ChallengesReal-time Analytics in Financial: Use Case, Architecture and Challenges
Real-time Analytics in Financial: Use Case, Architecture and ChallengesDataWorks Summit/Hadoop Summit
 
A Container-based Sizing Framework for Apache Hadoop/Spark Clusters
A Container-based Sizing Framework for Apache Hadoop/Spark ClustersA Container-based Sizing Framework for Apache Hadoop/Spark Clusters
A Container-based Sizing Framework for Apache Hadoop/Spark ClustersDataWorks Summit/Hadoop Summit
 
Major advancements in Apache Hive towards full support of SQL compliance
Major advancements in Apache Hive towards full support of SQL complianceMajor advancements in Apache Hive towards full support of SQL compliance
Major advancements in Apache Hive towards full support of SQL complianceDataWorks Summit/Hadoop Summit
 
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...DataWorks Summit/Hadoop Summit
 
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBaseApache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBaseDataWorks Summit/Hadoop Summit
 
Network for the Large-scale Hadoop cluster at Yahoo! JAPAN
Network for the Large-scale Hadoop cluster at Yahoo! JAPANNetwork for the Large-scale Hadoop cluster at Yahoo! JAPAN
Network for the Large-scale Hadoop cluster at Yahoo! JAPANDataWorks Summit/Hadoop Summit
 
Using Hadoop to build a Data Quality Service for both real-time and batch data
Using Hadoop to build a Data Quality Service for both real-time and batch dataUsing Hadoop to build a Data Quality Service for both real-time and batch data
Using Hadoop to build a Data Quality Service for both real-time and batch dataDataWorks Summit/Hadoop Summit
 

Destacado (20)

Data infrastructure architecture for medium size organization: tips for colle...
Data infrastructure architecture for medium size organization: tips for colle...Data infrastructure architecture for medium size organization: tips for colle...
Data infrastructure architecture for medium size organization: tips for colle...
 
The real world use of Big Data to change business
The real world use of Big Data to change businessThe real world use of Big Data to change business
The real world use of Big Data to change business
 
Path to 400M Members: LinkedIn’s Data Powered Journey
Path to 400M Members: LinkedIn’s Data Powered JourneyPath to 400M Members: LinkedIn’s Data Powered Journey
Path to 400M Members: LinkedIn’s Data Powered Journey
 
Hivemall: Scalable machine learning library for Apache Hive/Spark/Pig
Hivemall: Scalable machine learning library for Apache Hive/Spark/PigHivemall: Scalable machine learning library for Apache Hive/Spark/Pig
Hivemall: Scalable machine learning library for Apache Hive/Spark/Pig
 
Introduction to Hadoop and Spark (before joining the other talk) and An Overv...
Introduction to Hadoop and Spark (before joining the other talk) and An Overv...Introduction to Hadoop and Spark (before joining the other talk) and An Overv...
Introduction to Hadoop and Spark (before joining the other talk) and An Overv...
 
A3RT - the details and actual use cases of "Analytics & Artificial intelligen...
A3RT - the details and actual use cases of "Analytics & Artificial intelligen...A3RT - the details and actual use cases of "Analytics & Artificial intelligen...
A3RT - the details and actual use cases of "Analytics & Artificial intelligen...
 
Enabling Apache Zeppelin and Spark for Data Science in the Enterprise
Enabling Apache Zeppelin and Spark for Data Science in the EnterpriseEnabling Apache Zeppelin and Spark for Data Science in the Enterprise
Enabling Apache Zeppelin and Spark for Data Science in the Enterprise
 
The truth about SQL and Data Warehousing on Hadoop
The truth about SQL and Data Warehousing on HadoopThe truth about SQL and Data Warehousing on Hadoop
The truth about SQL and Data Warehousing on Hadoop
 
Leveraging smart meter data for electric utilities: Comparison of Spark SQL w...
Leveraging smart meter data for electric utilities: Comparison of Spark SQL w...Leveraging smart meter data for electric utilities: Comparison of Spark SQL w...
Leveraging smart meter data for electric utilities: Comparison of Spark SQL w...
 
Real-time Analytics in Financial: Use Case, Architecture and Challenges
Real-time Analytics in Financial: Use Case, Architecture and ChallengesReal-time Analytics in Financial: Use Case, Architecture and Challenges
Real-time Analytics in Financial: Use Case, Architecture and Challenges
 
Comparison of Transactional Libraries for HBase
Comparison of Transactional Libraries for HBaseComparison of Transactional Libraries for HBase
Comparison of Transactional Libraries for HBase
 
A Container-based Sizing Framework for Apache Hadoop/Spark Clusters
A Container-based Sizing Framework for Apache Hadoop/Spark ClustersA Container-based Sizing Framework for Apache Hadoop/Spark Clusters
A Container-based Sizing Framework for Apache Hadoop/Spark Clusters
 
Protecting Enterprise Data In Apache Hadoop
Protecting Enterprise Data In Apache HadoopProtecting Enterprise Data In Apache Hadoop
Protecting Enterprise Data In Apache Hadoop
 
Apache NiFi 1.0 in Nutshell
Apache NiFi 1.0 in NutshellApache NiFi 1.0 in Nutshell
Apache NiFi 1.0 in Nutshell
 
Major advancements in Apache Hive towards full support of SQL compliance
Major advancements in Apache Hive towards full support of SQL complianceMajor advancements in Apache Hive towards full support of SQL compliance
Major advancements in Apache Hive towards full support of SQL compliance
 
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
 
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBaseApache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
 
Network for the Large-scale Hadoop cluster at Yahoo! JAPAN
Network for the Large-scale Hadoop cluster at Yahoo! JAPANNetwork for the Large-scale Hadoop cluster at Yahoo! JAPAN
Network for the Large-scale Hadoop cluster at Yahoo! JAPAN
 
Using Hadoop to build a Data Quality Service for both real-time and batch data
Using Hadoop to build a Data Quality Service for both real-time and batch dataUsing Hadoop to build a Data Quality Service for both real-time and batch data
Using Hadoop to build a Data Quality Service for both real-time and batch data
 
Rebuilding Web Tracking Infrastructure for Scale
Rebuilding Web Tracking Infrastructure for ScaleRebuilding Web Tracking Infrastructure for Scale
Rebuilding Web Tracking Infrastructure for Scale
 

Similar a To The Cloud and Back: A Look At Hybrid Analytics

Simplifying Cloud Architectures with Data Virtualization
Simplifying Cloud Architectures with Data VirtualizationSimplifying Cloud Architectures with Data Virtualization
Simplifying Cloud Architectures with Data VirtualizationDenodo
 
Accelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud EraAccelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud EraAlluxio, Inc.
 
How the Development Bank of Singapore solves on-prem compute capacity challen...
How the Development Bank of Singapore solves on-prem compute capacity challen...How the Development Bank of Singapore solves on-prem compute capacity challen...
How the Development Bank of Singapore solves on-prem compute capacity challen...Alluxio, Inc.
 
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step JourneyWebinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step JourneyDataStax
 
Slides: Accelerating Queries on Cloud Data Lakes
Slides: Accelerating Queries on Cloud Data LakesSlides: Accelerating Queries on Cloud Data Lakes
Slides: Accelerating Queries on Cloud Data LakesDATAVERSITY
 
Hybrid Data Lake Architecture with Presto & Spark in the cloud accessing on-p...
Hybrid Data Lake Architecture with Presto & Spark in the cloud accessing on-p...Hybrid Data Lake Architecture with Presto & Spark in the cloud accessing on-p...
Hybrid Data Lake Architecture with Presto & Spark in the cloud accessing on-p...Alluxio, Inc.
 
Data Orchestration for the Hybrid Cloud Era
Data Orchestration for the Hybrid Cloud EraData Orchestration for the Hybrid Cloud Era
Data Orchestration for the Hybrid Cloud EraAlluxio, Inc.
 
Accelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud EraAccelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud EraAlluxio, Inc.
 
Enabling big data & AI workloads on the object store at DBS
Enabling big data & AI workloads on the object store at DBS Enabling big data & AI workloads on the object store at DBS
Enabling big data & AI workloads on the object store at DBS Alluxio, Inc.
 
Accelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud EraAccelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud EraAlluxio, Inc.
 
Emc ecs 2 technical deep dive workshop
Emc ecs 2 technical deep dive workshopEmc ecs 2 technical deep dive workshop
Emc ecs 2 technical deep dive workshopsolarisyougood
 
Achieving Separation of Compute and Storage in a Cloud World
Achieving Separation of Compute and Storage in a Cloud WorldAchieving Separation of Compute and Storage in a Cloud World
Achieving Separation of Compute and Storage in a Cloud WorldAlluxio, Inc.
 
ECS/Cloud Object Storage - DevOps Day
ECS/Cloud Object Storage - DevOps DayECS/Cloud Object Storage - DevOps Day
ECS/Cloud Object Storage - DevOps DayBob Sokol
 
Harvard university i tv3.2
Harvard university i tv3.2Harvard university i tv3.2
Harvard university i tv3.2kevin_donovan
 
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)Denodo
 
Architecting a Heterogeneous Data Platform Across Clusters, Regions, and Clouds
Architecting a Heterogeneous Data Platform Across Clusters, Regions, and CloudsArchitecting a Heterogeneous Data Platform Across Clusters, Regions, and Clouds
Architecting a Heterogeneous Data Platform Across Clusters, Regions, and CloudsAlluxio, Inc.
 
Disaggregated Hadoop Stacks
Disaggregated Hadoop StacksDisaggregated Hadoop Stacks
Disaggregated Hadoop StacksDataWorks Summit
 
Big data presentation (2014)
Big data presentation (2014)Big data presentation (2014)
Big data presentation (2014)Xavier Constant
 

Similar a To The Cloud and Back: A Look At Hybrid Analytics (20)

EMC config Hadoop
EMC config HadoopEMC config Hadoop
EMC config Hadoop
 
Simplifying Cloud Architectures with Data Virtualization
Simplifying Cloud Architectures with Data VirtualizationSimplifying Cloud Architectures with Data Virtualization
Simplifying Cloud Architectures with Data Virtualization
 
EMC EC Overview
EMC EC OverviewEMC EC Overview
EMC EC Overview
 
Accelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud EraAccelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud Era
 
How the Development Bank of Singapore solves on-prem compute capacity challen...
How the Development Bank of Singapore solves on-prem compute capacity challen...How the Development Bank of Singapore solves on-prem compute capacity challen...
How the Development Bank of Singapore solves on-prem compute capacity challen...
 
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step JourneyWebinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
 
Slides: Accelerating Queries on Cloud Data Lakes
Slides: Accelerating Queries on Cloud Data LakesSlides: Accelerating Queries on Cloud Data Lakes
Slides: Accelerating Queries on Cloud Data Lakes
 
Hybrid Data Lake Architecture with Presto & Spark in the cloud accessing on-p...
Hybrid Data Lake Architecture with Presto & Spark in the cloud accessing on-p...Hybrid Data Lake Architecture with Presto & Spark in the cloud accessing on-p...
Hybrid Data Lake Architecture with Presto & Spark in the cloud accessing on-p...
 
Data Orchestration for the Hybrid Cloud Era
Data Orchestration for the Hybrid Cloud EraData Orchestration for the Hybrid Cloud Era
Data Orchestration for the Hybrid Cloud Era
 
Accelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud EraAccelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud Era
 
Enabling big data & AI workloads on the object store at DBS
Enabling big data & AI workloads on the object store at DBS Enabling big data & AI workloads on the object store at DBS
Enabling big data & AI workloads on the object store at DBS
 
Accelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud EraAccelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud Era
 
Emc ecs 2 technical deep dive workshop
Emc ecs 2 technical deep dive workshopEmc ecs 2 technical deep dive workshop
Emc ecs 2 technical deep dive workshop
 
Achieving Separation of Compute and Storage in a Cloud World
Achieving Separation of Compute and Storage in a Cloud WorldAchieving Separation of Compute and Storage in a Cloud World
Achieving Separation of Compute and Storage in a Cloud World
 
ECS/Cloud Object Storage - DevOps Day
ECS/Cloud Object Storage - DevOps DayECS/Cloud Object Storage - DevOps Day
ECS/Cloud Object Storage - DevOps Day
 
Harvard university i tv3.2
Harvard university i tv3.2Harvard university i tv3.2
Harvard university i tv3.2
 
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
 
Architecting a Heterogeneous Data Platform Across Clusters, Regions, and Clouds
Architecting a Heterogeneous Data Platform Across Clusters, Regions, and CloudsArchitecting a Heterogeneous Data Platform Across Clusters, Regions, and Clouds
Architecting a Heterogeneous Data Platform Across Clusters, Regions, and Clouds
 
Disaggregated Hadoop Stacks
Disaggregated Hadoop StacksDisaggregated Hadoop Stacks
Disaggregated Hadoop Stacks
 
Big data presentation (2014)
Big data presentation (2014)Big data presentation (2014)
Big data presentation (2014)
 

Más de DataWorks Summit/Hadoop Summit

Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerUnleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerDataWorks Summit/Hadoop Summit
 
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformEnabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformDataWorks Summit/Hadoop Summit
 
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDouble Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDataWorks Summit/Hadoop Summit
 
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...DataWorks Summit/Hadoop Summit
 
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...DataWorks Summit/Hadoop Summit
 
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLMool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLDataWorks Summit/Hadoop Summit
 
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)DataWorks Summit/Hadoop Summit
 
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...DataWorks Summit/Hadoop Summit
 

Más de DataWorks Summit/Hadoop Summit (20)

Running Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in ProductionRunning Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in Production
 
State of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache ZeppelinState of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache Zeppelin
 
Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerUnleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache Ranger
 
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformEnabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science Platform
 
Revolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and ZeppelinRevolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and Zeppelin
 
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDouble Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSense
 
Hadoop Crash Course
Hadoop Crash CourseHadoop Crash Course
Hadoop Crash Course
 
Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Apache Spark Crash Course
Apache Spark Crash CourseApache Spark Crash Course
Apache Spark Crash Course
 
Dataflow with Apache NiFi
Dataflow with Apache NiFiDataflow with Apache NiFi
Dataflow with Apache NiFi
 
Schema Registry - Set you Data Free
Schema Registry - Set you Data FreeSchema Registry - Set you Data Free
Schema Registry - Set you Data Free
 
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
 
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
 
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLMool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and ML
 
How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient
 
HBase in Practice
HBase in Practice HBase in Practice
HBase in Practice
 
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)
 
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS HadoopBreaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
 
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
 
Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop
 

Último

SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 

Último (20)

SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 

To The Cloud and Back: A Look At Hybrid Analytics

  • 1. To The Cloud and Back: A Look at Hybrid Analytics Keith Manthey
  • 2. © Dell EMC 2016 - All rights reserved2 ITIL-based IT processes Client-server scale-up apps Infrastructure resiliency Traditional CIO challenge: next-generation infrastructures are needed
  • 3. © Dell EMC 2016 - All rights reserved3 ITIL-based IT processes Client-server scale-up apps Infrastructure resiliency Cloud-native Coexisting IT paradigms Traditional DevOps based IT processes Distributed scale-out apps Application resiliency CIO challenge: next-generation infrastructures are needed
  • 4. © Dell EMC 2016 - All rights reserved4 Traditional Cloud-native Modern IT Off-Premises On-Premises Available both on and off premises
  • 6. CLOUD ENABLED STORAGE DATALAKE DATACENTER Through 2020, IDC predicts 4.5x faster spending on cloud based big data analytics then on-premise solutions By 2020, IDC predicts that usage of big data analytics for actionable intelligence will double over today On premise and Cloud use of Hybrid Analytics will become a dominant use of the technology More specifically – Moving data from On-Premise to the Cloud and Back
  • 8. © Dell EMC 2016 - All rights reserved8 Data Management DATA LAKE SOLUTION FOR EDW MODERNISATION Clickstream Web & Social Geolocation Sensor & Machine Server Logs EXISTINGSOURCES ERP CRM Commodity Compute DATA SERVICES OPERATIONAL SERVICES HORTONWORKS DATA PLATFORM HADOOP CORE Business Analytics Visualization & Dashboards IT Applications NEWSOURCES 2 3 1 ETL/ELT OFFLOAD ACTIVE ARCHIVE ENRICH WITH NEW DATA TYPES MULTI-PROTOCOL ACCESS ENTERPRISE-GRADE DATA MANAGEMENT 5 NFS, SMB, HTTP, Swift 1 2 3 4 5 4 New Data Flow Current Data Flow Legend OFFLOAD Isilon
  • 9. © Dell EMC 2016 - All rights reserved9 1. Active Archive – Optimise Enterprise Data Warehouse storage by archiving cold data and still analyse it as needed 2. ETL Offload – Improve EDW performance by offloading ETL processing to Hadoop 3. Semi/Unstructured Data Analytics – Increase confidence in business decisions with new data sources 4. Multi-protocol Access – Enable applications to access/update Hadoop data using NFS, SMB, HTTP, Swift and other file/object based access methods 5. Data Management – Enterprise-grade data management at Hadoop economics DATA LAKE BENEFITS Unique to Isilon
  • 10. © Dell EMC 2016 - All rights reserved10 ISILON MOMENTUM 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 • >12,000 Clusters in the Field • Approaching 1PB Average Cluster Capacity • 1.3EB of Hadoop Usage • 7,000+ Customers World Wide • 1,300 New Customers in 2015 • #1 in Data Lakes >1,200 HDFS Customers Scale-Out NAS Leader
  • 12. Ability to: • Enact Policy Based Data Movement • Migrate data from On-Prem to Cloud and vice versa • Leverage Hybrid Analytics to greatest Effect To The Cloud and Back
  • 13. Limits: 100 Gb/Sec 1 Petabyte (1 million gigabytes) ~1 Day to transfer one way To The Cloud and Back
  • 14. Data Silos OnPremOnlyfiles CloudOnlyfiles Strong Bandwidth Typically Hand Scripted Expensive Extraction Limit outbound data movement
  • 16. Problem Domain 1. File movement automation in place today is labor intensive and fraught with peril 2. Non-prescriptive file movement out of the cloud will be extremely expensive with limited value in return. 3. Most file movements might not return to exactly the same target location. For example: Location 1, File 1 Location 2, File 1 – begets File 2 Location 3, File 2
  • 17. What if…. 1. There was a way to move files by policy (i.e., rules based) to various locations, including the cloud and back 1. For Example – only target net new files created in the cloud in certain directories for movement. 2. The rules based file movement could allow for multiple targets and / or destinations for files. 1. For Example – move net new files in one directory to a single argetand net new files in a second directory to three targets.
  • 18. OneFS in the Cloud
  • 19. Our Journey to Here 1. OneFS already is a software defined asset. 1. SDEdge is a software defined storage offering 2. CloudPools is a software feature in SDEdge that uses the Public Cloud (Azure, AWS, Google) as a back up target 2. OneFS has been loaded and run in the public cloud as noted in the previous slide. 3. Dell EMC has a long history of data mobility across our product suite (replication, backups, etc) 4. Dell EMC has a long history of policy based file movement features.
  • 20. Futures 1. We are exploring what OneFS with all its features and abilities to move data up to the cloud and back would look like. 2. What about a cloud environment that contained a OneFS daemon that allowed policy based file movement.? 3. The cloud environment file movement (up and back) could be controlled by a Isilon cluster
  • 21. On-Prem Files Movement policies Future State? Cloud Files OneFS Daemon

Notas del editor

  1. The challenge is: These applications require two distinct architectural paradigms. Traditional applications tend to be built in a vertically scaling manner. As more resources are required the systems on which they run must grow to support the scale-up of the individual app. Because of this design, application ecosystems tend to be contained within the boundary of a single datacenter. This has resulted in a model where resiliency and availability was delivered by the infrastructure itself. Additionally, in order to ensure system wide stability a rich set of checks and balances and change control methodologies emerged and were codified into a set of IT processes known as ITIL. CLICK
  2. Cloud native Applications, on the other hand, are built on a set of micro-services that scale horizontally rather than vertically. As more resources are required, systems don’t get bigger, but rather 10’s, 100’s or even 1000’s of additional application instances are deployed and these instances can run almost anywhere. Because these applications can run across a variety of datacenters, both on and off prem, organizations are not able to guarantee stability of the infrastructure and therefor the burden of reliability falls to the application itself. As a result new Dev/Ops processes designed to support rapid software iteration and delivery have been developed and are being adopted. CLICK
  3. So, each of these models are critical to the modern IT ecosystem. And thanks to the ubiquity of highly reliable and cost effective networks, these infrastructure services can be delivered almost anywhere. CLICK
  4. We’ve extended the data lake to the enterprise edge to address the storage needs of edge and remote locations We’ve also tightly integrated the data lake to public, private and hosted clouds to provide virtually limitless storage capacity to future proof your storage. the data lake now extends beyond the data center to your choice of cloud options
  5. We’ve extended the data lake to the enterprise edge to address the storage needs of edge and remote locations We’ve also tightly integrated the data lake to public, private and hosted clouds to provide virtually limitless storage capacity to future proof your storage. the data lake now extends beyond the data center to your choice of cloud options