SlideShare una empresa de Scribd logo
1 de 14
Descargar para leer sin conexión
Hybrid is the New Normal
Junaid Rao
Senior Cloud SE – APAC
2 © Cloudera, Inc. All rights reserved.
WHY HYBRID ?
3 © Cloudera, Inc. All rights reserved.
Three Types of Workload Lifecycles
1hr
SPIN UP SPIN
DOWN
24/7
24/7
1hr
SPIN UP SPIN
DOWN
Persistent
Transient
Elastic
4 © Cloudera, Inc. All rights reserved.
HOW CLOUDERA HELPS with HYBRID ?
5 © Cloudera, Inc. All rights reserved.
• The modern platform for machine
learning and analytics
• with multiple deployment options
• and one shared data experience
6 © Cloudera, Inc. All rights reserved.
SHARED DATA EXPERIENCEDEPLOYMENT OPTIONSMODERN PLATFORM
Amazon
S3
LOCATION
STORAGE
MANAGEABILITY
Microsoft
ADLS
HDFS KUDU
Data Center
Self Managed Managed Service
DATA
ENGINEERING
DATA
WAREHOUSE
DATA
SCIENCE
OPERATIONAL
DATABASE
SECURITY
GOVERNANCE
LIFECYCLE MANAGEMENT
DATA CATALOG
CLOUDERA VALUE PROPOSITION
7 © Cloudera, Inc. All rights reserved.
Big Data Infrastructure Evolution
Infrastructure traditionally
Each cluster is self-contained with compute,
data context, and data
Data context = HMS, Sentry, Navigator
Compute
Context
Data
Compute
Context
Data
Compute
Context
Data
• Compute, Context and Data together.
Designed for best performance.
• Highly Available, mission critical
• Multi-tenant, Secure and Governed
• Fixed Size, provisioned for peak capacity
• Low Utilization rates if only transient
workloads
• Not easy to Scale
8 © Cloudera, Inc. All rights reserved.
Big Data Infrastructure Evolution
Decoupled Infrastructure
Data is separate from compute (e.g., in
ADLS/S3), but context needs to be managed
redundantly in each cluster
Compute
Context
Compute
Context
Compute
Context
Data
• Decoupled data and compute
• Use Object Store
• Scale easily
• Workload Specific infrastructure
• Support Transient and Persistent workloads
• Compute and Context are still together
• Maintain schemas in application. You lose
Context when transient workload
completes
• No way to troubleshoot. You lose
Workload information (logs and statistics)
when the cluster goes away.
9 © Cloudera, Inc. All rights reserved.
Big Data Infrastructure Evolution
Modern Infrastructure with Cloudera
Compute clusters are launched as needed
Data and data context are stored externally and
are long-running
Workload Analytics stores your logs and job
statistics
ComputeCompute
Data
Cloudera SDX/WXM
Compute
• Decoupled data and compute
• Use Object Store
• Scale easily
• Workload Specific infrastructure
• Support Transient and Persistent workloads
• Persistent Context and Workload Analytics
10 © Cloudera, Inc. All rights reserved.
CLOUDERA
ALTUS
Flexible cloud
deployment options
including workload-
optimized Managed
Services with Cloudera
Shared Data
Experience (SDX)
DATA ENGINEERING DATA WAREHOUSE
MULTI
FUNCTION
CLOUD
STORAGE
DATA CATALOG
GOVERNANCESECURITY CONTROL
PLANE
LIFECYCLE
MANAGEMENT
Microsoft
ADLS
AWS
S3
DIRECTOR
11 © Cloudera, Inc. All rights reserved.
Cloudera Altus Architecture
Customer Cloud
Compute
Storage
CLI
Web
SDK
ALTUS DATA
WAREHOUSE
ALTUS DATA
ENGINEERING
ALTUS
CONTROL
PLANE
12 © Cloudera, Inc. All rights reserved.
On-Premises Cloud Bursting
Data Engineering Workflow:
Batch ML, Simulations, etc.
1+ Jobs
(e.g.Spark)
Transient Cluster(s):
Atlus Managed
1+ Jobs
(e.g. Spark)
On-Premises Cluster:
Bare Metal/Private
Cloud
HDFS
Data
Science
Data
Eng
Data
Warehouse
Continuous or
On-Demand
Persistent SDX
Altus Control PlaneSDX: Schema (HMS), Security (Sentry), Lineage/Metadata (Navigator)
Object Store (ADLS, S3)
Data
Warehouse
(Impala)
Elastic Cluster(s):
Atlus Managed
BI Tools
(e.g. Tableau)
SQL Editor
(e.g. Hue)
13 © Cloudera, Inc. All rights reserved.
Altus Demo
Data Engineering Workflow:
Batch ML, Simulations, etc.
1+ Jobs
(e.g.Spark)
Transient Cluster(s):
Atlus Managed
1+ Jobs
(e.g. Spark)
Persistent SDX
Altus Control PlaneSDX: Schema (HMS), Security (Sentry), Lineage/Metadata (Navigator)
Object Store (ADLS, S3)
Data
Warehouse
(Impala)
Elastic Cluster(s):
Atlus Managed
BI Tools
(e.g. Tableau)
SQL Editor
(e.g. Hue)
On-Premises Cluster:
Bare Metal/Private
Cloud
HDFS
Data
Science
Data
Eng
Data
Warehouse
Continuous or
On-Demand
THANK YOUAltus - Demo

Más contenido relacionado

La actualidad más candente

Logical Data Warehouse: How to Build a Virtualized Data Services Layer
Logical Data Warehouse: How to Build a Virtualized Data Services LayerLogical Data Warehouse: How to Build a Virtualized Data Services Layer
Logical Data Warehouse: How to Build a Virtualized Data Services Layer
DataWorks Summit
 
Startup Case Study: Leveraging the Broad Hadoop Ecosystem to Develop World-Fi...
Startup Case Study: Leveraging the Broad Hadoop Ecosystem to Develop World-Fi...Startup Case Study: Leveraging the Broad Hadoop Ecosystem to Develop World-Fi...
Startup Case Study: Leveraging the Broad Hadoop Ecosystem to Develop World-Fi...
DataWorks Summit
 
Actian Vector on Hadoop: First Industrial-strength DBMS to Truly Leverage Hadoop
Actian Vector on Hadoop: First Industrial-strength DBMS to Truly Leverage HadoopActian Vector on Hadoop: First Industrial-strength DBMS to Truly Leverage Hadoop
Actian Vector on Hadoop: First Industrial-strength DBMS to Truly Leverage Hadoop
DataWorks Summit
 

La actualidad más candente (20)

Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the CloudPart 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
 
IBM Power8 announce
IBM Power8 announceIBM Power8 announce
IBM Power8 announce
 
Cloudera Data Science Workbench: sparklyr, implyr, and More - dplyr Interfac...
 Cloudera Data Science Workbench: sparklyr, implyr, and More - dplyr Interfac... Cloudera Data Science Workbench: sparklyr, implyr, and More - dplyr Interfac...
Cloudera Data Science Workbench: sparklyr, implyr, and More - dplyr Interfac...
 
Insights into Real-world Data Management Challenges
Insights into Real-world Data Management ChallengesInsights into Real-world Data Management Challenges
Insights into Real-world Data Management Challenges
 
Logical Data Warehouse: How to Build a Virtualized Data Services Layer
Logical Data Warehouse: How to Build a Virtualized Data Services LayerLogical Data Warehouse: How to Build a Virtualized Data Services Layer
Logical Data Warehouse: How to Build a Virtualized Data Services Layer
 
Delivering a Flexible IT Infrastructure for Analytics on IBM Power Systems
Delivering a Flexible IT Infrastructure for Analytics on IBM Power SystemsDelivering a Flexible IT Infrastructure for Analytics on IBM Power Systems
Delivering a Flexible IT Infrastructure for Analytics on IBM Power Systems
 
Open Innovation with Power Systems
Open Innovation with Power Systems Open Innovation with Power Systems
Open Innovation with Power Systems
 
Hadoop Virtualization - Intel White Paper
Hadoop Virtualization - Intel White PaperHadoop Virtualization - Intel White Paper
Hadoop Virtualization - Intel White Paper
 
Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop
Successes, Challenges, and Pitfalls Migrating a SAAS business to HadoopSuccesses, Challenges, and Pitfalls Migrating a SAAS business to Hadoop
Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop
 
Intel and Cloudera: Accelerating Enterprise Big Data Success
Intel and Cloudera: Accelerating Enterprise Big Data SuccessIntel and Cloudera: Accelerating Enterprise Big Data Success
Intel and Cloudera: Accelerating Enterprise Big Data Success
 
Hadoop Everywhere & Cloudbreak
Hadoop Everywhere & CloudbreakHadoop Everywhere & Cloudbreak
Hadoop Everywhere & Cloudbreak
 
Startup Case Study: Leveraging the Broad Hadoop Ecosystem to Develop World-Fi...
Startup Case Study: Leveraging the Broad Hadoop Ecosystem to Develop World-Fi...Startup Case Study: Leveraging the Broad Hadoop Ecosystem to Develop World-Fi...
Startup Case Study: Leveraging the Broad Hadoop Ecosystem to Develop World-Fi...
 
Data Science and Machine Learning for the Enterprise
Data Science and Machine Learning for the EnterpriseData Science and Machine Learning for the Enterprise
Data Science and Machine Learning for the Enterprise
 
Hadoop: The Unintended Benefits
Hadoop: The Unintended BenefitsHadoop: The Unintended Benefits
Hadoop: The Unintended Benefits
 
Machine Learning Loves Hadoop
Machine Learning Loves HadoopMachine Learning Loves Hadoop
Machine Learning Loves Hadoop
 
Five Tips for Running Cloudera on AWS
Five Tips for Running Cloudera on AWSFive Tips for Running Cloudera on AWS
Five Tips for Running Cloudera on AWS
 
Actian Analytics Platform - Hadoop SQL Edition
Actian Analytics Platform - Hadoop SQL EditionActian Analytics Platform - Hadoop SQL Edition
Actian Analytics Platform - Hadoop SQL Edition
 
Actian Vector on Hadoop: First Industrial-strength DBMS to Truly Leverage Hadoop
Actian Vector on Hadoop: First Industrial-strength DBMS to Truly Leverage HadoopActian Vector on Hadoop: First Industrial-strength DBMS to Truly Leverage Hadoop
Actian Vector on Hadoop: First Industrial-strength DBMS to Truly Leverage Hadoop
 
Oracle big data appliance and solutions
Oracle big data appliance and solutionsOracle big data appliance and solutions
Oracle big data appliance and solutions
 
LLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in HiveLLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in Hive
 

Similar a Hybrid is the New Normal

Similar a Hybrid is the New Normal (20)

Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Cloudera SDX
Cloudera SDXCloudera SDX
Cloudera SDX
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...
Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...
Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...
 
Cloudera GoDataFest Deploying Cloudera in the Cloud
Cloudera GoDataFest Deploying Cloudera in the CloudCloudera GoDataFest Deploying Cloudera in the Cloud
Cloudera GoDataFest Deploying Cloudera in the Cloud
 
Cloudera Altus: Big Data in der Cloud einfach gemacht
Cloudera Altus: Big Data in der Cloud einfach gemachtCloudera Altus: Big Data in der Cloud einfach gemacht
Cloudera Altus: Big Data in der Cloud einfach gemacht
 
How to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
How to Build Multi-disciplinary Analytics Applications on a Shared Data PlatformHow to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
How to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
 
Cloud Data Warehousing with Cloudera Altus 7.24.18
Cloud Data Warehousing with Cloudera Altus 7.24.18Cloud Data Warehousing with Cloudera Altus 7.24.18
Cloud Data Warehousing with Cloudera Altus 7.24.18
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Big Data LDN 2018: CONSISTENT SECURITY, GOVERNANCE AND FLEXIBILITY FOR ALL WO...
Big Data LDN 2018: CONSISTENT SECURITY, GOVERNANCE AND FLEXIBILITY FOR ALL WO...Big Data LDN 2018: CONSISTENT SECURITY, GOVERNANCE AND FLEXIBILITY FOR ALL WO...
Big Data LDN 2018: CONSISTENT SECURITY, GOVERNANCE AND FLEXIBILITY FOR ALL WO...
 
A deep dive into running data analytic workloads in the cloud
A deep dive into running data analytic workloads in the cloudA deep dive into running data analytic workloads in the cloud
A deep dive into running data analytic workloads in the cloud
 
Self-service Big Data Analytics on Microsoft Azure
Self-service Big Data Analytics on Microsoft AzureSelf-service Big Data Analytics on Microsoft Azure
Self-service Big Data Analytics on Microsoft Azure
 
Cloudera - The Modern Platform for Analytics
Cloudera - The Modern Platform for AnalyticsCloudera - The Modern Platform for Analytics
Cloudera - The Modern Platform for Analytics
 
Big data journey to the cloud 5.30.18 asher bartch
Big data journey to the cloud 5.30.18   asher bartchBig data journey to the cloud 5.30.18   asher bartch
Big data journey to the cloud 5.30.18 asher bartch
 
Comment développer une stratégie Big Data dans le cloud public avec l'offre P...
Comment développer une stratégie Big Data dans le cloud public avec l'offre P...Comment développer une stratégie Big Data dans le cloud public avec l'offre P...
Comment développer une stratégie Big Data dans le cloud public avec l'offre P...
 
Supercharge Splunk with Cloudera

Supercharge Splunk with Cloudera
Supercharge Splunk with Cloudera

Supercharge Splunk with Cloudera

 
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
 
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
 Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ... Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
 
Cloudera Altus: Big Data in the Cloud Made Easy
Cloudera Altus: Big Data in the Cloud Made EasyCloudera Altus: Big Data in the Cloud Made Easy
Cloudera Altus: Big Data in the Cloud Made Easy
 
How Big Data Can Enable Analytics from the Cloud (Technical Workshop)
How Big Data Can Enable Analytics from the Cloud (Technical Workshop)How Big Data Can Enable Analytics from the Cloud (Technical Workshop)
How Big Data Can Enable Analytics from the Cloud (Technical Workshop)
 

Más de DataWorks Summit

HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
DataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
DataWorks Summit
 

Más de DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Último

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Último (20)

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 

Hybrid is the New Normal

  • 1. Hybrid is the New Normal Junaid Rao Senior Cloud SE – APAC
  • 2. 2 © Cloudera, Inc. All rights reserved. WHY HYBRID ?
  • 3. 3 © Cloudera, Inc. All rights reserved. Three Types of Workload Lifecycles 1hr SPIN UP SPIN DOWN 24/7 24/7 1hr SPIN UP SPIN DOWN Persistent Transient Elastic
  • 4. 4 © Cloudera, Inc. All rights reserved. HOW CLOUDERA HELPS with HYBRID ?
  • 5. 5 © Cloudera, Inc. All rights reserved. • The modern platform for machine learning and analytics • with multiple deployment options • and one shared data experience
  • 6. 6 © Cloudera, Inc. All rights reserved. SHARED DATA EXPERIENCEDEPLOYMENT OPTIONSMODERN PLATFORM Amazon S3 LOCATION STORAGE MANAGEABILITY Microsoft ADLS HDFS KUDU Data Center Self Managed Managed Service DATA ENGINEERING DATA WAREHOUSE DATA SCIENCE OPERATIONAL DATABASE SECURITY GOVERNANCE LIFECYCLE MANAGEMENT DATA CATALOG CLOUDERA VALUE PROPOSITION
  • 7. 7 © Cloudera, Inc. All rights reserved. Big Data Infrastructure Evolution Infrastructure traditionally Each cluster is self-contained with compute, data context, and data Data context = HMS, Sentry, Navigator Compute Context Data Compute Context Data Compute Context Data • Compute, Context and Data together. Designed for best performance. • Highly Available, mission critical • Multi-tenant, Secure and Governed • Fixed Size, provisioned for peak capacity • Low Utilization rates if only transient workloads • Not easy to Scale
  • 8. 8 © Cloudera, Inc. All rights reserved. Big Data Infrastructure Evolution Decoupled Infrastructure Data is separate from compute (e.g., in ADLS/S3), but context needs to be managed redundantly in each cluster Compute Context Compute Context Compute Context Data • Decoupled data and compute • Use Object Store • Scale easily • Workload Specific infrastructure • Support Transient and Persistent workloads • Compute and Context are still together • Maintain schemas in application. You lose Context when transient workload completes • No way to troubleshoot. You lose Workload information (logs and statistics) when the cluster goes away.
  • 9. 9 © Cloudera, Inc. All rights reserved. Big Data Infrastructure Evolution Modern Infrastructure with Cloudera Compute clusters are launched as needed Data and data context are stored externally and are long-running Workload Analytics stores your logs and job statistics ComputeCompute Data Cloudera SDX/WXM Compute • Decoupled data and compute • Use Object Store • Scale easily • Workload Specific infrastructure • Support Transient and Persistent workloads • Persistent Context and Workload Analytics
  • 10. 10 © Cloudera, Inc. All rights reserved. CLOUDERA ALTUS Flexible cloud deployment options including workload- optimized Managed Services with Cloudera Shared Data Experience (SDX) DATA ENGINEERING DATA WAREHOUSE MULTI FUNCTION CLOUD STORAGE DATA CATALOG GOVERNANCESECURITY CONTROL PLANE LIFECYCLE MANAGEMENT Microsoft ADLS AWS S3 DIRECTOR
  • 11. 11 © Cloudera, Inc. All rights reserved. Cloudera Altus Architecture Customer Cloud Compute Storage CLI Web SDK ALTUS DATA WAREHOUSE ALTUS DATA ENGINEERING ALTUS CONTROL PLANE
  • 12. 12 © Cloudera, Inc. All rights reserved. On-Premises Cloud Bursting Data Engineering Workflow: Batch ML, Simulations, etc. 1+ Jobs (e.g.Spark) Transient Cluster(s): Atlus Managed 1+ Jobs (e.g. Spark) On-Premises Cluster: Bare Metal/Private Cloud HDFS Data Science Data Eng Data Warehouse Continuous or On-Demand Persistent SDX Altus Control PlaneSDX: Schema (HMS), Security (Sentry), Lineage/Metadata (Navigator) Object Store (ADLS, S3) Data Warehouse (Impala) Elastic Cluster(s): Atlus Managed BI Tools (e.g. Tableau) SQL Editor (e.g. Hue)
  • 13. 13 © Cloudera, Inc. All rights reserved. Altus Demo Data Engineering Workflow: Batch ML, Simulations, etc. 1+ Jobs (e.g.Spark) Transient Cluster(s): Atlus Managed 1+ Jobs (e.g. Spark) Persistent SDX Altus Control PlaneSDX: Schema (HMS), Security (Sentry), Lineage/Metadata (Navigator) Object Store (ADLS, S3) Data Warehouse (Impala) Elastic Cluster(s): Atlus Managed BI Tools (e.g. Tableau) SQL Editor (e.g. Hue) On-Premises Cluster: Bare Metal/Private Cloud HDFS Data Science Data Eng Data Warehouse Continuous or On-Demand