SlideShare a Scribd company logo
1 of 26
Download to read offline
Big Data Extensions: Advanced Features and
Customer Case Study
Jayanth Gummaraju, VMware
Sasha Kipervarg, Identified, Inc.
VAPP5484
#VAPP5484
2
Data Is Exploding & Hadoop Is Driving Growth
Unstructured data driving growth Hadoop adoption is ramping
2011 2012 2013 2014 2015 2016 2017 2018 2019 2020
Structured Unstructured
Complex unstructured data
forecastedtooutpace structured
relationaldata by 10x by 2020
Evaluating
53%
In-
production
23%
Piloting
18%
Testing
2%
Don't know
2%
Other
2%
Source: Forrester Survey of60 CIOs, September 2011
• Unstructured data explosion and Hadoop capabilities causing CIOs to
reconsider Enterprise data strategy
• Hadoop’s ability to process raw data at cost presents intriguing value
proposition
3
Agenda
 Big Data Extensions Overview
 Virtualized Hadoop at Identified Inc.
 Advanced Features
4
Questions for Audience
 Familiarity with Hadoop
1. New to Hadoop
2. Reasonably familiar
3. Expert
 Hadoop cluster sizes
1. < 10 nodes
2. 10-50 nodes
3. > 50 nodes
 Virtualizing Hadoop
1. Never virtualized
2. Actively exploring virtualization
3. Running virtualized Hadoop in test-dev/production
5
Big Data on vSphere: Value Proposition
 Basic Features
• Fast provisioning
• Minutes/hours instead of days
• Workload Consolidation
• Multiple virtual clusters co-exist on same physical hardware
• High Availability
• Not limited to NameNode, JobTracker
 Advanced Features
• Auto-elasticity
• High Resource Utilization
• True multi-tenancy
• VM-grade security, performance, and configuration isolation
6
Serengeti
vSphere
Resource
Management
Hadoop
Virtualization
Extensions
vSphere Big Data Extensions: Program Highlights
 Open source project
 Tool to simplify virtualized
Hadoop deployment &
operations
Serengeti
 Virtualization changes
for core Hadoop
 Contributed back to
Apache Hadoop
 Advanced resource
management on vSphere
 Big Data applications-specific
extension to DRS
7
What is Hadoop?
Distributed processing of large data sets across clusters of computers
Source: http://architects.dzone.com/articles/how-hadoop-mapreduce-works
map()
reduce()
InputData
OutputData
Split
[k1, v1]
Sort
by k1
Merge
[k1, [v1, v2, v3,…]]
map()
map()
reduce()
8
Slave Node 1 Slave Node 2 Slave Node 3
Input File
Tasks Are Scheduled Where Data Resides
JobTrackerJob
DataNode
TaskTracker
Split 1 – 64MB
Task - 1
Split 2 – 64MB
Split 3 – 64MB
TaskTracker TaskTracker
DataNode DataNode
Block 1 – 64MB Block 2 – 64MB Block 3 – 64MB
Task - 2 Task - 3
NameNode
9
Myth: Virtual Performance Is Sub-optimal
[http://www.vmware.com/resources/techresources/10360, Jeff Buell, Apr 2013]
(lower is better)
32 hosts/3.6GHz 8 cores/15K RPM 146GB SAS disks/10GbE/72-96GB RAM
10
Agenda
 Big Data Extensions Overview
 Virtualized Hadoop at Identified Inc.
 Advanced Features
11
Agenda
 Big Data Extensions Overview
 Virtualized Hadoop at Identified Inc.
 Advanced Features
12
Compute-Data Separation
Combined
Storage/
Compute
VM
Hadoop in VM
• VM lifecycle
determined
by Datanode
• Limited elasticity
• Limited to Hadoop
Multi-Tenancy
Storage
Compute
VM
VM
Separate Storage
• Separate compute
from data
• Elastic compute
• Enable shared
workloads
• Raise utilization
Storage
T1 T2
VM
VM
VM
Separate Compute Tenants
• Separate virtual clusters
per tenant
• Stronger VM-grade security
and resource isolation
• Enable deployment of
multiple Hadoop runtime
versions
Slave Node
13
Dataflow with Separated Compute/Data
Virtual
Hadoop
Node
Virtual
Hadoop
Node
ESX Host
Virtual
Hadoop
Node
VMDK
DataNode
Virtual
Hadoop
Node
TaskTracker
Slot
Slot
Virtual Switch
Virtual NIC Virtual NIC
NIC Drivers
14
Elastic Scalability & Multi-Tenancy
 Deploy separate compute clusters for different tenants sharing HDFS.
 According to priority and available resources, power-on/off compute VMs
ExperimentationDynamic resourcepool
Data layer
Production
recommendation engine
Compute layer Compute
VM
Compute
VM
Compute
VM
Compute
VM
Compute
VM
Compute
VM
Compute
VM
Compute
VM
Compute
VM
Compute
VM
Compute
VM
Compute
VM
Compute
VM
Compute
VM
Compute
VM
Experimentation Production
Compute
VM
Job
Tracker
Job
Tracker
VMware vSphere + Big Data Extensions
15
Auto-elastic Hadoop in Action
ESX ESX ESX
J
T
DATA VM DATA VM DATA VM
Local Disks
SAN/NAS Non-Hadoop VMs
Hadoop Compute VMs
JT: JobTracker
TT: TaskTracker
NN: NameNode
VHM: Virtual Hadoop Manager
N
N
T
T
T
T
T
T
VirtualCenter Management Server
DRS DRS DRSDRS DRS
VHM
Hadoop HDFS VMs
T
T
T
T
T
T
J
T
16
Advanced Resource Management using Virtual Hadoop Manager
State, stats
(Slots used,
Pending work)
Commands
(Decommission,
Recommission)
Stats and VM
configuration
Serengeti Job
Tracker
vCenter DB
Manual/Auto
Power on/off
Virtual Hadoop Manager (VHM)
Job
Tracker
Task
Tracker
Task
Tracker
Task
Tracker
vCenter Server
VC
actions
Hadoop
actions
Serengeti
Configuration
VC
state and stats
Hadoop
state and stats
Auto-Scaling Algorithms
Cluster
Configuration
17
Auto-Scaling Algorithms: 5 Key Insights
① Expand or Shrink clusters based on ambient data
• Expand when there is work and no imminent contention
• Shrink when there is contention
• Predictable scaling for matching customer expectation, ease of testing, etc.
② Use contention detection as an input to scaling response
• Contention reflects user's resource control settings and workload demands
③ Act as an extension to DRS for distributed applications spanning multiple VMs
• A glue between DRS and Application-scheduler
• Penalize few VMs heavily rather than all VMs lightly/uniformly
④ React only if there is true contention and in a timely manner
• Actively used resources are deprived
• Do not react to transients
⑤ Use Hysteresis and Control Theory concepts to guide decisions
• E.g., transient windows and thresholds, feedback from previous actions, etc.
18
Shrinking-related Metrics
 CPU is being deprived
• VC metric: CPU Ready
• Time that vCPU is ready to run, but cannot be scheduled on a pCPU
 Memory is being deprived
• VC metrics:
• Usage: Active Memory, Granted Memory
• Reclamation: Memory Ballooning, Host Swap
• Typically starts with ballooning then leads to host swapping
 TaskTracker is dead or faulty
• Hadoop metrics: Alive Nodes and Task Failures
19
Expansion-related Metrics
 Jobs are present
• Hadoop metrics: jobs_preparing, jobs_running
 High slot usage
• Hadoop metrics: map_slots_used, max_map_slots, reduce_slots_used,
max_reduce_slots
 High task throughput
• Hadoop metrics: maps_completed, reduces_completed
 No imminent contention
• VC metrics: CPU Ready, Memory Ballooning
20
Auto-elasticity Demo
21
What’s Next?
 Resource management enhancements
• Algorithmic optimizations
• Contention metrics related to Disk/Network IO
 Auto-elasticity support for YARN and HBase
• YARN – Hadoop 2.x
• HBase – Hadoop database
 Serengeti enhancements
• Support for additional Hadoop distros
 Hadoop extensions
• Dynamic resource configuration
22
Main Takeaways
 Value proposition
• Fast provisioning
• Workload consolidation
• Elasticity  better resource utilization
• Multi-tenancy using VMs  differentiated service
 Key technologies
• Serengeti
• Advanced Resource Management
• Hadoop Virtual Extensions
Host Host Host
vSphere Platform
Make vSphere the platform of choice for running Big Data
23
Questions?
 Contact information
• Jayanth Gummaraju jgummaraju@vmware.com
• Sasha Kipervarg sasha@identified.com
 Other related sessions
• Breakout session (VAPP5402, VAPP5762)
• Big Data Panel (VAPP5626)
• Hands-on lab (HOL-SDC-1309)
 For more information (including download information)
• vSphere Big Data Extensions http://www.vmware.com/hadoop
• Project Serengeti http://www.projectserengeti.org
THANK YOU
Big Data Extensions: Advanced Features and
Customer Case Study
Jayanth Gummaraju, VMware
Sasha Kipervarg, Identified, Inc.
VAPP5484
#VAPP5484

More Related Content

What's hot

20100806 cloudera 10 hadoopable problems webinar
20100806 cloudera 10 hadoopable problems webinar20100806 cloudera 10 hadoopable problems webinar
20100806 cloudera 10 hadoopable problems webinarCloudera, Inc.
 
Hadoop and big data
Hadoop and big dataHadoop and big data
Hadoop and big dataYukti Kaura
 
Big data with Hadoop - Introduction
Big data with Hadoop - IntroductionBig data with Hadoop - Introduction
Big data with Hadoop - IntroductionTomy Rhymond
 
Using hadoop to expand data warehousing
Using hadoop to expand data warehousingUsing hadoop to expand data warehousing
Using hadoop to expand data warehousingDataWorks Summit
 
Big Data Real Time Applications
Big Data Real Time ApplicationsBig Data Real Time Applications
Big Data Real Time ApplicationsDataWorks Summit
 
Big data technologies and Hadoop infrastructure
Big data technologies and Hadoop infrastructureBig data technologies and Hadoop infrastructure
Big data technologies and Hadoop infrastructureRoman Nikitchenko
 
What it takes to run Hadoop at Scale: Yahoo! Perspectives
What it takes to run Hadoop at Scale: Yahoo! PerspectivesWhat it takes to run Hadoop at Scale: Yahoo! Perspectives
What it takes to run Hadoop at Scale: Yahoo! PerspectivesDataWorks Summit
 
Hadoop: An Industry Perspective
Hadoop: An Industry PerspectiveHadoop: An Industry Perspective
Hadoop: An Industry PerspectiveCloudera, Inc.
 
Hadoop core concepts
Hadoop core conceptsHadoop core concepts
Hadoop core conceptsMaryan Faryna
 
Introduction to Apache Hadoop Eco-System
Introduction to Apache Hadoop Eco-SystemIntroduction to Apache Hadoop Eco-System
Introduction to Apache Hadoop Eco-SystemMd. Hasan Basri (Angel)
 
Big data analytics with hadoop volume 2
Big data analytics with hadoop volume 2Big data analytics with hadoop volume 2
Big data analytics with hadoop volume 2Imviplav
 
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...Mahantesh Angadi
 
Hadoop vs. RDBMS for Advanced Analytics
Hadoop vs. RDBMS for Advanced AnalyticsHadoop vs. RDBMS for Advanced Analytics
Hadoop vs. RDBMS for Advanced Analyticsjoshwills
 
Hadoop - Architectural road map for Hadoop Ecosystem
Hadoop -  Architectural road map for Hadoop EcosystemHadoop -  Architectural road map for Hadoop Ecosystem
Hadoop - Architectural road map for Hadoop Ecosystemnallagangus
 
Big Data Analytics with Hadoop, MongoDB and SQL Server
Big Data Analytics with Hadoop, MongoDB and SQL ServerBig Data Analytics with Hadoop, MongoDB and SQL Server
Big Data Analytics with Hadoop, MongoDB and SQL ServerMark Kromer
 
Introduction to Big Data and Hadoop
Introduction to Big Data and HadoopIntroduction to Big Data and Hadoop
Introduction to Big Data and HadoopFebiyan Rachman
 

What's hot (20)

20100806 cloudera 10 hadoopable problems webinar
20100806 cloudera 10 hadoopable problems webinar20100806 cloudera 10 hadoopable problems webinar
20100806 cloudera 10 hadoopable problems webinar
 
Hadoop and big data
Hadoop and big dataHadoop and big data
Hadoop and big data
 
Big data with Hadoop - Introduction
Big data with Hadoop - IntroductionBig data with Hadoop - Introduction
Big data with Hadoop - Introduction
 
What is hadoop
What is hadoopWhat is hadoop
What is hadoop
 
Using hadoop to expand data warehousing
Using hadoop to expand data warehousingUsing hadoop to expand data warehousing
Using hadoop to expand data warehousing
 
Big Data Real Time Applications
Big Data Real Time ApplicationsBig Data Real Time Applications
Big Data Real Time Applications
 
Big data technologies and Hadoop infrastructure
Big data technologies and Hadoop infrastructureBig data technologies and Hadoop infrastructure
Big data technologies and Hadoop infrastructure
 
Big data analytics - hadoop
Big data analytics - hadoopBig data analytics - hadoop
Big data analytics - hadoop
 
What it takes to run Hadoop at Scale: Yahoo! Perspectives
What it takes to run Hadoop at Scale: Yahoo! PerspectivesWhat it takes to run Hadoop at Scale: Yahoo! Perspectives
What it takes to run Hadoop at Scale: Yahoo! Perspectives
 
Hadoop: An Industry Perspective
Hadoop: An Industry PerspectiveHadoop: An Industry Perspective
Hadoop: An Industry Perspective
 
Hadoop core concepts
Hadoop core conceptsHadoop core concepts
Hadoop core concepts
 
Hadoop Tutorial For Beginners
Hadoop Tutorial For BeginnersHadoop Tutorial For Beginners
Hadoop Tutorial For Beginners
 
Introduction to Apache Hadoop Eco-System
Introduction to Apache Hadoop Eco-SystemIntroduction to Apache Hadoop Eco-System
Introduction to Apache Hadoop Eco-System
 
Big data analytics with hadoop volume 2
Big data analytics with hadoop volume 2Big data analytics with hadoop volume 2
Big data analytics with hadoop volume 2
 
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
 
Hadoop vs. RDBMS for Advanced Analytics
Hadoop vs. RDBMS for Advanced AnalyticsHadoop vs. RDBMS for Advanced Analytics
Hadoop vs. RDBMS for Advanced Analytics
 
Hadoop - Architectural road map for Hadoop Ecosystem
Hadoop -  Architectural road map for Hadoop EcosystemHadoop -  Architectural road map for Hadoop Ecosystem
Hadoop - Architectural road map for Hadoop Ecosystem
 
Big Data Analytics with Hadoop, MongoDB and SQL Server
Big Data Analytics with Hadoop, MongoDB and SQL ServerBig Data Analytics with Hadoop, MongoDB and SQL Server
Big Data Analytics with Hadoop, MongoDB and SQL Server
 
Introduction to Big Data and Hadoop
Introduction to Big Data and HadoopIntroduction to Big Data and Hadoop
Introduction to Big Data and Hadoop
 
Big data concepts
Big data conceptsBig data concepts
Big data concepts
 

Viewers also liked

VMworld 2013: Extreme Performance Series: Monster Virtual Machines
VMworld 2013: Extreme Performance Series: Monster Virtual Machines VMworld 2013: Extreme Performance Series: Monster Virtual Machines
VMworld 2013: Extreme Performance Series: Monster Virtual Machines VMworld
 
VMworld 2013: Extreme Performance Series: Network Speed Ahead
VMworld 2013: Extreme Performance Series: Network Speed Ahead VMworld 2013: Extreme Performance Series: Network Speed Ahead
VMworld 2013: Extreme Performance Series: Network Speed Ahead VMworld
 
1. beyond mission critical virtualizing big data and hadoop
1. beyond mission critical   virtualizing big data and hadoop1. beyond mission critical   virtualizing big data and hadoop
1. beyond mission critical virtualizing big data and hadoopChiou-Nan Chen
 
Virtualized Big Data Platform at VMware Corp IT @ VMWorld 2015
Virtualized Big Data Platform at VMware Corp IT @ VMWorld 2015Virtualized Big Data Platform at VMware Corp IT @ VMWorld 2015
Virtualized Big Data Platform at VMware Corp IT @ VMWorld 2015Rajit Saha
 
WBDB 2014 Benchmarking Virtualized Hadoop Clusters
WBDB 2014 Benchmarking Virtualized Hadoop ClustersWBDB 2014 Benchmarking Virtualized Hadoop Clusters
WBDB 2014 Benchmarking Virtualized Hadoop Clusterst_ivanov
 
Soyez Big Data ready avec Isilon
Soyez Big Data ready avec IsilonSoyez Big Data ready avec Isilon
Soyez Big Data ready avec IsilonRSD
 
7. emc isilon hdfs enterprise storage for hadoop
7. emc isilon hdfs   enterprise storage for hadoop7. emc isilon hdfs   enterprise storage for hadoop
7. emc isilon hdfs enterprise storage for hadoopTaldor Group
 
Emerging Big Data & Analytics Trends with Hadoop
Emerging Big Data & Analytics Trends with Hadoop Emerging Big Data & Analytics Trends with Hadoop
Emerging Big Data & Analytics Trends with Hadoop InnoTech
 
EMC Hadoop Starter Kit
EMC Hadoop Starter KitEMC Hadoop Starter Kit
EMC Hadoop Starter KitEMC
 
Big data on virtualized infrastucture
Big data on virtualized infrastuctureBig data on virtualized infrastucture
Big data on virtualized infrastuctureDataWorks Summit
 
Gartner IT Symposium 2014 - VMware Cloud Services
Gartner IT Symposium 2014 - VMware Cloud ServicesGartner IT Symposium 2014 - VMware Cloud Services
Gartner IT Symposium 2014 - VMware Cloud ServicesPhilip Say
 
VMworld - vSphere Distributed Switch 6.0 Technical Deep Dive
VMworld - vSphere Distributed Switch 6.0 Technical Deep DiveVMworld - vSphere Distributed Switch 6.0 Technical Deep Dive
VMworld - vSphere Distributed Switch 6.0 Technical Deep DiveChris Wahl
 
Real World Application Orchestration Made Easy on VMware vCloud Air, vSphere ...
Real World Application Orchestration Made Easy on VMware vCloud Air, vSphere ...Real World Application Orchestration Made Easy on VMware vCloud Air, vSphere ...
Real World Application Orchestration Made Easy on VMware vCloud Air, vSphere ...Nati Shalom
 
Building Hadoop-as-a-Service with Pivotal Hadoop Distribution, Serengeti, & I...
Building Hadoop-as-a-Service with Pivotal Hadoop Distribution, Serengeti, & I...Building Hadoop-as-a-Service with Pivotal Hadoop Distribution, Serengeti, & I...
Building Hadoop-as-a-Service with Pivotal Hadoop Distribution, Serengeti, & I...EMC
 
Hadoop Analytics + Enterprise Class Storage: One-Stop Solution From EMC for H...
Hadoop Analytics + Enterprise Class Storage: One-Stop Solution From EMC for H...Hadoop Analytics + Enterprise Class Storage: One-Stop Solution From EMC for H...
Hadoop Analytics + Enterprise Class Storage: One-Stop Solution From EMC for H...EMC
 

Viewers also liked (17)

VMworld 2013: Extreme Performance Series: Monster Virtual Machines
VMworld 2013: Extreme Performance Series: Monster Virtual Machines VMworld 2013: Extreme Performance Series: Monster Virtual Machines
VMworld 2013: Extreme Performance Series: Monster Virtual Machines
 
VMworld 2013: Extreme Performance Series: Network Speed Ahead
VMworld 2013: Extreme Performance Series: Network Speed Ahead VMworld 2013: Extreme Performance Series: Network Speed Ahead
VMworld 2013: Extreme Performance Series: Network Speed Ahead
 
1. beyond mission critical virtualizing big data and hadoop
1. beyond mission critical   virtualizing big data and hadoop1. beyond mission critical   virtualizing big data and hadoop
1. beyond mission critical virtualizing big data and hadoop
 
Virtualized Big Data Platform at VMware Corp IT @ VMWorld 2015
Virtualized Big Data Platform at VMware Corp IT @ VMWorld 2015Virtualized Big Data Platform at VMware Corp IT @ VMWorld 2015
Virtualized Big Data Platform at VMware Corp IT @ VMWorld 2015
 
WBDB 2014 Benchmarking Virtualized Hadoop Clusters
WBDB 2014 Benchmarking Virtualized Hadoop ClustersWBDB 2014 Benchmarking Virtualized Hadoop Clusters
WBDB 2014 Benchmarking Virtualized Hadoop Clusters
 
Soyez Big Data ready avec Isilon
Soyez Big Data ready avec IsilonSoyez Big Data ready avec Isilon
Soyez Big Data ready avec Isilon
 
7. emc isilon hdfs enterprise storage for hadoop
7. emc isilon hdfs   enterprise storage for hadoop7. emc isilon hdfs   enterprise storage for hadoop
7. emc isilon hdfs enterprise storage for hadoop
 
Emerging Big Data & Analytics Trends with Hadoop
Emerging Big Data & Analytics Trends with Hadoop Emerging Big Data & Analytics Trends with Hadoop
Emerging Big Data & Analytics Trends with Hadoop
 
EMC Hadoop Starter Kit
EMC Hadoop Starter KitEMC Hadoop Starter Kit
EMC Hadoop Starter Kit
 
EMC config Hadoop
EMC config HadoopEMC config Hadoop
EMC config Hadoop
 
Big data on virtualized infrastucture
Big data on virtualized infrastuctureBig data on virtualized infrastucture
Big data on virtualized infrastucture
 
Gartner IT Symposium 2014 - VMware Cloud Services
Gartner IT Symposium 2014 - VMware Cloud ServicesGartner IT Symposium 2014 - VMware Cloud Services
Gartner IT Symposium 2014 - VMware Cloud Services
 
VMworld - vSphere Distributed Switch 6.0 Technical Deep Dive
VMworld - vSphere Distributed Switch 6.0 Technical Deep DiveVMworld - vSphere Distributed Switch 6.0 Technical Deep Dive
VMworld - vSphere Distributed Switch 6.0 Technical Deep Dive
 
Real World Application Orchestration Made Easy on VMware vCloud Air, vSphere ...
Real World Application Orchestration Made Easy on VMware vCloud Air, vSphere ...Real World Application Orchestration Made Easy on VMware vCloud Air, vSphere ...
Real World Application Orchestration Made Easy on VMware vCloud Air, vSphere ...
 
Cloud Management with vRealize Operations
Cloud Management with vRealize OperationsCloud Management with vRealize Operations
Cloud Management with vRealize Operations
 
Building Hadoop-as-a-Service with Pivotal Hadoop Distribution, Serengeti, & I...
Building Hadoop-as-a-Service with Pivotal Hadoop Distribution, Serengeti, & I...Building Hadoop-as-a-Service with Pivotal Hadoop Distribution, Serengeti, & I...
Building Hadoop-as-a-Service with Pivotal Hadoop Distribution, Serengeti, & I...
 
Hadoop Analytics + Enterprise Class Storage: One-Stop Solution From EMC for H...
Hadoop Analytics + Enterprise Class Storage: One-Stop Solution From EMC for H...Hadoop Analytics + Enterprise Class Storage: One-Stop Solution From EMC for H...
Hadoop Analytics + Enterprise Class Storage: One-Stop Solution From EMC for H...
 

Similar to VMworld 2013: Big Data Extensions: Advanced Features and Customer Case Study

VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...
VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...
VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...VMworld
 
Accelerating workloads and bursting data with Google Dataproc & Alluxio
Accelerating workloads and bursting data with Google Dataproc & AlluxioAccelerating workloads and bursting data with Google Dataproc & Alluxio
Accelerating workloads and bursting data with Google Dataproc & AlluxioAlluxio, Inc.
 
Vmware Serengeti - Based on Infochimps Ironfan
Vmware Serengeti - Based on Infochimps IronfanVmware Serengeti - Based on Infochimps Ironfan
Vmware Serengeti - Based on Infochimps IronfanJim Kaskade
 
Hadoop and SQL: Delivery Analytics Across the Organization
Hadoop and SQL:  Delivery Analytics Across the OrganizationHadoop and SQL:  Delivery Analytics Across the Organization
Hadoop and SQL: Delivery Analytics Across the OrganizationSeeling Cheung
 
Big Data at Pinterest - Presented by Qubole
Big Data at Pinterest - Presented by QuboleBig Data at Pinterest - Presented by Qubole
Big Data at Pinterest - Presented by QuboleQubole
 
SQL PASS Taiwan 七月份聚會-1
SQL PASS Taiwan 七月份聚會-1SQL PASS Taiwan 七月份聚會-1
SQL PASS Taiwan 七月份聚會-1SQLPASSTW
 
Which Change Data Capture Strategy is Right for You?
Which Change Data Capture Strategy is Right for You?Which Change Data Capture Strategy is Right for You?
Which Change Data Capture Strategy is Right for You?Precisely
 
Delivering Apache Hadoop for the Modern Data Architecture
Delivering Apache Hadoop for the Modern Data Architecture Delivering Apache Hadoop for the Modern Data Architecture
Delivering Apache Hadoop for the Modern Data Architecture Hortonworks
 
From Insights to Value - Building a Modern Logical Data Lake to Drive User Ad...
From Insights to Value - Building a Modern Logical Data Lake to Drive User Ad...From Insights to Value - Building a Modern Logical Data Lake to Drive User Ad...
From Insights to Value - Building a Modern Logical Data Lake to Drive User Ad...DataWorks Summit
 
Sql server consolidation and virtualization
Sql server consolidation and virtualizationSql server consolidation and virtualization
Sql server consolidation and virtualizationIvan Donev
 
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid WarehouseUsing the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid WarehouseRizaldy Ignacio
 
project--2 nd review_2
project--2 nd review_2project--2 nd review_2
project--2 nd review_2Aswini Ashu
 
project--2 nd review_2
project--2 nd review_2project--2 nd review_2
project--2 nd review_2aswini pilli
 
VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...
VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...
VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...VMworld
 
Bitfusion Nimbix Dev Summit Heterogeneous Architectures
Bitfusion Nimbix Dev Summit Heterogeneous Architectures Bitfusion Nimbix Dev Summit Heterogeneous Architectures
Bitfusion Nimbix Dev Summit Heterogeneous Architectures Subbu Rama
 
Hp Converged Systems and Hortonworks - Webinar Slides
Hp Converged Systems and Hortonworks - Webinar SlidesHp Converged Systems and Hortonworks - Webinar Slides
Hp Converged Systems and Hortonworks - Webinar SlidesHortonworks
 
Data Warehouse Optimization
Data Warehouse OptimizationData Warehouse Optimization
Data Warehouse OptimizationCloudera, Inc.
 
Big SQL 3.0 - Fast and easy SQL on Hadoop
Big SQL 3.0 - Fast and easy SQL on HadoopBig SQL 3.0 - Fast and easy SQL on Hadoop
Big SQL 3.0 - Fast and easy SQL on HadoopWilfried Hoge
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureDATAVERSITY
 

Similar to VMworld 2013: Big Data Extensions: Advanced Features and Customer Case Study (20)

VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...
VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...
VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...
 
Accelerating workloads and bursting data with Google Dataproc & Alluxio
Accelerating workloads and bursting data with Google Dataproc & AlluxioAccelerating workloads and bursting data with Google Dataproc & Alluxio
Accelerating workloads and bursting data with Google Dataproc & Alluxio
 
Vmware Serengeti - Based on Infochimps Ironfan
Vmware Serengeti - Based on Infochimps IronfanVmware Serengeti - Based on Infochimps Ironfan
Vmware Serengeti - Based on Infochimps Ironfan
 
Hadoop and SQL: Delivery Analytics Across the Organization
Hadoop and SQL:  Delivery Analytics Across the OrganizationHadoop and SQL:  Delivery Analytics Across the Organization
Hadoop and SQL: Delivery Analytics Across the Organization
 
Big Data at Pinterest - Presented by Qubole
Big Data at Pinterest - Presented by QuboleBig Data at Pinterest - Presented by Qubole
Big Data at Pinterest - Presented by Qubole
 
SQL PASS Taiwan 七月份聚會-1
SQL PASS Taiwan 七月份聚會-1SQL PASS Taiwan 七月份聚會-1
SQL PASS Taiwan 七月份聚會-1
 
Which Change Data Capture Strategy is Right for You?
Which Change Data Capture Strategy is Right for You?Which Change Data Capture Strategy is Right for You?
Which Change Data Capture Strategy is Right for You?
 
Delivering Apache Hadoop for the Modern Data Architecture
Delivering Apache Hadoop for the Modern Data Architecture Delivering Apache Hadoop for the Modern Data Architecture
Delivering Apache Hadoop for the Modern Data Architecture
 
From Insights to Value - Building a Modern Logical Data Lake to Drive User Ad...
From Insights to Value - Building a Modern Logical Data Lake to Drive User Ad...From Insights to Value - Building a Modern Logical Data Lake to Drive User Ad...
From Insights to Value - Building a Modern Logical Data Lake to Drive User Ad...
 
Sql server consolidation and virtualization
Sql server consolidation and virtualizationSql server consolidation and virtualization
Sql server consolidation and virtualization
 
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid WarehouseUsing the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
 
project--2 nd review_2
project--2 nd review_2project--2 nd review_2
project--2 nd review_2
 
project--2 nd review_2
project--2 nd review_2project--2 nd review_2
project--2 nd review_2
 
VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...
VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...
VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...
 
Skilwise Big data
Skilwise Big dataSkilwise Big data
Skilwise Big data
 
Bitfusion Nimbix Dev Summit Heterogeneous Architectures
Bitfusion Nimbix Dev Summit Heterogeneous Architectures Bitfusion Nimbix Dev Summit Heterogeneous Architectures
Bitfusion Nimbix Dev Summit Heterogeneous Architectures
 
Hp Converged Systems and Hortonworks - Webinar Slides
Hp Converged Systems and Hortonworks - Webinar SlidesHp Converged Systems and Hortonworks - Webinar Slides
Hp Converged Systems and Hortonworks - Webinar Slides
 
Data Warehouse Optimization
Data Warehouse OptimizationData Warehouse Optimization
Data Warehouse Optimization
 
Big SQL 3.0 - Fast and easy SQL on Hadoop
Big SQL 3.0 - Fast and easy SQL on HadoopBig SQL 3.0 - Fast and easy SQL on Hadoop
Big SQL 3.0 - Fast and easy SQL on Hadoop
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
 

More from VMworld

VMworld 2016: vSphere 6.x Host Resource Deep Dive
VMworld 2016: vSphere 6.x Host Resource Deep DiveVMworld 2016: vSphere 6.x Host Resource Deep Dive
VMworld 2016: vSphere 6.x Host Resource Deep DiveVMworld
 
VMworld 2016: Troubleshooting 101 for Horizon
VMworld 2016: Troubleshooting 101 for HorizonVMworld 2016: Troubleshooting 101 for Horizon
VMworld 2016: Troubleshooting 101 for HorizonVMworld
 
VMworld 2016: Advanced Network Services with NSX
VMworld 2016: Advanced Network Services with NSXVMworld 2016: Advanced Network Services with NSX
VMworld 2016: Advanced Network Services with NSXVMworld
 
VMworld 2016: How to Deploy VMware NSX with Cisco Infrastructure
VMworld 2016: How to Deploy VMware NSX with Cisco InfrastructureVMworld 2016: How to Deploy VMware NSX with Cisco Infrastructure
VMworld 2016: How to Deploy VMware NSX with Cisco InfrastructureVMworld
 
VMworld 2016: Enforcing a vSphere Cluster Design with PowerCLI Automation
VMworld 2016: Enforcing a vSphere Cluster Design with PowerCLI AutomationVMworld 2016: Enforcing a vSphere Cluster Design with PowerCLI Automation
VMworld 2016: Enforcing a vSphere Cluster Design with PowerCLI AutomationVMworld
 
VMworld 2016: What's New with Horizon 7
VMworld 2016: What's New with Horizon 7VMworld 2016: What's New with Horizon 7
VMworld 2016: What's New with Horizon 7VMworld
 
VMworld 2016: Virtual Volumes Technical Deep Dive
VMworld 2016: Virtual Volumes Technical Deep DiveVMworld 2016: Virtual Volumes Technical Deep Dive
VMworld 2016: Virtual Volumes Technical Deep DiveVMworld
 
VMworld 2016: Advances in Remote Display Protocol Technology with VMware Blas...
VMworld 2016: Advances in Remote Display Protocol Technology with VMware Blas...VMworld 2016: Advances in Remote Display Protocol Technology with VMware Blas...
VMworld 2016: Advances in Remote Display Protocol Technology with VMware Blas...VMworld
 
VMworld 2016: The KISS of vRealize Operations!
VMworld 2016: The KISS of vRealize Operations! VMworld 2016: The KISS of vRealize Operations!
VMworld 2016: The KISS of vRealize Operations! VMworld
 
VMworld 2016: Getting Started with PowerShell and PowerCLI for Your VMware En...
VMworld 2016: Getting Started with PowerShell and PowerCLI for Your VMware En...VMworld 2016: Getting Started with PowerShell and PowerCLI for Your VMware En...
VMworld 2016: Getting Started with PowerShell and PowerCLI for Your VMware En...VMworld
 
VMworld 2016: Ask the vCenter Server Exerts Panel
VMworld 2016: Ask the vCenter Server Exerts PanelVMworld 2016: Ask the vCenter Server Exerts Panel
VMworld 2016: Ask the vCenter Server Exerts PanelVMworld
 
VMworld 2016: Virtualize Active Directory, the Right Way!
VMworld 2016: Virtualize Active Directory, the Right Way! VMworld 2016: Virtualize Active Directory, the Right Way!
VMworld 2016: Virtualize Active Directory, the Right Way! VMworld
 
VMworld 2016: Migrating from a hardware based firewall to NSX to improve perf...
VMworld 2016: Migrating from a hardware based firewall to NSX to improve perf...VMworld 2016: Migrating from a hardware based firewall to NSX to improve perf...
VMworld 2016: Migrating from a hardware based firewall to NSX to improve perf...VMworld
 
VMworld 2015: Troubleshooting for vSphere 6
VMworld 2015: Troubleshooting for vSphere 6VMworld 2015: Troubleshooting for vSphere 6
VMworld 2015: Troubleshooting for vSphere 6VMworld
 
VMworld 2015: Monitoring and Managing Applications with vRealize Operations 6...
VMworld 2015: Monitoring and Managing Applications with vRealize Operations 6...VMworld 2015: Monitoring and Managing Applications with vRealize Operations 6...
VMworld 2015: Monitoring and Managing Applications with vRealize Operations 6...VMworld
 
VMworld 2015: Advanced SQL Server on vSphere
VMworld 2015: Advanced SQL Server on vSphereVMworld 2015: Advanced SQL Server on vSphere
VMworld 2015: Advanced SQL Server on vSphereVMworld
 
VMworld 2015: Virtualize Active Directory, the Right Way!
VMworld 2015: Virtualize Active Directory, the Right Way!VMworld 2015: Virtualize Active Directory, the Right Way!
VMworld 2015: Virtualize Active Directory, the Right Way!VMworld
 
VMworld 2015: Site Recovery Manager and Policy Based DR Deep Dive with Engine...
VMworld 2015: Site Recovery Manager and Policy Based DR Deep Dive with Engine...VMworld 2015: Site Recovery Manager and Policy Based DR Deep Dive with Engine...
VMworld 2015: Site Recovery Manager and Policy Based DR Deep Dive with Engine...VMworld
 
VMworld 2015: Building a Business Case for Virtual SAN
VMworld 2015: Building a Business Case for Virtual SANVMworld 2015: Building a Business Case for Virtual SAN
VMworld 2015: Building a Business Case for Virtual SANVMworld
 
VMworld 2015: Explaining Advanced Virtual Volumes Configurations
VMworld 2015: Explaining Advanced Virtual Volumes ConfigurationsVMworld 2015: Explaining Advanced Virtual Volumes Configurations
VMworld 2015: Explaining Advanced Virtual Volumes ConfigurationsVMworld
 

More from VMworld (20)

VMworld 2016: vSphere 6.x Host Resource Deep Dive
VMworld 2016: vSphere 6.x Host Resource Deep DiveVMworld 2016: vSphere 6.x Host Resource Deep Dive
VMworld 2016: vSphere 6.x Host Resource Deep Dive
 
VMworld 2016: Troubleshooting 101 for Horizon
VMworld 2016: Troubleshooting 101 for HorizonVMworld 2016: Troubleshooting 101 for Horizon
VMworld 2016: Troubleshooting 101 for Horizon
 
VMworld 2016: Advanced Network Services with NSX
VMworld 2016: Advanced Network Services with NSXVMworld 2016: Advanced Network Services with NSX
VMworld 2016: Advanced Network Services with NSX
 
VMworld 2016: How to Deploy VMware NSX with Cisco Infrastructure
VMworld 2016: How to Deploy VMware NSX with Cisco InfrastructureVMworld 2016: How to Deploy VMware NSX with Cisco Infrastructure
VMworld 2016: How to Deploy VMware NSX with Cisco Infrastructure
 
VMworld 2016: Enforcing a vSphere Cluster Design with PowerCLI Automation
VMworld 2016: Enforcing a vSphere Cluster Design with PowerCLI AutomationVMworld 2016: Enforcing a vSphere Cluster Design with PowerCLI Automation
VMworld 2016: Enforcing a vSphere Cluster Design with PowerCLI Automation
 
VMworld 2016: What's New with Horizon 7
VMworld 2016: What's New with Horizon 7VMworld 2016: What's New with Horizon 7
VMworld 2016: What's New with Horizon 7
 
VMworld 2016: Virtual Volumes Technical Deep Dive
VMworld 2016: Virtual Volumes Technical Deep DiveVMworld 2016: Virtual Volumes Technical Deep Dive
VMworld 2016: Virtual Volumes Technical Deep Dive
 
VMworld 2016: Advances in Remote Display Protocol Technology with VMware Blas...
VMworld 2016: Advances in Remote Display Protocol Technology with VMware Blas...VMworld 2016: Advances in Remote Display Protocol Technology with VMware Blas...
VMworld 2016: Advances in Remote Display Protocol Technology with VMware Blas...
 
VMworld 2016: The KISS of vRealize Operations!
VMworld 2016: The KISS of vRealize Operations! VMworld 2016: The KISS of vRealize Operations!
VMworld 2016: The KISS of vRealize Operations!
 
VMworld 2016: Getting Started with PowerShell and PowerCLI for Your VMware En...
VMworld 2016: Getting Started with PowerShell and PowerCLI for Your VMware En...VMworld 2016: Getting Started with PowerShell and PowerCLI for Your VMware En...
VMworld 2016: Getting Started with PowerShell and PowerCLI for Your VMware En...
 
VMworld 2016: Ask the vCenter Server Exerts Panel
VMworld 2016: Ask the vCenter Server Exerts PanelVMworld 2016: Ask the vCenter Server Exerts Panel
VMworld 2016: Ask the vCenter Server Exerts Panel
 
VMworld 2016: Virtualize Active Directory, the Right Way!
VMworld 2016: Virtualize Active Directory, the Right Way! VMworld 2016: Virtualize Active Directory, the Right Way!
VMworld 2016: Virtualize Active Directory, the Right Way!
 
VMworld 2016: Migrating from a hardware based firewall to NSX to improve perf...
VMworld 2016: Migrating from a hardware based firewall to NSX to improve perf...VMworld 2016: Migrating from a hardware based firewall to NSX to improve perf...
VMworld 2016: Migrating from a hardware based firewall to NSX to improve perf...
 
VMworld 2015: Troubleshooting for vSphere 6
VMworld 2015: Troubleshooting for vSphere 6VMworld 2015: Troubleshooting for vSphere 6
VMworld 2015: Troubleshooting for vSphere 6
 
VMworld 2015: Monitoring and Managing Applications with vRealize Operations 6...
VMworld 2015: Monitoring and Managing Applications with vRealize Operations 6...VMworld 2015: Monitoring and Managing Applications with vRealize Operations 6...
VMworld 2015: Monitoring and Managing Applications with vRealize Operations 6...
 
VMworld 2015: Advanced SQL Server on vSphere
VMworld 2015: Advanced SQL Server on vSphereVMworld 2015: Advanced SQL Server on vSphere
VMworld 2015: Advanced SQL Server on vSphere
 
VMworld 2015: Virtualize Active Directory, the Right Way!
VMworld 2015: Virtualize Active Directory, the Right Way!VMworld 2015: Virtualize Active Directory, the Right Way!
VMworld 2015: Virtualize Active Directory, the Right Way!
 
VMworld 2015: Site Recovery Manager and Policy Based DR Deep Dive with Engine...
VMworld 2015: Site Recovery Manager and Policy Based DR Deep Dive with Engine...VMworld 2015: Site Recovery Manager and Policy Based DR Deep Dive with Engine...
VMworld 2015: Site Recovery Manager and Policy Based DR Deep Dive with Engine...
 
VMworld 2015: Building a Business Case for Virtual SAN
VMworld 2015: Building a Business Case for Virtual SANVMworld 2015: Building a Business Case for Virtual SAN
VMworld 2015: Building a Business Case for Virtual SAN
 
VMworld 2015: Explaining Advanced Virtual Volumes Configurations
VMworld 2015: Explaining Advanced Virtual Volumes ConfigurationsVMworld 2015: Explaining Advanced Virtual Volumes Configurations
VMworld 2015: Explaining Advanced Virtual Volumes Configurations
 

Recently uploaded

Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 

Recently uploaded (20)

Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 

VMworld 2013: Big Data Extensions: Advanced Features and Customer Case Study

  • 1. Big Data Extensions: Advanced Features and Customer Case Study Jayanth Gummaraju, VMware Sasha Kipervarg, Identified, Inc. VAPP5484 #VAPP5484
  • 2. 2 Data Is Exploding & Hadoop Is Driving Growth Unstructured data driving growth Hadoop adoption is ramping 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 Structured Unstructured Complex unstructured data forecastedtooutpace structured relationaldata by 10x by 2020 Evaluating 53% In- production 23% Piloting 18% Testing 2% Don't know 2% Other 2% Source: Forrester Survey of60 CIOs, September 2011 • Unstructured data explosion and Hadoop capabilities causing CIOs to reconsider Enterprise data strategy • Hadoop’s ability to process raw data at cost presents intriguing value proposition
  • 3. 3 Agenda  Big Data Extensions Overview  Virtualized Hadoop at Identified Inc.  Advanced Features
  • 4. 4 Questions for Audience  Familiarity with Hadoop 1. New to Hadoop 2. Reasonably familiar 3. Expert  Hadoop cluster sizes 1. < 10 nodes 2. 10-50 nodes 3. > 50 nodes  Virtualizing Hadoop 1. Never virtualized 2. Actively exploring virtualization 3. Running virtualized Hadoop in test-dev/production
  • 5. 5 Big Data on vSphere: Value Proposition  Basic Features • Fast provisioning • Minutes/hours instead of days • Workload Consolidation • Multiple virtual clusters co-exist on same physical hardware • High Availability • Not limited to NameNode, JobTracker  Advanced Features • Auto-elasticity • High Resource Utilization • True multi-tenancy • VM-grade security, performance, and configuration isolation
  • 6. 6 Serengeti vSphere Resource Management Hadoop Virtualization Extensions vSphere Big Data Extensions: Program Highlights  Open source project  Tool to simplify virtualized Hadoop deployment & operations Serengeti  Virtualization changes for core Hadoop  Contributed back to Apache Hadoop  Advanced resource management on vSphere  Big Data applications-specific extension to DRS
  • 7. 7 What is Hadoop? Distributed processing of large data sets across clusters of computers Source: http://architects.dzone.com/articles/how-hadoop-mapreduce-works map() reduce() InputData OutputData Split [k1, v1] Sort by k1 Merge [k1, [v1, v2, v3,…]] map() map() reduce()
  • 8. 8 Slave Node 1 Slave Node 2 Slave Node 3 Input File Tasks Are Scheduled Where Data Resides JobTrackerJob DataNode TaskTracker Split 1 – 64MB Task - 1 Split 2 – 64MB Split 3 – 64MB TaskTracker TaskTracker DataNode DataNode Block 1 – 64MB Block 2 – 64MB Block 3 – 64MB Task - 2 Task - 3 NameNode
  • 9. 9 Myth: Virtual Performance Is Sub-optimal [http://www.vmware.com/resources/techresources/10360, Jeff Buell, Apr 2013] (lower is better) 32 hosts/3.6GHz 8 cores/15K RPM 146GB SAS disks/10GbE/72-96GB RAM
  • 10. 10 Agenda  Big Data Extensions Overview  Virtualized Hadoop at Identified Inc.  Advanced Features
  • 11. 11 Agenda  Big Data Extensions Overview  Virtualized Hadoop at Identified Inc.  Advanced Features
  • 12. 12 Compute-Data Separation Combined Storage/ Compute VM Hadoop in VM • VM lifecycle determined by Datanode • Limited elasticity • Limited to Hadoop Multi-Tenancy Storage Compute VM VM Separate Storage • Separate compute from data • Elastic compute • Enable shared workloads • Raise utilization Storage T1 T2 VM VM VM Separate Compute Tenants • Separate virtual clusters per tenant • Stronger VM-grade security and resource isolation • Enable deployment of multiple Hadoop runtime versions Slave Node
  • 13. 13 Dataflow with Separated Compute/Data Virtual Hadoop Node Virtual Hadoop Node ESX Host Virtual Hadoop Node VMDK DataNode Virtual Hadoop Node TaskTracker Slot Slot Virtual Switch Virtual NIC Virtual NIC NIC Drivers
  • 14. 14 Elastic Scalability & Multi-Tenancy  Deploy separate compute clusters for different tenants sharing HDFS.  According to priority and available resources, power-on/off compute VMs ExperimentationDynamic resourcepool Data layer Production recommendation engine Compute layer Compute VM Compute VM Compute VM Compute VM Compute VM Compute VM Compute VM Compute VM Compute VM Compute VM Compute VM Compute VM Compute VM Compute VM Compute VM Experimentation Production Compute VM Job Tracker Job Tracker VMware vSphere + Big Data Extensions
  • 15. 15 Auto-elastic Hadoop in Action ESX ESX ESX J T DATA VM DATA VM DATA VM Local Disks SAN/NAS Non-Hadoop VMs Hadoop Compute VMs JT: JobTracker TT: TaskTracker NN: NameNode VHM: Virtual Hadoop Manager N N T T T T T T VirtualCenter Management Server DRS DRS DRSDRS DRS VHM Hadoop HDFS VMs T T T T T T J T
  • 16. 16 Advanced Resource Management using Virtual Hadoop Manager State, stats (Slots used, Pending work) Commands (Decommission, Recommission) Stats and VM configuration Serengeti Job Tracker vCenter DB Manual/Auto Power on/off Virtual Hadoop Manager (VHM) Job Tracker Task Tracker Task Tracker Task Tracker vCenter Server VC actions Hadoop actions Serengeti Configuration VC state and stats Hadoop state and stats Auto-Scaling Algorithms Cluster Configuration
  • 17. 17 Auto-Scaling Algorithms: 5 Key Insights ① Expand or Shrink clusters based on ambient data • Expand when there is work and no imminent contention • Shrink when there is contention • Predictable scaling for matching customer expectation, ease of testing, etc. ② Use contention detection as an input to scaling response • Contention reflects user's resource control settings and workload demands ③ Act as an extension to DRS for distributed applications spanning multiple VMs • A glue between DRS and Application-scheduler • Penalize few VMs heavily rather than all VMs lightly/uniformly ④ React only if there is true contention and in a timely manner • Actively used resources are deprived • Do not react to transients ⑤ Use Hysteresis and Control Theory concepts to guide decisions • E.g., transient windows and thresholds, feedback from previous actions, etc.
  • 18. 18 Shrinking-related Metrics  CPU is being deprived • VC metric: CPU Ready • Time that vCPU is ready to run, but cannot be scheduled on a pCPU  Memory is being deprived • VC metrics: • Usage: Active Memory, Granted Memory • Reclamation: Memory Ballooning, Host Swap • Typically starts with ballooning then leads to host swapping  TaskTracker is dead or faulty • Hadoop metrics: Alive Nodes and Task Failures
  • 19. 19 Expansion-related Metrics  Jobs are present • Hadoop metrics: jobs_preparing, jobs_running  High slot usage • Hadoop metrics: map_slots_used, max_map_slots, reduce_slots_used, max_reduce_slots  High task throughput • Hadoop metrics: maps_completed, reduces_completed  No imminent contention • VC metrics: CPU Ready, Memory Ballooning
  • 21. 21 What’s Next?  Resource management enhancements • Algorithmic optimizations • Contention metrics related to Disk/Network IO  Auto-elasticity support for YARN and HBase • YARN – Hadoop 2.x • HBase – Hadoop database  Serengeti enhancements • Support for additional Hadoop distros  Hadoop extensions • Dynamic resource configuration
  • 22. 22 Main Takeaways  Value proposition • Fast provisioning • Workload consolidation • Elasticity  better resource utilization • Multi-tenancy using VMs  differentiated service  Key technologies • Serengeti • Advanced Resource Management • Hadoop Virtual Extensions Host Host Host vSphere Platform Make vSphere the platform of choice for running Big Data
  • 23. 23 Questions?  Contact information • Jayanth Gummaraju jgummaraju@vmware.com • Sasha Kipervarg sasha@identified.com  Other related sessions • Breakout session (VAPP5402, VAPP5762) • Big Data Panel (VAPP5626) • Hands-on lab (HOL-SDC-1309)  For more information (including download information) • vSphere Big Data Extensions http://www.vmware.com/hadoop • Project Serengeti http://www.projectserengeti.org
  • 25.
  • 26. Big Data Extensions: Advanced Features and Customer Case Study Jayanth Gummaraju, VMware Sasha Kipervarg, Identified, Inc. VAPP5484 #VAPP5484