SlideShare una empresa de Scribd logo
1 de 33
© Hortonworks Inc. 2013
Hadoop meet OpenStack
Himanshu Bari, Hortonworks
Ilya Elterman, Mirantis
John Spiedel, Hortonworks
June 26th, 2013
© Hortonworks Inc. 2013
Disclaimer
• This document may contain product features and technology directions
that are under development or may be under development in the future.
• Technical feasibility, market demand, user feedback, and the Apache
Software Foundation community development process can all affect
timing and final delivery.
• This document’s description of these features and technology
directions does not represent a contractual commitment from
Hortonworks to deliver these features in any generally available
product.
• Product features and technology directions are subject to change, and
must not be included in contracts, purchase orders, or sales
agreements of any kind.
© Hortonworks Inc. 2013
Agenda
Why
Hadoop on
OpenStack
Savanna
controller
deep dive
Hortonworks
OpenStack
plugin
DEMO
© Hortonworks Inc. 2013
Why Hadoop & OpenStack?
Hadoop provides a greenfield
use case
• Net new workload
• Needs scale out
infrastructure
• Shared platform
OpenStack provides the perfect
cloud platform
• Operational agility
• Supports scale out architecture
• Deployment choice across
public, private, and hybrid
clouds
1. Open source communities provide the fastest path to innovation
2. Open source is changing the game as economics and accessibility serve to
accelerate cloud & big data market trends
3. Both are attracting major ecosystem players: IBM, RHT, HP, RAX, etc…
Marries two of the largest open source movements
© Hortonworks Inc. 2013
OpenStack Infrastructure
Savanna
Elastic Hadoop Controller
Project Savanna to accelerate integration
Swift
storage
Hadoop Cluster
N
N
N
N
N
N
2
Ambari
Hadoop management
- - + +
N
N
N
N
1
3
1. Cluster templates: deploy
pre configured Hadoop
clusters in seconds from
Horizon or Ambari
1. HDFS-Swift connectors:
move data between HDFS
and Swift object storage
1. Simplified Elasticity
Project Savanna
Automate deployment of
Apache Hadoop on
OpenStack
© Hortonworks Inc. 2013
Provisioning
Phase-1 features
- Frequent dev/test/staging cluster
provision requests
- Migrations from staging to prod and
vice versa
- Reduce operator error in cluster
provisioning
- Migrate away from Amazon EMR for
Ad hoc analytics requests for
experimentation
- Cluster and node level templates for
self-provisioning
- Template operations like save/import
- Move data between HDFS & Swift
object store
Job flow based cluster provisioning
Phase-2 features
Benefits/Use cases
© Hortonworks Inc. 2013
Elasticity
Phase-1 features
- Commission a new node or
decommission a node for
maintenance
- For dev/test/staging clusters:
automatically vary cluster data &
compute capacity based on
tenant, workload, time of
day, resource utilization etc.
- Automatically vary compute capacity
only for production clusters
- Hadoop cluster node add/remove
from OpenStack
- Cluster operations like destroy
cluster fired from OpenStack
Rule based cluster node elasticity
Phase-2 features
Benefits/Use cases
© Hortonworks Inc. 2013
Multi-tenancy
Phase-1 features
- Common infrastructure for Hadoop
and non Hadoop workloads
- Simplify maintenance through version
isolation
- Resource isolation to support varying
SLAs based on tenant and workload
- Simplify chargeback/showback
- Hadoop virtualization extensions
support
- Ability to pin VMs to group of
physical hosts
- Keystone integration with Ambari
- One Ambari instance per tenant
- Keystone enhancements to support
Job flow to tenant mapping
Phase-2 features
Benefits/Use cases
© Hortonworks Inc. 2013
Agenda
Why Hadoop
on
OpenStack
Savanna
controller
deep dive
Hortonworks
OpenStack
plugin
DEMO
© Hortonworks Inc. 2013
OpenStack - cloud management platform
Glance
Image Service
Keystone
Identity Service
Horizon
NeutronNova
Cinder
Block Store
Swift
Object Store
(Apache License)
Ceilometer
Metering
Heat
Orchestration
Integrated
Mutli-hypervisor & guest OS
support
Savanna
Hadoop
© Hortonworks Inc. 2013
Project Savanna logical architecture
OpenStack Infrastructure
Network Storage
Security Compute
Savanna
Controller
Hortonworks OpenStack plugin
API
Hadoop
Provisioning
Configuration
Templates
Horizon +
Savanna UI
A
P
I
Configuration Elasticity
Orchestration
On-demand jobs execution
Hadoop Cluster
Ambari + API
Plugin manager
© Hortonworks Inc. 2013
Savanna Architecture
Savanna
Python
Client
RESTAPI
Cluster
Configuration
Manager
Horizon
Keystone
Auth
DAL
Nova
Glance
Swift
Savanna
Pages
Hadoop
VM
Provisioning
Plugin
Hadoop
VM
Hadoop
VM
Hadoop
VM
VM
Manager
Image
Registry
© Hortonworks Inc. 2013
Savanna key features
• Node group and cluster templates
• Cluster scaling (add/remove nodes)
• Hadoop cluster topology configuration parameters
–Data node anti-affinity
–HDFS location
–Swift integration
• Plugin mechanism for integration with different Hadoop
distributions
• Plugin implementations
–Hortonworks Data Platform OpenStack plugin ( uses Apache
Ambari)
–Vanilla Apache Hadoop ( No Apache Ambari) – reference
implementation with pre build image
© Hortonworks Inc. 2013
HDFS reliability on VMs
Compute
DN DN
D
N
DN DN
D
N
Data Block
Compute
© Hortonworks Inc. 2013
Data node anti-affinity
DN
Compute
TT | DN
Compute
DN
Compute
DN
Cluster A Cluster B
© Hortonworks Inc. 2013
Hadoop-8545: Swift for Hadoop
Swift
Hadoop
Job #1
Local
HDFS
Hadoop
Job #2
...
Hadoop
Job #N
© Hortonworks Inc. 2013
HDFS placement options
• Ephemeral drive
/var/lib/nova/instances/instance-xxx/disk -> /mnt/ephemeral
• Block storage volume
Cinder Volume -> /mnt/volume
• Bare drive support
/dev/sdb -> /mnt/sdb
© Hortonworks Inc. 2013
Savanna key features
• API to execute Map/Reduce jobs without exposing details
of underlying infrastructure (similar to AWS EMR)
• User-friendly UI for ad-hoc analytics queries based on
Hive or Pig
• Network configuration support, integration with Neutron
(OpenStack Networking, earlier Quantum)
© Hortonworks Inc. 2013
Agenda
Why Hadoop
on
OpenStack
Savanna
Controller
deep dive
Hortonworks
OpenStack
plugin
DEMO
© Hortonworks Inc. 2013
Hortonworks Data Platform OpenStack
plugin
• Provision HDP cluster using Ambari
• Supports generic or pre-packaged VM images
• Supports standard Savanna configuration templates
• Supports Ambari templates (aka blueprints)
configuration
–https://issues.apache.org/jira/browse/AMBARI-1783
© Hortonworks Inc. 2013
HDP OpenStack plugin and Ambari
• Ambari services installed on cluster hosts
–Ambari Server and Ambari Agent
• HDP plugin uses Ambari REST API
–Define cluster topology
–Configure Hadoop services
–Install Hadoop services on all VM’s
–Start Hadoop Services
• Monitor and Manage cluster with Ambari
–Ambari UI
–Ambari REST API
© Hortonworks Inc. 2013
Red Hat RDO
RDO is a freely-available, community supported
distribution of OpenStack, packaged and integrated for
Red Hat Enterprise Linux and its clones, and for Fedora
http://openstack.redhat.com
© Hortonworks Inc. 2013
Apache Ambari templates (aka blueprints)
Preconfigured information across all
clusters using this template
HDP Stack Information
- Services & Components & Packages
- Description
- Package Dependencies
Hadoop Topology
Component / Host Group Mapping
Hadoop Configuration
All Hadoop Configuration for the Cluster
(hundreds of parameters and their
values)
Per cluster pluggable data
- User names
- Passwords
- Host names
- Host VM flavors ( CPU/Mem)
- Node count per host group
……….
……….
……….
……….
© Hortonworks Inc. 2013
Demo
• Provision a Hadoop cluster on OpenStack
–Savanna with OpenStack UI extensions
–HDP Plugin
–Ambari Templates
–HDP stack including metrics and alerts
• Monitor cluster using Ambari UI
© Hortonworks Inc. 2013
Specify topology and configure
• Node Group templates
–Specify host/component mappings
–Specify VM flavor
–Specify node scoped configurations
• Cluster templates
–Specify node groups
–Specify VM image
–Specify cluster scoped configurations
• Upload templates (aka Ambari blueprints)
–Specifies topology and configuration
–Used to create Cluster Template
© Hortonworks Inc. 2013
Savanna Controller: Provision VM’s
Master VM
Slave VM 1
Node Groups
Master: 1
Slave: 2
Slave VM 2
Savanna OpenStack
• Savanna provisions OpenStack VM’s based on configured Node Groups
© Hortonworks Inc. 2013
HDP OpenStack Plugin: Install Ambari
Slave VM 1
Slave VM 2
Savanna OpenStackHDP Plugin
Ambari
Server
Ambari
Agent
Ambari
Agent
• HDP plugin remotely installs Ambari services
• Ambari is installed from public/private repo
• Ambari Agents register with Ambari Server
Ambari
DB
Master
VM
Ambari
Agent
© Hortonworks Inc. 2013
HDP OpenStack Plugin: Define topology
and configure
Savanna HDP Plugin
• HDP plugin specifies cluster topology and configuration via Ambari REST API
• Ambari stores topology and configuration in it’s DB
• Ambari is installed from public/private repo
• Ambari Agents register with Ambari Server
Slave VM 1
Slave VM 2
OpenStack
Ambari
Agent
Ambari
Agent
Ambari
Server Ambari
DB
Master
VM
Ambari
Agent
REST
API
© Hortonworks Inc. 2013
HDP OpenStack Plugin: Install Hadoop
Services
Savanna HDP Plugin
• HDP plugin sets state of all services to INSTALLED via Ambari REST API
• Ambari installs services on each host from public/private HDP repos
• Ambari pushes configurations to each host
• Service installation is asynchronous
• HDP Plugin polls install status via Ambari REST API
Slave VM 1
Slave VM 2
OpenStack
Ambari
Server
Ambari
Agent
Ambari
Agent
Ambari
DB
REST
API
DN
TT
DN
TT
N
N
JT
Master
VM
Ambari
Agent
© Hortonworks Inc. 2013
HDP OpenStack Plugin: Start Services
Savanna HDP Plugin
• HDP plugin sets state of all services to STARTED via Ambari REST API
• Ambari starts all services on all hosts
• Service start is asynchronous
• HDP Plugin polls start status via Ambari REST API
Slave VM 1
Slave VM 2
OpenStack
Ambari
Server
Ambari
Agent
Ambari
Agent
Ambari
DB
REST
API
Master
VM
Ambari
Agent
N
N
JT
DN
TT
DN
TT
© Hortonworks Inc. 2013
Apache Ambari: Monitor Cluster
• Use Ambari UI to monitor the cluster
–Use hostname where Ambari Server is running
–Default port is 8080
• Use Ambari REST API to monitor the cluster
–Use hostname where Ambari Server is running
–Default port is 8080
–‘clusters’ is root resource
–Example URL
– http://ambarihost:8080/api/v1/clusters
–REST API Documentation
– https://github.com/apache/ambari/blob/trunk/ambari-server/docs/api/v1/index.md
© Hortonworks Inc. 2013
• OpenStack provides operational agility and deployment choice
• Hadoop is a net new workload and a perfect app for OpenStack
• Integration marries two of the Largest Open Source Movements
– Community-driven innovation outpaces any single vendor
– Both are attracting major ecosystem players: IBM, RHT, HP, RAX, etc…
Summary
Project Savanna
Automate deployment of
Apache Hadoop on
OpenStack
© Hortonworks Inc. 2013
Learn More & Get Involved!
Download Hortonworks Data Platform
www.hortonworks.com/download
Follow…
@hortonworks
Email questions to:
Project Savanna:
https://launchpad.net/Savanna
https://wiki.openstack.org/wiki/Savanna/HowToParticipate
hbari@hortonworks.com
ielterman@mirantis.com

Más contenido relacionado

La actualidad más candente

January 2015 HUG: Using HBase Co-Processors to Build a Distributed, Transacti...
January 2015 HUG: Using HBase Co-Processors to Build a Distributed, Transacti...January 2015 HUG: Using HBase Co-Processors to Build a Distributed, Transacti...
January 2015 HUG: Using HBase Co-Processors to Build a Distributed, Transacti...
Yahoo Developer Network
 
Configuring a Secure, Multitenant Cluster for the Enterprise
Configuring a Secure, Multitenant Cluster for the EnterpriseConfiguring a Secure, Multitenant Cluster for the Enterprise
Configuring a Secure, Multitenant Cluster for the Enterprise
Cloudera, Inc.
 
Bikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarnBikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarn
hdhappy001
 

La actualidad más candente (20)

Hadoop Operations for Production Systems (Strata NYC)
Hadoop Operations for Production Systems (Strata NYC)Hadoop Operations for Production Systems (Strata NYC)
Hadoop Operations for Production Systems (Strata NYC)
 
January 2015 HUG: Using HBase Co-Processors to Build a Distributed, Transacti...
January 2015 HUG: Using HBase Co-Processors to Build a Distributed, Transacti...January 2015 HUG: Using HBase Co-Processors to Build a Distributed, Transacti...
January 2015 HUG: Using HBase Co-Processors to Build a Distributed, Transacti...
 
Big data on virtualized infrastucture
Big data on virtualized infrastuctureBig data on virtualized infrastucture
Big data on virtualized infrastucture
 
Hadoop Cluster on Docker Containers
Hadoop Cluster on Docker ContainersHadoop Cluster on Docker Containers
Hadoop Cluster on Docker Containers
 
20151027 sahara + manila final
20151027 sahara + manila final20151027 sahara + manila final
20151027 sahara + manila final
 
Docker based Hadoop provisioning - Hadoop Summit 2014
Docker based Hadoop provisioning - Hadoop Summit 2014 Docker based Hadoop provisioning - Hadoop Summit 2014
Docker based Hadoop provisioning - Hadoop Summit 2014
 
Hadoop 2 - More than MapReduce
Hadoop 2 - More than MapReduceHadoop 2 - More than MapReduce
Hadoop 2 - More than MapReduce
 
Hadoop 2 - Beyond MapReduce
Hadoop 2 - Beyond MapReduceHadoop 2 - Beyond MapReduce
Hadoop 2 - Beyond MapReduce
 
Configuring a Secure, Multitenant Cluster for the Enterprise
Configuring a Secure, Multitenant Cluster for the EnterpriseConfiguring a Secure, Multitenant Cluster for the Enterprise
Configuring a Secure, Multitenant Cluster for the Enterprise
 
Hive on kafka
Hive on kafkaHive on kafka
Hive on kafka
 
Bikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarnBikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarn
 
Empower Hive with Spark
Empower Hive with SparkEmpower Hive with Spark
Empower Hive with Spark
 
Intro to Hadoop Presentation at Carnegie Mellon - Silicon Valley
Intro to Hadoop Presentation at Carnegie Mellon - Silicon ValleyIntro to Hadoop Presentation at Carnegie Mellon - Silicon Valley
Intro to Hadoop Presentation at Carnegie Mellon - Silicon Valley
 
Big Data in Container; Hadoop Spark in Docker and Mesos
Big Data in Container; Hadoop Spark in Docker and MesosBig Data in Container; Hadoop Spark in Docker and Mesos
Big Data in Container; Hadoop Spark in Docker and Mesos
 
Flexible compute
Flexible computeFlexible compute
Flexible compute
 
Hive on spark berlin buzzwords
Hive on spark berlin buzzwordsHive on spark berlin buzzwords
Hive on spark berlin buzzwords
 
MySQL Applier for Apache Hadoop: Real-Time Event Streaming to HDFS
MySQL Applier for Apache Hadoop: Real-Time Event Streaming to HDFSMySQL Applier for Apache Hadoop: Real-Time Event Streaming to HDFS
MySQL Applier for Apache Hadoop: Real-Time Event Streaming to HDFS
 
Get most out of Spark on YARN
Get most out of Spark on YARNGet most out of Spark on YARN
Get most out of Spark on YARN
 
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, ClouderaHadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
 
Lessons learned from scaling YARN to 40K machines in a multi tenancy environment
Lessons learned from scaling YARN to 40K machines in a multi tenancy environmentLessons learned from scaling YARN to 40K machines in a multi tenancy environment
Lessons learned from scaling YARN to 40K machines in a multi tenancy environment
 

Destacado

از نماینده ایران در WSIS Prizes 2016 حمایت کنید ... متشکریم ...
از نماینده ایران در WSIS Prizes 2016 حمایت کنید ... متشکریم ...از نماینده ایران در WSIS Prizes 2016 حمایت کنید ... متشکریم ...
از نماینده ایران در WSIS Prizes 2016 حمایت کنید ... متشکریم ...
Leila Esmaeili
 
Sahara presentation latest - Codemotion Rome 2015
Sahara presentation latest - Codemotion Rome 2015Sahara presentation latest - Codemotion Rome 2015
Sahara presentation latest - Codemotion Rome 2015
Codemotion
 
20150704 benchmark and user experience in sahara weiting
20150704 benchmark and user experience in sahara weiting20150704 benchmark and user experience in sahara weiting
20150704 benchmark and user experience in sahara weiting
Wei Ting Chen
 

Destacado (13)

Running Non-MapReduce Big Data Applications on Apache Hadoop
Running Non-MapReduce Big Data Applications on Apache HadoopRunning Non-MapReduce Big Data Applications on Apache Hadoop
Running Non-MapReduce Big Data Applications on Apache Hadoop
 
از نماینده ایران در WSIS Prizes 2016 حمایت کنید ... متشکریم ...
از نماینده ایران در WSIS Prizes 2016 حمایت کنید ... متشکریم ...از نماینده ایران در WSIS Prizes 2016 حمایت کنید ... متشکریم ...
از نماینده ایران در WSIS Prizes 2016 حمایت کنید ... متشکریم ...
 
OpenStack Data Processing ("Sahara") project update - December 2014
OpenStack Data Processing ("Sahara") project update - December 2014OpenStack Data Processing ("Sahara") project update - December 2014
OpenStack Data Processing ("Sahara") project update - December 2014
 
Sahara presentation latest - Codemotion Rome 2015
Sahara presentation latest - Codemotion Rome 2015Sahara presentation latest - Codemotion Rome 2015
Sahara presentation latest - Codemotion Rome 2015
 
Benchmarking sahara based big data as a service solutions
Benchmarking sahara based big data as a service solutionsBenchmarking sahara based big data as a service solutions
Benchmarking sahara based big data as a service solutions
 
20150704 benchmark and user experience in sahara weiting
20150704 benchmark and user experience in sahara weiting20150704 benchmark and user experience in sahara weiting
20150704 benchmark and user experience in sahara weiting
 
OpenStack Trove Day (19 Aug 2014, Cambridge MA) - Sahara
OpenStack Trove Day (19 Aug 2014, Cambridge MA)  - SaharaOpenStack Trove Day (19 Aug 2014, Cambridge MA)  - Sahara
OpenStack Trove Day (19 Aug 2014, Cambridge MA) - Sahara
 
20150314 sahara intro and the future plan for open stack meetup
20150314 sahara intro and the future plan for open stack meetup20150314 sahara intro and the future plan for open stack meetup
20150314 sahara intro and the future plan for open stack meetup
 
Sahara Updates - Kilo Edition
Sahara Updates - Kilo EditionSahara Updates - Kilo Edition
Sahara Updates - Kilo Edition
 
آشنایی با جرم‌یابی قانونی رایانه‌ای
آشنایی با جرم‌یابی قانونی رایانه‌ایآشنایی با جرم‌یابی قانونی رایانه‌ای
آشنایی با جرم‌یابی قانونی رایانه‌ای
 
Cloud Security and Risk Management
Cloud Security and Risk ManagementCloud Security and Risk Management
Cloud Security and Risk Management
 
The Evolution of OpenStack – From Infancy to Enterprise
The Evolution of OpenStack – From Infancy to EnterpriseThe Evolution of OpenStack – From Infancy to Enterprise
The Evolution of OpenStack – From Infancy to Enterprise
 
Big Data on OpenStack
Big Data on OpenStackBig Data on OpenStack
Big Data on OpenStack
 

Similar a Hello OpenStack, Meet Hadoop

Hadoop Operations – Past, Present, and Future
Hadoop Operations – Past, Present, and FutureHadoop Operations – Past, Present, and Future
Hadoop Operations – Past, Present, and Future
DataWorks Summit
 
Hadoop Operations - Past, Present, and Future
Hadoop Operations - Past, Present, and FutureHadoop Operations - Past, Present, and Future
Hadoop Operations - Past, Present, and Future
DataWorks Summit
 
Hadoop in the Clouds, Virtualization and Virtual Machines
Hadoop in the Clouds, Virtualization and Virtual MachinesHadoop in the Clouds, Virtualization and Virtual Machines
Hadoop in the Clouds, Virtualization and Virtual Machines
DataWorks Summit
 
Apache CloudStack 4.2: A First Look
Apache CloudStack 4.2: A First LookApache CloudStack 4.2: A First Look
Apache CloudStack 4.2: A First Look
Shanker Balan
 

Similar a Hello OpenStack, Meet Hadoop (20)

Hadoop Operations – Past, Present, and Future
Hadoop Operations – Past, Present, and FutureHadoop Operations – Past, Present, and Future
Hadoop Operations – Past, Present, and Future
 
Hadoop Operations - Past, Present, and Future
Hadoop Operations - Past, Present, and FutureHadoop Operations - Past, Present, and Future
Hadoop Operations - Past, Present, and Future
 
Hadoop Everywhere & Cloudbreak
Hadoop Everywhere & CloudbreakHadoop Everywhere & Cloudbreak
Hadoop Everywhere & Cloudbreak
 
Hortonworks Technical Workshop: HDP everywhere - cloud considerations using...
Hortonworks Technical Workshop:   HDP everywhere - cloud considerations using...Hortonworks Technical Workshop:   HDP everywhere - cloud considerations using...
Hortonworks Technical Workshop: HDP everywhere - cloud considerations using...
 
The Unbearable Lightness of Ephemeral Processing
The Unbearable Lightness of Ephemeral ProcessingThe Unbearable Lightness of Ephemeral Processing
The Unbearable Lightness of Ephemeral Processing
 
Managing Enterprise Hadoop Clusters with Apache Ambari
Managing Enterprise Hadoop Clusters with Apache AmbariManaging Enterprise Hadoop Clusters with Apache Ambari
Managing Enterprise Hadoop Clusters with Apache Ambari
 
Managing Enterprise Hadoop Clusters with Apache Ambari
Managing Enterprise Hadoop Clusters with Apache AmbariManaging Enterprise Hadoop Clusters with Apache Ambari
Managing Enterprise Hadoop Clusters with Apache Ambari
 
Elastic Scalability in MySQL Fabric Using OpenStack
Elastic Scalability in MySQL Fabric Using OpenStackElastic Scalability in MySQL Fabric Using OpenStack
Elastic Scalability in MySQL Fabric Using OpenStack
 
Managing your Hadoop Clusters with Apache Ambari
Managing your Hadoop Clusters with Apache AmbariManaging your Hadoop Clusters with Apache Ambari
Managing your Hadoop Clusters with Apache Ambari
 
Running Enterprise Workloads in the Cloud
Running Enterprise Workloads in the CloudRunning Enterprise Workloads in the Cloud
Running Enterprise Workloads in the Cloud
 
Cloudbreak - Technical Deep Dive
Cloudbreak - Technical Deep DiveCloudbreak - Technical Deep Dive
Cloudbreak - Technical Deep Dive
 
Hadoop in the Clouds, Virtualization and Virtual Machines
Hadoop in the Clouds, Virtualization and Virtual MachinesHadoop in the Clouds, Virtualization and Virtual Machines
Hadoop in the Clouds, Virtualization and Virtual Machines
 
One Click Hadoop Clusters - Anywhere (Using Docker)
One Click Hadoop Clusters - Anywhere (Using Docker)One Click Hadoop Clusters - Anywhere (Using Docker)
One Click Hadoop Clusters - Anywhere (Using Docker)
 
Bay Area Impala User Group Meetup (Sept 16 2014)
Bay Area Impala User Group Meetup (Sept 16 2014)Bay Area Impala User Group Meetup (Sept 16 2014)
Bay Area Impala User Group Meetup (Sept 16 2014)
 
Virtual Hadoop Introduction In Chinese
Virtual Hadoop Introduction In ChineseVirtual Hadoop Introduction In Chinese
Virtual Hadoop Introduction In Chinese
 
Apache CloudStack 4.2: A First Look
Apache CloudStack 4.2: A First LookApache CloudStack 4.2: A First Look
Apache CloudStack 4.2: A First Look
 
Micro services vs hadoop
Micro services vs hadoopMicro services vs hadoop
Micro services vs hadoop
 
Running Cloudbreak on Kubernetes
Running Cloudbreak on KubernetesRunning Cloudbreak on Kubernetes
Running Cloudbreak on Kubernetes
 
Running Cloudbreak on Kubernetes
Running Cloudbreak on KubernetesRunning Cloudbreak on Kubernetes
Running Cloudbreak on Kubernetes
 
Container Conf 2017: Rancher Kubernetes
Container Conf 2017: Rancher KubernetesContainer Conf 2017: Rancher Kubernetes
Container Conf 2017: Rancher Kubernetes
 

Más de DataWorks Summit

HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
DataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
DataWorks Summit
 

Más de DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Último

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Último (20)

GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 

Hello OpenStack, Meet Hadoop

  • 1. © Hortonworks Inc. 2013 Hadoop meet OpenStack Himanshu Bari, Hortonworks Ilya Elterman, Mirantis John Spiedel, Hortonworks June 26th, 2013
  • 2. © Hortonworks Inc. 2013 Disclaimer • This document may contain product features and technology directions that are under development or may be under development in the future. • Technical feasibility, market demand, user feedback, and the Apache Software Foundation community development process can all affect timing and final delivery. • This document’s description of these features and technology directions does not represent a contractual commitment from Hortonworks to deliver these features in any generally available product. • Product features and technology directions are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind.
  • 3. © Hortonworks Inc. 2013 Agenda Why Hadoop on OpenStack Savanna controller deep dive Hortonworks OpenStack plugin DEMO
  • 4. © Hortonworks Inc. 2013 Why Hadoop & OpenStack? Hadoop provides a greenfield use case • Net new workload • Needs scale out infrastructure • Shared platform OpenStack provides the perfect cloud platform • Operational agility • Supports scale out architecture • Deployment choice across public, private, and hybrid clouds 1. Open source communities provide the fastest path to innovation 2. Open source is changing the game as economics and accessibility serve to accelerate cloud & big data market trends 3. Both are attracting major ecosystem players: IBM, RHT, HP, RAX, etc… Marries two of the largest open source movements
  • 5. © Hortonworks Inc. 2013 OpenStack Infrastructure Savanna Elastic Hadoop Controller Project Savanna to accelerate integration Swift storage Hadoop Cluster N N N N N N 2 Ambari Hadoop management - - + + N N N N 1 3 1. Cluster templates: deploy pre configured Hadoop clusters in seconds from Horizon or Ambari 1. HDFS-Swift connectors: move data between HDFS and Swift object storage 1. Simplified Elasticity Project Savanna Automate deployment of Apache Hadoop on OpenStack
  • 6. © Hortonworks Inc. 2013 Provisioning Phase-1 features - Frequent dev/test/staging cluster provision requests - Migrations from staging to prod and vice versa - Reduce operator error in cluster provisioning - Migrate away from Amazon EMR for Ad hoc analytics requests for experimentation - Cluster and node level templates for self-provisioning - Template operations like save/import - Move data between HDFS & Swift object store Job flow based cluster provisioning Phase-2 features Benefits/Use cases
  • 7. © Hortonworks Inc. 2013 Elasticity Phase-1 features - Commission a new node or decommission a node for maintenance - For dev/test/staging clusters: automatically vary cluster data & compute capacity based on tenant, workload, time of day, resource utilization etc. - Automatically vary compute capacity only for production clusters - Hadoop cluster node add/remove from OpenStack - Cluster operations like destroy cluster fired from OpenStack Rule based cluster node elasticity Phase-2 features Benefits/Use cases
  • 8. © Hortonworks Inc. 2013 Multi-tenancy Phase-1 features - Common infrastructure for Hadoop and non Hadoop workloads - Simplify maintenance through version isolation - Resource isolation to support varying SLAs based on tenant and workload - Simplify chargeback/showback - Hadoop virtualization extensions support - Ability to pin VMs to group of physical hosts - Keystone integration with Ambari - One Ambari instance per tenant - Keystone enhancements to support Job flow to tenant mapping Phase-2 features Benefits/Use cases
  • 9. © Hortonworks Inc. 2013 Agenda Why Hadoop on OpenStack Savanna controller deep dive Hortonworks OpenStack plugin DEMO
  • 10. © Hortonworks Inc. 2013 OpenStack - cloud management platform Glance Image Service Keystone Identity Service Horizon NeutronNova Cinder Block Store Swift Object Store (Apache License) Ceilometer Metering Heat Orchestration Integrated Mutli-hypervisor & guest OS support Savanna Hadoop
  • 11. © Hortonworks Inc. 2013 Project Savanna logical architecture OpenStack Infrastructure Network Storage Security Compute Savanna Controller Hortonworks OpenStack plugin API Hadoop Provisioning Configuration Templates Horizon + Savanna UI A P I Configuration Elasticity Orchestration On-demand jobs execution Hadoop Cluster Ambari + API Plugin manager
  • 12. © Hortonworks Inc. 2013 Savanna Architecture Savanna Python Client RESTAPI Cluster Configuration Manager Horizon Keystone Auth DAL Nova Glance Swift Savanna Pages Hadoop VM Provisioning Plugin Hadoop VM Hadoop VM Hadoop VM VM Manager Image Registry
  • 13. © Hortonworks Inc. 2013 Savanna key features • Node group and cluster templates • Cluster scaling (add/remove nodes) • Hadoop cluster topology configuration parameters –Data node anti-affinity –HDFS location –Swift integration • Plugin mechanism for integration with different Hadoop distributions • Plugin implementations –Hortonworks Data Platform OpenStack plugin ( uses Apache Ambari) –Vanilla Apache Hadoop ( No Apache Ambari) – reference implementation with pre build image
  • 14. © Hortonworks Inc. 2013 HDFS reliability on VMs Compute DN DN D N DN DN D N Data Block Compute
  • 15. © Hortonworks Inc. 2013 Data node anti-affinity DN Compute TT | DN Compute DN Compute DN Cluster A Cluster B
  • 16. © Hortonworks Inc. 2013 Hadoop-8545: Swift for Hadoop Swift Hadoop Job #1 Local HDFS Hadoop Job #2 ... Hadoop Job #N
  • 17. © Hortonworks Inc. 2013 HDFS placement options • Ephemeral drive /var/lib/nova/instances/instance-xxx/disk -> /mnt/ephemeral • Block storage volume Cinder Volume -> /mnt/volume • Bare drive support /dev/sdb -> /mnt/sdb
  • 18. © Hortonworks Inc. 2013 Savanna key features • API to execute Map/Reduce jobs without exposing details of underlying infrastructure (similar to AWS EMR) • User-friendly UI for ad-hoc analytics queries based on Hive or Pig • Network configuration support, integration with Neutron (OpenStack Networking, earlier Quantum)
  • 19. © Hortonworks Inc. 2013 Agenda Why Hadoop on OpenStack Savanna Controller deep dive Hortonworks OpenStack plugin DEMO
  • 20. © Hortonworks Inc. 2013 Hortonworks Data Platform OpenStack plugin • Provision HDP cluster using Ambari • Supports generic or pre-packaged VM images • Supports standard Savanna configuration templates • Supports Ambari templates (aka blueprints) configuration –https://issues.apache.org/jira/browse/AMBARI-1783
  • 21. © Hortonworks Inc. 2013 HDP OpenStack plugin and Ambari • Ambari services installed on cluster hosts –Ambari Server and Ambari Agent • HDP plugin uses Ambari REST API –Define cluster topology –Configure Hadoop services –Install Hadoop services on all VM’s –Start Hadoop Services • Monitor and Manage cluster with Ambari –Ambari UI –Ambari REST API
  • 22. © Hortonworks Inc. 2013 Red Hat RDO RDO is a freely-available, community supported distribution of OpenStack, packaged and integrated for Red Hat Enterprise Linux and its clones, and for Fedora http://openstack.redhat.com
  • 23. © Hortonworks Inc. 2013 Apache Ambari templates (aka blueprints) Preconfigured information across all clusters using this template HDP Stack Information - Services & Components & Packages - Description - Package Dependencies Hadoop Topology Component / Host Group Mapping Hadoop Configuration All Hadoop Configuration for the Cluster (hundreds of parameters and their values) Per cluster pluggable data - User names - Passwords - Host names - Host VM flavors ( CPU/Mem) - Node count per host group ………. ………. ………. ……….
  • 24. © Hortonworks Inc. 2013 Demo • Provision a Hadoop cluster on OpenStack –Savanna with OpenStack UI extensions –HDP Plugin –Ambari Templates –HDP stack including metrics and alerts • Monitor cluster using Ambari UI
  • 25. © Hortonworks Inc. 2013 Specify topology and configure • Node Group templates –Specify host/component mappings –Specify VM flavor –Specify node scoped configurations • Cluster templates –Specify node groups –Specify VM image –Specify cluster scoped configurations • Upload templates (aka Ambari blueprints) –Specifies topology and configuration –Used to create Cluster Template
  • 26. © Hortonworks Inc. 2013 Savanna Controller: Provision VM’s Master VM Slave VM 1 Node Groups Master: 1 Slave: 2 Slave VM 2 Savanna OpenStack • Savanna provisions OpenStack VM’s based on configured Node Groups
  • 27. © Hortonworks Inc. 2013 HDP OpenStack Plugin: Install Ambari Slave VM 1 Slave VM 2 Savanna OpenStackHDP Plugin Ambari Server Ambari Agent Ambari Agent • HDP plugin remotely installs Ambari services • Ambari is installed from public/private repo • Ambari Agents register with Ambari Server Ambari DB Master VM Ambari Agent
  • 28. © Hortonworks Inc. 2013 HDP OpenStack Plugin: Define topology and configure Savanna HDP Plugin • HDP plugin specifies cluster topology and configuration via Ambari REST API • Ambari stores topology and configuration in it’s DB • Ambari is installed from public/private repo • Ambari Agents register with Ambari Server Slave VM 1 Slave VM 2 OpenStack Ambari Agent Ambari Agent Ambari Server Ambari DB Master VM Ambari Agent REST API
  • 29. © Hortonworks Inc. 2013 HDP OpenStack Plugin: Install Hadoop Services Savanna HDP Plugin • HDP plugin sets state of all services to INSTALLED via Ambari REST API • Ambari installs services on each host from public/private HDP repos • Ambari pushes configurations to each host • Service installation is asynchronous • HDP Plugin polls install status via Ambari REST API Slave VM 1 Slave VM 2 OpenStack Ambari Server Ambari Agent Ambari Agent Ambari DB REST API DN TT DN TT N N JT Master VM Ambari Agent
  • 30. © Hortonworks Inc. 2013 HDP OpenStack Plugin: Start Services Savanna HDP Plugin • HDP plugin sets state of all services to STARTED via Ambari REST API • Ambari starts all services on all hosts • Service start is asynchronous • HDP Plugin polls start status via Ambari REST API Slave VM 1 Slave VM 2 OpenStack Ambari Server Ambari Agent Ambari Agent Ambari DB REST API Master VM Ambari Agent N N JT DN TT DN TT
  • 31. © Hortonworks Inc. 2013 Apache Ambari: Monitor Cluster • Use Ambari UI to monitor the cluster –Use hostname where Ambari Server is running –Default port is 8080 • Use Ambari REST API to monitor the cluster –Use hostname where Ambari Server is running –Default port is 8080 –‘clusters’ is root resource –Example URL – http://ambarihost:8080/api/v1/clusters –REST API Documentation – https://github.com/apache/ambari/blob/trunk/ambari-server/docs/api/v1/index.md
  • 32. © Hortonworks Inc. 2013 • OpenStack provides operational agility and deployment choice • Hadoop is a net new workload and a perfect app for OpenStack • Integration marries two of the Largest Open Source Movements – Community-driven innovation outpaces any single vendor – Both are attracting major ecosystem players: IBM, RHT, HP, RAX, etc… Summary Project Savanna Automate deployment of Apache Hadoop on OpenStack
  • 33. © Hortonworks Inc. 2013 Learn More & Get Involved! Download Hortonworks Data Platform www.hortonworks.com/download Follow… @hortonworks Email questions to: Project Savanna: https://launchpad.net/Savanna https://wiki.openstack.org/wiki/Savanna/HowToParticipate hbari@hortonworks.com ielterman@mirantis.com

Notas del editor

  1. Mirantis
  2. Note:Not Recommended: One cluster having multiple data nodes on the same hypervisor nodeAllowed: Multiple clusters having a data node on the same hypervisor nodeAllowed: One data node and multiple compute nodes from per hypervisor
  3. Object Store (codenamed "Swift") provides object storage. It allows you to store or retrieve files (but not mount directories like a fileserver). Several companies provide commercial storage services based on Swift. These include KT, Rackspace (from which Swift originated) and Internap. Swift is also used internally at many large companies to store their data.Image (codenamed "Glance") provides a catalog and repository for virtual disk images. These disk images are mostly commonly used in OpenStack Compute. While this service is technically optional, any cloud of size will require it.Compute (codenamed "Nova") provides virtual servers upon demand. Rackspace and HP provide commercial compute services built on Nova and it is used internally at companies like Mercado Libre and NASA (where it originated).Dashboard (codenamed "Horizon") provides a modular web-based user interface for all the OpenStack services. With this web GUI, you can perform most operations on your cloud like launching an instance, assigning IP addresses and setting access controls.Identity (codenamed "Keystone") provides authentication and authorization for all the OpenStack services. It also provides a service catalog of services within a particular OpenStack cloud.Network (codenamed "Quantum") provides "network connectivity as a service" between interface devices managed by other OpenStack services (most likely Nova). The service works by allowing users to create their own networks and then attach interfaces to them. Quantum has a pluggable architecture to support many popular networking vendors and technologies.Block Storage (codenamed "Cinder") provides persistent block storage to guest VMs. This project was born from code originally in Nova (the nova-volume service described below). In the Folsom release, both the nova-volume service and the separate volume service are available.File STORAGE(NAS)– No Support. Currently, OpenStack Compute does not have any native support for this type of file storage inside of an instance. However, there is a Gluster storage connector for OpenStack that enables the use of the GlusterFS file system as a back-end for the Image service.
  4. 1. What is RDO?* Distribution of OpenStack - The OpenStack project produces code. Packaging, integration, installation and support is left to distributors and partners - In its current form, OpenStack is a toolbox for creating an IaaS cloud, RDO allows you to get started quickly* For RHEL, CentOS, Scientific Linux and other RHEL clones, and for Fedora - There is a demand for being able to try out OpenStack on the industry's most successful enterprise Linux platform - We welcome users and experiences from the Red Hat Enterprise Linux ecosystem, which includes CentOS and Scientific Linux - We also want to make it easy for users of Fedora to try the version of OpenStack they are interested in without necessarily upgrading their entire operating system* Community-driven - The RDO community site is a wiki, and a forum. We welcome the participation of community members sharing knowledge, helping each other - Support offered with RDO is of a standard which can be expected from a community supported project - we encourage anyone who is looking for enterprise level support to upgrade to Red Hat OpenStack
  5. Every conversation with customers around Hadoop deployment model end with one word ‘flexibility’ . Customers want to be able to deploy Hadoop On prem – physical or over a virtual infrastructure and in the cloud. In the cloud OpenStack is emerging/ rather has emerged as the hands down dominant open source cloud management platform