SlideShare una empresa de Scribd logo
1 de 29
How Ambari manifest files
are used by System Center
and Windows Azure
Brian Swan
Program Manager, HDInsight Team
Microsoft
A representation of a software packages to be installed on a cluster
(typically Hadoop, but also any custom packages, such as Java or
Python). This representation captures all the invariants such as
services, components, properties associated with a specific package.
Authored by package distributor.
A mapping between a package component and one or more logical
host groups defined in the host manifest.
Authored by Hadoop Admin.
Contains a list of logical host definitions, system-level resources, and
(optionally) the actual hosts that fall into the host def categories.
When actual hosts are not described, references that are realized by
on-demand services (such as a cloud provider) are included. A logical
group may contain one or more hosts.
Authored by System Admin.
Captures the specific configuration for a deployment at the cluster
level, as well as overrides at the service and component levels.
Authored by Hadoop Admin.
HostComponentMapping.json
Manifest Files - Overview
HostManifest.json
PackageDefinition.json
PackageConfiguration.json
A representation of a software packages to be installed on a cluster
(typically Hadoop, but also any custom packages, such as Java or
Python). This representation captures all the invariants such as
services, components, properties associated with a specific package.
Authored by package distributor.
A mapping between a package component and one or more logical
host groups defined in the host manifest.
Authored by Hadoop Admin.
Contains a list of logical host definitions, system-level resourced, and
(optionally) the actual hosts that fall into the host def categories.
When actual hosts are not described, references that are realized by
on-demand services (such as a cloud provider) are included. A logical
group may contain one or more hosts.
Authored by System Admin.
Captures the specific configuration for a deployment at the cluster
level, as well as overrides at the service and component levels.
Authored by Hadoop Admin.
HostComponentMapping.json
Manifest Files - Overview
HostManifest.json
PackageDefinition.json
PackageConfiguration.json
A representation of a software packages to be installed on a cluster
(typically Hadoop, but also any custom packages, such as Java or
Python). This representation captures all the invariants such as
services, components, properties associated with a specific package.
Authored by package distributor.
A mapping between a package component and one or more logical
host groups defined in the host manifest.
Authored by Hadoop Admin.
Contains a list of logical host definitions, system-level resourced, and
(optionally) the actual hosts that fall into the host def categories.
When actual hosts are not described, references that are realized by
on-demand services (such as a cloud provider) are included. A logical
group may contain one or more hosts.
Authored by System Admin.
Captures the specific configuration for a deployment at the cluster
level, as well as overrides at the service and component levels.
Authored by Hadoop Admin.
HostComponentMapping.json
Manifest Files - Overview
HostManifest.json
PackageDefinition.json
PackageConfiguration.json
A representation of a software packages to be installed on a cluster
(typically Hadoop, but also any custom packages, such as Java or
Python). This representation captures all the invariants such as
services, components, properties associated with a specific package.
Authored by package distributor.
A mapping between a package component and one or more logical
host groups defined in the host manifest.
Authored by Hadoop Admin.
Contains a list of logical host definitions, system-level resourced, and
(optionally) the actual hosts that fall into the host def categories.
When actual hosts are not described, references that are realized by
on-demand services (such as a cloud provider) are included. A logical
group may contain one or more hosts.
Authored by System Admin.
Captures the specific configuration for a deployment at the cluster
level, as well as overrides at the service and component levels.
Authored by Hadoop Admin.
HostComponentMapping.json
Manifest Files - Overview
HostManifest.json
PackageDefinition.json
PackageConfiguration.json
A representation of a software packages to be installed on a cluster
(typically Hadoop, but also any custom packages, such as Java or
Python). This representation captures all the invariants such as
services, components, properties associated with a specific package.
Authored by package distributor.
A mapping between a package component and one or more logical
host groups defined in the host manifest.
Authored by Hadoop Admin.
Contains a list of logical host definitions, system-level resourced, and
(optionally) the actual hosts that fall into the host def categories.
When actual hosts are not described, references that are realized by
on-demand services (such as a cloud provider) are included. A logical
group may contain one or more hosts.
Authored by System Admin.
Captures the specific configuration for a deployment at the cluster
level, as well as overrides at the service and component levels.
Authored by Hadoop Admin.
HostComponentMapping.json
Manifest Files - Overview
HostManifest.json
PackageDefinition.json
PackageConfiguration.json
Deployment using System Center
Note: The tools described here for deploying Hadoop clusters using System
Center are prototype tools used internally at Microsoft. The intent here is to
demonstrate one consumer of cluster manifest files.
System Center – Prerequisites
Deployment
DB
System Center
Virtual Machine Manager
(VMM)
HadoopServiceTemplate.xml
Win.vhd
>HDInsightDeployment.exe
• System Center 2013
• VM running Virtual Machine Manager
(VMM) with…
• Hadoop Service Template
• Windows Server VHD
• HDInsight Deployment Tool
• Deployment Database (SQL Server)
Phase 1: Parse, Validate, Populate DB
Deployment
DB
System Center
VMM
HadoopServiceTemplate.xml
Win.vhd
>HDInsightDeployment.exe
• Copy manifest files to Deployment Tool directory.
Manifest
Files
Phase 1: Parse, Validate, Populate DB
Deployment
DB
System Center
VMM
HadoopServiceTemplate.xml
>HDInsightDeployment.exe
• Copy manifest files to Deployment Tool directory.
• Update the Deployment Tool configuration file.
Phase 1: Parse, Validate, Populate DB
Deployment
DB
System Center
VMM
HadoopServiceTemplate.xml
>HDInsightDeployment.exe
• Copy manifest files to Deployment Tool directory.
• Update HDInsightDeployment.exe.config.
• Start deployment with HDInsightDeployment.exe.
• Deployment tool reads and validates manifest files.
• Schema validation.
• Dependency validation.
Phase 1: Parse, Validate, Populate DB
Deployment
DB
System Center
VMM
HadoopServiceTemplate.xml
>HDInsightDeployment.exe
• Copy manifest files to Deployment Tool directory.
• Update HDInsightDeployment.exe.config.
• Start deployment with HDInsightDeployment.exe.
• Deployment tool reads and validates manifest files.
• Schema validation.
• Dependency validation.
• Deployment DB is populated with steps for creating system
resources on hosts (e.g. Users/Groups/Firewall Rules/etc.)
• Deployment DB is populated with ordered steps for installing
Hadoop (and other packages).
Phase 2: Download Packages
Deployment
DB
System Center
VMM
HadoopServiceTemplate.xml
Win.vhd
>HDInsightDeployment.exe
• Deployment tool downloads/copies packages to VMM based on
information in PackageDefinition.json.
VMM
Phase 3: Provision VMs, Install Packages
Deployment
DB
System Center
HadoopServiceTemplate.xml
Win.vhd
>HDInsightDeployment.exe
• VMM does VM provisioning based on HostManifest.json file.
VMM
Phase 3: Provision VMs, Install Packages
Deployment
DB
System Center
HadoopServiceTemplate.xml
Win.vhd
>HDInsightDeployment.exe
• VMM does VM provisioning based on HostManifest.json file.
VM1
VM2
VM3
VM4
MASTER_HOSTS
SLAVE_HOSTS
VMM
Phase 3: Provision VMs, Install Packages
Deployment
DB
System Center
HadoopServiceTemplate.xml
Win.vhd
>HDInsightDeployment.exe
• VMM does VM provisioning based on HostManifest.json file.
• Hadoop Service Template (a VMM template) specifies which
system components to install (e.g. Deployment Agent)
• Starts Deployment Agent
VM1
VM2
VM3
VM4
VMM
Phase 3: Provision VMs, Install Packages
Deployment
DB
System Center
HadoopServiceTemplate.xml
Win.vhd
>HDInsightDeployment.exe
• VMM does VM provisioning based on HostManifest.json file.
• Template specifies which system components to install (e.g.
Deployment Agent)
• Starts Deployment Agent
VM1
Deployment
Agent
VM2
Deployment
Agent
VM3
Deployment
Agent
VM4
Deployment
Agent
VMM
Phase 3: Provision VMs, Install Packages
Deployment
DB
System Center
HadoopServiceTemplate.xml
Win.vhd
>HDInsightDeployment.exe
• VMM does VM provisioning based on HostManifest file.
• Template specifies which system components to install (e.g.
Deployment Agent)
• Starts Deployment Agent
• Deployment Agents pull packages from SCVMM
VM1
Deployment
Agent
VM2
Deployment
Agent
VM3
Deployment
Agent
VM4
Deployment
Agent
Phase 4: Create System Resources, Install
Packages
Deployment
DB
System Center
VM1
Deployment
Agent
VM2
Deployment
Agent
VM3
Deployment
Agent
VM4
Deployment
Agent
• Deployment Agents create system resources
(Users/Groups/Firewall Rules/etc.) from steps in
Deployment DB hdfs_user
hadoop_admin
mapred_user
hadoop_admin
hdfs_user
mapred_user
hdfs_user
mapred_user
Phase 4: Create System Resources, Install
Packages
Deployment
DB
System Center
VM1
Deployment
Agent
VM2
Deployment
Agent
VM3
Deployment
Agent
VM4
Deployment
Agent
• Deployment Agents create system resources
(Users/Groups/Firewall Rules/etc.) from steps in
Deployment DB
• Deployment Agents work through steps for
installing Hadoop (and other packages)
• Packages contain scripts that will be invoked
for installing custom components (e.g. Java,
Python, etc.)
HDFS
NameNode
MapReduce
JobTracker
HDFS, MapReduce
DataNode, TaskTracker
HDFS, MapReduce
DataNode, TaskTracker
Phase 4: Create System Resources, Install
Packages
Deployment
DB
System Center
VM1
Deployment
Agent
VM2
Deployment
Agent
VM3
Deployment
Agent
VM4
Deployment
Agent
• Deployment Agents create system resources
(Users/Groups/Firewall Rules/etc.) from steps in
Deployment DB
• Deployment Agents work through steps for
installing Hadoop (and other packages)
• Packages contain scripts that will be invoked
for installing custom components (e.g. Java,
Python, etc.)
• Deployment Agents stores states of steps for re-trys
upon failures.
Deployment in Windows Azure
WA Blob Storage
Phase 1: Submit request, generate
manifest files
Windows Azure
Deployment Service
• Cluster creation request submitted via Windows Azure Portal.
• Deployment Service generates and validates manifest files.
• DA stores manifest files in Blob Storage.
• (Hadoop package files are already in Blob Storage.)
Windows Azure Fabric
WA Blob Storage
Phase 2: Generate/submit deployment
files
Windows Azure
Deployment Service
• Deployment Service generates Cloud Service deployment files.
• .cspkg: contains Deployment Agent
• .cscfg: contains instance counts for VMs and location of
generated manifest files.
• Cloud Service deployment files are submitted to Windows Azure
Fabric.
.cspkg .cscfg
WA Blob Storage
Phase 3: Provision VMs, Deployment
Agent
Windows Azure
Deployment Service
• Windows Azure Fabric provisions VMs and deploys Deployment
Agent on VMs
Windows Azure Fabric
WA Blob Storage
Phase 3: Provision VMs, Deployment
Agent
Windows Azure
• Windows Azure Fabric provisions VMs and deploys Deployment
Agent on VMsWindows Azure Fabric
VM1
VM2
VM3
VM4
WEB_ROLES
WORKER_ROLES
Deployment
Agent
Deployment
Agent
Deployment
Agent
Deployment
Agent
VM1
WA Blob Storage
Phase 4: Get manifest files, install
components
Windows Azure
• Deployment Agent determines environment and VM type.
• Deployment Agent gets manifest files based on location in .cscfg
file.
Windows Azure Fabric
VM2
VM3
VM4
Deployment
Agent
Deployment
Agent
Deployment
Agent
Deployment
Agent
WEB_ROLES
WORKER_ROLES
VM1
WA Blob Storage
Phase 4: Get manifest files, install
components
Windows Azure
• Deployment Agent generates in-memory list of activities for
installing components.
• Deployment Agent retrieves packages (based on repo location in
PackageDefinition file).
Windows Azure Fabric
VM2
VM3
VM4
Deployment
Agent
Deployment
Agent
Deployment
Agent
Deployment
Agent
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
VM1
WA Blob Storage
Phase 4: Get manifest files, install
components
Windows Azure
• Deployment Agent installs components.Windows Azure Fabric
VM2
VM3
VM4
Deployment
Agent
Deployment
Agent
Deployment
Agent
Deployment
Agent
NameNode JobTracker
DataNode, TaskTracker DataNode, TaskTracker
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------

Más contenido relacionado

La actualidad más candente

Building large scale applications in yarn with apache twill
Building large scale applications in yarn with apache twillBuilding large scale applications in yarn with apache twill
Building large scale applications in yarn with apache twillHenry Saputra
 
Hortonworks technical workshop operations with ambari
Hortonworks technical workshop   operations with ambariHortonworks technical workshop   operations with ambari
Hortonworks technical workshop operations with ambariHortonworks
 
Running Non-MapReduce Big Data Applications on Apache Hadoop
Running Non-MapReduce Big Data Applications on Apache HadoopRunning Non-MapReduce Big Data Applications on Apache Hadoop
Running Non-MapReduce Big Data Applications on Apache Hadoophitesh1892
 
Get most out of Spark on YARN
Get most out of Spark on YARNGet most out of Spark on YARN
Get most out of Spark on YARNDataWorks Summit
 
Hortonworks Technical Workshop: Apache Ambari
Hortonworks Technical Workshop:   Apache AmbariHortonworks Technical Workshop:   Apache Ambari
Hortonworks Technical Workshop: Apache AmbariHortonworks
 
Get Started Building YARN Applications
Get Started Building YARN ApplicationsGet Started Building YARN Applications
Get Started Building YARN ApplicationsHortonworks
 
Hortonworks Technical Workshop: Interactive Query with Apache Hive
Hortonworks Technical Workshop: Interactive Query with Apache Hive Hortonworks Technical Workshop: Interactive Query with Apache Hive
Hortonworks Technical Workshop: Interactive Query with Apache Hive Hortonworks
 
Hive on spark berlin buzzwords
Hive on spark berlin buzzwordsHive on spark berlin buzzwords
Hive on spark berlin buzzwordsSzehon Ho
 
One Click Hadoop Clusters - Anywhere (Using Docker)
One Click Hadoop Clusters - Anywhere (Using Docker)One Click Hadoop Clusters - Anywhere (Using Docker)
One Click Hadoop Clusters - Anywhere (Using Docker)DataWorks Summit
 
Apache Ambari: Past, Present, Future
Apache Ambari: Past, Present, FutureApache Ambari: Past, Present, Future
Apache Ambari: Past, Present, FutureHortonworks
 
Slider: Applications on YARN
Slider: Applications on YARNSlider: Applications on YARN
Slider: Applications on YARNSteve Loughran
 
Hive on kafka
Hive on kafkaHive on kafka
Hive on kafkaSzehon Ho
 
Apache Hadoop YARN: best practices
Apache Hadoop YARN: best practicesApache Hadoop YARN: best practices
Apache Hadoop YARN: best practicesDataWorks Summit
 
Discover.hdp2.2.ambari.final[1]
Discover.hdp2.2.ambari.final[1]Discover.hdp2.2.ambari.final[1]
Discover.hdp2.2.ambari.final[1]Hortonworks
 
Writing app framworks for hadoop on yarn
Writing app framworks for hadoop on yarnWriting app framworks for hadoop on yarn
Writing app framworks for hadoop on yarnDataWorks Summit
 
Apache REEF - stdlib for big data
Apache REEF - stdlib for big dataApache REEF - stdlib for big data
Apache REEF - stdlib for big dataSergiy Matusevych
 
Apache Ambari - What's New in 1.7.0
Apache Ambari - What's New in 1.7.0Apache Ambari - What's New in 1.7.0
Apache Ambari - What's New in 1.7.0Hortonworks
 

La actualidad más candente (20)

Effective Spark on Multi-Tenant Clusters
Effective Spark on Multi-Tenant ClustersEffective Spark on Multi-Tenant Clusters
Effective Spark on Multi-Tenant Clusters
 
Building large scale applications in yarn with apache twill
Building large scale applications in yarn with apache twillBuilding large scale applications in yarn with apache twill
Building large scale applications in yarn with apache twill
 
Hortonworks technical workshop operations with ambari
Hortonworks technical workshop   operations with ambariHortonworks technical workshop   operations with ambari
Hortonworks technical workshop operations with ambari
 
YARN and the Docker container runtime
YARN and the Docker container runtimeYARN and the Docker container runtime
YARN and the Docker container runtime
 
Running Non-MapReduce Big Data Applications on Apache Hadoop
Running Non-MapReduce Big Data Applications on Apache HadoopRunning Non-MapReduce Big Data Applications on Apache Hadoop
Running Non-MapReduce Big Data Applications on Apache Hadoop
 
Get most out of Spark on YARN
Get most out of Spark on YARNGet most out of Spark on YARN
Get most out of Spark on YARN
 
Hortonworks Technical Workshop: Apache Ambari
Hortonworks Technical Workshop:   Apache AmbariHortonworks Technical Workshop:   Apache Ambari
Hortonworks Technical Workshop: Apache Ambari
 
Get Started Building YARN Applications
Get Started Building YARN ApplicationsGet Started Building YARN Applications
Get Started Building YARN Applications
 
Hortonworks Technical Workshop: Interactive Query with Apache Hive
Hortonworks Technical Workshop: Interactive Query with Apache Hive Hortonworks Technical Workshop: Interactive Query with Apache Hive
Hortonworks Technical Workshop: Interactive Query with Apache Hive
 
Hive on spark berlin buzzwords
Hive on spark berlin buzzwordsHive on spark berlin buzzwords
Hive on spark berlin buzzwords
 
One Click Hadoop Clusters - Anywhere (Using Docker)
One Click Hadoop Clusters - Anywhere (Using Docker)One Click Hadoop Clusters - Anywhere (Using Docker)
One Click Hadoop Clusters - Anywhere (Using Docker)
 
Apache Ambari: Past, Present, Future
Apache Ambari: Past, Present, FutureApache Ambari: Past, Present, Future
Apache Ambari: Past, Present, Future
 
Slider: Applications on YARN
Slider: Applications on YARNSlider: Applications on YARN
Slider: Applications on YARN
 
Hive on kafka
Hive on kafkaHive on kafka
Hive on kafka
 
Apache Hadoop YARN: best practices
Apache Hadoop YARN: best practicesApache Hadoop YARN: best practices
Apache Hadoop YARN: best practices
 
Discover.hdp2.2.ambari.final[1]
Discover.hdp2.2.ambari.final[1]Discover.hdp2.2.ambari.final[1]
Discover.hdp2.2.ambari.final[1]
 
Writing app framworks for hadoop on yarn
Writing app framworks for hadoop on yarnWriting app framworks for hadoop on yarn
Writing app framworks for hadoop on yarn
 
Apache REEF - stdlib for big data
Apache REEF - stdlib for big dataApache REEF - stdlib for big data
Apache REEF - stdlib for big data
 
Apache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and FutureApache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and Future
 
Apache Ambari - What's New in 1.7.0
Apache Ambari - What's New in 1.7.0Apache Ambari - What's New in 1.7.0
Apache Ambari - What's New in 1.7.0
 

Similar a Apache Ambari BOF - Blueprints + Azure - Hadoop Summit 2013

Infrastructure as Code for Azure: ARM or Terraform?
Infrastructure as Code for Azure: ARM or Terraform?Infrastructure as Code for Azure: ARM or Terraform?
Infrastructure as Code for Azure: ARM or Terraform?Katherine Golovinova
 
Content server installation guide
Content server installation guideContent server installation guide
Content server installation guideNaveed Bashir
 
Best practices for share point solution deployment
Best practices for share point solution deploymentBest practices for share point solution deployment
Best practices for share point solution deploymentSalaudeen Rajack
 
Docker Java App with MariaDB – Deployment in Less than a Minute
Docker Java App with MariaDB – Deployment in Less than a MinuteDocker Java App with MariaDB – Deployment in Less than a Minute
Docker Java App with MariaDB – Deployment in Less than a Minutedchq
 
FabricServer Technology Overview
FabricServer Technology OverviewFabricServer Technology Overview
FabricServer Technology OverviewIvan_datasynapse
 
Extend Eclipse p2 framework capabilities: Add your custom installation steps
Extend Eclipse p2 framework capabilities: Add your custom installation stepsExtend Eclipse p2 framework capabilities: Add your custom installation steps
Extend Eclipse p2 framework capabilities: Add your custom installation stepsDragos_Mihailescu
 
Practical advice on deployment and management of enterprise workloads
Practical advice on deployment and management of enterprise workloadsPractical advice on deployment and management of enterprise workloads
Practical advice on deployment and management of enterprise workloadsJarek Miszczyk
 
Ranger v0.3 20180327
Ranger v0.3 20180327Ranger v0.3 20180327
Ranger v0.3 20180327현우 한
 
Hadoop Installation presentation
Hadoop Installation presentationHadoop Installation presentation
Hadoop Installation presentationpuneet yadav
 
Professional deployment
Professional deploymentProfessional deployment
Professional deploymentIvelina Dimova
 
AWS Update | London - Elastic Beanstalk
AWS Update | London - Elastic BeanstalkAWS Update | London - Elastic Beanstalk
AWS Update | London - Elastic BeanstalkAmazon Web Services
 
Managing Your Runtime With P2
Managing Your Runtime With P2Managing Your Runtime With P2
Managing Your Runtime With P2Pascal Rapicault
 
Information on Apache Handlers
Information on Apache HandlersInformation on Apache Handlers
Information on Apache HandlersHTS Hosting
 
R12 d49656 gc10-apps dba 20
R12 d49656 gc10-apps dba 20R12 d49656 gc10-apps dba 20
R12 d49656 gc10-apps dba 20zeesniper
 
Talk on .NET assemblies
Talk on .NET assembliesTalk on .NET assemblies
Talk on .NET assembliesVidya Agarwal
 
IBM Cloud Pak for Integration 2020.2.1 installation
IBM Cloud Pak for Integration 2020.2.1 installation IBM Cloud Pak for Integration 2020.2.1 installation
IBM Cloud Pak for Integration 2020.2.1 installation khawkwf
 
Azure for SharePoint Developers - Workshop - Part 3: Web Services
Azure for SharePoint Developers - Workshop - Part 3: Web ServicesAzure for SharePoint Developers - Workshop - Part 3: Web Services
Azure for SharePoint Developers - Workshop - Part 3: Web ServicesBob German
 

Similar a Apache Ambari BOF - Blueprints + Azure - Hadoop Summit 2013 (20)

Infrastructure as Code for Azure: ARM or Terraform?
Infrastructure as Code for Azure: ARM or Terraform?Infrastructure as Code for Azure: ARM or Terraform?
Infrastructure as Code for Azure: ARM or Terraform?
 
Content server installation guide
Content server installation guideContent server installation guide
Content server installation guide
 
iac.pptx
iac.pptxiac.pptx
iac.pptx
 
Best practices for share point solution deployment
Best practices for share point solution deploymentBest practices for share point solution deployment
Best practices for share point solution deployment
 
Docker Java App with MariaDB – Deployment in Less than a Minute
Docker Java App with MariaDB – Deployment in Less than a MinuteDocker Java App with MariaDB – Deployment in Less than a Minute
Docker Java App with MariaDB – Deployment in Less than a Minute
 
FabricServer Technology Overview
FabricServer Technology OverviewFabricServer Technology Overview
FabricServer Technology Overview
 
Extend Eclipse p2 framework capabilities: Add your custom installation steps
Extend Eclipse p2 framework capabilities: Add your custom installation stepsExtend Eclipse p2 framework capabilities: Add your custom installation steps
Extend Eclipse p2 framework capabilities: Add your custom installation steps
 
Practical advice on deployment and management of enterprise workloads
Practical advice on deployment and management of enterprise workloadsPractical advice on deployment and management of enterprise workloads
Practical advice on deployment and management of enterprise workloads
 
Ranger v0.3 20180327
Ranger v0.3 20180327Ranger v0.3 20180327
Ranger v0.3 20180327
 
Apache ppt
Apache pptApache ppt
Apache ppt
 
OMG D&C Tutorial
OMG D&C TutorialOMG D&C Tutorial
OMG D&C Tutorial
 
Hadoop Installation presentation
Hadoop Installation presentationHadoop Installation presentation
Hadoop Installation presentation
 
Professional deployment
Professional deploymentProfessional deployment
Professional deployment
 
AWS Update | London - Elastic Beanstalk
AWS Update | London - Elastic BeanstalkAWS Update | London - Elastic Beanstalk
AWS Update | London - Elastic Beanstalk
 
Managing Your Runtime With P2
Managing Your Runtime With P2Managing Your Runtime With P2
Managing Your Runtime With P2
 
Information on Apache Handlers
Information on Apache HandlersInformation on Apache Handlers
Information on Apache Handlers
 
R12 d49656 gc10-apps dba 20
R12 d49656 gc10-apps dba 20R12 d49656 gc10-apps dba 20
R12 d49656 gc10-apps dba 20
 
Talk on .NET assemblies
Talk on .NET assembliesTalk on .NET assemblies
Talk on .NET assemblies
 
IBM Cloud Pak for Integration 2020.2.1 installation
IBM Cloud Pak for Integration 2020.2.1 installation IBM Cloud Pak for Integration 2020.2.1 installation
IBM Cloud Pak for Integration 2020.2.1 installation
 
Azure for SharePoint Developers - Workshop - Part 3: Web Services
Azure for SharePoint Developers - Workshop - Part 3: Web ServicesAzure for SharePoint Developers - Workshop - Part 3: Web Services
Azure for SharePoint Developers - Workshop - Part 3: Web Services
 

Más de Hortonworks

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyHortonworks
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakHortonworks
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsHortonworks
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysHortonworks
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's NewHortonworks
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerHortonworks
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsHortonworks
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeHortonworks
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidHortonworks
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleHortonworks
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATAHortonworks
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Hortonworks
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseHortonworks
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseHortonworks
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationHortonworks
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementHortonworks
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHortonworks
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCHortonworks
 

Más de Hortonworks (20)

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with Cloudbreak
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log Events
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's New
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data Landscape
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache Druid
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at Scale
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with Ease
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data Management
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDC
 

Último

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 

Último (20)

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 

Apache Ambari BOF - Blueprints + Azure - Hadoop Summit 2013

  • 1. How Ambari manifest files are used by System Center and Windows Azure Brian Swan Program Manager, HDInsight Team Microsoft
  • 2. A representation of a software packages to be installed on a cluster (typically Hadoop, but also any custom packages, such as Java or Python). This representation captures all the invariants such as services, components, properties associated with a specific package. Authored by package distributor. A mapping between a package component and one or more logical host groups defined in the host manifest. Authored by Hadoop Admin. Contains a list of logical host definitions, system-level resources, and (optionally) the actual hosts that fall into the host def categories. When actual hosts are not described, references that are realized by on-demand services (such as a cloud provider) are included. A logical group may contain one or more hosts. Authored by System Admin. Captures the specific configuration for a deployment at the cluster level, as well as overrides at the service and component levels. Authored by Hadoop Admin. HostComponentMapping.json Manifest Files - Overview HostManifest.json PackageDefinition.json PackageConfiguration.json
  • 3. A representation of a software packages to be installed on a cluster (typically Hadoop, but also any custom packages, such as Java or Python). This representation captures all the invariants such as services, components, properties associated with a specific package. Authored by package distributor. A mapping between a package component and one or more logical host groups defined in the host manifest. Authored by Hadoop Admin. Contains a list of logical host definitions, system-level resourced, and (optionally) the actual hosts that fall into the host def categories. When actual hosts are not described, references that are realized by on-demand services (such as a cloud provider) are included. A logical group may contain one or more hosts. Authored by System Admin. Captures the specific configuration for a deployment at the cluster level, as well as overrides at the service and component levels. Authored by Hadoop Admin. HostComponentMapping.json Manifest Files - Overview HostManifest.json PackageDefinition.json PackageConfiguration.json
  • 4. A representation of a software packages to be installed on a cluster (typically Hadoop, but also any custom packages, such as Java or Python). This representation captures all the invariants such as services, components, properties associated with a specific package. Authored by package distributor. A mapping between a package component and one or more logical host groups defined in the host manifest. Authored by Hadoop Admin. Contains a list of logical host definitions, system-level resourced, and (optionally) the actual hosts that fall into the host def categories. When actual hosts are not described, references that are realized by on-demand services (such as a cloud provider) are included. A logical group may contain one or more hosts. Authored by System Admin. Captures the specific configuration for a deployment at the cluster level, as well as overrides at the service and component levels. Authored by Hadoop Admin. HostComponentMapping.json Manifest Files - Overview HostManifest.json PackageDefinition.json PackageConfiguration.json
  • 5. A representation of a software packages to be installed on a cluster (typically Hadoop, but also any custom packages, such as Java or Python). This representation captures all the invariants such as services, components, properties associated with a specific package. Authored by package distributor. A mapping between a package component and one or more logical host groups defined in the host manifest. Authored by Hadoop Admin. Contains a list of logical host definitions, system-level resourced, and (optionally) the actual hosts that fall into the host def categories. When actual hosts are not described, references that are realized by on-demand services (such as a cloud provider) are included. A logical group may contain one or more hosts. Authored by System Admin. Captures the specific configuration for a deployment at the cluster level, as well as overrides at the service and component levels. Authored by Hadoop Admin. HostComponentMapping.json Manifest Files - Overview HostManifest.json PackageDefinition.json PackageConfiguration.json
  • 6. A representation of a software packages to be installed on a cluster (typically Hadoop, but also any custom packages, such as Java or Python). This representation captures all the invariants such as services, components, properties associated with a specific package. Authored by package distributor. A mapping between a package component and one or more logical host groups defined in the host manifest. Authored by Hadoop Admin. Contains a list of logical host definitions, system-level resourced, and (optionally) the actual hosts that fall into the host def categories. When actual hosts are not described, references that are realized by on-demand services (such as a cloud provider) are included. A logical group may contain one or more hosts. Authored by System Admin. Captures the specific configuration for a deployment at the cluster level, as well as overrides at the service and component levels. Authored by Hadoop Admin. HostComponentMapping.json Manifest Files - Overview HostManifest.json PackageDefinition.json PackageConfiguration.json
  • 7. Deployment using System Center Note: The tools described here for deploying Hadoop clusters using System Center are prototype tools used internally at Microsoft. The intent here is to demonstrate one consumer of cluster manifest files.
  • 8. System Center – Prerequisites Deployment DB System Center Virtual Machine Manager (VMM) HadoopServiceTemplate.xml Win.vhd >HDInsightDeployment.exe • System Center 2013 • VM running Virtual Machine Manager (VMM) with… • Hadoop Service Template • Windows Server VHD • HDInsight Deployment Tool • Deployment Database (SQL Server)
  • 9. Phase 1: Parse, Validate, Populate DB Deployment DB System Center VMM HadoopServiceTemplate.xml Win.vhd >HDInsightDeployment.exe • Copy manifest files to Deployment Tool directory. Manifest Files
  • 10. Phase 1: Parse, Validate, Populate DB Deployment DB System Center VMM HadoopServiceTemplate.xml >HDInsightDeployment.exe • Copy manifest files to Deployment Tool directory. • Update the Deployment Tool configuration file.
  • 11. Phase 1: Parse, Validate, Populate DB Deployment DB System Center VMM HadoopServiceTemplate.xml >HDInsightDeployment.exe • Copy manifest files to Deployment Tool directory. • Update HDInsightDeployment.exe.config. • Start deployment with HDInsightDeployment.exe. • Deployment tool reads and validates manifest files. • Schema validation. • Dependency validation.
  • 12. Phase 1: Parse, Validate, Populate DB Deployment DB System Center VMM HadoopServiceTemplate.xml >HDInsightDeployment.exe • Copy manifest files to Deployment Tool directory. • Update HDInsightDeployment.exe.config. • Start deployment with HDInsightDeployment.exe. • Deployment tool reads and validates manifest files. • Schema validation. • Dependency validation. • Deployment DB is populated with steps for creating system resources on hosts (e.g. Users/Groups/Firewall Rules/etc.) • Deployment DB is populated with ordered steps for installing Hadoop (and other packages).
  • 13. Phase 2: Download Packages Deployment DB System Center VMM HadoopServiceTemplate.xml Win.vhd >HDInsightDeployment.exe • Deployment tool downloads/copies packages to VMM based on information in PackageDefinition.json.
  • 14. VMM Phase 3: Provision VMs, Install Packages Deployment DB System Center HadoopServiceTemplate.xml Win.vhd >HDInsightDeployment.exe • VMM does VM provisioning based on HostManifest.json file.
  • 15. VMM Phase 3: Provision VMs, Install Packages Deployment DB System Center HadoopServiceTemplate.xml Win.vhd >HDInsightDeployment.exe • VMM does VM provisioning based on HostManifest.json file. VM1 VM2 VM3 VM4 MASTER_HOSTS SLAVE_HOSTS
  • 16. VMM Phase 3: Provision VMs, Install Packages Deployment DB System Center HadoopServiceTemplate.xml Win.vhd >HDInsightDeployment.exe • VMM does VM provisioning based on HostManifest.json file. • Hadoop Service Template (a VMM template) specifies which system components to install (e.g. Deployment Agent) • Starts Deployment Agent VM1 VM2 VM3 VM4
  • 17. VMM Phase 3: Provision VMs, Install Packages Deployment DB System Center HadoopServiceTemplate.xml Win.vhd >HDInsightDeployment.exe • VMM does VM provisioning based on HostManifest.json file. • Template specifies which system components to install (e.g. Deployment Agent) • Starts Deployment Agent VM1 Deployment Agent VM2 Deployment Agent VM3 Deployment Agent VM4 Deployment Agent
  • 18. VMM Phase 3: Provision VMs, Install Packages Deployment DB System Center HadoopServiceTemplate.xml Win.vhd >HDInsightDeployment.exe • VMM does VM provisioning based on HostManifest file. • Template specifies which system components to install (e.g. Deployment Agent) • Starts Deployment Agent • Deployment Agents pull packages from SCVMM VM1 Deployment Agent VM2 Deployment Agent VM3 Deployment Agent VM4 Deployment Agent
  • 19. Phase 4: Create System Resources, Install Packages Deployment DB System Center VM1 Deployment Agent VM2 Deployment Agent VM3 Deployment Agent VM4 Deployment Agent • Deployment Agents create system resources (Users/Groups/Firewall Rules/etc.) from steps in Deployment DB hdfs_user hadoop_admin mapred_user hadoop_admin hdfs_user mapred_user hdfs_user mapred_user
  • 20. Phase 4: Create System Resources, Install Packages Deployment DB System Center VM1 Deployment Agent VM2 Deployment Agent VM3 Deployment Agent VM4 Deployment Agent • Deployment Agents create system resources (Users/Groups/Firewall Rules/etc.) from steps in Deployment DB • Deployment Agents work through steps for installing Hadoop (and other packages) • Packages contain scripts that will be invoked for installing custom components (e.g. Java, Python, etc.) HDFS NameNode MapReduce JobTracker HDFS, MapReduce DataNode, TaskTracker HDFS, MapReduce DataNode, TaskTracker
  • 21. Phase 4: Create System Resources, Install Packages Deployment DB System Center VM1 Deployment Agent VM2 Deployment Agent VM3 Deployment Agent VM4 Deployment Agent • Deployment Agents create system resources (Users/Groups/Firewall Rules/etc.) from steps in Deployment DB • Deployment Agents work through steps for installing Hadoop (and other packages) • Packages contain scripts that will be invoked for installing custom components (e.g. Java, Python, etc.) • Deployment Agents stores states of steps for re-trys upon failures.
  • 23. WA Blob Storage Phase 1: Submit request, generate manifest files Windows Azure Deployment Service • Cluster creation request submitted via Windows Azure Portal. • Deployment Service generates and validates manifest files. • DA stores manifest files in Blob Storage. • (Hadoop package files are already in Blob Storage.)
  • 24. Windows Azure Fabric WA Blob Storage Phase 2: Generate/submit deployment files Windows Azure Deployment Service • Deployment Service generates Cloud Service deployment files. • .cspkg: contains Deployment Agent • .cscfg: contains instance counts for VMs and location of generated manifest files. • Cloud Service deployment files are submitted to Windows Azure Fabric. .cspkg .cscfg
  • 25. WA Blob Storage Phase 3: Provision VMs, Deployment Agent Windows Azure Deployment Service • Windows Azure Fabric provisions VMs and deploys Deployment Agent on VMs Windows Azure Fabric
  • 26. WA Blob Storage Phase 3: Provision VMs, Deployment Agent Windows Azure • Windows Azure Fabric provisions VMs and deploys Deployment Agent on VMsWindows Azure Fabric VM1 VM2 VM3 VM4 WEB_ROLES WORKER_ROLES Deployment Agent Deployment Agent Deployment Agent Deployment Agent
  • 27. VM1 WA Blob Storage Phase 4: Get manifest files, install components Windows Azure • Deployment Agent determines environment and VM type. • Deployment Agent gets manifest files based on location in .cscfg file. Windows Azure Fabric VM2 VM3 VM4 Deployment Agent Deployment Agent Deployment Agent Deployment Agent WEB_ROLES WORKER_ROLES
  • 28. VM1 WA Blob Storage Phase 4: Get manifest files, install components Windows Azure • Deployment Agent generates in-memory list of activities for installing components. • Deployment Agent retrieves packages (based on repo location in PackageDefinition file). Windows Azure Fabric VM2 VM3 VM4 Deployment Agent Deployment Agent Deployment Agent Deployment Agent • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ----------
  • 29. VM1 WA Blob Storage Phase 4: Get manifest files, install components Windows Azure • Deployment Agent installs components.Windows Azure Fabric VM2 VM3 VM4 Deployment Agent Deployment Agent Deployment Agent Deployment Agent NameNode JobTracker DataNode, TaskTracker DataNode, TaskTracker • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ----------

Notas del editor

  1. Dependency validation is validation to make sure the cluster can run once it is deployed.In azure, Deployment DB is replaced by in-memory storage of info.In Azure and VMM, hostmanifest only specifies the number of instances in each logical host group. The host groups are defined in the template (in VMM), or by Azure.PackageDefinition: specifes settings for components selected in Host-Component-Mapping file
  2. Dependency validation is validation to make sure the cluster can run once it is deployed.In azure, Deployment DB is replaced by in-memory storage of info.In Azure and VMM, hostmanifest only specifies the number of instances in each logical host group. The host groups are defined in the template (in VMM), or by Azure.PackageDefinition: specifes settings for components selected in Host-Component-Mapping file.Note that SQL Authentication is shown in the sqlConnectionString. In production environment, Integrated Authentication is/should be used.
  3. Dependency validation is validation to make sure the cluster can run once it is deployed.Examples include…Is there Package Definition that matches the package specified in the Host-Component-Mapping?Are host groups consistent across Host-Component-Mapping and Host Manifest files?If Hive is selected to install, are its dependencies selected and available?In azure, Deployment DB is replaced by in-memory storage of info.In Azure and VMM, hostmanifest only specifies the number of instances in each logical host group. The host groups are defined in the template (in VMM), or by Azure.PackageDefinition: specifes settings for components selected in Host-Component-Mapping file
  4. Deployment DB is populated with ordered steps for installing Hadoop (and other packages). For example…Install HDFS service before MapReduceInstall NameNode component before DataNode component
  5. Deployment Agents stores states of steps for re-trys upon failures.E.g. if namenode install fails, it will retryIf namenode install fails, datanode will not proceed.Once issue is resolved, deployment agent will pick from last successful step
  6. Deployment Service is transparent to users.Deployment Service is a Cloud Service running in Windows Azure.Currently, the manifest files are mostly static. The HostManifest file isn’t used at all. VM information is handled by Azure Fabric.We have flexibility going forward to incorporate user input (e.g. configuration overrides).Manifest files are stored in user storage account.HDP and other packages are in HDInsight blob storage account.
  7. Web/Worker Roles are logical host groups in Windows Azure (the types of VMs)VM sizes are fixed (for now).
  8. Deployment Agent is the same code that is used in System Center scenario. Logic is forked based on environment.