SlideShare una empresa de Scribd logo
1 de 29
How Ambari manifest files
are used by System Center
and Windows Azure
Brian Swan
Program Manager, HDInsight Team
Microsoft
A representation of a software packages to be installed on a cluster
(typically Hadoop, but also any custom packages, such as Java or
Python). This representation captures all the invariants such as
services, components, properties associated with a specific package.
Authored by package distributor.
A mapping between a package component and one or more logical
host groups defined in the host manifest.
Authored by Hadoop Admin.
Contains a list of logical host definitions, system-level resources, and
(optionally) the actual hosts that fall into the host def categories.
When actual hosts are not described, references that are realized by
on-demand services (such as a cloud provider) are included. A logical
group may contain one or more hosts.
Authored by System Admin.
Captures the specific configuration for a deployment at the cluster
level, as well as overrides at the service and component levels.
Authored by Hadoop Admin.
HostComponentMapping.json
Manifest Files - Overview
HostManifest.json
PackageDefinition.json
PackageConfiguration.json
A representation of a software packages to be installed on a cluster
(typically Hadoop, but also any custom packages, such as Java or
Python). This representation captures all the invariants such as
services, components, properties associated with a specific package.
Authored by package distributor.
A mapping between a package component and one or more logical
host groups defined in the host manifest.
Authored by Hadoop Admin.
Contains a list of logical host definitions, system-level resourced, and
(optionally) the actual hosts that fall into the host def categories.
When actual hosts are not described, references that are realized by
on-demand services (such as a cloud provider) are included. A logical
group may contain one or more hosts.
Authored by System Admin.
Captures the specific configuration for a deployment at the cluster
level, as well as overrides at the service and component levels.
Authored by Hadoop Admin.
HostComponentMapping.json
Manifest Files - Overview
HostManifest.json
PackageDefinition.json
PackageConfiguration.json
A representation of a software packages to be installed on a cluster
(typically Hadoop, but also any custom packages, such as Java or
Python). This representation captures all the invariants such as
services, components, properties associated with a specific package.
Authored by package distributor.
A mapping between a package component and one or more logical
host groups defined in the host manifest.
Authored by Hadoop Admin.
Contains a list of logical host definitions, system-level resourced, and
(optionally) the actual hosts that fall into the host def categories.
When actual hosts are not described, references that are realized by
on-demand services (such as a cloud provider) are included. A logical
group may contain one or more hosts.
Authored by System Admin.
Captures the specific configuration for a deployment at the cluster
level, as well as overrides at the service and component levels.
Authored by Hadoop Admin.
HostComponentMapping.json
Manifest Files - Overview
HostManifest.json
PackageDefinition.json
PackageConfiguration.json
A representation of a software packages to be installed on a cluster
(typically Hadoop, but also any custom packages, such as Java or
Python). This representation captures all the invariants such as
services, components, properties associated with a specific package.
Authored by package distributor.
A mapping between a package component and one or more logical
host groups defined in the host manifest.
Authored by Hadoop Admin.
Contains a list of logical host definitions, system-level resourced, and
(optionally) the actual hosts that fall into the host def categories.
When actual hosts are not described, references that are realized by
on-demand services (such as a cloud provider) are included. A logical
group may contain one or more hosts.
Authored by System Admin.
Captures the specific configuration for a deployment at the cluster
level, as well as overrides at the service and component levels.
Authored by Hadoop Admin.
HostComponentMapping.json
Manifest Files - Overview
HostManifest.json
PackageDefinition.json
PackageConfiguration.json
A representation of a software packages to be installed on a cluster
(typically Hadoop, but also any custom packages, such as Java or
Python). This representation captures all the invariants such as
services, components, properties associated with a specific package.
Authored by package distributor.
A mapping between a package component and one or more logical
host groups defined in the host manifest.
Authored by Hadoop Admin.
Contains a list of logical host definitions, system-level resourced, and
(optionally) the actual hosts that fall into the host def categories.
When actual hosts are not described, references that are realized by
on-demand services (such as a cloud provider) are included. A logical
group may contain one or more hosts.
Authored by System Admin.
Captures the specific configuration for a deployment at the cluster
level, as well as overrides at the service and component levels.
Authored by Hadoop Admin.
HostComponentMapping.json
Manifest Files - Overview
HostManifest.json
PackageDefinition.json
PackageConfiguration.json
Deployment using System Center
Note: The tools described here for deploying Hadoop clusters using System
Center are prototype tools used internally at Microsoft. The intent here is to
demonstrate one consumer of cluster manifest files.
System Center – Prerequisites
Deployment
DB
System Center
Virtual Machine Manager
(VMM)
HadoopServiceTemplate.xml
Win.vhd
>HDInsightDeployment.exe
• System Center 2013
• VM running Virtual Machine Manager
(VMM) with…
• Hadoop Service Template
• Windows Server VHD
• HDInsight Deployment Tool
• Deployment Database (SQL Server)
Phase 1: Parse, Validate, Populate DB
Deployment
DB
System Center
VMM
HadoopServiceTemplate.xml
Win.vhd
>HDInsightDeployment.exe
• Copy manifest files to Deployment Tool directory.
Manifest
Files
Phase 1: Parse, Validate, Populate DB
Deployment
DB
System Center
VMM
HadoopServiceTemplate.xml
>HDInsightDeployment.exe
• Copy manifest files to Deployment Tool directory.
• Update the Deployment Tool configuration file.
Phase 1: Parse, Validate, Populate DB
Deployment
DB
System Center
VMM
HadoopServiceTemplate.xml
>HDInsightDeployment.exe
• Copy manifest files to Deployment Tool directory.
• Update HDInsightDeployment.exe.config.
• Start deployment with HDInsightDeployment.exe.
• Deployment tool reads and validates manifest files.
• Schema validation.
• Dependency validation.
Phase 1: Parse, Validate, Populate DB
Deployment
DB
System Center
VMM
HadoopServiceTemplate.xml
>HDInsightDeployment.exe
• Copy manifest files to Deployment Tool directory.
• Update HDInsightDeployment.exe.config.
• Start deployment with HDInsightDeployment.exe.
• Deployment tool reads and validates manifest files.
• Schema validation.
• Dependency validation.
• Deployment DB is populated with steps for creating system
resources on hosts (e.g. Users/Groups/Firewall Rules/etc.)
• Deployment DB is populated with ordered steps for installing
Hadoop (and other packages).
Phase 2: Download Packages
Deployment
DB
System Center
VMM
HadoopServiceTemplate.xml
Win.vhd
>HDInsightDeployment.exe
• Deployment tool downloads/copies packages to VMM based on
information in PackageDefinition.json.
VMM
Phase 3: Provision VMs, Install Packages
Deployment
DB
System Center
HadoopServiceTemplate.xml
Win.vhd
>HDInsightDeployment.exe
• VMM does VM provisioning based on HostManifest.json file.
VMM
Phase 3: Provision VMs, Install Packages
Deployment
DB
System Center
HadoopServiceTemplate.xml
Win.vhd
>HDInsightDeployment.exe
• VMM does VM provisioning based on HostManifest.json file.
VM1
VM2
VM3
VM4
MASTER_HOSTS
SLAVE_HOSTS
VMM
Phase 3: Provision VMs, Install Packages
Deployment
DB
System Center
HadoopServiceTemplate.xml
Win.vhd
>HDInsightDeployment.exe
• VMM does VM provisioning based on HostManifest.json file.
• Hadoop Service Template (a VMM template) specifies which
system components to install (e.g. Deployment Agent)
• Starts Deployment Agent
VM1
VM2
VM3
VM4
VMM
Phase 3: Provision VMs, Install Packages
Deployment
DB
System Center
HadoopServiceTemplate.xml
Win.vhd
>HDInsightDeployment.exe
• VMM does VM provisioning based on HostManifest.json file.
• Template specifies which system components to install (e.g.
Deployment Agent)
• Starts Deployment Agent
VM1
Deployment
Agent
VM2
Deployment
Agent
VM3
Deployment
Agent
VM4
Deployment
Agent
VMM
Phase 3: Provision VMs, Install Packages
Deployment
DB
System Center
HadoopServiceTemplate.xml
Win.vhd
>HDInsightDeployment.exe
• VMM does VM provisioning based on HostManifest file.
• Template specifies which system components to install (e.g.
Deployment Agent)
• Starts Deployment Agent
• Deployment Agents pull packages from SCVMM
VM1
Deployment
Agent
VM2
Deployment
Agent
VM3
Deployment
Agent
VM4
Deployment
Agent
Phase 4: Create System Resources, Install
Packages
Deployment
DB
System Center
VM1
Deployment
Agent
VM2
Deployment
Agent
VM3
Deployment
Agent
VM4
Deployment
Agent
• Deployment Agents create system resources
(Users/Groups/Firewall Rules/etc.) from steps in
Deployment DB hdfs_user
hadoop_admin
mapred_user
hadoop_admin
hdfs_user
mapred_user
hdfs_user
mapred_user
Phase 4: Create System Resources, Install
Packages
Deployment
DB
System Center
VM1
Deployment
Agent
VM2
Deployment
Agent
VM3
Deployment
Agent
VM4
Deployment
Agent
• Deployment Agents create system resources
(Users/Groups/Firewall Rules/etc.) from steps in
Deployment DB
• Deployment Agents work through steps for
installing Hadoop (and other packages)
• Packages contain scripts that will be invoked
for installing custom components (e.g. Java,
Python, etc.)
HDFS
NameNode
MapReduce
JobTracker
HDFS, MapReduce
DataNode, TaskTracker
HDFS, MapReduce
DataNode, TaskTracker
Phase 4: Create System Resources, Install
Packages
Deployment
DB
System Center
VM1
Deployment
Agent
VM2
Deployment
Agent
VM3
Deployment
Agent
VM4
Deployment
Agent
• Deployment Agents create system resources
(Users/Groups/Firewall Rules/etc.) from steps in
Deployment DB
• Deployment Agents work through steps for
installing Hadoop (and other packages)
• Packages contain scripts that will be invoked
for installing custom components (e.g. Java,
Python, etc.)
• Deployment Agents stores states of steps for re-trys
upon failures.
Deployment in Windows Azure
WA Blob Storage
Phase 1: Submit request, generate
manifest files
Windows Azure
Deployment Service
• Cluster creation request submitted via Windows Azure Portal.
• Deployment Service generates and validates manifest files.
• DA stores manifest files in Blob Storage.
• (Hadoop package files are already in Blob Storage.)
Windows Azure Fabric
WA Blob Storage
Phase 2: Generate/submit deployment
files
Windows Azure
Deployment Service
• Deployment Service generates Cloud Service deployment files.
• .cspkg: contains Deployment Agent
• .cscfg: contains instance counts for VMs and location of
generated manifest files.
• Cloud Service deployment files are submitted to Windows Azure
Fabric.
.cspkg .cscfg
WA Blob Storage
Phase 3: Provision VMs, Deployment
Agent
Windows Azure
Deployment Service
• Windows Azure Fabric provisions VMs and deploys Deployment
Agent on VMs
Windows Azure Fabric
WA Blob Storage
Phase 3: Provision VMs, Deployment
Agent
Windows Azure
• Windows Azure Fabric provisions VMs and deploys Deployment
Agent on VMsWindows Azure Fabric
VM1
VM2
VM3
VM4
WEB_ROLES
WORKER_ROLES
Deployment
Agent
Deployment
Agent
Deployment
Agent
Deployment
Agent
VM1
WA Blob Storage
Phase 4: Get manifest files, install
components
Windows Azure
• Deployment Agent determines environment and VM type.
• Deployment Agent gets manifest files based on location in .cscfg
file.
Windows Azure Fabric
VM2
VM3
VM4
Deployment
Agent
Deployment
Agent
Deployment
Agent
Deployment
Agent
WEB_ROLES
WORKER_ROLES
VM1
WA Blob Storage
Phase 4: Get manifest files, install
components
Windows Azure
• Deployment Agent generates in-memory list of activities for
installing components.
• Deployment Agent retrieves packages (based on repo location in
PackageDefinition file).
Windows Azure Fabric
VM2
VM3
VM4
Deployment
Agent
Deployment
Agent
Deployment
Agent
Deployment
Agent
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
VM1
WA Blob Storage
Phase 4: Get manifest files, install
components
Windows Azure
• Deployment Agent installs components.Windows Azure Fabric
VM2
VM3
VM4
Deployment
Agent
Deployment
Agent
Deployment
Agent
Deployment
Agent
NameNode JobTracker
DataNode, TaskTracker DataNode, TaskTracker
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------
• ----------

Más contenido relacionado

La actualidad más candente

Building large scale applications in yarn with apache twill
Building large scale applications in yarn with apache twillBuilding large scale applications in yarn with apache twill
Building large scale applications in yarn with apache twillHenry Saputra
 
Hortonworks technical workshop operations with ambari
Hortonworks technical workshop   operations with ambariHortonworks technical workshop   operations with ambari
Hortonworks technical workshop operations with ambariHortonworks
 
Running Non-MapReduce Big Data Applications on Apache Hadoop
Running Non-MapReduce Big Data Applications on Apache HadoopRunning Non-MapReduce Big Data Applications on Apache Hadoop
Running Non-MapReduce Big Data Applications on Apache Hadoophitesh1892
 
Get most out of Spark on YARN
Get most out of Spark on YARNGet most out of Spark on YARN
Get most out of Spark on YARNDataWorks Summit
 
Hortonworks Technical Workshop: Apache Ambari
Hortonworks Technical Workshop:   Apache AmbariHortonworks Technical Workshop:   Apache Ambari
Hortonworks Technical Workshop: Apache AmbariHortonworks
 
Get Started Building YARN Applications
Get Started Building YARN ApplicationsGet Started Building YARN Applications
Get Started Building YARN ApplicationsHortonworks
 
Hortonworks Technical Workshop: Interactive Query with Apache Hive
Hortonworks Technical Workshop: Interactive Query with Apache Hive Hortonworks Technical Workshop: Interactive Query with Apache Hive
Hortonworks Technical Workshop: Interactive Query with Apache Hive Hortonworks
 
Hive on spark berlin buzzwords
Hive on spark berlin buzzwordsHive on spark berlin buzzwords
Hive on spark berlin buzzwordsSzehon Ho
 
One Click Hadoop Clusters - Anywhere (Using Docker)
One Click Hadoop Clusters - Anywhere (Using Docker)One Click Hadoop Clusters - Anywhere (Using Docker)
One Click Hadoop Clusters - Anywhere (Using Docker)DataWorks Summit
 
Apache Ambari: Past, Present, Future
Apache Ambari: Past, Present, FutureApache Ambari: Past, Present, Future
Apache Ambari: Past, Present, FutureHortonworks
 
Slider: Applications on YARN
Slider: Applications on YARNSlider: Applications on YARN
Slider: Applications on YARNSteve Loughran
 
Hive on kafka
Hive on kafkaHive on kafka
Hive on kafkaSzehon Ho
 
Apache Hadoop YARN: best practices
Apache Hadoop YARN: best practicesApache Hadoop YARN: best practices
Apache Hadoop YARN: best practicesDataWorks Summit
 
Discover.hdp2.2.ambari.final[1]
Discover.hdp2.2.ambari.final[1]Discover.hdp2.2.ambari.final[1]
Discover.hdp2.2.ambari.final[1]Hortonworks
 
Writing app framworks for hadoop on yarn
Writing app framworks for hadoop on yarnWriting app framworks for hadoop on yarn
Writing app framworks for hadoop on yarnDataWorks Summit
 
Apache REEF - stdlib for big data
Apache REEF - stdlib for big dataApache REEF - stdlib for big data
Apache REEF - stdlib for big dataSergiy Matusevych
 
Apache Ambari - What's New in 1.7.0
Apache Ambari - What's New in 1.7.0Apache Ambari - What's New in 1.7.0
Apache Ambari - What's New in 1.7.0Hortonworks
 

La actualidad más candente (20)

Effective Spark on Multi-Tenant Clusters
Effective Spark on Multi-Tenant ClustersEffective Spark on Multi-Tenant Clusters
Effective Spark on Multi-Tenant Clusters
 
Building large scale applications in yarn with apache twill
Building large scale applications in yarn with apache twillBuilding large scale applications in yarn with apache twill
Building large scale applications in yarn with apache twill
 
Hortonworks technical workshop operations with ambari
Hortonworks technical workshop   operations with ambariHortonworks technical workshop   operations with ambari
Hortonworks technical workshop operations with ambari
 
YARN and the Docker container runtime
YARN and the Docker container runtimeYARN and the Docker container runtime
YARN and the Docker container runtime
 
Running Non-MapReduce Big Data Applications on Apache Hadoop
Running Non-MapReduce Big Data Applications on Apache HadoopRunning Non-MapReduce Big Data Applications on Apache Hadoop
Running Non-MapReduce Big Data Applications on Apache Hadoop
 
Get most out of Spark on YARN
Get most out of Spark on YARNGet most out of Spark on YARN
Get most out of Spark on YARN
 
Hortonworks Technical Workshop: Apache Ambari
Hortonworks Technical Workshop:   Apache AmbariHortonworks Technical Workshop:   Apache Ambari
Hortonworks Technical Workshop: Apache Ambari
 
Get Started Building YARN Applications
Get Started Building YARN ApplicationsGet Started Building YARN Applications
Get Started Building YARN Applications
 
Hortonworks Technical Workshop: Interactive Query with Apache Hive
Hortonworks Technical Workshop: Interactive Query with Apache Hive Hortonworks Technical Workshop: Interactive Query with Apache Hive
Hortonworks Technical Workshop: Interactive Query with Apache Hive
 
Hive on spark berlin buzzwords
Hive on spark berlin buzzwordsHive on spark berlin buzzwords
Hive on spark berlin buzzwords
 
One Click Hadoop Clusters - Anywhere (Using Docker)
One Click Hadoop Clusters - Anywhere (Using Docker)One Click Hadoop Clusters - Anywhere (Using Docker)
One Click Hadoop Clusters - Anywhere (Using Docker)
 
Apache Ambari: Past, Present, Future
Apache Ambari: Past, Present, FutureApache Ambari: Past, Present, Future
Apache Ambari: Past, Present, Future
 
Slider: Applications on YARN
Slider: Applications on YARNSlider: Applications on YARN
Slider: Applications on YARN
 
Hive on kafka
Hive on kafkaHive on kafka
Hive on kafka
 
Apache Hadoop YARN: best practices
Apache Hadoop YARN: best practicesApache Hadoop YARN: best practices
Apache Hadoop YARN: best practices
 
Discover.hdp2.2.ambari.final[1]
Discover.hdp2.2.ambari.final[1]Discover.hdp2.2.ambari.final[1]
Discover.hdp2.2.ambari.final[1]
 
Writing app framworks for hadoop on yarn
Writing app framworks for hadoop on yarnWriting app framworks for hadoop on yarn
Writing app framworks for hadoop on yarn
 
Apache REEF - stdlib for big data
Apache REEF - stdlib for big dataApache REEF - stdlib for big data
Apache REEF - stdlib for big data
 
Apache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and FutureApache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and Future
 
Apache Ambari - What's New in 1.7.0
Apache Ambari - What's New in 1.7.0Apache Ambari - What's New in 1.7.0
Apache Ambari - What's New in 1.7.0
 

Similar a Apache Ambari BOF - Blueprints + Azure - Hadoop Summit 2013

Infrastructure as Code for Azure: ARM or Terraform?
Infrastructure as Code for Azure: ARM or Terraform?Infrastructure as Code for Azure: ARM or Terraform?
Infrastructure as Code for Azure: ARM or Terraform?Katherine Golovinova
 
Content server installation guide
Content server installation guideContent server installation guide
Content server installation guideNaveed Bashir
 
Best practices for share point solution deployment
Best practices for share point solution deploymentBest practices for share point solution deployment
Best practices for share point solution deploymentSalaudeen Rajack
 
Docker Java App with MariaDB – Deployment in Less than a Minute
Docker Java App with MariaDB – Deployment in Less than a MinuteDocker Java App with MariaDB – Deployment in Less than a Minute
Docker Java App with MariaDB – Deployment in Less than a Minutedchq
 
FabricServer Technology Overview
FabricServer Technology OverviewFabricServer Technology Overview
FabricServer Technology OverviewIvan_datasynapse
 
Extend Eclipse p2 framework capabilities: Add your custom installation steps
Extend Eclipse p2 framework capabilities: Add your custom installation stepsExtend Eclipse p2 framework capabilities: Add your custom installation steps
Extend Eclipse p2 framework capabilities: Add your custom installation stepsDragos_Mihailescu
 
Practical advice on deployment and management of enterprise workloads
Practical advice on deployment and management of enterprise workloadsPractical advice on deployment and management of enterprise workloads
Practical advice on deployment and management of enterprise workloadsJarek Miszczyk
 
Ranger v0.3 20180327
Ranger v0.3 20180327Ranger v0.3 20180327
Ranger v0.3 20180327현우 한
 
Hadoop Installation presentation
Hadoop Installation presentationHadoop Installation presentation
Hadoop Installation presentationpuneet yadav
 
Professional deployment
Professional deploymentProfessional deployment
Professional deploymentIvelina Dimova
 
AWS Update | London - Elastic Beanstalk
AWS Update | London - Elastic BeanstalkAWS Update | London - Elastic Beanstalk
AWS Update | London - Elastic BeanstalkAmazon Web Services
 
Managing Your Runtime With P2
Managing Your Runtime With P2Managing Your Runtime With P2
Managing Your Runtime With P2Pascal Rapicault
 
Information on Apache Handlers
Information on Apache HandlersInformation on Apache Handlers
Information on Apache HandlersHTS Hosting
 
R12 d49656 gc10-apps dba 20
R12 d49656 gc10-apps dba 20R12 d49656 gc10-apps dba 20
R12 d49656 gc10-apps dba 20zeesniper
 
Talk on .NET assemblies
Talk on .NET assembliesTalk on .NET assemblies
Talk on .NET assembliesVidya Agarwal
 
IBM Cloud Pak for Integration 2020.2.1 installation
IBM Cloud Pak for Integration 2020.2.1 installation IBM Cloud Pak for Integration 2020.2.1 installation
IBM Cloud Pak for Integration 2020.2.1 installation khawkwf
 
Azure for SharePoint Developers - Workshop - Part 3: Web Services
Azure for SharePoint Developers - Workshop - Part 3: Web ServicesAzure for SharePoint Developers - Workshop - Part 3: Web Services
Azure for SharePoint Developers - Workshop - Part 3: Web ServicesBob German
 

Similar a Apache Ambari BOF - Blueprints + Azure - Hadoop Summit 2013 (20)

Infrastructure as Code for Azure: ARM or Terraform?
Infrastructure as Code for Azure: ARM or Terraform?Infrastructure as Code for Azure: ARM or Terraform?
Infrastructure as Code for Azure: ARM or Terraform?
 
Content server installation guide
Content server installation guideContent server installation guide
Content server installation guide
 
iac.pptx
iac.pptxiac.pptx
iac.pptx
 
Best practices for share point solution deployment
Best practices for share point solution deploymentBest practices for share point solution deployment
Best practices for share point solution deployment
 
Docker Java App with MariaDB – Deployment in Less than a Minute
Docker Java App with MariaDB – Deployment in Less than a MinuteDocker Java App with MariaDB – Deployment in Less than a Minute
Docker Java App with MariaDB – Deployment in Less than a Minute
 
FabricServer Technology Overview
FabricServer Technology OverviewFabricServer Technology Overview
FabricServer Technology Overview
 
Extend Eclipse p2 framework capabilities: Add your custom installation steps
Extend Eclipse p2 framework capabilities: Add your custom installation stepsExtend Eclipse p2 framework capabilities: Add your custom installation steps
Extend Eclipse p2 framework capabilities: Add your custom installation steps
 
Practical advice on deployment and management of enterprise workloads
Practical advice on deployment and management of enterprise workloadsPractical advice on deployment and management of enterprise workloads
Practical advice on deployment and management of enterprise workloads
 
Ranger v0.3 20180327
Ranger v0.3 20180327Ranger v0.3 20180327
Ranger v0.3 20180327
 
Apache ppt
Apache pptApache ppt
Apache ppt
 
OMG D&C Tutorial
OMG D&C TutorialOMG D&C Tutorial
OMG D&C Tutorial
 
Hadoop Installation presentation
Hadoop Installation presentationHadoop Installation presentation
Hadoop Installation presentation
 
Professional deployment
Professional deploymentProfessional deployment
Professional deployment
 
AWS Update | London - Elastic Beanstalk
AWS Update | London - Elastic BeanstalkAWS Update | London - Elastic Beanstalk
AWS Update | London - Elastic Beanstalk
 
Managing Your Runtime With P2
Managing Your Runtime With P2Managing Your Runtime With P2
Managing Your Runtime With P2
 
Information on Apache Handlers
Information on Apache HandlersInformation on Apache Handlers
Information on Apache Handlers
 
R12 d49656 gc10-apps dba 20
R12 d49656 gc10-apps dba 20R12 d49656 gc10-apps dba 20
R12 d49656 gc10-apps dba 20
 
Talk on .NET assemblies
Talk on .NET assembliesTalk on .NET assemblies
Talk on .NET assemblies
 
IBM Cloud Pak for Integration 2020.2.1 installation
IBM Cloud Pak for Integration 2020.2.1 installation IBM Cloud Pak for Integration 2020.2.1 installation
IBM Cloud Pak for Integration 2020.2.1 installation
 
Azure for SharePoint Developers - Workshop - Part 3: Web Services
Azure for SharePoint Developers - Workshop - Part 3: Web ServicesAzure for SharePoint Developers - Workshop - Part 3: Web Services
Azure for SharePoint Developers - Workshop - Part 3: Web Services
 

Más de Hortonworks

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyHortonworks
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakHortonworks
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsHortonworks
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysHortonworks
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's NewHortonworks
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerHortonworks
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsHortonworks
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeHortonworks
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidHortonworks
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleHortonworks
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATAHortonworks
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Hortonworks
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseHortonworks
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseHortonworks
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationHortonworks
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementHortonworks
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHortonworks
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCHortonworks
 

Más de Hortonworks (20)

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with Cloudbreak
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log Events
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's New
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data Landscape
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache Druid
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at Scale
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with Ease
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data Management
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDC
 

Último

Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 

Último (20)

Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 

Apache Ambari BOF - Blueprints + Azure - Hadoop Summit 2013

  • 1. How Ambari manifest files are used by System Center and Windows Azure Brian Swan Program Manager, HDInsight Team Microsoft
  • 2. A representation of a software packages to be installed on a cluster (typically Hadoop, but also any custom packages, such as Java or Python). This representation captures all the invariants such as services, components, properties associated with a specific package. Authored by package distributor. A mapping between a package component and one or more logical host groups defined in the host manifest. Authored by Hadoop Admin. Contains a list of logical host definitions, system-level resources, and (optionally) the actual hosts that fall into the host def categories. When actual hosts are not described, references that are realized by on-demand services (such as a cloud provider) are included. A logical group may contain one or more hosts. Authored by System Admin. Captures the specific configuration for a deployment at the cluster level, as well as overrides at the service and component levels. Authored by Hadoop Admin. HostComponentMapping.json Manifest Files - Overview HostManifest.json PackageDefinition.json PackageConfiguration.json
  • 3. A representation of a software packages to be installed on a cluster (typically Hadoop, but also any custom packages, such as Java or Python). This representation captures all the invariants such as services, components, properties associated with a specific package. Authored by package distributor. A mapping between a package component and one or more logical host groups defined in the host manifest. Authored by Hadoop Admin. Contains a list of logical host definitions, system-level resourced, and (optionally) the actual hosts that fall into the host def categories. When actual hosts are not described, references that are realized by on-demand services (such as a cloud provider) are included. A logical group may contain one or more hosts. Authored by System Admin. Captures the specific configuration for a deployment at the cluster level, as well as overrides at the service and component levels. Authored by Hadoop Admin. HostComponentMapping.json Manifest Files - Overview HostManifest.json PackageDefinition.json PackageConfiguration.json
  • 4. A representation of a software packages to be installed on a cluster (typically Hadoop, but also any custom packages, such as Java or Python). This representation captures all the invariants such as services, components, properties associated with a specific package. Authored by package distributor. A mapping between a package component and one or more logical host groups defined in the host manifest. Authored by Hadoop Admin. Contains a list of logical host definitions, system-level resourced, and (optionally) the actual hosts that fall into the host def categories. When actual hosts are not described, references that are realized by on-demand services (such as a cloud provider) are included. A logical group may contain one or more hosts. Authored by System Admin. Captures the specific configuration for a deployment at the cluster level, as well as overrides at the service and component levels. Authored by Hadoop Admin. HostComponentMapping.json Manifest Files - Overview HostManifest.json PackageDefinition.json PackageConfiguration.json
  • 5. A representation of a software packages to be installed on a cluster (typically Hadoop, but also any custom packages, such as Java or Python). This representation captures all the invariants such as services, components, properties associated with a specific package. Authored by package distributor. A mapping between a package component and one or more logical host groups defined in the host manifest. Authored by Hadoop Admin. Contains a list of logical host definitions, system-level resourced, and (optionally) the actual hosts that fall into the host def categories. When actual hosts are not described, references that are realized by on-demand services (such as a cloud provider) are included. A logical group may contain one or more hosts. Authored by System Admin. Captures the specific configuration for a deployment at the cluster level, as well as overrides at the service and component levels. Authored by Hadoop Admin. HostComponentMapping.json Manifest Files - Overview HostManifest.json PackageDefinition.json PackageConfiguration.json
  • 6. A representation of a software packages to be installed on a cluster (typically Hadoop, but also any custom packages, such as Java or Python). This representation captures all the invariants such as services, components, properties associated with a specific package. Authored by package distributor. A mapping between a package component and one or more logical host groups defined in the host manifest. Authored by Hadoop Admin. Contains a list of logical host definitions, system-level resourced, and (optionally) the actual hosts that fall into the host def categories. When actual hosts are not described, references that are realized by on-demand services (such as a cloud provider) are included. A logical group may contain one or more hosts. Authored by System Admin. Captures the specific configuration for a deployment at the cluster level, as well as overrides at the service and component levels. Authored by Hadoop Admin. HostComponentMapping.json Manifest Files - Overview HostManifest.json PackageDefinition.json PackageConfiguration.json
  • 7. Deployment using System Center Note: The tools described here for deploying Hadoop clusters using System Center are prototype tools used internally at Microsoft. The intent here is to demonstrate one consumer of cluster manifest files.
  • 8. System Center – Prerequisites Deployment DB System Center Virtual Machine Manager (VMM) HadoopServiceTemplate.xml Win.vhd >HDInsightDeployment.exe • System Center 2013 • VM running Virtual Machine Manager (VMM) with… • Hadoop Service Template • Windows Server VHD • HDInsight Deployment Tool • Deployment Database (SQL Server)
  • 9. Phase 1: Parse, Validate, Populate DB Deployment DB System Center VMM HadoopServiceTemplate.xml Win.vhd >HDInsightDeployment.exe • Copy manifest files to Deployment Tool directory. Manifest Files
  • 10. Phase 1: Parse, Validate, Populate DB Deployment DB System Center VMM HadoopServiceTemplate.xml >HDInsightDeployment.exe • Copy manifest files to Deployment Tool directory. • Update the Deployment Tool configuration file.
  • 11. Phase 1: Parse, Validate, Populate DB Deployment DB System Center VMM HadoopServiceTemplate.xml >HDInsightDeployment.exe • Copy manifest files to Deployment Tool directory. • Update HDInsightDeployment.exe.config. • Start deployment with HDInsightDeployment.exe. • Deployment tool reads and validates manifest files. • Schema validation. • Dependency validation.
  • 12. Phase 1: Parse, Validate, Populate DB Deployment DB System Center VMM HadoopServiceTemplate.xml >HDInsightDeployment.exe • Copy manifest files to Deployment Tool directory. • Update HDInsightDeployment.exe.config. • Start deployment with HDInsightDeployment.exe. • Deployment tool reads and validates manifest files. • Schema validation. • Dependency validation. • Deployment DB is populated with steps for creating system resources on hosts (e.g. Users/Groups/Firewall Rules/etc.) • Deployment DB is populated with ordered steps for installing Hadoop (and other packages).
  • 13. Phase 2: Download Packages Deployment DB System Center VMM HadoopServiceTemplate.xml Win.vhd >HDInsightDeployment.exe • Deployment tool downloads/copies packages to VMM based on information in PackageDefinition.json.
  • 14. VMM Phase 3: Provision VMs, Install Packages Deployment DB System Center HadoopServiceTemplate.xml Win.vhd >HDInsightDeployment.exe • VMM does VM provisioning based on HostManifest.json file.
  • 15. VMM Phase 3: Provision VMs, Install Packages Deployment DB System Center HadoopServiceTemplate.xml Win.vhd >HDInsightDeployment.exe • VMM does VM provisioning based on HostManifest.json file. VM1 VM2 VM3 VM4 MASTER_HOSTS SLAVE_HOSTS
  • 16. VMM Phase 3: Provision VMs, Install Packages Deployment DB System Center HadoopServiceTemplate.xml Win.vhd >HDInsightDeployment.exe • VMM does VM provisioning based on HostManifest.json file. • Hadoop Service Template (a VMM template) specifies which system components to install (e.g. Deployment Agent) • Starts Deployment Agent VM1 VM2 VM3 VM4
  • 17. VMM Phase 3: Provision VMs, Install Packages Deployment DB System Center HadoopServiceTemplate.xml Win.vhd >HDInsightDeployment.exe • VMM does VM provisioning based on HostManifest.json file. • Template specifies which system components to install (e.g. Deployment Agent) • Starts Deployment Agent VM1 Deployment Agent VM2 Deployment Agent VM3 Deployment Agent VM4 Deployment Agent
  • 18. VMM Phase 3: Provision VMs, Install Packages Deployment DB System Center HadoopServiceTemplate.xml Win.vhd >HDInsightDeployment.exe • VMM does VM provisioning based on HostManifest file. • Template specifies which system components to install (e.g. Deployment Agent) • Starts Deployment Agent • Deployment Agents pull packages from SCVMM VM1 Deployment Agent VM2 Deployment Agent VM3 Deployment Agent VM4 Deployment Agent
  • 19. Phase 4: Create System Resources, Install Packages Deployment DB System Center VM1 Deployment Agent VM2 Deployment Agent VM3 Deployment Agent VM4 Deployment Agent • Deployment Agents create system resources (Users/Groups/Firewall Rules/etc.) from steps in Deployment DB hdfs_user hadoop_admin mapred_user hadoop_admin hdfs_user mapred_user hdfs_user mapred_user
  • 20. Phase 4: Create System Resources, Install Packages Deployment DB System Center VM1 Deployment Agent VM2 Deployment Agent VM3 Deployment Agent VM4 Deployment Agent • Deployment Agents create system resources (Users/Groups/Firewall Rules/etc.) from steps in Deployment DB • Deployment Agents work through steps for installing Hadoop (and other packages) • Packages contain scripts that will be invoked for installing custom components (e.g. Java, Python, etc.) HDFS NameNode MapReduce JobTracker HDFS, MapReduce DataNode, TaskTracker HDFS, MapReduce DataNode, TaskTracker
  • 21. Phase 4: Create System Resources, Install Packages Deployment DB System Center VM1 Deployment Agent VM2 Deployment Agent VM3 Deployment Agent VM4 Deployment Agent • Deployment Agents create system resources (Users/Groups/Firewall Rules/etc.) from steps in Deployment DB • Deployment Agents work through steps for installing Hadoop (and other packages) • Packages contain scripts that will be invoked for installing custom components (e.g. Java, Python, etc.) • Deployment Agents stores states of steps for re-trys upon failures.
  • 23. WA Blob Storage Phase 1: Submit request, generate manifest files Windows Azure Deployment Service • Cluster creation request submitted via Windows Azure Portal. • Deployment Service generates and validates manifest files. • DA stores manifest files in Blob Storage. • (Hadoop package files are already in Blob Storage.)
  • 24. Windows Azure Fabric WA Blob Storage Phase 2: Generate/submit deployment files Windows Azure Deployment Service • Deployment Service generates Cloud Service deployment files. • .cspkg: contains Deployment Agent • .cscfg: contains instance counts for VMs and location of generated manifest files. • Cloud Service deployment files are submitted to Windows Azure Fabric. .cspkg .cscfg
  • 25. WA Blob Storage Phase 3: Provision VMs, Deployment Agent Windows Azure Deployment Service • Windows Azure Fabric provisions VMs and deploys Deployment Agent on VMs Windows Azure Fabric
  • 26. WA Blob Storage Phase 3: Provision VMs, Deployment Agent Windows Azure • Windows Azure Fabric provisions VMs and deploys Deployment Agent on VMsWindows Azure Fabric VM1 VM2 VM3 VM4 WEB_ROLES WORKER_ROLES Deployment Agent Deployment Agent Deployment Agent Deployment Agent
  • 27. VM1 WA Blob Storage Phase 4: Get manifest files, install components Windows Azure • Deployment Agent determines environment and VM type. • Deployment Agent gets manifest files based on location in .cscfg file. Windows Azure Fabric VM2 VM3 VM4 Deployment Agent Deployment Agent Deployment Agent Deployment Agent WEB_ROLES WORKER_ROLES
  • 28. VM1 WA Blob Storage Phase 4: Get manifest files, install components Windows Azure • Deployment Agent generates in-memory list of activities for installing components. • Deployment Agent retrieves packages (based on repo location in PackageDefinition file). Windows Azure Fabric VM2 VM3 VM4 Deployment Agent Deployment Agent Deployment Agent Deployment Agent • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ----------
  • 29. VM1 WA Blob Storage Phase 4: Get manifest files, install components Windows Azure • Deployment Agent installs components.Windows Azure Fabric VM2 VM3 VM4 Deployment Agent Deployment Agent Deployment Agent Deployment Agent NameNode JobTracker DataNode, TaskTracker DataNode, TaskTracker • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ---------- • ----------

Notas del editor

  1. Dependency validation is validation to make sure the cluster can run once it is deployed.In azure, Deployment DB is replaced by in-memory storage of info.In Azure and VMM, hostmanifest only specifies the number of instances in each logical host group. The host groups are defined in the template (in VMM), or by Azure.PackageDefinition: specifes settings for components selected in Host-Component-Mapping file
  2. Dependency validation is validation to make sure the cluster can run once it is deployed.In azure, Deployment DB is replaced by in-memory storage of info.In Azure and VMM, hostmanifest only specifies the number of instances in each logical host group. The host groups are defined in the template (in VMM), or by Azure.PackageDefinition: specifes settings for components selected in Host-Component-Mapping file.Note that SQL Authentication is shown in the sqlConnectionString. In production environment, Integrated Authentication is/should be used.
  3. Dependency validation is validation to make sure the cluster can run once it is deployed.Examples include…Is there Package Definition that matches the package specified in the Host-Component-Mapping?Are host groups consistent across Host-Component-Mapping and Host Manifest files?If Hive is selected to install, are its dependencies selected and available?In azure, Deployment DB is replaced by in-memory storage of info.In Azure and VMM, hostmanifest only specifies the number of instances in each logical host group. The host groups are defined in the template (in VMM), or by Azure.PackageDefinition: specifes settings for components selected in Host-Component-Mapping file
  4. Deployment DB is populated with ordered steps for installing Hadoop (and other packages). For example…Install HDFS service before MapReduceInstall NameNode component before DataNode component
  5. Deployment Agents stores states of steps for re-trys upon failures.E.g. if namenode install fails, it will retryIf namenode install fails, datanode will not proceed.Once issue is resolved, deployment agent will pick from last successful step
  6. Deployment Service is transparent to users.Deployment Service is a Cloud Service running in Windows Azure.Currently, the manifest files are mostly static. The HostManifest file isn’t used at all. VM information is handled by Azure Fabric.We have flexibility going forward to incorporate user input (e.g. configuration overrides).Manifest files are stored in user storage account.HDP and other packages are in HDInsight blob storage account.
  7. Web/Worker Roles are logical host groups in Windows Azure (the types of VMs)VM sizes are fixed (for now).
  8. Deployment Agent is the same code that is used in System Center scenario. Logic is forked based on environment.