SlideShare una empresa de Scribd logo
1 de 13
Apache Ambari
Hadoop Cluster Manifest/Blueprint
Sumit Mohanty
Member of Technical Staff
Hortonworks
Agenda
• Cluster Manifest
• Scenarios
• Cluster Blueprint
• Using Cluster Manifest
• What’s next?
Hadoop Cluster Manifest
• Declarative representation of a Hadoop Cluster
– Stack Definition
– Configuration
– Host Details
– Component Mapping
• A common spec. across tools/services
• Targets
– Package Author, Hadoop Admins, and System Admins
Cluster Manifest:
Package Definition
• Package metadata
• Repository details
• Constituent services and their components
• Service specific metadata
• Configurable parameters
Cluster Manifest:
Package Definition
{
"schemaVersion:" : "1”,
"version" : "1.3.0”,
"author" : "Hortonworks”,
"created" : "03-31-2013”,
"manifestId" : "GUID",
"stackVersion" : "1.3.0”, "stackName" : "HDP",
"context" : […],
"packages" : {
"type" : "rpm",
"osSpecificPackages" : […]
},
"services" : [
{
"name" : "HDFS",
"components" : [
{
"name" : "NAMENODE",
"category" : "MASTER",
…
},
{
"name" : "DATANODE", …
],
"configurations" : [
{
"type":"core-site.xml",
"properties" : [
{
"propertyName" : "fs.trash.interval",
"defaultValue" : "360",
"propertyDescription" : "..."
},
…
],
"isManageable": "true",
"isRequired": "true",
"packages": […],
"serviceContext" : […]
}
}
Cluster Manifest:
Package Configuration
• Configurable parameters and values
– Non-default
– Organization, environment, instance specific
• Service or component specific values
{
"schemaVersion:" : "1", …
"context" : [
{ "name" : "targetStackVersion", "value" : "1.3.0" },
],
"deployedServices" : ["HDFS”, … ],
"configuration" : [
{
"type":"core-site.xml",
"properties" : [
{ "name" : "fs.trash.interval", "value" : "300" },
...
]
},
…
"configOverrides" : [
/* delta changes on the top level changes */
{
"type" : "SERVICE”, "name" : "HDFS",
"configuration" : [
{
"type":"core-site.xml",
"properties" : [
{ "name" : "fs.trash.interval", "value" : "480" },
...
},
{
"type" : "COMPONENT"
"name" : "JOBTRACKER",
...
}
Cluster Manifest:
Host List
• List of hosts
– Can be fully specified
– Or, can be a set of requirements
– Or, can even be non-existent
{
"schemaVersion:" : "1", …
"context" : […],
"hostGroups" : [
{
"name" : "masterHosts",
"members" : {
"count" : "1",
"hosts" : [
{ "FQDN" : "host1.domain1.com", "ip" : "" }
]
},
"properties" : […]
},
{
"name" : ”slaveHosts",
"members" : {…},
"properties" : […]
},
{
"name" : "clientHosts",
"members" : {…},
"properties" : [
{ "name" : "host_type", "value" : "High-CPU Medium" }
]
},
...
]
}
Cluster Manifest:
Host Component Mapping
• A mapping of components to hosts
– Simple component mapping to named hosts
– A set of constraints that can be used to find best
match (e.g. evaluate against host properties)
• System resources
– users, groups, ports, etc.
• Host specific configuration
– Non-homogeneous cluster
Cluster Manifest:
Host Component Mapping
{
"schemaVersion:" : "1”, …
"context" : […],
"hostResourceMapping" : [
{
"hosts" : [
{
"predicate" : "name=*"
}
],
"systemResources" : {
"hadoopGroup" : "hadoop",
"groups" : [
{
"name" : "hadoop",
...
],
"users" : [
{
"groups" : [
"hadoop"
],
"name" : "hdfs",
"type" : "LOCAL”
],
"ports" : […]
...
],
"hostComponentMapping" : [
{
"hosts" : [
{
"predicate" : "name=masterhosts1"
"configOverrides" : [
{
"type":"core-site.xml",
"properties" : [
{ "name" : "fs.trash.interval", "value" : "480" },
...
],
"components" : [
"NAMENODE",
"JOBTRACKER",
...
]
},
...
}
Scenarios
• Define cluster templates
– and, host specific templates
• On demand cluster creation
– Cluster extension (e.g. add Datanodes)
• Export cluster manifest
• A uniform “language” across cluster managers
and environments
Cluster Blueprint
• Blueprint is manifest with “holes”
– Typically
• Hostnames
• Config parameters that use hostname
– But, any config params that a Hadoop admin
deems necessary to be parameterized
• Blueprint = Manifest + Parameter Values
Using Cluster Manifest
What’s Next?
• Apache Ambari JIRA 1783, is tracking this
project
– https://issues.apache.org/jira/browse/AMBARI-
1783
– Comments and suggestions, welcome
• In next releases, we will enhance Ambari to
add support for manifest and blueprints

Más contenido relacionado

Más de Hortonworks

Más de Hortonworks (20)

Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log Events
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's New
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data Landscape
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache Druid
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at Scale
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with Ease
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data Management
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDC
 
4 Essential Steps for Managing Sensitive Data
4 Essential Steps for Managing Sensitive Data4 Essential Steps for Managing Sensitive Data
4 Essential Steps for Managing Sensitive Data
 
5 Steps to Create a Company Culture that Embraces the Power of Data
5 Steps to Create a Company Culture that Embraces the Power of Data5 Steps to Create a Company Culture that Embraces the Power of Data
5 Steps to Create a Company Culture that Embraces the Power of Data
 
Exploring the Heated-and Completely Unnecessary- Data Lake Debate
Exploring the Heated-and Completely Unnecessary- Data Lake DebateExploring the Heated-and Completely Unnecessary- Data Lake Debate
Exploring the Heated-and Completely Unnecessary- Data Lake Debate
 

Último

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 

Último (20)

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 

Apache Ambari BOF - Blueprint - Hadoop Summit 2013

  • 1. Apache Ambari Hadoop Cluster Manifest/Blueprint Sumit Mohanty Member of Technical Staff Hortonworks
  • 2. Agenda • Cluster Manifest • Scenarios • Cluster Blueprint • Using Cluster Manifest • What’s next?
  • 3. Hadoop Cluster Manifest • Declarative representation of a Hadoop Cluster – Stack Definition – Configuration – Host Details – Component Mapping • A common spec. across tools/services • Targets – Package Author, Hadoop Admins, and System Admins
  • 4. Cluster Manifest: Package Definition • Package metadata • Repository details • Constituent services and their components • Service specific metadata • Configurable parameters
  • 5. Cluster Manifest: Package Definition { "schemaVersion:" : "1”, "version" : "1.3.0”, "author" : "Hortonworks”, "created" : "03-31-2013”, "manifestId" : "GUID", "stackVersion" : "1.3.0”, "stackName" : "HDP", "context" : […], "packages" : { "type" : "rpm", "osSpecificPackages" : […] }, "services" : [ { "name" : "HDFS", "components" : [ { "name" : "NAMENODE", "category" : "MASTER", … }, { "name" : "DATANODE", … ], "configurations" : [ { "type":"core-site.xml", "properties" : [ { "propertyName" : "fs.trash.interval", "defaultValue" : "360", "propertyDescription" : "..." }, … ], "isManageable": "true", "isRequired": "true", "packages": […], "serviceContext" : […] } }
  • 6. Cluster Manifest: Package Configuration • Configurable parameters and values – Non-default – Organization, environment, instance specific • Service or component specific values { "schemaVersion:" : "1", … "context" : [ { "name" : "targetStackVersion", "value" : "1.3.0" }, ], "deployedServices" : ["HDFS”, … ], "configuration" : [ { "type":"core-site.xml", "properties" : [ { "name" : "fs.trash.interval", "value" : "300" }, ... ] }, … "configOverrides" : [ /* delta changes on the top level changes */ { "type" : "SERVICE”, "name" : "HDFS", "configuration" : [ { "type":"core-site.xml", "properties" : [ { "name" : "fs.trash.interval", "value" : "480" }, ... }, { "type" : "COMPONENT" "name" : "JOBTRACKER", ... }
  • 7. Cluster Manifest: Host List • List of hosts – Can be fully specified – Or, can be a set of requirements – Or, can even be non-existent { "schemaVersion:" : "1", … "context" : […], "hostGroups" : [ { "name" : "masterHosts", "members" : { "count" : "1", "hosts" : [ { "FQDN" : "host1.domain1.com", "ip" : "" } ] }, "properties" : […] }, { "name" : ”slaveHosts", "members" : {…}, "properties" : […] }, { "name" : "clientHosts", "members" : {…}, "properties" : [ { "name" : "host_type", "value" : "High-CPU Medium" } ] }, ... ] }
  • 8. Cluster Manifest: Host Component Mapping • A mapping of components to hosts – Simple component mapping to named hosts – A set of constraints that can be used to find best match (e.g. evaluate against host properties) • System resources – users, groups, ports, etc. • Host specific configuration – Non-homogeneous cluster
  • 9. Cluster Manifest: Host Component Mapping { "schemaVersion:" : "1”, … "context" : […], "hostResourceMapping" : [ { "hosts" : [ { "predicate" : "name=*" } ], "systemResources" : { "hadoopGroup" : "hadoop", "groups" : [ { "name" : "hadoop", ... ], "users" : [ { "groups" : [ "hadoop" ], "name" : "hdfs", "type" : "LOCAL” ], "ports" : […] ... ], "hostComponentMapping" : [ { "hosts" : [ { "predicate" : "name=masterhosts1" "configOverrides" : [ { "type":"core-site.xml", "properties" : [ { "name" : "fs.trash.interval", "value" : "480" }, ... ], "components" : [ "NAMENODE", "JOBTRACKER", ... ] }, ... }
  • 10. Scenarios • Define cluster templates – and, host specific templates • On demand cluster creation – Cluster extension (e.g. add Datanodes) • Export cluster manifest • A uniform “language” across cluster managers and environments
  • 11. Cluster Blueprint • Blueprint is manifest with “holes” – Typically • Hostnames • Config parameters that use hostname – But, any config params that a Hadoop admin deems necessary to be parameterized • Blueprint = Manifest + Parameter Values
  • 13. What’s Next? • Apache Ambari JIRA 1783, is tracking this project – https://issues.apache.org/jira/browse/AMBARI- 1783 – Comments and suggestions, welcome • In next releases, we will enhance Ambari to add support for manifest and blueprints