SlideShare a Scribd company logo
1 of 21
1
OpensourcerecipesforChefdeploymentsofHadoop
/
Open source recipes for Chef
deployments of Hadoop
Clay Baenziger
(cbaenziger@bloomberg.net)
Hadoop Infrastructure
Bloomberg
http://bloomberg.github.io/chef-bcpc/hadoop
2
OpensourcerecipesforChefdeploymentsofHadoop
/
http://bloomberg.github.io/chef-bcpc/hadoop Clay Baenziger
Why Open Source?
• Bloomberg’s Culture
– Bringing transparency to our Hadoop
– Reference implementation
– Third party integration
– Hiring
• Everyone Here!
3
OpensourcerecipesforChefdeploymentsofHadoop
/
http://bloomberg.github.io/chef-bcpc/hadoop Clay Baenziger
Agenda
• What is Provided?
• Installation and Configuration Management
• Customization
• High Availability
• Integration with the Cloud
4
OpensourcerecipesforChefdeploymentsofHadoop
/
http://bloomberg.github.io/chef-bcpc/hadoop Clay Baenziger
What Is Provided?
• Open source Chef recipes – on GitHub
• Installation for Bigtop based distributions
– Supporting infrastructure too
• Same deployment on
– VirtualBox
– OpenStack
– Bare Metal
• PXE to application deployment
• Compose-able components
5
OpensourcerecipesforChefdeploymentsofHadoop
/
http://bloomberg.github.io/chef-bcpc/hadoop Clay Baenziger
Motivation at Bloomberg
Multiple Clusters
• Development
• Data Lake
• Low Latency
• Data Sensitivity
6
OpensourcerecipesforChefdeploymentsofHadoop
/
http://bloomberg.github.io/chef-bcpc/hadoop Clay Baenziger
Deployment Consistency
• Developers
• Vendors
• Application Teams
• Bare Metal
– Vagrant VM for bootstrap node
7
OpensourcerecipesforChefdeploymentsofHadoop
/
http://bloomberg.github.io/chef-bcpc/hadoop Clay Baenziger
Install & Config Management
• Hadoop Ecosystem 20+ Apache projects!
• Bloomberg Chef-BCPC:
– 2,500 lines of XML configurations
• Everything–site.xml
– 1,000 lines of Hadoop setup Ruby
• 10+ Apache Hadoop components setup for you
• Core Hadoop, HBase, Hive, Hcatalog, Kafka, Mahout, Oozie, Pig,
Sqoop, Zookeeper and more
– 1,600 lines of base setup Ruby
• Networking, MySQL Galera, Graphite, Zabbix, …
– 550 lines of shell scripts
8
OpensourcerecipesforChefdeploymentsofHadoop
/
HBase
HDFS YARN
Hive
DependsOn
Oozie
Map
Reduce
Hive
Metastore
Hive
Server2
Datanodes
Region
Servers
Master(s)
Zookeeper
Resource
Manager(s)
Node
Managers
Depends On
History
Server
Depends On
Depends
On
Hcatalog
WebHCa
t
Depends On
Depends On
Not Shown:
Operating System
Relational Database
Authentication/Security
Load Balancers
Monitoring …
JournalNodes
Namenode(s)
WebHDFS
HTTPFS
Hadoop Complexity
ZKFC
9
OpensourcerecipesforChefdeploymentsofHadoop
/
http://bloomberg.github.io/chef-bcpc/hadoop Clay Baenziger
Simple Chef
• Deployment and configuration management
• Ruby code to describes deployment process
• Similar to Puppet, Ansible, Salt, CFEngine, etc.
• More broad (& raw) than Ambari -- today
• Leverages open source Chef Server
10
OpensourcerecipesforChefdeploymentsofHadoop
/
http://bloomberg.github.io/chef-bcpc/hadoop Clay Baenziger
More Chef
• Mature Ecosystem:
– Community cookbooks
– Ruby Gems
• Cluster Features:
– Node attributes
– Searching by roles, recipes, …
• Procedural setup via high-level actions
• Provides error handling
11
OpensourcerecipesforChefdeploymentsofHadoop
/
Chef Idioms – Node Definitions{ “name“ : “cluster1-r5n16.example.com”,
“chef_environment“ : “cluster1”,
“normal“ : {
“bcpc“ : {
“management“ : {
“ip“ : “10.0.1.23”,
“interface“ : “eth1”,
“netmask“ : “255.255.240.0”,
“cidr“ : “10.0.1.0/20”,
“gateway“ : “10.0.1.1” },
“floating“ : {
“ip“ : “10.0.0.23”,
. . .
“gateway“ : “10.0.0.1” },
“hadoop”: { “disks” : [ “sdb”, “sdc”, “sdd”, “sde“ ] }
}
},
“run_list“ : [ “role[Namenode]”, “role[HBaseMaster]” ]
}
12
OpensourcerecipesforChefdeploymentsofHadoop
/
http://bloomberg.github.io/chef-bcpc/hadoop Clay Baenziger
Chef Idioms – Nodes By Recipe
def get_nodes_for(recipe, cookbook="bcpc")
results = 
search(:node,"recipes:#{cookbook}::#{recipe} AND" +
"chef_environment:#{node.chef_environment}")
results.map!{|x| x['hostname'] == node[:hostname] ? node : x}
return results.sort
end
zk_hosts = get_nodes_for(“zookeeper_server“,”bcpc-hadoop”)
Implementation:
Usage:
13
OpensourcerecipesforChefdeploymentsofHadoop
/
Chef Idioms – Service Verificationruby_block "Chef Oozie Up"
do
iter = 0; block do
status=`oozie admin -status 2>&1`
while not /NORMAL/ =~ status and $?.to_i
status=`oozie admin -status 2>&1`
if $?.to_i == 0
Chef::Log.debug("Oozie status is not failing - #{status}")
elseif $?.to_i != 0 and iter < 10
sleep(0.5); iter += 1
Chef::Log.debug("Oozie is down - #{status}")
else
raise Chef::Application.fatal! "Oozie is reported as down" +
"for more than 5 seconds -- #{status}"
end
end
Chef::Log.debug("Oozie is up - #{status}")
end
not_if "oozie admin -status"
end
PollLoop
API Verification
Guard Attribute
14
OpensourcerecipesforChefdeploymentsofHadoop
/
http://bloomberg.github.io/chef-bcpc/hadoop Clay Baenziger
Chef Idioms – Service Verification
* ruby_block[Check Oozie Up] action run
[2014-05-27T16:20:31-04:00] INFO: Processing ruby_block[Check Oozie Up] action run
(bcpc-hadoop::oozie line 132)
[2014-05-27T16:20:31-04:00] DEBUG: Platform ubuntu version 12.04 found
[2014-05-27T16:20:32-04:00] DEBUG: Oozie is down - Error: IO_ERROR :
java.net.ConnectException: Connection refused
[2014-05-27T16:20:36-04:00] DEBUG: Oozie status is not failing - System mode: NORMAL
[2014-05-27T16:20:36-04:00] DEBUG: Oozie is up - System mode: NORMAL
[2014-05-27T16:20:36-04:00] INFO: ruby_block[Check Oozie Up] called - execute the
ruby block Oozie Down
Output:
PollLoopGuard
Verification
Verification
15
OpensourcerecipesforChefdeploymentsofHadoop
/
http://bloomberg.github.io/chef-bcpc/hadoop Clay Baenziger
Maintenance
• Full developer model
• Git
– Can use all standard developer tools
– App teams can merge in their deployment code
• Jenkins for static analysis
• Can move relatively stateless Chef server
– IP Migration
– We build on Vagrant boxes – simply move the VM
16
OpensourcerecipesforChefdeploymentsofHadoop
/
http://bloomberg.github.io/chef-bcpc/hadoop Clay Baenziger
Customization
• Ad-Hoc Cluster Work
– Roll-out xfs_recover steps on machines with administrative damage
– sudo rules for debugging
• Pick-and-choose application versions
– HBase 0.98 on a vendor cluster
• Application group deployments
– HDFS Artifacts
– Kafka Topics
– HBase Tables
17
OpensourcerecipesforChefdeploymentsofHadoop
/
http://bloomberg.github.io/chef-bcpc/hadoop Clay Baenziger
High Availability Control Points
• Networking
– NIC Bonding
– Virtual IPs
• MySQL
– Galera
• HDFS
– Journal Nodes & ZKFC
• MapReduce
• HBase
• Oozie
• Hive
18
OpensourcerecipesforChefdeploymentsofHadoop
/
http://bloomberg.github.io/chef-bcpc/hadoop Clay Baenziger
HDFS HA Steps
1. Have a Zookeeper Cluster
2. Stop HDFS Services
– Namenode
– Datanodes
– Anything using HDFS
3. Change Configuration Files
– Hdfs-site.xml (on all nodes)
4. Start journal Nodes
5. Format Zookeeper
6. Initialize Shared Edits
7. Start Primary Namenode
8. Bootstrap Standby Namenode
9. Start Standby Namenode
10. Start Zookeeper Fencing Controller
11. Start Datanodes
19
OpensourcerecipesforChefdeploymentsofHadoop
/
http://bloomberg.github.io/chef-bcpc/hadoop Clay Baenziger
Cloud Integration
• Multi-Tenancy
– Virtual Machines Provide Isolation
– Hadoop Ecosystem Provides Multi-Tenancy
– Why Run Hadoop on-top of VMs?
• Distributed Filesystem
– Provides VM Resiliancy
– Provides Hadoop Resiliancy
• Scheduler
– Picks Optimal Hypervisor per Policy
– Picks Optimal Node per Policy
20
OpensourcerecipesforChefdeploymentsofHadoop
/
http://bloomberg.github.io/chef-bcpc/hadoop Clay Baenziger
We Are Hiring
21
OpensourcerecipesforChefdeploymentsofHadoop
/
Questions/Comments?
http://bloomberg.github.io/chef-bcpc/hadoop
Clay Baenziger
(cbaenziger@bloomberg.net)

More Related Content

What's hot

Chef Fundamentals Training Series Module 4: The Chef Client Run and Expanding...
Chef Fundamentals Training Series Module 4: The Chef Client Run and Expanding...Chef Fundamentals Training Series Module 4: The Chef Client Run and Expanding...
Chef Fundamentals Training Series Module 4: The Chef Client Run and Expanding...Chef Software, Inc.
 
Chef Fundamentals Training Series Module 2: Workstation Setup
Chef Fundamentals Training Series Module 2: Workstation SetupChef Fundamentals Training Series Module 2: Workstation Setup
Chef Fundamentals Training Series Module 2: Workstation SetupChef Software, Inc.
 
Environments - Fundamentals Webinar Series Week 5
Environments - Fundamentals Webinar Series Week 5Environments - Fundamentals Webinar Series Week 5
Environments - Fundamentals Webinar Series Week 5Chef
 
Node setup, resource, and recipes - Fundamentals Webinar Series Part 2
Node setup, resource, and recipes - Fundamentals Webinar Series Part 2Node setup, resource, and recipes - Fundamentals Webinar Series Part 2
Node setup, resource, and recipes - Fundamentals Webinar Series Part 2Chef
 
Opscode Webinar: Managing Your VMware Infrastructure with Chef
Opscode Webinar: Managing Your VMware Infrastructure with ChefOpscode Webinar: Managing Your VMware Infrastructure with Chef
Opscode Webinar: Managing Your VMware Infrastructure with ChefChef Software, Inc.
 
Common configuration with Data Bags - Fundamentals Webinar Series Part 4
Common configuration with Data Bags - Fundamentals Webinar Series Part 4Common configuration with Data Bags - Fundamentals Webinar Series Part 4
Common configuration with Data Bags - Fundamentals Webinar Series Part 4Chef
 
Automated Deployment and Configuration Engines. Ansible
Automated Deployment and Configuration Engines. AnsibleAutomated Deployment and Configuration Engines. Ansible
Automated Deployment and Configuration Engines. AnsibleAlberto Molina Coballes
 
Node object and roles - Fundamentals Webinar Series Part 3
Node object and roles - Fundamentals Webinar Series Part 3Node object and roles - Fundamentals Webinar Series Part 3
Node object and roles - Fundamentals Webinar Series Part 3Chef
 
Chef for OpenStack: Grizzly Roadmap
Chef for OpenStack: Grizzly RoadmapChef for OpenStack: Grizzly Roadmap
Chef for OpenStack: Grizzly RoadmapMatt Ray
 
Chef Fundamentals Training Series Module 3: Setting up Nodes and Cookbook Aut...
Chef Fundamentals Training Series Module 3: Setting up Nodes and Cookbook Aut...Chef Fundamentals Training Series Module 3: Setting up Nodes and Cookbook Aut...
Chef Fundamentals Training Series Module 3: Setting up Nodes and Cookbook Aut...Chef Software, Inc.
 
Boston/NYC Chef for OpenStack Hack Days
Boston/NYC Chef for OpenStack Hack DaysBoston/NYC Chef for OpenStack Hack Days
Boston/NYC Chef for OpenStack Hack DaysMatt Ray
 
Chef for OpenStack December 2012
Chef for OpenStack December 2012Chef for OpenStack December 2012
Chef for OpenStack December 2012Matt Ray
 
Chef for OpenStack: OpenStack Spring Summit 2013
Chef for OpenStack: OpenStack Spring Summit 2013Chef for OpenStack: OpenStack Spring Summit 2013
Chef for OpenStack: OpenStack Spring Summit 2013Matt Ray
 
Atlanta OpenStack 2014 Chef for OpenStack Deployment Workshop
Atlanta OpenStack 2014 Chef for OpenStack Deployment WorkshopAtlanta OpenStack 2014 Chef for OpenStack Deployment Workshop
Atlanta OpenStack 2014 Chef for OpenStack Deployment WorkshopMatt Ray
 
Building a PaaS using Chef
Building a PaaS using ChefBuilding a PaaS using Chef
Building a PaaS using ChefShaun Domingo
 
Chef Fundamentals Training Series Module 6: Roles, Environments, Community Co...
Chef Fundamentals Training Series Module 6: Roles, Environments, Community Co...Chef Fundamentals Training Series Module 6: Roles, Environments, Community Co...
Chef Fundamentals Training Series Module 6: Roles, Environments, Community Co...Chef Software, Inc.
 
Introduction to Chef
Introduction to ChefIntroduction to Chef
Introduction to ChefKnoldus Inc.
 
Automating your infrastructure with Chef
Automating your infrastructure with ChefAutomating your infrastructure with Chef
Automating your infrastructure with ChefJohn Ewart
 
Chef-Zero & Local Mode
Chef-Zero & Local ModeChef-Zero & Local Mode
Chef-Zero & Local ModeMichael Goetz
 

What's hot (20)

Chef Fundamentals Training Series Module 4: The Chef Client Run and Expanding...
Chef Fundamentals Training Series Module 4: The Chef Client Run and Expanding...Chef Fundamentals Training Series Module 4: The Chef Client Run and Expanding...
Chef Fundamentals Training Series Module 4: The Chef Client Run and Expanding...
 
Chef Fundamentals Training Series Module 2: Workstation Setup
Chef Fundamentals Training Series Module 2: Workstation SetupChef Fundamentals Training Series Module 2: Workstation Setup
Chef Fundamentals Training Series Module 2: Workstation Setup
 
Environments - Fundamentals Webinar Series Week 5
Environments - Fundamentals Webinar Series Week 5Environments - Fundamentals Webinar Series Week 5
Environments - Fundamentals Webinar Series Week 5
 
Node setup, resource, and recipes - Fundamentals Webinar Series Part 2
Node setup, resource, and recipes - Fundamentals Webinar Series Part 2Node setup, resource, and recipes - Fundamentals Webinar Series Part 2
Node setup, resource, and recipes - Fundamentals Webinar Series Part 2
 
The unintended benefits of Chef
The unintended benefits of ChefThe unintended benefits of Chef
The unintended benefits of Chef
 
Opscode Webinar: Managing Your VMware Infrastructure with Chef
Opscode Webinar: Managing Your VMware Infrastructure with ChefOpscode Webinar: Managing Your VMware Infrastructure with Chef
Opscode Webinar: Managing Your VMware Infrastructure with Chef
 
Common configuration with Data Bags - Fundamentals Webinar Series Part 4
Common configuration with Data Bags - Fundamentals Webinar Series Part 4Common configuration with Data Bags - Fundamentals Webinar Series Part 4
Common configuration with Data Bags - Fundamentals Webinar Series Part 4
 
Automated Deployment and Configuration Engines. Ansible
Automated Deployment and Configuration Engines. AnsibleAutomated Deployment and Configuration Engines. Ansible
Automated Deployment and Configuration Engines. Ansible
 
Node object and roles - Fundamentals Webinar Series Part 3
Node object and roles - Fundamentals Webinar Series Part 3Node object and roles - Fundamentals Webinar Series Part 3
Node object and roles - Fundamentals Webinar Series Part 3
 
Chef for OpenStack: Grizzly Roadmap
Chef for OpenStack: Grizzly RoadmapChef for OpenStack: Grizzly Roadmap
Chef for OpenStack: Grizzly Roadmap
 
Chef Fundamentals Training Series Module 3: Setting up Nodes and Cookbook Aut...
Chef Fundamentals Training Series Module 3: Setting up Nodes and Cookbook Aut...Chef Fundamentals Training Series Module 3: Setting up Nodes and Cookbook Aut...
Chef Fundamentals Training Series Module 3: Setting up Nodes and Cookbook Aut...
 
Boston/NYC Chef for OpenStack Hack Days
Boston/NYC Chef for OpenStack Hack DaysBoston/NYC Chef for OpenStack Hack Days
Boston/NYC Chef for OpenStack Hack Days
 
Chef for OpenStack December 2012
Chef for OpenStack December 2012Chef for OpenStack December 2012
Chef for OpenStack December 2012
 
Chef for OpenStack: OpenStack Spring Summit 2013
Chef for OpenStack: OpenStack Spring Summit 2013Chef for OpenStack: OpenStack Spring Summit 2013
Chef for OpenStack: OpenStack Spring Summit 2013
 
Atlanta OpenStack 2014 Chef for OpenStack Deployment Workshop
Atlanta OpenStack 2014 Chef for OpenStack Deployment WorkshopAtlanta OpenStack 2014 Chef for OpenStack Deployment Workshop
Atlanta OpenStack 2014 Chef for OpenStack Deployment Workshop
 
Building a PaaS using Chef
Building a PaaS using ChefBuilding a PaaS using Chef
Building a PaaS using Chef
 
Chef Fundamentals Training Series Module 6: Roles, Environments, Community Co...
Chef Fundamentals Training Series Module 6: Roles, Environments, Community Co...Chef Fundamentals Training Series Module 6: Roles, Environments, Community Co...
Chef Fundamentals Training Series Module 6: Roles, Environments, Community Co...
 
Introduction to Chef
Introduction to ChefIntroduction to Chef
Introduction to Chef
 
Automating your infrastructure with Chef
Automating your infrastructure with ChefAutomating your infrastructure with Chef
Automating your infrastructure with Chef
 
Chef-Zero & Local Mode
Chef-Zero & Local ModeChef-Zero & Local Mode
Chef-Zero & Local Mode
 

Similar to Open Source Recipes for Chef Deployments of Hadoop

Practical introduction to dev ops with chef
Practical introduction to dev ops with chefPractical introduction to dev ops with chef
Practical introduction to dev ops with chefLeanDog
 
Introduction to Chef
Introduction to ChefIntroduction to Chef
Introduction to Chefkevsmith
 
Overview of Chef - Fundamentals Webinar Series Part 1
Overview of Chef - Fundamentals Webinar Series Part 1Overview of Chef - Fundamentals Webinar Series Part 1
Overview of Chef - Fundamentals Webinar Series Part 1Chef
 
Introduction to Infrastructure as Code & Automation / Introduction to Chef
Introduction to Infrastructure as Code & Automation / Introduction to ChefIntroduction to Infrastructure as Code & Automation / Introduction to Chef
Introduction to Infrastructure as Code & Automation / Introduction to ChefAll Things Open
 
Testable Infrastructure with Chef, Test Kitchen, and Docker
Testable Infrastructure with Chef, Test Kitchen, and DockerTestable Infrastructure with Chef, Test Kitchen, and Docker
Testable Infrastructure with Chef, Test Kitchen, and DockerMandi Walls
 
Atmosphere 2014: Really large scale systems configuration - Phil Dibowitz
Atmosphere 2014: Really large scale systems configuration - Phil DibowitzAtmosphere 2014: Really large scale systems configuration - Phil Dibowitz
Atmosphere 2014: Really large scale systems configuration - Phil DibowitzPROIDEA
 
Chef conf-2015-chef-patterns-at-bloomberg-scale
Chef conf-2015-chef-patterns-at-bloomberg-scaleChef conf-2015-chef-patterns-at-bloomberg-scale
Chef conf-2015-chef-patterns-at-bloomberg-scaleBiju Nair
 
How bigtop leveraged docker for build automation and one click hadoop provis...
How bigtop leveraged docker for build automation and  one click hadoop provis...How bigtop leveraged docker for build automation and  one click hadoop provis...
How bigtop leveraged docker for build automation and one click hadoop provis...Evans Ye
 
Introduction to Infrastructure as Code & Automation / Introduction to Chef
Introduction to Infrastructure as Code & Automation / Introduction to ChefIntroduction to Infrastructure as Code & Automation / Introduction to Chef
Introduction to Infrastructure as Code & Automation / Introduction to ChefNathen Harvey
 
Using Nagios with Chef
Using Nagios with ChefUsing Nagios with Chef
Using Nagios with ChefBryan McLellan
 
Lessons from Etsy: Avoiding Kitchen Nightmares - #ChefConf 2012
Lessons from Etsy: Avoiding Kitchen Nightmares - #ChefConf 2012Lessons from Etsy: Avoiding Kitchen Nightmares - #ChefConf 2012
Lessons from Etsy: Avoiding Kitchen Nightmares - #ChefConf 2012Patrick McDonnell
 
Cocoapods in action
Cocoapods in actionCocoapods in action
Cocoapods in actionHan Qin
 
TXLF: Chef- Software Defined Infrastructure Today & Tomorrow
TXLF: Chef- Software Defined Infrastructure Today & TomorrowTXLF: Chef- Software Defined Infrastructure Today & Tomorrow
TXLF: Chef- Software Defined Infrastructure Today & TomorrowMatt Ray
 
Introduction to Chef - Techsuperwomen Summit
Introduction to Chef - Techsuperwomen SummitIntroduction to Chef - Techsuperwomen Summit
Introduction to Chef - Techsuperwomen SummitJennifer Davis
 
How bigtop leveraged docker for build automation and one click hadoop provis...
How bigtop leveraged docker for build automation and  one click hadoop provis...How bigtop leveraged docker for build automation and  one click hadoop provis...
How bigtop leveraged docker for build automation and one click hadoop provis...Evans Ye
 
Chef - Infrastructure Automation for the Masses
Chef - Infrastructure Automation for the Masses�Chef - Infrastructure Automation for the Masses�
Chef - Infrastructure Automation for the MassesSai Perchard
 
Azure handsonlab
Azure handsonlabAzure handsonlab
Azure handsonlabChef
 
SCALE12X: Chef for OpenStack
SCALE12X: Chef for OpenStackSCALE12X: Chef for OpenStack
SCALE12X: Chef for OpenStackMatt Ray
 
Automate Your Automation | DrupalCon Vienna
Automate Your Automation | DrupalCon ViennaAutomate Your Automation | DrupalCon Vienna
Automate Your Automation | DrupalCon ViennaPantheon
 

Similar to Open Source Recipes for Chef Deployments of Hadoop (20)

Practical introduction to dev ops with chef
Practical introduction to dev ops with chefPractical introduction to dev ops with chef
Practical introduction to dev ops with chef
 
Introduction to Chef
Introduction to ChefIntroduction to Chef
Introduction to Chef
 
Overview of Chef - Fundamentals Webinar Series Part 1
Overview of Chef - Fundamentals Webinar Series Part 1Overview of Chef - Fundamentals Webinar Series Part 1
Overview of Chef - Fundamentals Webinar Series Part 1
 
Introduction to Infrastructure as Code & Automation / Introduction to Chef
Introduction to Infrastructure as Code & Automation / Introduction to ChefIntroduction to Infrastructure as Code & Automation / Introduction to Chef
Introduction to Infrastructure as Code & Automation / Introduction to Chef
 
Testable Infrastructure with Chef, Test Kitchen, and Docker
Testable Infrastructure with Chef, Test Kitchen, and DockerTestable Infrastructure with Chef, Test Kitchen, and Docker
Testable Infrastructure with Chef, Test Kitchen, and Docker
 
Atmosphere 2014: Really large scale systems configuration - Phil Dibowitz
Atmosphere 2014: Really large scale systems configuration - Phil DibowitzAtmosphere 2014: Really large scale systems configuration - Phil Dibowitz
Atmosphere 2014: Really large scale systems configuration - Phil Dibowitz
 
Chef conf-2015-chef-patterns-at-bloomberg-scale
Chef conf-2015-chef-patterns-at-bloomberg-scaleChef conf-2015-chef-patterns-at-bloomberg-scale
Chef conf-2015-chef-patterns-at-bloomberg-scale
 
How bigtop leveraged docker for build automation and one click hadoop provis...
How bigtop leveraged docker for build automation and  one click hadoop provis...How bigtop leveraged docker for build automation and  one click hadoop provis...
How bigtop leveraged docker for build automation and one click hadoop provis...
 
Introduction to Infrastructure as Code & Automation / Introduction to Chef
Introduction to Infrastructure as Code & Automation / Introduction to ChefIntroduction to Infrastructure as Code & Automation / Introduction to Chef
Introduction to Infrastructure as Code & Automation / Introduction to Chef
 
Using Nagios with Chef
Using Nagios with ChefUsing Nagios with Chef
Using Nagios with Chef
 
Lessons from Etsy: Avoiding Kitchen Nightmares - #ChefConf 2012
Lessons from Etsy: Avoiding Kitchen Nightmares - #ChefConf 2012Lessons from Etsy: Avoiding Kitchen Nightmares - #ChefConf 2012
Lessons from Etsy: Avoiding Kitchen Nightmares - #ChefConf 2012
 
Cocoapods in action
Cocoapods in actionCocoapods in action
Cocoapods in action
 
TXLF: Chef- Software Defined Infrastructure Today & Tomorrow
TXLF: Chef- Software Defined Infrastructure Today & TomorrowTXLF: Chef- Software Defined Infrastructure Today & Tomorrow
TXLF: Chef- Software Defined Infrastructure Today & Tomorrow
 
Introduction to Chef - Techsuperwomen Summit
Introduction to Chef - Techsuperwomen SummitIntroduction to Chef - Techsuperwomen Summit
Introduction to Chef - Techsuperwomen Summit
 
How bigtop leveraged docker for build automation and one click hadoop provis...
How bigtop leveraged docker for build automation and  one click hadoop provis...How bigtop leveraged docker for build automation and  one click hadoop provis...
How bigtop leveraged docker for build automation and one click hadoop provis...
 
Chef - Infrastructure Automation for the Masses
Chef - Infrastructure Automation for the Masses�Chef - Infrastructure Automation for the Masses�
Chef - Infrastructure Automation for the Masses
 
Azure handsonlab
Azure handsonlabAzure handsonlab
Azure handsonlab
 
SCALE12X: Chef for OpenStack
SCALE12X: Chef for OpenStackSCALE12X: Chef for OpenStack
SCALE12X: Chef for OpenStack
 
What's new in chef 12
What's new in chef 12 What's new in chef 12
What's new in chef 12
 
Automate Your Automation | DrupalCon Vienna
Automate Your Automation | DrupalCon ViennaAutomate Your Automation | DrupalCon Vienna
Automate Your Automation | DrupalCon Vienna
 

More from DataWorks Summit

Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisDataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiDataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal SystemDataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExampleDataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberDataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixDataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiDataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsDataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureDataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EngineDataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudDataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiDataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerDataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouDataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkDataWorks Summit
 

More from DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Recently uploaded

Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 

Recently uploaded (20)

Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 

Open Source Recipes for Chef Deployments of Hadoop

Editor's Notes

  1. What is Chef? Will get to that Chef recipes – infrastructure-as-code (you get what Bloomberg runs internally) BCPC – Bloomberg Clustered Private Cloud – run what we run! Big Top distributions: Apache Hortonworks CDH Compose-able in that one can build from just HDFS to HBase or full-stack system with ease
  2. Developers spend time on airplanes; coffeeshops; who knows where – this allows one to take our stack on a laptop or deploy in Jenkins Vendors need to know how to integrate Application teams always want to tinker and see if they could do better Bare Metal (the PXE to Application stack means performance testing & entire system testing is much faster end-to-end) Gold master can be a Vagrant VM!
  3. Complexity much? Maybe you wanted to then actually deploy an application too? Note this is only six Apache Hadoop components (and there are more than 20!)
  4. Ambari supports more components indeed; the Chef model allows us today to deploy for applications and re-tune very quickly across the entire cluster stack (not solely the Hadoop ecosystem)
  5. Thanks to the question mark author: http://commons.wikimedia.org/wiki/File:Circle-question-blue.svg