SlideShare una empresa de Scribd logo
1 de 20
Jean-Pierre König, MeMo News AG



  OPENING THE TOOL BOX
  DEVELOPMENT, TESTING AND DEPLOYMENT IN THE HADOOP
  ECOSYSTEM

  14.05.12

http://www.flickr.com/photos/theaucitron/5810163712/sizes/l/in/photostream/
Development

 THE APPLICATION


http://www.flickr.com/photos/oskay/2523189273/sizes/l/in/photostream/
Development

The Applicationisa ...
  • Distributed newsagent
  • GUI-less Java Application
  • Spring-based 2-layer architecture
     • Services and data access objects
  • Client of Hadoop
     • Dependencies to Zookeeper and HBase




                                             14.05.12
Development(2)

We use Maven 3 for
  • Project structure -Corporate POM & Modules
  • Dependency Management
  • Build the artifact             Corporate
                                                                   POM


           global                            newsagent           tools   mapred

                                               Loader (Client)
                            Infrastructure
            Model

                    Utils




                                                  Services

                                                Data Access
                                                  Objects

                                                                             14.05.12
Development

 MAPREDUCEJOBS


http://www.flickr.com/photos/elasticsoul/61062372/sizes/l/in/photostream/
MapReduce
6


    • Java MR jobs for business processes
      • Input and output paths either HDFS or HBase
      • MR job chaining by Azkaban
    • PIG, HIVE for ad-hoc queries




                                                14.05.12
Development

 HBASE


http://www.flickr.com/photos/isherwoodchris/6902155937/sizes/l/in/photostream/
HBase

• HBase Schema Manager
  • github.com/jkoenig/hbase-schema-manager
• Utilities to copy/move/rename column-families
  and copy complete tables with it's data
  • github.com/memonews/hbase-utils
• Stargate REST API without compression
  • github.com/memonews/hbase-stargate



                                          14.05.12
Hadoop, HBase, Zookeeper

 TESTING


http://www.flickr.com/photos/42106306@N00/4380803535/sizes/m/in/photostream/
HBase

• We use the Apache HBaseTestingUtility
• It’s in-memory  complete hadoop instance
  with dfs, zk and hbase
• It‘s very slow – conciderlongrunning IT
publicclassConfigurableHBaseClient {
protectedstaticHBaseTestingUtility TEST_UTIL;
static{
   final Configurationconf = HBaseConfiguration.create();
conf.addResource("hbase-default-test.xml");
try{
TEST_UTIL = HBaseTestingUtilityFactory.getMiniCluster(1, conf);
   } catch (final Exception e) {
fail("Couldnot start hadoop mini cluster.");
   }
 }
}

                                                                  14.05.12
MapReduce

• Since business logic involved, we use hadoop-
  mrunit for testing Map/Reduce Jobs
• It’s in-memory testing
    • Parameterized Mapper/Reducer with a driver


@Test
publicvoidreduceShouldWriteExactlyOneLinePerMap() throwsIOException {
final List<DoubleWritable>values = newArrayList<DoubleWritable>();
values.add(new DoubleWritable(399287729));
this.driver.withInput(newText("de.t-online/nachrichten"), values);
this.driver.run();
 assertEquals(1, this.driver.getCounters().findCounter(
MeMoCounters.SIGNALS_WRITTEN).getValue());
}

                                                                        14.05.12
Zookeeper

• We use the Apache Zookeeper ClientBase
• It‘s not in-memory but against the staging
  cluster
    • Prefix paths e.g.: /test/memo/subscribers



@Test
publicvoidgetNumberOfSubscribersShouldSetWatchFlag()
throwsKeeperException,InterruptedException{
  final SubscriberDaoImplsubscriberDao =
newSubscriberDaoImpl(zookeeperDao, DIR, null);
subscriberDao.getNumberOfSubscribers(listener);
verify(this.zookeeper, times(1)).getChildren(eq(DIR), eq(subscriberDao));
}

                                                                            14.05.12
Deployment

 THE APPLICATION


http://www.flickr.com/photos/navalsurfaceforces/5553412190/sizes/l/in/photostream/
The Application

• Automated build and restart via capistrano
• Build on every machine
    • There is a .m2 repository everywhere

set :deploy_to, "/usr/share/memo-newsagent“
set:keep_releases, 1

after "deploy:setup" do
run "mkdir -p /var/run/memo #{shared_path}/logs /var/log/memo/"
  ...
end

after "deploy:update_code" do
run "cd #{current_release} &&mvninstall-Pfast> #{shared_path}/logs/build.log"
end

after "deploy", "rowlog:stop", "newsagent:restart", "rowlog:start"

                                                                           14.05.12
Deployment

 MAPREDUCE JOBS


http://www.flickr.com/photos/navalsurfaceforces/6257239933/sizes/l/in/photostream/
Map Reduce Jobs

• We use a Maven HadoopPlugin
hadoop:pack a la mvn:package
hadoop:deploy HDFS and target folder
• All dependencies packed-in  Careful: Huge
  JARs without dependency management



see github.com/memonews/maven-hadoop

                                       14.05.12
DevOps

 OTHER TOOLS IN USE


http://www.flickr.com/photos/damongman/4979871047/sizes/l/in/photostream/
Other Tools

• Staging environment in-house, 1 to 1 copy
  from production (virtualized)
• Azkaban for MR job scheduling
• Jenkins for (Integration-) Tests and Metrics
• GIT
• Icinga for Monitoring & Alerting
• Ganglia / Graphite for Hadoop Metrics
• Fliwi for automated cluster provisioning

                                           14.05.12
jean-pierre.koenig@menonews.com

THANKS!
14.05.2012 Opening the tool box: Development, testing and deployment in the Hadoop ecosystem (Jean-Pierre König, MeMo News AG)

Más contenido relacionado

Similar a 14.05.2012 Opening the tool box: Development, testing and deployment in the Hadoop ecosystem (Jean-Pierre König, MeMo News AG)

App cap2956v2-121001194956-phpapp01 (1)
App cap2956v2-121001194956-phpapp01 (1)App cap2956v2-121001194956-phpapp01 (1)
App cap2956v2-121001194956-phpapp01 (1)
outstanding59
 
Inside the Hadoop Machine @ VMworld
Inside the Hadoop Machine @ VMworldInside the Hadoop Machine @ VMworld
Inside the Hadoop Machine @ VMworld
Richard McDougall
 
Building hadoop based big data environment
Building hadoop based big data environmentBuilding hadoop based big data environment
Building hadoop based big data environment
Evans Ye
 
A new platform for a new era emc
A new platform for a new era   emcA new platform for a new era   emc
A new platform for a new era emc
Taldor Group
 
Hd insight essentials quick view
Hd insight essentials quick viewHd insight essentials quick view
Hd insight essentials quick view
Rajesh Nadipalli
 

Similar a 14.05.2012 Opening the tool box: Development, testing and deployment in the Hadoop ecosystem (Jean-Pierre König, MeMo News AG) (20)

DavidWible_res
DavidWible_resDavidWible_res
DavidWible_res
 
Apache Spark Introduction @ University College London
Apache Spark Introduction @ University College LondonApache Spark Introduction @ University College London
Apache Spark Introduction @ University College London
 
App cap2956v2-121001194956-phpapp01 (1)
App cap2956v2-121001194956-phpapp01 (1)App cap2956v2-121001194956-phpapp01 (1)
App cap2956v2-121001194956-phpapp01 (1)
 
Inside the Hadoop Machine @ VMworld
Inside the Hadoop Machine @ VMworldInside the Hadoop Machine @ VMworld
Inside the Hadoop Machine @ VMworld
 
App Cap2956v2 121001194956 Phpapp01 (1)
App Cap2956v2 121001194956 Phpapp01 (1)App Cap2956v2 121001194956 Phpapp01 (1)
App Cap2956v2 121001194956 Phpapp01 (1)
 
Prashanth Kumar_Hadoop_NEW
Prashanth Kumar_Hadoop_NEWPrashanth Kumar_Hadoop_NEW
Prashanth Kumar_Hadoop_NEW
 
MapR Unique features
MapR Unique featuresMapR Unique features
MapR Unique features
 
Lean microservices through ahead of time compilation (Tobias Piper, Loveholid...
Lean microservices through ahead of time compilation (Tobias Piper, Loveholid...Lean microservices through ahead of time compilation (Tobias Piper, Loveholid...
Lean microservices through ahead of time compilation (Tobias Piper, Loveholid...
 
Building hadoop based big data environment
Building hadoop based big data environmentBuilding hadoop based big data environment
Building hadoop based big data environment
 
Hackathon bonn
Hackathon bonnHackathon bonn
Hackathon bonn
 
Review of Calculation Paradigm and its Components
Review of Calculation Paradigm and its ComponentsReview of Calculation Paradigm and its Components
Review of Calculation Paradigm and its Components
 
How to develop Big Data Pipelines for Hadoop, by Costin Leau
How to develop Big Data Pipelines for Hadoop, by Costin LeauHow to develop Big Data Pipelines for Hadoop, by Costin Leau
How to develop Big Data Pipelines for Hadoop, by Costin Leau
 
Apache Spark Overview part1 (20161107)
Apache Spark Overview part1 (20161107)Apache Spark Overview part1 (20161107)
Apache Spark Overview part1 (20161107)
 
Big Data Everywhere Chicago: Getting Real with the MapR Platform (MapR)
Big Data Everywhere Chicago: Getting Real with the MapR Platform (MapR)Big Data Everywhere Chicago: Getting Real with the MapR Platform (MapR)
Big Data Everywhere Chicago: Getting Real with the MapR Platform (MapR)
 
Monika_Raghuvanshi
Monika_RaghuvanshiMonika_Raghuvanshi
Monika_Raghuvanshi
 
A new platform for a new era emc
A new platform for a new era   emcA new platform for a new era   emc
A new platform for a new era emc
 
Farming hadoop in_the_cloud
Farming hadoop in_the_cloudFarming hadoop in_the_cloud
Farming hadoop in_the_cloud
 
Hd insight essentials quick view
Hd insight essentials quick viewHd insight essentials quick view
Hd insight essentials quick view
 
HdInsight essentials Hadoop on Microsoft Platform
HdInsight essentials Hadoop on Microsoft PlatformHdInsight essentials Hadoop on Microsoft Platform
HdInsight essentials Hadoop on Microsoft Platform
 
Hd insight essentials quick view
Hd insight essentials quick viewHd insight essentials quick view
Hd insight essentials quick view
 

Más de Swiss Big Data User Group

Brainserve Datacenter: the High-Density Choice
Brainserve Datacenter: the High-Density ChoiceBrainserve Datacenter: the High-Density Choice
Brainserve Datacenter: the High-Density Choice
Swiss Big Data User Group
 
Urturn on AWS: scaling infra, cost and time to maket
Urturn on AWS: scaling infra, cost and time to maketUrturn on AWS: scaling infra, cost and time to maket
Urturn on AWS: scaling infra, cost and time to maket
Swiss Big Data User Group
 
The World Wide Distributed Computing Architecture of the LHC Datagrid
The World Wide Distributed Computing Architecture of the LHC DatagridThe World Wide Distributed Computing Architecture of the LHC Datagrid
The World Wide Distributed Computing Architecture of the LHC Datagrid
Swiss Big Data User Group
 
New opportunities for connected data : Neo4j the graph database
New opportunities for connected data : Neo4j the graph databaseNew opportunities for connected data : Neo4j the graph database
New opportunities for connected data : Neo4j the graph database
Swiss Big Data User Group
 

Más de Swiss Big Data User Group (20)

Making Hadoop based analytics simple for everyone to use
Making Hadoop based analytics simple for everyone to useMaking Hadoop based analytics simple for everyone to use
Making Hadoop based analytics simple for everyone to use
 
A real life project using Cassandra at a large Swiss Telco operator
A real life project using Cassandra at a large Swiss Telco operatorA real life project using Cassandra at a large Swiss Telco operator
A real life project using Cassandra at a large Swiss Telco operator
 
Data Analytics – B2B vs. B2C
Data Analytics – B2B vs. B2CData Analytics – B2B vs. B2C
Data Analytics – B2B vs. B2C
 
SQL on Hadoop
SQL on HadoopSQL on Hadoop
SQL on Hadoop
 
Building a Hadoop Data Warehouse with Impala
Building a Hadoop Data Warehouse with ImpalaBuilding a Hadoop Data Warehouse with Impala
Building a Hadoop Data Warehouse with Impala
 
Closing The Loop for Evaluating Big Data Analysis
Closing The Loop for Evaluating Big Data AnalysisClosing The Loop for Evaluating Big Data Analysis
Closing The Loop for Evaluating Big Data Analysis
 
Big Data and Data Science for traditional Swiss companies
Big Data and Data Science for traditional Swiss companiesBig Data and Data Science for traditional Swiss companies
Big Data and Data Science for traditional Swiss companies
 
Design Patterns for Large-Scale Real-Time Learning
Design Patterns for Large-Scale Real-Time LearningDesign Patterns for Large-Scale Real-Time Learning
Design Patterns for Large-Scale Real-Time Learning
 
Educating Data Scientists of the Future
Educating Data Scientists of the FutureEducating Data Scientists of the Future
Educating Data Scientists of the Future
 
Unleash the power of Big Data in your existing Data Warehouse
Unleash the power of Big Data in your existing Data WarehouseUnleash the power of Big Data in your existing Data Warehouse
Unleash the power of Big Data in your existing Data Warehouse
 
Big data for Telco: opportunity or threat?
Big data for Telco: opportunity or threat?Big data for Telco: opportunity or threat?
Big data for Telco: opportunity or threat?
 
Project "Babelfish" - A data warehouse to attack complexity
 Project "Babelfish" - A data warehouse to attack complexity Project "Babelfish" - A data warehouse to attack complexity
Project "Babelfish" - A data warehouse to attack complexity
 
Brainserve Datacenter: the High-Density Choice
Brainserve Datacenter: the High-Density ChoiceBrainserve Datacenter: the High-Density Choice
Brainserve Datacenter: the High-Density Choice
 
Urturn on AWS: scaling infra, cost and time to maket
Urturn on AWS: scaling infra, cost and time to maketUrturn on AWS: scaling infra, cost and time to maket
Urturn on AWS: scaling infra, cost and time to maket
 
The World Wide Distributed Computing Architecture of the LHC Datagrid
The World Wide Distributed Computing Architecture of the LHC DatagridThe World Wide Distributed Computing Architecture of the LHC Datagrid
The World Wide Distributed Computing Architecture of the LHC Datagrid
 
New opportunities for connected data : Neo4j the graph database
New opportunities for connected data : Neo4j the graph databaseNew opportunities for connected data : Neo4j the graph database
New opportunities for connected data : Neo4j the graph database
 
Technology Outlook - The new Era of computing
Technology Outlook - The new Era of computingTechnology Outlook - The new Era of computing
Technology Outlook - The new Era of computing
 
In-Store Analysis with Hadoop
In-Store Analysis with HadoopIn-Store Analysis with Hadoop
In-Store Analysis with Hadoop
 
Big Data Visualization With ParaView
Big Data Visualization With ParaViewBig Data Visualization With ParaView
Big Data Visualization With ParaView
 
Introduction to Apache Drill
Introduction to Apache DrillIntroduction to Apache Drill
Introduction to Apache Drill
 

Último

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Último (20)

Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 

14.05.2012 Opening the tool box: Development, testing and deployment in the Hadoop ecosystem (Jean-Pierre König, MeMo News AG)

  • 1. Jean-Pierre König, MeMo News AG OPENING THE TOOL BOX DEVELOPMENT, TESTING AND DEPLOYMENT IN THE HADOOP ECOSYSTEM 14.05.12 http://www.flickr.com/photos/theaucitron/5810163712/sizes/l/in/photostream/
  • 3. Development The Applicationisa ... • Distributed newsagent • GUI-less Java Application • Spring-based 2-layer architecture • Services and data access objects • Client of Hadoop • Dependencies to Zookeeper and HBase 14.05.12
  • 4. Development(2) We use Maven 3 for • Project structure -Corporate POM & Modules • Dependency Management • Build the artifact Corporate POM global newsagent tools mapred Loader (Client) Infrastructure Model Utils Services Data Access Objects 14.05.12
  • 6. MapReduce 6 • Java MR jobs for business processes • Input and output paths either HDFS or HBase • MR job chaining by Azkaban • PIG, HIVE for ad-hoc queries 14.05.12
  • 8. HBase • HBase Schema Manager • github.com/jkoenig/hbase-schema-manager • Utilities to copy/move/rename column-families and copy complete tables with it's data • github.com/memonews/hbase-utils • Stargate REST API without compression • github.com/memonews/hbase-stargate 14.05.12
  • 9. Hadoop, HBase, Zookeeper TESTING http://www.flickr.com/photos/42106306@N00/4380803535/sizes/m/in/photostream/
  • 10. HBase • We use the Apache HBaseTestingUtility • It’s in-memory  complete hadoop instance with dfs, zk and hbase • It‘s very slow – conciderlongrunning IT publicclassConfigurableHBaseClient { protectedstaticHBaseTestingUtility TEST_UTIL; static{ final Configurationconf = HBaseConfiguration.create(); conf.addResource("hbase-default-test.xml"); try{ TEST_UTIL = HBaseTestingUtilityFactory.getMiniCluster(1, conf); } catch (final Exception e) { fail("Couldnot start hadoop mini cluster."); } } } 14.05.12
  • 11. MapReduce • Since business logic involved, we use hadoop- mrunit for testing Map/Reduce Jobs • It’s in-memory testing • Parameterized Mapper/Reducer with a driver @Test publicvoidreduceShouldWriteExactlyOneLinePerMap() throwsIOException { final List<DoubleWritable>values = newArrayList<DoubleWritable>(); values.add(new DoubleWritable(399287729)); this.driver.withInput(newText("de.t-online/nachrichten"), values); this.driver.run(); assertEquals(1, this.driver.getCounters().findCounter( MeMoCounters.SIGNALS_WRITTEN).getValue()); } 14.05.12
  • 12. Zookeeper • We use the Apache Zookeeper ClientBase • It‘s not in-memory but against the staging cluster • Prefix paths e.g.: /test/memo/subscribers @Test publicvoidgetNumberOfSubscribersShouldSetWatchFlag() throwsKeeperException,InterruptedException{ final SubscriberDaoImplsubscriberDao = newSubscriberDaoImpl(zookeeperDao, DIR, null); subscriberDao.getNumberOfSubscribers(listener); verify(this.zookeeper, times(1)).getChildren(eq(DIR), eq(subscriberDao)); } 14.05.12
  • 14. The Application • Automated build and restart via capistrano • Build on every machine • There is a .m2 repository everywhere set :deploy_to, "/usr/share/memo-newsagent“ set:keep_releases, 1 after "deploy:setup" do run "mkdir -p /var/run/memo #{shared_path}/logs /var/log/memo/" ... end after "deploy:update_code" do run "cd #{current_release} &&mvninstall-Pfast> #{shared_path}/logs/build.log" end after "deploy", "rowlog:stop", "newsagent:restart", "rowlog:start" 14.05.12
  • 16. Map Reduce Jobs • We use a Maven HadoopPlugin hadoop:pack a la mvn:package hadoop:deploy HDFS and target folder • All dependencies packed-in  Careful: Huge JARs without dependency management see github.com/memonews/maven-hadoop 14.05.12
  • 17. DevOps OTHER TOOLS IN USE http://www.flickr.com/photos/damongman/4979871047/sizes/l/in/photostream/
  • 18. Other Tools • Staging environment in-house, 1 to 1 copy from production (virtualized) • Azkaban for MR job scheduling • Jenkins for (Integration-) Tests and Metrics • GIT • Icinga for Monitoring & Alerting • Ganglia / Graphite for Hadoop Metrics • Fliwi for automated cluster provisioning 14.05.12