SlideShare una empresa de Scribd logo
1 de 15
1
Gerrit + Jenkins =
Continuous Delivery for Big Data
Mountain View, CA, November 2015
Stefano Galarraga
GerritForge
stefano@gerritforge.com
http://www.gerritforge.com
Real-life case study and future developments
The Team
2
Luca Milanesio
• Co-founder and Director of GerritForge
• over 20 years in Agile Development and ALM
• OpenSource contributor to many projects
(BigData, Continuous Integration, Git/Gerrit)
Antonios Chalkiopulos
• Author of Programming MapReduce with Scalding
• Open source contributor to many BigData projects
• Working on the "land-of-Hadoop' (landoop.com)
Tiago Palma
• Data Warehouse & Big Data Development
• Senior Data Modeler
• Big Data infrastructure specialist
Stefano Galarraga
• 20 years of Agile Development
• Middleware, Big Data, Reactive Distributed Systems.
• Open Source contributor to BigData projects.
Agenda
• What’s special in Big Data
– General lack of support for Unite/Integration testing
– Testing the "real thing" (aka the Cluster)
• Why Gerrit for continuous deployment on BigData?
• Our Development Lifecycle ingredients
– Gerrit, Jenkins, Mesos, Marathon, CDH / Spark
• Gerrit Role and Components
– What did we use, why, what we would like to have
• New developments
– Usint Topics with microservices for “atomic” multi-service changes
• Live (minimised) Demo
• Open points and discussion
3
WHY Gerrit?
• Fast Paced
• Distributed team
• Relatively a “niche” technology
– A lot of “junior” developers
– Need for strong ownership
– Validation rules
– CD => We need to be have green build and consistent code
quality
4
Code-Review Lifecycle
• GIT used by distributed teams (UK, Israel, India)
• Topics and Code Review
• Jenkins build on every patch-set
• Commits reviewed / approved via Gerrit Submit
• Submitting a Topic automatically does:
– all patch-sets merged (semi-atomically)
– trigger a longer chain of CI steps
– automatically promote a RC if everything passes
• Jenkins automation via Gerrit Trigger Plugin
5
Build Steps and Solutions
• Unit tests abstracting from dependencies
• Integration Tests:
– Using Docker to run dependencies on the CI
• “Micro” Hadoop cluster or other dependencies (DBs,
messaging) => Jenkins docker plugin
• When possible “dockerizing” just the required
components and driving them from the test framework
• Performance/Acceptance required a real cluster
6
Fitting CDH Into this Picture
• Acceptance / performance test with short-lived CDHs
• Solution: Mesos, Marathon and Docker:
– Ephemeral clusters with defined capacity
– Automatic cluster-config
– All controlled via Docker/Mesos
• This was quite a long process!!
– mostly because of CDH cluster configuration
7
Mesos + Marathon
8
• Apache Mesos
– Abstracts CPU, memory, storage, other compute
resources away from machines
• Marathon Framework
– Runs on top of Mesos
– Guarantees that long-running applications never
stop
– REST API for managing and scaling services
CDH Components
• CDH 5.4.1 distribution
– Apache Spark
– Hadoop HDFS
– YARN
9
Slave Host
Integration/Performance Test Flow on
CDH Cluster
10
Jenkins
Master
Mesos
Master
Marathon Private
Docker Registry
Mesos
Slave
Docker
POST to Marathon REST
API to start 1 docker
container with Cloudera
Manager and N docker
containers with cloudera
agents
Marathon Framework
receives resource
offers from Mesos
Master and submits
the tasks
The task is sent to the
Mesos Slave
Mesos slave starts
the docker container
Docker image is fetched
from Docker registry if not
present in Slave host
WaitingforDockers
DockersUP
Install Cloudera packages via Cloudera Manager API using Python
Deploy the ETL, run the ETL and the Acceptance Tests
Unit and Integration Tests sample
• Test project:
– Test Spark project
– ETL from Oracle to HDFS
• Unit-test directly on Spark logic
• Integration tests for every patch-set:
– VERY small dataset just for this demo
– CDH and Oracle Docker Images
11
O
Unit and Integration Tests
12
Hadoop Pseudo-
distributed mode
Spark Standalone
Jenkins
Build Job
init
Submit job
Init/read HDFS
DEMO
13
Open Point and Discussion
• Topic based build of multiple artifacts
– Demo implementation is naïve and difficult to maintain
– Race conditions on build of dependent artifacts
• Need more advanced triggering system (zuul might fit)
– Race condition on submit of topic
• Stream event: “topic-submitted” instead/in addition of
many “patch-submitted” event
• Gerrit Trigger plugin should listen to this event to
coordinate
14
Questions?

Más contenido relacionado

La actualidad más candente

The Road to Continuous Delivery: Evolution Not Revolution 
The Road to Continuous Delivery: Evolution Not Revolution The Road to Continuous Delivery: Evolution Not Revolution 
The Road to Continuous Delivery: Evolution Not Revolution Perforce
 
Intro to Git: a hands-on workshop
Intro to Git: a hands-on workshopIntro to Git: a hands-on workshop
Intro to Git: a hands-on workshopCisco DevNet
 
CloudFest 2018 Hackathon Project Results Presentation - CFHack18
CloudFest 2018 Hackathon Project Results Presentation - CFHack18CloudFest 2018 Hackathon Project Results Presentation - CFHack18
CloudFest 2018 Hackathon Project Results Presentation - CFHack18Jeffrey J. Hardy
 
OpenShift In a Nutshell - Episode 02 - Architecture
OpenShift In a Nutshell - Episode 02 - ArchitectureOpenShift In a Nutshell - Episode 02 - Architecture
OpenShift In a Nutshell - Episode 02 - ArchitectureBehnam Loghmani
 
Leveraging Gradle @ Netflix (Madrid GUG Feb 2, 2021)
Leveraging Gradle @ Netflix (Madrid GUG Feb 2, 2021)Leveraging Gradle @ Netflix (Madrid GUG Feb 2, 2021)
Leveraging Gradle @ Netflix (Madrid GUG Feb 2, 2021)Roberto Pérez Alcolea
 
Introduction To Flink
Introduction To FlinkIntroduction To Flink
Introduction To FlinkKnoldus Inc.
 
Jenkins2 - Coding Continuous Delivery Pipelines
Jenkins2 - Coding Continuous Delivery PipelinesJenkins2 - Coding Continuous Delivery Pipelines
Jenkins2 - Coding Continuous Delivery PipelinesBrent Laster
 
Devops CI-CD pipeline with Containers
Devops CI-CD pipeline with ContainersDevops CI-CD pipeline with Containers
Devops CI-CD pipeline with ContainersNuSpace
 
GitLab for CI/CD process
GitLab for CI/CD processGitLab for CI/CD process
GitLab for CI/CD processHYS Enterprise
 
Advanced GitHub Enterprise Administration
Advanced GitHub Enterprise AdministrationAdvanced GitHub Enterprise Administration
Advanced GitHub Enterprise AdministrationLars Schneider
 
Build your android app with gradle
Build your android app with gradleBuild your android app with gradle
Build your android app with gradleSwain Loda
 
Java and DevOps: Supercharge Your Delivery Pipeline with Containers
Java and DevOps: Supercharge Your Delivery Pipeline with ContainersJava and DevOps: Supercharge Your Delivery Pipeline with Containers
Java and DevOps: Supercharge Your Delivery Pipeline with ContainersRed Hat Developers
 
Introduction to GitHub Actions - How to easily automate and integrate with Gi...
Introduction to GitHub Actions - How to easily automate and integrate with Gi...Introduction to GitHub Actions - How to easily automate and integrate with Gi...
Introduction to GitHub Actions - How to easily automate and integrate with Gi...All Things Open
 
Selecting a Container Image Registry for Production - Microservices Meetup Fe...
Selecting a Container Image Registry for Production - Microservices Meetup Fe...Selecting a Container Image Registry for Production - Microservices Meetup Fe...
Selecting a Container Image Registry for Production - Microservices Meetup Fe...Ritesh Patel
 
Introduction to Git for developers
Introduction to Git for developersIntroduction to Git for developers
Introduction to Git for developersDmitry Guyvoronsky
 
KubeCon EU 2021 Keynote: Shaping Kubernetes Community Culture
KubeCon EU 2021 Keynote: Shaping Kubernetes Community CultureKubeCon EU 2021 Keynote: Shaping Kubernetes Community Culture
KubeCon EU 2021 Keynote: Shaping Kubernetes Community CultureBob Killen
 
Collaborating on GitHub for Open Source Documentation
Collaborating on GitHub for Open Source DocumentationCollaborating on GitHub for Open Source Documentation
Collaborating on GitHub for Open Source DocumentationAnne Gentle
 

La actualidad más candente (20)

The Road to Continuous Delivery: Evolution Not Revolution 
The Road to Continuous Delivery: Evolution Not Revolution The Road to Continuous Delivery: Evolution Not Revolution 
The Road to Continuous Delivery: Evolution Not Revolution 
 
Intro to Git: a hands-on workshop
Intro to Git: a hands-on workshopIntro to Git: a hands-on workshop
Intro to Git: a hands-on workshop
 
CloudFest 2018 Hackathon Project Results Presentation - CFHack18
CloudFest 2018 Hackathon Project Results Presentation - CFHack18CloudFest 2018 Hackathon Project Results Presentation - CFHack18
CloudFest 2018 Hackathon Project Results Presentation - CFHack18
 
OpenShift In a Nutshell - Episode 02 - Architecture
OpenShift In a Nutshell - Episode 02 - ArchitectureOpenShift In a Nutshell - Episode 02 - Architecture
OpenShift In a Nutshell - Episode 02 - Architecture
 
GitHub for partners
GitHub for partnersGitHub for partners
GitHub for partners
 
Leveraging Gradle @ Netflix (Madrid GUG Feb 2, 2021)
Leveraging Gradle @ Netflix (Madrid GUG Feb 2, 2021)Leveraging Gradle @ Netflix (Madrid GUG Feb 2, 2021)
Leveraging Gradle @ Netflix (Madrid GUG Feb 2, 2021)
 
Introduction To Flink
Introduction To FlinkIntroduction To Flink
Introduction To Flink
 
Jenkins2 - Coding Continuous Delivery Pipelines
Jenkins2 - Coding Continuous Delivery PipelinesJenkins2 - Coding Continuous Delivery Pipelines
Jenkins2 - Coding Continuous Delivery Pipelines
 
Devops CI-CD pipeline with Containers
Devops CI-CD pipeline with ContainersDevops CI-CD pipeline with Containers
Devops CI-CD pipeline with Containers
 
GitLab for CI/CD process
GitLab for CI/CD processGitLab for CI/CD process
GitLab for CI/CD process
 
Advanced GitHub Enterprise Administration
Advanced GitHub Enterprise AdministrationAdvanced GitHub Enterprise Administration
Advanced GitHub Enterprise Administration
 
Build your android app with gradle
Build your android app with gradleBuild your android app with gradle
Build your android app with gradle
 
Java and DevOps: Supercharge Your Delivery Pipeline with Containers
Java and DevOps: Supercharge Your Delivery Pipeline with ContainersJava and DevOps: Supercharge Your Delivery Pipeline with Containers
Java and DevOps: Supercharge Your Delivery Pipeline with Containers
 
Introduction to GitHub Actions - How to easily automate and integrate with Gi...
Introduction to GitHub Actions - How to easily automate and integrate with Gi...Introduction to GitHub Actions - How to easily automate and integrate with Gi...
Introduction to GitHub Actions - How to easily automate and integrate with Gi...
 
Selecting a Container Image Registry for Production - Microservices Meetup Fe...
Selecting a Container Image Registry for Production - Microservices Meetup Fe...Selecting a Container Image Registry for Production - Microservices Meetup Fe...
Selecting a Container Image Registry for Production - Microservices Meetup Fe...
 
Introduction to Git for developers
Introduction to Git for developersIntroduction to Git for developers
Introduction to Git for developers
 
Building a universal search interface for the Cloud
Building a universal search interface for the CloudBuilding a universal search interface for the Cloud
Building a universal search interface for the Cloud
 
KubeCon EU 2021 Keynote: Shaping Kubernetes Community Culture
KubeCon EU 2021 Keynote: Shaping Kubernetes Community CultureKubeCon EU 2021 Keynote: Shaping Kubernetes Community Culture
KubeCon EU 2021 Keynote: Shaping Kubernetes Community Culture
 
The Automated Monolith
The Automated MonolithThe Automated Monolith
The Automated Monolith
 
Collaborating on GitHub for Open Source Documentation
Collaborating on GitHub for Open Source DocumentationCollaborating on GitHub for Open Source Documentation
Collaborating on GitHub for Open Source Documentation
 

Similar a Gerrit + Jenkins = Continuous Delivery For Big Data

JUC Europe 2015: Jenkins Pipeline for Continuous Delivery of Big Data Projects
JUC Europe 2015: Jenkins Pipeline for Continuous Delivery of Big Data ProjectsJUC Europe 2015: Jenkins Pipeline for Continuous Delivery of Big Data Projects
JUC Europe 2015: Jenkins Pipeline for Continuous Delivery of Big Data ProjectsCloudBees
 
Gerrit jenkins-big data-continuous-delivery
Gerrit jenkins-big data-continuous-deliveryGerrit jenkins-big data-continuous-delivery
Gerrit jenkins-big data-continuous-deliveryLuca Milanesio
 
DockerCon 15 Keynote - Day 2
DockerCon 15 Keynote - Day 2DockerCon 15 Keynote - Day 2
DockerCon 15 Keynote - Day 2Docker, Inc.
 
Lessons Learned Running Hadoop and Spark in Docker Containers
Lessons Learned Running Hadoop and Spark in Docker ContainersLessons Learned Running Hadoop and Spark in Docker Containers
Lessons Learned Running Hadoop and Spark in Docker ContainersBlueData, Inc.
 
DockerPenang Meetup#1
DockerPenang Meetup#1DockerPenang Meetup#1
DockerPenang Meetup#1Sujay Pillai
 
Tips For Maintaining OSS Projects
Tips For Maintaining OSS ProjectsTips For Maintaining OSS Projects
Tips For Maintaining OSS ProjectsTaro L. Saito
 
Leveraging docker for hadoop build automation and big data stack provisioning
Leveraging docker for hadoop build automation and big data stack provisioningLeveraging docker for hadoop build automation and big data stack provisioning
Leveraging docker for hadoop build automation and big data stack provisioningEvans Ye
 
Leveraging Docker for Hadoop build automation and Big Data stack provisioning
Leveraging Docker for Hadoop build automation and Big Data stack provisioningLeveraging Docker for Hadoop build automation and Big Data stack provisioning
Leveraging Docker for Hadoop build automation and Big Data stack provisioningDataWorks Summit
 
How bigtop leveraged docker for build automation and one click hadoop provis...
How bigtop leveraged docker for build automation and  one click hadoop provis...How bigtop leveraged docker for build automation and  one click hadoop provis...
How bigtop leveraged docker for build automation and one click hadoop provis...Evans Ye
 
Docker at and with SignalFx
Docker at and with SignalFxDocker at and with SignalFx
Docker at and with SignalFxSignalFx
 
DevOps for Big Data - Data 360 2014 Conference
DevOps for Big Data - Data 360 2014 ConferenceDevOps for Big Data - Data 360 2014 Conference
DevOps for Big Data - Data 360 2014 ConferenceGrid Dynamics
 
Containers and Microservices for Realists
Containers and Microservices for RealistsContainers and Microservices for Realists
Containers and Microservices for RealistsOracle Developers
 
Containers and microservices for realists
Containers and microservices for realistsContainers and microservices for realists
Containers and microservices for realistsKarthik Gaekwad
 
Docker for the enterprise
Docker for the enterpriseDocker for the enterprise
Docker for the enterpriseBert Poller
 
DEVNET-1169 CI/CT/CD on a Micro Services Applications using Docker, Salt & Ni...
DEVNET-1169	CI/CT/CD on a Micro Services Applications using Docker, Salt & Ni...DEVNET-1169	CI/CT/CD on a Micro Services Applications using Docker, Salt & Ni...
DEVNET-1169 CI/CT/CD on a Micro Services Applications using Docker, Salt & Ni...Cisco DevNet
 
10 tips for Cloud Native Security
10 tips for Cloud Native Security10 tips for Cloud Native Security
10 tips for Cloud Native SecurityKarthik Gaekwad
 
Devoxx 2016 - Docker Nuts and Bolts
Devoxx 2016 - Docker Nuts and BoltsDevoxx 2016 - Docker Nuts and Bolts
Devoxx 2016 - Docker Nuts and BoltsPatrick Chanezon
 
DevSecOps in a cloudnative world
DevSecOps in a cloudnative worldDevSecOps in a cloudnative world
DevSecOps in a cloudnative worldKarthik Gaekwad
 
Modern Web-site Development Pipeline
Modern Web-site Development PipelineModern Web-site Development Pipeline
Modern Web-site Development PipelineGlobalLogic Ukraine
 

Similar a Gerrit + Jenkins = Continuous Delivery For Big Data (20)

JUC Europe 2015: Jenkins Pipeline for Continuous Delivery of Big Data Projects
JUC Europe 2015: Jenkins Pipeline for Continuous Delivery of Big Data ProjectsJUC Europe 2015: Jenkins Pipeline for Continuous Delivery of Big Data Projects
JUC Europe 2015: Jenkins Pipeline for Continuous Delivery of Big Data Projects
 
Gerrit jenkins-big data-continuous-delivery
Gerrit jenkins-big data-continuous-deliveryGerrit jenkins-big data-continuous-delivery
Gerrit jenkins-big data-continuous-delivery
 
DockerCon 15 Keynote - Day 2
DockerCon 15 Keynote - Day 2DockerCon 15 Keynote - Day 2
DockerCon 15 Keynote - Day 2
 
Lessons Learned Running Hadoop and Spark in Docker Containers
Lessons Learned Running Hadoop and Spark in Docker ContainersLessons Learned Running Hadoop and Spark in Docker Containers
Lessons Learned Running Hadoop and Spark in Docker Containers
 
DockerPenang Meetup#1
DockerPenang Meetup#1DockerPenang Meetup#1
DockerPenang Meetup#1
 
Tips For Maintaining OSS Projects
Tips For Maintaining OSS ProjectsTips For Maintaining OSS Projects
Tips For Maintaining OSS Projects
 
Leveraging docker for hadoop build automation and big data stack provisioning
Leveraging docker for hadoop build automation and big data stack provisioningLeveraging docker for hadoop build automation and big data stack provisioning
Leveraging docker for hadoop build automation and big data stack provisioning
 
Leveraging Docker for Hadoop build automation and Big Data stack provisioning
Leveraging Docker for Hadoop build automation and Big Data stack provisioningLeveraging Docker for Hadoop build automation and Big Data stack provisioning
Leveraging Docker for Hadoop build automation and Big Data stack provisioning
 
How bigtop leveraged docker for build automation and one click hadoop provis...
How bigtop leveraged docker for build automation and  one click hadoop provis...How bigtop leveraged docker for build automation and  one click hadoop provis...
How bigtop leveraged docker for build automation and one click hadoop provis...
 
Docker {at,with} SignalFx
Docker {at,with} SignalFxDocker {at,with} SignalFx
Docker {at,with} SignalFx
 
Docker at and with SignalFx
Docker at and with SignalFxDocker at and with SignalFx
Docker at and with SignalFx
 
DevOps for Big Data - Data 360 2014 Conference
DevOps for Big Data - Data 360 2014 ConferenceDevOps for Big Data - Data 360 2014 Conference
DevOps for Big Data - Data 360 2014 Conference
 
Containers and Microservices for Realists
Containers and Microservices for RealistsContainers and Microservices for Realists
Containers and Microservices for Realists
 
Containers and microservices for realists
Containers and microservices for realistsContainers and microservices for realists
Containers and microservices for realists
 
Docker for the enterprise
Docker for the enterpriseDocker for the enterprise
Docker for the enterprise
 
DEVNET-1169 CI/CT/CD on a Micro Services Applications using Docker, Salt & Ni...
DEVNET-1169	CI/CT/CD on a Micro Services Applications using Docker, Salt & Ni...DEVNET-1169	CI/CT/CD on a Micro Services Applications using Docker, Salt & Ni...
DEVNET-1169 CI/CT/CD on a Micro Services Applications using Docker, Salt & Ni...
 
10 tips for Cloud Native Security
10 tips for Cloud Native Security10 tips for Cloud Native Security
10 tips for Cloud Native Security
 
Devoxx 2016 - Docker Nuts and Bolts
Devoxx 2016 - Docker Nuts and BoltsDevoxx 2016 - Docker Nuts and Bolts
Devoxx 2016 - Docker Nuts and Bolts
 
DevSecOps in a cloudnative world
DevSecOps in a cloudnative worldDevSecOps in a cloudnative world
DevSecOps in a cloudnative world
 
Modern Web-site Development Pipeline
Modern Web-site Development PipelineModern Web-site Development Pipeline
Modern Web-site Development Pipeline
 

Último

HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendArshad QA
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
 
Clustering techniques data mining book ....
Clustering techniques data mining book ....Clustering techniques data mining book ....
Clustering techniques data mining book ....ShaimaaMohamedGalal
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 

Último (20)

HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and Backend
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
Clustering techniques data mining book ....
Clustering techniques data mining book ....Clustering techniques data mining book ....
Clustering techniques data mining book ....
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 

Gerrit + Jenkins = Continuous Delivery For Big Data

  • 1. 1 Gerrit + Jenkins = Continuous Delivery for Big Data Mountain View, CA, November 2015 Stefano Galarraga GerritForge stefano@gerritforge.com http://www.gerritforge.com Real-life case study and future developments
  • 2. The Team 2 Luca Milanesio • Co-founder and Director of GerritForge • over 20 years in Agile Development and ALM • OpenSource contributor to many projects (BigData, Continuous Integration, Git/Gerrit) Antonios Chalkiopulos • Author of Programming MapReduce with Scalding • Open source contributor to many BigData projects • Working on the "land-of-Hadoop' (landoop.com) Tiago Palma • Data Warehouse & Big Data Development • Senior Data Modeler • Big Data infrastructure specialist Stefano Galarraga • 20 years of Agile Development • Middleware, Big Data, Reactive Distributed Systems. • Open Source contributor to BigData projects.
  • 3. Agenda • What’s special in Big Data – General lack of support for Unite/Integration testing – Testing the "real thing" (aka the Cluster) • Why Gerrit for continuous deployment on BigData? • Our Development Lifecycle ingredients – Gerrit, Jenkins, Mesos, Marathon, CDH / Spark • Gerrit Role and Components – What did we use, why, what we would like to have • New developments – Usint Topics with microservices for “atomic” multi-service changes • Live (minimised) Demo • Open points and discussion 3
  • 4. WHY Gerrit? • Fast Paced • Distributed team • Relatively a “niche” technology – A lot of “junior” developers – Need for strong ownership – Validation rules – CD => We need to be have green build and consistent code quality 4
  • 5. Code-Review Lifecycle • GIT used by distributed teams (UK, Israel, India) • Topics and Code Review • Jenkins build on every patch-set • Commits reviewed / approved via Gerrit Submit • Submitting a Topic automatically does: – all patch-sets merged (semi-atomically) – trigger a longer chain of CI steps – automatically promote a RC if everything passes • Jenkins automation via Gerrit Trigger Plugin 5
  • 6. Build Steps and Solutions • Unit tests abstracting from dependencies • Integration Tests: – Using Docker to run dependencies on the CI • “Micro” Hadoop cluster or other dependencies (DBs, messaging) => Jenkins docker plugin • When possible “dockerizing” just the required components and driving them from the test framework • Performance/Acceptance required a real cluster 6
  • 7. Fitting CDH Into this Picture • Acceptance / performance test with short-lived CDHs • Solution: Mesos, Marathon and Docker: – Ephemeral clusters with defined capacity – Automatic cluster-config – All controlled via Docker/Mesos • This was quite a long process!! – mostly because of CDH cluster configuration 7
  • 8. Mesos + Marathon 8 • Apache Mesos – Abstracts CPU, memory, storage, other compute resources away from machines • Marathon Framework – Runs on top of Mesos – Guarantees that long-running applications never stop – REST API for managing and scaling services
  • 9. CDH Components • CDH 5.4.1 distribution – Apache Spark – Hadoop HDFS – YARN 9
  • 10. Slave Host Integration/Performance Test Flow on CDH Cluster 10 Jenkins Master Mesos Master Marathon Private Docker Registry Mesos Slave Docker POST to Marathon REST API to start 1 docker container with Cloudera Manager and N docker containers with cloudera agents Marathon Framework receives resource offers from Mesos Master and submits the tasks The task is sent to the Mesos Slave Mesos slave starts the docker container Docker image is fetched from Docker registry if not present in Slave host WaitingforDockers DockersUP Install Cloudera packages via Cloudera Manager API using Python Deploy the ETL, run the ETL and the Acceptance Tests
  • 11. Unit and Integration Tests sample • Test project: – Test Spark project – ETL from Oracle to HDFS • Unit-test directly on Spark logic • Integration tests for every patch-set: – VERY small dataset just for this demo – CDH and Oracle Docker Images 11
  • 12. O Unit and Integration Tests 12 Hadoop Pseudo- distributed mode Spark Standalone Jenkins Build Job init Submit job Init/read HDFS
  • 14. Open Point and Discussion • Topic based build of multiple artifacts – Demo implementation is naïve and difficult to maintain – Race conditions on build of dependent artifacts • Need more advanced triggering system (zuul might fit) – Race condition on submit of topic • Stream event: “topic-submitted” instead/in addition of many “patch-submitted” event • Gerrit Trigger plugin should listen to this event to coordinate 14