SlideShare una empresa de Scribd logo
1 de 28
Building trust within the
organization, first steps towards
            DevOps
         Guido Serra, txtr
What’s the role of a DevOp(s)?
•   Deliver
•   Be bridge of trust between DEVs and SysOPs
•   Stop the “throw the ball over the fence” game
•   Mediate
•   Drive non-functional requirements

           … DevOp or DevOps, talking of one? 
Introduce a DevOp(s)
• In ‘txtr, starting as a QA Manager, specialised
  on backend systems, seems to have worked
• Other organizations tends to call it Site
  Reliability Engineer / Site Reliability Operation
• But… QA != Testing, not strictly at least
  – Testing should be only a subset of QA, but that is
    not how it is normally perceived
  – Non-functional requirements did not seem to fit in
Non-functional requirements?
• Functional requirements == features
• Non-functional requirements == everything that
  OPS would need to run the service, or even
  things that Product Owners would want but has
  not thought of at the design time
  – Logging
     • Which kind of informations?
     • How?
  – Health checks / Load Balancer required URL
  – Live sales report / Dashboard / Charting
Steps that worked so far
• Listen …to OPS, to PMs, to QA, to R&D
• See how the people have solved their specific
  needs trying to gather informations
• Match all the tools that have been built
• Try to gather the essence of those tools, and
  come up with non-functional requirements
• Discuss those with the R&D organization and
  push them at Product level to be prioritized
  over features
TRUST
Means…
• Not having to duplicate work
  – wrongly testing the backend to see if it is
    answering
  – or testing to measure the response times
  – or creating tests again, when there are plenty of
    them that are simply not shared and/or broadly
    understood
The answer is 42?
…no, the answer is DATA!
• Creating a single point of data collection and
  graphing, people are gaining trust in the
  backend
• Logs need to be shared too
• Tests needs to be commonly understood
Logging

SHARE LOGS WITH EVERYONE
Tools
• Logging
  – Slf4j > Log4j / JUL > GELF > GrayLog2
     • Logging to syslog from a Java based backend, is pretty
       bad. The stacktrace become very hard to be fetched
       and reported in a ticket. Instead, one link and a
       screenshot, or a cut&paste of a complete stacktrace
       from a web interface is much more easy to be digested
     • GELF is a notification format, encapsulating the full
       stacktrace as a message
     • GrayLog2 is a ruby/MongoDB FIFO queue with a nice
       web interface, and an alerting email system
Why?
• Slf4j
   – It is an abstraction layer on logging facilities
          • I’ll not explain why an “abstraction layer” is good
• Log4j or JUL, at your choice
   – They are the most commonly used
          • Means: their code is maintained
• GELF
   – It keeps a full stacktrace in a single message. There is no need of
     reconstructing it from syslog, spread on multiple lines and with
     additional garbage/timestamps
• GrayLog2
   – We have an in-house developer, and it is working pretty well
   – Has threshold based alerting per streams of events (regexp)
Results seen so far
• 1st level support team is gaining trust in the
  application.
  – Logs are getting more and more readable
  – Events can be correlated much more easily
• 2nd level support (OPS) can set thresholds of
  alerts and react promptly, having alerts tight
  to real traffic data and not “one time probes”
• I have a better feeling of the trend of issues in
  production, and I don’t have to dig for logs
Instrumented metrics

PRODUCTION PERFORMANCE
Tools
• Instrumented metrics
  – JMX > Jolokia > JSON > Graphite
     • MIN / MAX / AVG response time of each API
     • Worst response times with related API parameters
     • Success / failure counters
     • All the above aggregated over the last 5 / 15 minutes, 1
       hour, 24 hours
     • Plus all the standard exposed JConsole / JMX infos
Why?
• JMX
  – It is built in in Java, and it is non-invasive
     • R&D loves it, cause it does not need an invasive agent as
       many profiling agents that are normally used in such cases.
       Standard profiling agents tend to interfere with the
       application and decrease the overall performance.
  – It is a standard, so there are many tools that plug into
    it natively
• Jolokia
  – It is a standard tool that plugs into JMX and expose it
    as JSON encoded format
Why?
• Graphite
  – It can correlate data from many sources
  – Gives me the freedom of structuring graphs as I
    want, directly from the web interface
     • This is a definitive WIN over Munin or Cacti
  – It lets me select specific timeframes
     • In case of outage investigation. Thing which is not
       possible with Munin
  – Can create dashboards
Data are in transactions “per 5 minutes” in this graph
…you can see this specific service is currently being used
100 transactions per second
uhmm… at 7a.m., ok 11a.m. in India
someone is testing…
Results seen so far
• No need of load and performance testing
  – Apart of specific cases, to try to reproduce the
    issue to let DEVs work on it.
  – Producing a proper load test is problematic, and
    can bring to false assumptions about the product.
    Having the possibility to watch what the business
    logic is doing in production is the best load test.
• DEVs are proactively watching and fixing
  performance issues on their own. The overall
  product gets better and better.
Testing

SHARE TESTS AND RESULTS
Tools
• Testing
  – BDD / Cucumber-Nagios executed by Jenkins
     • Cover all the fast HTTP action via Watir
     • API calls via JsonRPC or Soap4Rr
     • Javascript based UI via Selenium / Capybara
• These tests are actually very valuable at
  deployment time, since there is no need of
  manual testing. All is in the hand of whom
  follows the deployment.
Why?
• BDD
  – Not everyone wants to read your code
  – Not everyone is a coder
  – You don’t want to have to explain your test again and
    again and again, and you hate documenting
• Cucumber-Nagios / Ruby
  – It is off-the-shelf, it works.
  – It generates standard JUnit XML report
     • Means: it directly integrates with Jenkins ( ex Hudson )
  – It generates an awesome HTML report
  – It can be extended pretty easily
Why?
• Watir
  – It is the default HTTP client in Cucumber-Nagios
     • BUT: it has tons of bugs… I have a long backlog to fix
  – It is fast
• Soap4r
  – Pretty easy SOAP ruby gem/library
• JsonRPC
  – Very simple and basic JSON RPC gem/library
     • BUT: it does not support proxy settings
Why?
• Selenium
  – Cause it is the only one?
  – It supports Javascript
  – It supports clustering of testing nodes
  – It is supposed to be easy to integrate with
    Cucumber (it is NOT …I’m working on it)
Upcoming…
• Health checks (normally used for load
  balancing purposes) are based on business
  logic historical data from within the
  instrumented metrics
• Continuous integration
  – Configuration management
• Data mining
guido.serra@txtr.com

QUESTIONS?

                       http://slidesha.re/rVzd8F

Más contenido relacionado

La actualidad más candente

Requirements Engineering - a tale from the trenches
Requirements Engineering - a tale from the trenchesRequirements Engineering - a tale from the trenches
Requirements Engineering - a tale from the trenches
Eric D. Schabell
 
Bringing CD to the DoD
Bringing CD to the DoDBringing CD to the DoD
Bringing CD to the DoD
Gene Gotimer
 
The challenges and pitfalls of database deployment automation
The challenges and pitfalls of database deployment automationThe challenges and pitfalls of database deployment automation
The challenges and pitfalls of database deployment automation
DBmaestro - Database DevOps
 
Automating development-operations-v1
Automating development-operations-v1Automating development-operations-v1
Automating development-operations-v1
Sumanth Vepa
 
Enhanced Verification Flow with Nextop's Assertion Synthesis Technology
Enhanced Verification Flow with Nextop's Assertion Synthesis TechnologyEnhanced Verification Flow with Nextop's Assertion Synthesis Technology
Enhanced Verification Flow with Nextop's Assertion Synthesis Technology
DVClub
 

La actualidad más candente (20)

So you-want-to-go-faster
So you-want-to-go-fasterSo you-want-to-go-faster
So you-want-to-go-faster
 
Joxean Koret - Interactive Static Analysis Tools for Vulnerability Discovery ...
Joxean Koret - Interactive Static Analysis Tools for Vulnerability Discovery ...Joxean Koret - Interactive Static Analysis Tools for Vulnerability Discovery ...
Joxean Koret - Interactive Static Analysis Tools for Vulnerability Discovery ...
 
Requirements Engineering - a tale from the trenches
Requirements Engineering - a tale from the trenchesRequirements Engineering - a tale from the trenches
Requirements Engineering - a tale from the trenches
 
jBPM Migration - generating your process future
jBPM Migration - generating your process futurejBPM Migration - generating your process future
jBPM Migration - generating your process future
 
Static Analysis Techniques For Testing Application Security - Houston Tech Fest
Static Analysis Techniques For Testing Application Security - Houston Tech FestStatic Analysis Techniques For Testing Application Security - Houston Tech Fest
Static Analysis Techniques For Testing Application Security - Houston Tech Fest
 
Bringing CD to the DoD
Bringing CD to the DoDBringing CD to the DoD
Bringing CD to the DoD
 
Benefits from AATs
Benefits from AATsBenefits from AATs
Benefits from AATs
 
The challenges and pitfalls of database deployment automation
The challenges and pitfalls of database deployment automationThe challenges and pitfalls of database deployment automation
The challenges and pitfalls of database deployment automation
 
Automating development-operations-v1
Automating development-operations-v1Automating development-operations-v1
Automating development-operations-v1
 
Static code analysis
Static code analysisStatic code analysis
Static code analysis
 
Security Implications for a DevOps Transformation
Security Implications for a DevOps TransformationSecurity Implications for a DevOps Transformation
Security Implications for a DevOps Transformation
 
Enhanced Verification Flow with Nextop's Assertion Synthesis Technology
Enhanced Verification Flow with Nextop's Assertion Synthesis TechnologyEnhanced Verification Flow with Nextop's Assertion Synthesis Technology
Enhanced Verification Flow with Nextop's Assertion Synthesis Technology
 
Generalization in Auto-Testing. How we put what we had into new Technological...
Generalization in Auto-Testing. How we put what we had into new Technological...Generalization in Auto-Testing. How we put what we had into new Technological...
Generalization in Auto-Testing. How we put what we had into new Technological...
 
The Continuous delivery value - Funaro
The Continuous delivery value - FunaroThe Continuous delivery value - Funaro
The Continuous delivery value - Funaro
 
Code refactoring
Code refactoringCode refactoring
Code refactoring
 
Extreme Makeover OnBase Edition
Extreme Makeover OnBase EditionExtreme Makeover OnBase Edition
Extreme Makeover OnBase Edition
 
Rob Sabourin: On Testing
Rob Sabourin: On TestingRob Sabourin: On Testing
Rob Sabourin: On Testing
 
Software Defects and SW Reliability Assessment
Software Defects and SW Reliability AssessmentSoftware Defects and SW Reliability Assessment
Software Defects and SW Reliability Assessment
 
“Sensu and Sensibility” - The Story of a Journey From #monitoringsucks to #mo...
“Sensu and Sensibility” - The Story of a Journey From #monitoringsucks to #mo...“Sensu and Sensibility” - The Story of a Journey From #monitoringsucks to #mo...
“Sensu and Sensibility” - The Story of a Journey From #monitoringsucks to #mo...
 
Ginsbourg.com presentation of open source performance validation
Ginsbourg.com presentation of open source performance validationGinsbourg.com presentation of open source performance validation
Ginsbourg.com presentation of open source performance validation
 

Destacado

Minding your P's and Q's: Enrich-ing Enlighten
Minding your P's and Q's: Enrich-ing EnlightenMinding your P's and Q's: Enrich-ing Enlighten
Minding your P's and Q's: Enrich-ing Enlighten
enlightenrepository
 
Northbridge capital organised retail pharmacy india 2011
Northbridge capital organised retail pharmacy india 2011Northbridge capital organised retail pharmacy india 2011
Northbridge capital organised retail pharmacy india 2011
rohit_shankar
 
Raport z monitoringu budżetu obywatelskiego w Sochaczewie
Raport z monitoringu budżetu obywatelskiego w SochaczewieRaport z monitoringu budżetu obywatelskiego w Sochaczewie
Raport z monitoringu budżetu obywatelskiego w Sochaczewie
Marcin Germanek
 
Act. iv ethernet y planificación y cableado de redes
Act. iv ethernet y planificación y cableado de redesAct. iv ethernet y planificación y cableado de redes
Act. iv ethernet y planificación y cableado de redes
Ary Roque
 
How has the internet affected the way we
How has the internet affected the way weHow has the internet affected the way we
How has the internet affected the way we
aimeechh
 
PresentacióN Movilflota 2011
PresentacióN Movilflota   2011PresentacióN Movilflota   2011
PresentacióN Movilflota 2011
jfcuellogarcia
 
Calidad Del Aire En Torrelavega Red
Calidad Del Aire En Torrelavega RedCalidad Del Aire En Torrelavega Red
Calidad Del Aire En Torrelavega Red
plataformabesaya
 

Destacado (20)

The end of traditional enterprise IT - ING's journey to the next generation I...
The end of traditional enterprise IT - ING's journey to the next generation I...The end of traditional enterprise IT - ING's journey to the next generation I...
The end of traditional enterprise IT - ING's journey to the next generation I...
 
Deploying 30 times a day, and making sure everything stays 200 OK by Eric Sigler
Deploying 30 times a day, and making sure everything stays 200 OK by Eric SiglerDeploying 30 times a day, and making sure everything stays 200 OK by Eric Sigler
Deploying 30 times a day, and making sure everything stays 200 OK by Eric Sigler
 
Minding your P's and Q's: Enrich-ing Enlighten
Minding your P's and Q's: Enrich-ing EnlightenMinding your P's and Q's: Enrich-ing Enlighten
Minding your P's and Q's: Enrich-ing Enlighten
 
Interpolacion POLINOMICA DE NEWTON
Interpolacion POLINOMICA DE NEWTONInterpolacion POLINOMICA DE NEWTON
Interpolacion POLINOMICA DE NEWTON
 
Cevora ICT Symposium - Graph Databases
Cevora ICT Symposium - Graph DatabasesCevora ICT Symposium - Graph Databases
Cevora ICT Symposium - Graph Databases
 
Northbridge capital organised retail pharmacy india 2011
Northbridge capital organised retail pharmacy india 2011Northbridge capital organised retail pharmacy india 2011
Northbridge capital organised retail pharmacy india 2011
 
Gestion de la Innovacion en Cofas
Gestion de la Innovacion en CofasGestion de la Innovacion en Cofas
Gestion de la Innovacion en Cofas
 
Channel Partner data sheet
Channel Partner data sheetChannel Partner data sheet
Channel Partner data sheet
 
Raport z monitoringu budżetu obywatelskiego w Sochaczewie
Raport z monitoringu budżetu obywatelskiego w SochaczewieRaport z monitoringu budżetu obywatelskiego w Sochaczewie
Raport z monitoringu budżetu obywatelskiego w Sochaczewie
 
Palladium Magazine (Special Summer 2014)
Palladium Magazine (Special Summer 2014)Palladium Magazine (Special Summer 2014)
Palladium Magazine (Special Summer 2014)
 
Act. iv ethernet y planificación y cableado de redes
Act. iv ethernet y planificación y cableado de redesAct. iv ethernet y planificación y cableado de redes
Act. iv ethernet y planificación y cableado de redes
 
U.S. Travel Trailer And Camper Market. Analysis And Forecast to 2020
U.S. Travel Trailer And Camper Market. Analysis And Forecast to 2020U.S. Travel Trailer And Camper Market. Analysis And Forecast to 2020
U.S. Travel Trailer And Camper Market. Analysis And Forecast to 2020
 
How has the internet affected the way we
How has the internet affected the way weHow has the internet affected the way we
How has the internet affected the way we
 
Texto 3
Texto 3Texto 3
Texto 3
 
~~Putting~~ Convincing the Ops in DevOps by Jamie Jones
~~Putting~~ Convincing the Ops in DevOps by Jamie Jones~~Putting~~ Convincing the Ops in DevOps by Jamie Jones
~~Putting~~ Convincing the Ops in DevOps by Jamie Jones
 
PresentacióN Movilflota 2011
PresentacióN Movilflota   2011PresentacióN Movilflota   2011
PresentacióN Movilflota 2011
 
Influencias escolha de repertorio
Influencias escolha de repertorioInfluencias escolha de repertorio
Influencias escolha de repertorio
 
Democracia y Derecho Constitucional - David Mercado Pérez
Democracia y Derecho Constitucional - David Mercado PérezDemocracia y Derecho Constitucional - David Mercado Pérez
Democracia y Derecho Constitucional - David Mercado Pérez
 
Calidad Del Aire En Torrelavega Red
Calidad Del Aire En Torrelavega RedCalidad Del Aire En Torrelavega Red
Calidad Del Aire En Torrelavega Red
 
Presentacion historia del facebook y su impacto en la sociedad
Presentacion historia del facebook y su impacto en la sociedadPresentacion historia del facebook y su impacto en la sociedad
Presentacion historia del facebook y su impacto en la sociedad
 

Similar a Building trust within the organization, first steps towards DevOps

Cerberus_Presentation1
Cerberus_Presentation1Cerberus_Presentation1
Cerberus_Presentation1
CIVEL Benoit
 

Similar a Building trust within the organization, first steps towards DevOps (20)

Profiling and Tuning a Web Application - The Dirty Details
Profiling and Tuning a Web Application - The Dirty DetailsProfiling and Tuning a Web Application - The Dirty Details
Profiling and Tuning a Web Application - The Dirty Details
 
Performance tuning Grails applications
Performance tuning Grails applicationsPerformance tuning Grails applications
Performance tuning Grails applications
 
Behavior-Driven Development (BDD) Testing with Apache Spark with Aaron Colcor...
Behavior-Driven Development (BDD) Testing with Apache Spark with Aaron Colcor...Behavior-Driven Development (BDD) Testing with Apache Spark with Aaron Colcor...
Behavior-Driven Development (BDD) Testing with Apache Spark with Aaron Colcor...
 
Road to Continuous Delivery - Wix.com
Road to Continuous Delivery - Wix.comRoad to Continuous Delivery - Wix.com
Road to Continuous Delivery - Wix.com
 
Presentation application server diagnostics
Presentation   application server diagnosticsPresentation   application server diagnostics
Presentation application server diagnostics
 
Redundant devops
Redundant devopsRedundant devops
Redundant devops
 
Performance tuning Grails Applications GR8Conf US 2014
Performance tuning Grails Applications GR8Conf US 2014Performance tuning Grails Applications GR8Conf US 2014
Performance tuning Grails Applications GR8Conf US 2014
 
ADF Performance Monitor
ADF Performance MonitorADF Performance Monitor
ADF Performance Monitor
 
" Performance testing for Automation QA - why and how " by Andrey Kovalenko f...
" Performance testing for Automation QA - why and how " by Andrey Kovalenko f..." Performance testing for Automation QA - why and how " by Andrey Kovalenko f...
" Performance testing for Automation QA - why and how " by Andrey Kovalenko f...
 
NYC MeetUp 10.9
NYC MeetUp 10.9NYC MeetUp 10.9
NYC MeetUp 10.9
 
Kku2011
Kku2011Kku2011
Kku2011
 
Webinar: Keep Calm and Scale Out - A proactive guide to Monitoring MongoDB
Webinar: Keep Calm and Scale Out - A proactive guide to Monitoring MongoDBWebinar: Keep Calm and Scale Out - A proactive guide to Monitoring MongoDB
Webinar: Keep Calm and Scale Out - A proactive guide to Monitoring MongoDB
 
5 Steps to Jump Start Your Test Automation
5 Steps to Jump Start Your Test Automation5 Steps to Jump Start Your Test Automation
5 Steps to Jump Start Your Test Automation
 
Topic production code
Topic production codeTopic production code
Topic production code
 
Cerberus : Framework for Manual and Automated Testing (Web Application)
Cerberus : Framework for Manual and Automated Testing (Web Application)Cerberus : Framework for Manual and Automated Testing (Web Application)
Cerberus : Framework for Manual and Automated Testing (Web Application)
 
Cerberus_Presentation1
Cerberus_Presentation1Cerberus_Presentation1
Cerberus_Presentation1
 
I pushed in production :). Have a nice weekend
I pushed in production :). Have a nice weekendI pushed in production :). Have a nice weekend
I pushed in production :). Have a nice weekend
 
Infrastructure as Code for Network
Infrastructure as Code for NetworkInfrastructure as Code for Network
Infrastructure as Code for Network
 
Change Management in Hybrid landscapes 2017
Change Management in Hybrid landscapes 2017Change Management in Hybrid landscapes 2017
Change Management in Hybrid landscapes 2017
 
[DPE Summit] How Improving the Testing Experience Goes Beyond Quality: A Deve...
[DPE Summit] How Improving the Testing Experience Goes Beyond Quality: A Deve...[DPE Summit] How Improving the Testing Experience Goes Beyond Quality: A Deve...
[DPE Summit] How Improving the Testing Experience Goes Beyond Quality: A Deve...
 

Último

Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Último (20)

Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Building trust within the organization, first steps towards DevOps

  • 1. Building trust within the organization, first steps towards DevOps Guido Serra, txtr
  • 2. What’s the role of a DevOp(s)? • Deliver • Be bridge of trust between DEVs and SysOPs • Stop the “throw the ball over the fence” game • Mediate • Drive non-functional requirements … DevOp or DevOps, talking of one? 
  • 3. Introduce a DevOp(s) • In ‘txtr, starting as a QA Manager, specialised on backend systems, seems to have worked • Other organizations tends to call it Site Reliability Engineer / Site Reliability Operation • But… QA != Testing, not strictly at least – Testing should be only a subset of QA, but that is not how it is normally perceived – Non-functional requirements did not seem to fit in
  • 4. Non-functional requirements? • Functional requirements == features • Non-functional requirements == everything that OPS would need to run the service, or even things that Product Owners would want but has not thought of at the design time – Logging • Which kind of informations? • How? – Health checks / Load Balancer required URL – Live sales report / Dashboard / Charting
  • 5. Steps that worked so far • Listen …to OPS, to PMs, to QA, to R&D • See how the people have solved their specific needs trying to gather informations • Match all the tools that have been built • Try to gather the essence of those tools, and come up with non-functional requirements • Discuss those with the R&D organization and push them at Product level to be prioritized over features
  • 6. TRUST Means… • Not having to duplicate work – wrongly testing the backend to see if it is answering – or testing to measure the response times – or creating tests again, when there are plenty of them that are simply not shared and/or broadly understood
  • 7. The answer is 42? …no, the answer is DATA! • Creating a single point of data collection and graphing, people are gaining trust in the backend • Logs need to be shared too • Tests needs to be commonly understood
  • 9. Tools • Logging – Slf4j > Log4j / JUL > GELF > GrayLog2 • Logging to syslog from a Java based backend, is pretty bad. The stacktrace become very hard to be fetched and reported in a ticket. Instead, one link and a screenshot, or a cut&paste of a complete stacktrace from a web interface is much more easy to be digested • GELF is a notification format, encapsulating the full stacktrace as a message • GrayLog2 is a ruby/MongoDB FIFO queue with a nice web interface, and an alerting email system
  • 10. Why? • Slf4j – It is an abstraction layer on logging facilities • I’ll not explain why an “abstraction layer” is good • Log4j or JUL, at your choice – They are the most commonly used • Means: their code is maintained • GELF – It keeps a full stacktrace in a single message. There is no need of reconstructing it from syslog, spread on multiple lines and with additional garbage/timestamps • GrayLog2 – We have an in-house developer, and it is working pretty well – Has threshold based alerting per streams of events (regexp)
  • 11.
  • 12.
  • 13. Results seen so far • 1st level support team is gaining trust in the application. – Logs are getting more and more readable – Events can be correlated much more easily • 2nd level support (OPS) can set thresholds of alerts and react promptly, having alerts tight to real traffic data and not “one time probes” • I have a better feeling of the trend of issues in production, and I don’t have to dig for logs
  • 15. Tools • Instrumented metrics – JMX > Jolokia > JSON > Graphite • MIN / MAX / AVG response time of each API • Worst response times with related API parameters • Success / failure counters • All the above aggregated over the last 5 / 15 minutes, 1 hour, 24 hours • Plus all the standard exposed JConsole / JMX infos
  • 16. Why? • JMX – It is built in in Java, and it is non-invasive • R&D loves it, cause it does not need an invasive agent as many profiling agents that are normally used in such cases. Standard profiling agents tend to interfere with the application and decrease the overall performance. – It is a standard, so there are many tools that plug into it natively • Jolokia – It is a standard tool that plugs into JMX and expose it as JSON encoded format
  • 17. Why? • Graphite – It can correlate data from many sources – Gives me the freedom of structuring graphs as I want, directly from the web interface • This is a definitive WIN over Munin or Cacti – It lets me select specific timeframes • In case of outage investigation. Thing which is not possible with Munin – Can create dashboards
  • 18. Data are in transactions “per 5 minutes” in this graph …you can see this specific service is currently being used
  • 19. 100 transactions per second uhmm… at 7a.m., ok 11a.m. in India someone is testing…
  • 20. Results seen so far • No need of load and performance testing – Apart of specific cases, to try to reproduce the issue to let DEVs work on it. – Producing a proper load test is problematic, and can bring to false assumptions about the product. Having the possibility to watch what the business logic is doing in production is the best load test. • DEVs are proactively watching and fixing performance issues on their own. The overall product gets better and better.
  • 22. Tools • Testing – BDD / Cucumber-Nagios executed by Jenkins • Cover all the fast HTTP action via Watir • API calls via JsonRPC or Soap4Rr • Javascript based UI via Selenium / Capybara • These tests are actually very valuable at deployment time, since there is no need of manual testing. All is in the hand of whom follows the deployment.
  • 23. Why? • BDD – Not everyone wants to read your code – Not everyone is a coder – You don’t want to have to explain your test again and again and again, and you hate documenting • Cucumber-Nagios / Ruby – It is off-the-shelf, it works. – It generates standard JUnit XML report • Means: it directly integrates with Jenkins ( ex Hudson ) – It generates an awesome HTML report – It can be extended pretty easily
  • 24. Why? • Watir – It is the default HTTP client in Cucumber-Nagios • BUT: it has tons of bugs… I have a long backlog to fix – It is fast • Soap4r – Pretty easy SOAP ruby gem/library • JsonRPC – Very simple and basic JSON RPC gem/library • BUT: it does not support proxy settings
  • 25. Why? • Selenium – Cause it is the only one? – It supports Javascript – It supports clustering of testing nodes – It is supposed to be easy to integrate with Cucumber (it is NOT …I’m working on it)
  • 26.
  • 27. Upcoming… • Health checks (normally used for load balancing purposes) are based on business logic historical data from within the instrumented metrics • Continuous integration – Configuration management • Data mining
  • 28. guido.serra@txtr.com QUESTIONS? http://slidesha.re/rVzd8F