Automate Application Quality Detection. Use Key Application Quality Metrics (# of SQL, Memory Allocated, CPU & GC Times, ...) captured during Automated Test Executions.
Let these Metrics act as Quality Gates. Leads to better quality software reaching the end of the Pipeline
2. Example of a “Bad” Web Deployment 282! Objects
on that page9.68MB Page Size
8.8s Page Load
Time
Most objects are images
delivered from your main
domain
Very long Connect time
(1.8s) to your CDN
3. Example of a Bad Java/Tomcat Deployment
526s to render a financial transaction report
1 SQL running
210s!
Debug Logging with
log4j on outdated log4j
library (sync issue)
4.
5. 700 Deployments / Year
50-60 Deployments / Day
10+ Deployments / Day
Every 11.6 seconds
32. 32 @Dynatrace
26.7s
Execution Time 33! Calls to the
same Web
Service
171! SQL Queries through LINQ
by this Web Service – request
similar data for each call
Architecture Violation: Direct access
to DB instead from frontend logic
36. Distance calculation issues
480km biking
in 1 hour!
Solution: Unit Test in
Live App reports Geo
Calc Problems
Finding: Only
happens on certain
Android versions
49. Extend your Continuous Integration
12 0 120ms
3 1 68ms
Build 20 testPurchase OK
testSearch OK
Build 17 testPurchase OK
testSearch OK
Build 18 testPurchase FAILED
testSearch OK
Build 19 testPurchase OK
testSearch OK
Build # Test Case Status # SQL # Excep CPU
12 0 120ms
3 1 68ms
12 5 60ms
3 1 68ms
75 0 230ms
3 1 68ms
Test & Monitoring Framework Results Architectural Data
We identified a regresesion
Problem solved
Exceptions probably reason for
failed tests
Problem fixed but now we have an
architectural regression
Problem fixed but now we have an
architectural regressionNow we have the functional and
architectural confidence
Let’s look behind the scenes
50. #1: Analyzing each Test
#2: Metrics for each Test
#3: Detecting Regression
based on Measure
54. #1: Pick your App Metrics
# of Service Calls Bytes Sent & Received
# of Worker
Threads
# of Worker
Threads
# of SQL Calls, # of
Same SQLs # of DB
Connections
# of SQL Calls, # of
Same SQLs # of DB
Connections
Get Dynatrace Free Trial at http://bit.ly/dttrial
Video Tutorials on YouTube Channel: http://bit.ly/dttutorials
Online Webinars every other week: http://bit.ly/onlineperfclinic
Share Your PurePath with me: http://bit.ly/sharepurepath
This is a deployment that shouldnt make it to production. Two metrics from WPO (Web Performance Optimization) that should have been seen in dev & test before releasing this to prod
Another deployment that didnt go that well. Bad SQL when specifying a too long timerange for that report, DEBUG logging turned on and an outdated buggy log4j library!
Who can we avoid this? Lets just do it like the „Unicorns“ in that space – such as Etsy, Google or Facebook?
Several companies changed their way they develop and deploy software over the years. Here are some examples (numbers from 2011 – 2014)
Cars: from 2 deployments to 700
Flicks: 10+ per Day
Etsy: lets every new employee on their first day of employment make a code change and push it through the pipeline in production: THAT’S the right approach towards required culture change
Amazon: every 11.6s
Remember: these are very small changes – which is also a key goal of continuous delivery. The smaller the change the easier it is to deploy, the less risk it has, the easier it is to test and the easier is it to take it out in case it has a problem.
The problem is though – when you blindly copy what you read you may end up with a very ugly copy of a Unicorn. Its not about copying everything or thinking that you have to release as frequently as the Unicorns. It is about changing and adapting a lot of their best practices but doing it in a way that makes sense to you. For you it might be enough to release once a month or once week.
So – our goal is to deploy new features faster to get it in front of our paying end users or employees
For many companies that tried this it may also meant that they fail faster
Your app that you are responsible for crashes …
The Fifa World Cup App one week before the worldcup. Crashed for the majority of Android users when refreshing the news section of the app caused by a memory leak introduced by an outdated library they used
I love metrics – and I think we should make decisions on deployments based on key metrics. But also monitor deployments in production to learn whether the deployment was really good
Synthetic Availability Monitoring -> Clearly something went wrong
Even if the deployment seemed good because all features work and response time is the same as before. If your resource consumption goes up like this the deployment is NOT GOOD. As you are now paying a lot of money for that extra compute power
Got a marketing campaign? If you roll it out do it smart: Start with a small number – monitor user behavior – fix errors if there are any before rolling out the rest of the campaign
A lot of people dont look at these metrics and just add new code on an ever growing big pile of technical debt
Based on a recent study:
80% of Dev Team overall is spent in Bugfixing instead of building new cool features
$60B annual costs of bad software instead of investing it in new cool features to spearhead competition
Yes – we are focusing on quality TOO LATE
When its too late we end up here
We need to leave that status quo. And there are two numbers that tell us that it is not as hard to do as it may seem
Based on my experience
80% of the problems are only caused by 20% problem patterns. And focusing on 20% of potential problems that take away 80% of the pain is a very good starting point
Sounds super nice on paper – so – how do we get there?
Marketing had a great idea: 20x20 grid showing the last 400 selfie uploads. Implementation was pushed through quick resulting in an overloaded page that causes both performance and usability issue on the mobile device as well as on the servers and CDNs
This was a monolithic app for searching sports club websites. The executed sample search brought 33 sports club. Before this app was „migrated“ to Microservices everything was in a single monolith taking about 1s to execute. After the „migration“ to (micro)services the same call takes 26.7s including 33 calls to the new microservice and 171 roundtrips to the database
Pushing the unit test to the mobile device to test for GPS specific calcuation issues. Then identifying which devices have a GPS calculation bug!
Monitoring impact of 3rd party APIs such as Facebook
Overloaded Kia website brings it down during superbowl
Kia is doing something different: they have a special „bare minimum static optimized“ website for the spike period -> thats smart
So – we have seen a lot of metrics. The goal now is that you start with one metric. Pick a single metric and take it back to your engineering team (Dev, Test, Ops and Business). Sit down and agree on what this metric means for everyone, how to measure it and also how to report it
Also remember that for most of these use cases discussed and metrics derived from it we only need a single user test. Even though we can identify performance, scalability and architectural issues – in most cases we don’t need a load test. Single user tests or unit tests are good enough
Here is how we do this. In addition to looking at functional and unit test results which only tell us how functionality is we also look into these backed metrics for every test. With that we can immediately identify whether code changes result in any performance, scalability or architectural regressions. Knowing this allows us to stop that build early
This is how this can look like in a real life example. Analyzing Key Performance, Scalability and Architectural Metrics for every single test
So – our goal is to deploy new features faster to get it in front of our paying end users or employees