SlideShare una empresa de Scribd logo
1 de 26
Bad Metrics and What You
Can Do About It
Paul Holland
Test Consultant and Teacher
at Testing Thoughts
My Background
• Independent S/W Testing consultant since Apr 2012
• 16+ years testing telecommunications equipment
and reworking test methodologies at Alcatel-Lucent
• 10+ years as a test manager
• Presenter at STAREast and CAST
• Keynote at KWSQA conference in 2012
• Facilitator at 25+ peer conferences and workshops
• Teacher of S/W testing for the past 5 years
• Teacher of Rapid Software Testing
– through Satisfice (James Bach): www.satisfice.com

• Military Helicopter pilot – Canadian Sea Kings
April, 2013

©2013 Testing Thoughts

2
Attributions
• Over the past 10 years I have spoken with many
people regarding metrics. I cannot directly
attribute any specific aspects of this talk to any
individual but all of these people (and more)
have influenced my opinions and thoughts on
metrics:
– Cem Kaner, James Bach, Michael Bolton, Ross
Collard, Doug Hoffman, Scott Barber, John
Hazel, Eric Proegler, Dan Downing, Greg
McNelly, Ben Yaroch
April, 2013

©2013 Testing Thoughts

3
Definitions of METRIC
(from http://www.merriam-webster.com, April 2012)

• 1 plural : a part of prosody that deals with metrical structure
• 2 : a standard of measurement <no metric exists that can
be applied directly to happiness — Scientific Monthly>
• 3 : a mathematical function that associates a real
nonnegative number analogous to distance with each pair of
elements in a set such that the number is zero only if the
two elements are identical, the number is the same
regardless of the order in which the two elements are
taken, and the number associated with one pair of elements
plus that associated with one member of the pair and a third
element is equal to or greater than the number associated
with the other member of the pair and the third element
April, 2013

©2013 Testing Thoughts

4
Sample Metrics
• Number of Test Cases Planned (per release or

feature)
• Number of Test Cases Executed vs. Plan
• Number of Bugs Found per Tester

• Number of Bugs Found per Feature
• Number of Bugs Found in the Field
• Number of Open Bugs
• Lab Equipment Usage
April, 2013

©2013 Testing Thoughts

5
Sample Metrics
• Hours between crashes in the Field

• Percentage Behind Plan
• Percentage of Automated Test Cases
• Percentage of Tests Passed vs. Failed (pass rate)
• Number of Test Steps
• Code Coverage / Path Coverage

• Requirements Coverage
April, 2013

©2013 Testing Thoughts

6
Goodhart‘s Law
• In 1975, Charles Goodhart, a former advisor to
the Bank of England and Emeritus Professor at
the London School of Economics stated:

Any observed statistical regularity will
tend to collapse once pressure is
placed upon it for control purposes
Goodhart, C.A.E. (1975a) ‗Problems of Monetary Management: The UK Experience‘ in
Papers in Monetary Economics, Volume I, Reserve Bank of Australia, 1975

April, 2013

©2013 Testing Thoughts

7
Goodhart‘s Law
• Professor Marilyn Strathern FBA has re-stated
Goodhart's Law more succinctly and more
generally:

`When a measure becomes a
target, it ceases to be a good
measure.'

April, 2013

©2013 Testing Thoughts

8
Elements of Bad Metrics
• Measure and/or compare elements that are
inconsistent in size or composition
– Impossible to effectively use for comparison
– How many containers do you need for your
possessions?
– Test Cases and Test Steps
• Greatly vary in time required and complexity

– Bugs
• Can be different severity, likelihood - i.e.: risk
April, 2013

©2013 Testing Thoughts

9
Elements of Bad Metrics
• Create competition between individuals
and/or teams
– They typically do not result in friendly competition
– Inhibits sharing of information and teamwork
– Especially damaging if compensation is impacted
– Number of xxxx per tester
– Number of xxxx per feature

April, 2013

©2013 Testing Thoughts

10
Elements of Bad Metrics
• Easy to ―game‖ or circumvent the desired
intention
– Easy to be improved by undesirable behaviour

– Pass rate (percentage): Execute more simple
tests that will pass or break up a long test case
into many smaller ones
– Number of bugs raised: Raising two similar bug
reports instead of combining them
April, 2013

©2013 Testing Thoughts

11
Elements of Bad Metrics
• Contain misleading information or gives a
false sense of completeness
– Summarizing a large amount of information
into one or two numbers out of context

– Coverage (Code, Path)
• Misleading information based on touching the code
once

– Pass rate and number of test cases
April, 2013

©2013 Testing Thoughts

12
Impact of Using Bad
Metrics
– Promotes bad behaviour:
• Testers may create more smaller test cases instead of
creating test cases that make sense
• Execution of ineffective testing to meet requirements
• Artificially creating higher numbers instead of doing
what makes sense
• Creation of tools that will mask inefficiencies (e.g.: lab
equipment usage)
• Time wasted improving the ―numbers‖ instead of
improving the testing
April, 2013

©2013 Testing Thoughts

13
Impact of Using Bad
Metrics
• Gives Executives a false sense of test coverage
– All they see is numbers out of context
– The larger the numbers the better the testing
– The difficulty of good testing is hidden by large ―fake‖
numbers

• Dangerous message to Executives
– Our pass rate is at 96% so our product is in good
shape
– Code coverage is at 100% - our code is completely
tested
– Feature specification coverage is at 100% - Ship it!!!

• What could possibly go wrong?
April, 2013

©2013 Testing Thoughts

14
Sample Metrics
• Number of Test Cases Planned (per release or

feature)
• Number of Test Cases Executed vs. Plan
• Number of Bugs Found per Tester

• Number of Bugs Found per Feature
• Number of Bugs Found in the Field – A list of Bugs
• Number of Open Bugs – A list of Open Bugs
• Lab Equipment Usage
April, 2013

©2013 Testing Thoughts

15
Sample Metrics
• Hours between crashes in the Field

• Percentage Behind Plan – depends if plan is flexible
• Percentage of Automated Test Cases
• Percentage of Tests Passed vs. Failed (pass rate)
• Number of Test Steps
• Code Coverage / Path Coverage – depends on usage
• Requirements Coverage – depends on usage

April, 2013

©2013 Testing Thoughts

16
So … Now what?
• I have to stop counting everything. I feel
naked and exposed.
• Track expected effort instead of tracking
test cases using:
– Whiteboard
– Excel spreadsheet

April, 2013

©2013 Testing Thoughts

17
Whiteboard
• Used for planning and tracking of test
execution
• Suitable for use in waterfall or agile (as long
as you have control over your own team‘s
process)
• Use colours to track:
– Features, or
– Main Areas, or
– Test styles (performance, robustness, system)
April, 2013

©2013 Testing Thoughts

18
Whiteboard
• Divide the board into four areas:
–
–
–
–

Work to be done
Work in Progress
Cancelled or Work not being done
Completed work

• Red stickies indicate issues (not just bugs)
• Create a sticky note for each half day of work (or
mark # of half days expected on the sticky note)
• Prioritize stickies daily (or at least twice/wk)
• Finish ―on-time‖ with low priority work incomplete
April, 2013

©2013 Testing Thoughts

19
Sticky Notes
• All of these items are optional – add your own
elements
Use what makes sense to your situation
–
–
–
–
–
–

Charter Title (or Test Case Title)
Estimated Effort
Feature area
Tester name
Date complete
Effort (# of sessions or half days of work)
• Initially, estimated -> replace with actual

April, 2013

©2013 Testing Thoughts

20
Actual Sample Sticky
Charter Title

Tester

Area

Effort
April, 2013

©2013 Testing Thoughts

21
Whiteboard Example

April, 2013

©2013 Testing Thoughts

22
Reporting
• An Excel Spreadsheet with:
–
–
–
–
–
–
–
–
–
–

List of Charters
Area
Estimated Effort
Expended Effort
Remaining Effort
Tester(s)
Start Date
Completed Date
Issues
Comments

• Does NOT include pass/fail percentage or number of
test cases
April, 2013

©2013 Testing Thoughts

23
Sample Report
Charter

Area

Estimated Expended Remaining
Effort
Effort
Effort
Tester

Date
Issues
Date Started Completed Found

Comments

Lots of investigation. Problem was on 2-3 out of
ALU01617 48 ports which just happened to be 2 of the 6
01/14/2012 032
ports I tested.

Investigation for high QLN spikes on EVLT

H/W
Performance

0

20

0

acode

ARQ Verification under different RA Modes

ARQ

2

2

0

ncowan 12/14/2011 12/15/2011

POTS interference

ARQ

2

0

0

---

01/08/2012 01/08/2012

Expected throughput testing

ARQ

5

5

0

acode

01/10/2012 01/14/2012

INP vs. SHINE

ARQ

6

6

0

ncowan 12/01/2011 12/04/2011

INP vs. REIN

ARQ

6

INP vs. REIN + SHINE

ARQ

12

Traffic delay and jitter from RTX

ARQ

2

Attainable Throughput

April, 2013

ARQ

1

7

5

jbright

12/10/2011

01/06/2012 01/10/2012

Decided not to test as the H/W team already
tested this functionality and time was tight.

To translate the files properly, had to install
Python solution from Antwerp. Some overhead
to begin testing (installation, config test) but was
fairly quick to execute afterwards

12
2

4

0

0

ncowan 12/05/2011 12/05/2011

jbright

01/05/2012 01/08/2012

©2013 Testing Thoughts

Took longer because was not behaving as
expected and I had to make sure I was testing
correctly. My expectations were wrong based
on virtual noise not being exact.

24
Sample Report
"Awesome Product" Test Progress as of 02/01/2012

Effort (person half days)

90
80
Original Planned Effort
70
Expended Effort
60
Total Expected Effort

50
40
30
20
10
0
ARQ

SRA

Vectoring

Regression

H/W Performance

Feature
April, 2013

©2013 Testing Thoughts

25
April, 2013

©2013 Testing Thoughts

26

Más contenido relacionado

Destacado

Destacado (17)

Acceptance Test-driven Development: Mastering Agile Testing
Acceptance Test-driven Development: Mastering Agile TestingAcceptance Test-driven Development: Mastering Agile Testing
Acceptance Test-driven Development: Mastering Agile Testing
 
Emotional Intelligence in Software Testing
Emotional Intelligence in Software TestingEmotional Intelligence in Software Testing
Emotional Intelligence in Software Testing
 
Find Requirements Defects to Build Better Software
Find Requirements Defects to Build Better SoftwareFind Requirements Defects to Build Better Software
Find Requirements Defects to Build Better Software
 
Agile Project Failures: Root Causes and Corrective Actions
Agile Project Failures: Root Causes and Corrective ActionsAgile Project Failures: Root Causes and Corrective Actions
Agile Project Failures: Root Causes and Corrective Actions
 
Keynote: Lightning Strikes the Keynotes
Keynote: Lightning Strikes the KeynotesKeynote: Lightning Strikes the Keynotes
Keynote: Lightning Strikes the Keynotes
 
Usability Testing: Personas, Scenarios, Use Cases, and Test Cases
Usability Testing: Personas, Scenarios, Use Cases, and Test CasesUsability Testing: Personas, Scenarios, Use Cases, and Test Cases
Usability Testing: Personas, Scenarios, Use Cases, and Test Cases
 
Continuous Delivery at Ancestry.com
Continuous Delivery at Ancestry.comContinuous Delivery at Ancestry.com
Continuous Delivery at Ancestry.com
 
Automated Performance Profiling with Continuous Integration
Automated Performance Profiling with Continuous IntegrationAutomated Performance Profiling with Continuous Integration
Automated Performance Profiling with Continuous Integration
 
Agile Estimation and Planning: Scrum, Kanban, and Beyond
Agile Estimation and Planning: Scrum, Kanban, and BeyondAgile Estimation and Planning: Scrum, Kanban, and Beyond
Agile Estimation and Planning: Scrum, Kanban, and Beyond
 
Agile Release Planning, Metrics, and Retrospectives
Agile Release Planning, Metrics, and RetrospectivesAgile Release Planning, Metrics, and Retrospectives
Agile Release Planning, Metrics, and Retrospectives
 
Usability Testing in a Nutshell
Usability Testing in a NutshellUsability Testing in a Nutshell
Usability Testing in a Nutshell
 
Sprint Reviews that Attract, Engage, and Enlighten Stakeholders
Sprint Reviews that Attract, Engage, and Enlighten StakeholdersSprint Reviews that Attract, Engage, and Enlighten Stakeholders
Sprint Reviews that Attract, Engage, and Enlighten Stakeholders
 
Presenting Test Results with Clarity and Confidence
Presenting Test Results with Clarity and ConfidencePresenting Test Results with Clarity and Confidence
Presenting Test Results with Clarity and Confidence
 
Building Customer Feedback Loops: Learn Quicker, Design Smarter
Building Customer Feedback Loops: Learn Quicker, Design SmarterBuilding Customer Feedback Loops: Learn Quicker, Design Smarter
Building Customer Feedback Loops: Learn Quicker, Design Smarter
 
Building an Enterprise Performance and Load Testing Infrastructure
Building an Enterprise Performance and Load Testing InfrastructureBuilding an Enterprise Performance and Load Testing Infrastructure
Building an Enterprise Performance and Load Testing Infrastructure
 
Leading Change―Even If You’re Not in Charge
Leading Change―Even If You’re Not in ChargeLeading Change―Even If You’re Not in Charge
Leading Change―Even If You’re Not in Charge
 
Requirements Engineering: A Practicum
Requirements Engineering: A PracticumRequirements Engineering: A Practicum
Requirements Engineering: A Practicum
 

Similar a Bad Testing Metrics—and What To Do About Them

Testing Metrics and Tools, Analyse de tests
Testing Metrics and Tools, Analyse de testsTesting Metrics and Tools, Analyse de tests
Testing Metrics and Tools, Analyse de tests
HervKoya
 
WINSEM2021-22_ITE2004_ETH_VL2021220500452_Reference_Material_I_21-04-2022_TES...
WINSEM2021-22_ITE2004_ETH_VL2021220500452_Reference_Material_I_21-04-2022_TES...WINSEM2021-22_ITE2004_ETH_VL2021220500452_Reference_Material_I_21-04-2022_TES...
WINSEM2021-22_ITE2004_ETH_VL2021220500452_Reference_Material_I_21-04-2022_TES...
madhurpatidar2
 
software testing metrics do's - don'ts-XBOSoft-QAI Webinar
software testing metrics do's - don'ts-XBOSoft-QAI Webinarsoftware testing metrics do's - don'ts-XBOSoft-QAI Webinar
software testing metrics do's - don'ts-XBOSoft-QAI Webinar
XBOSoft
 

Similar a Bad Testing Metrics—and What To Do About Them (20)

[Paul Holland] Bad Metrics and What You Can Do About It
[Paul Holland] Bad Metrics and What You Can Do About It[Paul Holland] Bad Metrics and What You Can Do About It
[Paul Holland] Bad Metrics and What You Can Do About It
 
Agile Test Management and Reporting—Even in a Non-Agile Project
Agile Test Management and Reporting—Even in a Non-Agile ProjectAgile Test Management and Reporting—Even in a Non-Agile Project
Agile Test Management and Reporting—Even in a Non-Agile Project
 
Anton Muzhailo - Practical Test Process Improvement using ISTQB
Anton Muzhailo - Practical Test Process Improvement using ISTQBAnton Muzhailo - Practical Test Process Improvement using ISTQB
Anton Muzhailo - Practical Test Process Improvement using ISTQB
 
How much testing is enough
How much testing is enoughHow much testing is enough
How much testing is enough
 
STARCANADA 2013 Keynote: Lightning Strikes the Keynotes
STARCANADA 2013 Keynote: Lightning Strikes the KeynotesSTARCANADA 2013 Keynote: Lightning Strikes the Keynotes
STARCANADA 2013 Keynote: Lightning Strikes the Keynotes
 
Testing Metrics
Testing MetricsTesting Metrics
Testing Metrics
 
1)Testing-Fundamentals_L_D.pptx
1)Testing-Fundamentals_L_D.pptx1)Testing-Fundamentals_L_D.pptx
1)Testing-Fundamentals_L_D.pptx
 
The Test Coverage Outline: Your Testing Road Map
The Test Coverage Outline: Your Testing Road MapThe Test Coverage Outline: Your Testing Road Map
The Test Coverage Outline: Your Testing Road Map
 
Testing Metrics and Tools, Analyse de tests
Testing Metrics and Tools, Analyse de testsTesting Metrics and Tools, Analyse de tests
Testing Metrics and Tools, Analyse de tests
 
Software Testing Metrics
Software Testing MetricsSoftware Testing Metrics
Software Testing Metrics
 
PAC 2019 virtual Joerek Van Gaalen
PAC 2019 virtual Joerek Van GaalenPAC 2019 virtual Joerek Van Gaalen
PAC 2019 virtual Joerek Van Gaalen
 
Test case design techniques
Test case design techniquesTest case design techniques
Test case design techniques
 
Test case design techniques
Test case design techniquesTest case design techniques
Test case design techniques
 
WINSEM2021-22_ITE2004_ETH_VL2021220500452_Reference_Material_I_21-04-2022_TES...
WINSEM2021-22_ITE2004_ETH_VL2021220500452_Reference_Material_I_21-04-2022_TES...WINSEM2021-22_ITE2004_ETH_VL2021220500452_Reference_Material_I_21-04-2022_TES...
WINSEM2021-22_ITE2004_ETH_VL2021220500452_Reference_Material_I_21-04-2022_TES...
 
rryghg.ppt
rryghg.pptrryghg.ppt
rryghg.ppt
 
Fundamentals of testing
Fundamentals of testingFundamentals of testing
Fundamentals of testing
 
Continuous testing in agile projects 2015
Continuous testing in agile projects 2015Continuous testing in agile projects 2015
Continuous testing in agile projects 2015
 
Software Testing Presentation in Cegonsoft Pvt Ltd...
Software Testing Presentation in Cegonsoft Pvt Ltd...Software Testing Presentation in Cegonsoft Pvt Ltd...
Software Testing Presentation in Cegonsoft Pvt Ltd...
 
software testing metrics do's - don'ts-XBOSoft-QAI Webinar
software testing metrics do's - don'ts-XBOSoft-QAI Webinarsoftware testing metrics do's - don'ts-XBOSoft-QAI Webinar
software testing metrics do's - don'ts-XBOSoft-QAI Webinar
 
Software Quality Metrics Do's and Don'ts - XBOSoft-QAI Webinar
Software Quality Metrics Do's and Don'ts - XBOSoft-QAI WebinarSoftware Quality Metrics Do's and Don'ts - XBOSoft-QAI Webinar
Software Quality Metrics Do's and Don'ts - XBOSoft-QAI Webinar
 

Más de TechWell

Más de TechWell (20)

Failing and Recovering
Failing and RecoveringFailing and Recovering
Failing and Recovering
 
Instill a DevOps Testing Culture in Your Team and Organization
Instill a DevOps Testing Culture in Your Team and Organization Instill a DevOps Testing Culture in Your Team and Organization
Instill a DevOps Testing Culture in Your Team and Organization
 
Test Design for Fully Automated Build Architecture
Test Design for Fully Automated Build ArchitectureTest Design for Fully Automated Build Architecture
Test Design for Fully Automated Build Architecture
 
System-Level Test Automation: Ensuring a Good Start
System-Level Test Automation: Ensuring a Good StartSystem-Level Test Automation: Ensuring a Good Start
System-Level Test Automation: Ensuring a Good Start
 
Build Your Mobile App Quality and Test Strategy
Build Your Mobile App Quality and Test StrategyBuild Your Mobile App Quality and Test Strategy
Build Your Mobile App Quality and Test Strategy
 
Testing Transformation: The Art and Science for Success
Testing Transformation: The Art and Science for SuccessTesting Transformation: The Art and Science for Success
Testing Transformation: The Art and Science for Success
 
Implement BDD with Cucumber and SpecFlow
Implement BDD with Cucumber and SpecFlowImplement BDD with Cucumber and SpecFlow
Implement BDD with Cucumber and SpecFlow
 
Develop WebDriver Automated Tests—and Keep Your Sanity
Develop WebDriver Automated Tests—and Keep Your SanityDevelop WebDriver Automated Tests—and Keep Your Sanity
Develop WebDriver Automated Tests—and Keep Your Sanity
 
Ma 15
Ma 15Ma 15
Ma 15
 
Eliminate Cloud Waste with a Holistic DevOps Strategy
Eliminate Cloud Waste with a Holistic DevOps StrategyEliminate Cloud Waste with a Holistic DevOps Strategy
Eliminate Cloud Waste with a Holistic DevOps Strategy
 
Transform Test Organizations for the New World of DevOps
Transform Test Organizations for the New World of DevOpsTransform Test Organizations for the New World of DevOps
Transform Test Organizations for the New World of DevOps
 
The Fourth Constraint in Project Delivery—Leadership
The Fourth Constraint in Project Delivery—LeadershipThe Fourth Constraint in Project Delivery—Leadership
The Fourth Constraint in Project Delivery—Leadership
 
Resolve the Contradiction of Specialists within Agile Teams
Resolve the Contradiction of Specialists within Agile TeamsResolve the Contradiction of Specialists within Agile Teams
Resolve the Contradiction of Specialists within Agile Teams
 
Pin the Tail on the Metric: A Field-Tested Agile Game
Pin the Tail on the Metric: A Field-Tested Agile GamePin the Tail on the Metric: A Field-Tested Agile Game
Pin the Tail on the Metric: A Field-Tested Agile Game
 
Agile Performance Holarchy (APH)—A Model for Scaling Agile Teams
Agile Performance Holarchy (APH)—A Model for Scaling Agile TeamsAgile Performance Holarchy (APH)—A Model for Scaling Agile Teams
Agile Performance Holarchy (APH)—A Model for Scaling Agile Teams
 
A Business-First Approach to DevOps Implementation
A Business-First Approach to DevOps ImplementationA Business-First Approach to DevOps Implementation
A Business-First Approach to DevOps Implementation
 
Databases in a Continuous Integration/Delivery Process
Databases in a Continuous Integration/Delivery ProcessDatabases in a Continuous Integration/Delivery Process
Databases in a Continuous Integration/Delivery Process
 
Mobile Testing: What—and What Not—to Automate
Mobile Testing: What—and What Not—to AutomateMobile Testing: What—and What Not—to Automate
Mobile Testing: What—and What Not—to Automate
 
Cultural Intelligence: A Key Skill for Success
Cultural Intelligence: A Key Skill for SuccessCultural Intelligence: A Key Skill for Success
Cultural Intelligence: A Key Skill for Success
 
Turn the Lights On: A Power Utility Company's Agile Transformation
Turn the Lights On: A Power Utility Company's Agile TransformationTurn the Lights On: A Power Utility Company's Agile Transformation
Turn the Lights On: A Power Utility Company's Agile Transformation
 

Último

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 

Último (20)

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 

Bad Testing Metrics—and What To Do About Them

  • 1. Bad Metrics and What You Can Do About It Paul Holland Test Consultant and Teacher at Testing Thoughts
  • 2. My Background • Independent S/W Testing consultant since Apr 2012 • 16+ years testing telecommunications equipment and reworking test methodologies at Alcatel-Lucent • 10+ years as a test manager • Presenter at STAREast and CAST • Keynote at KWSQA conference in 2012 • Facilitator at 25+ peer conferences and workshops • Teacher of S/W testing for the past 5 years • Teacher of Rapid Software Testing – through Satisfice (James Bach): www.satisfice.com • Military Helicopter pilot – Canadian Sea Kings April, 2013 ©2013 Testing Thoughts 2
  • 3. Attributions • Over the past 10 years I have spoken with many people regarding metrics. I cannot directly attribute any specific aspects of this talk to any individual but all of these people (and more) have influenced my opinions and thoughts on metrics: – Cem Kaner, James Bach, Michael Bolton, Ross Collard, Doug Hoffman, Scott Barber, John Hazel, Eric Proegler, Dan Downing, Greg McNelly, Ben Yaroch April, 2013 ©2013 Testing Thoughts 3
  • 4. Definitions of METRIC (from http://www.merriam-webster.com, April 2012) • 1 plural : a part of prosody that deals with metrical structure • 2 : a standard of measurement <no metric exists that can be applied directly to happiness — Scientific Monthly> • 3 : a mathematical function that associates a real nonnegative number analogous to distance with each pair of elements in a set such that the number is zero only if the two elements are identical, the number is the same regardless of the order in which the two elements are taken, and the number associated with one pair of elements plus that associated with one member of the pair and a third element is equal to or greater than the number associated with the other member of the pair and the third element April, 2013 ©2013 Testing Thoughts 4
  • 5. Sample Metrics • Number of Test Cases Planned (per release or feature) • Number of Test Cases Executed vs. Plan • Number of Bugs Found per Tester • Number of Bugs Found per Feature • Number of Bugs Found in the Field • Number of Open Bugs • Lab Equipment Usage April, 2013 ©2013 Testing Thoughts 5
  • 6. Sample Metrics • Hours between crashes in the Field • Percentage Behind Plan • Percentage of Automated Test Cases • Percentage of Tests Passed vs. Failed (pass rate) • Number of Test Steps • Code Coverage / Path Coverage • Requirements Coverage April, 2013 ©2013 Testing Thoughts 6
  • 7. Goodhart‘s Law • In 1975, Charles Goodhart, a former advisor to the Bank of England and Emeritus Professor at the London School of Economics stated: Any observed statistical regularity will tend to collapse once pressure is placed upon it for control purposes Goodhart, C.A.E. (1975a) ‗Problems of Monetary Management: The UK Experience‘ in Papers in Monetary Economics, Volume I, Reserve Bank of Australia, 1975 April, 2013 ©2013 Testing Thoughts 7
  • 8. Goodhart‘s Law • Professor Marilyn Strathern FBA has re-stated Goodhart's Law more succinctly and more generally: `When a measure becomes a target, it ceases to be a good measure.' April, 2013 ©2013 Testing Thoughts 8
  • 9. Elements of Bad Metrics • Measure and/or compare elements that are inconsistent in size or composition – Impossible to effectively use for comparison – How many containers do you need for your possessions? – Test Cases and Test Steps • Greatly vary in time required and complexity – Bugs • Can be different severity, likelihood - i.e.: risk April, 2013 ©2013 Testing Thoughts 9
  • 10. Elements of Bad Metrics • Create competition between individuals and/or teams – They typically do not result in friendly competition – Inhibits sharing of information and teamwork – Especially damaging if compensation is impacted – Number of xxxx per tester – Number of xxxx per feature April, 2013 ©2013 Testing Thoughts 10
  • 11. Elements of Bad Metrics • Easy to ―game‖ or circumvent the desired intention – Easy to be improved by undesirable behaviour – Pass rate (percentage): Execute more simple tests that will pass or break up a long test case into many smaller ones – Number of bugs raised: Raising two similar bug reports instead of combining them April, 2013 ©2013 Testing Thoughts 11
  • 12. Elements of Bad Metrics • Contain misleading information or gives a false sense of completeness – Summarizing a large amount of information into one or two numbers out of context – Coverage (Code, Path) • Misleading information based on touching the code once – Pass rate and number of test cases April, 2013 ©2013 Testing Thoughts 12
  • 13. Impact of Using Bad Metrics – Promotes bad behaviour: • Testers may create more smaller test cases instead of creating test cases that make sense • Execution of ineffective testing to meet requirements • Artificially creating higher numbers instead of doing what makes sense • Creation of tools that will mask inefficiencies (e.g.: lab equipment usage) • Time wasted improving the ―numbers‖ instead of improving the testing April, 2013 ©2013 Testing Thoughts 13
  • 14. Impact of Using Bad Metrics • Gives Executives a false sense of test coverage – All they see is numbers out of context – The larger the numbers the better the testing – The difficulty of good testing is hidden by large ―fake‖ numbers • Dangerous message to Executives – Our pass rate is at 96% so our product is in good shape – Code coverage is at 100% - our code is completely tested – Feature specification coverage is at 100% - Ship it!!! • What could possibly go wrong? April, 2013 ©2013 Testing Thoughts 14
  • 15. Sample Metrics • Number of Test Cases Planned (per release or feature) • Number of Test Cases Executed vs. Plan • Number of Bugs Found per Tester • Number of Bugs Found per Feature • Number of Bugs Found in the Field – A list of Bugs • Number of Open Bugs – A list of Open Bugs • Lab Equipment Usage April, 2013 ©2013 Testing Thoughts 15
  • 16. Sample Metrics • Hours between crashes in the Field • Percentage Behind Plan – depends if plan is flexible • Percentage of Automated Test Cases • Percentage of Tests Passed vs. Failed (pass rate) • Number of Test Steps • Code Coverage / Path Coverage – depends on usage • Requirements Coverage – depends on usage April, 2013 ©2013 Testing Thoughts 16
  • 17. So … Now what? • I have to stop counting everything. I feel naked and exposed. • Track expected effort instead of tracking test cases using: – Whiteboard – Excel spreadsheet April, 2013 ©2013 Testing Thoughts 17
  • 18. Whiteboard • Used for planning and tracking of test execution • Suitable for use in waterfall or agile (as long as you have control over your own team‘s process) • Use colours to track: – Features, or – Main Areas, or – Test styles (performance, robustness, system) April, 2013 ©2013 Testing Thoughts 18
  • 19. Whiteboard • Divide the board into four areas: – – – – Work to be done Work in Progress Cancelled or Work not being done Completed work • Red stickies indicate issues (not just bugs) • Create a sticky note for each half day of work (or mark # of half days expected on the sticky note) • Prioritize stickies daily (or at least twice/wk) • Finish ―on-time‖ with low priority work incomplete April, 2013 ©2013 Testing Thoughts 19
  • 20. Sticky Notes • All of these items are optional – add your own elements Use what makes sense to your situation – – – – – – Charter Title (or Test Case Title) Estimated Effort Feature area Tester name Date complete Effort (# of sessions or half days of work) • Initially, estimated -> replace with actual April, 2013 ©2013 Testing Thoughts 20
  • 21. Actual Sample Sticky Charter Title Tester Area Effort April, 2013 ©2013 Testing Thoughts 21
  • 23. Reporting • An Excel Spreadsheet with: – – – – – – – – – – List of Charters Area Estimated Effort Expended Effort Remaining Effort Tester(s) Start Date Completed Date Issues Comments • Does NOT include pass/fail percentage or number of test cases April, 2013 ©2013 Testing Thoughts 23
  • 24. Sample Report Charter Area Estimated Expended Remaining Effort Effort Effort Tester Date Issues Date Started Completed Found Comments Lots of investigation. Problem was on 2-3 out of ALU01617 48 ports which just happened to be 2 of the 6 01/14/2012 032 ports I tested. Investigation for high QLN spikes on EVLT H/W Performance 0 20 0 acode ARQ Verification under different RA Modes ARQ 2 2 0 ncowan 12/14/2011 12/15/2011 POTS interference ARQ 2 0 0 --- 01/08/2012 01/08/2012 Expected throughput testing ARQ 5 5 0 acode 01/10/2012 01/14/2012 INP vs. SHINE ARQ 6 6 0 ncowan 12/01/2011 12/04/2011 INP vs. REIN ARQ 6 INP vs. REIN + SHINE ARQ 12 Traffic delay and jitter from RTX ARQ 2 Attainable Throughput April, 2013 ARQ 1 7 5 jbright 12/10/2011 01/06/2012 01/10/2012 Decided not to test as the H/W team already tested this functionality and time was tight. To translate the files properly, had to install Python solution from Antwerp. Some overhead to begin testing (installation, config test) but was fairly quick to execute afterwards 12 2 4 0 0 ncowan 12/05/2011 12/05/2011 jbright 01/05/2012 01/08/2012 ©2013 Testing Thoughts Took longer because was not behaving as expected and I had to make sure I was testing correctly. My expectations were wrong based on virtual noise not being exact. 24
  • 25. Sample Report "Awesome Product" Test Progress as of 02/01/2012 Effort (person half days) 90 80 Original Planned Effort 70 Expended Effort 60 Total Expected Effort 50 40 30 20 10 0 ARQ SRA Vectoring Regression H/W Performance Feature April, 2013 ©2013 Testing Thoughts 25