SlideShare una empresa de Scribd logo
1 de 69
I LOVE the smell of data in
the morning
Getting started with data science
@t_magennis | troy.magennis@focusedobjective.com
All slides and spreadsheets: Bit.ly/SimResources
Agile alliance Code of Conduct: bit.ly/Agile2017COC
@t_magennis
Slides, spreadsheets, and other stuff
Bit.ly/SimResources
Everything you see is freely available
Questions…
• How many defects? (delivery readiness)
• How are we doing? (process opportunities)
• How big? (size of features)
• How long? (expected time to deliver from “here”)
• Where is the constraint? (sensitivity analysis)
@t_magennis
Q: How Many Defects?
Understanding if more testing is worth it, or can (should) we release yet!
@t_magennis
@t_magennis
How Many Fish?
(assume you can’t see them, and you shouldn’t kill them)
@t_magennis
We Go Fishing, Tag & Return
We got FIVE fish.
5 fish caught
@t_magennis
@t_magennis
We Go Fishing AGAIN
Ratio of Tagged vs Untagged Answers Our Question
(knowing what we know, 50/50 is the expected result)
4 fish caught
2 x tagged
2 x un-tagged
Total Caught First Time x Total Caught Second Time
Total Caught Both Times (tagged fish)
5 fish x 4 fish 20
2 tagged fish = 2
= 10 fish total
@t_magennis
LOTS MORE
TO FIND
MORE TO
FIND
MOST
FOUND
Latent Defect
Estimation
@t_magennis
• Bug Bash Days (two
groups to test separately)
• Beta test program
(last-names A-K, L-Z)
http://Bit.Ly/SimResources
@t_magennis
The properties of the “few,”
Help understand the properties of the “all.”
Likely low and upper ranges of values Likely future values
Common versus un-common
Defining “Few” (How few? = how certain?)
@t_magennis
Actual
Lowest
Actual
Highest
1
1 233% 33% 33%
1 23 54 612.5% 12.5% 12.5% 12.5% 12.5% 12.5% 12.5%
7 12.5%
75%
@t_magennis
Observations
So Far (n)
Below lowest
seen so far
Within range
seen so far
Above highest
seen so far
1 - - -
2 33% 33.3% 33%
3 25% 50% 25%
4 20% 60% 20%
5 16.7% 66.6% 16.7%
6 14.3% 71.4% 14.3%
7 12.5% 75% 12.5%
8 11.1% 77.8% 11.1%
9 10% 80% 10%
10 9.1% 81.8% 9.1%
Assumptions (rarely perfectly true)
• The samples are taken at random.
Convenient isn’t random!
• The distribution is relatively uniform;
All values have similar chance.
• The probability is on average.
Types of Analytics
• Descriptive Analytics “What has happened?”
• Data aggregation and data mining to provide
insight into the past
• Predictive Analytics “What could happen?”
• Statistical models and forecasts techniques to
understand potential futures
• Prescriptive Analytics “What should we do?”
• Optimization and simulation algorithms to give
advice based on possible outcomes
@t_magennis
Descriptive
(charts)
Predictive
(estimates,
forecasts)
Prescriptive
(sensitivity
simulation)
Q: How Are We Doing?
Understanding process improvement
@t_magennis
Descriptive Analytics AKA Charts and Stats
@t_magennis
But How Do I Get the Data?
@t_magennis
18 Charts
@t_magennis
@t_magennis
TeamDashboard.xlsxspreadsheet
Bit.Ly/SimResources
Getting started with Descriptive Analytics
• Start capturing Start Date, Complete Date and Work Type Data
• Start making this data available for analysis
• Start talking about this data in retrospectives and other meetings
• Confirm process improvements have desired impact and don’t have
un-intended consequences elsewhere
@t_magennis
Team Dashboard.xlsx spreadsheet at
http://Bit.Ly/SimResources
@t_magennis
Predictive Analytics
Q: How Big?
Understanding the size of a feature or project with less effort
@t_magennis
@t_magennis
Feature 1
15 stories
Feature 2
3 stories
Feature 3
7-15 stories
Feature 4
?
Feature 1
15 stories
Feature 2
3 stories
Feature 3
7-15 stories
Feature 4
10-15 stories
Step 1
Step 2
Step 3
Known as Reference Class Forecasting
Forecasting Total Story Count
• Question: How an I estimate the size of a feature or project without
analyzing every piece of work?
• Theory: The “size” patterns of randomly sample epics, will persist
through all other epics. Analyze a few and compute for the many…
@t_magennis
http://bit.ly/StoryCountForecaster
Sampling based Monte Carlo story count forecasting Excel spreadsheet
@t_magennis
Process to estimate total size –
1. Pick a 5-10 features at random
2. Build sets of 15 re-samples
(say 1000 times)
3. The number of sets that reach
certain story count levels give
probability
@t_magennis
50% = 72
50% = 72
90% = 81
90% = 81
Total for 100
Features using
Total Count
85% Likelihood
36 samples 506
10 samples 494
3 samples 504
Average Error calculation –
1. Split the samples into 2 groups
2. Calculate the average of both groups
3. Compare the difference as a % of range
error % = error of avg / (max-min)
Why should I believe this forecast anyway?
1. Sample Count: Keep cutting data and compare the result
2. Random groups: Split data into random groups and compare
@t_magennis
1. Multiple Options – NOT one…
2. Duration not ETA until commitment…
3. Continuously updated once started…
Contrast Google Maps to Software Estimates
Current Way
• Give one forecast even though
multiple approaches considered
• Give a calendar date for
undefined “complete” & “start”
• If the original date is in doubt
we find out near the end
• Appear on-time until we are not.
Measure progress from start.
Better Way
• Give multiple options of
investment and implementation
• Give a duration and define what
started & complete means
• If the original date is in doubt,
know earlier and react faster
• Report remaining time to deliver
not time since started
@t_magennis
@t_magennis
“Remember that all models are
wrong; the practical question is how
wrong do they have to be to not be
useful.”
Statistician,
George Box
@t_magennis
A. Better than intuition
Not perfect. Not exact. Not always right.
Just better than what you do now, or even equal (just less expensive)
@t_magennis
http://ritholtz.com/2016/09/cognitive-bias-codex/
@t_magennis
http://ritholtz.com/2016/09/cognitive-bias-codex/
@t_magennis
Model
Forecast
Compare
Reality
Models are always wrong.
It’s all about understanding why.
The
Future
The
Past
1. Model Baseline
using historically
known truths
(train)
2. Test Model
against historically
known truths
(test)
3. Forecast
Back-testing: Using the data you have, predict something known
@t_magennis
ONLY believe THIS
IF you see THIS
Simple Regression Line Forecasting
• Use the most recent 5-11 samples
• Less than that, too little data to make sense
• More than that, too exposed to context changes
• Test backwards, before forecasting forward
• Remove recent 1/3 data and see if first 2/3 would have predicted it
• Understand the limitations in your context
• Organizational disruption > local variability in many cases
• It is the simplest model that MAY work
@t_magennis
@t_magennis
0
20000
40000
60000
80000
100000
120000
W2-2012
W4-2012
W6-2012
W8-2012
W10-2012
W12-2012
W14-2012
W16-2012
W18-2012
W20-2012
W22-2012
W24-2012
W26-2012
W28-2012
W30-2012
W32-2012
W34-2012
W36-2012
W38-2012
W40-2012
W42-2012
W44-2012
W46-2012
W48-2012
W50-2012
W52-2012
W54-2012
W2-2013
W4-2013
W6-2013
W8-2013
W10-2013
W12-2013
W14-2013
W16-2013
W18-2013
W20-2013
W22-2013
W24-2013
W26-2013
W28-2013
W30-2013
W32-2013
W34-2013
W36-2013
W38-2013
W40-2013
W42-2013
W44-2013
W46-2013
W48-2013
W50-2013
W52-2013
W1-2014
W3-2014
W5-2014
W7-2014
W9-2014
W11-2014
W13-2014
W15-2014
W17-2014
W19-2014
W21-2014
Cumulative Completion (all work) for ~ 100 Teams
@t_magennis
-20000
0
20000
40000
60000
80000
100000
120000
140000
W2-2012
W4-2012
W6-2012
W8-2012
W10-2012
W12-2012
W14-2012
W16-2012
W18-2012
W20-2012
W22-2012
W24-2012
W26-2012
W28-2012
W30-2012
W32-2012
W34-2012
W36-2012
W38-2012
W40-2012
W42-2012
W44-2012
W46-2012
W48-2012
W50-2012
W52-2012
W54-2012
W2-2013
W4-2013
W6-2013
W8-2013
W10-2013
W12-2013
W14-2013
W16-2013
W18-2013
W20-2013
W22-2013
W24-2013
W26-2013
W28-2013
W30-2013
W32-2013
W34-2013
W36-2013
W38-2013
W40-2013
W42-2013
W44-2013
W46-2013
W48-2013
W50-2013
W52-2013
W1-2014
W3-2014
W5-2014
W7-2014
W9-2014
W11-2014
W13-2014
W15-2014
W17-2014
W19-2014
W21-2014
Cumulative Completion (all work) for ~ 100 Teams
0
200
400
600
800
1000
1200
1400
1600
W2-2012
W5-2012
W8-2012
W11-2012
W14-2012
W17-2012
W20-2012
W23-2012
W26-2012
W29-2012
W32-2012
W35-2012
W38-2012
W41-2012
W44-2012
W47-2012
W50-2012
W53-2012
W2-2013
W5-2013
W8-2013
W11-2013
W14-2013
W17-2013
W20-2013
W23-2013
W26-2013
W29-2013
W32-2013
W35-2013
W38-2013
W41-2013
W44-2013
W47-2013
W50-2013
W53-2013
W3-2014
W6-2014
W9-2014
W12-2014
W15-2014
W18-2014
W21-2014
All
WTF 1?
WTF 2?
Throughput per week for 100 teams
@t_magennis
1. Average pace
forecast (simple
regression)
2. Pace estimate
as a range
(probabilistic
forecast)
3. Pace
Mathematical
Distribution
(probabilistic
forecast)
4. Pace
Historical Data
Distribution
(probabilistic
forecast)
5. System
simulation
probabilistic
forecast
Effort AND how likely model represents actual outcome
Typical Agile Here
Low effort, Low chance High effort, High chance
Forecasting “How Long” Models
This session gets you here
Q: How Long?
Forecasting duration if nothing else was done…
@t_magennis
Forecasting Duration (and delivery date)
• Question: How can I estimate the amount of time it will take to
deliver a feature or project?
• Theory: Using a range estimate or actual team delivery rate data,
calculate how many of those periods of time to complete delivery
@t_magennis
http://bit.ly/ThroughputForecast
Estimate or Sampling based Monte Carlo
duration and date forecasting Excel spreadsheet
@t_magennis
More
Luck
Less
Luck
More
Likely
Less
Likely
85%
@t_magennis
Q: Where is our constraint?
Understanding the process in detail
@t_magennis
@t_magennis
@t_magennis
@t_magennis
Key Takeaways
• Start collecting started, finished and type of work data
• Start using data during team decision meetings
• You need much less data that you think to be better than “intuition”
• Use the values of a “few” to forecast the “many” – 7 to 11 samples
• Beware the limitation of simple regression forecasting
• Start using probabilistic forecasting tools
• Mine are free: Bit.ly/SimResources
@t_magennis
Get everything here: Slides and tools:
Bit.ly/SimResources
Me on Twitter
@t_magennis
About me…
• What I do
• Teach how to use data for forecasting
• Teach simple math to executives, especially “demand > supply”
• Teach how to know (earlier) that you are on the wrong side of an expectation
• What I did
• Started in software 1986. I actually liked Assembler & Cobol
• Have worked at senior exec level, and now beside them for major corporations
so I have some insight into what passes their decision filters
• How to reach me
• Twitter: @t_magennis or email: troy.magennis@focusedobjective.com
• Lots of free spreadsheets and stuff at FocusedObjective.com
@t_magennis
@t_magennis
@t_magennis
source
@t_magennis
Training
Set
Holdout
Set
We pretend we don’t know something we do know. We predict and compare.
@t_magennis
Passenger
FemaleMale
73% survived in group
36% of the passengers
Younger than
9.5 years
Older than
9.5 years
Family Group Size
> 2.5 people
Family Group Size
< 2.5 people
89% survived in group
2% of the passengers
17% survived in group
61% of the passengers
5% survived in group
2% of the passengers
Sex
Age
Family Size
Guess: Survive
Guess: Survive
Guess: Perish
Guess: Perish
Simple Decision Tree
What if you could predict…
• What test cases NEED to be run
• What test cases are consistently FALSE positives
• What code areas when touched causes the most future defect reports
• How risky is releasing a hotfix for an area of code
@t_magennis
@t_magennis
http://taoxie.cs.illinois.edu/publications/fse16industry-learning.pdf
@t_magennis
http://cs.brown.edu/~alexta/Doc/pubs/ICST2011_CRANE.pdf
Top Three Forecasting Fail Reasons
Reasons you shouldn’t have hired me five years ago
@t_magennis
Forecasting How Big How Long How Much Fails Q and A
Fail 1: Start Date On-Paper != Reality
• The assumed Start Date is often ONLY on paper
• Define what start means
• Team is dedicated and in-place
• They are trained and know how to do their work
• They know and understand what work they need to deliver
• Nothing inhibits them doing or delivering that work
• Team is never fully available on day one!
@t_magennis
Forecasting How Big How Long How Much Fails Q and A
Start Date of Feature B
is the finish date of
Feature A
What is the team doing now?
Fail 2: Backlog Rate versus Delivery Rate
Forecasting How Big How Long How Much Fails Q and A
Feature
Feature
Feature
1 2
3
1a 1b
2
3a 3b 3c
Features Estimated Stories Implemented Stories and Defects
1
1a 1b
1 2
1
2a
2c
2b
Actual Backlog
Rate Delivered = 6 Measured Throughput = 12
Fail 2: Backlog Rate versus Delivery Rate
• Forecast using the “Completion rate” we may under-forecast
• Backlog is Miles per Hour, Completion rate is Kilometers per Hour
• Normal split rates are between 1 to 3 times (most common seen)
• This means
• If you don’t account for it, you will UNDER-FORECAST by 1 to 3 times!
@t_magennis
Forecasting How Big How Long How Much Fails Q and A
@t_magennis
Forecasting How Big How Long How Much Fails Q and A
5 X 1.6 = 8
3 X 1.3 = 4
8 X 1.25 = 10
Fail 3: Ignoring Risks
• Risk = Work that “might” need to be done but we don’t know yet
• Some samples
• Fails on Internet Explorer 6, or now Safari on phones
• Fails performance testing under load, or uses too much memory
• CSS alignment issues with German text translations, things wrap
• Production network security blocks traffic, awaiting vendor to fix
• Fails on real customer data (we designed for 50 items, they have 500)
@t_magennis
Forecasting How Big How Long How Much Fails Q and A
@t_magennis
WITH
RISKS INCLUDED
27th May
(highest late June)
24th June
(highest early August)
WITHOUT
RISKS INCLUDED
Forecasting How Big How Long How Much Fails Q and A
Forecasts shown at
85th Percentile
Bonus fail: High System Utilization
@t_magennis
Forecasting How Big How Long How Much Fails Q and A
Can’t forecast high utilization systems using item size…
0
200
400
600
800
1000
1200
1400
1600
W2-2012
W5-2012
W8-2012
W11-2012
W14-2012
W17-2012
W20-2012
W23-2012
W26-2012
W29-2012
W32-2012
W35-2012
W38-2012
W41-2012
W44-2012
W47-2012
W50-2012
W53-2012
W2-2013
W5-2013
W8-2013
W11-2013
W14-2013
W17-2013
W20-2013
W23-2013
W26-2013
W29-2013
W32-2013
W35-2013
W38-2013
W41-2013
W44-2013
W47-2013
W50-2013
W53-2013
W3-2014
W6-2014
W9-2014
W12-2014
W15-2014
W18-2014
W21-2014
All Bugs
WTF 1?
WTF 2?Throughput per week for 100 teams
Forecasting How Big How Long How Much Fails Q and A
Key Take-aways and Resources
• Forecasting requires a system view,
• Three samples will outperform intuition (use most recent 7 samples)
• Give multiple options, not just one
• Forecast duration NOT date until “Start Conditions” are defined
• Track actual progress versus planned, and update the model continuously
• Get everything here: Slides and tools:
Bit.ly/SimResources
@t_magennis
Forecasting How Big How Long How Much Fails Q and A

Más contenido relacionado

La actualidad más candente

[CXL Live 16] Beyond Test-by-Test Results: CRO Metrics for Performance & Insi...
[CXL Live 16] Beyond Test-by-Test Results: CRO Metrics for Performance & Insi...[CXL Live 16] Beyond Test-by-Test Results: CRO Metrics for Performance & Insi...
[CXL Live 16] Beyond Test-by-Test Results: CRO Metrics for Performance & Insi...CXL
 
Magically predictable software delivery ralf westphal
Magically predictable software delivery   ralf westphalMagically predictable software delivery   ralf westphal
Magically predictable software delivery ralf westphalRenald Wittwer
 
Is data visualisation bullshit?
Is data visualisation bullshit?Is data visualisation bullshit?
Is data visualisation bullshit?Alban Gérôme
 
Trends on Pinterest
Trends on PinterestTrends on Pinterest
Trends on PinterestJune Andrews
 
Agile metrics for predicting the future
Agile metrics for predicting the futureAgile metrics for predicting the future
Agile metrics for predicting the futureMattia Battiston
 
Statistics for UX Professionals - Jessica Cameron
Statistics for UX Professionals - Jessica CameronStatistics for UX Professionals - Jessica Cameron
Statistics for UX Professionals - Jessica CameronUser Vision
 
[CXL Live 16] How to Utilize Your Test Capacity? by Ton Wesseling
[CXL Live 16] How to Utilize Your Test Capacity? by Ton Wesseling[CXL Live 16] How to Utilize Your Test Capacity? by Ton Wesseling
[CXL Live 16] How to Utilize Your Test Capacity? by Ton WesselingCXL
 
Leveraging Analytics In Gaming - Tiny Mogul Games
Leveraging Analytics In Gaming - Tiny Mogul GamesLeveraging Analytics In Gaming - Tiny Mogul Games
Leveraging Analytics In Gaming - Tiny Mogul GamesInMobi
 
Replication in Data Science - A Dance Between Data Science & Machine Learning...
Replication in Data Science - A Dance Between Data Science & Machine Learning...Replication in Data Science - A Dance Between Data Science & Machine Learning...
Replication in Data Science - A Dance Between Data Science & Machine Learning...June Andrews
 
Statistics for UX Professionals
Statistics for UX ProfessionalsStatistics for UX Professionals
Statistics for UX ProfessionalsJessica Cameron
 
Kanban Metrics in practice for leading Continuous Improvement
Kanban Metrics in practice for leading Continuous ImprovementKanban Metrics in practice for leading Continuous Improvement
Kanban Metrics in practice for leading Continuous ImprovementMattia Battiston
 
Scientific Revenue USF 2016 talk
Scientific Revenue USF 2016 talkScientific Revenue USF 2016 talk
Scientific Revenue USF 2016 talkScientificRevenue
 
No estimates - a controversial way to improve estimation with results-handouts
No estimates - a controversial way to improve estimation with results-handoutsNo estimates - a controversial way to improve estimation with results-handouts
No estimates - a controversial way to improve estimation with results-handoutsVasco Duarte
 
A quick trip to the future land of no estimates
A quick trip to the future land of no estimatesA quick trip to the future land of no estimates
A quick trip to the future land of no estimatesVasco Duarte
 
Crisis of confidence, p-hacking and the future of psychology
Crisis of confidence, p-hacking and the future of psychologyCrisis of confidence, p-hacking and the future of psychology
Crisis of confidence, p-hacking and the future of psychologyMatti Heino
 
Agile Analysis 101: Agile Stats v Command & Control Maths
Agile Analysis 101: Agile Stats v Command & Control MathsAgile Analysis 101: Agile Stats v Command & Control Maths
Agile Analysis 101: Agile Stats v Command & Control MathsAxelisys Limited
 
No estimates - 10 new principles for testing
No estimates  - 10 new principles for testingNo estimates  - 10 new principles for testing
No estimates - 10 new principles for testingVasco Duarte
 
Influx/Days 2017 San Francisco | Baron Schwartz
Influx/Days 2017 San Francisco | Baron SchwartzInflux/Days 2017 San Francisco | Baron Schwartz
Influx/Days 2017 San Francisco | Baron SchwartzInfluxData
 
Indeed Engineering and The Lead Developer Present: Tech Leadership and Manage...
Indeed Engineering and The Lead Developer Present: Tech Leadership and Manage...Indeed Engineering and The Lead Developer Present: Tech Leadership and Manage...
Indeed Engineering and The Lead Developer Present: Tech Leadership and Manage...indeedeng
 
#NoEstimates - TYPO3 Conference 2013
#NoEstimates -  TYPO3 Conference 2013#NoEstimates -  TYPO3 Conference 2013
#NoEstimates - TYPO3 Conference 2013weissgraeber
 

La actualidad más candente (20)

[CXL Live 16] Beyond Test-by-Test Results: CRO Metrics for Performance & Insi...
[CXL Live 16] Beyond Test-by-Test Results: CRO Metrics for Performance & Insi...[CXL Live 16] Beyond Test-by-Test Results: CRO Metrics for Performance & Insi...
[CXL Live 16] Beyond Test-by-Test Results: CRO Metrics for Performance & Insi...
 
Magically predictable software delivery ralf westphal
Magically predictable software delivery   ralf westphalMagically predictable software delivery   ralf westphal
Magically predictable software delivery ralf westphal
 
Is data visualisation bullshit?
Is data visualisation bullshit?Is data visualisation bullshit?
Is data visualisation bullshit?
 
Trends on Pinterest
Trends on PinterestTrends on Pinterest
Trends on Pinterest
 
Agile metrics for predicting the future
Agile metrics for predicting the futureAgile metrics for predicting the future
Agile metrics for predicting the future
 
Statistics for UX Professionals - Jessica Cameron
Statistics for UX Professionals - Jessica CameronStatistics for UX Professionals - Jessica Cameron
Statistics for UX Professionals - Jessica Cameron
 
[CXL Live 16] How to Utilize Your Test Capacity? by Ton Wesseling
[CXL Live 16] How to Utilize Your Test Capacity? by Ton Wesseling[CXL Live 16] How to Utilize Your Test Capacity? by Ton Wesseling
[CXL Live 16] How to Utilize Your Test Capacity? by Ton Wesseling
 
Leveraging Analytics In Gaming - Tiny Mogul Games
Leveraging Analytics In Gaming - Tiny Mogul GamesLeveraging Analytics In Gaming - Tiny Mogul Games
Leveraging Analytics In Gaming - Tiny Mogul Games
 
Replication in Data Science - A Dance Between Data Science & Machine Learning...
Replication in Data Science - A Dance Between Data Science & Machine Learning...Replication in Data Science - A Dance Between Data Science & Machine Learning...
Replication in Data Science - A Dance Between Data Science & Machine Learning...
 
Statistics for UX Professionals
Statistics for UX ProfessionalsStatistics for UX Professionals
Statistics for UX Professionals
 
Kanban Metrics in practice for leading Continuous Improvement
Kanban Metrics in practice for leading Continuous ImprovementKanban Metrics in practice for leading Continuous Improvement
Kanban Metrics in practice for leading Continuous Improvement
 
Scientific Revenue USF 2016 talk
Scientific Revenue USF 2016 talkScientific Revenue USF 2016 talk
Scientific Revenue USF 2016 talk
 
No estimates - a controversial way to improve estimation with results-handouts
No estimates - a controversial way to improve estimation with results-handoutsNo estimates - a controversial way to improve estimation with results-handouts
No estimates - a controversial way to improve estimation with results-handouts
 
A quick trip to the future land of no estimates
A quick trip to the future land of no estimatesA quick trip to the future land of no estimates
A quick trip to the future land of no estimates
 
Crisis of confidence, p-hacking and the future of psychology
Crisis of confidence, p-hacking and the future of psychologyCrisis of confidence, p-hacking and the future of psychology
Crisis of confidence, p-hacking and the future of psychology
 
Agile Analysis 101: Agile Stats v Command & Control Maths
Agile Analysis 101: Agile Stats v Command & Control MathsAgile Analysis 101: Agile Stats v Command & Control Maths
Agile Analysis 101: Agile Stats v Command & Control Maths
 
No estimates - 10 new principles for testing
No estimates  - 10 new principles for testingNo estimates  - 10 new principles for testing
No estimates - 10 new principles for testing
 
Influx/Days 2017 San Francisco | Baron Schwartz
Influx/Days 2017 San Francisco | Baron SchwartzInflux/Days 2017 San Francisco | Baron Schwartz
Influx/Days 2017 San Francisco | Baron Schwartz
 
Indeed Engineering and The Lead Developer Present: Tech Leadership and Manage...
Indeed Engineering and The Lead Developer Present: Tech Leadership and Manage...Indeed Engineering and The Lead Developer Present: Tech Leadership and Manage...
Indeed Engineering and The Lead Developer Present: Tech Leadership and Manage...
 
#NoEstimates - TYPO3 Conference 2013
#NoEstimates -  TYPO3 Conference 2013#NoEstimates -  TYPO3 Conference 2013
#NoEstimates - TYPO3 Conference 2013
 

Similar a I love the smell of data in the morning (getting started with data science) troy magennis

Monte Carlo and Schedule Risk Analysis
Monte Carlo and Schedule Risk AnalysisMonte Carlo and Schedule Risk Analysis
Monte Carlo and Schedule Risk AnalysisIntaver Insititute
 
Closing The Loop for Evaluating Big Data Analysis
Closing The Loop for Evaluating Big Data AnalysisClosing The Loop for Evaluating Big Data Analysis
Closing The Loop for Evaluating Big Data AnalysisSwiss Big Data User Group
 
Monte Carlo Schedule Risk Analysis
Monte Carlo Schedule Risk AnalysisMonte Carlo Schedule Risk Analysis
Monte Carlo Schedule Risk AnalysisIntaver Insititute
 
Data skills for Agile Teams- Killing story points
Data skills for Agile Teams- Killing story pointsData skills for Agile Teams- Killing story points
Data skills for Agile Teams- Killing story pointsyasinnathani
 
Test case design techniques
Test case design techniquesTest case design techniques
Test case design techniquesAshutosh Garg
 
Test case design techniques
Test case design techniquesTest case design techniques
Test case design techniques2PiRTechnologies
 
[Talk] Manage flow - Metrics and Analytics for predictability and flow
[Talk] Manage flow - Metrics and Analytics for predictability and flow[Talk] Manage flow - Metrics and Analytics for predictability and flow
[Talk] Manage flow - Metrics and Analytics for predictability and flowMarcio Sete
 
Better Living Through Analytics - Strategies for Data Decisions
Better Living Through Analytics - Strategies for Data DecisionsBetter Living Through Analytics - Strategies for Data Decisions
Better Living Through Analytics - Strategies for Data DecisionsProduct School
 
10 Tips for women to build a career in data science
10 Tips for women to build a career in data science10 Tips for women to build a career in data science
10 Tips for women to build a career in data scienceCarol Hargreaves
 
Correlation does not mean causation
Correlation does not mean causationCorrelation does not mean causation
Correlation does not mean causationPeter Varhol
 
Why You Need to STOP Using Spreadsheets for Audit Analysis
Why You Need to STOP Using Spreadsheets for Audit AnalysisWhy You Need to STOP Using Spreadsheets for Audit Analysis
Why You Need to STOP Using Spreadsheets for Audit AnalysisCaseWare IDEA
 
A/B Testing - Design, Analysis and Pitfals
A/B Testing - Design, Analysis and PitfalsA/B Testing - Design, Analysis and Pitfals
A/B Testing - Design, Analysis and PitfalsSlava Borodovsky
 
#Measurecamp : 18 Simple Ways to F*** up Your AB Testing
#Measurecamp : 18 Simple Ways to F*** up Your AB Testing#Measurecamp : 18 Simple Ways to F*** up Your AB Testing
#Measurecamp : 18 Simple Ways to F*** up Your AB TestingCraig Sullivan
 
7 Cases Where You Can't Afford to Skip Analytics Testing
7 Cases Where You Can't Afford to Skip Analytics Testing7 Cases Where You Can't Afford to Skip Analytics Testing
7 Cases Where You Can't Afford to Skip Analytics TestingObservePoint
 
Risk Event Modeling and Event Chains
Risk Event Modeling and Event ChainsRisk Event Modeling and Event Chains
Risk Event Modeling and Event ChainsIntaver Insititute
 
DevOps Days SLC 16: Stop running with sharp metrics
DevOps Days SLC 16:  Stop running with sharp metricsDevOps Days SLC 16:  Stop running with sharp metrics
DevOps Days SLC 16: Stop running with sharp metricsJulia Wester
 
Performance OR Capacity #CMGimPACt2016
Performance OR Capacity #CMGimPACt2016 Performance OR Capacity #CMGimPACt2016
Performance OR Capacity #CMGimPACt2016 Alex Gilgur
 

Similar a I love the smell of data in the morning (getting started with data science) troy magennis (20)

Monte Carlo and Schedule Risk Analysis
Monte Carlo and Schedule Risk AnalysisMonte Carlo and Schedule Risk Analysis
Monte Carlo and Schedule Risk Analysis
 
Closing The Loop for Evaluating Big Data Analysis
Closing The Loop for Evaluating Big Data AnalysisClosing The Loop for Evaluating Big Data Analysis
Closing The Loop for Evaluating Big Data Analysis
 
Evaluation of big data analysis
Evaluation of big data analysisEvaluation of big data analysis
Evaluation of big data analysis
 
Monte Carlo Schedule Risk Analysis
Monte Carlo Schedule Risk AnalysisMonte Carlo Schedule Risk Analysis
Monte Carlo Schedule Risk Analysis
 
SQLDay2013_MarcinSzeliga_DataInDataMining
SQLDay2013_MarcinSzeliga_DataInDataMiningSQLDay2013_MarcinSzeliga_DataInDataMining
SQLDay2013_MarcinSzeliga_DataInDataMining
 
Data skills for Agile Teams- Killing story points
Data skills for Agile Teams- Killing story pointsData skills for Agile Teams- Killing story points
Data skills for Agile Teams- Killing story points
 
Test case design techniques
Test case design techniquesTest case design techniques
Test case design techniques
 
Test case design techniques
Test case design techniquesTest case design techniques
Test case design techniques
 
[Talk] Manage flow - Metrics and Analytics for predictability and flow
[Talk] Manage flow - Metrics and Analytics for predictability and flow[Talk] Manage flow - Metrics and Analytics for predictability and flow
[Talk] Manage flow - Metrics and Analytics for predictability and flow
 
[Paul Holland] Bad Metrics and What You Can Do About It
[Paul Holland] Bad Metrics and What You Can Do About It[Paul Holland] Bad Metrics and What You Can Do About It
[Paul Holland] Bad Metrics and What You Can Do About It
 
Better Living Through Analytics - Strategies for Data Decisions
Better Living Through Analytics - Strategies for Data DecisionsBetter Living Through Analytics - Strategies for Data Decisions
Better Living Through Analytics - Strategies for Data Decisions
 
10 Tips for women to build a career in data science
10 Tips for women to build a career in data science10 Tips for women to build a career in data science
10 Tips for women to build a career in data science
 
Correlation does not mean causation
Correlation does not mean causationCorrelation does not mean causation
Correlation does not mean causation
 
Why You Need to STOP Using Spreadsheets for Audit Analysis
Why You Need to STOP Using Spreadsheets for Audit AnalysisWhy You Need to STOP Using Spreadsheets for Audit Analysis
Why You Need to STOP Using Spreadsheets for Audit Analysis
 
A/B Testing - Design, Analysis and Pitfals
A/B Testing - Design, Analysis and PitfalsA/B Testing - Design, Analysis and Pitfals
A/B Testing - Design, Analysis and Pitfals
 
#Measurecamp : 18 Simple Ways to F*** up Your AB Testing
#Measurecamp : 18 Simple Ways to F*** up Your AB Testing#Measurecamp : 18 Simple Ways to F*** up Your AB Testing
#Measurecamp : 18 Simple Ways to F*** up Your AB Testing
 
7 Cases Where You Can't Afford to Skip Analytics Testing
7 Cases Where You Can't Afford to Skip Analytics Testing7 Cases Where You Can't Afford to Skip Analytics Testing
7 Cases Where You Can't Afford to Skip Analytics Testing
 
Risk Event Modeling and Event Chains
Risk Event Modeling and Event ChainsRisk Event Modeling and Event Chains
Risk Event Modeling and Event Chains
 
DevOps Days SLC 16: Stop running with sharp metrics
DevOps Days SLC 16:  Stop running with sharp metricsDevOps Days SLC 16:  Stop running with sharp metrics
DevOps Days SLC 16: Stop running with sharp metrics
 
Performance OR Capacity #CMGimPACt2016
Performance OR Capacity #CMGimPACt2016 Performance OR Capacity #CMGimPACt2016
Performance OR Capacity #CMGimPACt2016
 

Último

Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number SystemsJheuzeDellosa
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
Active Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfActive Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfCionsystems
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 

Último (20)

Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number Systems
 
Exploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the ProcessExploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the Process
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Active Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfActive Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdf
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 

I love the smell of data in the morning (getting started with data science) troy magennis

Notas del editor

  1. Welcome the Forecasting using Data session. I’m Troy Magennis, and I’ve got 40 minutes to convince you that Forecasting is both easier than you think, and likely to outperform your intuition.
  2. Lets start at looking at how to answer the first, How Big.
  3. From https://halobi.com/2016/07/descriptive-predictive-and-prescriptive-analytics-explained/
  4. Lets start at looking at how to answer the first, How Big.
  5. Forecasting is about setting better expectations about what reality will eventually unfold. Expectations are subject to all manner of biases, wishful thinking and incomplete logic. Out job in forecasting is to narrow that gap, better align forecast to reality. Its about demonstrating we understand the wholistic delivery system well enough that we can make better, more informed decisions earlier.
  6. Lets start at looking at how to answer the first, How Big.
  7. Keeping a history of features and how many stories or story points it took to deliver those features is an excellent way to maintain knowledge and lessons over time. Here is a quick and practical way. Whenever you deliver a feature, tally up the story count and a short description and place them on a wall somewhere sorted from lowest to highest. When you are asked how big Feature 4 might be, walk over to the wall with the stakeholder and a few team members and pick where it fits. Done. Teams doing this well can estimate size of 30 features an hour. Better still, the stakeholder start to get the feel of relative feature size quickly and only ask the team on rare occasions. It has another benefit. It avoids all manner of cognitive biases and wishful thinking. It also captures that some even simple sounding work turned out hard by recording the eventual story count. This is an easy win for any organization.
  8. As fast as reference count forecasting is, sometimes you don’t have relevant historical data and you want to dive a little deeper with the team on size and complexity. My call to action here is to take a sampling approach and to
  9. Google Maps emerged about 10 years ago. It’s now pretty ubiquitous when we want to travel in unfamiliar areas that we se our phones to navigate. It has save my and many other marriages I suspect. Being around for 10 years means its undergone a lot of refinement, and is a pretty good reference to learn how to present forecass. There is some subtleties that we can easily use in forecasting software and IT projects. First off, it doesn’t show just one option, it shown many. Here, two driving alternatives and a public transport alternative. It keeps the human in the decision loop. It presents options, even highlighting the fastest one, but it clearly shows the others in case there is some compelling reason they are useful. Second, it doesn’t give an arrival time yet. It gives duration. Obviously, it doesn’t yet know when we are leaving home. Thirdly, once we do commit to one of the options, it gives us continuously updated arrival time information. It incorporates the latest traffic issues based on traffic flow ahead of us and even suggests alternative routes if its original turns out to be less ideal.
  10. So, lets incorporate that into our world. Lets NOT give a single answer, lets show the options we have considered in case they offer some advantage we haven’t considered. Lets NOT give a calendar delivery date until the project has started, lets make decisions using duration in weeks or sprints as a way to decide what option to pick. Lets NOT perform heroics at the very end of a project, lets continuously track our progress against the forecast and adapt and react quickly, earlier. I’m going to show you how I incorporate these aspects into my forecasting engagements.
  11. Lets start at looking at how to answer the first, How Big.
  12. Forecasting revolves around answering questions. Mostly those questions are about an uncertain future, so we can’t be certain. Its up to use to be very clear about how certain we are. The answer is going to be used for decision, and uncertainty is an important part of that equation. And, we need to know how to do it fast. We can improve all forecast by spending more time analyzing, but that’s not the goal. The goal is to be better than what is currently done. That’s pretty low bar! I want to emphasize, the right question part of this definition. When will I be done is a terrible question. When do we need it is a better question, then how can we achieve that. Avoid the when will it be done at all costs.
  13. Estimating size seems to be an emotional trigger for many people. I’m going to show you a couple of ways to answer the size question rapidly. Neither of these techniques use story point estimation or planning poker. Not that there is anything wrong with those techniques, just they are relatively slow and stressful to produce for the detail we need. The first call to action is to avoid quantitatively answering this question prematurely. Determine what your answer will cause. Often, the stakeholder is asking to feel out viability. Asking “When is it needed?” achieves an almost instant answer, and everyone can move on.