SlideShare una empresa de Scribd logo
1 de 120
Climbing out of a crisis-loop
at the BBC
Katherine Kirk
Raf Gemmail
QCon London 2013
InfoQ.com: News & Community Site
• 750,000 unique visitors/month
• Published in 4 languages (English, Chinese, Japanese and Brazilian
Portuguese)
• Post content from our QCon conferences
• News 15-20 / week
• Articles 3-4 / week
• Presentations (videos) 12-15 / week
• Interviews 2-3 / week
• Books 1 / month
Watch the video with slide
synchronization on InfoQ.com!
http://www.infoq.com/presentations
/bbc-agile-case-study
Presented at QCon London
www.qconlondon.com
Purpose of QCon
- to empower software development by facilitating the spread of
knowledge and innovation
Strategy
- practitioner-driven conference designed for YOU: influencers of
change and innovation in your teams
- speakers and topics driving the evolution and innovation
- connecting and catalyzing the influencers and innovators
Highlights
- attended by more than 12,000 delegates since 2007
- held in 9 cities worldwide
Session code: 7531
Introduction: the comfort page
• Katherine Kirk, Independent
– Was PM on this project
• Background
– Contracting for over 10 years
» Investment banks, Media companies, Trading companies…
mostly large corporations
» Previously:
• Rally Coach – John Deere, Philips, Continental, Petris etc
• BBC - R&D, iPlayer, Core services
– MSc Software Engineering, Oxford
• Raf Gemmail
– Was Dev on this project
• One scenario
• Two perspectives
Disclaimer
• This is the view of the presenters NOT the BBC
– The current team is working well
Keeping buzz words to a minimum
... swimlanes, policies, WIP limits, empowerment,
cooperation, etc etc ...
• Instead:
– Case study + plain language
• Why?
– At the end of the day: its about getting stuff done
This pres is about
• Working past the industry sell
– Do Scrum or Kanban ‘right’
• What happens if you can’t do Scrum or
Kanban ‘properly’?
• Can you still be Agile/Lean
• Can you get out of a pretty bad crisis?
• We think we did
Format
• What was the crisis?
• What Scrum and Kanban we did ‘roughly’?
• What did we differently?
• Why did the crisis loop stop?
• Not a typical agile team scenario
– Purely back end team
– Not cross-functional
– All Perl/Java devs doing same thing
– No front end
– No vertical slicing
In 3 months
• Calmed the crisis-to-crisis cycle that had been
running for nearly 2 years
• Began building new solution
• Kept things running AND improved the process at
the same time
• Turned around stakeholder relationships
• Despite
– People leaving and a restructure
But we did everything ‘incorrectly’
Kanban-ish
Scrum-ish
So what did we do differently?
And were we still Agile/Lean if we didn’t follow
the ‘rule book’?
Key factor in our ‘success’
• Agile/Lean are principles NOT methods
• This means you can use your brain to solve
stuff, as long as it aligns with the principles(!)
• Hmmm....
THE CASE STUDY: CONTEXT
Team
• Specialist, metadata delivery back end team
• Create feeds to display content
– Main ‘client’: iPlayer
– Daily traffic peak of between 200 and 500
requests/second (Not including cached responses)
– Over 700 playback formats
– Servicing hundreds of devices
• Mobile, IPTV, PC, tablets (in all variants and models)
Put into perspective
• “... 30m requests for iPlayer content via mobile or
tablet in July [2012] alone
• [represents only] 20% of all requests for iPlayer
programmes across all platforms... “
• Approx 150 million requests per month
• No metadata feed = no content display
– Front end teams are dependent
• cannot display content without getting feed
• cannot change or edit a feed – needs specialist expertise
http://www.bbc.co.uk/blogs/internet/posts/iplayer_mobile_downloads
Front end
teams
Core/Back
end teams
Integration &
Operations
Operations
Tstng
Fierce backlog competition
iPlayer Radio & Music Externals...
mobile
IPTV
PC
etc
etc
etc
etc mobile
IPTV
PC
etc
etc
etc
etc mobile
IPTV
PC
etc
etc
etc
etc
Meta
data
Logic Search etc etc etc
Testing &
Integration
Front end
teams
Core/Back
end teams
Integration &
Operations
Operations
Tstng
Integration & Test= 4 weeks min
iPlayer Radio & Music Externals...
mobile
IPTV
PC
etc
etc
etc
etc mobile
IPTV
PC
etc
etc
etc
etc mobile
IPTV
PC
etc
etc
etc
etc
Meta
data
Logic Search etc etc etc
Testing &
Integration
* Extra workload on top of planned items (a sprint never ends...)
*
Operations: One big bottleneck
Front end
teams
Core/Back
end teams
Integration &
Operations
iPlayer Radio & Music Externals...
mobile
IPTV
PC
etc
etc
etc
etc mobile
IPTV
PC
etc
etc
etc
etc mobile
IPTV
PC
etc
etc
etc
etc
Meta
daata
Logic Search etc etc etc
Operations
Testing &
Integration
Divisional
General
Manager
Heads of
Delivery
/Product
managers
Project
Manager
Official Communication
3 main issues for back end specialist team:
• Division heads do not necessarily have the expertise
• Prioritisation via Chinese whispers
• Time delay for decision making
So... if it’s urgent?
The crisis-loop
• Desperately holding on to Scrum
• Stakeholders have lost trust
• Technical debt increasing
• Work not done until urgent
• Silo expertise
• Management by manouvre
In summary
• Awesome team
• Running hard to stand still
• A ‘victim’ to its environment and corporate
structure
APPROACH
How to go about this?
• Others had gone through same thing and left
• Pressure
– Make change NOW
– Look like the expert
– Save the day!
• Highly specialised area: how could I know
what was wrong?
– Decided to observe first
Observation time
• I looked like an idiot
Observations after 3 weeks
• They were making all their commitments last
minute BUT
– “Reliance on 'hero' effort is the norm!
– Team is EXHAUSTED
– WHY?????
Causes
• Over 60% of team sprint activity = live and unexpected
issues
• Actual time on planned work is at 10% of management
expectation
• Struggling with stakeholder liaison - no visibility of progress
• Bugs taking 70 days turnover
• Acceptance Criteria non existent
• Already 6 month plus backlog
• Reviewing 20 more additional requests of work per week
• Capacity falling (ppl leaving)
• Difficulty hiring: specialist knowledge
New culture:
Under promise / Over deliver
Ask the EXPERTS what to do
Hand the problem over to the REAL
problem solvers: those doing the work!
–THE ENGINEERS!!!!!
(Warning to Managers: most engineers are more qualified at
solving problems than you are)
Solve problems collaboratively
Action
PM gathers/
collates info
Presents to
dev team
• (group or
individual)
Brainstorming
Reach general
consensus
• Time box
‘experiment’
Change through collaborative
experimentation
• Define agreed timeframe
• Action
• Review
• Keep/try something else
THE USUAL EXPLANATION:
SCRUM & KANBAN
Kept some Scrum
• Kept Scrum just for 40% workload (planned
delivery)
– Matching the rest of the org
• Kept meeting templates
– But didn’t always use them ‘in the right way’
Did ‘minimal’ Kanban
• Observed
• Visualised
• Incremental improvement after observations
of patterns
• No ‘proper’ measures
• No fancy graphs or charts
The original ‘day board’
To do Doing Done
Bugs?
Most requested:
What state is the work actually in?
To do doing done
Blocked Query Backlog Ready Doing For
review
Merged Ready
for test
Doing Done
Onto the day board....
Blocked Query Backlog Ready Doing For
review
Merged Ready
for test
Doing Done
What are we working on?
Sprint backlog urgent requests
Type Response needed
Bugs Days
Planned work Every two weeks ideally against
a 6 month plan
Performance & Optimisation Indefinite
Technical Debt Indefinite
Operations development Kneejerk (hourly?)
Ring fenced reality
Bugs (1-3 days)
Ops-Dev (now)
Performance
& Optimisation
Slate: Planned
Delivery
Slate: New
solution?
Response team
Delivery team
60%
40%
Actual
capacity
Type of work
• And then, incrementally improved
• 40% = Delivery team = Scrum-style
• 60% = Response team = Kanban style
USUALLY PRESENTATION ENDS
HERE....
In 3 months
 Results
 Live issues down (60% to 10-20%)
 Met delivery schedule thus far
 Most viewed program on iPlayer = no blip
 Improved stakeholder liaison
▪ From Red to Amber forTest and iPlayer (day to day operations, not
slate)
▪ Online and physical visibility of progress
▪ Bugs from 70 days to less than a sprint turnover
AND THAT’S IT????
• REALLY???
• Was that all it took?
• A bit of methodology?
HELL NO! Don’t be fooled
• Its not about the methods, its about people
– (and if you don’t believe me, read everything from
Alistair Cockburn, twice)
• For example
– Boards/Visualisations etc represent human
interactions
– Meetings / gatherings in Scrum are people
collaboration ‘tools’
WHAT WE DID ‘BEHIND THE
SCENES’
Collaboration
• We concentrated very hard on working together
openly and truthfully
• It was HARD work
• It was counter intuitive
• It didn’t feel comfortable
• Some people really struggled with it at the start
Examples
• Quirky stuff we did together
– Resulted from collaborating
– Rather than following methodology instructions
Workstream
Methodology Mix-n-Match
ResponseDelivery
Planned
delivery
(2 weeks)
Bugs
(1-3 days)
Ops Dev
(NOW)
P&O
(continuous)
Tech Debt
[Scrum –style planning to match stakeholder demand]
[Daily planning with Product Owner and Stakeholders]
[Hourly response and review]
[Both planned and responsive]
Devs rotate through
workstreams every 2 weeks
[Both planned and responsive]
Benefits
• Fairness
• Removing ‘single points of failure’
• Distributing knowledge throughout the team
– Holidays
– Sickness
– Mentoring
• Understanding of impact of coding practices
Changed the way we communicated:
Expand/Contract*
?
Discover
Focus
Discover
Focus
Discover
Focus
Discover
Focus
Main
issues
Key
Causes
Best
solution
Effective
action
Issues Causes Solutions Actions
•Expand: what’s
wrong?
•Contract: what’s the
main issues?
•Expand: what’s
possible causes?
•Contract: what’s
the main
causes?
•Expand: how could
we fix this?
•Contract: what
would make the
most effect?
•Expand: how should
we go about this?
•Contract: what,
timeframe, how,
who?
*Rachel Davies knows a lot about this
In everything we did
• Conversations
• Reviews
• Retrospectives
• Speculations
Issues – Causes – Solutions - Actions
Examine the ‘truth’ openly
Stakeholder
Upper
Mgmt
Bugs
Support requests
Features
Per sprint
Adhoc requests
Delivery
PM
Devs
Emails
Conversations
Meetings
Ops-dev
Dev team
Jira
Collaborative discussions result
Stakeholder liaison: new set up
Dynamite
Inbox
In JIRA
Review:
PO / PM / Dev
and Test
Stakeholder
Upper
Mgmt
Slate
Per sprint
Per 24 hours
Feature
Champions
assigned here
Short tasks
needing quick
response
Dev Tester
Triage point
BONUS – solving issue
by collaborating means
we already have buyin
Overcame: Expertise silos
Backend /
Core team
devs
Front end
devs
Operations
devs
? ?
3 main issues for all:
• Integration
• Writing requirements/requests
• Understanding (what each other has done/why)
Champions
• Strategic, ‘inner’ PO role
– NOT a ‘dogs-body’
– Keeps the overview
– Responsible for a feature or area of the app
• Inception > live > maintenance and documentation
– QUALITY: What / how / when to code
– Direct liaise with stakeholder devs
– Breaks down work for backlog if required with PO
– Reports on progress
– Involved spearheading realistic estimation
Stakeholder
team
Stakeholder
team
Stakeholder
team
Champions:
REAL product ownership
PO dev
dev
dev
Stakeholder
team
dev
Backlog,
priority,
strategy
Performance
and
optimisation
Bugs/
Technical Debt
Initiated Team Peer Sessions
2 wks 2 wks 2 wks 2 wks 2 wks
* Optional: Estimating / review / info sharing
Sprints
Standups
Peer Sessions*
Planning
Retrospective
Standups – Kanban style
• Issues only
• Info sessions after, if required
• Blocked / hold resolution ASAP
• right to left
Peer Sessions (optional)
• Information transfer
• Feature champion led
• All on same page
• Data to the team (engagement)
• Strategy / plan comms
• Estimation of large features
• Reviewing effectiveness/ capacity
Planning
• Assign support team
• Rotate duties
• Estimation of support work
•Review/resolve operations issues
Defined ideal in REAL words
Ideal Example of measure of ideal
Increased quality no hemorrhaging bugs, last minute surprises and live issues;
significant reduction of usage of dev for the ‘bugs’ role per sprint
Significant reduction of
technical debt and it’s effects
Time for refactoring is valued and provided
Refactoring has clearly been done
No ‘cowboy’ workaround pressure from Product Managers or upper
management
Significant reduction to backlog
of planned work
work only on what is required
Jira backlog only contains relevant and organized tickets
Good tracking of current and
upcoming workload
no sudden surprises – e.g. B2B
Increased adaptability we can bend and flex with demand: technical solution, devs, testers
and process
Increased predictability on time delivery for committed items
Commitment process is realistic no promising by upstream of what we are not likely to deliver on
time – consultation with team/PMs BEFORE commitment
Realistic input and direction
from upstream management
discussing not just what to do, but also HOW – incorporating
capacity limitations
Trusted PM/Dev/Tester/upper
management relationship
request from upper management or stakeholder is translated
effectively, and efficiently flows through the system with a quality
output
More transparent upper
management activities
what’s coming up is clear to the team and stakeholders
Happy stakeholders effective stakeholder expectation management: bravery to
communicate capacity limitations and other commitments
good communication of process, progress on items and outward
documentation - example: business friendly release notes
Engaged and empowered devs all devs currently in position are retained, and scores of ‘job
satisfaction’ is around 7-8 out of 10, with 85% of all devs indicating
• Simple solutions
• Effective – for our context
• Not in the rulebook
• But in line with the principles of Agile/Lean
REAL RESULT
As we said before: In 3 months
 Results
 Live issues down (60% to 10-20%)
 Met delivery schedule thus far
 Most viewed program on iPlayer = no blip
 Improved stakeholder liaison
▪ From Red to Amber forTest and iPlayer (day to day operations, not
slate)
▪ Online and physical visibility of progress
▪ Bugs from 70 days to less than a sprint turnover
BUT: for the next 3 months
• WITHOUT a manager or coach
• Team self managed
– Kept improving
– Didn’t fall back into crisis
– Kept good stakeholder relationships
18 months later
• From all reports, the team is still going strong
– Now have a project manager
– Haven’t fallen back into crisis
Empowerment
People solving problems together
=
Learning
=
Can solve problems on their own
=
Less handholding/time wasting/cost!
REFLECTION
Summary
• Although we did
– Scrum-ish
– Kanban-ish
• Why did it work?
• Here is a hint....
– Individuals and interactions (over processes and tools)
– Customer collaboration (over customer negotiation)
– Responding to change (over following a plan)
– Etc..
Agile/Lean is not a method
• Kanban and Scrum are Agile/Lean
– But Agile/Lean are not necessarily Kanban or Scrum
• The principles can save ‘difficult’ projects
– Even when methods can’t
• Use principles as your guide
• Reality as your driver
• And methods as your tools
In a crisis loop
• Suggestion
– If you have to choose between a process (e.g.
Scrum or Kanban) and adhering to Agile/Lean
Principles....
– Choose the principles!
(err... that’d be this one: individuals and interactions over
processes and tools)
;-)
RAF GEMMAIL
A Dev's Eye View
We practiced Scrum:
 Sprints
 Pointing
 Planning poker
 XP
But during the Sprint:
 URGENT issues
 Out of remit features
But during the Sprint:
 URGENT issues
 Out of remit features
 Failure to learn from history
Planned work compromised by
unplanned work
The climate
 Code decay
The climate
 Code decay
 Reviews blocking features
The climate
 Code decay
 Reviews blocking features
 Devs and PM's leaving
The climate
 Code decay
 Reviews blocking features
 Devs and PM's leaving
 No time to improve dev process
The climate
 Code decay
 Reviews blocking features
 Devs and PM's leaving
 No time to improve dev process
09:30 Almost done
10:00 Stand up “I just have to merge
it.”
Merge
Test
11:00 Done
09:30 Nearly done
10:00 Stand up
Merge
Test Failed
Code
Test
Push
11:30 Done
09:30 Nearly done
10:00 Stand up
Merge
Test Failed
Bug: “Urgent! Who is
available?”
Code
Test
Push
14:00 Done
09:30 Nearly done
10:00 Stand up
Merge
Test Failed
Bug: “Stake holder complained..”
Code
Production Issue
Test
Push
18:00 Done
•90 mins work
== 8h day
Katherine Kirk on the Bridge
 You guys are AMAZING
 But Stakeholders are scared
Katherine Kirk on the Bridge
 You guys are AMAZING
 But Stakeholders are scared
 What do you think we should do?
Katherine Kirk on the Bridge
 You guys are AMAZING
 But Stakeholders are scared
 What do you think we should do?
Did she just ask us to fix the PM
function??
Are the stake holders letting her?
Nemawashi (根回し)
Review
Change
Review
Change
D
e
v
P
r
o
c
e
s
s
1
D
e
v
P
r
o
c
e
s
s
2
D
e
v
P
r
o
c
e
s
s
3
Improve without compromising
current workload
The 'normal' Retrospective noise
Ownership
Review: Expand/Contract
?
Discover
Focus
Discover
Focus
Discover
Focus
Discover
Focus
Main
issues
Key
Causes
Best
solution
Effective
action
Issues Causes Solutions Action
Issues Dump
Example
Issues Dump Grouping
Example
Who knows what?
Cant keep up
Decaying Code
Issues Dump Grouping Cause?
Example
Single points
of failure
Too
reactionary
(accepting
too much)
Tech debt
Who knows what?
Cant keep up
Decaying Code
Action
Single points
of failure
Too
reactionary
(accepting
too much)
Tech debt
Cause
Action
Single points
of failure
Too
reactionary
(accepting
too much)
Tech debt
Cause Solution options
Action
Single points
of failure
Too
reactionary
(accepting
too much)
Tech debt
Rotate devs
through
workstreams
Prioritisation
and triage
New
workstream
on board
Cause Solution options Will try
No more heros
 A reactive Pull-based ResponseTeam
 Feature Champions to PO critical features
 An EmpoweredTeam!
Response team:
 Bugs
 Ops
 Performance and
optimisation
'Everyone-is-a-Hero' Rotation
Ops Bugs
'Everyone-is-a-Hero' Rotation
Ops
 Release Process
 Technical Debt
 Process automation
 Stability
Bugs
'Everyone-is-a-Hero' Rotation
Ops
 Release Process
 Technical Debt
 Process automation
 Stability
Bugs
Burdensome
Needs to be done
Often user error
'Everyone-is-a-Hero' Rotation
Ops
 Release Process
 Technical Debt
 Process automation
 Stability
Bugs
Burdensome
Needs to be done
Often user error
Shared Knowledge
Planned work: A new day!
 9am:Work on feature – include someTD
 Stand up
 CODE (Review / Have code review)
 TEST (Test – Merge –Test – Push)
 1800: HOME
1 days work
== 1 day uninterrupted work!!!!!
 9:30am Check Splunk Alerts
 10am Stand up
 10:15am Pull P&O card
 12pm Discuss optimisation
with Recommendations team
 2pm Pair with TL on incident
 3pm Review Related Code
and raise ticket
 4pm Refactor and speed up
some feed •P&O Officer's Log
1 days work == whatever needed!!!!!
Response work: A new way!
Visualisations provided a more
granular understanding
 Dev = {analysis, dev, review, testing, merge}
Visualisations provided a more
granular understanding
 Dev = {analysis, dev, review, testing, merge}
 “I'm nearly done” → “He's in review”
Visualisations provided a more
granular understanding
 Dev = {analysis, dev, review, testing, merge}
 “I'm nearly done” → “He's in review”
 “I'm merging” → “Dev's still got tests to run”
Visualisations provided a more
granular understanding
 Dev = {analysis, dev, review, testing, merge}
 “I'm nearly done” → “He's in review”
 “I'm merging” → “The dev's still got tests to run”
 Test Column → Test Board
Visualisations provided a more
granular understanding
 Dev = {analysis, dev, review, testing, merge}
 “I'm nearly done” → “He's in review”
 “I'm merging” → “The dev's still got tests to run”
 Test Column → Test Board
Self Management
 Continued to improve “established”
process
 Experiments with pointing
 Moves towards pure TDD
 New PM → went to Scrumban
Communication & Collaboration
Over
Process
On Reflection
Consider
 If we'd tried
Scrum-right
Kanban-right
 Not so Agile/Lean?
 Results as quick?
 As Sustainable?
 Self-managing?
Principles
Lean
 Eliminate waste
 Amplify learning
 Decide as late as
possible
 Deliver as fast as
possible
 Empower the team
 Build integrity in
 See the whole
Agile Manifesto
Individuals and
interactions over
processes and tools
Working software over
comprehensive
documentation
Customer collaboration
over contract
negotiation
Responding to change
over following a plan
Nemawashi (根回し)

Más contenido relacionado

Más de C4Media

Next Generation Client APIs in Envoy Mobile
Next Generation Client APIs in Envoy MobileNext Generation Client APIs in Envoy Mobile
Next Generation Client APIs in Envoy MobileC4Media
 
Software Teams and Teamwork Trends Report Q1 2020
Software Teams and Teamwork Trends Report Q1 2020Software Teams and Teamwork Trends Report Q1 2020
Software Teams and Teamwork Trends Report Q1 2020C4Media
 
Understand the Trade-offs Using Compilers for Java Applications
Understand the Trade-offs Using Compilers for Java ApplicationsUnderstand the Trade-offs Using Compilers for Java Applications
Understand the Trade-offs Using Compilers for Java ApplicationsC4Media
 
Kafka Needs No Keeper
Kafka Needs No KeeperKafka Needs No Keeper
Kafka Needs No KeeperC4Media
 
High Performing Teams Act Like Owners
High Performing Teams Act Like OwnersHigh Performing Teams Act Like Owners
High Performing Teams Act Like OwnersC4Media
 
Does Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to JavaDoes Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to JavaC4Media
 
Service Meshes- The Ultimate Guide
Service Meshes- The Ultimate GuideService Meshes- The Ultimate Guide
Service Meshes- The Ultimate GuideC4Media
 
Shifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDShifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDC4Media
 
CI/CD for Machine Learning
CI/CD for Machine LearningCI/CD for Machine Learning
CI/CD for Machine LearningC4Media
 
Fault Tolerance at Speed
Fault Tolerance at SpeedFault Tolerance at Speed
Fault Tolerance at SpeedC4Media
 
Architectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsArchitectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsC4Media
 
ML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsC4Media
 
Build Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerBuild Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerC4Media
 
User & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleUser & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleC4Media
 
Scaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeScaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeC4Media
 
Make Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereMake Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereC4Media
 
The Talk You've Been Await-ing For
The Talk You've Been Await-ing ForThe Talk You've Been Await-ing For
The Talk You've Been Await-ing ForC4Media
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data EngineeringC4Media
 
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreAutomated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreC4Media
 
Navigating Complexity: High-performance Delivery and Discovery Teams
Navigating Complexity: High-performance Delivery and Discovery TeamsNavigating Complexity: High-performance Delivery and Discovery Teams
Navigating Complexity: High-performance Delivery and Discovery TeamsC4Media
 

Más de C4Media (20)

Next Generation Client APIs in Envoy Mobile
Next Generation Client APIs in Envoy MobileNext Generation Client APIs in Envoy Mobile
Next Generation Client APIs in Envoy Mobile
 
Software Teams and Teamwork Trends Report Q1 2020
Software Teams and Teamwork Trends Report Q1 2020Software Teams and Teamwork Trends Report Q1 2020
Software Teams and Teamwork Trends Report Q1 2020
 
Understand the Trade-offs Using Compilers for Java Applications
Understand the Trade-offs Using Compilers for Java ApplicationsUnderstand the Trade-offs Using Compilers for Java Applications
Understand the Trade-offs Using Compilers for Java Applications
 
Kafka Needs No Keeper
Kafka Needs No KeeperKafka Needs No Keeper
Kafka Needs No Keeper
 
High Performing Teams Act Like Owners
High Performing Teams Act Like OwnersHigh Performing Teams Act Like Owners
High Performing Teams Act Like Owners
 
Does Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to JavaDoes Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to Java
 
Service Meshes- The Ultimate Guide
Service Meshes- The Ultimate GuideService Meshes- The Ultimate Guide
Service Meshes- The Ultimate Guide
 
Shifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDShifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CD
 
CI/CD for Machine Learning
CI/CD for Machine LearningCI/CD for Machine Learning
CI/CD for Machine Learning
 
Fault Tolerance at Speed
Fault Tolerance at SpeedFault Tolerance at Speed
Fault Tolerance at Speed
 
Architectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsArchitectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep Systems
 
ML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.js
 
Build Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerBuild Your Own WebAssembly Compiler
Build Your Own WebAssembly Compiler
 
User & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleUser & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix Scale
 
Scaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeScaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's Edge
 
Make Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereMake Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home Everywhere
 
The Talk You've Been Await-ing For
The Talk You've Been Await-ing ForThe Talk You've Been Await-ing For
The Talk You've Been Await-ing For
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data Engineering
 
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreAutomated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
 
Navigating Complexity: High-performance Delivery and Discovery Teams
Navigating Complexity: High-performance Delivery and Discovery TeamsNavigating Complexity: High-performance Delivery and Discovery Teams
Navigating Complexity: High-performance Delivery and Discovery Teams
 

Último

GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 

Último (20)

GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 

Climbing Out of a Crisis Loop: How a Critical BBC Back-end Team Reigned in a Workflow Crisis-to-crisis Cycle

  • 1. Climbing out of a crisis-loop at the BBC Katherine Kirk Raf Gemmail QCon London 2013
  • 2. InfoQ.com: News & Community Site • 750,000 unique visitors/month • Published in 4 languages (English, Chinese, Japanese and Brazilian Portuguese) • Post content from our QCon conferences • News 15-20 / week • Articles 3-4 / week • Presentations (videos) 12-15 / week • Interviews 2-3 / week • Books 1 / month Watch the video with slide synchronization on InfoQ.com! http://www.infoq.com/presentations /bbc-agile-case-study
  • 3. Presented at QCon London www.qconlondon.com Purpose of QCon - to empower software development by facilitating the spread of knowledge and innovation Strategy - practitioner-driven conference designed for YOU: influencers of change and innovation in your teams - speakers and topics driving the evolution and innovation - connecting and catalyzing the influencers and innovators Highlights - attended by more than 12,000 delegates since 2007 - held in 9 cities worldwide
  • 5. Introduction: the comfort page • Katherine Kirk, Independent – Was PM on this project • Background – Contracting for over 10 years » Investment banks, Media companies, Trading companies… mostly large corporations » Previously: • Rally Coach – John Deere, Philips, Continental, Petris etc • BBC - R&D, iPlayer, Core services – MSc Software Engineering, Oxford • Raf Gemmail – Was Dev on this project
  • 6. • One scenario • Two perspectives
  • 7. Disclaimer • This is the view of the presenters NOT the BBC – The current team is working well
  • 8. Keeping buzz words to a minimum ... swimlanes, policies, WIP limits, empowerment, cooperation, etc etc ... • Instead: – Case study + plain language • Why? – At the end of the day: its about getting stuff done
  • 9. This pres is about • Working past the industry sell – Do Scrum or Kanban ‘right’ • What happens if you can’t do Scrum or Kanban ‘properly’? • Can you still be Agile/Lean • Can you get out of a pretty bad crisis? • We think we did
  • 10. Format • What was the crisis? • What Scrum and Kanban we did ‘roughly’? • What did we differently? • Why did the crisis loop stop?
  • 11. • Not a typical agile team scenario – Purely back end team – Not cross-functional – All Perl/Java devs doing same thing – No front end – No vertical slicing
  • 12. In 3 months • Calmed the crisis-to-crisis cycle that had been running for nearly 2 years • Began building new solution • Kept things running AND improved the process at the same time • Turned around stakeholder relationships • Despite – People leaving and a restructure
  • 13. But we did everything ‘incorrectly’ Kanban-ish Scrum-ish So what did we do differently? And were we still Agile/Lean if we didn’t follow the ‘rule book’?
  • 14. Key factor in our ‘success’ • Agile/Lean are principles NOT methods • This means you can use your brain to solve stuff, as long as it aligns with the principles(!) • Hmmm....
  • 15. THE CASE STUDY: CONTEXT
  • 16. Team • Specialist, metadata delivery back end team • Create feeds to display content – Main ‘client’: iPlayer – Daily traffic peak of between 200 and 500 requests/second (Not including cached responses) – Over 700 playback formats – Servicing hundreds of devices • Mobile, IPTV, PC, tablets (in all variants and models)
  • 17. Put into perspective • “... 30m requests for iPlayer content via mobile or tablet in July [2012] alone • [represents only] 20% of all requests for iPlayer programmes across all platforms... “ • Approx 150 million requests per month • No metadata feed = no content display – Front end teams are dependent • cannot display content without getting feed • cannot change or edit a feed – needs specialist expertise http://www.bbc.co.uk/blogs/internet/posts/iplayer_mobile_downloads
  • 18. Front end teams Core/Back end teams Integration & Operations Operations Tstng Fierce backlog competition iPlayer Radio & Music Externals... mobile IPTV PC etc etc etc etc mobile IPTV PC etc etc etc etc mobile IPTV PC etc etc etc etc Meta data Logic Search etc etc etc Testing & Integration
  • 19. Front end teams Core/Back end teams Integration & Operations Operations Tstng Integration & Test= 4 weeks min iPlayer Radio & Music Externals... mobile IPTV PC etc etc etc etc mobile IPTV PC etc etc etc etc mobile IPTV PC etc etc etc etc Meta data Logic Search etc etc etc Testing & Integration * Extra workload on top of planned items (a sprint never ends...) *
  • 20. Operations: One big bottleneck Front end teams Core/Back end teams Integration & Operations iPlayer Radio & Music Externals... mobile IPTV PC etc etc etc etc mobile IPTV PC etc etc etc etc mobile IPTV PC etc etc etc etc Meta daata Logic Search etc etc etc Operations Testing & Integration
  • 21. Divisional General Manager Heads of Delivery /Product managers Project Manager Official Communication 3 main issues for back end specialist team: • Division heads do not necessarily have the expertise • Prioritisation via Chinese whispers • Time delay for decision making
  • 22. So... if it’s urgent?
  • 23. The crisis-loop • Desperately holding on to Scrum • Stakeholders have lost trust • Technical debt increasing • Work not done until urgent • Silo expertise • Management by manouvre
  • 24. In summary • Awesome team • Running hard to stand still • A ‘victim’ to its environment and corporate structure
  • 26. How to go about this? • Others had gone through same thing and left • Pressure – Make change NOW – Look like the expert – Save the day! • Highly specialised area: how could I know what was wrong? – Decided to observe first
  • 27. Observation time • I looked like an idiot
  • 28. Observations after 3 weeks • They were making all their commitments last minute BUT – “Reliance on 'hero' effort is the norm! – Team is EXHAUSTED – WHY?????
  • 29. Causes • Over 60% of team sprint activity = live and unexpected issues • Actual time on planned work is at 10% of management expectation • Struggling with stakeholder liaison - no visibility of progress • Bugs taking 70 days turnover • Acceptance Criteria non existent • Already 6 month plus backlog • Reviewing 20 more additional requests of work per week • Capacity falling (ppl leaving) • Difficulty hiring: specialist knowledge
  • 30. New culture: Under promise / Over deliver
  • 31. Ask the EXPERTS what to do Hand the problem over to the REAL problem solvers: those doing the work! –THE ENGINEERS!!!!! (Warning to Managers: most engineers are more qualified at solving problems than you are)
  • 32. Solve problems collaboratively Action PM gathers/ collates info Presents to dev team • (group or individual) Brainstorming Reach general consensus • Time box ‘experiment’
  • 33. Change through collaborative experimentation • Define agreed timeframe • Action • Review • Keep/try something else
  • 35. Kept some Scrum • Kept Scrum just for 40% workload (planned delivery) – Matching the rest of the org • Kept meeting templates – But didn’t always use them ‘in the right way’
  • 36. Did ‘minimal’ Kanban • Observed • Visualised • Incremental improvement after observations of patterns • No ‘proper’ measures • No fancy graphs or charts
  • 37. The original ‘day board’ To do Doing Done Bugs?
  • 38. Most requested: What state is the work actually in? To do doing done Blocked Query Backlog Ready Doing For review Merged Ready for test Doing Done
  • 39. Onto the day board.... Blocked Query Backlog Ready Doing For review Merged Ready for test Doing Done
  • 40. What are we working on? Sprint backlog urgent requests Type Response needed Bugs Days Planned work Every two weeks ideally against a 6 month plan Performance & Optimisation Indefinite Technical Debt Indefinite Operations development Kneejerk (hourly?)
  • 41. Ring fenced reality Bugs (1-3 days) Ops-Dev (now) Performance & Optimisation Slate: Planned Delivery Slate: New solution? Response team Delivery team 60% 40% Actual capacity Type of work
  • 42. • And then, incrementally improved • 40% = Delivery team = Scrum-style • 60% = Response team = Kanban style
  • 44. In 3 months  Results  Live issues down (60% to 10-20%)  Met delivery schedule thus far  Most viewed program on iPlayer = no blip  Improved stakeholder liaison ▪ From Red to Amber forTest and iPlayer (day to day operations, not slate) ▪ Online and physical visibility of progress ▪ Bugs from 70 days to less than a sprint turnover
  • 46. • REALLY??? • Was that all it took? • A bit of methodology?
  • 47. HELL NO! Don’t be fooled • Its not about the methods, its about people – (and if you don’t believe me, read everything from Alistair Cockburn, twice) • For example – Boards/Visualisations etc represent human interactions – Meetings / gatherings in Scrum are people collaboration ‘tools’
  • 48. WHAT WE DID ‘BEHIND THE SCENES’
  • 49. Collaboration • We concentrated very hard on working together openly and truthfully • It was HARD work • It was counter intuitive • It didn’t feel comfortable • Some people really struggled with it at the start
  • 50. Examples • Quirky stuff we did together – Resulted from collaborating – Rather than following methodology instructions
  • 51. Workstream Methodology Mix-n-Match ResponseDelivery Planned delivery (2 weeks) Bugs (1-3 days) Ops Dev (NOW) P&O (continuous) Tech Debt [Scrum –style planning to match stakeholder demand] [Daily planning with Product Owner and Stakeholders] [Hourly response and review] [Both planned and responsive] Devs rotate through workstreams every 2 weeks [Both planned and responsive]
  • 52. Benefits • Fairness • Removing ‘single points of failure’ • Distributing knowledge throughout the team – Holidays – Sickness – Mentoring • Understanding of impact of coding practices
  • 53. Changed the way we communicated: Expand/Contract* ? Discover Focus Discover Focus Discover Focus Discover Focus Main issues Key Causes Best solution Effective action Issues Causes Solutions Actions •Expand: what’s wrong? •Contract: what’s the main issues? •Expand: what’s possible causes? •Contract: what’s the main causes? •Expand: how could we fix this? •Contract: what would make the most effect? •Expand: how should we go about this? •Contract: what, timeframe, how, who? *Rachel Davies knows a lot about this
  • 54. In everything we did • Conversations • Reviews • Retrospectives • Speculations Issues – Causes – Solutions - Actions
  • 55. Examine the ‘truth’ openly Stakeholder Upper Mgmt Bugs Support requests Features Per sprint Adhoc requests Delivery PM Devs Emails Conversations Meetings Ops-dev Dev team Jira
  • 57. Stakeholder liaison: new set up Dynamite Inbox In JIRA Review: PO / PM / Dev and Test Stakeholder Upper Mgmt Slate Per sprint Per 24 hours Feature Champions assigned here Short tasks needing quick response Dev Tester Triage point BONUS – solving issue by collaborating means we already have buyin
  • 58. Overcame: Expertise silos Backend / Core team devs Front end devs Operations devs ? ? 3 main issues for all: • Integration • Writing requirements/requests • Understanding (what each other has done/why)
  • 59. Champions • Strategic, ‘inner’ PO role – NOT a ‘dogs-body’ – Keeps the overview – Responsible for a feature or area of the app • Inception > live > maintenance and documentation – QUALITY: What / how / when to code – Direct liaise with stakeholder devs – Breaks down work for backlog if required with PO – Reports on progress – Involved spearheading realistic estimation
  • 60. Stakeholder team Stakeholder team Stakeholder team Champions: REAL product ownership PO dev dev dev Stakeholder team dev Backlog, priority, strategy Performance and optimisation Bugs/ Technical Debt
  • 61. Initiated Team Peer Sessions 2 wks 2 wks 2 wks 2 wks 2 wks * Optional: Estimating / review / info sharing Sprints Standups Peer Sessions* Planning Retrospective Standups – Kanban style • Issues only • Info sessions after, if required • Blocked / hold resolution ASAP • right to left Peer Sessions (optional) • Information transfer • Feature champion led • All on same page • Data to the team (engagement) • Strategy / plan comms • Estimation of large features • Reviewing effectiveness/ capacity Planning • Assign support team • Rotate duties • Estimation of support work •Review/resolve operations issues
  • 62. Defined ideal in REAL words Ideal Example of measure of ideal Increased quality no hemorrhaging bugs, last minute surprises and live issues; significant reduction of usage of dev for the ‘bugs’ role per sprint Significant reduction of technical debt and it’s effects Time for refactoring is valued and provided Refactoring has clearly been done No ‘cowboy’ workaround pressure from Product Managers or upper management Significant reduction to backlog of planned work work only on what is required Jira backlog only contains relevant and organized tickets Good tracking of current and upcoming workload no sudden surprises – e.g. B2B Increased adaptability we can bend and flex with demand: technical solution, devs, testers and process Increased predictability on time delivery for committed items Commitment process is realistic no promising by upstream of what we are not likely to deliver on time – consultation with team/PMs BEFORE commitment Realistic input and direction from upstream management discussing not just what to do, but also HOW – incorporating capacity limitations Trusted PM/Dev/Tester/upper management relationship request from upper management or stakeholder is translated effectively, and efficiently flows through the system with a quality output More transparent upper management activities what’s coming up is clear to the team and stakeholders Happy stakeholders effective stakeholder expectation management: bravery to communicate capacity limitations and other commitments good communication of process, progress on items and outward documentation - example: business friendly release notes Engaged and empowered devs all devs currently in position are retained, and scores of ‘job satisfaction’ is around 7-8 out of 10, with 85% of all devs indicating
  • 63. • Simple solutions • Effective – for our context • Not in the rulebook • But in line with the principles of Agile/Lean
  • 65. As we said before: In 3 months  Results  Live issues down (60% to 10-20%)  Met delivery schedule thus far  Most viewed program on iPlayer = no blip  Improved stakeholder liaison ▪ From Red to Amber forTest and iPlayer (day to day operations, not slate) ▪ Online and physical visibility of progress ▪ Bugs from 70 days to less than a sprint turnover
  • 66. BUT: for the next 3 months • WITHOUT a manager or coach • Team self managed – Kept improving – Didn’t fall back into crisis – Kept good stakeholder relationships
  • 67. 18 months later • From all reports, the team is still going strong – Now have a project manager – Haven’t fallen back into crisis
  • 68. Empowerment People solving problems together = Learning = Can solve problems on their own = Less handholding/time wasting/cost!
  • 70. Summary • Although we did – Scrum-ish – Kanban-ish • Why did it work? • Here is a hint.... – Individuals and interactions (over processes and tools) – Customer collaboration (over customer negotiation) – Responding to change (over following a plan) – Etc..
  • 71. Agile/Lean is not a method • Kanban and Scrum are Agile/Lean – But Agile/Lean are not necessarily Kanban or Scrum • The principles can save ‘difficult’ projects – Even when methods can’t • Use principles as your guide • Reality as your driver • And methods as your tools
  • 72. In a crisis loop • Suggestion – If you have to choose between a process (e.g. Scrum or Kanban) and adhering to Agile/Lean Principles.... – Choose the principles! (err... that’d be this one: individuals and interactions over processes and tools) ;-)
  • 74.
  • 75. A Dev's Eye View
  • 76. We practiced Scrum:  Sprints  Pointing  Planning poker  XP
  • 77. But during the Sprint:  URGENT issues  Out of remit features
  • 78. But during the Sprint:  URGENT issues  Out of remit features  Failure to learn from history
  • 79. Planned work compromised by unplanned work
  • 81. The climate  Code decay  Reviews blocking features
  • 82. The climate  Code decay  Reviews blocking features  Devs and PM's leaving
  • 83. The climate  Code decay  Reviews blocking features  Devs and PM's leaving  No time to improve dev process
  • 84. The climate  Code decay  Reviews blocking features  Devs and PM's leaving  No time to improve dev process
  • 85. 09:30 Almost done 10:00 Stand up “I just have to merge it.” Merge Test 11:00 Done
  • 86. 09:30 Nearly done 10:00 Stand up Merge Test Failed Code Test Push 11:30 Done
  • 87. 09:30 Nearly done 10:00 Stand up Merge Test Failed Bug: “Urgent! Who is available?” Code Test Push 14:00 Done
  • 88. 09:30 Nearly done 10:00 Stand up Merge Test Failed Bug: “Stake holder complained..” Code Production Issue Test Push 18:00 Done •90 mins work == 8h day
  • 89. Katherine Kirk on the Bridge  You guys are AMAZING  But Stakeholders are scared
  • 90. Katherine Kirk on the Bridge  You guys are AMAZING  But Stakeholders are scared  What do you think we should do?
  • 91. Katherine Kirk on the Bridge  You guys are AMAZING  But Stakeholders are scared  What do you think we should do? Did she just ask us to fix the PM function?? Are the stake holders letting her?
  • 94. The 'normal' Retrospective noise Ownership
  • 97. Issues Dump Grouping Example Who knows what? Cant keep up Decaying Code
  • 98. Issues Dump Grouping Cause? Example Single points of failure Too reactionary (accepting too much) Tech debt Who knows what? Cant keep up Decaying Code
  • 100. Action Single points of failure Too reactionary (accepting too much) Tech debt Cause Solution options
  • 101. Action Single points of failure Too reactionary (accepting too much) Tech debt Rotate devs through workstreams Prioritisation and triage New workstream on board Cause Solution options Will try
  • 102. No more heros  A reactive Pull-based ResponseTeam  Feature Champions to PO critical features  An EmpoweredTeam!
  • 103. Response team:  Bugs  Ops  Performance and optimisation
  • 105. 'Everyone-is-a-Hero' Rotation Ops  Release Process  Technical Debt  Process automation  Stability Bugs
  • 106. 'Everyone-is-a-Hero' Rotation Ops  Release Process  Technical Debt  Process automation  Stability Bugs Burdensome Needs to be done Often user error
  • 107. 'Everyone-is-a-Hero' Rotation Ops  Release Process  Technical Debt  Process automation  Stability Bugs Burdensome Needs to be done Often user error Shared Knowledge
  • 108. Planned work: A new day!  9am:Work on feature – include someTD  Stand up  CODE (Review / Have code review)  TEST (Test – Merge –Test – Push)  1800: HOME 1 days work == 1 day uninterrupted work!!!!!
  • 109.  9:30am Check Splunk Alerts  10am Stand up  10:15am Pull P&O card  12pm Discuss optimisation with Recommendations team  2pm Pair with TL on incident  3pm Review Related Code and raise ticket  4pm Refactor and speed up some feed •P&O Officer's Log 1 days work == whatever needed!!!!! Response work: A new way!
  • 110. Visualisations provided a more granular understanding  Dev = {analysis, dev, review, testing, merge}
  • 111. Visualisations provided a more granular understanding  Dev = {analysis, dev, review, testing, merge}  “I'm nearly done” → “He's in review”
  • 112. Visualisations provided a more granular understanding  Dev = {analysis, dev, review, testing, merge}  “I'm nearly done” → “He's in review”  “I'm merging” → “Dev's still got tests to run”
  • 113. Visualisations provided a more granular understanding  Dev = {analysis, dev, review, testing, merge}  “I'm nearly done” → “He's in review”  “I'm merging” → “The dev's still got tests to run”  Test Column → Test Board
  • 114. Visualisations provided a more granular understanding  Dev = {analysis, dev, review, testing, merge}  “I'm nearly done” → “He's in review”  “I'm merging” → “The dev's still got tests to run”  Test Column → Test Board
  • 115. Self Management  Continued to improve “established” process  Experiments with pointing  Moves towards pure TDD  New PM → went to Scrumban
  • 118. Consider  If we'd tried Scrum-right Kanban-right  Not so Agile/Lean?  Results as quick?  As Sustainable?  Self-managing?
  • 119. Principles Lean  Eliminate waste  Amplify learning  Decide as late as possible  Deliver as fast as possible  Empower the team  Build integrity in  See the whole Agile Manifesto Individuals and interactions over processes and tools Working software over comprehensive documentation Customer collaboration over contract negotiation Responding to change over following a plan