Craeg Strong presented on applying agile and kanban methods in a regulated energy company environment. He discussed prior work implementing kanban which resulted in long lead times. Incremental improvements involved decomposing requirements into user stories and spikes. Applying the Kanban Method and STATIK approach helped analyze workflow and design an updated kanban board. Ongoing challenges in this environment include the need for spikes to explore complex designs and unavailable stakeholders. Lessons were that kanban is well-suited for constrained development and highlighting sources of delay and barriers to encourage collaboration.
2. 2
Agenda
Context
A. Prior Work
B. Selected Engagement Details
1. Agile Requirements Decomposition via User Stories
2. Applying the Kanban Method
C. Observations and Discoveries
1. Ongoing Challenges
2. Lessons Learned and Takeaways
3. 3
Craeg Strong
Software Development since 1988
Large Commercial & Government Projects
Agile Coach / DevOps Engineer
Kanban Trainer / SpecFlow Trainer
Performance & Scalability Architect
Certified Ethical Hacker
New York & Washington DC Area
CTO, Ariel Partners
AKT, KMP, CSM, CSP, CSPO,
ITILv3, PMI-ACP, PMP, LeSS, SAFe
www.arielpartners.com
cstrong@arielpartners.com
@ckstrong1
5. 5
Unique Challenges for Energy Sector
Highly Constrained COTS Product
Few Developers With Deep Domain and Product Knowledge
Lengthy Analysis Required
Customers Often Unavailable
Highly Distributed Organization
7. 7
Initial Kanban Implementation
1. Established LeanKit to track the work
2. Process captured as-is, no changes
3. No changes to existing responsibilities or titles
Consultant helped IT teams implement Kanban in early 2017
Before Kanban
SharePoint
Functional Requirements Document
System Design Document
Detailed Design Document
User & Admin Guides
Analyze Design Build Test Deploy
With Kanban
A) No Changes To
Documentation
B) Requirements
Captured on
Cards
C) Kanban board
lanes reflect
process steps
Minimum
Marketable
Feature
8. 8
Minimum Marketable Features
Minimum As small as possible. If an MMF can be subdivided into
multiple MMFs, it should be– even if those MMFs are
to be delivered together.
Marketable Something that could be potentially deployed on its
own (used in production, with live data)
Feature Something that is perceived, of itself, as value by the
user.
Minimum Marketable Features. The smallest set of functionality that must be
realized in order for the customer to perceive value. A “MMF” is characterized
by the three attributes: minimum, marketable, and feature.
10. 10
Development Process
Back Office
Demand
Middle Office
Front Office
Governance
Committee
Analysis
Committed&
SequencedQueue
Uncommitted
Arrivals
Backlog
Analysts Developers Testers
Hypercare
Average 120+ Days (!!!)
Approval
Gate
IT
Need to
Do Better
Analysts Play
Both Roles
Other
11. 11
Kanban Foundational Principles
1. Start with What you do Now
2. Agree to Pursue Incremental, Evolutionary
Change
3. (Initially) Respect the Current Process,
Roles, Responsibilities & Titles
Hmmm, we
got stuck on
this one
14. 14
As Is versus To Be
As Is
Delivered
MMF
Lengthy Delays
Lack of Visible Progress
Kanban Board not adding value
Difficult to Estimate
Difficult to Subdivide
Identified
Minimum
Marketable
Feature
User
Story
User
Story
SpikeSpike
Regular Visible Progress
Kanban Board tells a story
Small, estimable chunks
Independently valuable stories
Can assign work to multiple teams
To Be
Delivered
MMF
User
Story
Idea or
Request
Identified
Minimum
Marketable
Feature
15. 15
Decomposing MMF into User Stories…
MMF User Story
Relationship Parent Child
Effort 2 weeks to 4 months of
development
2 days to 2 weeks
Analysis Requirements Document As a/I want/So that
Possibly a UX mockup
Testing Test Plan (up to dozens of
tests)
2-12 Acceptance Criteria
Value to Business Immediately recognized as
independently valuable
Has value to the business,
but the value may be best
realized when combined
with other user stories split
out from the MMF
16. 16
…and Spikes
User Story Spike
Goal Realize business value Reduce risk by answering a
question or gathering information
Effort 2 days to 2 weeks 2 days to 2 weeks
Completion Acceptance Criteria fulfilled Hypothesis is proven true or false
Next Steps Testing, Documentation,
Deployment
Discarded or recycled
17. 17
Properly Formed User Stories
Splitting User Stories
1. Performance (ways to add caching or improved performance)
2. Simple to complex
3. Interface variations: excel, grid, graphical, mobile,
4. Variations in data: geographic region, or by refinery, or by customer
5. Operations: splitting out CRUD
6. Business rule variations
7. Workflow steps: this then that
8. Breakout a spike
INVEST
1. Independent
2. Negotiable
3. Valuable
4. Estimable
5. Small
6. Testable
User Story Components
1. “As a…” Identify the Stakeholder. More than one? Split!
2. “I want…” Identity the function the way the stakeholder would
3. “So that…” Identify the business value. What is the Stakeholder’s goal?
4. Acceptance Criteria
1. Assumptions
2. Invariants
3. Pre- and Post-Conditions
18. 18
Example #1: EBITDA
Story Compute Operating Income component of EBITDA by company
Story Compute Income Tax component of EBITDA by company
Story Compute Depreciation component of EBITDA by company
Story Compute Interest component of EBITDA by company
Story Compute Amortization component of EBITDA by company
Story Compute EBITDA by company combining component calculations
Story Control Access to EBITDA calculations by company
Story Reconciliation report to compare EBITDA in AO report to BPC P&L
statement
Story Schedule report to enable reconciliation during close
MMF Description
EBITDA Calculations by Company – for
Accounting and Reporting Team
Earnings before interest taxes depreciation and amortization
19. 19
Example #2: Trader Checkout Process
Story Basic read-only details view for trader for formula tablet Similar to existing report, but omits trades already
approved or rejected, activated or pending. Most
trades have formula tablets
Story Trader can approve or reject deal details
Story Trader can only approve/reject their own trades
Story Read-only form for middle office to review rejected details
Story Force modified deals to be re-approved
Story Update historical deal details so I don’t have to approve
them
Spike Determine how fixed price tablets will be displayed Small percentage of trades have fixed price tablets
Story Support fixed price tablet support for TCM view
MMF Description
New Trader Checkout Process (TCM) For Audit To prevent booking errors, have traders explicitly approve or reject
trades on a daily basis before they are activated
20. B2. APPLYING THE KANBAN METHOD*
*STATIK: SYSTEMS THINKING APPROACH TO IMPLEMENTING KANBAN
21. 21
1. Leading Questions
2. Motivations for Change
3. Analyze Sources and Nature of Demand
4. Identify and Define Classes of Service
5. Analyze Delivery Capability
6. Model Delivery Workflow
7. Kanban System Design (Lane Policies, WIP Limits, Replenishment
Cycles, etc.)
Applying the Kanban Method
22. 22
Motivations For Change
Internal Dissatisfaction
Work beginning on MMF but card is stuck “waiting for approval”
Significant time spent on documentation (Reqs doc, Func spec, Tech spec)
Not much day-by-day movement on the board
Large projects do not appear on the board
No visibility Into Work Being done for MMF
Difficult to tell what is blocked or high priority
Coordination with Other Groups (e.g. Help Desk) not reflected on board
How Can I tell what has moved since last week?
External Dissatisfaction
Too slow / Long lead time
Unpredictable delivery
Quality issues
23. 23
Updated LeanKit Board Design
Backlog Approval
Reqs
Analysis
Func
Design
Build /
Unit Test
System /
User Test
Deploy Hypercare
Previous Flow
Hypercare
DefectsUser Stories & Spikes
Backlog
On
Deck
Reqs
Analysis
Design /
Build
System
Test
User Test Deploy Hypercare
Specify Ready
In
Progress
Func
Test Identify
In
Progress
Test Deploy
Updated Flow
25. 25
On Deck Lanes: Two MMFs Per Stakeholder Group
All Five Stakeholder
Groups Can Indicate
their Top Two
Priorities
Top Two Can Change
Anytime Prior To
Commitment
26. 26
Parallel Parent[MMF] and Child[User Story] Work
Child Stories Created
When MMF Analysis
Begins
“Ready” User Stories
Can Be Completed
Anytime
Additional User
Stories Created As
Needed During
MMF Analysis &
Design
Spikes Created As
Needed
Finish-to-Finish Dependency (MMF to Child Story)
27. 27
Hypercare Defects Captured
External Defects
Appear Directly
Under Their Parent
MMF
Separate Card Type
(Hypercare vs
Internal Defect)
Makes Querying
Easy
28. 28
Improving Functional / Technical Team
Collaboration
Functional
Team
Technical
team
Handoff
From This…
Time
Level
Of
Effort
MMF Reqs
Analysis
MMF Design
& Build
… To This!
Functional Technical
Level
Of
Effort
Time
Story
Identification
Story
In Progress
MMF
Design
MMF
Reqs Analysis
Explicit
Entry/Exit
Policies
Explicit
Entry/Exit
Policies
Explicit
Entry/Exit
Policies
30. 30
Lessons Learned for Highly Regulated / COTS / Energy
Challenge Response
1. Spikes frequently needed to explore different designs
2. Estimation process helps team identify areas of
significant uncertainty and risk (Extra-Large “T” Shirt)
Older Development Platform,
Largely Manual Processes
Development takes longer; user stories must be split
more fine-grained
Highly Specialized Team
Members Not Fungible
Benefits from “swarming” and collaboration are limited,
often leading to increased Work In Progress
Stakeholders Unavailable for
Extended Periods of Time
Identify backup stakeholders & More Up-Front Estimation
and Planning Required to avoid Blackout times
Highly Complex Commercial
Software Package
31. 31
Takeaways
Kanban Well-Suited for “Highly Constrained” SW Development
Low Barrier to Entry
Transparency Promotes Trust
Highlights Sources of Delay
Conducive to Incremental Change
Compatible with SDLC methods (Waterfall, Scrum, SAFe, etc.)
Kanban Method and STATIK
Simple and Powerful Techniques
Helps Break down Barriers and Encourage Collaboration
Limiting the amount of Work in Process reduces Overburdening
Improvement Workshops can be repeated as needed
Good afternoon! My name is Craeg Strong and today I will be talking about using Kanban for software development in the energy sector, which is a highly regulated and complex environment.
You may be asking yourself how Kanban could be used for software development– isn’t it just for infrastructure support?
What I hope to show you is that not only is it great for software development, it is particularly helpful in highly constrained environments with complex domains and significant compliance and regulatory issues.
I will start out providing some context
Then I will describe the situation when we arrived based on some prior work that was done.
I will go into just a few of the improvements we made along the way
And finish up by describing our lessons learned and where we will go from here
I am a software developer and devops engineer, but more importantly I am a certified Kanban trainer and I coach teams on test automation and agile methods
My company is in NY and DC, and we work for a variety of commercial and government customers
the logistics of extracting, storing, transporting, and trading petroleum products are very complex
Energy trades involve multiple components and counterparties and millions of dollars
Our business stakeholders that we get our requirements from are typically located at the refineries.
The only problem is that refineries get partially or completely shutdown on a regular basis in order to implement improvements, or for regulatory or compliance reasons.
During those periods, called turnarounds, it is basically impossible to get requirements or feedback from our customers
Of course, you try plan around that, but sometimes there is very little advanced warning.
There are a small number of big complex commercial software packages that are used by basically everyone in the energy sector.
Because they are so complex, and the domain itself is so hard to grasp, there are just not going to be that many people on the team that are masters of both
This means that complex analysis is required, and many questions need to be asked– but remember that our customers may disappear without warning due to a turnaround.
To make matters worse, the refineries are scattered all over the US, and the company has offices worldwide.
Now that we set the stage, I’ll describe the situation as we found it when we started back in May of this year.
A consultant had set up an initial kanban board for the company back in early 2017, using the LeanKit software tool
The intent was to capture the existing process exactly as it was, with no changes.
Historically, the work was captured in a set of documents stored in sharepoint. Those documents probably look familiar to many of you– functional requirements document, design documents
For the new process, the documents were maintained exactly the same way. The only new concept that was introduced was the minimum marketable feature or MMF
I’ll describe what an MMF is in a moment.
By representing each requirement as an MMF card, they were able to capture all of the in-flight requirements together on a kanban board.
So what exactly is an MMF? well it has three components
First– it is as small as possible– which means if it could be subdivided, it would be.
Second, it is marketable, which means it can be deployed on its own and used in production
Third, it is something that is directly perceived by the business as valuable
So here is the original implementation in leankit.
Because the lanes are hard to read, I copied them down below
So.. Taking a look at those, can anyone tell me what SDLC process that resembles?
Let me give you a hint.
Yes, this is waterfall! We are using a kanban board to represent an unchanged waterfall process. The one benefit we really have here is visibility.
Here is how it all worked.
We had five different stakeholder groups. Those were the sources of our demand.
Requirements coming from them are approved by a governance committee, and then go into the team’s backlog.
When an item is removed from the backlog and worked, it starts out with the analysts, then its handed to the developers, then testers, and finally gets to hypercare.
Hypercare is a two to four week period where the item is in production, but support has not yet been handed off to the operations team.
Sounds good, so what is the problem here? It is cycle time! It takes the team an average of more than 4 months to get an item from backlog to hypercare.
So how can Kanban help? Let’s look at the kanban foundational principles.
We definitely started with what we do now. We absolutely respected the current process, roles, responsibilities, and titles, but we seem to be missing something.
We got stuck and did not make any changes!
So that is what we set out to do. Since then we have made a number of incremental improvements, and I’d like to highlight two of them
First, I provided coaching regarding how to decompose requirements
So why do we need to decompose requirements? Why aren’t the MMFs good enough?
The problem is that most of them were quite large. Because of this, they would sit in a lane for weeks or months.
Day by day, the board rarely changed, which unfortunately lead to most people simply ignoring it.
The analysts would ask the developers for estimates, which sometimes could be off by an order of magnitude or more.
After it was analyzed, the MMF would basically go into a black hole and would not be heard from for weeks or months until it was somehow delivered.
So I introduced the concept of user stories. There is nothing magic about user stories, and they are not mandated by kanban nor scrum.
However, they are a well-known and well documented technique, so why reinvent the wheel.
Not only user stories but also spikes. So how do MMFs, user stories, and spikes relate to each other?
The answer is pretty simple. The MMF would be decomposed into one or more child user stories.
We followed a very standard definition of user stories and expected them to take between two days and two weeks.
Although each user story provides value, we would still wait until all child user stories were completed before delivering the MMF to the business.
The team asked– what if we don’t have all the information we need to create a user story up front? How can we move forward?
So we also introduced the notion of a spike, popularized from Extreme Programming back in 1999.
We defined a spike as some work you perform to reduce risk by answering a question or gathering information.
So– for example if a solution could take two weeks or two months depending on whether a suitable API already exists or not, that question needs to be answered.
A spike could be done by an analyst or a developer, or both working together. It could involve in depth analysis of a business problem, or looking at the commercial software or some other tool or technique.
So as soon as we have our answer, we document it as the outcome of our spike and move forward.
In any event a spike is time boxed to take no longer than two weeks.
Now that we had our taxonomy, it was time to learn how to write user stories.
We structured them in the standard way, and learned about characteristics of a good user story.
However, by far the biggest challenge was splitting them to make them very fine grained.
There is a wonderful online resource at the agileForAll website that illustrates a bunch of ways to split a story, and I listed some of them here.
So the teams stopped me at this point and said– this is all well and good, but that stuff just won’t work here.
We can’t possibly split our MMFs to be any finer grained. That only works for those web designers.
The subject matter is too complex, and the commercial software package we are using just doesn’t lend itself to building something up piece by piece.
So there is a fake Mark Twain quote that goes like this: “it aint what you know that gets you into trouble. It is what you know for sure that just ain’t so.”
Let me illustrate with two quick examples
Let’s take EBITDA: earnings before interest taxes depreciation and amortization.
Is there anything there that suggests splitting?
Indeed. When we have a calculation that involves sub-calculations, we can split each of those out into separate stories. No sweat!
For another example, we have the trader checkout process.
It turns out that traders were not double-checking all the details for their trades, which lead to some incorrect or incomplete trades needing to get backed out. Yikes!
So we need to introduce a double check process where traders can review the details and approve or reject them individually.
First of all, it turns out that trades are priced out in two different ways: fixed price and via formulas. Formulas were easy and they represent 90% of all trades. Aha- let’s do those first!
After that it is a matter of breaking down the business process into steps, and implementing them one at a time.
First we have the ability to approve or reject, then restrict that to only my own trades. Question– what happens if a detail gets rejected?
The middle office needs to review it and decide what to do. Voila, another user story. And so on.
We also applied the kanban method via STATIK
STATIK is the systems thinking approach to implementing kanban. It provides a simple recipe of useful techniques that can be applied to any workflow process to improve it.
There are 6 or 7 basic steps, depending on how you count them.
We start out understanding the sources of dissatisfaction- you have to understand the problem first!
We analyze where our demand comes from, and what it looks like.
When do our requirements need to be expedited?
Which ones have a fixed deadline, like April 15 for taxes? We call those different classes of service.
Then We need to understand our delivery capability
Then we can model our workflow. Rather than just looking at development we want to look all the way from early conception thru operations where the value is really felt– or not.
Lastly we can apply policies, limit our work in progress, and decide how often to replenish our backlog.
Most of the motivations for change I have already described.
However, we found that many activities were not reflected on the board at all.
Also we found that significant time was spent waiting for approvals for documents.
All of this lead to long cycle times, often lasting more than 4 months.
I will skip ahead to the workflow design.
Here is the previous flow. We wanted to make some relatively modest changes that would yield some big benefits.
First of all, we eliminated the approval step. We realized that the MMFs were already approved by governance by the time we got them.
The documents didn’t really need to be “re-approved” but merely accepted. Most of the time, as long as the document was accepted before the MMF went to production we were fine.
No need to wait!
Secondly, we introduced a parallel flow for user stories and spikes, so we could work on those while the MMF was being analyzed.
Lastly, we observed that defects that occur while an MMF is in hypercare get a very different workflow than the ones that happen in development.
We want to make sure we make production defects highly visible– and how better to do that than to put them right next to the MMF hypercare lanes.
I will first show you the overall board, and then go into each of these in a little more detail.
Here is the updated board– you can see an “On Deck” section here, and a parallel swim lane for user stories and spikes.
We wanted to make it easy for the team to grab the next most important card. Front office, middle office, back-office, Technical, and Other Groups.
So each stakeholder group gets an “On Deck” area where they can put their two highest priority cards.
Notice how the lane is exactly two cards wide.
We created a parallel swim lane where user stories and spikes would be worked on.
The team explained that MMFs could take a long time to analyze and design-– but that there were usually some aspects that were fairly well understood up front.
So by creating a parallel structure, we enabled the team to ”peel off” spikes and stories as soon as they were understood.
The MMF was finished as soon as the analysts were done, and all the user stories were completed-- a finish-to-finish dependency.
This way maximizes the ability to work in parallel– and makes it clear visually that work is truly happening on BOTH the parent and child cards at the same time.
At the right hand side of the board, we have hypercare– when the MMF makes its way to production but prior to handoff to the operations team.
We put a parallel swim lane just for production defects right under the hypercare lanes, so it is immediately obvious when a hypercare MMF has a bug.
If it’s a high priority bug, the MMF will get a big red stop sign meaning it is blocked, while the defect is worked on and makes its way through change control.
We knew we needed to improve team collaboration.
Previously, the functional team would work on their requirements and then just lob them over to the technical team to implement.
By creating this parallel structure, we encouraged the developers and analysts to work together continuously, while recognizing that the analysts may have a bit more work to do at the beginning, and the developers may have a bit more work to do at the end.
We encouraged this visually via the parallel swimlanes, but also by having the team craft explicit entry and exit policies that describe the responsibilities of each team for each step of the way.
How many of you are familiar with Scrum Definition of Done, and definition of ready? They are basically entry and exit policies for work in the system.
Kanban takes that concept and generalizes it– so we capture policies for every single step in the workflow.
All team members can review the policies at any time right from the tool. It’s a powerful way to encourage good collaboration.
Based on the experience of exploring and deepening kanban at this company, there are some interesting lessons learned
What is different about using agile methods on a project where you have a complex commercial software package? A couple of things. We tend to see more spikes. Also even though estimates are not normally part of kanban, the team estimates T shirt sizes for user stories– primarily to make sure they are granular enough.
The SW platform is older and there are a lot of manual proceses. This means development takes longer. What would take 2 days for a modern web application could take 2 weeks. This means we have to really split our user stories fine grained to get a good level of throughput.
With such a complex business domain, many of team members are specialists in a very specific area. The result is we have to constantly guard against increase in work in progress– people are tempted to take on new work rather than help others finish their work because “that is not my area.”
Our stakeholders that just gave us a high priority requirement may be in a turnaround and might suddenly not be available. Although we try to get backups, this also means we need to plan more in advance to help avoid these blackout times.
So what are some key takeaways?
First– kanban is well suited for highly constrained software development – legacy-heavy development, or with burdensome compliance and governance processes.
Easy to get started
High degree of transparency
Conducive to implemental change
And you can start with waterfall, scrum SAFe or whatever.
The kanban method gives you a set of simple and powerful techniques that help break down cultural barriers and encourage collaboration,
reducing the amount of work in process to relieve overburdening.
And you can repeat these improvement workshops as needed
We are enjoying our time working with the energy company and are looking forward to finding new ways to improve.
Thank you very much! I think we have about 15 minutes for questions.