This are the slides for my initial dissertation proposal at CMU, which covered the original eMoose project and the idea of combining objective and subjective knowledge.
Because the scope of this work was too large, I ended up focusing my dissertation on the contextual side and abandoning this line of work.
Designing a Prosthetic Memory for Software Developers
1. Designing a Prosthetic
Memory for Software
Developers
PhD Thesis Proposal
In Software Engineering
Uri Dekel
1
22 January 2008
2. Preview
• Settings: Software implementation and
maintenance work in the IDE
• Problem: Managing, using, and sharing
knowledge about artifacts AND activities
Reduce errors, disorientation, omissions
• Proposed approach: Memory aid in IDE
Presents “journal” of development episode
Interleaves “Objective” and “Subjective”
Knowledge
Strength in objective-subjective synergy
Offers unified management of knowledge on
artifacts and activities 2
3. Preview
• Thesis statement:
“A prosthetic memory for software
development…
…inspired by human episodic memory…
…can successfully augment the organic
memories of developers…
…and help them preserve, manage, and share
knowledge”
3
4. Outline
• Knowledge of activities
Problems, techniques, proposed support
• Knowledge of artifacts
• Traceability of decisions
• Thesis and contributions
• Completed work
• Evaluation plans
• Risks (Backup in response to questions)
4
6. Running Example
• Wally works on multiplayer cellphone chess
• Wally plans game loop
Each side maintains copy of board
Client gets move from UI, sends to server and
waits for opponent response
Server always relays moves to other side
Client receiving opponent move updates board,
waits for UI
• Subtasks for implementing game loop
Communication infrastructure
Encode moves
Handle end of game 6
7. Running Example:
Disorientation
• Created Model classes and UI
• Create ChessClient class
Implement onUIMove
Add board update call
• Switch to Board class
Implement board update
Add Communicator calls
• Switch to Move class
Implement move encoding
Implement move decoding
Create Communicator class
• Implement all four methods
Investigate networking library docs and samples
NEED: Don’t wait if end-of-game from own move
• NEED: Implement end-of-game check on Board class
NEED: Implement onOpponentMove
NEED: Implement server
NEED: Handle end-of-game
7
Activities Artifacts Traceability Thesis Completed Plan Risks
8. Running Example:
Disorientation
• Need to backtrack to add check for
end-of-game, then move on to the server
• Potential for navigation disorientation
[DeAlwis ‘06]
Too many open windows
Location stack in Eclipse limited
DOI models not specific enough
• Wally closes all windows to “start fresh”
• Potential for task disorientation
What was I doing and what did I accomplish?
What remains to be done?
8
Activities Artifacts Traceability Thesis Completed Plan Risks
9. Running Example:
Omissions
• Wally remembers he needs to implement
the server and the client-receiving
Begins work on server
• Omission of check for end-of-game before
waiting for opponent response
System will get stuck if user’s own move ends
game
Problem will likely pop up while testing
something else and make for difficult debug
9
Activities Artifacts Traceability Thesis Completed Plan Risks
10. Keeping Track Of Reminders
• Wally could have used bug DB to
record tasks for reminders
Appropriate for high-level tasks
Self-contained, well defined, preplanned
Less appropriate for personal subtasks
Transient, not-well defined, ad-hoc
Expensive to use during development
Unwanted public archival [Storey ‘08]
Does not ensure timely awareness
10
Activities Artifacts Traceability Thesis Completed Plan Risks
11. Keeping Track of Reminders
• Wally could have used to-do
comments for reminders [Storey ‘08]
Cheap and easy to produce
Proactive investment in phrasing
Potential for placement difficulties
Clutter the code and not private if
committed
Code must be inspected to reveal to-do
No clear order of completion
11
Activities Artifacts Traceability Thesis Completed Plan Risks
12. Proposed approach
• We propose building a prosthetic
memory for software development
“Device or software that supplements
human memory, to store maintain and
access copies of relevant information”
Many existing prostheses focus on
physical world
Others record knowledge without context
12
Activities Artifacts Traceability Thesis Completed Plan Risks
13. Proposed Approach:
Journal of Objective Activities
• A “journal”/“memory” of development work
• Formed around “objective observations”
Activities with external manifestations
Obvious to layperson without interpretation
Yes: Added text X to method Y
No: Made method Y robust against null parameters
Aggregation/abstraction of low level events
e.g., Edited method Y from S1 to S2 between T1 and T2
Automated capture via instrumentation/monitors
Video recording not structured/searchable
13
Activities Artifacts Traceability Thesis Completed Plan Risks
14. Proposed Approach:
Journal of Objective Activities
• Chronological view
Peripherally shows
most recent
Maintain Orientation
Use to find artifacts
• Difficult to
understand and
visually search
14
Activities Artifacts Traceability Thesis Completed Plan Risks
15. Proposed approach:
Add Subjective “reminders”
• Developer can provide
“subjective” observations
• Interleaved with objective
• Objective provides context
to subjective
Can infer details from
adjacent objective
Makes subjective cheaper to
produce
• Proposition: reduced costs
lead to more knowledge
preservation
15
16. Proposed approach:
Add Activity Announcements
• Visual partition if developer declares when
starting or switching tasks or subtasks
16
Activities Artifacts Traceability Thesis Completed Plan Risks
17. Proposed approach:
Backtracking
• Simplified backtracking by removing top
• Can further filter for leftover reminders
17
Activities Artifacts Traceability Thesis Completed Plan Risks
18. Proposed approach:
Context View
• Memory view shows limited portion
Hides old reminders relevant to current context
• Separate “context view” shows all relevant
subjective observations
For current resource and method
For other visible method definitions
Potentially related methods (e.g., invoked)
• Allows caller to know of remaining to-dos
in callee
18
Activities Artifacts Traceability Thesis Completed Plan Risks
20. Proposition 1
Providing discrete categorized
subjective observations about their
actions… (What we saw so far)
…will help developers avoid
disorientation and omissions…
…and improve their peers’ awareness
of their actions
20
Activities Artifacts Traceability Thesis Completed Plan Risks
22. Knowledge About Artifacts
• Developers generate important knowledge
while creating artifacts
Some manifested in the artifact
e.g., external interfaces, obvious implementations
Some only in developer’s mind
Usage protocols
Assumptions and limitations
Differences from expected implementation
Underlying rationale
• Potential for costly errors
e.g., Not complying with usage protocol
22
Activities Artifacts Traceability Thesis Completed Plan Risks
23. Running Example:
Need for artifact knowledge
• Asuk has to write the server
• Examines the client sending code
• Wants to use Wally’s Communicator
• What is the usage protocol?
e.g., Has to be initialized somewhere?
e.g., Can the same communicator handle
multiple concurrent connections?
• Without additional information, will
have to study source code
23
Activities Artifacts Traceability Thesis Completed Plan Risks
25. Running Example:
Need for artifact knowledge
Excerpt from
CommunicatorImpl
25
Activities Artifacts Traceability Thesis Completed Plan Risks
26. Documentation Types
External Internal
Amount Elaborate Limited
Structure Elaborate Short text
Multiple Yes (including virtual) Single construct/block
artifacts
Artifact Explicit (with name or content Visual to following construct
connection replication)
Presentation Separate from artifacts In artifact, may clutter
Privacy Can be controlled Public once checked-in
Awareness Only when browsing docs Only when reading source
26
Activities Artifacts Traceability Thesis Completed Plan Risks
28. Documentation Feasibility
• Optimally: comprehensive up-to-date
readable documentation
• But: Costs are high
Ensuring grammatical/semantic clarity
Making elements self-contained
Delays from concurrent documentation
• And: No guaranteed returns per element
• Result: Limited documentation
Rationale and intent often lost
No exhaustive listing of assumptions/limitations
28
Activities Artifacts Traceability Thesis Completed Plan Risks
29. Proposed Approach:
Observations on Artifacts
• A compromise between internal and
external documentation
• User provides discrete subjective
observations instead of comments
• Reduced costs
Context and selection automatically assigned
Can be provided while coding (using voice)
Short with no proactive editing
Allows use of deictic references
• Reduction in clutter and privacy issues
• Presented in related contexts (e.g., callers) 29
30. Proposed Approach:
Observations on Artifacts
Creating
Observations
30
Activities Artifacts Traceability Thesis Completed Plan Risks
32. Proposed Approach:
Observations on Artifacts
Context
View
32
Activities Artifacts Traceability Thesis Completed Plan Risks
33. Proposition 2
Providing discrete categorized
subjective observations about
artifacts… (What we just saw)
…will help developers track and share
knowledge and thus avoid errors
33
Activities Artifacts Traceability Thesis Completed Plan Risks
35. Decision Traceability
• Occasional need to understand
“how and why” of current artifact state
Infrequent and after-the-fact but critical
• Requires knowledge about:
Development process
Knowledge from developer’s mind
Passively absorbed information
35
Activities Artifacts Traceability Thesis Completed Plan Risks
36. Running Example
• Asuk is trying to understand Wally’s
implementation of Communicator
What is the purpose of Authorization?
Why is the request queue initialized here?
36
Activities Artifacts Traceability Thesis Completed Plan Risks
37. Proposed Approach:
Traceability
• Asuk selects authorization line and queries for all
relevant events
One hit: Created and not moved or edited
• Asuk asks for all observations from that time
Reveals sequence of web browsing
Android APIs, various examples
Eclipse activity just before implementing method
Opened sample Tweeter client in Eclipse
Copied section from client, pasted in method
Reedited then copied additional constructs, including queue
• Asuk realizes that Wally blindly copied example
Uses referenced materials to learn
37
Activities Artifacts Traceability Thesis Completed Plan Risks
38. Running Example
• Asuk still doesn’t understand the
game loop
• Wally did not document his intentions
Well known problem
No relevant knowledge or discrete
subjective observations when work began
Needed knowledge more elaborate than
possible with discrete observations
38
Activities Artifacts Traceability Thesis Completed Plan Risks
39. Episodic Memory
• Developers avoid cost/distraction
• How to get developers to provide
knowledge?
• Humans capable of memorizing details and
experiences while working
Turned to literature in psychology
Encountered Tulving’s Episodic Memory model
39
Activities Artifacts Traceability Thesis Completed Plan Risks
40. Episodic vs. Semantic Memory
Semantic Episodic
Content and Connected facts w/o Rich chronological
structure learning context episode recollections
Scale Limited succinct Comprehensive but
knowledge not accessible
Example Road routes, Vivid recollection of
vacation dates road trip vacation
Learning process Multiple exposures Immediate with
exposure
Access Association Contextual cues
40
Activities Artifacts Traceability Thesis Completed Plan Risks
41. Episodic Memory in
SW Development
• Recollection of development episode in EM
Visual context
Activities
Thoughts during activities, including rationale
• Details and distinguish factors degrade
• Need to be externalized and persisted
Cannot tap organic memory
• We mimic episodic memory
Collect objective knowledge with context
Combine with record of “thoughts”
41
Activities Artifacts Traceability Thesis Completed Plan Risks
42. Obtaining Subjective
Knowledge
• Knowledge workers reason and reinforce
working memory with “inner voice”
• Researchers glimpse into this voice by
asking for continuous verbalization
“Think-aloud”, protocol analysis, etc.
No interference with end product if avoiding
cognitive investment
Deictic references to shared context
Dependencies between utterances, form a
continuous narrative
42
Activities Artifacts Traceability Thesis Completed Plan Risks
44. Obtaining Subjective
Knowledge
• Developer “narrates” work for benefit of self
and of future users
Most likely using speech while coding/thinking
• Does not need to be exhaustive
Can apply discretion
But must avoid distracting proactive investment
• Broken into uncategorized subjective
observations
Visually distinct from discrete categories
Does not appear in context view
Interleaved with objective and context
44
45. Propositions 3+4
3. Developers can provide, with little
effort, a continuous narrative that
captures important traceability
knowledge
4. Other stakeholders can cost-
effectively elicit the narrated
traceability knowledge from the
prosthetic memory
45
Activities Artifacts Traceability Thesis Completed Plan Risks
47. Expected Practical Benefits
• Objective record only:
Tracking some recent navigation and changes
Tracing passive exposure to information
• Objective + Discrete subjective:
Recording reminders to reduce omissions
Recording and share artifact knowledge
Activity observations partition objective record
and improve orientation and awareness
• Objective + All subjective narrative
Traceability
Search
47
Activities Artifacts Traceability Thesis Completed Plan Risks
48. Expected Technical
Contributions
• Objective knowledge collection
Infrastructure for IDE monitoring
Technique for web application monitoring
Framework for representing traces with
context from multiple sources
48
Activities Artifacts Traceability Thesis Completed Plan Risks
49. Thesis Statement
• A prosthetic memory for software
development inspired by human
episodic memory can successfully
augment the organic memories of
developers and help them preserve,
manage, and share knowledge
49
Activities Artifacts Traceability Thesis Completed Plan Risks
50. Expected Research
Contributions
• Demonstrating that cost reductions alone
yield more knowledge preservation
Positive result - lead to more strategies
Negative results - highlight another factor?
• Demonstrating the power of an episodic
chronological representation for knowledge
about software system
Existing tools rely on semantic presentation
50
Activities Artifacts Traceability Thesis Completed Plan Risks
51. Potential impact
• Step towards addressing the
knowledge problem in SW
development
• Proactive knowledge preservation
reduces need for retroactive means
• Support for distributed organizations
• May apply to to other development
phases
• May apply to other knowledge work
51
Activities Artifacts Traceability Thesis Completed Plan Risks
53. Completed work
53
Activities Artifacts Traceability Thesis Completed Plan Risks
54. Research Evolution
• Early work on interruptions
Means for automatically deferring
interruptions
Gathering and interpreting low-level
telemetry to determine activity
Similar strategy followed in prototype
54
Activities Artifacts Traceability Thesis Completed Plan Risks
55. Research Evolution
• Studies of design
Seeking to support distributed design
Studied collocated design
Existing tools focus on supporting drawing
Identified difficulties in managing artifacts
Memory and spatial cues frequently used
Identified importance of knowledge preservation
Passive exposure to artifacts
• Trace changes and impacts
Needed for interpretation
Effort to preserve knowledge limits creativity
55
Activities Artifacts Traceability Thesis Completed Plan Risks
56. Research Evolution
• Knowledge preservation in eMoose
Focused on computer-based work
More straightforward to monitor
Tracking passive exposure
Developed technique for web access
Support for heterogeneous records with
contextual details
Abstraction of telemetry
Addition of subjective support
56
Activities Artifacts Traceability Thesis Completed Plan Risks
59. Personal experience with tool
• Used for over 6 months
• Typed discrete observations about activities
Effective capturing of reminders and bugs
• Task/subtask switching observations help
avoid disorientation and omissions
• Effective for finding files and members
Recent locations
Searching by task indicator
Search by free text
59
Activities Artifacts Traceability Thesis Completed Plan Risks
61. Prototype Deliverables
• Initial prototype ready for use
• Still much work
Robustness
Correctness of objective observations
New features
Context view
• Thesis scope:
Support for Java work within Eclipse
Support for web applications
61
Activities Artifacts Traceability Thesis Completed Plan Risks
62. Formative Evaluation
• Goal: General initial evaluation and
preparation for prototype release
• Recruit testers within department
• Train in all facets of eMoose
• Freedom in whether and how to use
• Frequently observe and interview
• Short cycles on requests and bug fixes
• Prepare for outside distribution
62
Activities Artifacts Traceability Thesis Completed Plan Risks
63. Phase 1
• Goal: Evaluate ability to manage tasks and
reminders and avoid disorientation (prop. 1)
• Recruit developers from OSS/Industry
Leverage existing contacts (IBM/Accenture, MSE)
Preferably recruit entire project team
• Custom version focused activity knowledge
• Periodic surveys on satisfaction and impact
• Interviews with some participants
• Collect data from memory view client
• Check bug database for impact
63
Activities Artifacts Traceability Thesis Completed Plan Risks
64. Phase 2
• Goal: Knowledge preservation and sharing
with discrete observations (prop. 2)
• Recruit from phase 2 participants
Projects where everyone used eMoose
Add knowledge preservation/sharing features
• Periodic surveys on satisfaction and impact
• Collect exposure data from context view
• Compare amount of documentation
before/after intervention
64
Activities Artifacts Traceability Thesis Completed Plan Risks
65. Phase 3
• Goal: Study knowledge preservation and
consumption with continuous narrative
• Recruit users for lab study
• Starting task from scratch
• Asked to narrate for benefit of others
• Qualitative analysis of narrative contents
and impact on productivity
• Select work of certain developers
• Have others perform maintenance and
explore decision. Analyze process and use
65
Activities Artifacts Traceability Thesis Completed Plan Risks
66. Phase 4 (optional)
• Goal: Study everything together
• Recruit users for lab study
• Sequence of hand-offs from one user
to another
• Compensation incorporates group
effort and preserved knowledge
• Qualitatively analyze results
66
Activities Artifacts Traceability Thesis Completed Plan Risks
67. Timeline
67
Activities Artifacts Traceability Thesis Completed Plan Risks
69. Capture difficulties
• Risk: Not all objective knowledge
can be captured
• Existing knowledge sufficient for most
purposes
• May affect rare trace investigations
69
Activities Artifacts Traceability Thesis Completed Plan Risks
70. Voice Capture
• Risk: Environment not suited for verbal
narration
• Risk: Not finding appropriate engine
• Continuous narrative likely to be used less
than typed discrete observations
• For our lab studies, we will type in transcript
between users.
• Technologies and availability increasing
• Organizations adjusting environment to
support collaboration (e.g., pair prog)
70
Activities Artifacts Traceability Thesis Completed Plan Risks
71. Voice Capture
• Risk: Speech to text inaccurate
• We do not need absolute accuracy
• Knowledge readable even if disrupted
• May apply context to resolve names
71
Activities Artifacts Traceability Thesis Completed Plan Risks
72. Insufficient contributions
• Risk: Users do not provide enough
subjective knowledge?
• Likely to see variations between users
• Returns correlated with investment
• Some organizations mandate
knowledge preservation
72
Activities Artifacts Traceability Thesis Completed Plan Risks
73. Users prefer traditional
means?
• Risk: Users prefer traditional means of
preserving knowledge
• If our approach is just incorrect, not much
we can do…
• Developers may not trust tool with storing
knowledge separately
We should offer automated means of adding the
knowledge to the code or to documents
• Have to improve robustness
73
Activities Artifacts Traceability Thesis Completed Plan Risks
74. Discrete knowledge
consumption
• Risk: Users do not have space for
memory and context views?
Tool meant primarily for dual-monitor
Eclipse supports quick-view
Markers or overlay in lieu of context view
• Risk: Users find them distracting?
Unlikely
Memory view changes by one line at a time
Context view changes only when switching
members
74
Activities Artifacts Traceability Thesis Completed Plan Risks
75. Discrete observations
• Risk: Developers do not associate types
with observations
Types can be associated after-the-fact
Some can be inferred automatically
• Risk: What if an observation is associated
with wrong context?
Associations automatically associated with
current context
User may spot mistake in memory view
We may support a degree-of-relevance context
model
75
Activities Artifacts Traceability Thesis Completed Plan Risks
76. Continuous Narrative
• What if it is difficult to elicit
knowledge from the continuous
narrative?
Could explore various techniques
Creating visual playback may offload
some mental load
Collaborative filtering and editing over
time
76
Activities Artifacts Traceability Thesis Completed Plan Risks
78. Navigation disorientation
• Questions:
How did we arrive at current position?
How do we get to next position or back?
• Contributing factors:
Complex and unfamiliar code base [de Alwis ‘06]
Branching and delocalization
• Manifestations:
Display thrashing
Excessive navigation
78
79. Change disorientation
• Questions:
What changes took place?
What to include in a commit?
What to revert when backtracking from a failed
attempt?
• Contributing factors:
Branching and delocalization
Interleaving of concerns within same members
• Manifestations:
Dead and irrelevant code in production
Accidental deletion
79
80. • Bob left several “surprises”
He didn’t realize there are special moves
in chess with different encodings
Race conditions (assumed that opponent
response takes time, no need for close
connection fast)
Waiting checks a single mailbox in the
communicator
80
81. Other preserved knowledge
• Knowledge preserved in:
Version control comments
Bug tracking comments
Electronic communications
• Limitations:
Sparse
Often written after-the-fact
Difficult to piece into readable material
81
82. Others using tools
• No field deployment yet
• Brought one user to lab
Subject showed preference for speech
Difficulty starting and stopping recording
Able to effectively “think-aloud”
Narrative contained rationale
Narrative too comprehensive
Must train for balance
Subject did not explicitly associate types
82