SlideShare una empresa de Scribd logo
1 de 24
Descargar para leer sin conexión
The Validation
  Attitude
           Bob Colwell
             April 2010




                     1
                          1
Attitude
Icould talk about techniques, tools,
 FV
Environments, algorithms, machinery

Languages,   suites, training

but I think attitude is more important
 than any of those

                                   2
                                       2
No Perfect Designs
     Nothing is perfect, everything has bugs
           – Shortcomings, compromises, defects, design errata, gaffes, goofs,
             fumbles, errors, boneheaded mistakes, bobbles, bungles, boo-boos
           – But not all bugs are equal!
     Can’t test to saturation: schedule matters too
     Why is everything always so darned buggy?
           –   Software…need say no more…
           –   Why did Titanic not have waterproof compartments?
           –   Why did Ford Pinto have gas tank in back?
           –   Why did Challenger fly with leaky O-rings?
           –   Why did torpedoes not explode in WWII?



Entropy has a preferred direction
   Only genius could paint Mona Lisa,
      but any small child can destroy it quickly
   1000 ways to do things wrong, 1 or 2 that work
  4/4/07                               Bob Colwell                               3
                                                                                     3
Prescription: SW visualization, tools to localize
bugs, diagnose problems, and instrument behavior 4   4
Accidents Are Inevitable
             – It's the nature of engineering
               to push designs to edge of
               failure (schedule, reliability,
               thermals, materials, tools,
               judgment of unknowns)
             – P(accident) = ε , for ε ≠ 0
             – World rewards this behavior
                    Cool new features + first to
                      market often preferred to
                      dependability
                    Other markets (life-support)
                      make (or should make) this
                      trade-off differently!

4/4/07      Bob Colwell                      5
                                                   5
Isn’t that just                 ?

             Close. But Murphy is not
                quite right.

                  1. #Near-misses >> #disasters
                  2. Competent design/test finds
                     simple errors
                  3. Complex sequences & unlikely
   4/4/07            event Colwell
                        Bob cascades survive to prod’n
                                                    6
                                                         6
Failures Getting Worse
   Mechanical          things usually fail predictably due to physics
        – Wings bend, bridges groan, engines rattle, knees ache
        – By contrast, computer-based things fail “all over the place”


Helpful Engineering Attitude:
   1.     Nature does not want your
          engineered system to work; will
          actively work against you
   2.     Your design will do only what
          you’ve constrained it to do, only
          as long as it has to
   3.     Watch out for…
Normalization of deviance
   (Challenger O-rings, Apollo
   1 fire)

        4/4/07                           Bob Colwell                     7
                                                                             7
The Steely-Eyed Missile Validator
 Apollo           12
   2nd try to land on moon, launched 11/14/69
   36 seconds after liftoff, spacecraft struck by lightning => power
    surge                                                       r t ant             o
     –   All telemetry went haywire; book said to abort liftoff
                                                                       m os    t imp T?”
                                                         said 3 t w                HA
   –     Both spacecraft pilot and mission controller were furiously considering that option
   – But John Aaron was on shift, and thought he’d seen this malfunction beforeas T
                                             once “Wha
 During testing 1 ac As
                                   imov e observed test that went off into weeds
                        year earlier, Aaron ar
                                                          e
                 Isa to investigateien–c him to obscure SCE subsystem
                               in sc this led
                      ords
   – Aaron took it on himself

                   w
   In critical “abort or not” few seconds, with lives on line, Aaron made one of
    most famous calls in NASA history
     –   “Flight, try SCE to ‘Aux’”
     –   Neither Flight nor spacecraft pilot Conrad knew what that even meant, but Alan Bean tried it
     –   Telemetry came right back, vaulted Aaron into validation stardom


   He could have blown off earlier test, but he didn’t
   His inner validator wanted to know “what just happened?”                                            8
                                                                                                            8
Complexity Implies Surprises
 …and surprises are bad
 Chaos effects in complex µ P’s
   – Decomposability is a fundamental tenet of
     complex system design
   – Butterfly wings ruin decomposability
   – “Improve design, get slower performance” not
     at all uncommon
 We  must stop designing large
  systems as though small ones simply
  scale up
                                             9
   – lesson from comm engineers: assume errors      9
Thinking about validation
Abilityto think in analogies is highest
 form of intelligence
  – IQ tests like “a:b :: c:d”
  – Hofstadter's book: numerical sequences
Analogies  may illuminate a subject in
 a way that direct introspection cannot
  – They drive our minds to their creative limits



                                                    10
                                                         10
Listen to Your Inner Validator
0, 1, 2, …?


You    knew it wouldn’t be 3, didn’t you?
  – You sensed something’s not quite as it seems
Answer: 0, 1, 2, 720!, …
  = 0, 1, 2, 6!!
                          D. Hofstadter, Fluid Concepts and Creative Analogies
  = 0, 1!, 2!!, 3!!!, …
That was the voice of your inner
 validator that you were hearing
                                                                          11
                                                                                 11
Lesson: Trust Nothing
             Hyatt  Regency
              hotel, Missouri,
              1980
             Catwalks on rods

             40’ threaded rods
              with nuts halfway
             Killed 114,
              injured 200
                            12
                                 12
What Happened?
            Spec  was marginal
            40’ threaded rods
             “too hard”, changed
             to 2x20’ by contractor
            No simulation, no test



            Who goofed?
            Engineer, contractor,
            inspector…everyone
                               13
                                    13
Therac-25
            Medical  particle
             accelerator
            Electrons,
             protons, X-rays
            Six fatalities
             from poor
             system/SW
             design
              – And blind naïve
                               14
                faith in computers!
                                  14
Question Everything
Test   assumptions as well as design
 – If assumptions are broken, design surely is too
 – Try to “catch the field goals”




                                               15
                                                     15
Fight Urge to Relax Requirements
  Challenger
    – Not ok to slip design assumptions (launch temp,
      # of unburnt O-rings) to suit desires
  Airbus
    – Blaming pilot not reasonable explanation; pilot
      is part of system design
  Runway       “incursions” up 71% since ‘93
    – Near-misses are trying to tell us something

    Diane Vaughan, The Challenger Launch Decision, Chicago Press
      1996; Nancy Leveson, Safeware, Addison-Wesley 1995
                                                             16
                                                               16
If You Didn’t Test It,
              It Doesn’t Work




Mir:   fire extinguishers bolted to wall
  – Still had strong metal launch straps
  – Had never been needed before, so never tested
  – Discovered with a roaring fire several feet away
                                                17
                                                     17
Complexity Makes Everything Worse
   Some        things must be complicated to do their job
          – Our brains, for example
   But      complex sequences are root of most disasters
          – Challenger, Bhopal, Chernobyl, FDIV, Exxon Valdez
   Where       does complexity come from? Why does it
         keep increasing? Where are the limits?
          – Pentium 4
   “in  the small” vs “in the large” design (micros vs
    comm systems)
   What to do? Vigilance, testing, awareness…we are
    all validators
4/4/07                           Bob Colwell                    18
                                                                     18
What To Do
   Get the spec right
   Design for correctness but…
   design knowing perfection is unattainable
   Users are part of the system
   Formal methods
   Pre-production testing and validation
   Post-production testing and verification
   Education of the public
                                            19
                                                 19
Roles
Engineers must stand
 their ground
  – There are always doubts,
    incomplete data; don’t let
    ‘em use those against you
       Judgment is crucially
         needed -- YOURS
  –Remember the Challenger                          mgt                 HR                engineer
      “My God, Thiokol, when do you want me to launch? Next April?”
  –Be careful with “data”
      “Risk assessment data is like a captured spy; if you torture it long enough, it will tell you
      anything you want to know…” (Wm. Ruckelshaus)
  –Crushing, conflicting demands are norm
       Design must push the envelope w/o ceding responsibility
       Validation establishes whether they've pushed it too far
       Management must beware overriding tech judgment
       Public must understand limits of human design process
  All players must value roles of others!
  4/4/07                                     Bob Colwell                                              20
                                                                                                           20
Roles cont.
   Management
      – wants to assume a product is safe
      – knows nothing’s ever perfect,
          comes a time to “shoot the engineers” or they’ll never
            stop tinkering

   Validators
      – want to prove a product is safe
      – assume it is not by default
      – only informed arbiters of when product is ready

don’t fall for “might as well sign, we’re                  21
                                                                21
Future Directions:
           Public Expectations
Andy Grove’s FDIV epiphany
      Paradoxically, the more high tech, the more public expects of product
      Users caused Chernobyl, TMI by going “off book”, but prevented many
      other disasters with real-time creativity…lessons are subtle


Takes exquisite understanding & judgment to discern
  accidents from reasonable risk-taking and
  bonehead errors or incompetence
      This is what a jury must do.
      How?



Can’t keep trending this way
                                                                     22
                                                                          22
Future of Validation
Multiple Culture Changes Needed
 Public needs to stop expecting perfection
 Design teams must explicitly limit complexity
     and avoid auto-scale-up assumptions
 Companies must mature past point of viewing
 validation as an unpleasant overhead
      does your company have “Validation Fellows?”


Validation is a profession of its own.
 Cultivate the Validation Attitude!
                                       23
                                                 23
The End




          24
               24

Más contenido relacionado

Más de Obsidian Software (20)

Zhang rtp q307
Zhang rtp q307Zhang rtp q307
Zhang rtp q307
 
Zehr dv club_12052006
Zehr dv club_12052006Zehr dv club_12052006
Zehr dv club_12052006
 
Yang greenstein part_2
Yang greenstein part_2Yang greenstein part_2
Yang greenstein part_2
 
Yang greenstein part_1
Yang greenstein part_1Yang greenstein part_1
Yang greenstein part_1
 
Williamson arm validation metrics
Williamson arm validation metricsWilliamson arm validation metrics
Williamson arm validation metrics
 
Whipp q3 2008_sv
Whipp q3 2008_svWhipp q3 2008_sv
Whipp q3 2008_sv
 
Vishakantaiah validating
Vishakantaiah validatingVishakantaiah validating
Vishakantaiah validating
 
Validation and-design-in-a-small-team-environment
Validation and-design-in-a-small-team-environmentValidation and-design-in-a-small-team-environment
Validation and-design-in-a-small-team-environment
 
Tobin verification isglobal
Tobin verification isglobalTobin verification isglobal
Tobin verification isglobal
 
Tierney bq207
Tierney bq207Tierney bq207
Tierney bq207
 
The validation attitude
The validation attitudeThe validation attitude
The validation attitude
 
Thaker q3 2008
Thaker q3 2008Thaker q3 2008
Thaker q3 2008
 
Thaker q3 2008
Thaker q3 2008Thaker q3 2008
Thaker q3 2008
 
Strickland dvclub
Strickland dvclubStrickland dvclub
Strickland dvclub
 
Stinson post si and verification
Stinson post si and verificationStinson post si and verification
Stinson post si and verification
 
Shultz dallas q108
Shultz dallas q108Shultz dallas q108
Shultz dallas q108
 
Shreeve dv club_ams
Shreeve dv club_amsShreeve dv club_ams
Shreeve dv club_ams
 
Sharam salamian
Sharam salamianSharam salamian
Sharam salamian
 
Schulz sv q2_2009
Schulz sv q2_2009Schulz sv q2_2009
Schulz sv q2_2009
 
Schulz dallas q1_2008
Schulz dallas q1_2008Schulz dallas q1_2008
Schulz dallas q1_2008
 

Colwell validation attitude

  • 1. The Validation Attitude Bob Colwell April 2010 1 1
  • 2. Attitude Icould talk about techniques, tools, FV Environments, algorithms, machinery Languages, suites, training but I think attitude is more important than any of those 2 2
  • 3. No Perfect Designs  Nothing is perfect, everything has bugs – Shortcomings, compromises, defects, design errata, gaffes, goofs, fumbles, errors, boneheaded mistakes, bobbles, bungles, boo-boos – But not all bugs are equal!  Can’t test to saturation: schedule matters too  Why is everything always so darned buggy? – Software…need say no more… – Why did Titanic not have waterproof compartments? – Why did Ford Pinto have gas tank in back? – Why did Challenger fly with leaky O-rings? – Why did torpedoes not explode in WWII? Entropy has a preferred direction Only genius could paint Mona Lisa, but any small child can destroy it quickly 1000 ways to do things wrong, 1 or 2 that work 4/4/07 Bob Colwell 3 3
  • 4. Prescription: SW visualization, tools to localize bugs, diagnose problems, and instrument behavior 4 4
  • 5. Accidents Are Inevitable – It's the nature of engineering to push designs to edge of failure (schedule, reliability, thermals, materials, tools, judgment of unknowns) – P(accident) = ε , for ε ≠ 0 – World rewards this behavior Cool new features + first to market often preferred to dependability Other markets (life-support) make (or should make) this trade-off differently! 4/4/07 Bob Colwell 5 5
  • 6. Isn’t that just ? Close. But Murphy is not quite right. 1. #Near-misses >> #disasters 2. Competent design/test finds simple errors 3. Complex sequences & unlikely 4/4/07 event Colwell Bob cascades survive to prod’n 6 6
  • 7. Failures Getting Worse  Mechanical things usually fail predictably due to physics – Wings bend, bridges groan, engines rattle, knees ache – By contrast, computer-based things fail “all over the place” Helpful Engineering Attitude: 1. Nature does not want your engineered system to work; will actively work against you 2. Your design will do only what you’ve constrained it to do, only as long as it has to 3. Watch out for… Normalization of deviance (Challenger O-rings, Apollo 1 fire) 4/4/07 Bob Colwell 7 7
  • 8. The Steely-Eyed Missile Validator  Apollo 12  2nd try to land on moon, launched 11/14/69  36 seconds after liftoff, spacecraft struck by lightning => power surge r t ant o – All telemetry went haywire; book said to abort liftoff m os t imp T?” said 3 t w HA – Both spacecraft pilot and mission controller were furiously considering that option – But John Aaron was on shift, and thought he’d seen this malfunction beforeas T once “Wha  During testing 1 ac As imov e observed test that went off into weeds year earlier, Aaron ar e Isa to investigateien–c him to obscure SCE subsystem in sc this led ords – Aaron took it on himself w  In critical “abort or not” few seconds, with lives on line, Aaron made one of most famous calls in NASA history – “Flight, try SCE to ‘Aux’” – Neither Flight nor spacecraft pilot Conrad knew what that even meant, but Alan Bean tried it – Telemetry came right back, vaulted Aaron into validation stardom  He could have blown off earlier test, but he didn’t  His inner validator wanted to know “what just happened?” 8 8
  • 9. Complexity Implies Surprises …and surprises are bad Chaos effects in complex µ P’s – Decomposability is a fundamental tenet of complex system design – Butterfly wings ruin decomposability – “Improve design, get slower performance” not at all uncommon We must stop designing large systems as though small ones simply scale up 9 – lesson from comm engineers: assume errors 9
  • 10. Thinking about validation Abilityto think in analogies is highest form of intelligence – IQ tests like “a:b :: c:d” – Hofstadter's book: numerical sequences Analogies may illuminate a subject in a way that direct introspection cannot – They drive our minds to their creative limits 10 10
  • 11. Listen to Your Inner Validator 0, 1, 2, …? You knew it wouldn’t be 3, didn’t you? – You sensed something’s not quite as it seems Answer: 0, 1, 2, 720!, … = 0, 1, 2, 6!! D. Hofstadter, Fluid Concepts and Creative Analogies = 0, 1!, 2!!, 3!!!, … That was the voice of your inner validator that you were hearing 11 11
  • 12. Lesson: Trust Nothing Hyatt Regency hotel, Missouri, 1980 Catwalks on rods 40’ threaded rods with nuts halfway Killed 114, injured 200 12 12
  • 13. What Happened?  Spec was marginal  40’ threaded rods “too hard”, changed to 2x20’ by contractor  No simulation, no test  Who goofed? Engineer, contractor, inspector…everyone 13 13
  • 14. Therac-25 Medical particle accelerator Electrons, protons, X-rays Six fatalities from poor system/SW design – And blind naïve 14 faith in computers! 14
  • 15. Question Everything Test assumptions as well as design – If assumptions are broken, design surely is too – Try to “catch the field goals” 15 15
  • 16. Fight Urge to Relax Requirements Challenger – Not ok to slip design assumptions (launch temp, # of unburnt O-rings) to suit desires Airbus – Blaming pilot not reasonable explanation; pilot is part of system design Runway “incursions” up 71% since ‘93 – Near-misses are trying to tell us something Diane Vaughan, The Challenger Launch Decision, Chicago Press 1996; Nancy Leveson, Safeware, Addison-Wesley 1995 16 16
  • 17. If You Didn’t Test It, It Doesn’t Work Mir: fire extinguishers bolted to wall – Still had strong metal launch straps – Had never been needed before, so never tested – Discovered with a roaring fire several feet away 17 17
  • 18. Complexity Makes Everything Worse  Some things must be complicated to do their job – Our brains, for example  But complex sequences are root of most disasters – Challenger, Bhopal, Chernobyl, FDIV, Exxon Valdez  Where does complexity come from? Why does it keep increasing? Where are the limits? – Pentium 4  “in the small” vs “in the large” design (micros vs comm systems)  What to do? Vigilance, testing, awareness…we are all validators 4/4/07 Bob Colwell 18 18
  • 19. What To Do  Get the spec right  Design for correctness but…  design knowing perfection is unattainable  Users are part of the system  Formal methods  Pre-production testing and validation  Post-production testing and verification  Education of the public 19 19
  • 20. Roles Engineers must stand their ground – There are always doubts, incomplete data; don’t let ‘em use those against you Judgment is crucially needed -- YOURS –Remember the Challenger mgt HR engineer “My God, Thiokol, when do you want me to launch? Next April?” –Be careful with “data” “Risk assessment data is like a captured spy; if you torture it long enough, it will tell you anything you want to know…” (Wm. Ruckelshaus) –Crushing, conflicting demands are norm Design must push the envelope w/o ceding responsibility Validation establishes whether they've pushed it too far Management must beware overriding tech judgment Public must understand limits of human design process All players must value roles of others! 4/4/07 Bob Colwell 20 20
  • 21. Roles cont. Management – wants to assume a product is safe – knows nothing’s ever perfect, comes a time to “shoot the engineers” or they’ll never stop tinkering Validators – want to prove a product is safe – assume it is not by default – only informed arbiters of when product is ready don’t fall for “might as well sign, we’re 21 21
  • 22. Future Directions: Public Expectations Andy Grove’s FDIV epiphany Paradoxically, the more high tech, the more public expects of product Users caused Chernobyl, TMI by going “off book”, but prevented many other disasters with real-time creativity…lessons are subtle Takes exquisite understanding & judgment to discern accidents from reasonable risk-taking and bonehead errors or incompetence This is what a jury must do. How? Can’t keep trending this way 22 22
  • 23. Future of Validation Multiple Culture Changes Needed Public needs to stop expecting perfection Design teams must explicitly limit complexity and avoid auto-scale-up assumptions Companies must mature past point of viewing validation as an unpleasant overhead does your company have “Validation Fellows?” Validation is a profession of its own. Cultivate the Validation Attitude! 23 23
  • 24. The End 24 24