SlideShare una empresa de Scribd logo
1 de 44
Descargar para leer sin conexión
Learning From NASA Mishaps:
What Separates Success From Failure?


 Project Management Challenge 2007
          February 7, 2007


               Faith Chandler
               Office of Safety and Mission Assurance

                                                        1
Discussion Topics

• What is a mishap?

• What is a close call?

• How can they affect your
  program?

• Anatomy of an accident.

• What can you learn from
  others past failures that
  will make you successful?

• What do you do if your
  program has a mishap?
                                 Is the purpose of your program
• How can you prepare your
  program?                    only to serve as a warning to others?


                                                                      2
The cause is really
obvious…or is it?

Without understanding
the cause, how can you
fix the problem?

How can you learn
from it?




                         3
What’s A Mishap? What’s A Close Call?

NASA Mishap. An unplanned event that results in at least one of the
following:

    –   Injury to non-NASA personnel, caused by NASA operations.
    –   Damage to public or private property (including foreign property),
        caused by NASA operations or NASA-funded development or
        research projects.
    –   Occupational injury or occupational illness to NASA personnel.
    –   NASA mission failure before the scheduled completion of the
        planned primary mission.
    –   Destruction of, or damage to, NASA property.

New Definition of Close Call. An event in which there is no injury or only
minor injury requiring first aid and/or no equipment/property damage or
minor equipment/property damage (less than $1000), but which
possesses a potential to cause a mishap.

             ALL MISHAPS And CLOSE CALLS ARE INVESTIGATED
                                                                             4
How Are Mishaps Classified?
           Classification based on dollar loss and injury.
            • (Mission failure based on cost of mission).

           Classification determines type of investigation to be
           conducted.

           Mishap classification:
            • Type A mishaps – Type D mishaps
            • Close calls

                                                                                                     2005
      2003                2006                    2006                      2006




    Type A               Type B                 Type C                     Type D                Close Call
  NOAA N Prime      Remote Manipulator    Hubble WSIPE Lift Sling   Hydraulic Pump (HPU-3)
                                                                                             Premature Shutdown of
Processing Mishap   System Damage by     Falls on WSIPE Hardware     Electrical Arc Between
                                                                                               WSTF Large Altitude
     $223 M           Bridge Bucket                $TBD               Pump & Crane Hook
                                                                                               Simulation System &
                          $470 K           Between $25K-$250K                Damage
                                                                                            Blowback on Test Article
                                                                        Less Than $25K
                                                                                                                   5
How Can Mishaps And Close Calls Affect
 Your Program?
Mishaps can impact your budget, your schedule,
and your mission success!

What Can Go Wrong?
  Equipment can fail
  Software can contain errors
  Humans can make mistakes or
  deviate from accepted policy
  and practices

What’s The Cost?
  Human life
  One-of-a-kind hardware
  Government equipment &
  facilities
  Scientific knowledge
  Program cancellation
                                             40.017 UE Lynch
  Public confidence                                2005


                                                               6
NASA Mishaps And Close Calls
In 2006, NASA had 715 Mishaps and 920 Close Calls reported in the NASA
Incident Reporting and Information System (IRIS).
         6 Type A mishaps
        13 Type B mishaps
        268 Type C mishaps

In the last 10 years (1996-2006), the direct cost of mishaps was more than
$2 Billion.

Other Additional Costs Include:
 • Worker’s compensation
 • Training and replacement workers
 • Lost productivity
 • Schedule delays
 • Mishap investigation
 • Implementing the Corrective Action Plan (CAP)
 • Record keeping (CAP, worker’s compensation, mishap, etc)
 • Liability
These indirect costs can amount to more, in fact much more, than the
direct cost of the injury or property damage.
                                                                             7
Types Of NASA Mishaps


                  Industrial                       Lift Off                    Landing
                                                  Test Flight
    CTDs



                                  NOAA N Prime
                                                                Mars Climate
                                                                  Orbiter
Slips, Trips, &
     Falls
                  Cooling Tower
                       Fire                        Challenger                  Columbia




  Insect &                        VAB Foam Fire
Animal Bites
                                                                   DART


                  Crane - Pad B
                                  Processing                    In Space
                                    & Test           Helios                    Genesis

 Automobile

Individual                                                                                8
Type Of Activity Where Mishaps Occurred

             Percentage of Type A Mishaps
          Occurring During Each Type of Activity
                        1996-2005


                               Ground
                                Test
                       Space
                        41%     27%

                                              Flight Test
                                                 12%

                                        Earth Flight
      Ground Maintenance                   8%
            8%
                                 Ground Processing
                                      4%

                                                            9
Close Calls And Mishaps

                  Mars Exploration Rovers

                  Even programs with great success have
                  significant failures and close calls!
                  •   Cancellation of one rover due to concerns
                      about ability to be ready safely for launch.
                  •   Air bag failure months before launch.
                  •   Parachute failure months before launch.
                  •   Potential cable cutter shorting days before
                      launch.
                  •   Pyrotechnic firing software concern one day
                      before Mars arrival.




                                                                     10
Why Do Some Programs Have Close Calls
And Others Have Mishaps?

                                  Controls and Barriers Fail or Are Non-Existent




                                                                  Location Where
                                                                  Deceased Fell
                                                                  From Roof

                                                                  First Point of
                                                 Second           Impact of
                                                 Point of         Deceased
                                                 Impact of
                                                 Deceased




  VAB CLOSE CALL 2006                       Bldg. M6-794 TYPE A MISHAP 2006
    2 Men Fall From Ladder                            1 Man Falls From Roof
     Sustain Slight Injuries                                  Fatality
  Control: Fall Protection Used              Control Failed: Fall Protection NOT Used
                                                                                        11
Anatomy Of An Accident
                                     • Guard, Shield, Shroud
                                     • Personal Protective Equipment (PPE)
         d                           • Lockout, Keyed Connecter
   esire
Und ome
 Outc




                                            Failed



                                                      Failed



                                                                Failed
              Initiating                                                               Undesired
                                 Event                                                 Outcome
                Event

                                            Barrier



                                                      Control



                                                                Control
                           •   Hardware                                   •   Design Review
             Undesired     •   Software                                   •   Inspection
             Outcome       •   Human                                      •   Test
                           •   Weather                                    •   Audit
                           •   Natural Phenomenon                         •   Alarm/Feedback Loop
                           •   External Event                             •   Risk Assessment/FMEA
                                                                                                     12
How To Make Sure Your Program Doesn’t
Have A Major Mishap
   It is Not enough to have layers of defenses… Nearly every
   program at NASA has them.
         • Reviews
         • Inspections
         • Tests
         • Audits
         • Alarms and means to mitigate

   What separates the successful programs from those that have
   mishaps… These defenses work

   How can you detect failing or non-existent defenses?

       • Perform Root Cause Analysis (RCA) on problems and close
         calls.

       • Identify systemic problems in your program.

       • Fix failures in defenses early … before they cause a
         mishap.
                                                                   13
NOAA N Prime’s Weak Defenses




                                      Failed




                                                                       Failed




                                                                                            Failed
                                                                                  Failed
                        Other                   Engineer Doesn’t
Team Installs                                        Follow
                    Team Removes
   Bolts                                          Procedures
                        Bolts
                                                  To Reinstall
                                      Control




                                                                       Control




                                                                                            Control
                                                                                  Control
                                                    24 Bolts


   Team That Removes Bolts
   Doesn’t Tell Anyone



                Lead Technician (PQC) Stamped
                Procedure Without Inspecting


                                        Quality Assurance Stamped
                                        Procedure Without Inspecting


                                                     NASA and Contractor Supervisors Did
                                                     Not Correct Known Problems

                                                     Routinely Allowed Sign Off
                                                     Without Verification

                                                                                                      14
What Can Your Program Learn From
NOAA N Prime?


•   Communicate all changes on the floor to all technicians and
    supervisors.
     – Is this really working in your program now?



•   Do not back stamp procedures
     – Are your technicians doing this now?
     – This has been a factor in many mishaps!


•   If an audit or an investigation identifies a problem, or a non-
    conformance, fix it!
      – This has been a factor in previous mishaps




                                                                      15
Understanding The Mishap


• Initiating events happen.


• Defenses (Controls and Barriers) fail or do not exist.


                      But why?




                                                           16
Anatomy of an Accident – Asking Why




                                             Failed



                                                          Failed



                                                                            Failed
         Initiating                                                                                    Undesired
                                   Event                                                               Outcome
           Event




                                             Barrier



                                                          Control



                                                                            Control
  Why?              Why?                     Why?                       Why?
Was There A     Did A Previous             Was There A              Did A Previous
 Condition       Event Occur?               Condition                Event Occur?
 Present?                                   Present?

                                                         Why?                             Why?
                                                       Was There A                    Did A Previous
          Why?               Why?                       Condition                      Event Occur?
        Was There A      Did A Previous                 Present?
         Condition        Event Occur?
         Present?                                                         Why?                       Why?
                                                                        Was There A              Did A Previous
                                                                         Condition                Event Occur?
                                                                         Present?
                                                                                                                   17
Investigating Accidents

Often we:
    Identify the part or individual
    that failed.

    Identify the type of failure.

    Identify the immediate cause
    of the failure.

    Stop the analysis.



 Problem with this approach:
 The underlying causes may
 continue to produce similar
 problems or mishaps in the same
 or related areas.

                                      18
Root Cause Analysis

Identify the immediate causes
(proximate causes) and the
organizational causes using
root cause analysis.            Describes
                                  What
                                 Failed




              Proximate Cause

            Intermediate Cause


                   Root Cause




                                            19
Root Cause Analysis

Event and Causal Factor Tree: Shows all the things that did occur.




                                                                     20
Anhydrous Ammonia




                    21
Will Your Program Implement The
Lessons Learned?




                                                  Parachute failed to deploy

                                                  Some of the Causes
  •   Genesis spacecraft launch, August 8, 2001
                                                  • Design error - G-Switch inverted
  •   Collect solar wind samples for two years      (Inappropriate confidence in heritage
  •   Returned to Earth on September 8, 2004         design)
                                                  • Drawing incorrect
  •   Most science was recovered
                                                  • Drawing error not detected:
                                                      • Reviews not in depth
                                                      • Testing deleted/modified
                                                                                            22
Will Your Program Implement The
Lessons Learned?
Canister Ladder Contacted Canister Rotation Facility Door – 2001
                                           •   Configuration change – added ladder
                                           •   Didn’t review or analyze change.
                                           •   No documentation of clearance height.
                                           •   Procedures did not require a check of
                                               canister stack height and facility
                                               clearance prior to move.
                                           •   No detection during move.




                                                                                       23
Will Your Program Implement The
    Lessons Learned?
                                   •      Helios Test Flight, June 23, 2003.

                                   •      High dynamic pressure reached by the aircraft during an unstable pitch
                                          oscillation leading to failure of the vehicle’s secondary structure

                                   •      “Helios suffered from incorrect assessment of risk as the result of inaccurate
                                          information provided by the analysis methods and schedule pressures and
                                          fiscal constraints resulting from budgetary contraction, constrained test
                                          windows and a terminating program. Though the pressures and constraints
                                          were not considered unusual, it did have some unquantifiable influence on
                                          the decision process.” (MIB report)

Summary of Causes of Mishap
•   Configuration change – 2 fuel cells to 3
    fuel cells
•   Reviews did not identify problem -
    change in the vehicle’s weight and
    balance that stability
•   Lack of adequate analysis methods
•   Inaccurate risk assessment of the
    effects of configuration changes
•   Didn’t do incremental testing after
    change.
•   This led to an inappropriate decision to
    fly an aircraft configuration highly
    sensitive to disturbances.

                                                                                                                           24
Can You See A Pattern?




       Type B                        SLC-2 VAFB
  Finger Amputation                     2006
      In Pulley




                 Astro E2 – Suzaku
                        2005                      DART 2005
                                                              25
What Causes Mishaps?

Proximate Cause of Type A Mishaps   NASA
                                    • 57% of Type A mishaps caused by
           1996-2005                  human error (1996-2005)
                                      *Does not include auto accidents or death by natural
                                      causes

     Human                          • 78% of the Shuttle ground-support
      24%                             operations incidents resulted from
                                      human error (Perry, 1993).

                                    Outside NASA
                                    • 75% of all US military aircraft losses
Software                Hardware
  15%                                 involve sensory or cognitive errors
                          61%          (Air Force Safety Center, 2003).


                                    • 83 % of 23,338 accidents involving
                                      boilers and pressure vessels were a
                                      direct result of human oversight or
                                      lack of knowledge
                                      (National Board of Boiler and Pressure Vessel Inspectors, 2005).


                                    • 41% of mishaps at petrochemical
                                      plants were caused by human error
                                      (R.E. Butikofer, 1986).



                                                                                                         26
Lessons Learned: What Causes Mishaps?
 Unsafe acts occur in all programs and phases of the system life cycle.
 – Specification development/planning
 – Conceptual design
 – Product design
 – Fabrication/production
 – Operational service
 – Product decommissioning


   70-90% of safety-related decisions in engineering projects are made
   during early concept development.
   Decisions made during the design process account for the greatest
   effect on cost of a product.


  Design process errors are the root cause of many failures.

  Human performance during operations & maintenance is also a major
  contributor to system risk.
                                                                          27
Type A - Payload Mishaps
                                                                      HESSI (2000)
                                                               High Energy Solar Spectroscopic Imager


                                                Subjected to a series of vibration tests as part of its
                                                flight certification program…caused significant
                                                structural damage.
                                                 •   Misaligned shaker table
                                                 •   No validation test of shaker table
                                                 •   HESSI project not aware of risk posed by test
                                                 •   Sine-burst frequency not in the test plan
                                                 •   Written procedure did not have critical steps



                  SOHO (1998)
               Solar Heliospheric Observatory

•    Made changes to software and procedures
•    Failed to perform risk analysis of modified
     procedure set
•    Ground errors led to the major loss of attitude
     (Omission in the modified predefined command)
•    Failure to communicate procedure change
•    Incorrect diagnosis

                                                                                                          28
Causes Of Mishaps – Inside NASA

               Design                                          Reviews

•   Logic design error existed -      •   Design was not peer reviewed
    Design errors in the circuitry
    were not identified               •   Systems reviews were not conducted

•   Drawing incorrect                 •   Technical reviews failed to detect error in design
•   System drawings were incorrect    •   Red-Team Reviews failed to identify design errors
    because they were not updated
    when system was moved from its
    original location to the Center                              Tests
•   System labels were incorrect      •   Testing only for correction functional behavior … not
                                          for anomalous behavior, especially during
•   System did not have sensors to
                                          initial turn-on and power on reset conditions
    detect failure
•   Configuration changes driven by   •   There was no end-to-end test.
    programmatic and technological    •   Test procedure did not have a step to verify that all
    constraints… reduced design           critical steps were performed
    robustness and margins of
    safety                            •   Lacked a facility validation test

                                      •   Failed to test as fly…fly as you test

                                      •   Tests were cut because funding was cut
                                                                                                  29
Causes Of Mishaps – Inside NASA

                 Operations                                      Paperwork
•    Team error in analysis due to lack of     •   Lacked documentation on system
     system knowledge. This contributed to         characteristics
     the team’s lack of understanding of       •   Processing paperwork and discrepancy
     essential spacecraft design                   disposition paperwork were ambiguous

•    Incorrect diagnosis of problem because    •   Electronic paperwork system can be
     the team lacked information about             edited with no traceability (Info was
     changes in the procedures                     changed and no record of the change was
                                                   recorded)
•    Emergency step/correction maneuver was
                                               •   Written procedures generally did not have
     not performed
                                                   full coverage of the pretest setup and
                                                   post-test teardown phases of the process
                                               •   Did not follow procedures (led to fatality)
               Communication
                                               •   Procedure did not have mandatory steps
•    Inadequate communication between shifts

•    Inadequate communications between
     project elements




                                                                                                 30
Causes Of Mishaps – Inside NASA

                  Supervision                             Risk Assessment & Risk Mgmt

•    “Failure to correct known problems” was      • Did not consider the worst-case effect.
     a supervisory failure to correct similar       Lacked systematic analyses of “what
     known problems (Hardware)                      could go wrong”
                                                  • The perception that operations were routine
•    Supervisory Violation” was committed by
     repeatedly waiving required presence of        resulted in inadequate attention to risk
     quality assurance and safety and               mitigation
     bypassing Government Mandatory               • The project was not fully aware of the risks
     Inspection Points                              associated with the test
•    Lacked “organizational processes” to         • Lack of adequate analysis methods led to
     effectively monitor, verify, and audit the     an inaccurate risk assessment of the effects
     performance and effectiveness of the           of configuration changes
     processes and activities



                     Staffing
•    Inadequate operation’s team staffing




                                                                                                   31
What happens when a mishap or
close call occurs?




                                32
Immediate Notification Process


    Within 1 Hour                       Within 8 Hours                     Within 24 Hours

Center Safety Office-                                                  Center Safety Office-Notify
                                       Notify OSHA (if applicable)
Notify Headquarters by                                                 Headquarters electronically
Phone = for Type A, Type B,                                            with additional details
                                    Applicable:
high visibility mishap, or          Up to 30 days after mishap
high visibility close call.                                             Center Safety Office-
                                    when:
This includes reporting a human                                         Record the occurrence of
test subject injury/fatality)                                           ALL mishaps & close calls
                                    -Death of federal employee
  • Duty 202.358.0006                                                   in Incident Reporting
  • Non-duty 866.230.6272                                               Information System (IRIS)
                                    - Hospitalization 3 or more
                                      if 1 is a Federal Employee
Chief, Safety and Mission                                            • Center Director- Notify
Assurance – Notify                                                     Administrator by phone
Administrator and senior staff                                         when the following occur:
(phone and/or mishap lists email)                                    • Type A
                                                                     • Type B
Center’s Chief of Aircraft                                           • Type C (Lost-time injury only)
Operations- Notify National                                          • Onsite non-occupational
Transportation Safety Board                                             fatality (e.g., heart attack)
(NTSB) if applicable                                                 • Fatalities and serious illness
                                                                       off the job (civil servant &
                                                                       contractor)
                                                                                                        33
Mishap Investigation Notional Timeline
   Immediately – 24 hours
   Safe Site, Initiate Mishap
   Preparedness and Contingency Plans,
   Make Notifications, Classify Mishap                                  Program Manager’s
         Within 48 Hours of Mishap                                        Responsibilities
         Appoint Investigating Authority
                   Within 75 Workdays of Mishap                             PM Pays For Mishap
                   Complete Investigation & Mishap Report                  Investigation Costs Too
                            Within An Additional 30 Workdays
                            Review & Endorse Mishap Report
                                  Within An Additional 5 Workdays
                                  Approve or Reject Mishap Report
                                           Within An Additional 10 Workdays
                                           Authorize Report For Public Release
                                                    Within An Additional 10 Workdays
                                                    Distribute Mishap Report
                                                               Concurrent
                                                               Within 15 Workdays of Being Tasked
   Within 145 days                                             Develop Corrective Action Plan
     of mishap                                                 Within 10 Workdays of Being Tasked
                                                               Develop Lessons Learned



                                                                                                     34
Two Types of Mishap Investigations

 •   Safety Mishap Investigation
     (Per NPR 8621.1: NASA Procedural Requirements for Mishap
     Reporting, Investigating and Recordkeeping)

      –   Describes policy to report, investigate, and document
          mishaps, close calls, and previously unidentified serious
          workplace hazards to prevent recurrence of similar accidents.

 •   Collateral Mishap Investigation
     (Procedures & requirements being developed by the Office of the
     General Counsel).

      –   If it is reasonably suspected that a mishap resulted from
          criminal activity.
      –   If the Agency wants to access accountability (e.g., determine
          negligence).



                                                                          35
Why Should You Investigate Close Calls?


•   Investigations can identify systemic problems


•   Close calls can help you a lot…They tell you where your problems
    are.


•   Close calls give you the opportunity to influence your
    program/project along the way.


                     Requiring your teams to
               report and investigate close calls.




                                                                       36
Preparing For Mishaps




Who’s Missing
 Hardware?


                            July 2006
                        LaRC Wind Tunnel

                                           37
Preparing For Mishaps
 • Anticipate the failures

 • Write failure report first… if failed why would we have failed?
   Generate your plans accordingly.

 • A good designer thinks about how someone will use a tool,
   piece equipment, or procedure (etc.) and how will it be
   mis-used. Think about it early on! Prepare for the mis-use.

 • For critical failures have a Mishap Preparedness and
   Contingency Plan that covers:
        Manufacturing Mishaps
        Processing Mishaps
        Test Mishap
        Transportation Mishap
        Flight Mishaps
        Operations Mishaps

   What does your Mishap Preparedness and Contingency Plan
   look like? Successful programs have complete plans.
                                                                     38
Preparing For Mishaps

•   Center Mishap Preparedness and Contingency Plan Contents
     – Local close call and mishap reporting & investigating procedures
     – Center-specific emergency response
     – Procedures to appoint an Interim Response Team
     – Location of space for impounded objects
     – Mishap process to establish investigating authority and process
       report (Type C mishaps, Type D mishaps, and close calls)

•   Program Mishap Preparedness and Contingency Plan Contents
     – Specific procedures for program emergency response and
       investigating (e.g., safing procedures, toxic commodities, …)
     – Names chair and ex-officio for a Type A board.
     – Procedures to impound data, records, etc… for off-site mishaps
     – Lists national, state, and local organizations and agencies which
       are most likely to take part in debris collection
     – Lists MOUs with international partners and agencies that may
       support investigation

                                                                           39
For More Information

  •   NASA PBMA Mishap Investigation Website
      (https://secureworkgroups.grc.nasa.gov/mi)
       – Includes:
           • Requirements

           • Guides and handbooks
           • Template
           • Tools and methods

           • Hard copies of classroom training
           • Mishap reports
           • Lessons learned

           • Conference presentations



  •   HQ Office of Safety & Mission Assurance
       – Faith.T.Chandler@nasa.gov
       – 202-358-0411
                                                   40
Conclusion

 •   “Lots of times we’re lucky or prepared and we dodge the
     bullet...

 •   But sometimes we endure very public failures, loss of life
     and significant loss of property...

 •   In every case, we work to prevent failures and ensure
     success...

 •   And when failures occur, we try to learn from them.”
     (Tattini… Mars Exploration Rover)

 •   To be successful, we must report and investigate our
     failures and close calls, identify the underlying root causes,
     and generate solutions that prevent these systemic
     problems from creating more failures in our program and in
     others.
                                                                      41
BACK-UP SLIDES



                 42
NASA Mishap And Close Call
 Classification Levels
 Classification                        Property Damage                                                        Injury
     Level
Type A            Total direct cost of mission failure and property damage is            Occupational injury and/or illness that resulted
                  $1,000,000 or more,                                                    in:
                                                  or                                     A fatality,
                  Crewed aircraft hull loss has occurred,                                                       or
                                                  or                                     A permanent total disability,
                  Occurrence of an unexpected aircraft departure from controlled                                or
                  flight (except high performance jet/test aircraft such as F-15, F-     The hospitalization for inpatient care of 3 or
                  16, F/A-18, T-38, and T-34, when engaged in flight test activities).   more people within 30 workdays of the
                                                                                         mishap.

Type B            Total direct cost of mission failure and property damage of at         Occupational injury and/or illness has resulted
                  least $250,000 but less than $1,000,000.                               in permanent partial disability.
                                                                                                                or
                                                                                         The hospitalization for inpatient care of 1-2
                                                                                         people within 30 workdays of the mishap.

Type C            Total direct cost of mission failure and property damage of at         Nonfatal occupational injury or illness that
                  least $25,000 but less than $250,000.                                  caused any workdays away from work,
                                                                                         restricted duty, or transfer to another job
                                                                                         beyond the workday or shift on which it
                                                                                         occurred.

Type D            Total direct cost of mission failure and property damage of at         Any nonfatal OSHA recordable occupational
                  least $1,000 but less than $25,000.                                    injury and/or illness that does not meet the
                                                                                         definition of a Type C mishap.


Close Call        An event in which there is no equipment/property damage or             An event in which there is no injury or only
                  minor equipment/property damage (less than $1000), but which           minor injury requiring first aid, but which
                  possesses the potential to cause a mishap.                             possesses a potential to cause a mishap.
                                                                                                                                            43
Definitions of RCA & Related Terms
Cause (Causal Factor)    An event or condition that results in an effect. Anything that shapes or influences the outcome.


Proximate Cause(s)       The event(s) that occurred, including any condition(s) that existed immediately before the undesired
                         outcome, directly resulted in its occurrence and, if eliminated or modified, would have prevented the
                         undesired outcome. Also known as the direct cause(s).

Root Cause(s)            One of multiple factors (events, conditions or organizational factors) that contributed to or created the
                         proximate cause and subsequent undesired outcome and, if eliminated, or modified would have
                         prevented the undesired outcome. Typically multiple root causes contribute to an undesired outcome.

Root Cause Analysis      A structured evaluation method that identifies the root causes for an undesired outcome and the
(RCA)                    actions adequate to prevent recurrence. Root cause analysis should continue until organizational
                         factors have been identified, or until data are exhausted.
Event                    A real-time occurrence describing one discrete action, typically an error, failure, or malfunction.
                         Examples: pipe broke, power lost, lightning struck, person opened valve, etc…

Condition                Any as-found state, whether or not resulting from an event, that may have safety, health, quality,
                         security, operational, or environmental implications.
Organizational Factors   Any operational or management structural entity that exerts control over the system at any stage in its
                         life cycle, including but not limited to the system’s concept development, design, fabrication, test,
                         maintenance, operation, and disposal.
                         Examples: resource management (budget, staff, training); policy (content, implementation, verification);
                         and management decisions.
Contributing Factor      An event or condition that may have contributed to the occurrence of an undesired outcome but, if
                         eliminated or modified, would not by itself have prevented the occurrence.

Barrier                  A physical device or an administrative control used to reduce risk of the undesired outcome to an
                         acceptable level. Barriers can provide physical intervention (e.g., a guardrail) or procedural separation
                         in time and space (e.g., lock-out-tag-out procedure).
                                                                                                                                     44

Más contenido relacionado

La actualidad más candente

Military Radar
Military RadarMilitary Radar
Military RadarDefenceIQ
 
Otero.s.mongan.p
Otero.s.mongan.pOtero.s.mongan.p
Otero.s.mongan.pNASAPMC
 
Recent Developments and Current Projects in HEL Technology: Harro Ackermann -...
Recent Developments and Current Projects in HEL Technology: Harro Ackermann -...Recent Developments and Current Projects in HEL Technology: Harro Ackermann -...
Recent Developments and Current Projects in HEL Technology: Harro Ackermann -...Sharmin Ahammad
 
Fred.ouellette
Fred.ouelletteFred.ouellette
Fred.ouelletteNASAPMC
 
ATI Courses Professional Development Short Course Spacecraft Quality Assuranc...
ATI Courses Professional Development Short Course Spacecraft Quality Assuranc...ATI Courses Professional Development Short Course Spacecraft Quality Assuranc...
ATI Courses Professional Development Short Course Spacecraft Quality Assuranc...Jim Jenkins
 
Newman lengyel dartpm-chal_case
Newman lengyel dartpm-chal_caseNewman lengyel dartpm-chal_case
Newman lengyel dartpm-chal_caseNASAPMC
 
A Strategy for Collaborative NASA Technology Development
A Strategy for Collaborative NASA Technology DevelopmentA Strategy for Collaborative NASA Technology Development
A Strategy for Collaborative NASA Technology DevelopmentChief Technologist Office
 
Hunter.h.shinn.s
Hunter.h.shinn.sHunter.h.shinn.s
Hunter.h.shinn.sNASAPMC
 
John W Hines: ARCTek Phase 3
John W Hines: ARCTek Phase 3John W Hines: ARCTek Phase 3
John W Hines: ARCTek Phase 3Matthew F. Reyes
 
Alfred mecum
Alfred mecumAlfred mecum
Alfred mecumNASAPMC
 
Thomas.diegelman
Thomas.diegelmanThomas.diegelman
Thomas.diegelmanNASAPMC
 
Seftas.george
Seftas.georgeSeftas.george
Seftas.georgeNASAPMC
 
Lessons Learned from Space Technology Development
Lessons Learned from Space Technology Development Lessons Learned from Space Technology Development
Lessons Learned from Space Technology Development Chief Technologist Office
 
[Day 4] SERVIR-Africa
[Day 4] SERVIR-Africa[Day 4] SERVIR-Africa
[Day 4] SERVIR-Africacsi2009
 
Robert s tthomas_v2
Robert s tthomas_v2Robert s tthomas_v2
Robert s tthomas_v2NASAPMC
 

La actualidad más candente (17)

Military Radar
Military RadarMilitary Radar
Military Radar
 
Otero.s.mongan.p
Otero.s.mongan.pOtero.s.mongan.p
Otero.s.mongan.p
 
Recent Developments and Current Projects in HEL Technology: Harro Ackermann -...
Recent Developments and Current Projects in HEL Technology: Harro Ackermann -...Recent Developments and Current Projects in HEL Technology: Harro Ackermann -...
Recent Developments and Current Projects in HEL Technology: Harro Ackermann -...
 
Fred.ouellette
Fred.ouelletteFred.ouellette
Fred.ouellette
 
ATI Courses Professional Development Short Course Spacecraft Quality Assuranc...
ATI Courses Professional Development Short Course Spacecraft Quality Assuranc...ATI Courses Professional Development Short Course Spacecraft Quality Assuranc...
ATI Courses Professional Development Short Course Spacecraft Quality Assuranc...
 
Newman lengyel dartpm-chal_case
Newman lengyel dartpm-chal_caseNewman lengyel dartpm-chal_case
Newman lengyel dartpm-chal_case
 
A Strategy for Collaborative NASA Technology Development
A Strategy for Collaborative NASA Technology DevelopmentA Strategy for Collaborative NASA Technology Development
A Strategy for Collaborative NASA Technology Development
 
Hunter.h.shinn.s
Hunter.h.shinn.sHunter.h.shinn.s
Hunter.h.shinn.s
 
Dryden Overview
Dryden OverviewDryden Overview
Dryden Overview
 
John W Hines: ARCTek Phase 3
John W Hines: ARCTek Phase 3John W Hines: ARCTek Phase 3
John W Hines: ARCTek Phase 3
 
Alfred mecum
Alfred mecumAlfred mecum
Alfred mecum
 
Thomas.diegelman
Thomas.diegelmanThomas.diegelman
Thomas.diegelman
 
Seftas.george
Seftas.georgeSeftas.george
Seftas.george
 
Lessons Learned from Space Technology Development
Lessons Learned from Space Technology Development Lessons Learned from Space Technology Development
Lessons Learned from Space Technology Development
 
Taube
TaubeTaube
Taube
 
[Day 4] SERVIR-Africa
[Day 4] SERVIR-Africa[Day 4] SERVIR-Africa
[Day 4] SERVIR-Africa
 
Robert s tthomas_v2
Robert s tthomas_v2Robert s tthomas_v2
Robert s tthomas_v2
 

Similar a Chandler faith

Steve Hipskind - Applying NASA’s Global Perspective to Societal Needs
Steve Hipskind - Applying NASA’s Global Perspective to Societal NeedsSteve Hipskind - Applying NASA’s Global Perspective to Societal Needs
Steve Hipskind - Applying NASA’s Global Perspective to Societal NeedsShane Mitchell
 
Fred.ouellette
Fred.ouelletteFred.ouellette
Fred.ouelletteNASAPMC
 
Rodney rocha
Rodney rochaRodney rocha
Rodney rochaNASAPMC
 
Rodney rocha
Rodney rochaRodney rocha
Rodney rochaNASAPMC
 
Failure Mode and Effect Analysis
Failure Mode and Effect AnalysisFailure Mode and Effect Analysis
Failure Mode and Effect Analysistulasiva
 
Turner.john
Turner.johnTurner.john
Turner.johnNASAPMC
 
Software Faults, Failures and Their Mitigations | Turing100@Persistent
Software Faults, Failures and Their Mitigations | Turing100@PersistentSoftware Faults, Failures and Their Mitigations | Turing100@Persistent
Software Faults, Failures and Their Mitigations | Turing100@PersistentPersistent Systems Ltd.
 
Newman lengyel dartpm-chal_case
Newman lengyel dartpm-chal_caseNewman lengyel dartpm-chal_case
Newman lengyel dartpm-chal_caseNASAPMC
 
Comstock petro
Comstock petroComstock petro
Comstock petroNASAPMC
 
2010 Rims Session Ind909.Final
2010 Rims Session   Ind909.Final2010 Rims Session   Ind909.Final
2010 Rims Session Ind909.Finalgabelugo
 
Fundamentals of reliability engineering and applications part1of3
Fundamentals of reliability engineering and applications part1of3Fundamentals of reliability engineering and applications part1of3
Fundamentals of reliability engineering and applications part1of3ASQ Reliability Division
 
Jansen.michael
Jansen.michaelJansen.michael
Jansen.michaelNASAPMC
 
Risk assessment south bank
Risk assessment south bankRisk assessment south bank
Risk assessment south bankDmeeThriller
 
FME for Disaster Response
FME for Disaster ResponseFME for Disaster Response
FME for Disaster ResponseSafe Software
 

Similar a Chandler faith (20)

Art c
Art cArt c
Art c
 
Software Testing Basics
Software Testing BasicsSoftware Testing Basics
Software Testing Basics
 
Steve Hipskind - Applying NASA’s Global Perspective to Societal Needs
Steve Hipskind - Applying NASA’s Global Perspective to Societal NeedsSteve Hipskind - Applying NASA’s Global Perspective to Societal Needs
Steve Hipskind - Applying NASA’s Global Perspective to Societal Needs
 
A RESTful WfXML
A RESTful WfXMLA RESTful WfXML
A RESTful WfXML
 
PL Safety and Essentials of Risk Assessment
PL Safety and Essentials of Risk AssessmentPL Safety and Essentials of Risk Assessment
PL Safety and Essentials of Risk Assessment
 
Cloud malfunction up11
Cloud malfunction up11Cloud malfunction up11
Cloud malfunction up11
 
Fred.ouellette
Fred.ouelletteFred.ouellette
Fred.ouellette
 
Rodney rocha
Rodney rochaRodney rocha
Rodney rocha
 
Rodney rocha
Rodney rochaRodney rocha
Rodney rocha
 
Failure Mode and Effect Analysis
Failure Mode and Effect AnalysisFailure Mode and Effect Analysis
Failure Mode and Effect Analysis
 
Turner.john
Turner.johnTurner.john
Turner.john
 
Intro to burn in & ess quantification
Intro to burn in & ess quantificationIntro to burn in & ess quantification
Intro to burn in & ess quantification
 
Software Faults, Failures and Their Mitigations | Turing100@Persistent
Software Faults, Failures and Their Mitigations | Turing100@PersistentSoftware Faults, Failures and Their Mitigations | Turing100@Persistent
Software Faults, Failures and Their Mitigations | Turing100@Persistent
 
Newman lengyel dartpm-chal_case
Newman lengyel dartpm-chal_caseNewman lengyel dartpm-chal_case
Newman lengyel dartpm-chal_case
 
Comstock petro
Comstock petroComstock petro
Comstock petro
 
2010 Rims Session Ind909.Final
2010 Rims Session   Ind909.Final2010 Rims Session   Ind909.Final
2010 Rims Session Ind909.Final
 
Fundamentals of reliability engineering and applications part1of3
Fundamentals of reliability engineering and applications part1of3Fundamentals of reliability engineering and applications part1of3
Fundamentals of reliability engineering and applications part1of3
 
Jansen.michael
Jansen.michaelJansen.michael
Jansen.michael
 
Risk assessment south bank
Risk assessment south bankRisk assessment south bank
Risk assessment south bank
 
FME for Disaster Response
FME for Disaster ResponseFME for Disaster Response
FME for Disaster Response
 

Más de NASAPMC

Bejmuk bo
Bejmuk boBejmuk bo
Bejmuk boNASAPMC
 
Baniszewski john
Baniszewski johnBaniszewski john
Baniszewski johnNASAPMC
 
Yew manson
Yew mansonYew manson
Yew mansonNASAPMC
 
Wood frank
Wood frankWood frank
Wood frankNASAPMC
 
Wood frank
Wood frankWood frank
Wood frankNASAPMC
 
Wessen randi (cd)
Wessen randi (cd)Wessen randi (cd)
Wessen randi (cd)NASAPMC
 
Vellinga joe
Vellinga joeVellinga joe
Vellinga joeNASAPMC
 
Trahan stuart
Trahan stuartTrahan stuart
Trahan stuartNASAPMC
 
Stock gahm
Stock gahmStock gahm
Stock gahmNASAPMC
 
Snow lee
Snow leeSnow lee
Snow leeNASAPMC
 
Smalley sandra
Smalley sandraSmalley sandra
Smalley sandraNASAPMC
 
Seftas krage
Seftas krageSeftas krage
Seftas krageNASAPMC
 
Sampietro marco
Sampietro marcoSampietro marco
Sampietro marcoNASAPMC
 
Rudolphi mike
Rudolphi mikeRudolphi mike
Rudolphi mikeNASAPMC
 
Roberts karlene
Roberts karleneRoberts karlene
Roberts karleneNASAPMC
 
Rackley mike
Rackley mikeRackley mike
Rackley mikeNASAPMC
 
Paradis william
Paradis williamParadis william
Paradis williamNASAPMC
 
Osterkamp jeff
Osterkamp jeffOsterkamp jeff
Osterkamp jeffNASAPMC
 
O'keefe william
O'keefe williamO'keefe william
O'keefe williamNASAPMC
 
Muller ralf
Muller ralfMuller ralf
Muller ralfNASAPMC
 

Más de NASAPMC (20)

Bejmuk bo
Bejmuk boBejmuk bo
Bejmuk bo
 
Baniszewski john
Baniszewski johnBaniszewski john
Baniszewski john
 
Yew manson
Yew mansonYew manson
Yew manson
 
Wood frank
Wood frankWood frank
Wood frank
 
Wood frank
Wood frankWood frank
Wood frank
 
Wessen randi (cd)
Wessen randi (cd)Wessen randi (cd)
Wessen randi (cd)
 
Vellinga joe
Vellinga joeVellinga joe
Vellinga joe
 
Trahan stuart
Trahan stuartTrahan stuart
Trahan stuart
 
Stock gahm
Stock gahmStock gahm
Stock gahm
 
Snow lee
Snow leeSnow lee
Snow lee
 
Smalley sandra
Smalley sandraSmalley sandra
Smalley sandra
 
Seftas krage
Seftas krageSeftas krage
Seftas krage
 
Sampietro marco
Sampietro marcoSampietro marco
Sampietro marco
 
Rudolphi mike
Rudolphi mikeRudolphi mike
Rudolphi mike
 
Roberts karlene
Roberts karleneRoberts karlene
Roberts karlene
 
Rackley mike
Rackley mikeRackley mike
Rackley mike
 
Paradis william
Paradis williamParadis william
Paradis william
 
Osterkamp jeff
Osterkamp jeffOsterkamp jeff
Osterkamp jeff
 
O'keefe william
O'keefe williamO'keefe william
O'keefe william
 
Muller ralf
Muller ralfMuller ralf
Muller ralf
 

Último

Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 

Último (20)

Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 

Chandler faith

  • 1. Learning From NASA Mishaps: What Separates Success From Failure? Project Management Challenge 2007 February 7, 2007 Faith Chandler Office of Safety and Mission Assurance 1
  • 2. Discussion Topics • What is a mishap? • What is a close call? • How can they affect your program? • Anatomy of an accident. • What can you learn from others past failures that will make you successful? • What do you do if your program has a mishap? Is the purpose of your program • How can you prepare your program? only to serve as a warning to others? 2
  • 3. The cause is really obvious…or is it? Without understanding the cause, how can you fix the problem? How can you learn from it? 3
  • 4. What’s A Mishap? What’s A Close Call? NASA Mishap. An unplanned event that results in at least one of the following: – Injury to non-NASA personnel, caused by NASA operations. – Damage to public or private property (including foreign property), caused by NASA operations or NASA-funded development or research projects. – Occupational injury or occupational illness to NASA personnel. – NASA mission failure before the scheduled completion of the planned primary mission. – Destruction of, or damage to, NASA property. New Definition of Close Call. An event in which there is no injury or only minor injury requiring first aid and/or no equipment/property damage or minor equipment/property damage (less than $1000), but which possesses a potential to cause a mishap. ALL MISHAPS And CLOSE CALLS ARE INVESTIGATED 4
  • 5. How Are Mishaps Classified? Classification based on dollar loss and injury. • (Mission failure based on cost of mission). Classification determines type of investigation to be conducted. Mishap classification: • Type A mishaps – Type D mishaps • Close calls 2005 2003 2006 2006 2006 Type A Type B Type C Type D Close Call NOAA N Prime Remote Manipulator Hubble WSIPE Lift Sling Hydraulic Pump (HPU-3) Premature Shutdown of Processing Mishap System Damage by Falls on WSIPE Hardware Electrical Arc Between WSTF Large Altitude $223 M Bridge Bucket $TBD Pump & Crane Hook Simulation System & $470 K Between $25K-$250K Damage Blowback on Test Article Less Than $25K 5
  • 6. How Can Mishaps And Close Calls Affect Your Program? Mishaps can impact your budget, your schedule, and your mission success! What Can Go Wrong? Equipment can fail Software can contain errors Humans can make mistakes or deviate from accepted policy and practices What’s The Cost? Human life One-of-a-kind hardware Government equipment & facilities Scientific knowledge Program cancellation 40.017 UE Lynch Public confidence 2005 6
  • 7. NASA Mishaps And Close Calls In 2006, NASA had 715 Mishaps and 920 Close Calls reported in the NASA Incident Reporting and Information System (IRIS). 6 Type A mishaps 13 Type B mishaps 268 Type C mishaps In the last 10 years (1996-2006), the direct cost of mishaps was more than $2 Billion. Other Additional Costs Include: • Worker’s compensation • Training and replacement workers • Lost productivity • Schedule delays • Mishap investigation • Implementing the Corrective Action Plan (CAP) • Record keeping (CAP, worker’s compensation, mishap, etc) • Liability These indirect costs can amount to more, in fact much more, than the direct cost of the injury or property damage. 7
  • 8. Types Of NASA Mishaps Industrial Lift Off Landing Test Flight CTDs NOAA N Prime Mars Climate Orbiter Slips, Trips, & Falls Cooling Tower Fire Challenger Columbia Insect & VAB Foam Fire Animal Bites DART Crane - Pad B Processing In Space & Test Helios Genesis Automobile Individual 8
  • 9. Type Of Activity Where Mishaps Occurred Percentage of Type A Mishaps Occurring During Each Type of Activity 1996-2005 Ground Test Space 41% 27% Flight Test 12% Earth Flight Ground Maintenance 8% 8% Ground Processing 4% 9
  • 10. Close Calls And Mishaps Mars Exploration Rovers Even programs with great success have significant failures and close calls! • Cancellation of one rover due to concerns about ability to be ready safely for launch. • Air bag failure months before launch. • Parachute failure months before launch. • Potential cable cutter shorting days before launch. • Pyrotechnic firing software concern one day before Mars arrival. 10
  • 11. Why Do Some Programs Have Close Calls And Others Have Mishaps? Controls and Barriers Fail or Are Non-Existent Location Where Deceased Fell From Roof First Point of Second Impact of Point of Deceased Impact of Deceased VAB CLOSE CALL 2006 Bldg. M6-794 TYPE A MISHAP 2006 2 Men Fall From Ladder 1 Man Falls From Roof Sustain Slight Injuries Fatality Control: Fall Protection Used Control Failed: Fall Protection NOT Used 11
  • 12. Anatomy Of An Accident • Guard, Shield, Shroud • Personal Protective Equipment (PPE) d • Lockout, Keyed Connecter esire Und ome Outc Failed Failed Failed Initiating Undesired Event Outcome Event Barrier Control Control • Hardware • Design Review Undesired • Software • Inspection Outcome • Human • Test • Weather • Audit • Natural Phenomenon • Alarm/Feedback Loop • External Event • Risk Assessment/FMEA 12
  • 13. How To Make Sure Your Program Doesn’t Have A Major Mishap It is Not enough to have layers of defenses… Nearly every program at NASA has them. • Reviews • Inspections • Tests • Audits • Alarms and means to mitigate What separates the successful programs from those that have mishaps… These defenses work How can you detect failing or non-existent defenses? • Perform Root Cause Analysis (RCA) on problems and close calls. • Identify systemic problems in your program. • Fix failures in defenses early … before they cause a mishap. 13
  • 14. NOAA N Prime’s Weak Defenses Failed Failed Failed Failed Other Engineer Doesn’t Team Installs Follow Team Removes Bolts Procedures Bolts To Reinstall Control Control Control Control 24 Bolts Team That Removes Bolts Doesn’t Tell Anyone Lead Technician (PQC) Stamped Procedure Without Inspecting Quality Assurance Stamped Procedure Without Inspecting NASA and Contractor Supervisors Did Not Correct Known Problems Routinely Allowed Sign Off Without Verification 14
  • 15. What Can Your Program Learn From NOAA N Prime? • Communicate all changes on the floor to all technicians and supervisors. – Is this really working in your program now? • Do not back stamp procedures – Are your technicians doing this now? – This has been a factor in many mishaps! • If an audit or an investigation identifies a problem, or a non- conformance, fix it! – This has been a factor in previous mishaps 15
  • 16. Understanding The Mishap • Initiating events happen. • Defenses (Controls and Barriers) fail or do not exist. But why? 16
  • 17. Anatomy of an Accident – Asking Why Failed Failed Failed Initiating Undesired Event Outcome Event Barrier Control Control Why? Why? Why? Why? Was There A Did A Previous Was There A Did A Previous Condition Event Occur? Condition Event Occur? Present? Present? Why? Why? Was There A Did A Previous Why? Why? Condition Event Occur? Was There A Did A Previous Present? Condition Event Occur? Present? Why? Why? Was There A Did A Previous Condition Event Occur? Present? 17
  • 18. Investigating Accidents Often we: Identify the part or individual that failed. Identify the type of failure. Identify the immediate cause of the failure. Stop the analysis. Problem with this approach: The underlying causes may continue to produce similar problems or mishaps in the same or related areas. 18
  • 19. Root Cause Analysis Identify the immediate causes (proximate causes) and the organizational causes using root cause analysis. Describes What Failed Proximate Cause Intermediate Cause Root Cause 19
  • 20. Root Cause Analysis Event and Causal Factor Tree: Shows all the things that did occur. 20
  • 22. Will Your Program Implement The Lessons Learned? Parachute failed to deploy Some of the Causes • Genesis spacecraft launch, August 8, 2001 • Design error - G-Switch inverted • Collect solar wind samples for two years (Inappropriate confidence in heritage • Returned to Earth on September 8, 2004 design) • Drawing incorrect • Most science was recovered • Drawing error not detected: • Reviews not in depth • Testing deleted/modified 22
  • 23. Will Your Program Implement The Lessons Learned? Canister Ladder Contacted Canister Rotation Facility Door – 2001 • Configuration change – added ladder • Didn’t review or analyze change. • No documentation of clearance height. • Procedures did not require a check of canister stack height and facility clearance prior to move. • No detection during move. 23
  • 24. Will Your Program Implement The Lessons Learned? • Helios Test Flight, June 23, 2003. • High dynamic pressure reached by the aircraft during an unstable pitch oscillation leading to failure of the vehicle’s secondary structure • “Helios suffered from incorrect assessment of risk as the result of inaccurate information provided by the analysis methods and schedule pressures and fiscal constraints resulting from budgetary contraction, constrained test windows and a terminating program. Though the pressures and constraints were not considered unusual, it did have some unquantifiable influence on the decision process.” (MIB report) Summary of Causes of Mishap • Configuration change – 2 fuel cells to 3 fuel cells • Reviews did not identify problem - change in the vehicle’s weight and balance that stability • Lack of adequate analysis methods • Inaccurate risk assessment of the effects of configuration changes • Didn’t do incremental testing after change. • This led to an inappropriate decision to fly an aircraft configuration highly sensitive to disturbances. 24
  • 25. Can You See A Pattern? Type B SLC-2 VAFB Finger Amputation 2006 In Pulley Astro E2 – Suzaku 2005 DART 2005 25
  • 26. What Causes Mishaps? Proximate Cause of Type A Mishaps NASA • 57% of Type A mishaps caused by 1996-2005 human error (1996-2005) *Does not include auto accidents or death by natural causes Human • 78% of the Shuttle ground-support 24% operations incidents resulted from human error (Perry, 1993). Outside NASA • 75% of all US military aircraft losses Software Hardware 15% involve sensory or cognitive errors 61% (Air Force Safety Center, 2003). • 83 % of 23,338 accidents involving boilers and pressure vessels were a direct result of human oversight or lack of knowledge (National Board of Boiler and Pressure Vessel Inspectors, 2005). • 41% of mishaps at petrochemical plants were caused by human error (R.E. Butikofer, 1986). 26
  • 27. Lessons Learned: What Causes Mishaps? Unsafe acts occur in all programs and phases of the system life cycle. – Specification development/planning – Conceptual design – Product design – Fabrication/production – Operational service – Product decommissioning 70-90% of safety-related decisions in engineering projects are made during early concept development. Decisions made during the design process account for the greatest effect on cost of a product. Design process errors are the root cause of many failures. Human performance during operations & maintenance is also a major contributor to system risk. 27
  • 28. Type A - Payload Mishaps HESSI (2000) High Energy Solar Spectroscopic Imager Subjected to a series of vibration tests as part of its flight certification program…caused significant structural damage. • Misaligned shaker table • No validation test of shaker table • HESSI project not aware of risk posed by test • Sine-burst frequency not in the test plan • Written procedure did not have critical steps SOHO (1998) Solar Heliospheric Observatory • Made changes to software and procedures • Failed to perform risk analysis of modified procedure set • Ground errors led to the major loss of attitude (Omission in the modified predefined command) • Failure to communicate procedure change • Incorrect diagnosis 28
  • 29. Causes Of Mishaps – Inside NASA Design Reviews • Logic design error existed - • Design was not peer reviewed Design errors in the circuitry were not identified • Systems reviews were not conducted • Drawing incorrect • Technical reviews failed to detect error in design • System drawings were incorrect • Red-Team Reviews failed to identify design errors because they were not updated when system was moved from its original location to the Center Tests • System labels were incorrect • Testing only for correction functional behavior … not for anomalous behavior, especially during • System did not have sensors to initial turn-on and power on reset conditions detect failure • Configuration changes driven by • There was no end-to-end test. programmatic and technological • Test procedure did not have a step to verify that all constraints… reduced design critical steps were performed robustness and margins of safety • Lacked a facility validation test • Failed to test as fly…fly as you test • Tests were cut because funding was cut 29
  • 30. Causes Of Mishaps – Inside NASA Operations Paperwork • Team error in analysis due to lack of • Lacked documentation on system system knowledge. This contributed to characteristics the team’s lack of understanding of • Processing paperwork and discrepancy essential spacecraft design disposition paperwork were ambiguous • Incorrect diagnosis of problem because • Electronic paperwork system can be the team lacked information about edited with no traceability (Info was changes in the procedures changed and no record of the change was recorded) • Emergency step/correction maneuver was • Written procedures generally did not have not performed full coverage of the pretest setup and post-test teardown phases of the process • Did not follow procedures (led to fatality) Communication • Procedure did not have mandatory steps • Inadequate communication between shifts • Inadequate communications between project elements 30
  • 31. Causes Of Mishaps – Inside NASA Supervision Risk Assessment & Risk Mgmt • “Failure to correct known problems” was • Did not consider the worst-case effect. a supervisory failure to correct similar Lacked systematic analyses of “what known problems (Hardware) could go wrong” • The perception that operations were routine • Supervisory Violation” was committed by repeatedly waiving required presence of resulted in inadequate attention to risk quality assurance and safety and mitigation bypassing Government Mandatory • The project was not fully aware of the risks Inspection Points associated with the test • Lacked “organizational processes” to • Lack of adequate analysis methods led to effectively monitor, verify, and audit the an inaccurate risk assessment of the effects performance and effectiveness of the of configuration changes processes and activities Staffing • Inadequate operation’s team staffing 31
  • 32. What happens when a mishap or close call occurs? 32
  • 33. Immediate Notification Process Within 1 Hour Within 8 Hours Within 24 Hours Center Safety Office- Center Safety Office-Notify Notify OSHA (if applicable) Notify Headquarters by Headquarters electronically Phone = for Type A, Type B, with additional details Applicable: high visibility mishap, or Up to 30 days after mishap high visibility close call. Center Safety Office- when: This includes reporting a human Record the occurrence of test subject injury/fatality) ALL mishaps & close calls -Death of federal employee • Duty 202.358.0006 in Incident Reporting • Non-duty 866.230.6272 Information System (IRIS) - Hospitalization 3 or more if 1 is a Federal Employee Chief, Safety and Mission • Center Director- Notify Assurance – Notify Administrator by phone Administrator and senior staff when the following occur: (phone and/or mishap lists email) • Type A • Type B Center’s Chief of Aircraft • Type C (Lost-time injury only) Operations- Notify National • Onsite non-occupational Transportation Safety Board fatality (e.g., heart attack) (NTSB) if applicable • Fatalities and serious illness off the job (civil servant & contractor) 33
  • 34. Mishap Investigation Notional Timeline Immediately – 24 hours Safe Site, Initiate Mishap Preparedness and Contingency Plans, Make Notifications, Classify Mishap Program Manager’s Within 48 Hours of Mishap Responsibilities Appoint Investigating Authority Within 75 Workdays of Mishap PM Pays For Mishap Complete Investigation & Mishap Report Investigation Costs Too Within An Additional 30 Workdays Review & Endorse Mishap Report Within An Additional 5 Workdays Approve or Reject Mishap Report Within An Additional 10 Workdays Authorize Report For Public Release Within An Additional 10 Workdays Distribute Mishap Report Concurrent Within 15 Workdays of Being Tasked Within 145 days Develop Corrective Action Plan of mishap Within 10 Workdays of Being Tasked Develop Lessons Learned 34
  • 35. Two Types of Mishap Investigations • Safety Mishap Investigation (Per NPR 8621.1: NASA Procedural Requirements for Mishap Reporting, Investigating and Recordkeeping) – Describes policy to report, investigate, and document mishaps, close calls, and previously unidentified serious workplace hazards to prevent recurrence of similar accidents. • Collateral Mishap Investigation (Procedures & requirements being developed by the Office of the General Counsel). – If it is reasonably suspected that a mishap resulted from criminal activity. – If the Agency wants to access accountability (e.g., determine negligence). 35
  • 36. Why Should You Investigate Close Calls? • Investigations can identify systemic problems • Close calls can help you a lot…They tell you where your problems are. • Close calls give you the opportunity to influence your program/project along the way. Requiring your teams to report and investigate close calls. 36
  • 37. Preparing For Mishaps Who’s Missing Hardware? July 2006 LaRC Wind Tunnel 37
  • 38. Preparing For Mishaps • Anticipate the failures • Write failure report first… if failed why would we have failed? Generate your plans accordingly. • A good designer thinks about how someone will use a tool, piece equipment, or procedure (etc.) and how will it be mis-used. Think about it early on! Prepare for the mis-use. • For critical failures have a Mishap Preparedness and Contingency Plan that covers: Manufacturing Mishaps Processing Mishaps Test Mishap Transportation Mishap Flight Mishaps Operations Mishaps What does your Mishap Preparedness and Contingency Plan look like? Successful programs have complete plans. 38
  • 39. Preparing For Mishaps • Center Mishap Preparedness and Contingency Plan Contents – Local close call and mishap reporting & investigating procedures – Center-specific emergency response – Procedures to appoint an Interim Response Team – Location of space for impounded objects – Mishap process to establish investigating authority and process report (Type C mishaps, Type D mishaps, and close calls) • Program Mishap Preparedness and Contingency Plan Contents – Specific procedures for program emergency response and investigating (e.g., safing procedures, toxic commodities, …) – Names chair and ex-officio for a Type A board. – Procedures to impound data, records, etc… for off-site mishaps – Lists national, state, and local organizations and agencies which are most likely to take part in debris collection – Lists MOUs with international partners and agencies that may support investigation 39
  • 40. For More Information • NASA PBMA Mishap Investigation Website (https://secureworkgroups.grc.nasa.gov/mi) – Includes: • Requirements • Guides and handbooks • Template • Tools and methods • Hard copies of classroom training • Mishap reports • Lessons learned • Conference presentations • HQ Office of Safety & Mission Assurance – Faith.T.Chandler@nasa.gov – 202-358-0411 40
  • 41. Conclusion • “Lots of times we’re lucky or prepared and we dodge the bullet... • But sometimes we endure very public failures, loss of life and significant loss of property... • In every case, we work to prevent failures and ensure success... • And when failures occur, we try to learn from them.” (Tattini… Mars Exploration Rover) • To be successful, we must report and investigate our failures and close calls, identify the underlying root causes, and generate solutions that prevent these systemic problems from creating more failures in our program and in others. 41
  • 43. NASA Mishap And Close Call Classification Levels Classification Property Damage Injury Level Type A Total direct cost of mission failure and property damage is Occupational injury and/or illness that resulted $1,000,000 or more, in: or A fatality, Crewed aircraft hull loss has occurred, or or A permanent total disability, Occurrence of an unexpected aircraft departure from controlled or flight (except high performance jet/test aircraft such as F-15, F- The hospitalization for inpatient care of 3 or 16, F/A-18, T-38, and T-34, when engaged in flight test activities). more people within 30 workdays of the mishap. Type B Total direct cost of mission failure and property damage of at Occupational injury and/or illness has resulted least $250,000 but less than $1,000,000. in permanent partial disability. or The hospitalization for inpatient care of 1-2 people within 30 workdays of the mishap. Type C Total direct cost of mission failure and property damage of at Nonfatal occupational injury or illness that least $25,000 but less than $250,000. caused any workdays away from work, restricted duty, or transfer to another job beyond the workday or shift on which it occurred. Type D Total direct cost of mission failure and property damage of at Any nonfatal OSHA recordable occupational least $1,000 but less than $25,000. injury and/or illness that does not meet the definition of a Type C mishap. Close Call An event in which there is no equipment/property damage or An event in which there is no injury or only minor equipment/property damage (less than $1000), but which minor injury requiring first aid, but which possesses the potential to cause a mishap. possesses a potential to cause a mishap. 43
  • 44. Definitions of RCA & Related Terms Cause (Causal Factor) An event or condition that results in an effect. Anything that shapes or influences the outcome. Proximate Cause(s) The event(s) that occurred, including any condition(s) that existed immediately before the undesired outcome, directly resulted in its occurrence and, if eliminated or modified, would have prevented the undesired outcome. Also known as the direct cause(s). Root Cause(s) One of multiple factors (events, conditions or organizational factors) that contributed to or created the proximate cause and subsequent undesired outcome and, if eliminated, or modified would have prevented the undesired outcome. Typically multiple root causes contribute to an undesired outcome. Root Cause Analysis A structured evaluation method that identifies the root causes for an undesired outcome and the (RCA) actions adequate to prevent recurrence. Root cause analysis should continue until organizational factors have been identified, or until data are exhausted. Event A real-time occurrence describing one discrete action, typically an error, failure, or malfunction. Examples: pipe broke, power lost, lightning struck, person opened valve, etc… Condition Any as-found state, whether or not resulting from an event, that may have safety, health, quality, security, operational, or environmental implications. Organizational Factors Any operational or management structural entity that exerts control over the system at any stage in its life cycle, including but not limited to the system’s concept development, design, fabrication, test, maintenance, operation, and disposal. Examples: resource management (budget, staff, training); policy (content, implementation, verification); and management decisions. Contributing Factor An event or condition that may have contributed to the occurrence of an undesired outcome but, if eliminated or modified, would not by itself have prevented the occurrence. Barrier A physical device or an administrative control used to reduce risk of the undesired outcome to an acceptable level. Barriers can provide physical intervention (e.g., a guardrail) or procedural separation in time and space (e.g., lock-out-tag-out procedure). 44