SlideShare una empresa de Scribd logo
1 de 9
Effective Reliability Program
  Traits and Management




        Fred Schenkelberg




         Fred Schenkelberg
     Senior Reliability Consultant
        Ops A La Carte, LLC
     990 Richard Ave, Suite 101
       Santa Clara, CA 95050
          fms@opsalacarte.com




        Schenkelberg: page i
SUMMARY & PURPOSE
  The purpose of this tutorial is to highlight key traits for the effective management of a reliability program.
The basic premise is no single list of reliability activities will work for every product. Every product
development and production team faces a different history, constraints, and a different set of variables and
uncertainties. Such that what worked for the last program may or may not be appropriate for the current
project. There are a handful of key traits that separate the valuable programs from the merely busy
programs. These traits and the underlying structure can provide a framework to create a cost effective and
efficient reliability program.

                                                           Fred Schenkelberg
  Fred Schenkelberg is a reliability engineering and management consultant with Ops A La Carte, with areas
of focus including reliability engineering management training and accelerated life testing. Previously, he
co-founded and built the HP corporate reliability program, including consulting on a broad range of HP
products. He is a lecturer with the University of Maryland teaching a graduate level course on reliability
engineering management. He earned a master of science degree in statistics at Stanford University in 1996.
He earned his bachelors degrees in Physics at the United State Military Academy in 1983. Fred is an active
volunteer as the Executive Producer of the American Society of Quality Reliability Division webinar
program, IEEE reliability standards development teams and previously a voting member of the IEC TAG 56
- Durability. He is a Senior Member of ASQ and IEEE. He is an ASQ Certified Quality and Reliability
Engineer.




                                                             Table of Contents
1.    Introduction...........................................................................................................................................1
2.    Basic Structure......................................................................................................................................1
3.    Reliability Goals.....................................................................................................................................1
4.    Apportionment.......................................................................................................................................3
5.    Feedback Mechanism.............................................................................................................................4
6.    Determining Value.................................................................................................................................6
7.    Maturity Model......................................................................................................................................7
8.    Conclusions............................................................................................................................................7
9.    References.............................................................................................................................................7
10.   Tutorial Visuals……………………………………………………………………………………... . .8




                                                              Schenkelberg: page ii
1. INTRODUCTION                                 reliable product. This brings up the question of what is a
                                                                     ‘reliable product’?
  A product’s design, supply chain and assembly process in             The objective or goal provides the direction and guidance
large part establish the product’s reliability performance. A
                                                                     for the reliability program. Clearly stating the reliability goal
product well suited for the use application will meet or exceed      is a key trait of very effective programs. Leaving the goal
the customer’s durability expectations. The myriad of
                                                                     unstated or vaguely understood may lead to one or more of
decisions by the entire design and production team creates the       the following:
eventual product reliability performance. The structure for
these decisions is the focus of this tutorial.                              • High field failure rate
  Considering that each activity of a design team takes                     • Product recall
resources such as time and money to accomplish, focusing the                • Over designed and expensive product
use of these resources on activities of high value is a common              • Design team priority confusion
strategy. Including product reliability in the value proposition
permits the entire team to weigh the importance of product             Another element of a process is feedback. This occurs
reliability and the appropriate use of tools to accomplish both      within the process as part of the creation of the output, and it
the business and product reliability objectives.                     most certainly exists externally based on the output or process
  The basic premise of this tutorial is the underlying concept       results.
that no one set of reliability activities is appropriate for every     The final result for product reliability is the customer
product development situation. Selecting and integrating the         acceptance or rejection of the product. If the product
best tools permits the execution of an effective and efficient       functions longer than expected, like an HP calculator, the
reliability program.                                                 product is considered a ‘good value’. If the product fails
  The traits of very good reliability programs and examples of       quickly or often, especially compared to other products
very poor practices in this tutorial serve to illustrate how to      providing the same solution, it is considered of ‘poor value’.
approach establishing an effective reliability program.                In some organization the feedback is non-existent, in others
Highlighting the basic structure along with guidelines on how        it is captured within a warranty claims system, in others
to tailor a reliability program will permit the repeatable           within service or repair programs. Customers may complain
creation of reliable products.                                       directly with returned products and demands for
                                                                     replacements, or indirectly by simple not purchasing the
Acronyms and Notation                                                product in the future.
 ALT Accelerated Life Test                                             The feedback within the reliability program attempts to
 CAD Computer Aided Design                                           anticipate the customer’s feedback prior to the delivery of the
 FMEA Failure Modes and Effects Analysis                             product to the customer. Depending on the product and the
 HALT Highly Accelerated Life Test                                   organization, this feedback may be very formally determined,
 LED Light Emitting Diode                                            highly structured and very accurate. Or, the feedback may be
 MTBF Mean Time Between Failure                                      random, haphazard and inaccurate. Both types of feedback
 PoF Physics of Failure                                              may be suitable, again depending on the product and
 SPICE Simulation Program with Integrated                 Circuit    organization.
Emphasis                                                               Establishing the appropriate set of feedback mechanisms
                                                                     within a reliability program is done within the context of the
                                                                     product reliability goals and the value to the organization of
                    2. BASIC STRUCTURE                               the feedback. The process benefits from feedback that is
                                                                     timely and accurate enough to make decisions. It is those
  A product reliability program is a process. Like any process
                                                                     decisions that lead to the product’s reliability in the hands of
it has inputs and outputs, plus generally some form of an
                                                                     the customers.
objective and feedback. Furthermore, the process may or may
                                                                       Therefore the basic structure for any reliability program is
not be controlled or even a conscious part of the organization.
                                                                     to clearly establish and state the reliability goal. Then
Reliability may just happen, good or bad. Results may or may
                                                                     determine the appropriate set of feedback mechanisms that
not be known or understood.
                                                                     provide timely information to permit design and production
  In some organizations, the reliability program may be
                                                                     decisions. The ‘how’ to decide the ‘appropriate set’ is the
highly structured with required activities at each stage along
                                                                     subject of this tutorial.
the product lifecycle. In other organizations, reliability is
considered as a set of tests (e.g. environmental or safety
compliance). And, in some organizations, reliability is                                 3. RELIABILITY GOALS
effectively a part of everyone’s role.                                 The target, objective, mission or goal is the statement that
  In each example above, the resulting product reliability may       provides the design team focus and direction. A well stated
meet the customer’s expectations or not. There isn’t a single        goal will establish the business connection to the technical
process that will always work.                                       decisions related to the product durability expectations. A
  Going back to the basic notion of a simple process, consider       well stated goal provides clarity across the organization and
the objective for a moment. For a reliability program one may        permits a common language for discussing design, supply
desire a specific outcome of a reliable product. The process         chain and manufacturing decisions.
then should promote activities leading to the creation of a            Let’s explore the definition of a ‘well stated reliability
                                                                     goal’. First is it not simple MTBF, “as good as or better



                                                      Schenkelberg: page 1
than…”, or ‘a 5 year product’. These are common ‘goals’            expected to wash clothes for 10 years. An implanted hearing
found across many industries, yet none permit a clear              aid is expected to last the life of the patient; if the patient is a
technical understanding of the durability expectations for the     child this expectation may be more than 70 years.
product.                                                             The duration expectations may be defined by contract,
  The common definition for reliability is                         market expectations, or by a business decision. The duration
                                                                   or life expectancy most likely is not the warranty period. For
      Reliability is … the ability or capability of the            example, many personal computers have a 3 month or 1 year
    product to perform the specified function in the               warranty period. Yet, the product is expected to last at least
    designated environment for a minimum length of                 two years or more with normal use.
    time or minimum number of cycles or events.                      Many products have multiple durations that are of interest.
                                 (Ireson, Coombs et al. 1995)             • Out of box
                                                                          • Warranty
 Note this definition has four elements:                                  • Design Life
     • Function
     • Environment                                                   The initial, out-of-box, or installation period is that
     • Duration                                                    duration when the customer is first setting up and using the
     • Probability                                                 product. Brand visibility is at the highest and the expectation
                                                                   that a new product will function as expected is very high. The
3.1 Function                                                       types of failures that may occur include installation or
  The function is what the product is to do or perform. For        configuration errors, mistaken purchase, shipping or
example, an emergency room ventilator is to provide assisted       installation damage, or simply buyer error. All of these
breathing for a person. This requires the ventilator to produce    ‘failures’ cost the company producing the product resources.
breathable air within a range of pressures within a prescribed       The warranty period is the duration associated with the
cycle of respiration. It may include requirements for filtering,   producer’s promise to provide a product free of defects for a
temperature, and adjustments to pressure and timing of the         stated period of time. For example a computer may have a 1
cycle, etc. Often, a product development team either develops      year warranty period. During this one year, if the product
or is given a detailed set of functional requirements.             fails (usually limited to normal use and operating
  Often the functional elements of a product are directly          environment) the producer will repair or replace the product.
measurable. And, the quality function of most organizations        Naturally this will cost the producer resources.
verifies the design and production units meet the functional         The design life is the business or market expected product
requirements. When the product does not meet the functional        duration of function use. After the warranty period there isn’t
requirements, it is considered a product failure. Within the       an expectation for the producer to replace or repair the
function definition, which are the most important functions,       product, yet the customer may have a reasonable expectation
which must not fail, which are functions that, if they fail may    that the product will function satisfactorily over the design
simply degrade performance, if noticed by the customer at          life duration. For example, many cell phones have a 3-month
all?                                                               warranty, yet as consumers we have an expectation that the
                                                                   phone will function for two years or more.
3.2 Environment                                                      Marketing or senior management may set the design life.
  The environment could be considered the weather around           They may want to establish a market position for the product
the product when in use. ‘Weather’ such as temperature,            related to reliability. One way is to design a very robust
humidity, UV radiation intensity, etc. It should also include      product with a long design life duration. HP calculators often
environmental factors that provide destructive stresses, such      have only a 3-month or 1-year warranty, yet many have lasted
as vibration, moisture, corrosive gases, voltage transients, and   10 or more years. These calculators are known for their
many more.                                                         robustness and often cost more to purchase – a reliability
  Another element of the environment is the use of the             premium.
product. What is the use profile? Once a day for a few               Each of the three durations often involves different risks
minutes, like a remote control for the stereo system. Or is it a   related to the failure mechanisms. It is rare for bearings to
24/7 operation such as for server system processing                wear out in the first 30 days, yet more likely for a 10-year
transactions for a major online store. The profile may include     design life. Establishing three or more durations within the
details concerning human interactions, operating modes,            product reliability goal permits the design team to focus on
shipping, storage, and installation. The environmental             and address the full range of product reliability risks.
conditions need to detail how the product responds or              3.4 Probability
degrades to the set of stresses the product encounters. The
environmental conditions focus on drivers for the product’s          The probability is the likelihood of the product surviving
most likely failure mechanisms.                                    over a specified period of time. In the formal reliability
                                                                   definition above, the phrase ‘ability or capability’ refers to the
3.3 Duration                                                       probability. This is the statistical part of the reliability goal
  The duration is the amount of time or number of cycles the       and without it the goal is fairly meaningless. Furthermore,
product is expected to function. A computer printer may be         stating a probability without an associated duration and
expected to print for five years. A washing machine is             distribution is also meaningless in most cases.



                                                    Schenkelberg: page 2
What is the chance that a particular product will function as     and some will require modification. Sophisticated model
expected over the entire expected design life? How many of           include apportioned goals, addressing many functions, and
the installed units will be functional over the warranty             several use profiles and several environments, different
period? Since each product and the associated environmental          durations, and conditional probabilities. Simple models work
stress vary, the use of statistics is unavoidable in describing      to get started, as more details become available concerning
product reliability. Even the definition of a product failure        the design and use, sophisticated models are increasingly
may vary by customer.                                                useful.
   While there are many common terms to convey the                     The duration may also require modification. The durations
probability of survival, the use of a percentage surviving is        are most often the same as the system level and may require
the easiest understood and most easily applied across an             modification if the various components or subsystems are
organization. Stating that 95% of units are expected to              only employed during specific phases of the products use, i.e.
survive over the 5 year design life, means 95 out of 100 units       an installation and configuration aide.
will function properly over the 5 year period. Or, that a single       The probability will require modification unless the product
product has a one in twenty chance (95%) of surviving 5              has no component or subsystem elements. This is rare except
years. A similar statement is that not more than 5% of               for raw materials. Even a simple discrete resistor has multiple
products fail over the full five years. Or, may be stated as not     components that may have different failure mechanisms. For
more than a 1% failure rate per year.                                example, the resistive element and the soldering leads have
   A common probability statement is the inverse of the failure      different functional descriptions are made of different
rate, or MTBF. The 95% reliability over 5 years (t) becomes          materials and enjoy different sets of stresses that lead to
approximately 100 years MTBF (θ). This does not mean the             failure. The probability of failure is not the same as for the
product will last 100 years, it does mean that 95% of the            system.
products are expected to last 5 years.                                 Another way to look at the probability differences breaks
   Finally stating a separate failure probability for each           down the system probability of success to each element within
duration of interest provides a set of duration/probability          the product. A simple system with two primary means to fail
couplets that permit different focus for early or out of box         (say the resistor with the resistive and connection elements as
failure risks versus the longer term failure risks.                  an example for discussion) and the system has a 90%
   If the product has a specific mission time, say an aircraft       probability of successfully functioning over 20 years. If both
with an expected 12-hour mission over a 20-year serviceable          of the elements also have a 90% probability, and either the
life period. The probability of success for the 12-hour mission      resistive or connection element causes a system failure, then
time maybe set relatively high. And, it may have a                   either the system or subsystem goals are misstated. As you
conditional probability considering the number of missions           already know, for a simple series system the probability of
since the last major service. Some products have availability        success for the subsystems has to be larger such that when
goals and undergo routine maintenance or repair. These               they are multiplied together the result meets or exceeds the
products and many complex systems require additional                 system goal.
complexity in their goal setting. For the purpose of this              There are excellent references for basic reliability modeling
discussion, we are considering simple products that are not          and many papers and forums to discuss even the most
normally repaired or, products where the main interest is in         complex systems. The intention is to apportion the system
the time to the first failure.                                       reliability goal, especially the probability value, to all major
   The point is that setting the reliability goal for a product is   elements of the product.
not as simple as stating a ‘five year life’ – it requires a clear    4.1 Establishing the probability apportionment
statement with sufficient detail of each of the four elements:
function, environment, duration, and probability. And, it may          The time to establish the reliability apportionment is early
and often should include at least three duration/probability         in the project. Depending on the project and the known
couplets. The goal establishes the direction or target for the       values from field data, vendors, previous projects, etc. the
entire design, supply chain and manufacturing team.                  apportionment may be well founded on data, or simply a
                                                                     guess. Both are valuable.
                                                                       Consider a simple example of a computer system with five
                    4. APPORTIONMENT                                 major subsystems: motherboard, disk drive, monitor, power
                                                                     supply and keyboard. Of course there are other elements, yet
  The system or product level reliability goal is not sufficient     for this example we are limited the list to these five.
by itself. Ideally, every component or assembly step, which            If this is our first product and little is known about the
has a possible impact on the final product reliability, should       reliability of any of these components (for example, when
have an established reliability goal. Each individual element        designing the first personal computers in the 80’s). Further,
should have goals that are tailored to that specific element.        let’s assume the system goal is 95% reliable over a 5-year
For example a cooling fan that only operates when the                period for the design life. Having no other information, a
internal temperature reaches a defined value, has a different        straight-line apportionment is as good a starting place as any.
use profile than the entire system. The function and                 Therefore, each of the five subsystems receives an
environment are different for the specific fan than for the          apportionment goal of 99% reliable over 5 years. Also, the
system. The computer provides a platform for computer                functional and environmental elements receive attention to
programs to operate along with a user interface, whereas the         adjust to those subsystems particular requirements.
fan provides cooling. Many of the environmental factors for
the computer also impact the fan, yet not everything applies


                                                      Schenkelberg: page 3
At first, this simple method provides a starting point for the     The primary intent of using reliability goals and
team’s discussion concerning reliability. It provides the basis    apportionment is to permit meaningful decisions concerning
for product design, part procurement, validation and               reliability along with the ability to consider product cost and
verification testing, and the myriad of cost/benefit trade off     other important aspects of the design in a meaningful
decisions required during the product lifecycle.                   manner.
  Overtime, years of field data, vendor data and internal
product testing continue to improve the understanding of                           5. FEEDBACK MECHANISMS
each subsystem’s reliability. This understanding becomes the
base for the initial apportionment estimates for a new               There are two basic questions in reliability engineering.
product. Consider a new project for a personal computer            What is going to fail? And, when will the product fail? Both
where only the CPU and associated chipset is new. The              are related to failure mechanisms. The first may require the
overall apportionment model may start with the best available      discovery of the failure mechanism. The second may require
reliability values for all the subsystems and include an           the determination of the expected behavior of the failure
adjustment to the motherboard value considering the                mechanism over time. Both questions have a wide range of
uncertainty or estimated value change regarding the new            tools available to find the answers. It is the selection of the
CPU chipset. The uncertainty is relatively low and the use         right tools to provide a good enough answer in an effective
within subtle design decisions is possible.                        and efficient manner that is the subject of this section.
                                                                     Each engineer tends to design away from failure. (Petroski
4.2 Adjusting the probability apportionment                        1994) And, each engineer generally knows about the most
                                                                   likely failure mechanisms related to their section of the
  Going back to the first personal computer design and
                                                                   design, within the realm of their experience. They may gain
simple straight-line apportionment. A little common sense
                                                                   additional experience as their design fails in unexpected (to
and feedback from vendors may provide additional
                                                                   them) ways. Part of the design process is to uncover failures
information. The keyboard is most likely more reliable than
                                                                   and improve the design to avoid or lessen the probability of
the power supply, for example. Adjusting the goal for the
                                                                   the same.
power supply down, say to 98%, then requires an adjustment
                                                                     Tools such as FMEA and HALT permit the design team to
in one or more of the other subsystems such that the product
                                                                   discover failures. Often the FMEA session permits the design
remains at or above the system goal of 95%. The same rule
                                                                   team to share the known or expected failure mechanisms.
applies for any other series system of apportionment.
                                                                   Occasionally, a new possible failure mode appears in these
  Another consideration for the apportionment adjustment is
                                                                   sessions. The real value is in improving the ability of the
the cost/benefit tradeoff. For nearly any development project
                                                                   entire team to identify unknown failures and address the
there is a limit to product cost, therefore simply purchasing
                                                                   effects of the known expected failures. Each person on the
the most expensive components, which may or may not be the
                                                                   FMEA team brings a set of known or expected failures to the
most reliable, is not always an option. Back to the power
                                                                   discussion. The combined set increases the entire team’s
supply example above. Let’s say the vendor of the initially
                                                                   awareness to the larger set of possible issues.
selected power supply considers the use, environment and
                                                                     HALT, in the broadest sense is started with the first product
functional requirements and states that the power supply will
                                                                   models or bench top testing. Exploring the reaction of the
have a 95% probability of success over 5 years. That is the
                                                                   product to various stimulations is an exploration of where the
same value as the overall system goal, and unless all the other
                                                                   product works by defining where it doesn’t work. The
subsystems are perfect (100% reliable over 5 years) the design
                                                                   intention of HALT is to apply stresses relevant to the
team will not achieved the reliability objective.
                                                                   product’s environment (vibration, voltage, temperature, usage
  A search reveals three alternative power supplies that will
                                                                   rates, etc.) and determine the boundary between functional
meet the functional requirements. One has a 97% reliability
                                                                   and not functional behavior. With careful root cause analysis,
at a cost of $50, the second has a 98% reliability at a cost of
                                                                   then uncover and understand the failures, enabling the design
$100 and the third has a 99% reliability at a cost of $250.
                                                                   to adjust to create a more robust product.
  If product cost is not an issue (rarely the case) spend the
                                                                     Common engineering tools also permit this discovery.
$250 and achieve the apportioned objective. If it is possible to
                                                                   Many CAD programs include basic finite element analysis
improve the reliability of other subsystems, say the monitor,
                                                                   capabilities. Adjusting material properties to reflect the
for less cost, to offset the difference between the 99% goal
                                                                   effects of aging (i.e. oxidation of polymers making them more
and 98% or 97% reliability associated with less expensive
                                                                   brittle) and performing a simple analysis may find aging
power supplies, than that would provide the highest reliability
                                                                   weaknesses in the design. The same applies for SPICE
for the least cost. This is a simple illustration of the
                                                                   models of circuits. Consider the expected drift of capacitor
cost/benefit tradeoff; in practice these may become very
                                                                   values over time and the continued functionality of the
complicated decisions.
                                                                   circuit.
  An advanced practice is to establish reliability goals and
                                                                     If the product is new or contains new technology or
associated apportionment for the various stage gates during
                                                                   assembly processes, the nature of the failures may not be well
the product lifecycle. With each successive round of design,
                                                                   understood. FMEA and HALT and related discovery tools
prototyping, and analysis not only is the product improving,
                                                                   apply. If the project is to refine an existing product and there
but the uncertainty is also diminishing. Using the lower limit
                                                                   is ample internal and field data defining the areas for
for reliability estimates is one way to reflect the range of
                                                                   improvements, then the discovery tools do not add value.
reliability uncertainty.
                                                                     The first question looks for what will fail. If the failures are
                                                                   known or the various tools help determine what will fail, the


                                                    Schenkelberg: page 4
product reliability can be improved by addressing those            expected new environment. Tools such as ALT may apply.
aspects the product that lead to the failures. One approach to     Thermal cycling for the solder joint attachments and high
product design is: build, test, fix – repeat. That is, find and    temperature exposure while illuminated to evaluate the
fix the first element of a design to fail and the product          luminosity degradation are two examples of what could be
improves. Continue to do so till there are no more failures or     usefully tested.
the design reaches the design limits of the materials (for            The results of the discovery evaluations along with
example, the first failure occurs as the polymer case melts).      engineering judgments concerning the uncertainty of failure
  The primary drawback to this approach is the inability to        mechanism behaviors will prioritize the list of most likely
quantify the product reliability value concerning how many         failure mechanisms. This list then can be sorted by
units will last how long. Understanding what will fail is          appropriate stress to design accelerated life tests. More than
critical to being able to answer the second question – when        one failure mechanism may be accelerated due to the high
will it fail?                                                      temperature exposure, for instance.
  As the design team addresses the design issues the second           The reliability program most likely will have goal setting,
question enables them to know if they have achieved the            apportionment, initial reliability predictions based on
product reliability goal. As with discovery tools, there are       literature and vendor data, prototype testing of various solder
many tools available to determine how long a product will          joint attachment mythologies, and product level accelerated
last. Predictions, accelerated life testing, demonstration tests   life testing focused on 3 to 5 different stresses.
all are capable of providing an estimate of how long a product        This approach takes advantage of existing knowledge
may last.                                                          concerning LED technology and previous explorations of
  Deterministic models may also provide results. For               failure mechanisms within LED technology and solder
example, the polymer diffusion rate permits air to accumulate      attachment methods. The approach also considers if any new
within a tube, which at a critical air volume will block fluid     failure modes may appear in the new, harsher environment.
flow. This process can be modeled and the time to failure          The approach also considers the relative low cost of the
calculated for different wall thicknesses and air pressures.       individual units and the ability to quickly measure the
Field data is often the most accurate way to estimate actual       product performance by the use of ALT. The initial risk is
field performance although it is usually not available for new     high for the new environment and if the LED’s actually last
products or elements of new products.                              longer than twice the expected life of the incandescent
  To illustrate how to select the appropriate tools to provide     systems the product provides a cost savings to the car
feedback to the design team, let’s consider a few cases. Keep      manufacturer and owner.
in mind that not all tools are appropriate for all situations.
                                                                   5.2 The low volume high cost case
5.1 The existing technology in new environment case
                                                                     In comparison to the first case, consider a product that has a
  To illustrate the existing technology in new environment         very limited production volume, say 50 total units. Plus, each
case consider the initial design of an LED brake light. This       unit is very expensive; say $1million. Running 30 units each
is new technology with respect to the application of the LED       in three different ALT to failure is not viable. Even getting
to the car taillight environment. While LED lighting has been      one full unit for destructive HALT testing is not likely. Yet
available in a range of applications for some time, the car        all the same unknowns as above or more may apply.
taillight environment is harsher and more demanding than             Consider an oil exploration sensor array unit that attaches
previous application environments. Simply the ambient              to the drill string during drilling and has the function to
temperature extremes from overnight, outdoors in Fargo, ND         monitor and report the presence of specific types of
(-30°F) to direct sunlight exposure, within an unventilated        hydrocarbons. This is a complex system in a very harsh
enclosure in the Tucson, AZ summer (180°C). Also a new             environment.
assembly process to attach dozens of LED elements to a brake         The list of what and how the product could fail is quite
light pattern frame in a high-volume mass-production               long. Given the constraint of no system level units for product
assembly line will be required.                                    testing, only a few of the tools from the first case apply: goal
  There is no history, no previous products on the market          setting, apportionment, prediction and FMEA. The FMEA is
using LED’s in anything like the brake light environment.          a discovery tool and will not provide the necessary feedback
What could possibly go wrong? The design team doesn’t              on the product’s expected durability. Thus the onus is on
know what could go wrong. Therefore, the appropriate set of        performing accurate predictions.
tools should first discover the most likely failure mechanisms.      In this case, the use of Physics of Failure (PoF) modeling
FMEA and HALT both apply, for example. Both of these               may be the most valuable tool available. Understanding the
tools can build on what is already known about LED                 relationship between the expected stresses and the component
operations and known failure mechanisms. The new                   level responses over time, permits the PoF models to predict
environment may accelerate some little known failure               the system life. The development of the PoF models related to
mechanisms, or it may simply accelerate already well known         the critical component failure mechanisms may take
mechanisms.                                                        significant work, yet the option to test multiple units is not
  Once the failure mechanisms are known the requirement for        viable. Therefore, the analytical and theoretical work permits
the new brake lights is to last twice as long as the current       the team to receive feedback on the expected product
incandescent systems, or a 95% probability of lasting 10           weaknesses and expected life limiting failure mechanisms.
years. Simply finding the failures, surprising failure modes or    Even determining the critical component failure mechanisms
not, permits the evaluation of how long they will last in the      may be difficult.



                                                    Schenkelberg: page 5
This approach takes advantage of the existing literature          Suppose after the first round of predictions we find the
detailing failure mechanisms for a wide range of components,        keyboard has a lower expected reliability of 99.9% reliability
plus the ability to evaluate individual components at much          over 5 years. Furthermore, let’s assume the remaining four
less expense than the full system. The approach has more risk       subsystems all meet their goal at 99%. And, it is possible to
in the identification of the unknown failure mechanisms             improve the reliability of the keyboard to 99.99% by spending
related to the full system configuration and use, yet, careful      $1 more per keyboard. And, let’s assume it will cost the
use of tools like FMEA and reliability modeling permits the         company $1000 per field failure for any cause. And, we
team to mitigate this risk to some extent.                          expect to build and sell 100,000 computers.
                                                                      For the current keyboard, we expect 100,000-(0.999 *
5.3 The moderate volume product family variation case
                                                                    100,000) = 100 keyboard failures. These will cost the
  A common case is the modification of an existing product.         company 100*$1,000=$100,000.
There is field data, the previous product testing information is      For the new keyboard the cost will be 100,000 * $1 =
available and the list of known failure mechanisms is well          $100,000. The savings will be due to reducing the field
defined. Furthermore the product functions, intended                failures, from 100 failures to 1 failure. The new keyboard’s
environment and use profile remain basically the same.              one failure costs $1,000. This is down from $100,000 for a
  In some regards, this is more difficult than the previous two     savings of $99,000.
cases. One approach would be to only test the new product             For a savings of approximately $99,000 we spent $100,000,
with respect to the changes, and possibly only evaluate the         which may make it difficult to justify the change. The
individual new components with the justification that nothing       calculations might be more favorable if for the same cost of
else has changed.                                                   change, a difference in reliability from 99% to 99.5% could
  The second possible approach is to repeat all of the              be made in the power supply. For any proposed change that
evaluations and testing as done for the original product. Here      impacts the reliability apportionment model the above
the justification may be that the relatively minor changes may      calculation quickly illustrates the value.
adversely impact existing elements of the product. Or, worse,         Yet, not all of the reliability tools directly increase or
the justification could be ‘we always do the full set of testing’   decrease the expected reliability. In some cases, the tool
mentality.                                                          might only shorten the time to detect the failure mechanisms.
  Both approaches have risks and costs that can and should          HALT is an example of this and it often finds most of the
be mitigated. Using the existing reliability models and best        failure mechanisms in a design within a week, which would
available data, the design team can isolate the changes and         normally take months of standards based environmental
assign a range of predicted values to the new component             testing to uncover. The savings in time to market risk, more
reliability. In conjunction with that they can perform a very       than justifies the necessity of making multiple trips to the
focused Design FMEA on the changes with an emphasis on              HALT chamber.
how the changes impact any other element within the design.           Another cost saving is the reduction in uncertainty. By
  At this point, the design team can decide if the uncertainty      simply improving the accuracy of reliability predictions the
concerning either the interaction effects or the life uncertainty   range of the estimated reliability diminishes. Once the range
warrant further testing. If the true value for any range of         no longer crosses a decision threshold to either conduct
reliability uncertainty will not preclude the product launch,       further analysis or testing, the project resources can focus
then clearly no further testing is needed and the current           improvement efforts on other high uncertainty or low
prediction if sufficient. If the low end of the range, on the       reliability elements.
other hand, would require further reliability improvements, or
if the changes impact on other aspects of the product is
unclear, then further analysis and testing will be needed.                             7. MATURITY MODEL
  The appropriate approach considers what is known and
                                                                      The state of the organization is also important. A design
unknowns, and the associated risks and decision points. The
                                                                    team that has no experience or expertise in statistical methods
intent is to provide both guidance and feedback to the design
                                                                    will probably flounder when trying to use an event-
team that permits well informed decisions. Using too few or
                                                                    conditional based reliability block diagram that requires
too many reliability tools may incur undue risks or costs. The
                                                                    advance statistical modeling. Getting this team to simply use
well crafted reliability program carefully considers how each
                                                                    a Weibull cumulative distribution plot may be a stretch, and
reliability activity provides feedback toward answering the
                                                                    provide more value initially.
two primary questions: what will fail and when will the
                                                                      Each organization has a set of skills, expectations,
product fail. The intent is to add value to the product.
                                                                    structures, etc. that defines the culture concerning product
                                                                    reliability. Designing and applying reliability tools that will
                  6. DETERMINING VALUE
                                                                    make an impact within the organization should fit within or
  One way to select reliability tools for improving the             be close to the organization’s current capabilities. The tools
product’s reliability is to consider the return on investment. If   will only have impact and be useful if understood and make
the activity will not reduce risk, increase durability, reduce      the current situation better. For example, a team that is
engineering time, and eliminate failure mechanisms, etc. then       consistently surprised by field failure modes may immediately
the activity should not occur.                                      benefit by conducting HALT testing to discover failure
  Consider a simple example. Recall the computer with five          mechanism before their customers do so for them.
subsystems from the apportionment discussion above. The               Phil Crosby in his book Quality is Free (Crosby 1979)
initial goal for each subsystem was 99% over 5 years.               created a maturity matrix focused on quality. With slight


                                                     Schenkelberg: page 6
modification, by substituting reliability for quality the same     different tools would be needed. The downstairs, stage II,
basic table is meaningful for the assessment of an                 organization would require coaching, training, and resources
organization’s reliability program.                                to break the cycle of letting surprising field failures dominate
  The primary difference that separates an effective reliability   the engineering day. The upstairs, stage IV, organization
program from a non-effective one is the proactive nature of        might be ready for advanced tools related to product
the program. On one occasion I conducted assessments of two        modeling or field data analysis. They would have the time to
organizations located in the same building. Both designed          learn advanced accelerated degradation testing methods, for
and manufactured telecommunication equipment with similar          example.
complexity and volume. The interview schedule had me
going up and down stairs almost every hour for two days and                              8. CONCLUSIONS
by midday of the first day I enjoyed going upstairs and
dreaded heading down. Despite all the product and business           In summary, the traits of effective reliability programs
similarities the two reliability programs were dramatically        include the ability to:
different; as different as their reliability results.                    • State clear reliability goals;
  Downstairs the interviews started late, got interrupted by             • Enable tradeoff decision-making;
urgent phone calls or in-person requests; firefighting at its            • Selectively use only value-added reliability
best. The team employed a wide range of tools, all that were                  activities;
listed on a checklist, for each project. The reliability goals           • Promote a proactive reliability culture.
were not known to the design team and the few that did know          The basic message is that no one list or standard of tasks
them also understood that they would not be measured nor           makes an effective reliability program. The selection of
would a failure to meet those goals impede getting the             valuable tools and the establishment of a basic structure for
product to market. The people I talked to stated reliability       decision-making permit an organization to achieve the
was very important and were very busy fixing field failures or     desired reliability objectives.
testing (just before product launch) identified issues.
Reliability was done by “the guy that left last year”.                                   9. REFERENCES
  Upstairs the interview started on time, and proceeded
without interruption. No one remembered the last time there        Crosby, P. B. (1979). Quality is Free: The Art of Making
had been an urgent need to resolve a field issue. The team         Quality Certain. New York, Signet.
employed reliability tools that would benefit the project as
needed. The specific testing that was done was tailored to the     Ireson, W. G., C. F. Coombs, et al. (1995). Handbook of
risks identified during the design phase. The goals were           reliability engineering and management. New York, McGraw
widely known and their current status was also known, both         Hill.
during development and after product launch. The people I
talked to stated reliability was very important and they knew      Petroski, H. (1994). Design Paradigms: Case historyes of
what to do to meet their reliability objectives. The team’s        error and judgement in engineering. Cambridge, Cambridge
manager taught reliability thinking and skills, and everyone       University Press.
did reliability.
  For both organizations the basic structure and thought
process to determine which reliability tools to use would
apply but because of their different stages of development




                                                    Schenkelberg: page 7

Más contenido relacionado

La actualidad más candente

Ability Maturity Matrix PowerPoint Presentation Slides
Ability Maturity Matrix PowerPoint Presentation SlidesAbility Maturity Matrix PowerPoint Presentation Slides
Ability Maturity Matrix PowerPoint Presentation SlidesSlideTeam
 
Protorative Methodology
Protorative MethodologyProtorative Methodology
Protorative MethodologyYashpal Jain
 
Erp implementation approaches.
Erp implementation approaches.Erp implementation approaches.
Erp implementation approaches.Bondrulz Ubale
 
Presentation by somdatta banerjee
Presentation by somdatta banerjeePresentation by somdatta banerjee
Presentation by somdatta banerjeePMI_IREP_TP
 
Ray Business Technologies Process Methodology
Ray Business Technologies Process MethodologyRay Business Technologies Process Methodology
Ray Business Technologies Process Methodologyray biztech
 
CMMI-DEV 1.3 Tool (checklist)
CMMI-DEV 1.3 Tool (checklist)CMMI-DEV 1.3 Tool (checklist)
CMMI-DEV 1.3 Tool (checklist)Robert Levy
 
Bpm10gperformancetuning 476208
Bpm10gperformancetuning 476208Bpm10gperformancetuning 476208
Bpm10gperformancetuning 476208Vibhor Rastogi
 
St Final Hsiq Questcon Sales Presentation 092006
St Final Hsiq Questcon Sales Presentation 092006St Final Hsiq Questcon Sales Presentation 092006
St Final Hsiq Questcon Sales Presentation 092006anjuabel
 
Agile adoption julen c. mohanty
Agile adoption   julen c. mohantyAgile adoption   julen c. mohanty
Agile adoption julen c. mohantyJulen Mohanty
 
Agile in an ANSI-748-C environment
Agile in an ANSI-748-C environmentAgile in an ANSI-748-C environment
Agile in an ANSI-748-C environmentGlen Alleman
 
Bsa 385 week 5 team assignment smith software testing environment
Bsa 385 week 5 team assignment smith software testing environmentBsa 385 week 5 team assignment smith software testing environment
Bsa 385 week 5 team assignment smith software testing environmentThomas Charles Mack (Leigh)
 
Essential building blocks of a lean and efficient test process
Essential building blocks of a lean and efficient test processEssential building blocks of a lean and efficient test process
Essential building blocks of a lean and efficient test processMaveric Systems
 
Establishing the Performance Measurement Baseline
Establishing the Performance Measurement BaselineEstablishing the Performance Measurement Baseline
Establishing the Performance Measurement BaselineGlen Alleman
 
Productivity measurement of agile teams (IWSM 2015)
Productivity measurement of agile teams (IWSM 2015)Productivity measurement of agile teams (IWSM 2015)
Productivity measurement of agile teams (IWSM 2015)Harold van Heeringen
 
Integrated methodology for testing and quality management.
Integrated methodology for testing and quality management.Integrated methodology for testing and quality management.
Integrated methodology for testing and quality management.Mindtree Ltd.
 
ERP Implementation Life Cycle
ERP Implementation Life CycleERP Implementation Life Cycle
ERP Implementation Life CycleApurv Gourav
 
Outsourcing product development introduction
Outsourcing product development introductionOutsourcing product development introduction
Outsourcing product development introductionsuryauk
 

La actualidad más candente (20)

Ability Maturity Matrix PowerPoint Presentation Slides
Ability Maturity Matrix PowerPoint Presentation SlidesAbility Maturity Matrix PowerPoint Presentation Slides
Ability Maturity Matrix PowerPoint Presentation Slides
 
Ch23 project planning
Ch23 project planningCh23 project planning
Ch23 project planning
 
Protorative Methodology
Protorative MethodologyProtorative Methodology
Protorative Methodology
 
Erp implementation approaches.
Erp implementation approaches.Erp implementation approaches.
Erp implementation approaches.
 
Presentation by somdatta banerjee
Presentation by somdatta banerjeePresentation by somdatta banerjee
Presentation by somdatta banerjee
 
Ray Business Technologies Process Methodology
Ray Business Technologies Process MethodologyRay Business Technologies Process Methodology
Ray Business Technologies Process Methodology
 
CMMI-DEV 1.3 Tool (checklist)
CMMI-DEV 1.3 Tool (checklist)CMMI-DEV 1.3 Tool (checklist)
CMMI-DEV 1.3 Tool (checklist)
 
Right sourcing
Right sourcingRight sourcing
Right sourcing
 
Bpm10gperformancetuning 476208
Bpm10gperformancetuning 476208Bpm10gperformancetuning 476208
Bpm10gperformancetuning 476208
 
St Final Hsiq Questcon Sales Presentation 092006
St Final Hsiq Questcon Sales Presentation 092006St Final Hsiq Questcon Sales Presentation 092006
St Final Hsiq Questcon Sales Presentation 092006
 
Agile adoption julen c. mohanty
Agile adoption   julen c. mohantyAgile adoption   julen c. mohanty
Agile adoption julen c. mohanty
 
Agile in an ANSI-748-C environment
Agile in an ANSI-748-C environmentAgile in an ANSI-748-C environment
Agile in an ANSI-748-C environment
 
Bsa 385 week 5 team assignment smith software testing environment
Bsa 385 week 5 team assignment smith software testing environmentBsa 385 week 5 team assignment smith software testing environment
Bsa 385 week 5 team assignment smith software testing environment
 
Essential building blocks of a lean and efficient test process
Essential building blocks of a lean and efficient test processEssential building blocks of a lean and efficient test process
Essential building blocks of a lean and efficient test process
 
Establishing the Performance Measurement Baseline
Establishing the Performance Measurement BaselineEstablishing the Performance Measurement Baseline
Establishing the Performance Measurement Baseline
 
Productivity measurement of agile teams (IWSM 2015)
Productivity measurement of agile teams (IWSM 2015)Productivity measurement of agile teams (IWSM 2015)
Productivity measurement of agile teams (IWSM 2015)
 
PROACTVE
PROACTVEPROACTVE
PROACTVE
 
Integrated methodology for testing and quality management.
Integrated methodology for testing and quality management.Integrated methodology for testing and quality management.
Integrated methodology for testing and quality management.
 
ERP Implementation Life Cycle
ERP Implementation Life CycleERP Implementation Life Cycle
ERP Implementation Life Cycle
 
Outsourcing product development introduction
Outsourcing product development introductionOutsourcing product development introduction
Outsourcing product development introduction
 

Destacado

1 2 здания и сооружения
1 2 здания и сооружения1 2 здания и сооружения
1 2 здания и сооруженияMichaelMir
 
卓爾,謝謝你毀了我的人生
卓爾,謝謝你毀了我的人生卓爾,謝謝你毀了我的人生
卓爾,謝謝你毀了我的人生TAAZE 讀冊生活
 
Esteem2012
Esteem2012Esteem2012
Esteem2012cbrach
 
The Dangers of Pushing Collaboration Too Far
The Dangers of Pushing Collaboration Too FarThe Dangers of Pushing Collaboration Too Far
The Dangers of Pushing Collaboration Too FarMichael Sampson
 
პრეზენტაცია 3
პრეზენტაცია 3პრეზენტაცია 3
პრეზენტაცია 3Lili Ekseulidze
 
Does social media have an affect on seo
Does social media have an affect on seoDoes social media have an affect on seo
Does social media have an affect on seoClark Davidson
 
учимся читать
учимся читатьучимся читать
учимся читатьLili Ekseulidze
 
Illustrating Poetry - Presentation Design
Illustrating Poetry - Presentation DesignIllustrating Poetry - Presentation Design
Illustrating Poetry - Presentation DesignJaclyn Martin
 
2011 RAMS Tutorial Effective Reliability Program Traits and Management
2011 RAMS Tutorial Effective Reliability Program Traits and Management2011 RAMS Tutorial Effective Reliability Program Traits and Management
2011 RAMS Tutorial Effective Reliability Program Traits and ManagementAccendo Reliability
 
Projeto I sarau da eja
Projeto I sarau da ejaProjeto I sarau da eja
Projeto I sarau da ejaJeca Tatu
 
Nutrientes: El Agua (H2O)
Nutrientes: El Agua (H2O)Nutrientes: El Agua (H2O)
Nutrientes: El Agua (H2O)Eli Caballero
 
Usa western region country editable powerpoint maps with states and counties ...
Usa western region country editable powerpoint maps with states and counties ...Usa western region country editable powerpoint maps with states and counties ...
Usa western region country editable powerpoint maps with states and counties ...SlideTeam.net
 
Replanejamento julho 2016
Replanejamento julho 2016Replanejamento julho 2016
Replanejamento julho 2016Jeca Tatu
 
Teste de psicopatia 01
Teste de psicopatia 01Teste de psicopatia 01
Teste de psicopatia 01Byron Lanverly
 

Destacado (18)

1 2 здания и сооружения
1 2 здания и сооружения1 2 здания и сооружения
1 2 здания и сооружения
 
卓爾,謝謝你毀了我的人生
卓爾,謝謝你毀了我的人生卓爾,謝謝你毀了我的人生
卓爾,謝謝你毀了我的人生
 
9789862218679
97898622186799789862218679
9789862218679
 
國共相爭與皖南事變
國共相爭與皖南事變國共相爭與皖南事變
國共相爭與皖南事變
 
Tutorial
TutorialTutorial
Tutorial
 
TP2 Evaluation of service concepts
TP2 Evaluation of service conceptsTP2 Evaluation of service concepts
TP2 Evaluation of service concepts
 
Esteem2012
Esteem2012Esteem2012
Esteem2012
 
The Dangers of Pushing Collaboration Too Far
The Dangers of Pushing Collaboration Too FarThe Dangers of Pushing Collaboration Too Far
The Dangers of Pushing Collaboration Too Far
 
პრეზენტაცია 3
პრეზენტაცია 3პრეზენტაცია 3
პრეზენტაცია 3
 
Does social media have an affect on seo
Does social media have an affect on seoDoes social media have an affect on seo
Does social media have an affect on seo
 
учимся читать
учимся читатьучимся читать
учимся читать
 
Illustrating Poetry - Presentation Design
Illustrating Poetry - Presentation DesignIllustrating Poetry - Presentation Design
Illustrating Poetry - Presentation Design
 
2011 RAMS Tutorial Effective Reliability Program Traits and Management
2011 RAMS Tutorial Effective Reliability Program Traits and Management2011 RAMS Tutorial Effective Reliability Program Traits and Management
2011 RAMS Tutorial Effective Reliability Program Traits and Management
 
Projeto I sarau da eja
Projeto I sarau da ejaProjeto I sarau da eja
Projeto I sarau da eja
 
Nutrientes: El Agua (H2O)
Nutrientes: El Agua (H2O)Nutrientes: El Agua (H2O)
Nutrientes: El Agua (H2O)
 
Usa western region country editable powerpoint maps with states and counties ...
Usa western region country editable powerpoint maps with states and counties ...Usa western region country editable powerpoint maps with states and counties ...
Usa western region country editable powerpoint maps with states and counties ...
 
Replanejamento julho 2016
Replanejamento julho 2016Replanejamento julho 2016
Replanejamento julho 2016
 
Teste de psicopatia 01
Teste de psicopatia 01Teste de psicopatia 01
Teste de psicopatia 01
 

Similar a Tutorial on Effective Reliability Program Traits and Management

Presentation by meghna jadhav
Presentation by meghna jadhavPresentation by meghna jadhav
Presentation by meghna jadhavPMI_IREP_TP
 
Quality - A Priority In Service Engagements
Quality - A Priority In Service EngagementsQuality - A Priority In Service Engagements
Quality - A Priority In Service Engagementsppd1961
 
How To Build A Credible Performance Measurement Baseline
How To Build A Credible Performance Measurement BaselineHow To Build A Credible Performance Measurement Baseline
How To Build A Credible Performance Measurement BaselineGlen Alleman
 
How To Build A Credible Performance Measurement Baseline
How To Build A Credible Performance Measurement BaselineHow To Build A Credible Performance Measurement Baseline
How To Build A Credible Performance Measurement Baselineguest9da059
 
Presentation by lavika upadhyay
Presentation by lavika upadhyayPresentation by lavika upadhyay
Presentation by lavika upadhyayPMI_IREP_TP
 
Reliability Tools and Integration Seminar
Reliability Tools and Integration SeminarReliability Tools and Integration Seminar
Reliability Tools and Integration SeminarAccendo Reliability
 
Testing experience no_22_guzman_barrio_martinez
Testing experience no_22_guzman_barrio_martinezTesting experience no_22_guzman_barrio_martinez
Testing experience no_22_guzman_barrio_martinezRaúl Martínez
 
Reliability Value FMS Reliability
Reliability Value FMS ReliabilityReliability Value FMS Reliability
Reliability Value FMS ReliabilityAccendo Reliability
 
Planning For Success Quality Management
Planning For Success Quality ManagementPlanning For Success Quality Management
Planning For Success Quality ManagementJolene_Eichorn
 
Software reliability engineering
Software reliability engineeringSoftware reliability engineering
Software reliability engineeringMark Turner CRP
 
Why do the Projects fail
Why do the Projects failWhy do the Projects fail
Why do the Projects failSwapanK
 
Managing Business Analysis for Agile Development
Managing Business Analysis for Agile DevelopmentManaging Business Analysis for Agile Development
Managing Business Analysis for Agile DevelopmentIJMER
 
Software life cycle ppt
Software life cycle pptSoftware life cycle ppt
Software life cycle pptArsalanAman
 
RAMS 2013 Establishing Product Reliability Goals
RAMS 2013 Establishing Product Reliability GoalsRAMS 2013 Establishing Product Reliability Goals
RAMS 2013 Establishing Product Reliability GoalsAccendo Reliability
 
Project Pluto Will Adopt The Incremental Build Model Essay
Project Pluto Will Adopt The Incremental Build Model EssayProject Pluto Will Adopt The Incremental Build Model Essay
Project Pluto Will Adopt The Incremental Build Model EssayDiane Allen
 
Project delivery standardization framework innovate vancouver
Project delivery standardization framework  innovate vancouverProject delivery standardization framework  innovate vancouver
Project delivery standardization framework innovate vancouverInnovate Vancouver
 
Agile Mumbai 2023 | Sustainable Agile Metrics - Pranav Menon
Agile Mumbai 2023 | Sustainable Agile Metrics - Pranav MenonAgile Mumbai 2023 | Sustainable Agile Metrics - Pranav Menon
Agile Mumbai 2023 | Sustainable Agile Metrics - Pranav MenonAgileNetwork
 
The Role of Quality Assurance in Software Testing.pdf
The Role of Quality Assurance in Software Testing.pdfThe Role of Quality Assurance in Software Testing.pdf
The Role of Quality Assurance in Software Testing.pdfUncodemy
 

Similar a Tutorial on Effective Reliability Program Traits and Management (20)

Presentation by meghna jadhav
Presentation by meghna jadhavPresentation by meghna jadhav
Presentation by meghna jadhav
 
Quality - A Priority In Service Engagements
Quality - A Priority In Service EngagementsQuality - A Priority In Service Engagements
Quality - A Priority In Service Engagements
 
How To Build A Credible Performance Measurement Baseline
How To Build A Credible Performance Measurement BaselineHow To Build A Credible Performance Measurement Baseline
How To Build A Credible Performance Measurement Baseline
 
How To Build A Credible Performance Measurement Baseline
How To Build A Credible Performance Measurement BaselineHow To Build A Credible Performance Measurement Baseline
How To Build A Credible Performance Measurement Baseline
 
Presentation by lavika upadhyay
Presentation by lavika upadhyayPresentation by lavika upadhyay
Presentation by lavika upadhyay
 
Reliability Tools and Integration Seminar
Reliability Tools and Integration SeminarReliability Tools and Integration Seminar
Reliability Tools and Integration Seminar
 
Testing experience no_22_guzman_barrio_martinez
Testing experience no_22_guzman_barrio_martinezTesting experience no_22_guzman_barrio_martinez
Testing experience no_22_guzman_barrio_martinez
 
Reliability Value FMS Reliability
Reliability Value FMS ReliabilityReliability Value FMS Reliability
Reliability Value FMS Reliability
 
Planning For Success Quality Management
Planning For Success Quality ManagementPlanning For Success Quality Management
Planning For Success Quality Management
 
Quality function development
Quality function developmentQuality function development
Quality function development
 
Test Driven Development (TDD)
Test Driven Development (TDD)Test Driven Development (TDD)
Test Driven Development (TDD)
 
Software reliability engineering
Software reliability engineeringSoftware reliability engineering
Software reliability engineering
 
Why do the Projects fail
Why do the Projects failWhy do the Projects fail
Why do the Projects fail
 
Managing Business Analysis for Agile Development
Managing Business Analysis for Agile DevelopmentManaging Business Analysis for Agile Development
Managing Business Analysis for Agile Development
 
Software life cycle ppt
Software life cycle pptSoftware life cycle ppt
Software life cycle ppt
 
RAMS 2013 Establishing Product Reliability Goals
RAMS 2013 Establishing Product Reliability GoalsRAMS 2013 Establishing Product Reliability Goals
RAMS 2013 Establishing Product Reliability Goals
 
Project Pluto Will Adopt The Incremental Build Model Essay
Project Pluto Will Adopt The Incremental Build Model EssayProject Pluto Will Adopt The Incremental Build Model Essay
Project Pluto Will Adopt The Incremental Build Model Essay
 
Project delivery standardization framework innovate vancouver
Project delivery standardization framework  innovate vancouverProject delivery standardization framework  innovate vancouver
Project delivery standardization framework innovate vancouver
 
Agile Mumbai 2023 | Sustainable Agile Metrics - Pranav Menon
Agile Mumbai 2023 | Sustainable Agile Metrics - Pranav MenonAgile Mumbai 2023 | Sustainable Agile Metrics - Pranav Menon
Agile Mumbai 2023 | Sustainable Agile Metrics - Pranav Menon
 
The Role of Quality Assurance in Software Testing.pdf
The Role of Quality Assurance in Software Testing.pdfThe Role of Quality Assurance in Software Testing.pdf
The Role of Quality Assurance in Software Testing.pdf
 

Más de Accendo Reliability

Should RCM be applied to all assets.pdf
Should RCM be applied to all assets.pdfShould RCM be applied to all assets.pdf
Should RCM be applied to all assets.pdfAccendo Reliability
 
T or F Must have failure data.pdf
T or F Must have failure data.pdfT or F Must have failure data.pdf
T or F Must have failure data.pdfAccendo Reliability
 
Should RCM Templates be used.pdf
Should RCM Templates be used.pdfShould RCM Templates be used.pdf
Should RCM Templates be used.pdfAccendo Reliability
 
12-RCM NOT a Maintenance Program.pdf
12-RCM NOT a Maintenance Program.pdf12-RCM NOT a Maintenance Program.pdf
12-RCM NOT a Maintenance Program.pdfAccendo Reliability
 
09-Myth RCM only product is maintenance.pdf
09-Myth RCM only product is maintenance.pdf09-Myth RCM only product is maintenance.pdf
09-Myth RCM only product is maintenance.pdfAccendo Reliability
 
10-RCM has serious weaknesses industrial environment.pdf
10-RCM has serious weaknesses industrial environment.pdf10-RCM has serious weaknesses industrial environment.pdf
10-RCM has serious weaknesses industrial environment.pdfAccendo Reliability
 
08-Master the basics carousel.pdf
08-Master the basics carousel.pdf08-Master the basics carousel.pdf
08-Master the basics carousel.pdfAccendo Reliability
 
07-Manufacturer Recommended Maintenance.pdf
07-Manufacturer Recommended Maintenance.pdf07-Manufacturer Recommended Maintenance.pdf
07-Manufacturer Recommended Maintenance.pdfAccendo Reliability
 
06-Is a Criticality Analysis Required.pdf
06-Is a Criticality Analysis Required.pdf06-Is a Criticality Analysis Required.pdf
06-Is a Criticality Analysis Required.pdfAccendo Reliability
 
05-Failure Modes Right Detail.pdf
05-Failure Modes Right Detail.pdf05-Failure Modes Right Detail.pdf
05-Failure Modes Right Detail.pdfAccendo Reliability
 
04-Equipment Experts Couldn't believe response.pdf
04-Equipment Experts Couldn't believe response.pdf04-Equipment Experts Couldn't believe response.pdf
04-Equipment Experts Couldn't believe response.pdfAccendo Reliability
 
Reliability Engineering Management course flyer
Reliability Engineering Management course flyerReliability Engineering Management course flyer
Reliability Engineering Management course flyerAccendo Reliability
 
How to Create an Accelerated Life Test
How to Create an Accelerated Life TestHow to Create an Accelerated Life Test
How to Create an Accelerated Life TestAccendo Reliability
 

Más de Accendo Reliability (20)

Should RCM be applied to all assets.pdf
Should RCM be applied to all assets.pdfShould RCM be applied to all assets.pdf
Should RCM be applied to all assets.pdf
 
T or F Must have failure data.pdf
T or F Must have failure data.pdfT or F Must have failure data.pdf
T or F Must have failure data.pdf
 
Should RCM Templates be used.pdf
Should RCM Templates be used.pdfShould RCM Templates be used.pdf
Should RCM Templates be used.pdf
 
12-RCM NOT a Maintenance Program.pdf
12-RCM NOT a Maintenance Program.pdf12-RCM NOT a Maintenance Program.pdf
12-RCM NOT a Maintenance Program.pdf
 
13-RCM Reduces Maintenance.pdf
13-RCM Reduces Maintenance.pdf13-RCM Reduces Maintenance.pdf
13-RCM Reduces Maintenance.pdf
 
11-RCM is like a diet.pdf
11-RCM is like a diet.pdf11-RCM is like a diet.pdf
11-RCM is like a diet.pdf
 
09-Myth RCM only product is maintenance.pdf
09-Myth RCM only product is maintenance.pdf09-Myth RCM only product is maintenance.pdf
09-Myth RCM only product is maintenance.pdf
 
10-RCM has serious weaknesses industrial environment.pdf
10-RCM has serious weaknesses industrial environment.pdf10-RCM has serious weaknesses industrial environment.pdf
10-RCM has serious weaknesses industrial environment.pdf
 
08-Master the basics carousel.pdf
08-Master the basics carousel.pdf08-Master the basics carousel.pdf
08-Master the basics carousel.pdf
 
07-Manufacturer Recommended Maintenance.pdf
07-Manufacturer Recommended Maintenance.pdf07-Manufacturer Recommended Maintenance.pdf
07-Manufacturer Recommended Maintenance.pdf
 
06-Is a Criticality Analysis Required.pdf
06-Is a Criticality Analysis Required.pdf06-Is a Criticality Analysis Required.pdf
06-Is a Criticality Analysis Required.pdf
 
05-Failure Modes Right Detail.pdf
05-Failure Modes Right Detail.pdf05-Failure Modes Right Detail.pdf
05-Failure Modes Right Detail.pdf
 
03-3 Ways to Do RCM.pdf
03-3 Ways to Do RCM.pdf03-3 Ways to Do RCM.pdf
03-3 Ways to Do RCM.pdf
 
04-Equipment Experts Couldn't believe response.pdf
04-Equipment Experts Couldn't believe response.pdf04-Equipment Experts Couldn't believe response.pdf
04-Equipment Experts Couldn't believe response.pdf
 
02-5 RCM Myths Carousel.pdf
02-5 RCM Myths Carousel.pdf02-5 RCM Myths Carousel.pdf
02-5 RCM Myths Carousel.pdf
 
01-5 CBM Facts.pdf
01-5 CBM Facts.pdf01-5 CBM Facts.pdf
01-5 CBM Facts.pdf
 
Lean Manufacturing
Lean ManufacturingLean Manufacturing
Lean Manufacturing
 
Reliability Engineering Management course flyer
Reliability Engineering Management course flyerReliability Engineering Management course flyer
Reliability Engineering Management course flyer
 
How to Create an Accelerated Life Test
How to Create an Accelerated Life TestHow to Create an Accelerated Life Test
How to Create an Accelerated Life Test
 
Reliability Programs
Reliability ProgramsReliability Programs
Reliability Programs
 

Último

WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 

Último (20)

WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Tutorial on Effective Reliability Program Traits and Management

  • 1. Effective Reliability Program Traits and Management Fred Schenkelberg Fred Schenkelberg Senior Reliability Consultant Ops A La Carte, LLC 990 Richard Ave, Suite 101 Santa Clara, CA 95050 fms@opsalacarte.com Schenkelberg: page i
  • 2. SUMMARY & PURPOSE The purpose of this tutorial is to highlight key traits for the effective management of a reliability program. The basic premise is no single list of reliability activities will work for every product. Every product development and production team faces a different history, constraints, and a different set of variables and uncertainties. Such that what worked for the last program may or may not be appropriate for the current project. There are a handful of key traits that separate the valuable programs from the merely busy programs. These traits and the underlying structure can provide a framework to create a cost effective and efficient reliability program. Fred Schenkelberg Fred Schenkelberg is a reliability engineering and management consultant with Ops A La Carte, with areas of focus including reliability engineering management training and accelerated life testing. Previously, he co-founded and built the HP corporate reliability program, including consulting on a broad range of HP products. He is a lecturer with the University of Maryland teaching a graduate level course on reliability engineering management. He earned a master of science degree in statistics at Stanford University in 1996. He earned his bachelors degrees in Physics at the United State Military Academy in 1983. Fred is an active volunteer as the Executive Producer of the American Society of Quality Reliability Division webinar program, IEEE reliability standards development teams and previously a voting member of the IEC TAG 56 - Durability. He is a Senior Member of ASQ and IEEE. He is an ASQ Certified Quality and Reliability Engineer. Table of Contents 1. Introduction...........................................................................................................................................1 2. Basic Structure......................................................................................................................................1 3. Reliability Goals.....................................................................................................................................1 4. Apportionment.......................................................................................................................................3 5. Feedback Mechanism.............................................................................................................................4 6. Determining Value.................................................................................................................................6 7. Maturity Model......................................................................................................................................7 8. Conclusions............................................................................................................................................7 9. References.............................................................................................................................................7 10. Tutorial Visuals……………………………………………………………………………………... . .8 Schenkelberg: page ii
  • 3. 1. INTRODUCTION reliable product. This brings up the question of what is a ‘reliable product’? A product’s design, supply chain and assembly process in The objective or goal provides the direction and guidance large part establish the product’s reliability performance. A for the reliability program. Clearly stating the reliability goal product well suited for the use application will meet or exceed is a key trait of very effective programs. Leaving the goal the customer’s durability expectations. The myriad of unstated or vaguely understood may lead to one or more of decisions by the entire design and production team creates the the following: eventual product reliability performance. The structure for these decisions is the focus of this tutorial. • High field failure rate Considering that each activity of a design team takes • Product recall resources such as time and money to accomplish, focusing the • Over designed and expensive product use of these resources on activities of high value is a common • Design team priority confusion strategy. Including product reliability in the value proposition permits the entire team to weigh the importance of product Another element of a process is feedback. This occurs reliability and the appropriate use of tools to accomplish both within the process as part of the creation of the output, and it the business and product reliability objectives. most certainly exists externally based on the output or process The basic premise of this tutorial is the underlying concept results. that no one set of reliability activities is appropriate for every The final result for product reliability is the customer product development situation. Selecting and integrating the acceptance or rejection of the product. If the product best tools permits the execution of an effective and efficient functions longer than expected, like an HP calculator, the reliability program. product is considered a ‘good value’. If the product fails The traits of very good reliability programs and examples of quickly or often, especially compared to other products very poor practices in this tutorial serve to illustrate how to providing the same solution, it is considered of ‘poor value’. approach establishing an effective reliability program. In some organization the feedback is non-existent, in others Highlighting the basic structure along with guidelines on how it is captured within a warranty claims system, in others to tailor a reliability program will permit the repeatable within service or repair programs. Customers may complain creation of reliable products. directly with returned products and demands for replacements, or indirectly by simple not purchasing the Acronyms and Notation product in the future. ALT Accelerated Life Test The feedback within the reliability program attempts to CAD Computer Aided Design anticipate the customer’s feedback prior to the delivery of the FMEA Failure Modes and Effects Analysis product to the customer. Depending on the product and the HALT Highly Accelerated Life Test organization, this feedback may be very formally determined, LED Light Emitting Diode highly structured and very accurate. Or, the feedback may be MTBF Mean Time Between Failure random, haphazard and inaccurate. Both types of feedback PoF Physics of Failure may be suitable, again depending on the product and SPICE Simulation Program with Integrated Circuit organization. Emphasis Establishing the appropriate set of feedback mechanisms within a reliability program is done within the context of the product reliability goals and the value to the organization of 2. BASIC STRUCTURE the feedback. The process benefits from feedback that is timely and accurate enough to make decisions. It is those A product reliability program is a process. Like any process decisions that lead to the product’s reliability in the hands of it has inputs and outputs, plus generally some form of an the customers. objective and feedback. Furthermore, the process may or may Therefore the basic structure for any reliability program is not be controlled or even a conscious part of the organization. to clearly establish and state the reliability goal. Then Reliability may just happen, good or bad. Results may or may determine the appropriate set of feedback mechanisms that not be known or understood. provide timely information to permit design and production In some organizations, the reliability program may be decisions. The ‘how’ to decide the ‘appropriate set’ is the highly structured with required activities at each stage along subject of this tutorial. the product lifecycle. In other organizations, reliability is considered as a set of tests (e.g. environmental or safety compliance). And, in some organizations, reliability is 3. RELIABILITY GOALS effectively a part of everyone’s role. The target, objective, mission or goal is the statement that In each example above, the resulting product reliability may provides the design team focus and direction. A well stated meet the customer’s expectations or not. There isn’t a single goal will establish the business connection to the technical process that will always work. decisions related to the product durability expectations. A Going back to the basic notion of a simple process, consider well stated goal provides clarity across the organization and the objective for a moment. For a reliability program one may permits a common language for discussing design, supply desire a specific outcome of a reliable product. The process chain and manufacturing decisions. then should promote activities leading to the creation of a Let’s explore the definition of a ‘well stated reliability goal’. First is it not simple MTBF, “as good as or better Schenkelberg: page 1
  • 4. than…”, or ‘a 5 year product’. These are common ‘goals’ expected to wash clothes for 10 years. An implanted hearing found across many industries, yet none permit a clear aid is expected to last the life of the patient; if the patient is a technical understanding of the durability expectations for the child this expectation may be more than 70 years. product. The duration expectations may be defined by contract, The common definition for reliability is market expectations, or by a business decision. The duration or life expectancy most likely is not the warranty period. For Reliability is … the ability or capability of the example, many personal computers have a 3 month or 1 year product to perform the specified function in the warranty period. Yet, the product is expected to last at least designated environment for a minimum length of two years or more with normal use. time or minimum number of cycles or events. Many products have multiple durations that are of interest. (Ireson, Coombs et al. 1995) • Out of box • Warranty Note this definition has four elements: • Design Life • Function • Environment The initial, out-of-box, or installation period is that • Duration duration when the customer is first setting up and using the • Probability product. Brand visibility is at the highest and the expectation that a new product will function as expected is very high. The 3.1 Function types of failures that may occur include installation or The function is what the product is to do or perform. For configuration errors, mistaken purchase, shipping or example, an emergency room ventilator is to provide assisted installation damage, or simply buyer error. All of these breathing for a person. This requires the ventilator to produce ‘failures’ cost the company producing the product resources. breathable air within a range of pressures within a prescribed The warranty period is the duration associated with the cycle of respiration. It may include requirements for filtering, producer’s promise to provide a product free of defects for a temperature, and adjustments to pressure and timing of the stated period of time. For example a computer may have a 1 cycle, etc. Often, a product development team either develops year warranty period. During this one year, if the product or is given a detailed set of functional requirements. fails (usually limited to normal use and operating Often the functional elements of a product are directly environment) the producer will repair or replace the product. measurable. And, the quality function of most organizations Naturally this will cost the producer resources. verifies the design and production units meet the functional The design life is the business or market expected product requirements. When the product does not meet the functional duration of function use. After the warranty period there isn’t requirements, it is considered a product failure. Within the an expectation for the producer to replace or repair the function definition, which are the most important functions, product, yet the customer may have a reasonable expectation which must not fail, which are functions that, if they fail may that the product will function satisfactorily over the design simply degrade performance, if noticed by the customer at life duration. For example, many cell phones have a 3-month all? warranty, yet as consumers we have an expectation that the phone will function for two years or more. 3.2 Environment Marketing or senior management may set the design life. The environment could be considered the weather around They may want to establish a market position for the product the product when in use. ‘Weather’ such as temperature, related to reliability. One way is to design a very robust humidity, UV radiation intensity, etc. It should also include product with a long design life duration. HP calculators often environmental factors that provide destructive stresses, such have only a 3-month or 1-year warranty, yet many have lasted as vibration, moisture, corrosive gases, voltage transients, and 10 or more years. These calculators are known for their many more. robustness and often cost more to purchase – a reliability Another element of the environment is the use of the premium. product. What is the use profile? Once a day for a few Each of the three durations often involves different risks minutes, like a remote control for the stereo system. Or is it a related to the failure mechanisms. It is rare for bearings to 24/7 operation such as for server system processing wear out in the first 30 days, yet more likely for a 10-year transactions for a major online store. The profile may include design life. Establishing three or more durations within the details concerning human interactions, operating modes, product reliability goal permits the design team to focus on shipping, storage, and installation. The environmental and address the full range of product reliability risks. conditions need to detail how the product responds or 3.4 Probability degrades to the set of stresses the product encounters. The environmental conditions focus on drivers for the product’s The probability is the likelihood of the product surviving most likely failure mechanisms. over a specified period of time. In the formal reliability definition above, the phrase ‘ability or capability’ refers to the 3.3 Duration probability. This is the statistical part of the reliability goal The duration is the amount of time or number of cycles the and without it the goal is fairly meaningless. Furthermore, product is expected to function. A computer printer may be stating a probability without an associated duration and expected to print for five years. A washing machine is distribution is also meaningless in most cases. Schenkelberg: page 2
  • 5. What is the chance that a particular product will function as and some will require modification. Sophisticated model expected over the entire expected design life? How many of include apportioned goals, addressing many functions, and the installed units will be functional over the warranty several use profiles and several environments, different period? Since each product and the associated environmental durations, and conditional probabilities. Simple models work stress vary, the use of statistics is unavoidable in describing to get started, as more details become available concerning product reliability. Even the definition of a product failure the design and use, sophisticated models are increasingly may vary by customer. useful. While there are many common terms to convey the The duration may also require modification. The durations probability of survival, the use of a percentage surviving is are most often the same as the system level and may require the easiest understood and most easily applied across an modification if the various components or subsystems are organization. Stating that 95% of units are expected to only employed during specific phases of the products use, i.e. survive over the 5 year design life, means 95 out of 100 units an installation and configuration aide. will function properly over the 5 year period. Or, that a single The probability will require modification unless the product product has a one in twenty chance (95%) of surviving 5 has no component or subsystem elements. This is rare except years. A similar statement is that not more than 5% of for raw materials. Even a simple discrete resistor has multiple products fail over the full five years. Or, may be stated as not components that may have different failure mechanisms. For more than a 1% failure rate per year. example, the resistive element and the soldering leads have A common probability statement is the inverse of the failure different functional descriptions are made of different rate, or MTBF. The 95% reliability over 5 years (t) becomes materials and enjoy different sets of stresses that lead to approximately 100 years MTBF (θ). This does not mean the failure. The probability of failure is not the same as for the product will last 100 years, it does mean that 95% of the system. products are expected to last 5 years. Another way to look at the probability differences breaks Finally stating a separate failure probability for each down the system probability of success to each element within duration of interest provides a set of duration/probability the product. A simple system with two primary means to fail couplets that permit different focus for early or out of box (say the resistor with the resistive and connection elements as failure risks versus the longer term failure risks. an example for discussion) and the system has a 90% If the product has a specific mission time, say an aircraft probability of successfully functioning over 20 years. If both with an expected 12-hour mission over a 20-year serviceable of the elements also have a 90% probability, and either the life period. The probability of success for the 12-hour mission resistive or connection element causes a system failure, then time maybe set relatively high. And, it may have a either the system or subsystem goals are misstated. As you conditional probability considering the number of missions already know, for a simple series system the probability of since the last major service. Some products have availability success for the subsystems has to be larger such that when goals and undergo routine maintenance or repair. These they are multiplied together the result meets or exceeds the products and many complex systems require additional system goal. complexity in their goal setting. For the purpose of this There are excellent references for basic reliability modeling discussion, we are considering simple products that are not and many papers and forums to discuss even the most normally repaired or, products where the main interest is in complex systems. The intention is to apportion the system the time to the first failure. reliability goal, especially the probability value, to all major The point is that setting the reliability goal for a product is elements of the product. not as simple as stating a ‘five year life’ – it requires a clear 4.1 Establishing the probability apportionment statement with sufficient detail of each of the four elements: function, environment, duration, and probability. And, it may The time to establish the reliability apportionment is early and often should include at least three duration/probability in the project. Depending on the project and the known couplets. The goal establishes the direction or target for the values from field data, vendors, previous projects, etc. the entire design, supply chain and manufacturing team. apportionment may be well founded on data, or simply a guess. Both are valuable. Consider a simple example of a computer system with five 4. APPORTIONMENT major subsystems: motherboard, disk drive, monitor, power supply and keyboard. Of course there are other elements, yet The system or product level reliability goal is not sufficient for this example we are limited the list to these five. by itself. Ideally, every component or assembly step, which If this is our first product and little is known about the has a possible impact on the final product reliability, should reliability of any of these components (for example, when have an established reliability goal. Each individual element designing the first personal computers in the 80’s). Further, should have goals that are tailored to that specific element. let’s assume the system goal is 95% reliable over a 5-year For example a cooling fan that only operates when the period for the design life. Having no other information, a internal temperature reaches a defined value, has a different straight-line apportionment is as good a starting place as any. use profile than the entire system. The function and Therefore, each of the five subsystems receives an environment are different for the specific fan than for the apportionment goal of 99% reliable over 5 years. Also, the system. The computer provides a platform for computer functional and environmental elements receive attention to programs to operate along with a user interface, whereas the adjust to those subsystems particular requirements. fan provides cooling. Many of the environmental factors for the computer also impact the fan, yet not everything applies Schenkelberg: page 3
  • 6. At first, this simple method provides a starting point for the The primary intent of using reliability goals and team’s discussion concerning reliability. It provides the basis apportionment is to permit meaningful decisions concerning for product design, part procurement, validation and reliability along with the ability to consider product cost and verification testing, and the myriad of cost/benefit trade off other important aspects of the design in a meaningful decisions required during the product lifecycle. manner. Overtime, years of field data, vendor data and internal product testing continue to improve the understanding of 5. FEEDBACK MECHANISMS each subsystem’s reliability. This understanding becomes the base for the initial apportionment estimates for a new There are two basic questions in reliability engineering. product. Consider a new project for a personal computer What is going to fail? And, when will the product fail? Both where only the CPU and associated chipset is new. The are related to failure mechanisms. The first may require the overall apportionment model may start with the best available discovery of the failure mechanism. The second may require reliability values for all the subsystems and include an the determination of the expected behavior of the failure adjustment to the motherboard value considering the mechanism over time. Both questions have a wide range of uncertainty or estimated value change regarding the new tools available to find the answers. It is the selection of the CPU chipset. The uncertainty is relatively low and the use right tools to provide a good enough answer in an effective within subtle design decisions is possible. and efficient manner that is the subject of this section. Each engineer tends to design away from failure. (Petroski 4.2 Adjusting the probability apportionment 1994) And, each engineer generally knows about the most likely failure mechanisms related to their section of the Going back to the first personal computer design and design, within the realm of their experience. They may gain simple straight-line apportionment. A little common sense additional experience as their design fails in unexpected (to and feedback from vendors may provide additional them) ways. Part of the design process is to uncover failures information. The keyboard is most likely more reliable than and improve the design to avoid or lessen the probability of the power supply, for example. Adjusting the goal for the the same. power supply down, say to 98%, then requires an adjustment Tools such as FMEA and HALT permit the design team to in one or more of the other subsystems such that the product discover failures. Often the FMEA session permits the design remains at or above the system goal of 95%. The same rule team to share the known or expected failure mechanisms. applies for any other series system of apportionment. Occasionally, a new possible failure mode appears in these Another consideration for the apportionment adjustment is sessions. The real value is in improving the ability of the the cost/benefit tradeoff. For nearly any development project entire team to identify unknown failures and address the there is a limit to product cost, therefore simply purchasing effects of the known expected failures. Each person on the the most expensive components, which may or may not be the FMEA team brings a set of known or expected failures to the most reliable, is not always an option. Back to the power discussion. The combined set increases the entire team’s supply example above. Let’s say the vendor of the initially awareness to the larger set of possible issues. selected power supply considers the use, environment and HALT, in the broadest sense is started with the first product functional requirements and states that the power supply will models or bench top testing. Exploring the reaction of the have a 95% probability of success over 5 years. That is the product to various stimulations is an exploration of where the same value as the overall system goal, and unless all the other product works by defining where it doesn’t work. The subsystems are perfect (100% reliable over 5 years) the design intention of HALT is to apply stresses relevant to the team will not achieved the reliability objective. product’s environment (vibration, voltage, temperature, usage A search reveals three alternative power supplies that will rates, etc.) and determine the boundary between functional meet the functional requirements. One has a 97% reliability and not functional behavior. With careful root cause analysis, at a cost of $50, the second has a 98% reliability at a cost of then uncover and understand the failures, enabling the design $100 and the third has a 99% reliability at a cost of $250. to adjust to create a more robust product. If product cost is not an issue (rarely the case) spend the Common engineering tools also permit this discovery. $250 and achieve the apportioned objective. If it is possible to Many CAD programs include basic finite element analysis improve the reliability of other subsystems, say the monitor, capabilities. Adjusting material properties to reflect the for less cost, to offset the difference between the 99% goal effects of aging (i.e. oxidation of polymers making them more and 98% or 97% reliability associated with less expensive brittle) and performing a simple analysis may find aging power supplies, than that would provide the highest reliability weaknesses in the design. The same applies for SPICE for the least cost. This is a simple illustration of the models of circuits. Consider the expected drift of capacitor cost/benefit tradeoff; in practice these may become very values over time and the continued functionality of the complicated decisions. circuit. An advanced practice is to establish reliability goals and If the product is new or contains new technology or associated apportionment for the various stage gates during assembly processes, the nature of the failures may not be well the product lifecycle. With each successive round of design, understood. FMEA and HALT and related discovery tools prototyping, and analysis not only is the product improving, apply. If the project is to refine an existing product and there but the uncertainty is also diminishing. Using the lower limit is ample internal and field data defining the areas for for reliability estimates is one way to reflect the range of improvements, then the discovery tools do not add value. reliability uncertainty. The first question looks for what will fail. If the failures are known or the various tools help determine what will fail, the Schenkelberg: page 4
  • 7. product reliability can be improved by addressing those expected new environment. Tools such as ALT may apply. aspects the product that lead to the failures. One approach to Thermal cycling for the solder joint attachments and high product design is: build, test, fix – repeat. That is, find and temperature exposure while illuminated to evaluate the fix the first element of a design to fail and the product luminosity degradation are two examples of what could be improves. Continue to do so till there are no more failures or usefully tested. the design reaches the design limits of the materials (for The results of the discovery evaluations along with example, the first failure occurs as the polymer case melts). engineering judgments concerning the uncertainty of failure The primary drawback to this approach is the inability to mechanism behaviors will prioritize the list of most likely quantify the product reliability value concerning how many failure mechanisms. This list then can be sorted by units will last how long. Understanding what will fail is appropriate stress to design accelerated life tests. More than critical to being able to answer the second question – when one failure mechanism may be accelerated due to the high will it fail? temperature exposure, for instance. As the design team addresses the design issues the second The reliability program most likely will have goal setting, question enables them to know if they have achieved the apportionment, initial reliability predictions based on product reliability goal. As with discovery tools, there are literature and vendor data, prototype testing of various solder many tools available to determine how long a product will joint attachment mythologies, and product level accelerated last. Predictions, accelerated life testing, demonstration tests life testing focused on 3 to 5 different stresses. all are capable of providing an estimate of how long a product This approach takes advantage of existing knowledge may last. concerning LED technology and previous explorations of Deterministic models may also provide results. For failure mechanisms within LED technology and solder example, the polymer diffusion rate permits air to accumulate attachment methods. The approach also considers if any new within a tube, which at a critical air volume will block fluid failure modes may appear in the new, harsher environment. flow. This process can be modeled and the time to failure The approach also considers the relative low cost of the calculated for different wall thicknesses and air pressures. individual units and the ability to quickly measure the Field data is often the most accurate way to estimate actual product performance by the use of ALT. The initial risk is field performance although it is usually not available for new high for the new environment and if the LED’s actually last products or elements of new products. longer than twice the expected life of the incandescent To illustrate how to select the appropriate tools to provide systems the product provides a cost savings to the car feedback to the design team, let’s consider a few cases. Keep manufacturer and owner. in mind that not all tools are appropriate for all situations. 5.2 The low volume high cost case 5.1 The existing technology in new environment case In comparison to the first case, consider a product that has a To illustrate the existing technology in new environment very limited production volume, say 50 total units. Plus, each case consider the initial design of an LED brake light. This unit is very expensive; say $1million. Running 30 units each is new technology with respect to the application of the LED in three different ALT to failure is not viable. Even getting to the car taillight environment. While LED lighting has been one full unit for destructive HALT testing is not likely. Yet available in a range of applications for some time, the car all the same unknowns as above or more may apply. taillight environment is harsher and more demanding than Consider an oil exploration sensor array unit that attaches previous application environments. Simply the ambient to the drill string during drilling and has the function to temperature extremes from overnight, outdoors in Fargo, ND monitor and report the presence of specific types of (-30°F) to direct sunlight exposure, within an unventilated hydrocarbons. This is a complex system in a very harsh enclosure in the Tucson, AZ summer (180°C). Also a new environment. assembly process to attach dozens of LED elements to a brake The list of what and how the product could fail is quite light pattern frame in a high-volume mass-production long. Given the constraint of no system level units for product assembly line will be required. testing, only a few of the tools from the first case apply: goal There is no history, no previous products on the market setting, apportionment, prediction and FMEA. The FMEA is using LED’s in anything like the brake light environment. a discovery tool and will not provide the necessary feedback What could possibly go wrong? The design team doesn’t on the product’s expected durability. Thus the onus is on know what could go wrong. Therefore, the appropriate set of performing accurate predictions. tools should first discover the most likely failure mechanisms. In this case, the use of Physics of Failure (PoF) modeling FMEA and HALT both apply, for example. Both of these may be the most valuable tool available. Understanding the tools can build on what is already known about LED relationship between the expected stresses and the component operations and known failure mechanisms. The new level responses over time, permits the PoF models to predict environment may accelerate some little known failure the system life. The development of the PoF models related to mechanisms, or it may simply accelerate already well known the critical component failure mechanisms may take mechanisms. significant work, yet the option to test multiple units is not Once the failure mechanisms are known the requirement for viable. Therefore, the analytical and theoretical work permits the new brake lights is to last twice as long as the current the team to receive feedback on the expected product incandescent systems, or a 95% probability of lasting 10 weaknesses and expected life limiting failure mechanisms. years. Simply finding the failures, surprising failure modes or Even determining the critical component failure mechanisms not, permits the evaluation of how long they will last in the may be difficult. Schenkelberg: page 5
  • 8. This approach takes advantage of the existing literature Suppose after the first round of predictions we find the detailing failure mechanisms for a wide range of components, keyboard has a lower expected reliability of 99.9% reliability plus the ability to evaluate individual components at much over 5 years. Furthermore, let’s assume the remaining four less expense than the full system. The approach has more risk subsystems all meet their goal at 99%. And, it is possible to in the identification of the unknown failure mechanisms improve the reliability of the keyboard to 99.99% by spending related to the full system configuration and use, yet, careful $1 more per keyboard. And, let’s assume it will cost the use of tools like FMEA and reliability modeling permits the company $1000 per field failure for any cause. And, we team to mitigate this risk to some extent. expect to build and sell 100,000 computers. For the current keyboard, we expect 100,000-(0.999 * 5.3 The moderate volume product family variation case 100,000) = 100 keyboard failures. These will cost the A common case is the modification of an existing product. company 100*$1,000=$100,000. There is field data, the previous product testing information is For the new keyboard the cost will be 100,000 * $1 = available and the list of known failure mechanisms is well $100,000. The savings will be due to reducing the field defined. Furthermore the product functions, intended failures, from 100 failures to 1 failure. The new keyboard’s environment and use profile remain basically the same. one failure costs $1,000. This is down from $100,000 for a In some regards, this is more difficult than the previous two savings of $99,000. cases. One approach would be to only test the new product For a savings of approximately $99,000 we spent $100,000, with respect to the changes, and possibly only evaluate the which may make it difficult to justify the change. The individual new components with the justification that nothing calculations might be more favorable if for the same cost of else has changed. change, a difference in reliability from 99% to 99.5% could The second possible approach is to repeat all of the be made in the power supply. For any proposed change that evaluations and testing as done for the original product. Here impacts the reliability apportionment model the above the justification may be that the relatively minor changes may calculation quickly illustrates the value. adversely impact existing elements of the product. Or, worse, Yet, not all of the reliability tools directly increase or the justification could be ‘we always do the full set of testing’ decrease the expected reliability. In some cases, the tool mentality. might only shorten the time to detect the failure mechanisms. Both approaches have risks and costs that can and should HALT is an example of this and it often finds most of the be mitigated. Using the existing reliability models and best failure mechanisms in a design within a week, which would available data, the design team can isolate the changes and normally take months of standards based environmental assign a range of predicted values to the new component testing to uncover. The savings in time to market risk, more reliability. In conjunction with that they can perform a very than justifies the necessity of making multiple trips to the focused Design FMEA on the changes with an emphasis on HALT chamber. how the changes impact any other element within the design. Another cost saving is the reduction in uncertainty. By At this point, the design team can decide if the uncertainty simply improving the accuracy of reliability predictions the concerning either the interaction effects or the life uncertainty range of the estimated reliability diminishes. Once the range warrant further testing. If the true value for any range of no longer crosses a decision threshold to either conduct reliability uncertainty will not preclude the product launch, further analysis or testing, the project resources can focus then clearly no further testing is needed and the current improvement efforts on other high uncertainty or low prediction if sufficient. If the low end of the range, on the reliability elements. other hand, would require further reliability improvements, or if the changes impact on other aspects of the product is unclear, then further analysis and testing will be needed. 7. MATURITY MODEL The appropriate approach considers what is known and The state of the organization is also important. A design unknowns, and the associated risks and decision points. The team that has no experience or expertise in statistical methods intent is to provide both guidance and feedback to the design will probably flounder when trying to use an event- team that permits well informed decisions. Using too few or conditional based reliability block diagram that requires too many reliability tools may incur undue risks or costs. The advance statistical modeling. Getting this team to simply use well crafted reliability program carefully considers how each a Weibull cumulative distribution plot may be a stretch, and reliability activity provides feedback toward answering the provide more value initially. two primary questions: what will fail and when will the Each organization has a set of skills, expectations, product fail. The intent is to add value to the product. structures, etc. that defines the culture concerning product reliability. Designing and applying reliability tools that will 6. DETERMINING VALUE make an impact within the organization should fit within or One way to select reliability tools for improving the be close to the organization’s current capabilities. The tools product’s reliability is to consider the return on investment. If will only have impact and be useful if understood and make the activity will not reduce risk, increase durability, reduce the current situation better. For example, a team that is engineering time, and eliminate failure mechanisms, etc. then consistently surprised by field failure modes may immediately the activity should not occur. benefit by conducting HALT testing to discover failure Consider a simple example. Recall the computer with five mechanism before their customers do so for them. subsystems from the apportionment discussion above. The Phil Crosby in his book Quality is Free (Crosby 1979) initial goal for each subsystem was 99% over 5 years. created a maturity matrix focused on quality. With slight Schenkelberg: page 6
  • 9. modification, by substituting reliability for quality the same different tools would be needed. The downstairs, stage II, basic table is meaningful for the assessment of an organization would require coaching, training, and resources organization’s reliability program. to break the cycle of letting surprising field failures dominate The primary difference that separates an effective reliability the engineering day. The upstairs, stage IV, organization program from a non-effective one is the proactive nature of might be ready for advanced tools related to product the program. On one occasion I conducted assessments of two modeling or field data analysis. They would have the time to organizations located in the same building. Both designed learn advanced accelerated degradation testing methods, for and manufactured telecommunication equipment with similar example. complexity and volume. The interview schedule had me going up and down stairs almost every hour for two days and 8. CONCLUSIONS by midday of the first day I enjoyed going upstairs and dreaded heading down. Despite all the product and business In summary, the traits of effective reliability programs similarities the two reliability programs were dramatically include the ability to: different; as different as their reliability results. • State clear reliability goals; Downstairs the interviews started late, got interrupted by • Enable tradeoff decision-making; urgent phone calls or in-person requests; firefighting at its • Selectively use only value-added reliability best. The team employed a wide range of tools, all that were activities; listed on a checklist, for each project. The reliability goals • Promote a proactive reliability culture. were not known to the design team and the few that did know The basic message is that no one list or standard of tasks them also understood that they would not be measured nor makes an effective reliability program. The selection of would a failure to meet those goals impede getting the valuable tools and the establishment of a basic structure for product to market. The people I talked to stated reliability decision-making permit an organization to achieve the was very important and were very busy fixing field failures or desired reliability objectives. testing (just before product launch) identified issues. Reliability was done by “the guy that left last year”. 9. REFERENCES Upstairs the interview started on time, and proceeded without interruption. No one remembered the last time there Crosby, P. B. (1979). Quality is Free: The Art of Making had been an urgent need to resolve a field issue. The team Quality Certain. New York, Signet. employed reliability tools that would benefit the project as needed. The specific testing that was done was tailored to the Ireson, W. G., C. F. Coombs, et al. (1995). Handbook of risks identified during the design phase. The goals were reliability engineering and management. New York, McGraw widely known and their current status was also known, both Hill. during development and after product launch. The people I talked to stated reliability was very important and they knew Petroski, H. (1994). Design Paradigms: Case historyes of what to do to meet their reliability objectives. The team’s error and judgement in engineering. Cambridge, Cambridge manager taught reliability thinking and skills, and everyone University Press. did reliability. For both organizations the basic structure and thought process to determine which reliability tools to use would apply but because of their different stages of development Schenkelberg: page 7