Reliability Prediction Value
One of primary questions we answer as reliability engineers is
How long will it last?
Reliability prediction is the forecast or prognostication attempting to quantify either the time till failure, or expected future failure rate or warranty claims, or required spare parts.
We need to know as we make decisions today about the design or purchase.
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Consider Reliability Prediction Value
1. fmsreliability.com http://www.fmsreliability.com/education/consider-reliability-prediction-value/
Reliability Prediction Value
Fred
One of primary questions we answer as reliability engineers is
How long will it last?
Reliability prediction is the forecast or prognostication attempting to quantify either
the time till failure, or expected future failure rate or warranty claims, or required
spare parts.
We need to know as we make decisions today about the design or purchase.
The best prediction is done after everything has failed. Ship all your products and
track the actual failures over time. Unfortunately is only provides what happens after
it happens.
We need the information to make decisions so we can change the design, stock more spares, or set customer
expectations appropriately.
Methods of Prediction and Costs
Forecasting the future is difficult. Engineers have taken available information and knowledge along with a range of
techniques to create reliability predictions.
Methods range from a simple guess based on engineering judgement to detailed physics of failure modeling for
specific failure mechanisms and environments.
Simple Guess
Early in the concept phase of any design, we may ask, “Will this last long enough?” Based on engineering judgment
we select basic design elements, materials, and architecture.
Largely based on engineering judgement, a simple guess helps to uncover basic information about the use
environment and customer expectations, along with basic technology capabilities.
A simple guess has a large degree of uncertainty. It is the fastest (just the duration of a considered opinion) and least
expensive prediction to create.
Parts Count Prediction
In the 70′s and 80′s, HP and other organizations would diagnose product failures to the component level. Then keep
track of component specific failure rates and types of applications or environments.
These databases of failure information provided a viable means to estimate future designs.
The large databases such as Mil Hdbk 217 or Bellcore (now Telcordia) collected failure information across broad
groups of products and environments.
2. Most organizations do not conduct the details failure analysis opting to quickly replace or repair the product for the
customer, so the source of information for databases has diminished.
The basic idea that each component contributes a possible failure then adding the failure rates to estimate the
product failure rate is full of inaccuracies and faulty assumptions.
Unless using an accurate database the results are not much better than a simple guess, yet do assist in identifying
potential reliability issues in a design.
To conduct the study, one needs access to an appropriate failure rate database, vendor data to supplement for
missing values, and a little time to tally the failure rates. With a reasonable database it may take one or two days to
create a prediction.
A small investment and not much improvement in prediction accuracy.
Similar Product History
As organizations stopped conducting detailed failure analysis on every product failure, they did continue to track
product performance.
Since most products are variations of existing products, the prediction approach of using similar products as a
foundation, then supplementing with other prediction methods for the new elements, is a reasonable approach.
For example, if a new design includes the existing power supply of an existing product, we can use the field failure
rate information for that power supply in the new product.
One must consider the power supply environment and load in the new design and if any changes in conditions impact
it reliability performance. It is a good approach to narrow down the areas needing detailed analysis.
Of course this is only viable if you have the information on previous products and sufficient failure rate detail to
subsystems. For completely new designs it is not possible.
This approach is comparable to cost of a parts count prediction and generally has better accuracy. Like a parts count
approach it may take one or two days to track down and tally the prediction.
Weakest Element Estimate
Years ago I learned about tape backup product. Like any electromechanical system it has many possible failure
mechanisms.
If it is assembled, transported and installed correctly the dominant failure mechanism is the read/write head wear due
to tape abrasion. When the head dimension reduces enough the ability of the device to read/write ceases.
Each foot of tape dragged across the head creates a predictable amount of wear, and the organization accurately
predicted the time to failure based on amount of use (i.e., feet of tape moved across the head).
Field data on similar models verified that the number one reason for product failure was head wear. The design team
optimized the wear and performance, yet the failure mechanisms remained the primary failure mechanism for the
product.
In this kind of situation, the team has the benefit on being able to focus on one mechanisms and creating a product
prediction. The other elements of the system just had to last longer than the head.
This approach to prediction simples the amount of work and experimentation and provides an accurate life estimate.
3. Of course, changes to the materials, tape, speed, tension, and other variables effecting the head wear will change
the relationship. Plus, the team must maintain due diligence with the reliability of other elements to avoid creating a
new weakest link.
Reliability Block Diagram
Block diagrams are an organizational tool for considering contributions to the system reliability. Appearing like an
organization chart, each block is a subsystem or element of a product.
Let’s say we have a desktop computer as the product. The top block is the system, below it let’s say there are five
blocks that represent the power supply, hard drive, mother board, display, and keyboard.
There are block diagram structures for series, parallel and complex systems. Each block includes the reliability of that
element. And, depending on the structure (reliability-wise) the calculations may differ to determine the system
reliability.
While not a prediction method on its own, once a block diagram is established, the ability to compare design options
(say two hard drives with different expected reliability performance) and the impact on the system reliability.
In one regard it is like the parts count method with the added benefit of being able to account for parallel reliability
structures.
Experimental Results
Life testing takes many forms and beyond the scooper of this article to describe them all. You can find books on
accelerated life testing to provide detailed guidance.
As a prediction method, conducting an experiment is expensive and accurate, when done well. The basic requirement
is to know the failure mechanisms of interest and how the appropriate stress relates from the experimental to use
conditions.
Errors or poor assumptions can make this method very inaccurate, so take care when designing, conducting and
analyzing reliability experiments or tests.
Physics of Failure Modeling
Research and modeling have enabled the creation of characterization of failure mechanisms in great detail. While not
covering every possible failure mechanisms, many are well known and have a physics of failure model.
Simple models include solder joint fatigue formulas based on the Coffin-Mason relationship. Complex models may
require finite element tools to completely model.
Technical literature provides details for PoF models and if not exist you may need to conduct experiments to fully
characterize the relationship between stress and life performance.
Once you establish a PoF model, you have the ability to consider changes in the environment, stress load, structure,
material set, or dimensions (variables affecting the failure mechanism) to create reliability predictions.
While PoF models take time and are expensive to create they provide the most flexibility and accuracy beyond other
methods.
Sources of Value
We create reliability predictions to support decisions. Predictions can help to understand:
4. Is the product reliable enough?
Is the product going to meet our reliability targets?
How many spares will we need over the next year?
And, many other specific questions when we’re not able to wait for actual results to occur.
Making Informed Decisions
In the simple example of comparing to vendors of hard drives. Understanding an accurate reliability prediction for your
application enables selecting the most cost effective and reliable product to meet our business goals.
Without a prediction, we may select a hard drive that fails too early and limits the product’s reliability. Or we may
select one that is too reelable and expensive for our application.
Identifying and Improving Design Weaknesses
During product or system design, the ability to identify the elements that will lead to failures (weakest links) permits us
to focus resources on those areas for improvement.
Without knowing where to focus we may have more failures than anticipated along with the remorse of “if we only
knew”.
Allocating Resources Appropriately
Beyond focusing efforts for reliability improvements, predictions also divert resources away from elements that are
very reliable already.
Prediction also focus product testing on failure mechanisms most likely to limit the product reliability. And, may limit
the costs of prototypes and testing facilities.
Using reliability block diagrams and similar models allows the balancing on reliability with component costs to
optimize the reliability at the minimum costs.
Setting Expectations
An accurate prediction is useful inside the company to forecast warranty and repair costs. Outside the company,
predictions are useful to customers:
Making purchase decisions
Planning larger systems using the product reliability information for modeling
Estimating total cost of ownership, including spares, downtime and maintenance costs
Starting with the data sheet, to supporting white papers on reliability prediction claims, to warranty policies to
customer perception of brand promise impacts the value of reliability predictions.
Finally, creating a prediction and comparing it to field reliability performance allows the organization to improve next
time. First using the field data for similar products, and in refining any models created for reliability predictions.