Probability Plotting

The least mathematically intensive method for parameter estimation is the method of probability plotting. As the term implies, probability plotting involves a physical plot of the data on specially constructed probability plotting paper. This method is easily implemented by hand, given that one can obtain the appropriate probability plotting paper.

The method of probability plotting takes the cdf of the distribution and attempts to linearize it by employing a specially constructed paper. For example, in the case of the two-parameter Weibull distribution, the cdf and unreliability $Q(T),$ can be shown to be:

MATH

This function can then be linearized (i.e. put in the common form of $y=a+bx$) as follows:

MATH(15)

Then setting:

MATH

and:

MATH

the equation can be rewritten as,

MATH

which is now a linear equation with a slope of $\beta $ and an intercept of MATH

The next task is to construct a paper with the appropriate $y$- and $x$-axes. The x-axis calculation is easy since it is simply logarithmic. The y-axis, however, has to represent,

MATH

where $Q(T)$ is the unreliability (or a double log reciprocal scale). Such papers have been created by different vendors and are called probability plotting papers. (Note: You can download different probability plotting papers from www.ChinaRel.Com.)

To illustrate, consider the following probability plot on a Weibull probability paper.

ch3StatB__223.gif

This paper is constructed based on the mentioned $y$- and $x$-transformations, where the y-axis represents unreliability and the x-axis represents time. Both of these values must be known for each time-to-failure point we want to plot.

Then, given the $y$ and $x$ value for each point, the points can easily be put on the plot. Once the points have been placed on the plot, the best possible straight line is drawn through these points. Once the line has been drawn, the slope of the line can be obtained (some probability papers include a slope indicator to simplify this calculation). This is the parameter $\beta ,$ which is the value of the slope.

To determine the scale parameter, $\eta $ (also called the characteristic life by some authors), one must simply set $t=\eta $ in the cdf equation. Note that from before:

MATH

so at $T=\eta :$

MATH

Thus, if we enter the $y$ axis at $Q(T)=63.2\%,$ the corresponding value of $T$ will be equal to $\eta .$ Thus, using this simple but rather time-consuming methodology, the parameters of the Weibull distribution can be estimated.

Determining the X and Y Position of the Plot Points

The points on the plot represent our data or, more specifically, our times-to-failure data. If, for example, we tested four units that failed at 10, 20, 30 and 40 hours, we would use these times as our $x$ values or time values. Determining what the appropriate $y$ plotting positions, or the unreliability values, should be is a little more complex. To determine the $y$ plotting positions, we must first determine a value indicating the corresponding unreliability for that failure. In other words, we need to obtain the cumulative percent failed for each time-to-failure. In this example, and by 10 hours, the cumulative percent failed is 25%, by 20 hours 50%, and so forth. This is a simple method illustrating the idea. The problem with this simple method is the fact that the 100% point is not defined on most probability plots, thus an alternative and more robust approach must be used. The most widely used method of determining this value is the method of obtaining the median rank for each failure. This method is discussed next.

Median Ranks

Median ranks are used to obtain an estimate of the unreliability, $Q(T_{j}),$ for each failure. It is the value that the true probability of failure, $Q(T_{j}),$ should have at the $j^{th}$ failure out of a sample of $N$ units at a $50\%$ confidence level. This essentially means that this is our best estimate for the unreliability. Half of the time the true value will be greater than the 50% confidence estimate, the other half of the time the true value will be less than the estimate. This estimate is based on a solution of the binomial equation.

The rank can be found for any percentage point, $P$, greater than zero and less than one, by solving the cumulative binomial equation for $Z$. This represents the rank, or unreliability estimate, for the $j^{th}$ failure [15; 16] in the following equation for the cumulative binomial:

MATH

where $N$ is the sample size and $j$ the order number.

The median rank is obtained by solving this equation for $Z$ at $P=0.50,$

MATH(16)

For example, if $N=4$ and we have four failures, we would solve the median rank equation, Eqn. (16), four times; once for each failure with $j=$ 1, 2, 3 and 4, for the value of $Z.$ This result can then be used as the unreliability estimate for each failure or the $y$ plotting position. (The Weibull distribution chapter presents a step-by-step example for this method.) The solution of Eqn. (16) for $Z$ requires the use of numerical methods.

A more straightforward and easier method of estimating median ranks is by applying two transformations to Eqn. (16), first to the beta distribution and then to the F distribution, resulting in [12;13],

MATH

$F_{0.50;m;n}$ denotes the F distribution at the 0.50 point, with $m$ and $n$ degrees of freedom, for the $j^{th}$ failure out of $N$ units. Weibull++ uses this formulation when determining the median ranks.

Another quick, and less accurate, approximation of the median ranks is also given by [15]:

MATH(17)

This approximation of the median ranks is also known as Benard's approximation.

Kaplan-Meier

The Kaplan-Meier estimator is used as an alternative to the median ranks method for calculating the estimates of the unreliability for probability plotting purposes. The equation of the estimator is given by,

MATH(18)

where,

MATH

Weibull++ provides the option to select whether the median ranks or the Kaplan-Meier estimator is used for the unreliability estimates for probability plotting and regression. By default, the median ranks are used.

Probability Plots for Other Distributions

This same methodology can be applied to other distributions which have cdf equations that can be linearized. Different probability papers exist for each distribution, since different distributions have different cdf equations. Weibull++ automatically creates these plots for you when choosing a probability plot for a particular distribution. Special scales on these plots allow the parameter estimates to be derived directly from the plots, similar to the way $\beta $ and $\eta $ were obtained from the Weibull probability plot. These will be discussed in subsequent chapters on the individual distributions.

Some Shortfalls of Manual Probability Plotting

Besides the most obvious drawback to probability plotting, which is the amount of effort required, manual probability plotting is not always consistent in the results. Two people plotting a straight line through a set of points will not always draw this line the same way, and thus will come up with slightly different results. This method was used primarily before the widespread use of computers that could easily perform the calculations for more complicated parameter estimation methods, such as the least squares and maximum likelihood methods.

See Also:
Parameter Estimation


Go to ChinaRel.Com
Go to ReliaSoft.Cn

 

©1996-2005. ReliaSoft Corporation. ALL RIGHTS RESERVED.