Besides the capability of a measurement system, often the capability of a process is of interest or needs to be assessed e. g. as part of a supplier customer relationship in industry (citep{Kotz.1998, Mittag.1999}). Process Capability Indices basically provide the information of how much of the tolerance range is being used by common cause variation of the considered process. Using these techniques one can state how many units (e. g. products) are expected to fall outside the tolerance range (i. e. defective regarding the requirements determined before) if for instance production continues without intervention. It also gives insights as to where to center the process if shifting is possible and meaningful in terms of costs. There are three indices which are also defined in the corresponding document (cite{ISO21747}).(footnote{citep{ISO21747} is being revised by citep{ISO22514-2} see also citep{ISO22514-2,ISO22514-3, ISO22514-4})
\begin{align} c_p &= \frac{USL - LSL}{Q_{0.99865} - Q_{0.00135}}\\ \nonumber \\ c_{pkL} &= \frac{Q_{0.5} - LSL}{Q_{0.5} - Q_{0.00135}}\\ \nonumber\\ c_{pkU} &= \frac{USL - Q_{0.5}}{Q_{0.5} - Q_{0.00135}} \end{align}\(c_p\) is the potential process capability meaning the process capability that could be achieved if the process can be centered within specification limits (i. e. USL - Upper Specification Limit and LSL - Lower Specification Limit) and \(c_{pk}\) is the actual process capability taking into consideration the location of the distribution (i.e. the center) of the characteristic within the specification limits. For one sided specification limits \(c_{pkL}\) and \(c_{pkU}\) exist with \(c_{pk}\) being equal to the smallest capability index. As one can imagine in addition the location of the distribution of the characteristic the shape of the distribution is relevant too. Assessing the fit of a specific distribution for given data can be done via probability plots ppPlot
and quantile-quantile plots qqPlot
, as well as formal test methods like the Anderson Darling Test (citep{Stephens.2006, DAgostino.1986}).
Process capabilities can be calculated with the pcr
method of the qualityTools package. The pcr
method plots a histogram of the data, the fitted distribution and returns the capability indices along with the estimated parameters of the distribution, an Anderson Darling Test for the specified distribution and the corresponding QQ-Plot. The relevant parameters for the pcr
method are the upper and lower specification limit usl, lsl and the type of distribution distribution. Fitting the distribution itself is accomplished by the fitdistr
method of the R-package MASS
set.seed(1234)
#generate some data
d1 = rnorm(20, mean = 20)
#generate some data
d2 = rweibull(20, shape = 2, scale = 8)
#
par(mfrow = c(1,2))
#process capability
pcr(d1, "normal", lsl = 17, usl = 23)
Process Capability assuming a normal and a logistic distribution
##
## Anderson Darling Test for normal distribution
##
## data: d1
## A = 0.5722, mean = 19.749, sd = 1.014, p-value = 0.1191
## alternative hypothesis: true distribution is not equal to normal
pcr(d2, "logistic", lsl = 1, usl = 20)
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
## Warning in densfun(x, parm[1], parm[2], ...): NaNs produced
Process Capability assuming a normal and a logistic distribution
##
## Anderson Darling Test for logistic distribution
##
## data: d2
## A = 0.3795, location = 6.866, scale = 1.428, p-value > 0.25
## alternative hypothesis: true distribution is not equal to logistic
Along with the graphical representation an Anderson Darling Test for the corresponding distribution is returned.
Omitting an upper or a lower specification limit leads to an exclusive calculation of the upper or lower process capability index. If both lsl
and usl
are not specified they are calculated to provide process capability indices of 1.
par(mfrow = c(1,2))
pcr(d1, "normal", usl = 23)
Missing lower/upper specification limit
##
## Anderson Darling Test for normal distribution
##
## data: d1
## A = 0.5722, mean = 19.749, sd = 1.014, p-value = 0.1191
## alternative hypothesis: true distribution is not equal to normal
pcr(d1, "normal", lsl = 17)
Missing lower/upper specification limit
##
## Anderson Darling Test for normal distribution
##
## data: d1
## A = 0.5722, mean = 19.749, sd = 1.014, p-value = 0.1191
## alternative hypothesis: true distribution is not equal to normal
For distributions other than normal, a Box-Cox-Transformation can be used to obtain data that follows a normal distribution using boxcox = TRUE
. In this case the parameter lambda is estimated and the data is transformed using the Box-Cox transformation. It is also possible to specify lambda directly using the lambda = value
.
pcr(d1, boxcox = TRUE, lsl = 17, usl = 23) #arbitrary example!
Box-Cox Tranformation
##
## Anderson Darling Test for normal distribution
##
## data: d1
## A = 0.4561, mean = 0.001, sd = 0.000, p-value = 0.2392
## alternative hypothesis: true distribution is not equal to normal
Data organized in subgroups can be handled using the parameter grouping
(only for normally distributed data).
#process capability for a normal distribution and data in subgroups
#some artificial data with shifted means in subgroups
x = c(rnorm(5, mean = 1), rnorm(5, mean = 2), rnorm(5, mean = 0))
group = c(rep(1,5), rep(2,5), rep(3,5))
pcr(x, grouping = group) #compare to sd(x)
subgrouping for normally distributed data
##
## Anderson Darling Test for normal distribution
##
## data: x
## A = 0.2653, mean = 0.373, sd = 1.028, p-value = 0.6407
## alternative hypothesis: true distribution is not equal to normal
Q-Q Plots can be calculated with the qqPlot
function of the qualityTools package.
par(mfrow = c(1,2))
qqPlot(d2, "weibull")
qqPlot(d2, "normal")
QQ-Plots for different distributions
Probability Plots can be calculated with the ppPlot
function of the qualityTools package.
par(mfrow = c(1,2))
ppPlot(d1, "weibull")
## $shape
## [1] 18.16804
##
## $scale
## [1] 20.24174
##
## [1] 0.02712497 0.07448611 0.11716049 0.15597608 0.19115860 0.22274691
## [7] 0.25068393 0.27484618 0.29505297 0.31106578 0.32258091 0.32921508
## [13] 0.33048197 0.32575438 0.31420183 0.29468107 0.26552607 0.22408706
## [19] 0.16546838 0.07703570
ppPlot(d1, "normal")
## $mean
## [1] 19.74934
##
## $sd
## [1] 0.9881374
##
## [1] 0.0591467 0.1432548 0.2083248 0.2608699 0.3035130 0.3376776 0.3642353
## [8] 0.3837475 0.3965759 0.4029386 0.4029386 0.3965759 0.3837475 0.3642353
## [15] 0.3376776 0.3035130 0.2608699 0.2083248 0.1432548 0.0591467
PP-Plots for different distributions