Collecting data involves the use of measurement systems often referred to as gages. In order to make a statement regarding the quality, i.e. the degree in which a set of inherent characteristics meets requirements, of a product (citep{ISO9000}), the capability of the measurement system used needs to be validated. Basically, gages can have two types of impairments:

• a bias (an assumed constant shift of values for measurements of equal magnitude)
• variation
• introduced by other factors e.g. operators using these gages reproducibility
• system immanent variation of the measurement system itself repeatability

These impairments lead to varying measurements for repeated measurements of the same unit (e.g. a product). The amount of tolerable variation of course depends on the number of distinctive categories you need to be able to identify in order to characterize the product. This tolerable amount of variation for a measurement system relates directly to the tolerance range of the characteristics of a product. Regarding these two impairments, measurement systems are characterized with respect to accuracy and precision.

Accuracy is used to describes how close the observed average of repeated measurements is to a reference value with fixed conditions of measurement, i. e. one operator measures the same part several times with the same measurement device. Instead of accuracy often the term bias is used.

Precision is used to describe the expected variation of repeated measurements with fixed conditions of measurement, i. e. one operator measures the same part several times with the same measurement device (repeatability).

The capability of a measurement system is crucial for any conclusion based on data. Non-capable measurement systems due to a non adjusted bias, or a large measurement system immanent variation implicate two serious errors of judgement.

• Accepting items that are actually out of tolerance
• Declining items that are actually within tolerance

Declining items that are actually within tolerance is a type I error (statistically: a true null hypothesis $$H_{0}$$ is incorrectly rejected) and can be described as producer’s risk. This is the chance of a false alarm. Accepting items that are actually out of tolerance is a type II error (statistically: a false null hypothesis $$H_{0}$$ is failed to be rejected) and can be called consumer’s risk. This is a missed opportunity. All in all it can be summarized that the capability of measurement systems is directly related to costs (see figure gageCapability).

## Measurement Systems Analysis

### Gage Capability – Measurement Systems Analysis - Type I

Suppose an engineer wants to check the capability of a new optical measurement device. An unit with known characteristic ($$x_m = 10.033mm$$) is repeatedly measured $$n=25$$ times. From the measurement values the mean $$\overline{x_g}$$ and standard deviation $$s_g$$ can be calculated. Here, $$\sigma_g$$ denotes the standard deviation of the gage which is also referred to as repeatability.

Basically the calculation of an capability index comprises two steps. First a fraction of the tolerance width (i. e. $$USL - LSL$$) with $$USL$$ being the Upper Specification Limit and $$LSL$$ beign the Lower Specification Limit is calculated. The fraction typically relates to $$0.2$$. In a second step this fraction is set in relation with a measure of the process spread (i.e. the range in which 95.5% or 99.73% of the characteristics of a process are to be expected). For normal distributed measurement values this relates to $$k=2\sigma_g$$ and $$k=3\sigma_g$$ calculated from the measurement values. For non-normal distributed data the corresponding quantiles can be taken. If there’s no bias this calculation represents the capability index $$c_g$$ and reflects the true capability of the measurement device.

\begin{align} c_g &= \frac{0.2\cdot(USL - LSL)}{6\cdot s_g}\\ &= \frac{0.2\cdot(USL - LSL)}{X_{0.99865} - X_{0.00135}} \end{align}

However, if there’s a bias it is taken into account by substracting it from the numerator. In this case $$c_g$$ reflects only the potential capability (i.e. capability if bias is corrected) and $$c_{gk}$$ is an estimator of the actual capability. The bias is calculated as the difference between the known characteristic $$x_m$$ and the mean of the measurement values $$x_g$$

$c_{gk} = \frac{0.1\cdot(USL - LSL) - \left| x_m - x_g \right|}{3\cdot s_g}$

Determining if the bias is due to chance or not can be done with the help of a t-test which has the general form:

$t = \frac{\text{difference in means}}{\text{standard error of the difference}} = \frac{Bias}{\frac{s_{Bias}}{\sqrt{n}}}$

Besides bias and standard deviation it is important to check the run-chart of the measurement values. Using the qualityTools package, all this is easily achieved using the cg method. The output of the cg method is shown in the following figure.

x = c(9.991, 10.013, 10.001, 10.007, 10.010, 10.013, 10.008, 10.017, 10.005,
10.005, 10.002, 10.017, 10.005, 10.002,  9.996, 10.011, 10.009, 10.006,
10.008, 10.003, 10.002, 10.006, 10.010, 9.992, 10.013)
cg(x, target = 10.003, tolerance = c(9.903, 10.103))

### Gage Repeatability & Reproducibility – Measurement Systems Analysis - Type II

A common procedure applied in industry is to perform a Gage R&R analysis to assess the repeatability and the reproducibility of measurement systems (citep{ISO22514-1, Dietrich.2007, Dietrich.2008}). R&R stands for repeatability and reproducibility. Repeatability hereby refers to the precision of a measurement system (i. e. the standard deviation of subsequent measurements of the same unit). Reproducibility is the part of the overall variance that models the effect of different e. g. operators performing measurements on the same unit and a possible interaction between different operators and parts measured within this Gage R&R. The overall model is given by

$\sigma^2_{total} = \sigma^2_{Parts} +\sigma^2_{Operator} +\sigma^2_{Parts\times Operator} +\sigma^2_{Error}$

where $$\sigma^2_{Parts}$$ models the variation between different units of the same process. $$\sigma^2_{Parts}$$ is thus an estimate of the inherent process variability. Repeatability is modeled by $$\sigma^2_{Error}$$ and reproducibility by $$\sigma^2_{Operator} +\sigma^2_{Parts\times Operator}$$.

Suppose 10 randomly chosen units were measured by 3 randomly chosen operators. Each operator measured each unit two times in a randomly chosen order. The units were presented in a way they could not be distinguished by the operators.

The corresponding Gage R&R design can be created using the gageRRDesign method of the qualityTools package. The measurements are assigned to this design using the response method. Methods for analyzing this design are given by gageRR and plot.

#create a gage RnR design
design = gageRRDesign(Operators=3, Parts=10, Measurements=2, randomize=FALSE)
#set the response
response(design) = c(23,22,22,22,22,25,23,22,23,22,20,22,22,22,24,25,27,28,
23,24,23,24,24,22,22,22,24,23,22,24,20,20,25,24,22,24,21,20,21,22,21,22,21,
21,24,27,25,27,23,22,25,23,23,22,22,23,25,21,24,23)
#perform Gage RnR
gdo = gageRR(design)
##
## AnOVa Table -  crossed Design
##               Df Sum Sq Mean Sq F value   Pr(>F)
## Operator       2  20.63  10.317   8.597  0.00112 **
## Part           9 107.07  11.896   9.914 7.31e-07 ***
## Operator:Part 18  22.03   1.224   1.020  0.46732
## Residuals     30  36.00   1.200
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## ----------
## AnOVa Table Without Interaction -  crossed Design
##             Df Sum Sq Mean Sq F value   Pr(>F)
## Operator     2  20.63  10.317   8.533 0.000675 ***
## Part         9 107.07  11.896   9.840 2.39e-08 ***
## Residuals   48  58.03   1.209
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## ----------
##
## Gage R&R
##                  VarComp VarCompContrib Stdev StudyVar StudyVarContrib
## totalRR            1.664          0.483 1.290     7.74           0.695
##  repeatability     1.209          0.351 1.100     6.60           0.592
##  reproducibility   0.455          0.132 0.675     4.05           0.364
##    Operator        0.455          0.132 0.675     4.05           0.364
##    Operator:Part   0.000          0.000 0.000     0.00           0.000
## Part to Part       1.781          0.517 1.335     8.01           0.719
## totalVar           3.446          1.000 1.856    11.14           1.000
##
## ---
##  * Contrib equals Contribution in %
##  **Number of Distinct Categories (truncated signal-to-noise-ratio) = 1

The standard graphical output of a Gage R&R is given in the following figure using the plot method on gage design object gdo:

#visualization of Gage R&R
plot(gdo)

The barplot gives a visual representation of the Variance Components. totalRR depicts the total (R)epeatability and (R)eproducibility. 48% of the variance is due to 35% repeatability (i. e. variation from the gage itself) and 13% reproducibility (i. e. effect of operator and the interaction between operator and part). It can be seen from the AnOVa table that an interaction between parts and operators is not existing. The remaining 52% (51.7 in column VarCompContrib) of variation stems from differences between parts taken from the process (i. e. process inherent variation) which can be seen also in the Measurement by Part plot. The variaton for measurements taken by one operator is roughly equal for all three operators (Measurement by Operator) although operator C seems to produce values that are most of the time larger than the values from the other operators (Interaction Operator:Part).

Besides this interpretation of the results critical values for totalRR also refered to as GRR are used within industry and shown in the following table. However, a measurement system should never be judged by critical values alone.

Contribution of total RR Capability
$$\leq 0.1$$ suitable
$$>0.1$$ and $$< 0.3$$ limited suitability depending upon circumstances
$$\geq 0.3$$ not suitable

Checking for interaction: The interaction plot provides a visual check of possible interactions between Operator and Part. For each Operator the average measurement value is shown as a function of the part number. Crossing lines indicate that operators are assigning different readings to identical depending on the combination of Operator and Part. Different readings means in the case of an interaction between Operator and Part that on average sometimes smaller or bigger values are assigned depending on the combination of Operator and Part. In this case, lines are practically not crossing but Operator C seems to systematically assign larger readings to the parts than his colleagues.

Operators: To check for an operator dependent effect, measurements are plotted grouped by operators in form of boxplots. Boxplots that differ in size or location might indicate e.g. possible different procedures within the measurement process, which then lead to a systematic difference in the readings. In this case one might discuss a possible effect for operator C which is also supported by the interaction plot.

Inherent process variation: Within this plot Measurements are grouped by operator. Due to the repeated measurements by different Operators per Part an insight into the process is given. A line connecting the mean of the measurements of each part provides an insight into the inherent process variation. Each part is measured number of operator times number of measurements per part.

Components of variation: In order to understand the output of a Gage R&R study, the following formula needs to be referenced. The variance component totalRR (VarComp column) represents the total (R)epeatability and (R)eproducibility. Since variances are simply added 1.664 is the sum of 1.209 (repeatability given by $$\sigma^2_{Error}$$) and 0.455 (reproducibility). Reproducibility itself is the sum of Operator ($$\sigma^2_{Operator}$$) and Operator:Part ($$\sigma^2_{Parts\times Operator}$$). Since there’s no interaction, Reproducibility amounts to 0.455. Part to Part amounts to 1.781. Together with the total of repeatability and reproducibility this gives $$\sigma^2_{Total} = 3.446$$

### Gage Repeatability & Reproducibility – Measurement Systems Analysis - Type III

Measurement Systems Analysis Type III is a specific form of Measurement Systems Analysis Type II where there is no operator influence present.

### Relation to the Measurement Systems Terminology

The Measurement Systems Analysis Manual (cite{MSA.2010}) uses a specific Terminology for the terms repeatability, reproducibility, Operator, Part to Part, totalRR and the interaction Operator:Part. The objective of this paragraph is to give a short overview of these terms and how they relate to the terms used in the gageRR methods of the qualityTools package.

EV stands for Equipment Variation which is the variation due to the repeatability

AV stands for Appraiser Variation which is the variation due to the operators.

INT stands for the interaction Appraiser:Part which is the Operator:Part interaction

GRR stands for Gage Repeatability & Reproducibility and refers to the variation introduced by the measurement system. The equivalent to this term is totalRR which is the sum of repeatability and repeatability.

PV stands for Part Variation which relates to Part to Part