Presence/Absence Compliance Sampling

Background Information

The objective of this design is to demonstrate, with high probability, that a high percentage of the decision area (or population) is acceptable, where none of the observed samples may be unacceptable. The following discussion is presented in terms of sampling conducted within a decision area (such as room or collection of rooms within a building). However, this methodology is equally applicable to the sampling of any finite population of items, in which case the decision area is analogous to the population of items and the grid cell sampling locations are analogous to the individual items that will be sampled.

The hypergeometric model used in this design requires that each sample result be categorized as a binary outcome, such as 1) the presence or absence of a particular quality, 2) a sample result being acceptable or unacceptable as defined by an action level threshold, 3) contamination being detected or not detected, etc. This statistical sampling approach employed here is known as Compliance Sampling (Schilling and Neubauer 2009) or Accept on zero attribute compliance sampling (AOZ-ACS) (Squeglia, 1994; Bowen and Bennett, 1988).

Compliance sampling requires that all surfaces in the decision area be divided into non-overlapping, equal-size grid cells of specified size that correspond to the sampling methodology, e.g., 10cm x 10 cm. The method may be used outdoors if the decision area can be divided into grid cells. The size of the grid cell should correspond to the footprint of the sampling methodology (i.e. the area sampled by the swab, wipe or vacuum). If more than one sampling methodology is to be employed in a decision area, the size of the grid cell should be chosen to match the sampling methodology with the smallest footprint. The location of samples that will be taken using methodologies with larger footprints should be assigned in a consistent fashion, e.g. the sample is centered on the smaller grid cell that was assigned by VSP, or the upper-left corner of the larger sample is aligned with the upper-left corner of the grid cell assigned by VSP, etc. While this approach to multiple sampling methodologies is conservative, it ensures that the desired confidence level is preserved.

Compliance sampling is especially suited for use in decision areas where very few of the grid cells are unacceptable. If at any time during the sampling process, one of the samples is found to be unacceptable, the decision area is declared to be unacceptable and no further samples for this design need be taken. If this occurs, it may be desirable to implement a hot spot or geospatial sampling plan to characterize the extent of the unacceptable locations or items.

Definitions

\( Grid \, Unit\)

A grid cell is a unit area of specified size, e.g. a 10cm by 10 cm square, on a surface in the decision area (e.g., floor, ceiling, or wall). The size of the grid cell might be determined by the type of sampling method used, (e.g., a swipe or swab sample), and how much surface area must be swiped or swabbed to obtain sufficient material to achieve required detection accuracy.

\(N\)

The total number of grid cells in the decision area (target population)

\(n\)

The number of grid cells that are selected using random sampling that will be measured or inspected. The number n is computed in VSP as described below.

\(X\)

The number of unacceptable grid units which might be observed in the sample. While the value of \(X\) may range from 0 to \(n\), it must be 0 in order to declare that the decision area to be acceptable.

\(D\)

The true (but unknown) number of unacceptable grid cells.

\(1 - P\)

The desired fraction of the decision area that will be acceptable with \( (1 - \alpha) \times 100 \% \) confidence.

\(H_0\)

The null hypothesis being tested. \(H_0\) is assumed to be true unless proved otherwise by the data. For this design, \( H_0 : D > D_0 \) (i.e. the decision area is unacceptable).

\(D_0\)

The largest number of unacceptable grid cells that are tolerated in the population if \(H_0\) is not true (i.e. if the decision area is acceptable). Specifically, \(D_0 = PN\).

\(H_a\)

The alternative hypothesis, the hypothesis we desire to prove. For this design, \( H_a : D \leq D_0 \) (i.e. decision area is acceptable). Ha can be accepted as being true with \(( 1 - \alpha ) \times 100\% \) confidence when \(H_0\) is rejected. \(H_0\) is rejected if \(X\) = 0.

\(\alpha\)

The desired probability of rejecting \(H_0\) when \(H_0\) is really true (Type I error). The value of \(1 - \alpha\), which is the confidence level, is provided by the VSP user as indicated below.

\(Decision \, Rule\)

If none of the \(n\) grid units are unacceptable (\(X\) =0), then reject \(H_0\) and conclude with \((1 - \alpha ) \times 100 \% \) confidence that \( D \leq D_0 \). In other words, conclude with \((1 - \alpha ) \times 100 \% \) confidence that at least \((1 - P) \times 100 \% \) of the decision area is acceptable.

 

The required inputs for this design and their relationship to the preceding definitions are illustrated in Figure 1.

image\Compliance0.gif

Figure 1: Required inputs and their corresponding mathematical symbols for the compliance sampling design.

Assumptions

  1. The size of the grid unit has been determined to be appropriate for the measurement (inspection) method to be performed. For example, an appropriate grid unit size might be a 10cm by 10cm surface area.

  2. The total number of grid units in the decision area, \(N\), is known.

  3. All \(N\) grid units are the same size.

  4. \(n\) of the \(N\) grid units are selected using random sampling.

  5. The \(n\) selected grid units are representative of the total population of \(N\) grid units.

  6. Each of the \(n\) grid units is measured or inspected using an approved method.

  7. Each sample is correctly classified as being acceptable or unacceptable (no false positives or false negatives).

Method used in VSP to compute the sample size, the confidence, and the percentage of acceptable grid cells

The method discussed here is similar to the approach used by Bowen and Bennett (1988). The confidence function and the required sample size can be derived via the expression for the Type I error rate of the test of \( H_0: D > D_0 \) . Specifically, we want to identify the smallest value of \(n\) that satisfies:

$$ \alpha \geq P (Reject \, H_0 | H_0 \, is \, true) $$

$$ = P(X=0 | D> D_0 ) $$

\begin{equation} \geq P( X = 0 | D = U) \end{equation}

where \( D_0 = PN \) and \( U = \lfloor PN \rfloor + 1 \)  . Note that \(U\) denotes the smallest whole number of unacceptable items that may be in the population if \(H_0\) is true. Rewriting the last line of (1) using the mass function of the hypergeometric variate \(X\) , we have

\begin{equation} \alpha \geq \frac{ \dbinom{U}{0} \dbinom{N-U}{n-0}}{\dbinom{N}{n}} = \frac{ (N-U)!(N-n)!}{N-u-n)!N!} \end{equation}

Thus, the sample size is the smallest integer-valued n which satisfies (2). While this sample size is an exact solution, it results in oscillating sample sizes as \(N\) increases, as illustrated in Figure 2 below. To avoid this phenomenon, VSP applies a continuous approximation to the sample size in (2) that is slightly conservative which ensures that as \(N\) increases, so does \(n\). Using a result given by Jaech (1973, pp 327), the right hand side of equation (2) can be approximated as

\begin{equation} \alpha \approx \left( 1 - \frac{2n}{2N - V + 1} \right)^V \end{equation}

where \( V= max(1, D_0)\) . Bounding \(V\) below at 1 ensures the continuous approximation agrees with the exact solution when \(D_0 < 1 \) (which occurs when \(P\) is close to 0).

image\Compliance1.gif

Figure 2: The sample size (\(n\)) required to achieve 95% confidence that at least 99% of the sampling area is acceptable, expressed as a function of the population size, \(N\). The exact sample sizes are obtained as the smallest integer-valued \(n\) which satisfy equation (2), and the approximate sample sizes are given by equation (4).

Calculating the sample size

A continuous approximation for the sample size can be obtained by solving (3) for \(n\):

\begin{equation} n \approx \lceil 0.5(1 - \alpha^{1/V})(2N-V+1) \rceil \end{equation}

Calculating the confidence

For a given \(n\), \(N\) , and \(P\) , subtracting the result of the right hand side of equation (3) from 1 gives the achieved confidence level. When calculating the confidence, if the number of samples is large, the confidence can be 100%. When this happens, it is possible that the achieved percentage of the sampling area that is acceptable is higher than originally specified. This higher (or achieved) value of \((1- P)\) , which we denote as \(\lambda_a\) , can be calculated using the following algorithm.

  1. Set \(\alpha\) equal to a suitably small number (such as 0.000005) and then solve equation (3) numerically for \(V\). Let \(V_1\) denote this solution, with \( \lambda_1 = 1 - V_1/N \) being the corresponding fraction of acceptable grid cells that results in virtually 100% confidence. If there is no solution (which occurs for very small sample sizes), set \( \lambda_1 = 0\).

  2. If \( n \geq (1-P)N \), set \( \lambda_2 = n/N \) , otherwise, let \( \lambda_2 = 0\).

  3. For convenience, let \(\lambda_0 = 1-P\), the desired fraction of acceptable grid cells. Then

$$ \lambda_a = \begin{cases} \max( \lambda_0 , \lambda_2)  &   \mbox{if} \, \lambda_0 > \lambda_1 \\ \max( \lambda_1 , \lambda_2)   & \mbox{otherwise} \\ \end{cases} $$

Note that \( \lambda_a > \lambda_0 \) only occurs when the confidence is 100%. Consequently, when \( \lambda_a > \lambda_0 \), observing no unacceptable grid cells in a sample gives 100% confidence that at least \( \lambda_a \times 100 \% \) of the sampling area is acceptable.

Calculating the percentage of acceptable grid cells

The desired fraction of the sampling area that is acceptable, \( 1- P \) , (given the values of \(\alpha\) , \( N\) , and \(n\) ) can be obtained by solving (3) numerically for \(P\). Because \(V\) is bounded below at 1, a unique solution for \(P\) can only be found if \( N (1- \alpha ) > n \). For the same reason, the numerical solution for \( 1-P\) must lie inside the interval \( \left( 0, \frac{N-1}{N} \right] \). However, if \( N (1- \alpha ) = n \), then \(P = 0\). Furthermore, if \( N(1- \alpha ) < n\), then \(P\) is still 0 but an even larger confidence is possible. In other words, if \( N( 1 - \alpha ) < n \) and all \(n\) samples are acceptable, we are \( \frac{n}{N} \times 100 \% \) confident that 100% of the sampling area is acceptable.

References:

Bowen, M.W. and C.A. Bennett. 1988. Statistical Methods for Nuclear Material Management, NUREG/CR-4604, U.S. Nuclear Regulatory Commission, Washington, DC

Jaech, J.L. 1973. Statistical Methods in Nuclear Material Control, TID-26298, NTIS, Springfield, Virginia.

Schilling, E.G. and D.V. Neubauer. 2009. Acceptance Sampling in Quality Control, 2nd ed. CRC Press, Taylor & Francis Group, NY.

Squeglia, N.L. 1994. Zero Acceptance Number Sampling Plans. ASQ Quality Press, Milwaukee, WI.

The Compliance Sampling dialog contains the following controls:

Total Number of Grids Cells in Decision Area

Confidence

Percentage of Decision Area that is Acceptable

Sample Placement Page

Cost page

Data Analysis page

 Data Entry sub-page

 Summary Statistics sub-page

 Tests sub-page

 Plots sub-page