The purpose of the one-sample proportion test in VSP is to test a hypothesis involving a population proportion against a given proportion (Action Level). Please consult EPA's guidance document, Guidance for the Data Quality Objectives Process (EPA 2006a), to put this test in the context of environmental decision-making.
Before deciding to develop a sampling plan based on using the one-sample proportion test, consider the assumptions and limitations involved. For a discussion of these assumptions, limitations, and for the details of the test, please consult EPA's Data Quality Assessment: Statistical Methods for Practitioners (EPA 2006b). This document, as well as the DQO guidance document, is currently available at: http://www.epa.gov/quality/qa_docs.html
The example listed in Box 3-11 (EPA 2000b, p. 3-20) suggests a sample size of 422.18 is necessary to achieve the DQOs for this problem. However, VSP calculates a sample size of 368. The explanation is that the 1.04 in the numerator of the sample-size equation in Box 3-11 is a typo. Using the correct value for \(z_{1-\beta}\) of 0.8416, instead of 1.04, and a more precise value for \(z_{1-\alpha}\) of 1.645, instead of 1.64, produces VSP's result of \(n\) = 368.
The number of samples is calculated using Eq. (1) below (EPA 2006, p. 59). No MQO option is currently provided with this option.
\begin{equation} n=\Bigg(\frac{z_{1-\alpha}\sqrt{P_0(1-P_0)}+z_{1-\beta}\sqrt{P_1(1-P_1)}}{P_1-P_0}\Bigg)^2 \end{equation}
where:
\(n\) |
is the recommended minimum sample size |
\(z_{1-\alpha}\) |
is the value of the standard normal distribution for which the proportion of the distribution to the left of \(z_{1-\alpha}\) is \(1-\alpha\) |
\(z_{1-\beta}\) |
is the value of the standard normal distribution for which the proportion of the distribution to the left of \(z_{1-\beta}\) is \(1-\beta\) |
\(P_0\) |
is the Action Level |
\(P_1\) |
is the outer bound of the gray region. Note that \(\Delta\), the width of the gray region, is abs( \( P_1 - P_0\)) |
\(\alpha\) |
is the probability of rejecting the null hypothesis when the null hypothesis is true. |
\(\beta\) |
is the probability of not rejecting the null hypothesis when the null hypothesis is false. |
Note: \(nP_0\) and \(n(1-P_0)\) must be at least 5.
The assumptions associated with the formulas for computing the number of samples are:
1. The population values are not spatially or temporally correlated, and
2. The sampling locations will be selected probabilistically.
The first assumption will be assessed in a post data collection analysis. The last assumption is valid because the sample locations were selected using a randomization process.
For an illustration of how to use this sampling design, please refer to the Compare Proportion to Fixed Threshold section in chapter 3 of the VSP User’s Guide.
EPA. 2006a. Guidance on Systematic Planning Using the Data Quality Objectives Process. EPA QA/G-4, EPA/240/B-06/001, U.S. Environmental Protection Agency, Office of Environmental Information, Washington DC.
EPA .2006. Data Quality Assessment: Statistical Methods for Practitioners. EPA QA/G-9S, EPA/240/B-06/003, U.S. Environmental Protection Agency, Office of Environmental Information, Washington DC.
For Null Hypothesis = Site is Dirty:
Width of Gray Area (Delta) / LBGR / UBGR
For Null Hypothesis = Site is Clean :
Width of Gray Area (Delta) / LBGR / UBGR