Two-Sample t-Test When the Two Populations Have Unequal Variances

Background Information

The two-sample t-test is a statistical test that can be used to test whether the true survey-unit mean exceeds the true reference-area mean. The difference of means is compared to a specified difference, i.e., Action Level, which could be zero. Because the standard two-sample t-test is valid only for populations with approximately equal variances, a separate method is required when the two populations cannot be assumed to have equal variances. Please consult EPA's guidance document, Guidance on Systematic Planning Using the Data Quality Objectives Process (EPA 2006a) to put this test in the context of environmental decision-making.

Performing the Two-Sample t-Test for Populations With Unequal Variances

The two-sample t-test for populations with unequal variances is performed according to the procedure described in EPA's Data Quality Assessment: Statistical Methods for Practitioners (EPA 2006b, section 3.3.1.1.2). The test is performed on sets of random samples drawn from both the reference area and the survey site populations. The sample mean and sample variance are calculated for both of these sets of samples and used to calculate the test statistic \( T_0 \) according to the following equation (EPA 2006b, Box 3-23):

$$ \large t_0 = \frac{( \bar X - \bar Y ) - AL}{ \sqrt{ \frac{S_X^2}{n} + \frac{S_Y^2}{m}}} $$

where

\( \bar X \)

is the mean of the site samples

\( \bar Y \)

is the mean of the reference area samples

\( AL \)

is the action level, or the specified difference of true means

\( S_X^2 \)

 is the variance of the site samples

\( S_Y^2 \)

is the variance of the reference area samples

\( n \)

is the number of samples taken from the site population

\( m \)

 is the number of samples taken from the reference area population

 

The critical value \( t_{1- \alpha} \) is then determined, where \( t_{1- \alpha} \) is the value of the t-distribution for which the proportion of the distribution with \( f \) degrees of freedom to the left of \( t_{1- \alpha} \)  is \( 1 - \alpha \). The degrees of freedom \( f \) is computed using Satterthwaite's approximation (EPA 2006b, Box 3-23):

$$ \large f = \frac{ \left( \frac{S_X^2}{n} + \frac{S_Y^2}{m} \right)^2}{\frac{S_X^4}{n^2(n-1)} + \frac{S_Y^4}{m^2(m-1)}} $$ , rounded down to the nearest integer.

Once both \( t_0\) and \( t_{1- \alpha} \) have been determined, the test of the null hypothesis is performed by comparing the two values. If the null hypothesis is that the difference of true means exceeds the action level (i.e., assuming site is dirty), then the null hypothesis may be rejected if \( t_0 > t_{1- \alpha} \). If the null hypothesis is that the difference of true means is less than the action level (i.e., assuming site is clean), then the null hypothesis may be rejected if \( t_0 < -t_{1- \alpha} \).

Method Used to Estimate Recommended Minimum Number of Samples

The minimum number of samples required is estimated by using an iterative simulation. The user specifies the known number of samples and standard deviation of the reference area and the acceptable error rates \( \alpha \) and \( \beta \). The simulation determines whether \(n \) site samples are sufficient to satisfy the required \( \alpha \) and \( \beta \) by repeatedly generating sets of \( n \) samples drawn from simulated site populations and performing the two-sample t-test as described above.

The false rejection error rate \( a \) is approximated by the frequency with which the test rejects the null hypothesis when the set of \( n \) generated site samples is drawn from a population for which the null hypothesis should not be rejected. For example, if the null hypothesis is that the difference in means exceeds the action level (\(H_0\): Difference of True Means \( \geq \) Action Level), the approximate achieved \( \alpha \) estimated by performing the two-sample t-test on sets of \( n \) site samples drawn from a population with a mean that is at least \( AL \) greater than the reference area mean, and calculating the percentage of times that the test erroneously concludes that the difference of true means is not greater than the action level.

Likewise, the achieved false acceptance error rate \( \beta \) is approximated by the frequency with which the t-test fails to reject the null hypothesis when the \( n \) generated site samples are drawn from a population for which the null hypothesis should be rejected.

If the estimated achieved error rates exceed the specified acceptable \( \alpha \) and \( \beta \), more samples are necessary to increase the power of the test, and the simulation is repeated until the minimum number of site samples \( n \) is found which achieves the user requirements.

Statistical Assumptions

The assumptions associated with the formulas for computing the number of samples are:

1. The data from each area (site and reference area) originate from normal populations,

2. The variance estimates (\(S_X^2\) and \(S_Y^2\)) are reasonable and representative of the population being sampled,

3. The population values are not spatially or temporarily correlated, and

4. The sampling locations will be selected randomly.

The first three assumptions will be assessed in a post data collection analysis. The last assumption is valid because the sample locations were selected using a random process.

References:

EPA. 2006a. Guidance on Systematic Planning Using the Data Quality Objectives Process. EPA QA/G-4, EPA/240/B-06/001, U.S. Environmental Protection Agency, Office of Environmental Information, Washington DC.

EPA. 2006b. Data Quality Assessment: Statistical Methods for Practitioners. EPA QA/G-9S, EPA/240/B-06/003. U.S. Environmental Protection Agency, Office of Environmental Information, Washington DC.

The Two-Sample t-Test with Unequal Variances dialog contains the following controls:

Analyte

Null Hypothesis

Percent Confident

Action Level (Specified Difference of True Means)

Width of Gray Area (Delta) / LBGR / UBGR (when null hypothesis = "site is unacceptable")

Width of Gray Area (Delta) / LBGR / UBGR (when null hypothesis = "site is acceptable")

Type II Error Rate (Beta) (when null hypothesis = "site is unacceptable")

Type II Error Rate (Beta) (when null hypothesis = "site is acceptable")

Estimated Site Std Dev

Reference Area Std Dev

Number of Reference Samples

Calculate Button

Sample Placement page

Cost page

Data Analysis page

Data Entry sub-page

Summary Statistics sub-page

Tests sub-page

Plots sub-page

Analyte page