# Walsh's Outlier Test

Walsh's test can be used to detect multiple outliers in a data set that
is not required to be normally distributed. This test will detect outliers
that are both much smaller and much larger than the rest of the data.

Although Walsh's test does not require that the data be normally distributed,
it requires at least 60 samples to be performed at a significance level
of \(\alpha\)=0.10, and at least 220 samples to be performed at a significance
level of \(\alpha\)=0.05.

## Performing Walsh's Test

VSP performs Walsh's Test as described in section 4.4.6 of the EPA's QA/G-9S documents (EPA).
The \( n \) observed values are ordered from smallest to largest. We specify
the maximum number of suspected outliers \( k \) and compute the following
values:

$$ c = ceiling(\sqrt{2n}) $$

$$ r = k + c $$

$$ b^2 = \frac{1}{\alpha} $$

$$ a = \frac{1 + b \sqrt{\frac{(c-b^2)}{c-1}}}{c-b^2-1} $$

where \(\alpha\)=0.10 for 60 \(< n \leq\) 220, and \(\alpha\)
= 0.05 for \(n >\) 220, and ceiling (
) indicates rounding the value to the next largest integer.

If the following equation holds:

\( X_{(k)}-(1+a)X_{(k+1)}+aX_{(r)} < 0 \)

then the \( k \) smallest points are outliers with an \(\alpha\) level
of significance.

The \( k \) largest points are outliers with an \(\alpha\) level of
significance if

\( X_{(n+1-k)}-(1+a)X_{(n-k)}+aX_{(n+1-r)} > 0 \) .

If both of the inequalities are true, then the test concludes that both
the \( k \) smallest and the \( k \) largest points are outliers, with
a significance level of \( \alpha \).

## References:

EPA. 2006. Data
Quality Assessment: Statistical Methods for Practitioners. EPA QA/G-9S, EPA/240/B-06/003, U.S. Environmental
Protection Agency, Office of Environmental Information, Washington DC.