Kriging

Post Processing Mapping

This tab provides access to algorithms that can be used to prepare maps addressing several common concerns in environmental remediation, including the probability of exceeding a specified threshold, several measures of uncertainty, and several concentration estimates. The approach used for generating these maps is based on multiGaussian kriging to produce a model of local uncertainty (Goovaerts 1997; Deutsch and Journel 1998).

This approach assumes that the normal score transform can be used to transform the data so that it has a multivariate spatial Gaussian distribution. Although there is no simple way to fully evaluate the multivariate Gaussian assumption, it is possible to test whether the transformed data are at least bivariate Gaussian (Goovaerts 1997). The Post-Processing Mapping tab includes a button, Test multiGaussian Assumption, that calculates the indicator variograms that would be predicted for the deciles of the data, given the variogram model fit to the normal score transformed data. The theoretical indicator variogram models for each decile are plotted together with the empirical indicator variograms calculated from the data. The most common form of departure is for the experimental indicator variograms to show greater spatial correlation than the multiGaussian model would predict, i.e., extreme values may be more connected than would be expected under the multiGaussian model. However, unless the departure appears to be very pronounced (a subjective judgment), standard practice is to adopt the multiGaussian assumption, In cases where the multiGaussian assumption does not appear to be justified, indicator kriging should be considered (Deutsch and Journel 1998, Goovaerts 1997).

If the multiGaussian assumption appears reasonable, the conditional cumulative distribution function (ccdf) at each location in the map area can be estimated through the kriging mean and variance of the normal score transformed data (Goovaerts 1997). For theoretical correctness, simple kriging should be used to determine the mean and variance. However, if there appear to be trends in the data, it is possible to use ordinary kriging to estimate the mean. Even when ordinary kriging is used to estimate the mean, the kriging variance used to estimate the ccdf should still be the simple kriging variance (Goovaerts 1997), and that is the approach taken in VSP for this module. If significant trends are present, another approach to take would be to fit a deterministic trend model to the data (e.g., by fitting a polynomial regression), and then perform the geostatistical analysis on the residuals from the trend; it would still be necessary to perform that analysis on a normal score transform of the residuals (Goovaerts 1997).

Goovaerts (1997, Section 7.2.4) describes the process used to derive the parameters of the Gaussian ccdf, while Section 7.2.5 describes the approach used to transform and back-transform measures of local uncertainty that can be derived from the ccdf. This can include choice of the approach used to model the tails of the distribution, above the highest data value and below the lowest data value. The default approach used for the lower and upper tails in this module is linear interpolation to the minimum and maximum data values, respectively. However, there may be situations in which the user desires to acknowledge the uncertainty in the minimum and maximum values during back transformation of the local ccdf's to the original data space. In that case, the user can select the Advanced Options tab and model the lower tail using a power model and either a power model or hyperbolic model for the upper tail. See Deutsch and Journel (1998) or Goovaerts (1997) for description of those models and the parameters required.

Section 7.4 of Goovaerts (1997) describes the use of local uncertainty models, including the definitions of several of the uncertainty measures and concentration estimates provided in this module. Once a model of local uncertainty is available through the ccdf, we can map the probability that the concentration of the contaminant exceeds a specified cutoff; estimate and map several measures of uncertainty in the distribution of the contaminant; and create maps of the mean or median concentration, or a map of a specified percentile of the distribution at each grid node. The options available on the post-processing mapping tab are split into 3 major groups. These include maps that show: 1) The probability of exceeding a threshold; 2) several measures of uncertainty; and 3) several measures of the concentration. The mapping options available in each of the 3 groups are discussed below:

Probability of exceeding a threshold:

This option allows the user to create maps showing the probability that the contaminant level will exceed a given threshold.

Uncertainty maps:

Three maps can be generated using this option.

Conditional variance:

This measures the spread of the local distribution around the mean. Unlike the kriging variance, this measure of uncertainty is influenced by the data values, not just their location (Goovaerts 1997). This measure of uncertainty can be heavily influenced by the model chosen for the upper tail of the distribution.

Interquartile range (IQR):

This measure of variability is the difference between the upper and lower quartiles of the distributions. It is more robust than the conditional variance, and not as influenced by the choice of model for the upper tail of the distribution.

Reference uncertainty index:

This measure of uncertainty is influenced in part by the spatial uncertainty in the variable, similar to the conditional variance or IQR, but also accounts more specifically for the uncertainty with respect to a specific concentration level, e.g., a cleanup standard (Kyriakidis 1997). A grid node would have a high RUI value when the spread of the conditional distribution is wide (which would occur if there are few nearby data or those data are highly variable), or the median of the conditional distribution is very close to the regulatory threshold. The farther the median is from the threshold value, the more likely the simulation node would be classified as either clean (if the median is well below the cleanup threshold) or requiring remediation (if the median is above the threshold). On the other hand, the closer the median is to the threshold value, the greater the uncertainty of determining the correct action that should be taken at that grid node.

Estimates of the concentration:

Mean:

Also known as the expected value or E-type estimate of the ccdf (Goovaerts 1997), this is based on the standard least-squares criterion. Like the conditional variance, the conditional mean of the concentration is heavily influenced by the model chosen for the upper tail.

Median:

This estimate is based on the 50th percentile of the local ccdfs and is a robust estimator, not as strongly influenced by the model chosen for the upper tail.

Xth percentile:

This estimator will return a map of the specified percentile of the local ccdfs. These maps can be used to identify areas that are almost sure to be high or low, depending on the percentile chosen (Deutsch and Journel 1998, p. 284). For example, mapping the 10th percentile could be used to identify locations that are sure to have high concentrations, because areas that have high concentration values on the 10th percentile map have a 90% probability that the true value is even higher than that shown on the map.

References:

Deutsch, C.V. and A.G. Journel. 1998. GSLIB Geostatistical Software Library and User's Guide, 2nd Edition, Applied Geostatistics Series, Oxford University Press, Inc. New York, NY.

Goovaerts, P. 1997. Geostatistics for Natural Resources Evaluation. Applied Geostatistics Series, Oxford University Press, New York.

Kyriakidis, PC. 1997. Selecting Panels for Remediation in Contaminated Soils Via Stochastic Imaging. In Geostatistics Wollongong '96 , eds. EY Baafi and NA Schofield, Vol 2, pp. 973-83. Kluwer Academic Publishers, Dordrecht.

Webster, R, and MA Oliver. 1993. How Large a Sample Is Needed to Estimate the Regional Variogram Adequately? In Geostatistics Troia '92 , ed. A Soares, Vol 1, pp. 155-66. Kluwer Academic Publishers, Dordrecht.