An investigation report should be prepared in accordance with Contaminated Land Management Guidelines No. 1: Reporting on Contaminated Sites in New Zealand (Revised 2011) (Ministry for the Environment, 2001). Data generated during a site investigation should be collated and presented in a logical form to enable the information to be assessed. Data are generated at different stages of the investigation and include information collated from the desk research, site walkover study, field screening, field observations, chain of custody documentation and analytical results.
Data from the site history research – which may include old site layout plans, photographs, material safety data sheets and permits – should be included in the report. Analytical data should be tabulated using the appropriate number of significant figures, and laboratory certificates of analysis must be appended. Analytical results can be presented on a site plan to indicate sample locations, numbers and depths for different concentration ranges in different colours.
Concentration contours for specific sample depths can be useful to show plumes, but use these with caution in areas where a small number of sample locations are used and they may be misleading. Uncertainty in contouring is usually identified by using broken contour lines.
Data generated from a site investigation should be related to the conceptual site model for the site, to see if the information makes sense in relation to the anticipated model conditions. The information should be assessed in the context of the model to determine the location, extent, trends and likely movement of the contamination through the soil profile. Analytical and field results should enable the conceptual site model to be refined, and issues relating to source, pathway and target identified and assessed.
The assessment of site data requires a review of all sources of information, including the conceptual site model and field and analytical results, and consideration of the site's use and intended uses. When interpreting the soil analytical results, the uncertainty in the data and any limitations in the sampling and analytical method must be understood. Data are often assessed by comparing results with guideline values as an initial screen of the data. The appropriateness of the values in the context of the site, exposure pathways and analyses must be considered. (Further information on the use of guidelines is provided in section 5.3.)
Professional judgement must be exercised if averaged concentrations are being used for comparison against guidelines. Averages must be used in the context of the exposure pathways, and in some instances may not be appropriate because they can 'hide' hot spot information. (Further details on statistical summaries are provided in section 5.4.1.)
The interpretation of numbers close to guidelines can be done using statistical methods, provided the assumptions and limitations of the statistical method are appropriate and a designed statistical investigation sampling pattern has been used. The recommended method is to use the upper confidence limit of the arithmetic mean (for further discussion see Appendix I). When comparing results to a long-term guideline value, the result will be acceptable if the 95% upper confidence limit is at or below the guideline, provided no result is more than twice the guideline value. Further guidance is provided in Contaminated Sites: Sampling design guidelines (New South Wales Environment Protection Authority, 1995) and Supplemental Guidance to RAGS: Calculating the concentration term (US EPA, 1992).
Limitations and uncertainties of the data must be identified, and any assumptions made in interpreting the data clearly stated. Uncertainty in the data can be determined from the use of replicate samples, which provide an indication of the precision of sampling and analysis procedure. Replicate samples should be collected from different locations and the mean and standard deviation calculated for the individual replicates. The information on precision can then be used when comparing results to the guideline value. An example of such a calculation is included in the spreadsheet in Appendix J (see also section 5.8).
The National Environmental Standard (NES) for Assessing and Managing Contaminants in Soil to Protect Human Health provides a suite of 12 soil contaminant standards and five land-use exposure scenarios that are legally binding. The way they were derived and a site-specific methodology to derive soil guideline values is contained in Methodology for Deriving Standards for Contaminants in Soil to Protect Human Health (Ministry for the Environment, 2011a).
A variety of guidelines are available in New Zealand and overseas, and are commonly used for assessing data generated from site investigations. Only guideline documents that are appropriate to the site conditions should be used, and practitioners are cautioned to have a thorough understanding of the basis of the derivation of the guideline numbers before applying them on a site-specific basis. The hierarchy for the selection of a guideline value for a contaminant not included within the NES is set out in Contaminated Land Management Guidelines No. 2: Hierarchy and Application in New Zealand of Environmental Guideline Values (Revised 2011) (Ministry for the Environment, 2003) and should be followed.
Statistical reports can be provided for data from site investigations that have been appropriately designed. Many statistical methods assume that data that have been randomly selected from a larger population of values are normally distributed, but this is often not the case in contaminated site investigations. Care must be taken when using statistical summaries for samples that have been collected from judgemental sample designs, because any interpretation will be based on professional judgement. The data must first be checked for integrity and to determine if there are any outliers, and the distribution of the data must be understood. Two common statistical terms widely used in this area are described below.
'Averages', in this context, refers to a range of summary statistics that indicate the central tendency or 'average', and can include the arithmetic mean, median, geometric mean and mode. In cases where the data set is positively skewed, such as in contaminated site investigations with a lognormal distribution, the median and geometric mean are usually more representative of the bulk of the data. The median and geometric means are relatively unaffected by extremes in data and may be more appropriate than the arithmetic mean for describing an 'average' concentration. The geometric mean is always less than or equal to the arithmetic mean. The mode is the most frequently occurring value.
This is another important characteristic of data and can be described by the range, which may not be useful if it is affected by extremes of data. The variance or its positive square root (the standard deviation), is often used to measure variability and is given in the same units as the original data. The coefficient of variation is more useful because it is comparable among different samples and is a dimensionless measure. The 95% confidence error (see Appendix I) is used as a measure of variability when interpreting a statistically designed site investigations. This is useful in appropriately designed validation sampling. Where hot spots do not appear to have been detected, the first step should be a statistical check on the chance of missing a hot spot of x size. The x size will be based on the DQOs (eg, what size hot spot were you attempting to find or considered significant?).
When reporting statistical summaries of site investigation data, it is advisable to 'over-report' the results by listing the number in the sample, the standard deviation and the 95% confidence error, because this gives subsequent users the flexibility of deriving other confidence intervals (such as the 99% confidence interval). The 95% confidence error should not be confused with a 95th percentile, which is the value that is greater than or equal to 95% of all values in a distribution. This is presented graphically in Figure 6 for site data.
If appropriate, the following statistics should be reported and can be summarised for each soil stratum tested:
Analytical data from site investigations where hazardous substances are present are generally lognormally distributed rather than normally distributed. Figure 7 shows the typical profile for normal and lognormal distributions. The distribution of the data set can be checked using statistical tests, and many statistical software packages have the facility for testing the assumption of normality (see Appendix I). Data should be plotted to assess whether the contaminant distribution is normal at the site. If the statistical tests show the data are not normally distributed, then the data should be transformed using the appropriate transformation.
For soil sampling, where data are generally lognormally distributed, an appropriate transformation is to use the natural logarithm function (ie, calculate yi = ln(xi), where xi is the original sample measurement and yi is the transformed sample measurement). Further details are provided in Statistical Methods for Environmental Pollution Monitoring (Gilbert, 1987).
Validation information relating to accuracy and precision of the measurements should form part of any significant contaminated site investigation report. In analytical chemistry, accuracy refers to how close a measured value is to the true value. The true value is usually not known (that was the point of undertaking the measurement). However, analytical measurements are sometimes prone to systematic errors that can compromise accuracy. Accuracy is usually assessed by one of two methods:
Certified reference materials are homogeneous reference samples that have been previously analysed, and in which the true values of contaminants can be assumed. These are available in a range of sample types, such as soils, plants and foods, but are not available for all analytes. They essentially represent inter-laboratory comparison in a bottle, and are available from a number of international standards agencies, including LGC (UK), the International Atomic Energy Association (IAEA, Vienna) and the National Institute of Standards and Technology (NIST, USA).
Analytical precision refers to the spread of results, and is usually assessed by repeated measurements of the same sample. Precision is described by the measures of variability outlined in section 5.4.1. The most common statistic used to describe precision is the coefficient of variation. The use of replicates in soil sampling can give an indication of the precision in the sampling and analysis process.
An outlier is one observation in a set of data that appears to be excessively high or low with respect to the mean value suggested by the other observations. Outliers may arise from analytical or sampling difficulties, but may also represent actual site contamination (eg,a hot spot). In other words, an outlier may be spurious or genuine. Each outlier should be evaluated to determine if it is a real result.
The prevalence of spurious analytical outliers gets higher as the relative concentrations being measured decrease. One reason for this is that minor sample contamination effects (via contact with the atmosphere, sampler, sample container, analyst, laboratory reagents and equipment or instrumental technique) make up a greater part of the overall measurement as the concentration being measured decreases. Due to differences in the magnitudes being measured, spurious outliers are more common in trace background analyses than in contaminated site investigation soil analyses.
The decision to identify an excessively high or low result as an outlier and discard it from the data set requires care and justification. Outliers must be looked at critically to ensure data are not mistakenly 'lost' from a site investigation. Where spurious outliers are identified, the original number must not be removed from the site investigation report. Instead, suspected outliers in the data set should be clearly identified (eg, with an asterisk and footnote). Reasons for the identification of the suspect observation should be provided in the text or a footnote.
There is a range of statistical methods for identifying outlying observations, but they all suffer from the problem that in order to definitively identify an outlier, the nature of the underlying population from which the samples were drawn must be known with reasonable certainty. The best way to get a good idea of the nature of the underlying population is to analyse at least 30 samples. In small data sets (less than 30 samples), statistical methods for outlier rejection should be used only as a last recourse. An outlier should only be rejected if a back check reveals an error. Otherwise it is a real result that requires an explanation.
The recommended checks when excluding outliers include the following.
An assessment of the validity of the data should be made and any uncertainty in the accuracy of the data explained. In particular, the data from the field and laboratory QA/QC must be within the acceptable criteria and any variability or exceedance in acceptability criteria explained. Any uncertainty in the accuracy of the data must also be clarified. A checklist for the data is recommended, as follows.
Common mistakes and pitfalls to be avoided in data interpretation include:
The interpretation of numbers close to method detection limits has uncertainty associated with the measurement in the laboratory due to the small signal being generated by the contaminant relative to the noise associated with the analytical equipment. There is also uncertainty due to the potential for sample contamination, which becomes more significant when undertaking trace level analysis.
Numbers below detection limits (also referred to as censored data) do not imply that the contaminant does not exist in the soil sample, only that the analytical method was not sufficiently sensitive to be able to detect that level of contaminants. The contaminant may be present at a concentration below the reported detection limit, or it may not be present in the sample at all (the concentration in the sample is zero). If numbers below detection limits are required for comparison against guideline values, then if possible the analysis should be undertaken again using a method with a more sensitive detection limit (the detection limit must be below the guideline value). When interpreting numbers below detection limits, the numbers should not be treated as 'missing', and non-detected results must not be omitted from the results.
The numbers below detection limits can be interpreted in a number of ways:
Data below the detection limit can cause problems with statistical analysis, as any of the above ways of data interpretation introduces constant values, and biases the results. Any data set with a significant proportion of results (eg, over 25%) below the detection limit should not have any form of confidence intervals reported. In other cases, the statistical analysis of the data should be performed twice – once using the detection limit (or half the detection limit) as the replacement value, and once using zero – to see if the results differ markedly. If they do, more sophisticated statistical methods are required. If they do not differ markedly, then the small proportion of the data set that is below the detection limit has little influence on the statistical analysis, and the results can be used.
Numbers close to guidelines should be interpreted with consideration to the following issues:
It is very rare for repeated analysis of the same sample to yield exactly the same result. The variability in results obtained from repeated analysis of the same sample represents the analytical precision (see section 5.4.3). In cases where replicate samples are collected from the same location and repeatedly analysed, this variability represents a combination of 'sampling and analytical' precision.
Where sufficient data on the precision of a given measurement are available, it is possible to better define the area around the guideline value where analytical results are ambiguous. An example of this procedure is given in Appendix J, where the sample design was sufficient to assess 'sampling and analytical' precision.
Example: In the case outlined in Appendix J, the precision of any given soil arsenic measurement (represented by the Student's t-test 95% confidence interval) was found to be plus or minus 5.5% of the measurement. This implied that, for that site and circumstances, 19 out of 20 analytical measurements of a sample containing 30 mg/kg arsenic would be in the region 28.4 mg/kg to 31.6 mg/kg. The practical upshot of this is that any result in this region is analytically indistinguishable from 30 mg/kg, which is the human health guideline value. In terms of practical implementation, analytical values below 28.4 mg/kg are taken to be 'below guideline', those in the range 28.4 mg/kg to 31.6 mg/kg are taken to be 'at guideline', and those above 31.6 mg/kg are taken to be 'above guideline'.
The guideline used and its appropriateness with respect to site-specific conditions should be considered and assessed. The results should always be assessed in the context of the site, proposed land use and DQOs, and be related to the known information about the site history, sources of contamination and pathways for migration and target receptors. The basis for the derivation of any guideline should be understood and the suitability for use considered in the context of the site.
When comparing results to guideline values, there are three possible outcomes in terms of how the results of any one measurement may relate to the guideline:
The 95% upper confidence limit of the arithmetic mean can be used for interpreting numbers against a specified level (see Appendix I), and is applicable only where a statistically designed investigation has been undertaken.
The use of judgemental sampling may preclude statistical methods, because the sampling design is biased. When using judgemental sampling, the confidence intervals cannot be reported and professional judgement is required. The use of blanks and replicates is required to assist in interpreting the data.
The blank analytical results should be reported, and if any corrections to analytical results are made based on the blank results these must be clearly documented.
When comparing results to guidelines for common contaminants, assess the significance of the results with caution. Examples include phthalate esters from plastic laboratory tubing, and traces of zinc from a range of sources (from galvanised iron to skin flakes), contamination from which becomes more important as the concentration being measured decreases. The use of blanks is important for determining the presence of common contaminants. Common organic contaminants include acetone, 2-butanone (or methyl ethyl ketone), methylene chloride (or dichloromethane), toluene and phthalate esters as defined by the US EPA. The recommended procedure for common laboratory contaminants is that sample results should be considered as positive only if the concentrations in the sample exceed 10 times the maximum amount detected in any blank. For other contaminants detected in the blank, the sample results should be considered positive if the sample exceeds five times the amount detected in any blank.
Last updated: 19 October 2011