Statistical Analysis of Groundwater Monitoring Data at RCRA Facilities Unified Guidance Fact Sheet


                        United States Environmental Protection Agency    EPA 530-F-09-020
                                                        March 2009
                        Fact Sheet
                        Statistical  Analysis  of Groundwater Monitoring
                        Data at RCRA Facilities—Unified  Guidance
   Features of the Unified Guidance

   What's new in the guidance? The March
2009 version of the Unified Guidance represents
more than a decade of input from EPA Regions,
states, statisticians working with groundwater
monitoring, and results of a formal peer review.
While the RCRA regulatory programs have been
established for some time, existing guidance
does not fully cover newer methods and
experience gained in implementing the program.
Major features include:

•  Updated guidance for RCRA Subtitles C & D
   groundwater monitoring regulations covering
   all specified tests and performance criteria
•  A suggested systematic detection monitoring
   framework to balance false positive errors and
   power in light of multiple comparisons
•  Newer statistical methods for prediction limits,
   outlier, normality, autocorrelation and non-
   detect data diagnostic evaluations, and
   expanded use of non-parametric test methods

•  Use of trend testing when stationarity
   assumptions cannot be met

•  Expanded single-sample tests for compliance
   and corrective action monitoring, considering
   false positive errors and power

Organization. The guidance is laid out in four
parts, with extensive Appendix statistical tables
to support individual test methods:

•  Part I identifies the key RCRA regulatory
   provisions and general recommendations for
   implementing these rules. It addresses issues
   of statistical design: factors such as
   developing and updating background data and
   strategies for constructing an effective
   statistical monitoring program.

•  Part II covers diagnostic evaluations for
   checking key assumptions—outliers, normality,
   autocorrelation, non-detect data, spatial and
   temporal dependence. Useful exploratory
   techniques and tests are provided.

•  Part III presents formal testing procedures for
   detection monitoring, covering both 40 CFR
   Parts 265, 264, and 258 requirements.

•  Part IV is devoted to compliance and
   corrective action formal tests. Strategies are
   provided for a range of conditions including
   parametric and non-parametric alternatives.
What is the Unified Guidance?

     This latest version of Statistical Analysis of Groundwater
Monitoring Data at RCRA Facilities is termed the Unified Guidance,
since it integrates and supersedes two guidance documents of the same
title released in 1989 and 1992. It resolves certain problems in earlier
guidance while providing newer statistical methods and strategies
developed in the mid-1990's and later. The guidance applies to both
RCRA Subtitle C and D regulations. The focus is on RCRA hazardous
and solid waste facility regulatory requirements, although the general
statistical guidance is useful in other regulatory monitoring applications.

     The guidance contains a compilation of statistical methods
recommended for groundwater monitoring at RCRA and other facilities.
It provides comprehensive strategies for designing the statistical aspects
of facility detection, compliance, or corrective action monitoring
systems. Interpretations are suggested for key statistical provisions of the
RCRA groundwater monitoring regulations.

How was this guidance developed?

     In the mid-1990's, the EPA Office of Solid Waste convened a task
group consisting of state and EPA personnel, industry representatives,
and statisticians closely involved with groundwater monitoring issues.
The goal was to develop more current and relevant RCRA statistical
guidance. Following a number of preliminary drafts, a full version was
circulated in 2004 to interested state regulatory personnel for their
comments, as well as to three expert peer reviewers in 2005. The various
drafts were produced by Science Applications International Corporation
(SAIC), using the technical expertise of statistician Dr. Kirk Cameron
(MacStat Consulting Ltd). The Unified Guidance has been substantially
modified and expanded to address the issues raised by commenters.


Who are potential users of this guidance?

     The guidance is aimed at the informed professional working in the
groundwater monitoring field, assuming a limited background in
statistics. The primary users are expected to be:

•   Owners, operators, and personnel at Subtitle C hazardous waste or
    Subtitle D solid waste facilities

•   State and EPA regulatory personnel concerned with permits,
    enforcement and compliance at these facilities

•   Consultants and statisticians providing technical assistance to
    regulated facilities; and

•   Other ground water and regulatory monitoring program personnel
    such as in the CERCLA program.

-------
Fact Sheet-Statistical Analysis of Data at RCRA Facilities—Unified Guidance
                                                               Page 2
   Features of the Unified Guidance

Part I-- Introductory Framework

 •  Regulatory Issues
  - Hypothesis testing frameworks
  - Sampling requirements
  - Limitations of certain tests like ANOVA
 •  The groundwater monitoring context
 •  Basic statistical concepts
 •  The nature of hypothesis testing
 •  Establishing and updating background data

 •  Detection Monitoring Design
  - Control of false positive errors with multiple
  comparisons
  - Sitewide False Positive Error Rate [SWFPR]
  application
  - Minimum  power reference criteria
  - Using multiple test methods
  - Effect size power evaluation
  - Appropriate tests including trend analysis

 •  Compliance/Corrective Action Monitoring
   Design
    - Use of single sample tests against a fixed
    standard
   - Hypothesis framework
   - Centrality versus upper percentile parameters
   - Test types (parametric vs. non-parametric, trends)
   - Testing Against a Background Standard

Part II- Diagnostic Evaluation and Testing
    Exploratory data tools
    Goodness-of-fit testing
   - Importance of the normal distribution
   - Other normalizing transformations
    (logarithmic, ladder-of-powers)
    Outliers
    Equality of Variance
    Managing Non-Detect Data
    Spatial Dependence
    Types of Temporal Dependence
    - autocorrelation, trends, seasonality, etc.
Par t III-- Detection Monitoring Tests

 •   Coverage of all regulatory tests
    - t-tests, ANOVA, control charts, prediction and
    tolerance limits
 •   Parametric versus non-parametric methods
 •   Tests when non-detect data are present
 •   Use of trend analyses
 •   Emphasis on prediction limits for systematic
    design

Part IV- Compliance/Corrective Action Tests

 •   Test of means versus upper percentiles
 •   Control of false positive errors and power
 •   Fixed standards vs. background limits
What legal limitations does this guidance impose?

     EPA makes it clear at the outset of the document that this present
work is guidance only, and does not confer any legal requirements or
obligations on regulated entities or regulatory programs. While it is
necessary to make interpretations of regulatory language to apply
statistical measures, those found in the guidance are only suggested.
Other approaches and statistical methods can work equally well or better
in specific instances.  As a practical matter, it is recognized that states
may choose to adopt requirements similar to guidance recommendations.
While  we believe that the document offers reasonable current guidance,
experience and statistical applications in this field are continually
evolving.

What regulations and issues are covered?

        The guidance covers the statistical aspects of groundwater
monitoring regulations for 40 CFR Parts 265, 264, and 258.  These
include monitoring under Subtitle  C interim status and RCRA permits, as
well as for Subtitle D solid waste facilities. These rules span a
considerable period of time from 1980 forward, with significant
modifications to the Part 264 regulations in 1988 and 2006. Key portions
of regulatory language pertaining to groundwater monitoring and
statistical testing are provided in the guidance. These include the
specified test procedures, performance criteria, sampling requirements,
and identification of relevant groundwater protection standards.

        Basic statistical interpretations include identifying the
appropriate hypothesis testing frameworks, meeting performance criteria,
the application of certain sampling data requirements,  and  the use and
limitation of designated tests. For some applications, the regulations do
not explicitly identify appropriate test methods; the Unified Guidance
makes reasonable judgments as to  the more appropriate procedures. One
particular issue stressed throughout the guidance is the need to utilize
statistically independent data as identified in 1988 and later RCRA
regulatory language.  Certain regulatory restrictions also dictate the
appropriate responses for RCRA applications, but may not be limiting in
other monitoring situations.

How is this document organized?

        The guidance follows a logical progression from simple and
general discussions to more detailed coverage of specific test methods.
After presenting the regulatory context in Part I, a chapter  is devoted to
basic statistical concepts. These include the assumptions found in the
RCRA performance criteria but are more broadly extended to include
other standard statistical factors. Terms such as independence, statistical
significance, stationarity, random sampling, spatial and temporal
dependence, normality, equality of variance, outliers and non-detect data
are defined and explained. The overall  groundwater monitoring context
is presented, with special emphasis on hypothesis  testing and the related
false positive and negative errors.  A separate chapter discusses
developing, assessing and updating background data.

-------
Fact Sheet-Statistical Analysis of Data at RCRA Facilities—Unified Guidance
                                                Page 3
      General design considerations are provided for
developing a detection monitoring system.  The guidance
provides a systematic approach to integrating false positive
errors and power in a site design. We specifically
recommend a 10% Site-Wide False Positive Rate
[SWFPR] partitioned among the total number of tests per
year. EPA Reference Power Curves [ERPC] are provided
as minimum criteria for sufficient statistical power, used to
gauge the effectiveness of particular detection monitoring
tests.

     Design of compliance or corrective action monitoring
systems follows. Because most groundwater protection
standards [GWPS] are in the form of fixed, risk- or health-
based limits, the design differs along with the appropriate
types of statistical tests.  Unlike highly site-specific
detection programs, key decisions need to be made by
regulatory agencies. These include the appropriate type of
parameter for comparison to  the GWPS, false positive and
negative error rates, and the form of hypothesis testing.
The use of a background GWPS is also discussed.
     Following a summary chapter of recommended
methods, detailed consideration of diagnostic evaluations
and testing of data are provided in Part II.  These include
general exploratory techniques such as box plots or
probability plots, testing for goodness-of-fit, outliers, non-
detect data, equality of variance, spatial and temporal
dependence. If assumptions  critical to statistical tests are
not met, the guidance suggests potential data adjustments
for these situations.
     Part III provides the specific detection monitoring
tests found in the RCRA regulations.  Each test is
discussed in overall terms including necessary
assumptions, followed by a detailed procedure and
example.  All formal tests in the guidance follow this
same approach.

     Part IV contains detailed methods for  compliance and
assessment monitoring using confidence intervals.
Consideration is given to the design aspects presented
earlier, including the parameter choice and  hypothesis
framework.  A discussion of cumulative false positive
errors and power is provided. Depending on whether
compliance or corrective action monitoring is involved,
false positive error and power criteria can vary based on
different perspectives of the regulated entity and agency.
The guidance offers recommendations which place priority
on EPA and state regulatory needs to enhance protection of
public health and the environment.

     The appendices contain references, a glossary and
index, as well as extensive tables for specific test methods
which span the range of conditions likely to occur at
regulated facilities.
Why is it recommended to use the SWFPR and
ERPC in detection monitoring design?

       These criteria stem from problems historically
experienced at facilities conducting multiple statistical
tests for a wide range of monitoring constituents at
numerous compliance wells. This is the classic multiple
comparisons problem. When many tests are conducted at
a fixed error rate, the chances of one or more false positive
errors (a condition when one concludes that a release has
occurred when there is in fact none) can become
unreasonably high. A second and very important
consideration is that statistical tests must have sufficient
ability (or power) to detect such a release when it occurs.

     Within the limits of the RCRA regulations, certain
opportunities were afforded to control this potentially high
rate of false positive error.  This is especially true if
prediction limits are used as tests, although two other
identified methods—control charts and tolerance limits—
can be similarly designed. By maintaining a consistent
overall annual error rate, all regulated facilities will be
afforded the same risk.
     Based on earlier work by EPA and others, prediction
limit tests typical of the RCRA groundwater monitoring
context were identified as a minimally acceptable criterion
for power to detect real releases to groundwater. While a
relative measure, it can be applied universally to all
detection monitoring tests. The March 2009 Unified
Guidance extends this approach to consider the cumulative
power of tests, based on the number of annual evaluations
per year.  It provides a common framework for
considering both cumulative false positive errors and
power.

     The guidance also discusses effect size power as an
alternative to the relative power criteria.  This approach
requires a regulatory agency determination of a specific
increase of concern. At present, there are few if any such
criteria established.  This approach may find use in
specific applications discussed in the guidance.

     While the SWFPR and ERPC approaches are
recommended for detection monitoring, the guidance
reaches different conclusions for compliance  and
corrective action monitoring when fixed limits are used as
standards. The situation is too uncertain and problematic
to apply the same concepts, and other strategies are
recommended.
Why is diagnostic testing important and when
should it be used?
     In addition to addressing the RCRA regulatory
requirements for performance criteria, it is good statistical

-------
Fact Sheet-Statistical Analysis of Data at RCRA Facilities—Unified Guidance
                                                Page 4
practice to know one's data closely. Checking key
assumptions is critical to proper performance of any
statistical test. Misapplication can also generate results
which do not follow the expected outcomes of a given test.
Diagnostic testing is performed primarily during permit or
remedial action plan development. Once a set of tests is
selected for formal permit or remedial plan monitoring,
diagnostic testing might only be periodically expected
(e.g., for updating background data).
     Many important statistical tests assume a normal
distribution. Goodness-of-fit techniques for identifying a
probable normal distribution are found in the guidance.  In
many situations,  a transformation of data (e.g.,
logarithmic,  square root) can result in approximately
normal data.  Other parametric distributions may work
equally well or better in some situations, but the guidance
generally focuses on the family of normal distributions.  If
no transformation is suitable, non-parametric test methods
can be used.

     Equality of variance is an additional assumption
necessary for some tests.  The guidance provides both
exploratory measures and a formal statistical test.

     Outliers, often very large values of dubious quality,
can significantly weaken the ability of tests to perform as
expected. The guidance offers two test methods for
identifying outliers, and suggestions for when they might
be removed, replaced or otherwise avoided.

     Spatial variability is a very important  consideration.
If background monitoring constituent mean data vary by
well, assumptions for certain detection monitoring tests
like Analysis of Variance (ANOVA) will not be met.
More importantly,  it will generally be impossible to
determine if mean well differences are due to existing
background conditions or a true release.  Parametric or
non-parametric ANOVAs are recommended in the
guidance as diagnostic tests to initially establish if prior
spatial differences exist.  The outcomes may vary with the
types of constituents being monitored.

     Several forms of temporal variation can occur.
Temporal variation is some non-random pattern in data
over time.  It could include autocorrelation,  seasonal
variation, well-to-well constituent correlation, correlation
among monitoring constituents in a well, and the presence
of trends.  Each  of these types of temporal dependence
requires somewhat different diagnostic testing and
potential adjustments provided in the guidance.

     Non-detect values are a common feature of many
RCRA constituent data sets.  Those containing multiple
non-detect limits are of particular concern. The Unified
Guidance provides a number of non-detect data adjustment
procedures, including two fairly recent methods for
multiple non-detect limits.
Which detection monitoring tests are
recommended?
     While the guidance covers all of the regulatory tests,
there is a clear preference for prediction limits or control
charts as detection monitoring tests.  The guidance
specifically recommends the Shewhart-CUSUM option
when choosing control charts.

     For interim status or facilities with few annual tests,
variants of the Student-t or alternative non-parametric two-
sample tests may be sufficient. Other facilities will need to
apply tests which account for the multiple comparisons.
Both because of the common presence of spatial variability
and regulatory restrictions, neither parametric nor non-
parametric ANOVA tests are likely to be used frequently.
Tolerance limits are similar to prediction limits, but their
usefulness in designing a systematic detection monitoring
program is more limited.  Prediction limits provide the
greatest flexibility, and the guidance provides the most
extensive details for this method.  By careful use of repeat
testing, prediction limits can minimize future sampling
requirements and meet the SWFPR and ERPC criteria.
Nine different parametric and six non-parametric variants
are provided to address most monitoring situations.
Which compliance/corrective action monitoring
tests are recommended?
     The regulatory agency first determines the
appropriate form of comparison to groundwater protection
standards [GWPS]. The guidance offers a number of
single-sample tests for centrality parameters such as the
arithmetic mean, geometric mean, arithmetic mean of a
lognormal distribution, and median tests.  If the decision
is that a maximum limit is appropriate, the guidance offers
parametric or non-parametric upper percentiles as options.
Confidence intervals around trend lines may be  appropriate
in some instances. Testing background GWPS can either
use options provided here or those for detection
monitoring.


Where can the public get more information
about this guidance?

     The guidance will be available on the EPA website:
http://www.epa.gov/epawaste/hazard/correctiveaction/
resources/guidance/sitechar/gwstats/index.htm.  For
further assistance, please contact Mike Gansecki, EPA
Region 8 (email: gansecki.mike@epa.gov or  by phone:
(303-312-6150).

-------