530-SW-89-026
STATISTICAL ANALYSIS OF
GROUND-WATER MONITORING DATA
AT RCRA FACILITIES
INTERIM FINAL GUIDANCE
OFFICE OF SOUD WASTE
WASTE MANAGEMENT DIVISION
U.S. ENVIRONMENTAL PROTECTION AGENCY
401 M STREET, S.W.
WASHINGTON, D.C 20460
ntwucto it
NATIONAL TECHNICAL
INFORMATION SERVICE
Oi OEPMtWr Of COMIIERCt
VIMGFIUO, »». Sid
FEBRUARY 1989
LIBRARY
Environmental Protection Agency.
" State of Illinois
Springfield, Illinois
-------
DISCLAIMER
This document 1s Intended to assist Regional and State personnel 1n
evaluating ground-water monitoring data from RCRA facilities. Conformance
with this guidance 1s expected to result 1n statistical methods and sampling
procedures that meet the regulatory standard of protecting human health and
the environment. However, EPA will not 1n all cases limit Its approval of
statistical methods and sampling procedures to those that comport with the
guidance set forth herein. This guidance 1s not a regulation (1.er, 1t does
not establish a standard of conduct which has the force of law) and should not
be used as such. Regional and State personnel should exercise their discre-
tion 1n using this guidance document as well as other relevant Information 1n
choosing a statistical method and sampling procedure that meet the regulatory
requirements for evaluating ground-water monitoring data from RCRA facilities.
This document has been reviewed by the Office of Solid Waste, U.S. Envi-
ronmental Protection Agency, Washington, D.C., and approved for publication.
Approval does not signify that the contents necessarily reflect the views and
policies of the U.S. Environmental Protection Agency, nor does mention of
trade names, commercial products, or publications constitute endorsement or
recommendation for use.
-------
ACKNOWLEDGMENT
This document was developed by EPA's Office of Solid Waste under the
direction of Or. Vernon Myers, Chief of the Ground-Water Section of the Waste
Management Division. The document was prepared by the joint efforts of
Dr. Vernon B. Myers, Mr. James R. Brown of the Waste Management Division,
Mr. James Craig of the Office of Policy -Planning and Information, and
Mr. Barnes Johnson of the Office of Policy, Planning, and Evaluation. Tech-
nlcal support 1n the preparation of this document was provided by Midwest
Research Institute (MRI) under a subcontract to NUS Corporation, the prime
contractor with EPA's Office of Solid Waste. MRI staff who assisted with the
preparation of the document were Jalrus 0. Flora, Jr., Ph.D., Principal
Statistician, Ms. Karln M. Bauer, Senior Statistician, and Mr. Joseph S.
BartHng, Assistant Statistician.
1x
-------
PREFACE
This guidance document has been developed primarily .for evaluating
ground-water Monitoring data at RCRA (Resource Conservation and Recovery Act)
facilities. The statistical Methodologies described 1n this document can be
applied to both hazardous (Subtitle C of RCRA) and Municipal (Subtitle D of
RCRA) waste land disposal facilities.
The recently amended regulations concerning the statistical analysis of
ground-water Monitoring data at RCRA facilities (53 FR 39720: October 11,
1988), provide a wide variety of statistical Methods that May be used to
evaluate ground-water quality. To the experienced and Inexperienced water
quality professional, the choice of which test to use under a particular set
of conditions nay not be apparent. The reader 1s referred to Section 4 of
this guidance, "Choosing a Statistical Method,* for assistance 1n choosing an
appropriate statistical test. For relatively new facilities that have only
United amounts of ground-water monitoring data, 1t 1s recommended that a form
of hypothesis test (e.g., parametric analysis of variance) be employed to
evaluate the data. Once sufficient data are available (after 12 to 24 months
or eight background samples), another Method of analysis such as the control
chart Methodology described 1n Section 7 of the guidance 1s recommended. Each
method of analysis and the conditions under which they will be used can be
written 1n the facility permit. This will ellnlnate the need for a permit
Modification each time More Information about the hydrogeochenlstry 1s
collected, and more appropriate methods of data analysis become apparent.
This guidance was written primarily for the statistical analysis of
ground-water monitoring data at RCRA facilities. The guidance has wider
applications however, 1f one exaMlnes the spatial relationships Involved
between the Monitoring wells and the potential contaminant source. For
example, Section 5 of the guidance describes background well (upgradlent) vs.
compliance well (downgradlent) comparisons. This scenario can be applied to
other non-RCRA situations Involving the same spatial relationships and the
same null hypothesis. The explicit null hypothesis (H.) for testing contrasts
between means, or where appropriate between Medians, 1s that the Means between
groups (here Monitoring wells) are equal (I.e., no release has been detected),
or that the group Means are below a prescribed action level (e.g., the ground-
water protection standard). Statistical Methods that can be used to evaluate
these conditions are described 1n Section 5.2 (Analysis of Variance), 5.3
(Tolerance Intervals), and 5.4 (Prediction Intervals).
A different situation exists when compliance wells (downgradlent) are
compared to a fixed standard (e.g., the ground-water protection standard). In
that case. Section 6 of the guidance should be consulted. The value to which
the constituent concentrations at compliance wells are compared can be any
Preceding page blank 1"
-------
standard established by a Regional Administrator, State or county health
official, or another appropriate official.
A note of caution applies to Section 6. The examples used In Section 6
are used to determine whether ground water has been contaminated as a result
of a release from a facility. When the lower confidence limit 1s exceeded,
further action or assessment may be warranted. If one wishes to determine
whether a cleanup standard has been attained for a Superfund site or a RCRA
facility 1n corrective action, another EPA guidance document entitled,
Statistical Methods for the Attainment of Superfund Cleanup Standards
(Volume 2: Ground HaterDraft), should be consulted. This draft Superfund
guidance 1s a muIt1volume set that addresses questions regarding the success
of air, ground-water, and soil remediation efforts. Information about the
availability of this draft guidance, currently being developed, can be
obtained by calling the RCRA/Superfund Hotline, telephone (800) 424-9346 or
(202) 382-3000.
Those Interested 1n evaluating Individual uncontamlnated wells or 1n an
Intrawell comparison are referred to Section 7 of the guidance which describes
the use of Shewhart-CUSUM control charts and trend analysis. Municipal water
supply engineers, for example, who wish to monitor water quality parameters in
supply wells, may find this section useful.
Other sections of this guidance have wide applications 1n the field of
applied statistics, regardless of the Intended use or purpose. Section 4.2
and 4.3 provide Information on checking distributional assumptions and
equality of variance, while Sections 8.1 and 8.2 cover limit of detection
problems and outliers. Helpful advice and references for many experiments
Involving the use of statistics can be found 1n these sections.
Finally, 1t should be noted that this guidance 1s not Intended to be the
final chapter on the statistical analysis of ground-water monitoring data, nor
should 1t be used as'such. 40 CFR Part 264 Subpart F offers an alternative
(§264.97(h)(5)] to the methods suggested and described 1n this guidance
document. In fact, the guidance recommends a procedure (confidence Intervals)
for comparing monitoring data to a fixed standard that 1s not mentioned 1n the
Subpart F regulations. This 1s neither contradictory nor Inconsistent, but
rather epitomizes the complexities of the subject matter and exemplifies the
need for flexibility due to the site-specific monitoring requirements of the
RCRA program.
1v
-------
CONTENTS
1. I ntroduct 1 on 1-1
2. Regulatory Overview 2-1
2.1 Background 2-1
2.2 Overview of Methodology 2-3
2.3 General Performance Standards 2-3
2.4 Basic Statistical Methods and Sampling
Procedures 2-6
3. Choosing a Sampling Interval 3-1
4. Choosing a Statistical Method 4-1
4.1 FlowchartsOverview and Use 4-1
4.2 Checking Distributional Assumptions 4-4
4.3 Checking Equality of Variance: Bartlett's Test 4-16
5. Background Well to Compliance Well Comparisons 5-1
5.1 Summary Flowchart for Background Well to
Compliance Well Comparisons 5-2
5.2 Analysis of Variance 5-5
5.3 Tolerance Intervals Based on the Normal
Distribution 5-19
5.4 Prediction Intervals 5-23
6. Comparisons with MCLs or ACLs 6-1
6.1 Summary Chart for Comparison with MCLs or ACLs 6-1
6-2 Statistical Procedures 6-1
7. Control Charts for Intra-Well Comparisons 7-1
7-1 Advantages of Plotting Data 7-1
7-2 Correcting for Seasonally 7-2
7.3 Combined Shewhart-CUSUM Control Charts for Each
Nell and Constituent 7-5
7.4 Update of a Control Chart 7-10
7.5 Nondetects 1n a Control Chart 6-12
8. Miscellaneous Topics..... ......' 8-1
8.1 Limit of Detection 8-1
8.2 Outliers 8-10
Appendices
A. General Statistical Considerations and Glossary of
Statistical Terms A-l
B. Statistical Tables B-l
C. General Bibliography C-l
-------
FIGURES
Number Page
3-1 Hydraulic conductivity (1n three units) of selected rocks 3-3
3-2 Total porosity and dralnable porosity for typical
geologic materials.. 3-6
3-3 Potent1oroetr1c surface nap for computation of hydraulic
gradient 3-8
4-1 Flowchart overview 4-3
4-2 Probability plot of raw chlordane concentrations 4-11
4-3 Probability plot of log-transformed chlordane concentrations.. 4-12
5-1 Background well to compliance well comparisons 5-3
5-2 Tolerance limits: alternate approach to background
well to compliance well comparisons 5-4
6-1 Comparisons with MCLs/ACLs 6-2
7-1 Plot of unadjusted and seasonally adjusted monthly
observations 7-6
7-2 Combined Shewhart-CUSUM chart 7-11
v1
-------
r
r
TABLES
\
Number Page
2-1 Sunraary of Statistical Methods , 2-7
3-1 Default Values for Effective Porosity (Ne) for Use 1n Time
of Travel (TOT) Analyses 3-4
3-2 Specific Yield Values for Selected Rock Units 3-5
3-3 Determining a Sampling Interval 3-10
4-1 Example Data for Coeff1c1ent-of-Variation Test 4-7
4-2 Example Data Computations for Probability Plotting 4-10
4-3 Cell Boundaries for the Ch1-Squared Test 4-13
4-4 Example Data for Ch1-squared Test 4-15
4-5 Example Data for Bartlett's Test 4-18
5-1 One-Way Parametric ANOVA Table 5-8
5-2 Example Data for One-Way Parametric Analysis of Variance 5-11
5-3 Example Computations 1n One-Way Parametric ANOVA Table 5-12
5-4 Example Data for One-Way Nonparametrlc ANOVABenzene
Concentrations (ppm) 5-17
5-5 Example Data for Normal Tolerance Interval 5-22
5-6 Example Data for Prediction IntervalChlordane Levels 5-26
6-1 Example Data for Normal Confidence IntervalAldlcarb
Concentrations 1n Compliance Wells (ppb) 6-4
6-2 Example Data for Log-Normal Confidence IntervalEDB
Concentrations In Compliance Wells (ppb) 6-6
6-3 Values of M and n+l-M and Confidence Coefficients for
Small Samples 6-9
6-4 Example Data for Nonparanetrlc Confidence IntervalS11vex
Concentrations (ppm) 6-10
-------
TABLES (continued) »
Number Page
6-5 Example Data for a Tolerance Interval Compared to an ACL...... 6-13
7-1 Example Computation for Deseasona11z1ng Data 7-4
7-2 Example Data for Combined Shewhart-CUSUM ChartCarbon
Tetrachlorlde Concentration (vg/L) 7-9
8-1 Methods for Below Detection Limit Values 8-2
8-2 Example Data for a Test of Proportions 8-5
8-3 Example Data for Testing Cohen's Test 8-8
8-4 * Example Data for Testing for an Outlier 8-12
-------
EXECUTIVE SUMMARY
The hazardous waste regulations under the Resource
Conservation and Recovery Act (RCRA) require owners and operators
of hazardous waste facilities to utilize design features and
control measures that prevent the release of hazardous waste into
ground water. Further, regulated units (i.e., all surface
impoundments, waste piles, land treatment'units, and landfills
that receive hazardous waste after July 26, 1982) are also
subject to the ground-water monitoring and corrective action
standards of 40 CFR Part 264, Subpart F. These regulations
require that a statistical method and sampling procedure approved
by EPA be used to determine whether there are releases from
regulated units into ground water.
This document provides guidance to RCRA Facility permit
applicants and writers concerning the statistical analysis of
ground-water monitoring data at RCRA facilities. Section 1 is an
introduction to the guidance; it describes the purpose and intent
of the document, and emphasizes the need for site-specific
considerations in implementing the Subpart F regulations of 40
CFR Part 264.
Section 2 provides the reader with an overview of the
recently promulgated regulations concerning the statistical
analysis of ground-water monitoring data (53 FR 39720: October
11, 1988). The requirements of the regulation are reviewed, and
the need to consider site specific factors in evaluating data at
a hazardous waste facility is emphasized.
Section 3 discusses the important hydrogeologic parameters to
consider when choosing a sampling interval. The Darcy equation
is used to determine the horizontal component of the average
linear velocity of ground water. This parameter provides a good
estimate of time of travel for most soluble constituents in
ground water, and may be used to determine a sampling interval.
Example calculations are provided at the end of the section to
further assist the reader.
Section 4 provides guidance on choosing an appropriate
statistical method. A flowchart to guide the reader through this
section, as well as procedures to test the distributional
assumptions of data are presented. Finally, this section
outlines procedures to test specifically for equality of
variance.
El
-------
Section 5 covers statistical Methods that nay be used to
evaluate ground-water Monitoring data when background veils have
been sited hydraulically upgradient from the regulated unit, and
a second set of wells are sited hydraulically downgradient from
the regulated unit at the point of compliance. The data from
these compliance wells are compared to data from the background
wells to determine whether a release from a facility has
occurred. Parametric and nonparametric analysis of variance,
tolerance intervals, and prediction intervals are suggested
methods for this type of comparison. Flowcharts, procedures and
example calculations are given for each testing method.
Section 6 includes statistical procedures that are
appropriate when comparing ground-water constituent
concentrations to fixed concentration limits (e.g., alternate
concentration limits or maximum concentration limits). The
methods applicable to this type of comparison are confidence
intervals and tolerance intervals. As in section 5, flowcharts,
procedures, and examples explain the calculations necessary for
each testing method.
Section 7 presents the case where the level of each
constituent within a single, uncontaminated well is being
compared to its historic background concentrations. This is
known as an intra-well comparison. In essence, the data for each
constituent in each well are plotted on a time scale and
inspected for obvious features such as trends or sudden changes .
in concentration levels. The method suggested in this section is
a combined Shewhart-CUSUM control chart.
Section 8 contains a variety of special topics that are
relatively short and self contained. These topics include
methods to deal with data that is below the limit of analytical
detection and methods to test for outliers or extreme values in
the data.
Finally, the guidance presents appendices that cover general
statistical considerations, a glossary of statistical terms,
statistical tables, and a listing of references. These
appendices provide necessary and ancillary information to aid the
user in evaluating ground-water monitoring data.
E2
-------
SECTION 1
INTRODUCTION
The U.S. Environmental Protection Agency (EPA) promulgated regulations
for detecting contamination of ground water at hazardous waste land disposal
facilities under the Resource Conservation and Recovery Act (RCRA) of 1976.
The statistical procedures specified for use to evaluate the presence of con-
tamination have been criticized and require Improvement. Therefore, EPA has
revised those statistical procedures 1n 40 CFR Part 264, "Statistical Methods
for Evaluating Ground-Water Monitoring Data From Hazardous Haste Facilities.*
In 40 CFR Part 264, EPA has recently amended the Subpart F regulations
with statistical methods and sampling procedures that are appropriate for
evaluating ground-water monitoring data under a variety of situations (53 FR
39720: October 11, 1988). The purpose of this document 1s to provide guidance
1n determining which situation applies and consequently which statistical
procedure may be used. In addition to providing guidance on selection of an
appropriate statistical procedure, this document provides Instructions on
carrying out the procedure and Interpreting the results.
The regulations provide three levels of monitoring for a regulated
unit: detection monitoring; compliance monitoring; and corrective action.
The regulations define conditions for a regulated unit to be changed from one
level of monitoring to a more stringent level of monitoring (e.g., from detec-
tion monitoring to compliance monitoring). These conditions are that there 1s
statistically significant evidence of contamination [40 CFR §264.91(a)(l) and
The regulations allow the benefit of the doubt to reside with the current
stage of monitoring. That 1s, a unit will remain 1n Its current monitoring
stage unless there 1s convincing evidence to change 1t. This means that a
unit will not be changed from detection monitoring to compliance monitoring
(or from compliance monitoring to corrective action) unless there 1s statisti-
cally significant evidence of contamination (or contamination above the com-
pliance limit).
The main purpose of this document 1s to guide owners, operators, Regional
Administrators, State Directors, and other Interested parties 1n the selec-
tion, use, and Interpretation of appropriate statistical methods for monitor-
Ing the ground water at each specific regulated unit. Topics to be covered
Include sampling needed, sample sizes, selection of appropriate statistical
design, matching analysis of data to design, and Interpretation of results.
Specific recommended methods are detailed and a general discussion of evalu-
ation of alternate methods Is provided. Statistical concepts are discussed 1n
1-1
-------
Appendix A. References for suggested procedures are provided as well as
references to alternate procedures and general statistics ttxts. 'Situations
calling for external consultation are Mentioned as well as sources tor obtain-
ing expert assistance when needed.
EPA would like to emphasize the need for site-specific considerations In
Implementing the Subpart F regulations of 40 CFR Part 264 (especially as
amended, S3 FR 39720: October 11, 1988). It has been an ongoing strategy to
promulgate regulations that are specific enough to Implement, yet flexible
enough to accommodate a wide variety of site-specific environmental factors.
This 1s usually achieved by specifying criteria that are appropriate for the
majority of monitoring situations, while at the same time allowing alterna-
tives that are also protective of human health and the environment. This
philosophy 1s maintained In the recently promulgated amendments entitled,
"Statistical Methods for Evaluating Ground-Water Monitoring Data From Haz-
ardous Waste Facilities" (53 FR 39720: October 11, 1988). The sections that
allow for the use of an alternate sampling procedure and statistical method
l§264.97(g)(2) and §264.97(h)(S), respectively] are as viable as those that
are explicitly referenced [§264.97(g)(l) and §264.97(h)(1-4)1, provided they
meet the performance standards of §264.97(1). Due consideration to this
should be given when preparing and reviewing Part B permits and permit
applications.
1-2
-------
SECTION 2
REGULATORY OVERVIEW
In 1982, EPA promulgated ground-water monitoring and response standards
for permitted facilities In Subpart F of 40 CFR Pan 264, for detecting
releases of hazardous wastes Into ground water from storage, treatment, and
disposal units, at permitted facilities (47 FR 32274: July 26, 1982).
The Subpart F regulations required ground-water data to be examined by
Cochran's Approximation to the Behrens-Flsher Student's t-test (CABF) to
determine whether there was a significant exceedance of background levels, or
other allowable levels, of specified chemical parameters and hazardous waste
constituents. One concern was that this procedure could result 1n a high rate
of 'false positives" (Type I error), thus requiring an owner or operator
unnecessarily to advance Into a more comprehensive and expensive phase of
monitoring. More Importantly, another concern was that the procedure could
result 1n a high rate of 'false negatives" (Type II error), I.e., Instances
where actual contamination would go undetected.
As a result of these concerns, EPA amended the CABF procedure with five
different statistical methods that are more appropriate for ground-water moni-
toring (53 FR 39720: October 11, 1988). These amendments also outline sam-
pling procedures and performance standards that are designed to help minimize
the event that a statistical method will Indicate contamination when It Is not
present (Type I error), and fall to detect contamination when It Is present
(Type II error).
2.1 BACKGROUND
Subtitle C of the Resource Conservation Recovery Act of 1976 (RCRA) cre-
ates a comprehensive program for the safe management of hazardous waste. Sec-
tion 3004 of RCRA requires owners and operators of facilities that treat,
store, or dispose of hazardous waste to comply with standards established by
EPA that are 'necessary to protect human health and the environment." Sec-
tion 3005 provides for Implementation of these standards under permits Issued
to owners and operators by EPA or authorized States. Section 3005 also pro-
vides that owners and operators of existing facilities that apply for a permit
and comply with applicable notice requirements My operate until a permit
determination 1s made. These facilities are commonly known as 'Interim
status* facilities. Owners and operators of Interim status facilities also
must comply with standards set under Section 3004.
2-1
-------
EPA promulgated ground-water monitoring and response standards for per-
mitted facilities 1n 1982 (47 FR 32274, July 26, 1982), cod1f1t4 1n 40 CFR
Part 264, Subpart F. These standards establish programs for protecting ground
water from releases of hazardous wastes from treatment, storage, And disposal
units. Facility owners and operators were required to sample ground water at
specified Intervals and to use a statistical procedure to determine whether or
not hazardous wastes or constituents from the facility are contaminating
ground water. As explained 1n more detail below, the Subpart F regulations
regarding statistical Methods used 1n evaluating ground-water monitoring data
that EPA promulgated 1n 1982 have generated criticism.
The Part 264 regulations prior to the .October 11, 1988 amendments pro-
vided that the Cochran's Approximation to the Behrens-Flsher Student's t-test
(CABF) or an alternate statistical procedure approved by EPA be used to deter-
mine whether there 1s a statistically significant exceedance of background
levels, or other allowable levels, of specified chemical parameters and haz-
ardous waste constituents. Although the regulations have always provided
latitude for the use of an alternate statistical procedure, concerns were
raised that the CABF statistical procedure 1n the regulations was not appro-
priate. It was pointed out that: (1) the replicate sampling method 1s not
appropriate for the CABF procedure, (2) the CABF procedure does not adequately
consider the number of comparisons that must be made, and (3) the CABF does
not control for seasonal variation. Specifically, the concerns were that the
CABF procedure could result 1n "false positives' (Type I error), thus requir-
ing an owner or operator unnecessarily to collect additional ground-water
samples, to further characterlzt ground-water quality, and to apply for a
permit modification, which 1s then subject to EPA review. In addition, there
was concern that CABF my result 1n "false negatives" (Type II error), I.e.,
Instances where actual contamination goes undetected. This could occur
because the background data, which are often used as the basis of the
statistical comparisons, are highly variable due to temporal, spatial,
analytical, and sampling effects.
As a result of these concerns, on October 11, 1988 EPA amended both the
statistical methods and the sampling procedures of the regulations, by requir-
ing (1f necessary) that owners or operators more accurately characterize the
hydrogeology and potential contaminants at the facility, and by Including 1n
the regulations performance standards that all the statistical methods and
sampling procedures must meet. Statistical methods and sampling procedures
meeting these performance standards would have a low probability of Indicating
contamination when 1t 1s not present, and of falling to detect contamination
that actually 1s present. The facility owner or operator would have to demon-
strate that a procedure 1s appropriate for the site-specific conditions at the
facility, and to ensure that 1t meets the performance standards outlined
below. This demonstration holds for any of the statistical methods and sam-
pling procedures outlined 1n this regulation as well as any alternate methods
or procedures proposed by facility owners and operators.
EPA recognizes that the selection of appropriate monitoring parameters 1s
also an essential pan of a reliable statistical evaluation. The Agency
addressed this Issue 1n a previous Federal Rtgfetcr notice (52 FR 25942,
July 9, 1987).
2-2
-------
2.2 OVERVIEW OF METHODOLOGY f
EPA has elected to retain the Idea of general performance requirements
that the regulated community Must Meet. This approach allows for flexibility
1n designing statistical Methods and sanpllng procedures to site-specific
considerations.
EPA has tried to bring a Measure of certainty to these Methods, while
accommodating the unique nature of Many of the regulated units 1n question.
Consistent with this general strategy, the Agency 1s establishing several
options for the sampling procedures and statistical Methods to be used 1n
detection Monitoring and, where appropriate, 1n compliance Monitoring.
The owner or operator shall submit, for each of the chemical parameters
and hazardous constituents listed In the facility permit, one or More of the
statistical methods and sampling procedures described 1n the regulations
promulgated on October 11, 1988. In deciding which statistical test 1s
appropriate, he or she will consider the theoretical properties of the test,
the data available, the site hydrogeology, and the fate and transport charac-
teristics of potential contaminants at the facility. The Regional Administra-
tor will review, and 1f appropriate, approve the proposed statistical methods
and sampling procedures when Issuing the facility permit.
The Agency recognizes that there May be situations where any one statis-
tical test may not be appropriate. This Is true of new facilities with little
or no ground-water Monitoring data. If Insufficient data prohibit the owner
or operator from specifying a statistical Method of analysis, then contingency
plans containing several Methods of data analysis and the conditions under
which the Method can be used will be specified by the Regional Administrator
1n the permit. In Many cases, the parametric ANOVA can be performed after six
Months of data have been collected. This will eliminate the need for a permit
Modification 1n the event that data collected during future sampling and
analysis events Indicate the need to change to a More appropriate statistical
Method of analysis.
2.3 GENERAL PERFORMANCE STANDARDS
EPA's basic concern 1n establishing these performance standards for sta-
tistical Methods 1s to achieve a proper balance between the risk that the pro-
cedures will falsely Indicate that a regulated unit 1s causing background
values or concentration limits to be exceeded (false positives) and the risk
that the procedures will fall to Indicate that background values or concen-
tration limits are being exceeded (false negatives). EPA's approach 1s
designed to address that concern directly. Thus any statistical Method or
sampling procedure, whether specified here or as an alternative to those
specified, should Meet the following performance standards contained In
40 CFR §264.97(1):
1. The statistical test 1s to be conducted separately for each haz-
ardous constituent 1n each well (under §264.97(g)]. If the dis-
tribution of the chemical parameters or constituents 1s shown by the
owner or operator to be Inappropriate for a normal theory test, then
2-3
-------
the data should be transformed or a distribution-free theory test
should be used. If the distributions for the constituents differ,
ore than one statistical Method may be needed.
2. If an Individual well comparison procedure 1s used to compare an
Individual compliance well constituent concentration with background
constituent concentrations or a ground-water protection standard,
the test shall be done at a Type I error level of no less than 0.01
for each testing period. If a multiple comparisons procedure 1s
used, the Type I exper1mentw1se error rate shall be no less than
0.05 for each testing period; however, the Type I error of no less
than 0.01 for Individual well comparisons must be maintained. This
performance standard does not apply to control charts, tolerance
Intervals, or prediction Intervals unless they are modeled after
hypothesis testing procedures that Involve setting significance
levels.
3. If a control chart approach 1s used to evaluate ground-water moni-
toring data, the specific type of control chart and Its associated
parameters shall be proposed by the owner or operator and approved
by the Regional Administrator 1f he or she finds 1t to be protective
of human health and the environment.
4. If a tolerance Interval or a prediction Interval Is used to evaluate
ground-water monitoring data, then the levels of confidence shall be
proposed; 1n addition, for tolerance Intervals, the proportion of
the population that the Interval must contain (with the proposed
confidence) shall be proposed by the owner or operator and approved
by the Regional Administrator 1f he or she finds these parameters to
be protective of human health and the environment. These parameters
will be determined after considering the number of samples 1n the
background data base, the distribution of the data, and the range of
the concentration values for each constituent of concern.
5. The statistical method will Include procedures for handling data
below the limit of detection with one or more procedures that are
protective of human healtn and the environment. Any practical quan-
t1tat1on limit (PQL) approved by the Regional Administrator under
§264.97(h) that 1s used 1n the statistical method shall be the low-
est concentration level that can be reliably achieved within speci-
fied limits of precision and accuracy during routine laboratory
operating conditions available to the facility.
6. If necessary, the statistical method shall Include procedures to
control or correct for seasonal and spatial variability as well as
temporal correlation 1n the data.
In referring to "statistical methods," EPA means to emphasize that the
concept of "statistical significance" must be reflected 1n several aspects of
the monitoring program. This Involves not only the choice of a level of sig-
nificance, but also the choice of a statistical test, the sampling require-
ments, the number of samples, and the frequency of sampling. Since all of
2-4
-------
9
these parameters Interact to determine the ability of the procedure to detect
contamination, the statistical methods, like a comprehensive ground-water
monitoring program, must be evaluated In their entirety, not by Individual
components. Thus a systems approach to ground-water monitoring 1s endorsed.
The second performance standard requires further comment. For Individual
well comparisons 1n which an Individual compliance well 1s compared to back-
ground, the Type I error level shall be no less than 1% (0.01) for each test-
Ing period. In other words, the probability of the test resulting 1n a false
positive 1s no less than 1 1n 100. EPA believes that this significance level
1s sufficient 1n limiting the false positive rate while at the same time con-
trolling the false negative (missed detection) rate.
Owners -and operators of facilities that have an extensive network of
ground-water monitoring wells may find 1t more practical to use a multiple
well comparisons procedure. Multiple comparisons procedures control the
experlmentwise error rate for comparisons Involving multiple upgradlent and
downgradlent wells. If this method 1s used, the Type I experlmentwise error
rate for each constituent shall be no less than 5X (0.05) for each testing
period.
In using a multiple well comparisons procedure, 1f the owner or operator
chooses to use a t-stat1st1c rather than an F-stat1st1c, the Individual well
Type I error level must be maintained at no less than IX (0.01). This
provision should be considered 1f a facility owner or operator wishes to use a
procedure that distributes the risk of a false positive evenly throughout all
monitoring wells (e.g., Bonferronl t-test).
Setting these levels of significance at IX and 5X, respectively, raises
an Important question 1n how the false positive rate will be controlled at
facilities with a large number of ground-water monitoring wells and monitoring
constituents. The Agency set these levels of significance on the basis of a
single testing period and not on the entire operating life of the facility.
Further, large facilities can reduce the false positive rate by Implementing a
unit-specific monitoring approach. Nonetheless, It Is evident that facilities
with an extensive number of ground-water monitoring wells which are monitored
for many constituents may still generate a large number of comparisons during
each testing period.
In these particular situations, a determination of whether a release from
a facility has occurred may require the Regional Administrator to evaluate the
site hydrogeology, geochemistry, climatic factors, and other environmental
parameters to determine If a statistically significant result Is Indicative of
an actual release from the facility. In making this determination, the
Regional Administrator may note the relative magnitude of the concentration of
the constltuent(s). If the exceedance Is based on an observed compliance well
value that 1s the same relative magnitude as the PQl (practical quant 1tat 1 on
limit) or the background concentration level, then a false positive may have
occurred, and further sampling and testing may be appropriate. If, however,
the background concentration level or an action level 1s substantially
2-5
-------
exceeded, then the exceedance 1s more likely to be Indicative of a release
from the facility.
2.4 BASIC STATISTICAL METHODS AND SAMPLING PROCEDURES
The October 11, 1988 rule specifies five types of statistical methods to
detect contamination 1n ground water. EPA believes that at least one of these
types of procedures will be appropriate for virtually all facilities. To
address situations where these methods may not be appropriate, EPA has
Included a provision for the owner or operator to select an alternate method
which 1s subject to approval by the Regional Administrator.
2.4.1 The Five Statistical Methods Outlined 1n the October 11. 1988 Final
Rule !
1. A parametric analysis of variance (ANOVA) followed by multiple com-
parison procedures to Identify specific sources of difference. The
procedures will Include estimation and testing of the contrasts
between the mean of each compliance well and the background mean for
each constituent.
2. An analysis of variance (ANOVA) based on ranks followed by multiple
comparison procedures to Identify specific sources of difference.
The procedure will Include estimation and testing of the contrasts
between the median of each compliance well and the median background
levels for each constituent.
3. A procedure 1n which a tolerance Interval or a prediction Interval
for each constituent 1s established from the background data, and
the level of each constituent 1n each compliance well 1s compared to
Us upper tolerance or prediction limit.
4. A control chart approach which will give control limits for each
constituent. If any compliance well has a value or a sequence of
values that He outside the control limits for that constituent, 1t
may constitute statistically significant evidence of contamination.
5. Another statistical method submitted by the owner or operator and
approved by the Regional Administrator.
A summary of these statistical methods and their applicability 1s pre-
sented 1n Table 2-1. The table lists types of comparisons and the recommended
procedure and refers the reader to the appropriate sections where a discussion
and example can be found.
EPA 1s specifying multiple statistical methods and sampling procedures
and has allowed for alternatives because no one method or procedure 1s appro-
priate for all circumstances. EPA believes that the suggested methods and
procedures are appropriate for the site-specific design and analysis of data
from ground-water monitoring systems and that they can account for more of the
site-specific factors that Cochran's Approximation to the Behrens-F1sher
Student's t-test (CABF) and the accompanying sampling procedures 1n the past
2-6
-------
TABLE 2-1. SUMMARY OF STATISTICAL METHODS
SUMMARY OF STATISTICAL METHODS
COMPOUND
ANY
COMPOUND
IN
BACKGROUND
ACL/MCL
SPECIFIC
SYNTHETIC
TYPE OF COMPARISON
BACKGROUND VS
COMPLIANCE WELL
INTRA-WELL
FIXED STANDARD
MANY NONDETECTS
IN DATA SET
RECOMMENDED METHOD
ANOVA
TOLERANCE LIMITS
PREDICTION INTERVALS
CONTROL CHARTS
CONFIDENCE INTERVALS
TOLERANCE LIMITS
SEE BELOW DETECTION
LIMIT TABLE 8-1
SECTION OF
GUIDANCE
DOCUMENT
5.2
5.3
5.4
7
6.2.1
6.2.2
8.1
2-7
-------
regulations. The statistical Methods specified here address the nultlple
comparison problems and provide for documenting and accounting for'sources of
natural variation. EPA believes that the specified statistical methods and
procedures consider and control for natural temporal and spatial variation.
2.4.2 Site-Specific Considerations for Sampling ^
The decision on the number of wells needed In a monitoring system will be
made on a site-specific basis by the Regional Administrator and will consider
the statistical method being used, the site hydrogeology, the fate and trans-
port characteristics of potential contaminants, and the sampling procedure.
The number of wells must be sufficient to ensure a high probability of detect-
ing contamination when 1t 1s present. To determine which sampling procedure
should be used, the owner or operator shall consider existing data and site
characteristics, Including the possibility of trends and seasonallty. These
sampling procedures are:
1. Obtain a sequence of at least four samples taken at an Interval that
ensures, to the greatest extent technically feasible, that an Inde-
pendent sample 1s obtained, by reference to the uppermost aquifer's
effective porosity, hydraulic conductivity, and hydraulic gradient,
and the fate and transport characteristics of potential contami-
nants. The sampling Interval that 1s proposed must be approved by
the Regional Administrator.
2. An alternate sampling procedure proposed by the owner or operator
and approved by the Regional Administrator If he or she finds It to
be protective of human health and the environment.
EPA believes that the above sampling procedures will allow the use of
statistical methods that will accurately detect contamination. These sampling
procedures may be used to replace the sampling method present 1n the former
Subpart F regulations. Rather than taking a single ground-water sample and
dividing 1t Into four replicate samples, a sequence of at least four samples
taken at Intervals ,far enough apart 1n time (dally, weekly, or monthly,
depending on rates of ground-water flow and contaminant fate and transport
characteristics) will help ensure the sampling of a discrete portion (I.e., an
Independent sample) of ground water. In hydrogeologlc environments where the
ground-water velocity prohibits one from obtaining four Independent samples on
a semiannual basis, an alternate sampling procedure approved by the Regional
Administrator may be utilized [40 CFR §264.97(g)(l) and (2)].
The Regional Administrator shall approve an appropriate sampling proce-
dure and Interval submitted by the owner or operator after considering the
effective porosity, hydraulic conductivity, and hydraulic gradient 1n the
uppermost aquifer under the waste management area, and the fate and transport
characteristics of potential contaminants. Most of this Information 1s
already required to be submitted 1n the facility's Part B permit application
under §270.14(c) and may be used by the owner or operator to make this deter-
mination. Further, the number and kinds of samples collected to establish
background concentration levels should be appropriate to the form of statisti-
cal test employed, following generally accepted statistical principles
2-8
-------
[40 CFR §264.97(g)J. For example, the use of control charts presumes a well-
defined background of at least eight samples per well. By contrast, ANOVA
alternatives might require only four samples per well.
It seems likely that most facilities will be sampling monthly over four
consecutive months, twice a year. In order to maintain a complete annual
record of ground-water data, the facility owner or operator may find 1t
desirable to obtain a sample each month of the year. This will help Identify
seasonal trends 1n the data and permit evaluation of the effects of auto-
correlation and seasonal variation 1f present 1n the samples.
The concentrations of a constituent determined 1n these samples are
Intended to be used 1n one-po1nt-1n-t1me comparisons between background and
compliance wells. This approach will help reduce the components of seasonal
variation by providing for simultaneous comparisons between background and
compliance well Information.
The flexibility for establishing sampling Intervals were chosen to allow
for the unique nature of the hydrogeologlc systems beneath hazardous waste
sites. This sampling scheme will give proper consideration to the temporal
variation of and autocorrelation among the ground-water constituents. The
specified procedure requires sampling data from background wells, at the
compliance point, and according to a specific test protocol. The owner or
operator should use a background value determined from data collected under
this scenario 1f a test approved by the Regional Administrator requires 1t or
1f a concentration limit 1n compliance monitoring 1s to be based upon
background data.
EPA recognizes that there may be situations where the owner or operator
can devise alternate statistical methods and sampling procedures that are more
appropriate to the facility and that will provide reliable results. There-
fore, today's regulations allow the Regional Administrator to approve such
procedures 1f he or she finds that the procedures balance the risk of false
positives and false negatives 1n a manner comparable to that provided by the
above specified, tests and that they meet specified performance standards
[40 CFR §264.97(g)]. In examining the comparability of the procedure to
provide a reasonable balance between the risk of false positives and false
negatives, the owner or operator will specify 1n the alternate plan such
parameters as sampling frequency and sample size.
2.4.3 The "Reasonable Confidence* Requirement
The methods Indicate that the procedure must provide reasonable confi-
dence that the migration of hazardous constituents from a regulated unit Into
and through the aquifer will be detected. (The reference to hazardous con-
stituents does not mean that this option applies only to compliance monitor-
Ing; the procedure also applies to monitoring' parameters and constituents 1n
the detection monitoring program since they are surrogates Indicating the
presence of hazardous constituents.) The protocols for the specific tests,
however, will be used as general benchmark to define 'reasonable confidence*
1n the proposed procedure. If the owner or operator shows that his or her
suggested test 1s comparable 1n Its results to one of the specified tests,
2-9
-------
then 1t 1s likely to be acceptable under the "reasonable confidence" test.
There nay be situations, however, where 1t will be difficult to directly
compare the performance of an alternate test to the protocols for the
specified tests. In such cases the alternate test will have to be evaluated
on Its own writs.
2-10
-------
SECTION 3
CHOOSING A SAMPLING INTERVAL
This section discusses the Important hydrogeologlc parameters to consider
when choosing a sampling Interval. The Oarcy equation 1s used to determine
the horizontal component of the average linear velocity of ground water. This
value provides a good estimate of time of travel for most soluble constituents
1n ground water, and can be used to determine a sampling Interval. Example
calculations are provided at the end of the section to further assist the
reader.
Section 264.97(g) of 40 CFR Part 264 Subpart F provides the owner or
operator of a RCRA facility with a flexible sampling schedule that will allow
him or her to choose a sampling procedure that will reflect site-specific con-
cerns. This section specifies that the owner or operator shall, on a semi-
annual basis, obtain a sequence of at least four samples from each well, based
on an Interval that 1s determined after evaluating the uppermost aquifer's
effective porosity, hydraulic conductivity, and hydraulic gradient, and the
fate and transport characteristics of potential contaminants. The Intent of
this provision 1s to set a sampling frequency that allows sufficient time to
pass between sampling events to ensure, to the greatest extent technically
feasible, that an Independent ground-water sample 1s taken from each well..
For further Information on ground-water sampling, refer to the EPA "Practical
Guide for Ground-Water Sampling," Barcelona et al., 1985.
The sampling frequency of the four semiannual sampling events required 1n
Part 264 Subpart F can be based on estimates using the average linear velocity
of ground water.' Two forms of the Oarcy equation stated below relate ground-
water velocity (V) to effective porosity (He), hydraulic gradient (1), and
hydraulic conductivity (K):
W1>/Ne and V^K/IJ/Ne
where Vh and Vy are the horizontal and vertical components of the average
linear velocity of ground water, respectively; Kh and Ky are the horizontal
and vertical components of hydraulic conductivity; 1 1s the bead gradient; and
Ne 1s the effective porosity. In applying these equations to ground-water
monitoring, the horizontal component of the average linear velocity (Vh) can
be used to determine an appropriate sampling Interval. Usually, field
Investigations will yield bulk values for hydraulic conductivity. In most
cases, the bulk hydraulic conductivity determined by a pump test, tracer test,
or a slug test will be sufficient for these calculations. The vertical
component of the average linear velocity of ground water (Vy), however,.should
3-1
-------
be considered 1n estimating flow velocities 1n areas with significant com-
ponents of vertical velocity such as recharge and discharge zones.
To apply the Oarcy equation to ground-water monitoring, one needs to
determine the parameters K. 1, and Ne. The hydraulic conductivity, K, 1s the
volume of water at the existing kinematic viscosity that will nove 1n unit
time under a unit hydraulic gradient through a unit area measured at right
angles to the direction of flow. The reference to "existing kinematic vis-
cosity" relates to the fact that hydraulic conductivity 1s not only determined
by the media (aquifer), but also by fluid properties (ground water or poten-
tial contaminants). Thus, 1t 1s possible to have several hydraulic conduc-
tivity values for many different chemical substances that are present 1n the
same aquifer. In either case 1t 1s advisable to use the greatest value for
velocity that 1s calculated using the Oarcy equation to determine sampling
Intervals. This will provide for the earliest detection of a leak from a
hazardous waste facility and expeditious remedial action procedures. A range
of hydraulic conductivities (the transmitted fluid 1s water) for various aqui-
fer materials 1s given 1n Figure 3-1. The conductivities are given in three
units: the top line 1s in meters per day; the middle line, in feet per day,
1s commonly used; the last line 1s expressed 1n gallons per day-foot-squared.
The hydraulic gradient, 1, 1s the change in hydraulic head per unit of
distance in a given direction. It can be determined by dividing the differ-
ence 1n head between two points on a potent 1ometr1c surface map by the
orthogonal distance between those two points (see example calculation). Water
level measurements are normally used to determine the natural hydraulic gradi-
ent at a facility. However, the effects of mounding in the event of a leak
from a waste disposal facility may produce a steeper local hydraulic-gradient
1n the vicinity of the monitoring well. These local changes 1n hydraulic
gradient should be accounted for in the velocity calculations.
The effective porosity, Ne, 1s the ratio, usually expressed as a per-
centage, of the total volume of voids available for fluid transmission to the
total volume of the porous medium dewatered. It can be estimated during a
pump test by dividing the volume of water removed from an aquifer by the total
volume of aquifer dewatered (see example calculation). Table 3-1 presents
approximate effective porosity values for a variety of aquifer materials. In
cases where the effective porosity 1s unknown, specific yield may be substi-
tuted Into the equation. Specific yields of selected rock units are given in
Table 3-2. In the absence of measured values, drainable porosity 1s often
used to approximate effective porosity. Figure 3-2 Illustrates representative
values of drainable porosity and total porosity as a function of aquifer
particle size.
Once the values for K, 1, and Ne are determined, the horizontal component
of the average linear velocity of ground water can be calculated. Using the
Oarcy equation, we can determine the time required for ground water to pass
through the complete monitoring well diameter by dividing the monitoring well
diameter by the horizontal component of the average linear velocity of ground
water. This value will represent the minimum time Interval required between
sampling events that will yield an Independent ground-water sample.
3-2
-------
r
t
IGNEOUS AND METAMORPHtC ROCKS
Unfrocturcd Fractured
BASALT ;
Unfroctured Fractured Lava flow
SANDSTONE
Fractured Stmiconjolidofed
SHALE
Unfroctured Fractured
CARBONATE ROCKS
Fractured Cavernous
CLAY SILT, LOESS
SlLTY SAND
CLEAN SAND
Pint Coarse
GLACIAL TILL GRAVEL
10"' IO'7 10** 10'* 10"* I0"s 10** 10'' I 10 10 * 10s 10*
in/day
!____! 1 I I I L t 1 I I t I
IO*T I0m* IO'5 10** IO'J 10'* 10*' I 10 10 * 10 * JO * 10 *
ft/day
t i i i 11 i i i i i i i
I0"r 10"' 10'* IO*4 IO'1 10'* 10" I 10 10 * 10 s 10 4 10 5
gol/day-ft2
Source: Heath, R. C. 1983. Basic Ground-Water Hydrology. U.S. Geological
Survey Water Supply Paper, 2220, 84 pp.
Figure 3-1. Hydraulic conductivity (In three units) of selected rocks,
3.3
-------
TABLE 3-1. DEFAULT VALUES FOR EFFECTIVE POROSITY (Ne) FOR USE
IN TIME OF TRAVEL (TOT) ANALYSES r
Effective porosity
Soil textural classes of saturation8
Unified son classification system
GS, GP, GM, GC, SW, SP, SM, SC 0.20
(20X)
ML, MH 0.15
(15X)
CL, OL, CH, OH, PT O.OL
d«)b
USDA son textural classes
Clays, sllty clays, sandy clays 0.01
(1*)
b
Silts, silt loams, sllty clay loams 0.10
(10X)
All others 0.20
(20X)
Rock units fall)
Porous media (nonfractured rocks 0.15
such as sandstone and some carbonates) (15X)
Fractured rocks (most carbonates, 0.0001
shales, granites, etc.) (0.01*)
Source: Bararl, A., and L. S. Hedges. 1985. Movement of Mater
1 n 61 ad al Till. Proceedings of the 17th. international Congress of the
International Association of Hydrogeologists, pp. 129-134.
& These values are estimates and there may be differences between
similar units. For ex <*>1e, recent studies Indicate that
weathered and unweathered glacial till may have markedly dif-
ferent effective porosities (Bararl and Hedges, 1985; Bradbury
et al., 1985).
b Assumes de minimus secondary porosity. If fractures or soil
structure are present, effective porosity should be 0.001
(0.1X).
3-4
-------
TABLE 3-2. SPECIFIC YIELD VALUES FOR
SELECTED ROCK TYPES
Rock type Specific yield (*)
Clay 2
Sand 22
Gravel 19
Limestone 18
Sandstone (seraiconsolidated) 6
Granite 0.09
Basalt (young) 8
Source:Heath, R. C.1983.Basic Ground-Water
Hydrology. U.S. Geological Survey, Water Supply
Paper 2220, 84 pp.
3-5
-------
t
r
u
SO
«S
40
35
30
23
20
IS
10
s
0
Porosity
Sotciftc yield
(dramaole porosity)
J
§
s
i
1
S
I
I
I
1
a
1
1
I
1/16 1/18 1/4 1/21 2 4 f 16 32 64 12t 2S6
Maximum 10% grain tiM. millimattn
(T*i IT** tin <* i«*>cft. f* cvmv4«rr«* totH. MfWMMitf wifi tt* eotnmt
10% 91 aw row wiwtJ
Source: Todd, D. K. 1980. Ground Water Hydrology. John
Wiley and Sons, New York. 534 pp.
Figure 3-2. Total porosity and dralnable porosity for
typical neologlc materials.
3-6
-------
(Three-dimensional mixing of ground water 1n the vicinity of the monitoring
well will occur when the well 1s purged before sampling, which*1s one reason
why this method only provides an estimation of travel time).
In determining these sampling Intervals, one should note that many chemi-
cal compounds will not travel at the same velocity as ground water. Chemical
characteristics such as adsorptlve potential, specific gravity, and molecular
size will Influence the way chemicals travel 1n the subsurface. Large mole-
cules, for example, will tend to travel slower than the average linear veloc-
ity of ground water because of matrix Interactions. Compounds that exhibit a
strong adsorptlve potential will undergo a similar fate that will dramatically
change time of travel predictions using the Oarcy equation. In some cases
chemical Interaction with the matrix material will alter the matrix structure
and Us associated hydraulic conductivity that may result 1n an Increase 1n
contaminant mobility. This effect has been observed with certain organic
solvents 1n clay units (see Brown and Andersen, 1981). Contaminant fate ind
transport models may be useful 1n determining the Influence of these effects
on movement 1n the subsurface. A variety of these models are available on the
commercial market for private use.
EXAMPLE CALCULATION NO. 1: DETERMINING THE EFFECTIVE POROSITY (Ne)
The effective porosity, Ne, expressed 1n X, can be determined during a
pump test using the following method:
Ne * 100X x volume of water removed/volume of aquifer dewatered
Based on a pumping rate of the pump of 50 gal/m1n and a pumping
duration of 30 m1n, compute the volume of water removed as:
i
50 gal/m1n x 30 m1n 1.500 gal
To calculate the volume of aquifer dewatered, use the formula:
V « (l/3)»r*h
where r 1s the radius (ft) of area affected by pumping and h (ft) 1s the drop
1n the water level. If, for example, h « 3 ft and r « 18 ft, then:
V » (l/3)*3.14*18**3 - 1,018 fts
»
Next, converting ft' of water to gallons of water,
V » (1,018 ft>)(7.48 gal/ft») » 7,615 gal
Substituting the two volumes 1n the equation for the effective
porosity, obtain
Ne 100X x 1,500/7,615 19.7X
. 3-7
-------
EXAMPLE CALCULATION NO. 2: DETERMINING THE HYDRAULIC GRADIENT (1)
The hydraulic gradient, 1, can be detemlned fro« a potent 1ometr1c
surface map (Figure 3-3 below) as 1 - Ah/i, where Ah 1s the difference
measured 1n the gradient at Pzl and Pza, and & 1s the orthogonal distance
between the two piezometers.
Using the values given 1n Figure 3-3, obtain
1 « Ah/i - (29.2 ft - 29.1 ft)/100 ft
0.001 ft/ft
N
Figure 3-3. Potent1ometr1c surface map for computation
of hydraulic gradient.
This method provides only a very general estimate of the natural
hydraulic gradient that exists in the vicinity of the two piezometers.
Chemical gradients are known to exist and may override the effects of the
hydraulic gradient. A detailed study of the effects of multiple chemical
contaminants may be necessary to determine the actual average linear velocity
(horizontal component) of ground water 1n the vicinity of the monitoring
wells.
»
EXAMPLE CALCULATION NO. 3: DETERMINING THE HORIZONTAL COMPONENT OF THE
AVERAGE LINEAR VELOCITY OF GROUND HATER (Vh)
A land disposal facility has ground-water monitoring wells that are
screened 1n an unconfirmed sllty sand aquifer. Slug tests, pump tests, and
tracer tests conductcJ during a hydrogeologlc site Investigation have revealed
that the aquifer has a horizontal hydraulic conductivity (Kh) of 15 ft/day and
an effective porosity (Ne) of 1SX. Using a potentlometrlc map (as in
example 2), the regional hydraulic gradient (1) has been determined to be
0.003 ft/ft.
3-8
-------
To estimate the minimum time Interval between sampling eve-nts that will
allow one to obtain an Independent sample of ground water proceed as follows.
Calculate the horizontal component of the average linear velocity of
ground water (Vh) using the Darcy equation, Vh (1^*1)/Ne.
With Kh » 15 ft/day.
Ne « 15X, and
1 0.003 ft/ft, calculate
Vh - (15)(0.003)/(15X) « 0.3 ft/day, or equlvalently
Vh - (0.3 ft/day)(12 1n/ft) » 3.6 In/day
Discussion: The horizontal component of the average linear velocity of
ground water, Vh, has been calculated and 1s equal to 3.6 In/day. Monitoring
well- diameters at this particular facility are 4 1n. We can determine the
minimum time Interval between sampling events that will allow one to obtain an
Independent sample of ground water by dividing the monitoring well diameter by
the horizontal component of the average linear velocity of ground water:
Minimum time Interval « (4 In)/(3.6 In/day) » 1.1 days
Based on the above calculations, the owner or operator could sample every
other day. However, because the velocity can vary with recharge rates sea-
sonally, a weekly sampling Interval would be advised.
Suggested Sampling Interval
Date Obtain Sample No.
June 1 1
June 8 2
, June 15 3
June 22 4
Table 3-3 gives some results for common situations.
3-9
-------
TABLE 3-3. DETERMINING A SAMPLING INTERVAL
DETERMINING A SAMPLING INTERVAL
UNIT
GRAVEL
SAND
SILTY SAND
TILL
SS (SEMICON)
BASALT
Kh (ft/day)
10'
102
10
10*
1
1C'1
Ne (%)
19
22
14
2
6
8
Vn (in/mo)
9.6x104
8.3X102
1.3x10 2
9.1x10*
30
2.28
SAMPLING INTERVAL
DAILY
, DAILY
WEEKLY
MONTHLY *
WEEKLY
MONTHLY
The horizontal component of the average linear velocities is based on
a hydraulic gradient, i, of 0.005 ft/ft
* Use a Monthly sampling interval or an alternate sampling procedure.
3-10
-------
SECTION 4
CHOOSING A STATISTICAL METHOD
This section discusses the choice of an appropriate statistical method.
Section 4.1 Includes a flowchart to guide this selection. Section 4.2 contains
procedures to test the distributional assumptions of statistical methods and
Section 4.3 has procedures to test specifically for equality of variances.
The choice of an appropriate statistical test depends on the type of mon-
itoring and the nature of the data. The proportion of values 1n the data set
that are below detection 1s one Important consideration. If most of the
values are below detection, a test of proportions 1s suggested.
One set of statistical procedures 1s suggested when the monitoring con-
sists of comparisons of water sample data from the background (hydraullcally
upgradlent) well with the sample data from compliance (hydraullcally down-
gradient) wells. The recommended approach 1s analysis of variance (ANOVA).
Also, for a facility with limited amounts of data, 1t 1s advisable to Ini-
tially use the ANOVA method of data evaluation, and later, when sufficient
amounts of data are collected, to change to a tolerance Interval or a control
chart approach for each compliance well. However, alternate approaches are
allowed. These Include adjustments for seasonal1ty, use of tolerance Inter-
vals, and use of prediction Intervals. These methods are discussed 1n Sec-
tion 5.
When the monitoring objective 1s to compare the concentration of a haz-
ardous constituent to a fixed level such as a maximum concentration limit
(MCL), a different type of approach Is needed. This type of comparison com-
monly serves as a basis of compliance monitoring. Control charts may be used,
as may tolerance or confidence Intervals. Methods for comparison with a fixed
level are presented 1n Section 6.
When a long history of data from each well Is available, 1ntra-well com-
parisons are appropriate. That Is, the data from a single uncontamlnated well
are compared over time to detect shifts 1n concentration, or gradual trends in
concentration that may Indicate contamination. Methods for this situation are
presented In Section 7.
4.1 FLOWCHARTSOVERVIEW AND USE
The selection and use of a statistical procedure for ground-water moni-
toring 1s a detailed process. Because a single flowchart would become too
complicated for easy use, a series of flowcharts has been developed. These
flowcharts are found at the beginning of each section and are Intended to
4-1
-------
guide the user 1n the selection and use of procedures 1n that secflon. The
more detailed flowcharts can be thought of as attaching to the general flow-
charts at the Indicated points.
Three general types of statistical procedures are presented 1n the flow-
chart overview (Figure 4-1): (1) background well to compliance well data
comparisons; (2) comparison of compliance well data with a constant limit such
as an alternate concentration limit (ACL) or a Maximum concentration limit
(MCI); and (3) 1ntra-well comparisons. The first question to be asked in
determining the appropriate statistical procedure 1s the type of monitoring
program specified 1n facility permit. The type of monitoring program may
determine 1f the appropriate comparison 1s among wells, comparison of down-
gradient well data to a constant, 1ntra-we1l comparisons, or a special case.
If the facility 1s 1n detection monitoring, the appropriate comparison is
between wells that are hydraullcally upgradlent from the facility and those
that are-hydraullcally downgradlent. The statistical procedures for this type
of monitoring are presented 1n Section 5. In detection monitoring, it is
likely that many of the monitored constituents may result 1n few quantified
results (I.e., much of the data are below the limit of analytical detection).
If this 1s the case, then the test of proportions (Section 8.1.3) may be rec-
ommended. If the constituent occurs 1n measurable concentrations 1n back-
ground, then analysis of variance (Section 5.2) 1s recommended. This method
of analysis 1s preferred when the data lack sufficient quantity to allow for
the use of tolerance Intervals or control charts.
If the facility 1s in compliance monitoring, the permit will specify the
type of compliance limit. If the compliance limit 1s determined from the
background, the statistical method 1s chosen from those that compare back-
ground well to compliance well data. Statistical methods for this case are
presented 1n Section 5. The preferred method 1s the appropriate analysis of
variance method 1n Section 5.2, or 1f sufficient data permit, tolerance Inter-
vals or control charts. The flow chart 1n Section 5 aids 1n determining which
method 1s applicable.
If a facility in compliance monitoring has a constant maximum concentra-
tion limit (MCI) or alternate concentration limit (ACL) specified, then the
appropriate comparison 1s with a constant. Methods for comparison with MCLs
or ACLs are presented In Section 6,- which contains a flow chart to aid in
determining which method to use.
Finally, when more than one year of data have been collected from each
well, the facility owner or operator may find 1t useful to perform 1ntra-well
comparison', over time to supplement the other methods. This 1s -not a regula-
tory requirement, but 1t could provide the facility owner or operator with
Information about the site hydrogeology. This method of analysis may be used
when sufficient data from an Individual uncontanlnated well exist and the data
allow for the Identification of trends. A recommended control chart procedure
(Starks, 1988} suggests that a minimum background sample of eight observations
1s needed. Thus an 1ntra-well control chart approach could begin after the
first complete year of data collection. These methods are presented .1n
Section 7.
4-2
-------
FLOWCHART OVERVIEW
r
r
Detection Monitoring
Compliance Monitoring
Type of Xor Corrective Action
Permit
Background
Background/
Compliance Well
Comparisons
(Section 5)
Type of XMCL/ACL
Compliance
Limit
with
with
L.-.,
Intra-Well
Comparisons
If more than
lYr. ofData
Control Charts
(Section?)
Comparisons
with MCL/ACLs
(Section 6)
Flqure 4-1. Flowchart overview.
4-3
-------
4.2 CHECKING DISTRIBUTIONAL ASSUMPTIONS
t
The purpose of this section 1s to provide users with methods to check the
distributional assumptions of the statistical procedures recommended for
ground-water Monitoring. It 1s emphasized that one need not do in extensive
study of the distribution of the data unless a nonparametrlc method of analy-
sis 1s used to evaluate the data. If the owner or operator wishes to trans-
fora the data In lieu of using a nonparametrlc wthod. It Bust first be shown
that the untransformed data are Inappropriate for a normal theory test.
Similarly, 1f the owner or operator wishes to use nonparametrlc methods, he or
she must demonstrate that the data do violate normality assumptions.
EPA has adopted this approach because most of the statistical procedures
that meet the criteria set forth 1n the regulations are robust with respect to
departures from many of the normal distributional assumptions. That 1s, only
extreme violations of assumptions will result 1n an Incorrect outcome of a
statistical test. Moreover, It 1s only 1n situations where 1t 1s unclear
whether contamination 1s present that departures from assumptions will alter
the outcome of a statistical test. EPA therefore believes that 1t 1s protec-
tive of the environment to adopt the approach of not requiring testing of
assumptions of a normal distribution on a wide scale.
It should be noted that the normal distributional assumptions for
statistical procedures apply to the errors of the observations. Application
of the distributional tests to the observations themselves may lead to the
conclusion that the distribution does not fit the observations. In some cases
this lack of fit may be due to differences 1n means for the different wells or
some other cause. The tests for distributional assumptions are best applied
to the residuals from a statistical analysis. A residual 1s the difference
between the original observation and the value predicted by a model. For
example, 1n analysis of variance, the predicted values are the group means and
the residual 1s the difference between each observation and Its group mean.
If the conclusion from testing the assumptions 1s that the assumptions
are not adequately met, then a transformation of the data may be used or a
nonparametrlc statistical procedure selected. Many types of concentration
data have been reported 1n the literature to be adequately described by a log-
normal distribution. That 1s* the natural logarithm of the original observa-
tions has been found to follow the normal distribution. Consequently, 1f the
normal distributional assumptions are found to be violated for the original
data, a transformation by taking the natural logarithm of each observation 1s
suggested. This assumes that the data are all positive. If the log trans-
formation does not adequately normalize the data or stabilize the variance,
one should use a nonparametrlc procedure or seek the consultation of a profes-
sional statistician to determine an appropriate statistical procedure.
The following sections present four selected approaches to check for
normality. The first option refers to literature citation, the other three
are statistical procedures. The choice 1s left to the user. The availability
of statistical software and the user's familiarity with It will be a factor 1n
the choice of a method. The coefficient of variation method, for example,
requires only the computation of the mean and standard deviation of the data.
-------
Plotting on probability paper can -be done by hand but becomes, tedious with
any data sets. However, the commercial Statistical Analysis System (SAS)
software package provides a computerized version of a probability plot 1n Its
PROC UNIVAR1ATE procedure. SYSTAT, a package for PCs also has a probability
plot procedure. The ch1-squared test 1s not readily available through commer-
cial software but can be programmed on a PC (for example In LOTUS 1-2-3) or In
any other (statistical) software language with which the user 1s familiar.
The amount of data available will also Influence the choice. All tests of
distributional assumptions require a fairly large sample size to detect
moderate to small deviations from normality. The ch1-squared test requires a
minimum of 20 samples for a reasonable test.
Other statistical procedures are available for checking distributional
assumptions. The more advanced user Is referred to the Kolmogorov-Smlrnov
test (see, for example, Llndgren, 1976) which Is used to test the hypothesis
that data come from a specific (that 1s, completely specified) distribution.
The normal distribution assumption can thus be tested for. A minimum sample
size, of 50 1s recommended for using this test.
A modification to the Kolmogorov-Smlrnov test has been developed by
L11l1efors who uses the sample mean and standard deviation from the data as
the parameters of the distribution (Ulllefors, 1967). Again, a sample size
of at least 50 1s recommended.
Another alternative to testing for normality Is provided by the rather
Involved Shaplro-WUk's test. The Interested user 1s referred to the relevant
article In Bfometrtte by Shapiro and H1lk (1965).
4.2.1 Literature Citation
PURPOSE
An owner or operator may wish to consult literature to determine what
type of distribution the ground-water monitoring data for a specific con-
stituent are likely to follow. This may avoid unnecessary computations and
make 1t easier to determine whether there 1s statistically significant evi-
dence of contamination.
PROCEDURE
One simple way to select a procedure based on a specific statistical dis-
tribution, 1s by citing a relevant published reference. The owner or operator
may find papers that discuss data resulting from sampling ground water and
conclude that such data for a particular constituent follow a specified dis-
tribution. Citing such a reference may be sufficient justification for using
a method based on that distribution, provided that the data do not show evi-
dence that the assumptions are violated.
To justify the use of a literature citation, the owner or operator needs
to make sure that the reference cited considers the distribution of data for
the specific compound being monitored. In addition, he or she must evaluate
the similarity of their site to the site that was discussed 1n the literature,
4-5
-------
especially similar hydrogeologlc and potential contaminant characteristics.
However, because many of the compounds may not be studied 1n the literature,
extrapolations to compounds with similar chemical characteristics and to sites
with similar hydrogeologlc conditions are also acceptable. Basically, the
owner or operator needs to provide some reason or Justification for choosing a
particular distribution. ^
4.2.2 Coeff1c1ent-of-Variation Test
Many statistical procedures assume that the data are normally distrib-
uted. The concentration of a hazardous constituent 1n ground water 1s Inher-
ently nonnegatlve, while the normal distribution allows for negative values.
However, 1f the mean of the normal distribution 1s sufficiently above zero,
the distribution places very little probability on negative observations and
1s still a valid approximation.
One simple check that can rule out use of the normal distribution 1s to
calculate the coefficient of variation of the data. The use of this method
was required by the former Part 264 Subpart F regulations pursuant to Sec-
tion 264.97(h)(l). Because most owners and operators as well as Regional
personnel are already familiar with this procedure, It will probably be used
frequently. The coefficient of variation, CV, 1s the standard deviation of
the observations, divided by their mean. If the normal distribution 1s to be
a valid model, there should be very little probability of negative values.
The number of standard deviations by which the mean exceeds zero determines
the probability of negative values. For example, 1f the mean exceeds zero by
one standard deviation, the normal distribution will have less than 0.159
probability of a negative observation.
Consequently, one can calculate the standard deviation of the observa-
tions, calculate the mean, and form the ratio of the standard deviation di-
vided by the mean. If this ratio exceeds 1.00, there Is evidence that the
data are not normal and the normal distribution should not be used for those
data. (There are other possibilities for nonnormallty, but this 1s a simple
check that can rule out obviously nonnormal data.)
PURPOSE
This test 1s a simple check for evidence of gross nonnormallty 1n the
ground-water monitoring data.
PROCEDURE
To apply the coefflclent-of-variation check for normality proceed as fol-
lows.
Step 1. Calculate the sample mean, X, of n observations Xj, 1*1, ...,n.
n
(
t X,)/n
1*1
4-6
-------
Step 2. Calculate the sample standard deviation, S.
r
1/2
I" ! (X1 -7)i/(n- 1)1
Step 3. Divide the sample standard deviation by the sample mean. This
ratio 1s the CV.
CV - S/X.
Step 4. Determine 1f the result of Step 3 exceeds 1.00. If so, this 1s
evidence that the normal distribution does not fit the data adequately.
EXAMPLE
Table 4-1 1s an example data set of chlordane concentrations 1n 24 water
samples from a fictitious site. The data are presented 1n order from least to
greatest.
TABLE 4-1. EXAMPLE DATA FOR COEFFICIENT-
OF-VARIATION TEST
Chlordane concentration (ppm)
Dissolved phase
I«rlsc1ble phase
0.04
0.18
0.18
0.25
0.29
0.38
0.50
0.50
0.60
0.93
0.97
1.10
1.16
1.29
1.37
1.38
1.45
1.46
- 2.58
2.69
2.80
3.33
4.50
.60
4-7
-------
Applying the procedure steps to the data of Table 4-1, we have:
Step 1. X - 1.52 r
t-
Step 2. S 1.56
Step 3. CV 1.56/1.52 « 1.03
Step 4. Because the result of Step 3 was 1.03, which exceeds 1.00, we
conclude that there 1s evidence that the data do not adequately follow the
normal distribution. As will be discussed 1n other sections one would then
either transform the data, use a nonparametrlc procedure, or seek professional
guidance.
NOTE. The owner or operator may choose to use parametric tests since
1.03 1s so close to the limit but should use a transformation or a nonpara-
netrlc test 1f he or she believes that the parametric test results would be
Incorrect due to the departure from normality.
4.2.3 Plotting on Probability Paper
PURPOSE
Probability paper Is a visual aid and diagnostic tool in determining
whether a set of data follows a normal distribution. Also, approximate esti-
mates of the mean and standard deviation of the distribution can be read from
the plot.
PROCEDURE
Let X be the variable; Xlt X2,...,X.j,...,Xn the set of n observations.
The values of X can be raw data, residuals, or transformed data.
Step 1. Rearrange the observations In ascending order:
, X(2) ..... X(n).
Step 2. Compute the cumulative frequency for each distinct value X(1)
as (1/(n+l)) x 100*. The divisor of (n+1) Is t plotting convention to avoid
cumulative frequencies of 1001 which would be at Infinity on the probability
paper.
If a value of X occurs more than once, then the corresponding value of 1
Increases appropriately. For example, 1f X(2) » X(3), then the cumulative
frequency for X(l) 1s 100*l/(n+l), but the cumulative frequency for X(2) or
X(3) Is 100*(l+2)/(n+i).
Step 3. Plot the distinct pairs [X(1), (1/n+l)) x 100] values on prob-
ability paper (this paper 1s commercial 1y available) using an appropriate
scale for X on the horizontal axis. The vertical axis for the cumulative
frequencies 1s already scaled from 0.01 to 99.99X.
4-8
-------
If the points fall roughly on a straight line (the line cap be drawn with
a ruler), then one can conclude that the underlying distribution 1s approxi-
mately normal. Also, an estimate of the mean and standard deviation can be
made from the plot. The horizontal line drawn through SOX cuts the plotted
line at the mean of the X values. The horizontal line going through 84X cuts
the line at a value corresponding to the mean plus one standard deviation. By
subtraction, one obtains the standard deviation.
REFERENCE
Dlxon, W. J., and F. J. Massey, Jr. introduction to Statistical Analysis.
McGraw-Hill, Fourth Edition, 1983.
EXAMPLE
Table 4-2 lists 22 distinct chlordane concentration values (X) along with
their frequencies. These are the same values as those listed 1n Table 4-1.
There 1s a total of n»24 observations.
Step 1. Sort the values of X 1n ascending order (column 1).
Step 2. Compute [100 x (1/25)], column 4, for each distinct value of X,
based on the values of 1 (column 2).
Step 3. Plot the pairs (X,, I00x(1/25)l on probability paper (Fig-
ure 4-2).
INTERPRETATION
The points 1n Figure 4-2 do not fall on a straight line; therefore, the
hypothesis of an underlying normal distribution 1s rejected. However, the
shape of the curve Indicates a lognormal distribution. This 1s checked 1n the
next step.
Also, Information about the solubility of chlordane In this example 1s
helpful. Chlordane has a solubility (1n water) that ranges between 0.0156 and
1.85 mg/L. Because the last six measurements exceed this solubility range,
contamination 1s suspected.
Next, take the natural logarithm of the X-values (ln(X)) (column 5 1n
Table 4-2). Repeat Step 3 above using the pairs (ln(X), lOOx(1/25)]. The re-
sulting plot 1s shown 1n Figure 4-3. The points fall approximately on a
straight line (hand-drawn) and the hypothesis of lognormallty of X, I.e.,
ln(X) 1s normally distributed, can be accepted. The mean can be estimated at
slightly below 0 and the standard deviation at about 1.2 on the log scale.
4.2.4 The CM-Squared Test
The ch1 -squared test can be used to test whether a set of data properly
fits a specified distribution within a specified probability. Most Introduc-
tory courses 1n statistics explain the ch1-squared test, and Its familiarity
among owners and operators as well as Regional personnel may make 1t a
4-9
-------
TABLE 4-2. EXAMPLE DATA COMPUTATIONS FOR
PROBABILITY PLOTTING
Concentration
X
Dissolved phase
Imlsclble phase
0.04
0.18
0.25
0.29
0.38
0.50
0.60
. 0.93
0.97
1.10
1.16
1.29
1.37
1.38
1.45
..ta 1.46
2.58
2.69
2.80
3.33
4.50
6.60
Absolute
frequency
1
2
1
1
1
2
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1 100x(1/(n*l)) 1n(X)
1
3
4
5
6
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
4
12
16
20
24
32
36
40
44
48
52
56
60
64
68
72
76
80
84
88
92
96
-3.22
-1.71
-1.39
-1.24
-0.97
-0.69
-0.51
-0.07
-0.03
0.10
0.15
0.25
0.31
0.32
0.37
0.38
0.95
0.99
1.03
1.20
1.50
1.89
4-10
-------
I
5
i
0 OS
Figure 4-2. Probability plot of raw chlordane concentrations.
4-11
-------
** ss
i .1
1 zr
|
*.
s-
>
*
*
^^B
S
^M
^M
^1V
^^B
^ml
ki(X) lOOxfl/fn+t))
3.22
1.71
1.39
-1.24
0.97
0.69
OJt
0.07
0.03
0.10
0.1S
0^5
0.31
0.32
0.37
0.38
0.85
0.99
1.03
1.20
1.50
1.89
«
X-Axto:
I
4
12
16
20
24
32
36
40
44
48
52
96
60
64
68
72
76
80
84
88
92
96
i
1
1 1
* ~\
&=
*^~~
» -3L5 -2 -1
LOQ (coommoon)
mms
^M*
8s
mam
!=
»^«^
4^
53
) 1
^l~
t=
.
^M«^B
==
i
i
i
i
t
i
i
i
t
__ . B^i
P
f
f
r
IB
P
t==
=
1
MM^^
amm
,
n^^^
^^H
.
^^
^==H
5
Figure 4-3. Probability plot of log-transfonKd chlordane conctntratlons.
4-12
-------
frequently used method of analysis. In this application the assfuned distribu-
tion 1s the normal distribution, but other distributions could also be used.
The test consists of defining cells or ranges of values and determining the
expected number of observations that would fall In each cell according to the
hypothesized distribution. The actual number of data points 1n each cell 1s
compared with that predicted by the distribution to judge the adequacy of the
fit.
PURPOSE
The ch1-squared test 1s used to test the adequacy of the assumption of
normality of the data.
PROCEDURE
Step 1. Determine the appropriate number of cells, K. This number
usually ranges from 5 to 10. Divide the number of observations, N, by 4.
Dividing the total number of observations by 4 will guarantee a minimum of
four observations necessary for each of the K N/4 cells. Use the largest
whole number of this result, using 10 1f the result exceeds 10.
Step 2. Standardize the data by subtracting the sample mean and divid-
ing by the sample standard deviation:
*-f l*f '
Step 3. Determine the number of observations that fall 1n each of the
cells defined according to Table 4-3. The expected number of observations for
each cell 1s N/K, where N 1s the total number of observations and K 1s the
number of cells. Let N, denote the observed number 1n cell 1 (for 1 taking
values from 1 to K) and let E^ denote the expected number of observations 1n
cell 1. Note that 1n this case the cells are chosen to make the E^'s equal.
TABLE 4-3. CELL BOUNDARIES FOR THE CHI-SQUARED TEST
Number of cells (10
Cell boundaries
for equal ex-
pected cell
sizes with the
normal distri-
bution
-0.84
-0.25
0.25
0.84
-0.97
-0.43
0.00
0.43
0.97
-1.07
-0.57
-0.18
0.18
0.57
1.07
-1.15
-0.67
-0.32
0.00
0.32
0.67
1.15
-1.22
-1.08
-0.43
-0.14
0.14
0.43
1.08
1.22
-1.28
-0.84
-0.52
-0.25
0.00
0.25
0.52
0.84
1.28
4-13
-------
Step 4. Calculate the ch1 -squared statistic by the formula bellow:
K (M, -
1-1 E1
Step 5. Compare the calculated result to the table of the cM -squared
distribution with K-3 degrees of freedom (Table 1, Appendix B). Reject the
hypothesis of normality 1f the calculated value exceeds the tabulated value.
REFERENCE
Remington, R. 0., and M. A. Schork. Statistics with Applications to the
Biological and Health Sciences. Prentice-Hall, 1970. 235-236.
EXAMPLE
The data 1n Table 4-4 are H 21 residuals from an analysis of variance
on dloxln concentrations. The analysis of variance assumes that the errors
(estimated by the residuals) are normally distributed. The chi -squared test
1s used to check this assumption.
Step 1. Divide the number of observations, 21, by 4 to get 5.25. Keep
only the Integer part, 5, so the test will use K 5 cells.
Stes 2. The sample mean and standard deviation are calculated and found
to be: X 0.00, S 0.24. The data are standardized by subtracting the mean
(0 1n this case) and dividing by S. The results are also shown 1n Table 4-4.
Step 3. Determine the number of (standardized) Observations that fall,
Into the five cells determined from Table 4-3. These divisions are: (1) less
than or equal to -0.84, (2) greater than -0.84 and less than or equal to
-0.25, (3) greater than -0.25 and less than or equal to +0.25, (4) greater
than 0.25 and less than or equal to 0.84, and (5) greater than 0.84. We find
4 observations 1n cell 1, 6 1n cell 2, 2 1n cell 3. 4 1n cell 4, and 5 1n
cell 5.
Step 4. Calculate the chl-squared statistic. The expected number 1n
each cell 1s H/K or 21/5 4.2.
. " .
Step 5. The critical value at the 5* level for a ch1-squared test with
2 (K-3 5-3 2) degrees of freedom 1s 5.99 (Table 1, Appendix B). Because
the calculated value of 2.10 Is less than 5.99 there 1s no evidence that these
data are not normal.
4-14
-------
TABLE 4-4. EXAMPLE DATA FOR CHI-SQUARED
TEST
Observation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
Residual
-0.45
-0.35
-0.35
-0.22
-0.16
-0.13
-0.11
-0.10
-0.10
-0.06
-0.05
0.04
0.11
0.13
0.16
0.17
0.20
0.21
0.30
0.34
0.41
Standardized
residual
-1.90
-1.48
-1.48
-0.93
-0.67
-0.55
-0.46
-0.42
-0.42
-0.25
-0.21
0.17
0.47
0.55
0.68
0.72
0.85
0.89
1.27
1.44
1.73
4-15
-------
INTERPRETATION
r
The cell boundaries are determined fron the normal distribution so that
equal numbers of observations should fall In each cell. If there are large
differences between the number of observations 1n each cell and that predicted
by the normal distribution, this 1s evidence that the data are not normal.
The ch1-squared statistic 1s a nonnegatlve statistic that Increases as the
difference between the predicted and observed number of observations 1n each
cell Increases.
If the calculated value of the ch1-squared statistic exceeds the tabu-
lated value, there 1s statistically significant evidence that the data do not
follow the normal distribution. In that case, one would need to do a trans-
formation, use a nonparametrlc procedure, or seek consultation before Inter-
preting the results of the test of the ground-water data. If the calculated
value of the ch1-squared statistic does not exceed the tabulated critical
value, there Is no significant lack of fit to the normal distribution and one
can proceed assuming that the assumption of normality 1s adequately met.
REMARK
The ch1-squared statistic can be used to test whether the residuals from
an analysis of variance or other procedure are normal. In this case the
degrees of freedom are found by (number of cells minus one minus the number of
parameters that have been estimated). This may require more than the sug-
gested 10 cells. The chi-squared test does require a fairly large sample size
1n that there should be generally at least four observations per cell.
4.3 CHECKING EQUALITY OF VARIANCE: BARTLETT'S TEST
The analysis of variance procedures presented In Section 5 are often more
sensitive to unequal variances than to moderate departures from normality.
The procedures described 1n this section allow for testing to determine
whether group variances are equal or differ significantly. Often In practice
unequal variances and nonnormallty occur together. Sometimes a transformation
to stabilize or equalize the variances also produces a distribution that 1s
more nearly normal. This sometimes occurs 1f the Initial distribution was
positively skewed with variance Increasing with the number of observations.
Only Bartlett's test for checking equality, or homogeneity, of variances 1s
presented here. It encompasses checking equality of more than two variances
with unequal sample sizes. Other tests are available for special cases. The
P-test 1s a special situation when there are only two groups to bt compared.
The user 1s referred to classical textbooks for this test (e.g., Snedecor and
Cochran, 1980). In the case of equal sample sizes but more than two variances
to be compared, the user might want to use Hartley's or maximum F-rat1o test
(see Nelson, 1987). This test provides a quick procedure t° test for variance
homogeneity.
4-16
-------
PURPOSE t
Bartlett's test 1s a test of homogeneity of variances, tn other words,
1t 1s a means of testing whether a number of population variances of normal
distributions are equal. Homogeneity of variances 1s an assumption made 1n
analysis of variance when comparing concentrations of constituents between
background and compliance wells, or among compliance wells. It should be
noted that Bartlett's test 1s Itself sensitive to nonnormaHty 1n the data.
With long-tailed distributions the test too often rejects equality (homo-
geneity) of the variances.
PROCEDURE
Assume that data from k wells are available and that there are n1 data
points for well 1.
Step 1. Compute the k sample variances S?,...,SJ!. The sample variance,
S ,-1s the square of the sample standard deviation and 1s given by the general
equation
s'-i1
-------
INTERPRETATION
If the calculated value X2 1s larger than the tabulated value, then con-
clude that the variances are not equal at that significance level. '
Johnson N. L., and F. C. Leone. Statistics and Experimental Design in
Engineering and the Physical Sciences. Vol. I, John Wiley and Sons, New York,
1977.
EXAMPLE
Manganese concentrations are given for k»6 wells 1n Table 4-5 below.
TABLE 4-5. EXAMPLE DATA FOR BARTLETT'S TEST
Sampling
date
January 1
February 1
March 1
April 1
"1 '
f^ * n^-1
^1 *
v-
f1*s12 *
ln(S^a) »
fj*ln(Sfa) »
Well 1
50
73
244
202
4
3
95
9,076
27,229
9
27
Well 2
46
77
2
1
22
481
481
6
6
Well 3
272
171
32
53
4
3
112
12,454
37,362
9
28
Well 4
34
3,940
2
1
2,762
7,628,418
7,628,418
16
16
Well 5
48
54
2
1
3
8
8
2
2
Well 6
68
991
54
3
2
537
288,349
576,698
13
25
logarithm
Step 1. Compute the six sample variances and take their natural
1thm, ln(Sj)),..., ln(Sj), as 9, 6,..., 13, respectively.
Step 2.
Compute I f1
105,
4-18
-------
This 1s the sum of the last line 1n Table 4-5.
/
6
Compute f » z f, 3 + 1 +...+ 2 11
1-1 1
Compute Sp
sp " TT r fl S1 " IT (27»299 +"" 576»698> ' TT (8»270'195) ' 751»836
Take the natural logarithm of Sp1: ln(sj) - 14
Compute X* 11(14) -105-44
Step 3. The critical X* value with 6-1 5 degrees of freedom at the 5X
significance level 1s 11.1 (Table 1 1n Appendix B). Since 44 1s larger than
11.1, we conclude that the six variances S ,...,S , are not homogeneous at the
5% significance level. l '
INTERPRETATION
The sample variances of the data from the six wells were compared by
means of Bartlett's test. The test was significant at the 5% level, suggest-
ing that the variances are significantly unequal (heterogeneous). A log-
transform of the data can be done and the same test performed on the trans-
formed data. Generally, 1f the data followed skewed distribution, this ap-
proach resolves the problem of unequal variances and the user can proceed with
an ANOVA for example.
On the other hand, unequal variances among well data could be a direct
Indication of well contamination, since the Individual data could come from
different distributions (I.e., different means and variances). Then the user
may wish to test which variance differs from which one. The reader 1s
referred here to the literature for a gap test of variance (Tukey, 1949;
David, 1956; or .Nelson. 1987).
NOTE
In the case of k»2 variances, the test of equality of variances 1s
the F-test (Snedecor and Cochran, 1980).
Bartlett's test simplifies 1n the case of equal sample sizes, n^n,
1-1....,k. The test used then 1s Cochran'$ test. Cochran's test focuses on
the largest variance and compares 1t to the sum of all the variances. Hartley
Introduced a quick test of homogeneity of variances that uses the ratio of the
largest over the smallest variances. Technical aids for the procedures under
the assumption of equal sample sizes are given by L. S. Nelson 1n the Journal
of Quality Technology, Vol. 19, 1987, pp. 107 and 165.
4-19
-------
SECTION 5
BACKGROUND WELL TO COMPLIANCE WELL COMPARISONS
There are many situations 1n ground-water monitoring that call for the
comparison of data from different wells. The assumption 1s that a set of
uncontamlnated wells can be defined. Generally these are background wells and
have been sited to be hydraullcally upgradlent from the regulated unit. A
second set of wells are sited hydraullcally downgradlent from the regulated
unit and are otherwise known as compliance wells. The data from these com-
pliance wells are compared to the data from the background wells to determine
whether there 1s any evidence of contamination 1n the compliance wells that
would presumably result from a release from the regulated unit.
If the owner or operator of a hazardous waste facility does not have
reason to suspect that the test assumptions of equal variance or normality
will be violated, then he or she may simply choose the parametric analysis of
variance as a default method of statistical analysis. In the event that this
method Indicates a statistically significant difference between the groups
being tested, then the test assumptions should be evaluated.
This situation, where the relevant comparison 1s between data from back-
ground, wells and data from compliance wells, 1s the topic of this section.
Comparisons between background well data and compliance well data may be
called for In all phases of monitoring. This type of comparison 1s the gen-
eral case for detection monitoring. It 1s also the usual approach for com-
pliance monitoring 1f the compliance limits are determined by the background
well constituent concentration levels. Compounds that are present 1n back-
ground wells (e.g., naturally occurring metals) are most appropriately
evaluated using this comparison method.
Section 5.1 provides a flowchart and overview for the selection of
methods for comparison of background well and compliance well data. Sec-
tion 5.2 contains analysis of variance methods. These provide methods for
directly comparing background well data to compliance well data. Section 5.3
describes a tolerance Interval approach, where the background well data are
used to define the tolerance limits for comparison with the compliance well
data. Section 5.4 contains an approach based on prediction Intervals, again
using the background well data to determine the prediction Interval for com-
parison with the compliance well data. Methods for comparing data to a fixed
compliance limit (an MCL or ACL) will be described 1n Section 6.
5-1
-------
5.1 SUMMARY FLOWCHART FOR BACKGROUND WELL TO COMPLIANCE WELL COMPARISONS
r
Figure 5-1 1s a flowchart to aid In selecting the appropriate statistical
procedure for background well to compliance w$11 comparisons. The first step
1s to determine whether most of the observations are quantified (that 1s,
above the detection limits) or not. Generally, 1f more than SOX of the obser-
vations are below the detection limit (as might be the case with detection or
compliance monitoring for volatile organlcs) then the appropriate comparison
1s a test of proportions. The test of proportions compares the proportion of
detected values 1n the background wells to those 1n the compliance wells. See
Section 8.1 for a discussion of dealing with data below the detection limit.
If the proportion of detected values 1s 50% or more, then an analysis of
variance procedure 1s the first choice. Tolerance limits or prediction Inter-
vals are acceptable alternate choices that the user may select. The analysis
of variance procedures give a more thorough picture of the situation at the
facility. However, the tolerance limit or prediction Interval approach 1s
acceptable and requires less computation 1n many situations.
Figure 5-2 1s a flowchart to guide the user 1f a tolerance limits
approach Is selected. The first step 1n using Figure 5-2 1s to determine
whether the facility 1s 1n detection monitoring. If so, much of the data may
be below the detection limit. See Section 8.1 for a discussion of this case,
which may call for consulting a statistician. If most of the data are quanti-
fied, then follow the flow chart to determine 1f normal tolerance limits can
be used. If the data are not normal (as determined by one of the procedures
In Section 4.2), then the logarithm transformation may be done and the trans-
formed data checked for normality. If the log data are normal, the lognormal
tolerance limit should be. used. If neither the original data nor the log-
transformed data are normal, seek consultation with *a professional
statistician.
If a prediction Interval 1s selected as the method of choice, see Sec-
tion 5.4 for guidance 1n performing the procedure.
If analysis of variance Is to be used, then continue with Figure 5-1 to
select the specific method that 1s appropriate. A one-way analysis of vari-
ance 1s recommended. If the data show evidence of seasonal 1ty (observed, for
example, 1n a plot of the data over time), a trend analysis or perhaps a two-
way analysis of variance may be the appropriate choice. These Instances may
require consultation with a professional statistician.
If the one-way analysis of variance Is appropriate, the computations are
performed, then the residuals are checked to see If they meet the assumptions
of normality and equal variance. If so, the analysis concludes. If not, a
logarithm transformation may be tried and the residuals from the anal -$1s of
variance on the log data are checked for assumptions. If these still do not
adequately satisfy the assumptions, then a one-way nonparametrlc analysis of
variance may be done, or professional consultation may be sought.
5-2
-------
BACKGROUND WELL TO COMPLIANCE WELL COMPARISONS
' T
»
Pradkfen MMV*
ConMChMe
Figure 5-1. Background well to compliance well comparisons.
-------
Tolerance Limits: Alternate Approach to
Background Weil To Compliance Well Comparisons
Tolerance Limits
Take Log
of Data
Consult with
Professional
Statistician
inclusions';
Normal
Tolerance
Limits
Are
Log Data
Normal?
Lognormal
Tolerance
Limits
inclusions'
Conclusions:
Figure 5-2. Tolerance Halts: alternate approach to background
well to compliance well comparisons.
5-4
-------
5.2 ANALYSIS OF VARIANCE »
If contamination of the ground water occurs fro* the waste disposal
facility and 1f the Monitoring wells are hydraullcally upgradlent and
hydraullcally downgradlent from the site, then contamination 1s unlikely to
change the levels of a constituent In all wells by the saw amount. Thus,
contamination from a disposal site can be seen as differences 1n average con-
centration among wells, and such differences can be detected by analysis of
variance.
Analysis of variance (ANOVA) 1s the name given to a wide variety of sta-
tistical procedures. All of these procedures compare the means of different
groups of observations to determine whether there are any significant differ-
ences among the groups, and 1f so, contrast procedures may be used to
determine where the differences lie. Such procedures are also known 1n the
statistical literature as general linear model procedures.
Because of Its flexibility and power, analysis of variance Is the pre-
ferred method of statistical analysis when the ground-water monitoring 1s
based on a comparison of background and compliance well data. Two types of
analysis of variance are presented: parametric and nonparametrlc one-way
analyses of variance. Both methods are appropriate when the only factor of
concern 1s the different monitoring wells at a given sampling period.
The hypothesis tests with parametric analysis of variance usually assume
that the errors (residuals) are normally distributed with constant variance.
These assumptions can be checked by saving the residuals (the difference
between the observations and the values predicted by- the analysis of variance
model) and using the tests of assumptions presented 1n Section 4. Since the
data will generally be concentrations and since concentration data are often
found to follow the log normal distribution, the log transformation Is sug-
gested 1f substantial violations of the assumptions are found 1n the analysis
of the original concentration data. If the residuals from the transformed
data do not meet the parametric ANOVA requirements, then nonparametrlc
approaches to analysis of variance are available using the ranks of the obser-
vations. A one-way analysis of variance using the ranks 1s presented 1n
Section 5.2.2.
When several sampling periods have been used and 1t 1s Important to con-
sider the sampling periods as a second factor, then two-way analysis of vari-
ance, parametric or nonparametrlc, 1s appropriate. This would be one way to
test for and adjust the data for seasonally. Also, trend analysis (e.g.,
time series) may be used to Identify seasonal1ty 1n the data set. If neces-
sary, data that exhibit seasonal trends can be adjusted. Usually, however,
seasonal variation will affect all wells at a facility by nearly the same
amount, and 1n most circumstances, corrections will not be necessary. Fur-
ther, the effects of seasonal 1ty will be substantially reduced by simultane-
ously comparing aggregate compliance well data to background well data.
Situations that require an analysis procedure other than a one-way ANOVA
should be referred to a professional statistician.
5-5
-------
5.2.1 One-Way Parametric Analysis of Variance r
»
In the context of ground-water Monitoring, two situations exist for which
a one-way analysis of variance 1s most applicable:
* Data for a water quality parameter are available from several wells
but for only one tine period (e.g., monitoring has just begun).
* Data for a water quality parameter are available from several wells
for several time periods. However, the data do not exhibit sea-
sonal 1ty.
In order to apply a parametric one-way analysis of variance, a minimum
number of observations 1s needed to give meaningful results. At least p 2 2
groups are to be compared (I.e., two or more wells). It 1s recommended that
each group (here, wells) have at least three observations and that the total
sample size, N, be large enough so that N-p * 5. A variety of combinations of
groups and number of observations In groups will fulfill this minimum. One
sampling Interval with four Independent samples per well and at least three
wells would fulfill the minimum sample size requirements. The wells should be
spaced so as to maximize the probability of Intercepting a plume of contamina-
tion. The samples should be taken far enough apart In time to guard against
autocorrelation.
PURPOSE
One-way analysis of variance 1s a statistical procedure to determine
whether differences 1n mean concentrations among wells, or groups of wells,
are statistically significant. For example, 1s there significant contamina-
tion of one or more compliance wells as compared to background wells?
PROCEDURE
Suppose the regulated unit has p wells and that n., data points (concen-
trations of a constituent) are available for the 1th well. These data can be
from either a single sampling period or from more than one. In the latter
case, the user could check for seasonal1ty before proceeding by plotting the
data over time. Usually the computation will be done on a computer using a
commercially available program. However, the procedure 1s presented so that
computations can be done using a desk calculator, 1f necessary.
P
Step 1. Arrange the N » I n1 data points 1n a data table as follows
(N 1s the total sample size at this specific regulated unit):
5-6
-------
Well Total Mel V Mean
(froB (froB
Well No. 1
2
3
u
Observations
X,, Xi.
. . 1
.
*ul
i x
P1 P"D
Step 1)
Xi
1. I
*u.
X.
*p.
X
Steo 2)
Xi
1.
V
x_
*p.
X
Step 2. Compute well totals and well Means as follows:
* z
, total of all n, observations at well 1
« -* X. , average of all n. observations at well 1
. n» i . i
X z z X« , grand total of all n< observations
" 1-1 J-l 1J 1
« 4 X , grand Bean of all observations
* *
These totals and Beans are shown 1n the last two coluans of the table above.
Step 3. Coapute the sum of squares of differences between well Beans
and the grand Bean:
,|1n1 <*1. * *..>' ' ^ ^ "I." J «!.
(The forwla on the far Haht 1s usually aost convenient for calculation.)
This SUB of squares has (p-1) degrees of freedom associated with It and 1s a
Beasure of the variability between wells.
5-7
-------
Step 4. Compute the corrected total sum of squares
p n1 , » p ni
SSrfit.i " * xMXjj-T )« I z XJ. - (X* /N)
Total jm^ jB^ ij ..' ^m^ . . ij * .. '
\
(The formula on the far right 1s usually most convenient for calculation.)
This sum of squares has (N-l) degrees of freedom associated with 1t and 1s a
measure of the variability 1n the whole data set.
Step 5. Compute the sum of squares of differences of observations
within wells from the well means. This 1s the sum of squares due to error and
1s obtained by subtraction:
SSError " SSTotal - SSWells
It has associated with 1t (N-p) degrees of freedom and 1s a measure of the
variability within wells.
Step 6. Set up the ANOVA table as shown below 1n Table 5-1. The sums
of squares and their degree of freedom were obtained from Steps 3 through 5.
The mean square quantities are simply obtained by dividing each sum of squares
by Its corresponding degrees of freedom.
TABLE 5-1. ONE-WAY PARAMETRIC ANOVA TABLE
Source of
Variation
Between wells
Error (within
wells)
Total
Sums of squares
SSwells
ssError
Degrees of
freedom
P-l
N-p
N-l
Mean squares
Dwells
^Error
SSError/(N-p)
F
e . Dwells
9 ^ "*
Step 7. To test the hypothesis of equal means for all p wells, compute
F MSUel1s/MS£rror (last column 1n Table 5-1). Compare this statistic to the
tabulated F statistic with (p-1) and (N-p) degrees of freedom (Table 2, Appen-
dix B) at the 5X significance level. If the calculated F value exceeds the
tabulated value, reject the hypothesis of equal well means. Otherwise,
5-8
-------
conclude that there 1s no significant difference tctween the concentrations at
the p wells and thus no evidence of well contamination.
In the case of a significant F (calculated F greater than tabulated F 1n
Step 7), the user will conduct the next few steps to determine which compli-
ance well(s) 1s (are) contaminated. This will be done by comparing each com-
pi lance well with the background well(s). Concentration differences between a
pair of background wells and compliance wells or between a compliance well and
a set of background wells are called contrasts 1n the ANOVA and multiple com-
parisons framework.
Step 8. Determine 1f the significant F Is due to differences between
background and compliance wells (computation of Bonferronl t-stat1 sties)
Assume that of the p wells, u are background wells and m are compliance
wells (thus u + m » p). Then m differences m compliance wells each compared
with the average of the background wells need to be computed and tested for
statistical significance. If there are more than five downgradlent wells, the
Individual comparisons are done at the compar1sonw1se significance level of
IX, which may make the exper1mentw1se significance level greater than 5%.
Obtain the total sample size of all u background wells.
Compute the average concentration from the u background wells.
u
Compute the m differences between the average concentrations from
each compliance well and the average background wells.
Y1. ' Yup 1 " 1 ..... "
Compute the standard error of each difference as
SE1 ' |MSError <1/nup * l/n1)l%
where MSc-- 1s determined from the ANOVA table (Table 5-1) and n,
Is the number of observations at well 1.
Obtain the t-stat1st1c t t(n.p) M from Bonferronl '$ t-table
(Table 3, Appendix B) with a » 0.05 and (M-p) degrees of freedom.
5-9
-------
Compute the quantities 0^ » SE^ x t for each compliance well 1.
If n > 5 use the entry for t/M _\ M.Q.OI)- That ** «« the entry
at m 9«
INTERPRETATION
i
If the difference X^ - Xyp exceeds the value 01v conclude that the 1th
compliance well has significantly higher concentrations than the average back-
ground wells. Otherwise conclude that the well 1s not contaminated. This
exercise needs to be performed for each of the m compliance wells Individu-
ally. The test 1s designed so that the overall experlmentwlse error 1s SJMf
there are no more than five compliance wells.
CAUTIONARY NOTE
Should the regulated unit consist of more than five compliance wells,
then the Bonferronl t-test should be modified by doing the Individual compari-
sons at the IX level so that the Part 264 Subpart F regulatory requirement
pursuant to §264.97(1)(2) will be met. Alternately, a different analysis of
contrasts, such as Scheffe's, may be used. The more advanced user 1s referred
to the second reference below for a discussion of multiple comparisons.
REFERENCES
Johnson, Norman 1., and F. C. Leone. 1977. Statistics and Experimental
Design in Engineering and the Physical Sciences. Vol. II, Second Edition,
John Wiley and Sons, New York.
Miller, Ruppert 6., Jr. 1981. Simultaneous Statistical Inference. Second
Edition, Sprlnger-Verlag, New York.
EXAMPLE
Four lead concentration values at each of six wells are given 1n
Table 5-2 below. The wells consist of u«2 background and m-4 compliance
wells. (The values 1n Table 5-2 are actually the natural logarithms of the
original lead concentrations.)
Step 1. Arrange the 4 x 6 * 24 observations 1n a data table as follows:
5-10
-------
TABLE 5-2. EXAMPLE DATA FOR ONE-WAY PARAMETRIC ANALYSIS OF VARIANCE
Natural
loo: of Pb concentratlons(uQ/L)
Well
total
Well No. Date:
1 Background wells
2
3 Compliance wells
4
5
6
Jan 1
4.06
3.83
5.61
3.53
3.91
5.42
Feb 1
3.99
4.34
5.14
4.54
4.29
5.21
Mar 1
3.40
3.47
3.47
4.26
5.50
5.29
Well
ean
Apr 1 (XjJ (X^)
3
4
3
4
5
5
.83
.22
.97
.42
.31
.08
X..
15
15
18
16
19
11
. 106
.28
.86
.18
.75
.01
.01
.08
3
3
4
4
4
5
X.. » 4
.82
.96
.55
.19
.75
.25
.42
Well
std.
0.295
0.398
0.996
0.456
0.771
0.142
dev.
(max)
(m1n)
Step 2. The calculations are shown on the right-hand side of the data
table above. Sample standard deviations have been computed also.
Step 3. Compute the between-well sum of squares.
SSWfilu « ^ (15.282 + .... + 21.012) - 3! x 106.082 « 5176
with [6 (wells) - 11 « 9 degrees of freedom.
Step 4. Compute the corrected total sum of squares.
ssTotal * 4*062 * 3'"2 * ' * 5'082 ' 27 x 106-082 * l1-94
with [24 (observations) - 11 23 degrees of freedom.
Step 5. Obtain the withIn-well or error sum of squares by subtraction.
SS£rror - 11.94 - 5.76 « 6.18
with (24 (observations) - 6 (wells)] * 18 degrees of freedom.
Step 6. Set up the one-way ANOVA as 1n Table 5-3 below:
5-11
-------
TABLE 5-3. EXAMPLE COMPUTATIONS IN ONE-WAY PARAMETRIC ANOVA TABLE
Source of
variation
Between wells
Error
(within wells)
Total
Sums of
squares
5.76
6.18
11.94
Degrees of
freedom Mean squares F
5 5.76/5 - 1.15 1.15/0.34 » 3.38
18 6.18/18 0.34
23
Step 7. The calculated F statistic 1s 3.38. The tabulated F value with
5 and 18 degrees of freedom at the a « 0.05 level 1s 2.77 (Table 2, Appen-
dix B). Since the calculated value exceeds the tabulated value, the hypothe-
sis of equal well means must be rejected, and post hoc comparisons are
necessary.
Step 8. Computation of Bonferronl t-stat1sties.
Note that there are four compliance wells, so m » 4 comparisons will
be made
Hup 8 total number of samples In background wells
X"up » 3.89 average concentration of background wells
Compute the differences between the four compliance wells and the
average of the two background wells:
*s. - *Up " 4'55 ' *" " °'66
** ' *up ' 4-19 ' 3-w " °-3
*» ' * ' 4-75 3'89 * °-86
5.25 - 3.89 - 1.36
Compute the standard error of each difference. Since the number of
observations 1s the same for all compliance wells, the standard
errors for the four differences will be trial.
SEt - 10.34 (1/8 + 1/4)]** - 0.357 for 1 3..... 6
5-12
-------
From Table 3, Appendix B, obtain the critical t w1thj(24 - 6) - 18
degrees of freedom, m - 4, and for a « 0.05. The approximate value
1s 2.43 obtained by linear Interpolation between 15 and 20 degrees
of freedom.
^
Compute the quantities 0*. Again, due to equal sample sizes, they
will all be equal.
SE1 x t 0.357 x 2.43 0.868 for 1 3...., 6
IMTERPRETATION
The F test was significant at the 5% level. The Bonferronl multiple
comparisons procedure was then used to determine for which wells there was
statistically significant evidence of contamination. Of the four differences
*1. *up» onlv *« - *up " 1>36 «xc*ed* tn« critical value of 0.868. From
this 1t 1s concluded that there 1s significant evidence of contamination at
Well 6. Well 5 1s right on the boundary of significance. It 1s likely that
Well 6 has Intercepted a plume of contamination with Well 5 being on the edge
of the plume.
All the compliance well concentrations were somewhat above the mean con-
centratlon of the background levels. The well means should be used to Indi-
cate the location of the plume. The findings should be reported to the
Regional Administrator.
5.2.2 One-Way Nonpararaetrlc Analysis of Variance
This procedure Is appropriate for Interwell comparisons when the data or
the residuals from a parametric ANOVA have been found to be significantly dif-
ferent from normal and when a log transformation falls to adequately normalize
the data. In one-way nonparametrlc ANOVA, the assumption under the null
hypothesis 1s that the data from each well come from the same continuous dis-
tribution and hence have the same median concentrations of a specific hazard-
ous constituent. The alternatives of Interest are that the data from some
wells show Increased levels of the hazardous constituent 1n question.
'The procedure Is called the Kruskal-Wallls test. For meaningful results,
there should be at least three groups with a minimum sample size of three 1n
each group. For large data sets use of a computer program 1s recommended. In
the case of large data sets a good approximation to the procedure 1s to re-
place each observation by Its rank (Us numerical place when the data are
ordered from least to greatest) and perform the (parametric) one-way analysis
of variance (Section 5.2.1) on the ranks. Such an approach can be done with
some commercially statistical packages such as SAS.
5-13
-------
PURPOSE tf
The purpose of the procedure Is to test the hypothesis that til wells (or
groups of wells) around regulated units have the same median concentration of
a hazardous constituent. If the wells are found to differ, post-hoc compari-
sons are again necessary to determine 1f contamination 1s present.
Note that the wells define the groups. All wells will have at least four
observations. Denote the number of groups by K and the number of observations
1n each group by n 6), use Z.0i» the upper one-
percent He from the standard normal distribution.
5-14
-------
Step 6. Form the differences of the average ranks for each group to the
background and compare these with the critical values found In step 5 to de-
termine which wells give evidence of contamination. That 1s, compare R<-R! to
C1 for 1 taking the values 2 through K. (Recall that group 1 1s the back-
ground.)
'While the above steps are the general procedure, some details need to be
specified further to handle special cases. First, 1t may happen that two or
more observations are numerically equal or tied. When this occurs, determine
the ranks that the tied observations would have received 1f they had been
slightly different from each other, but still In the same places with respect
to the rest of the observations. Add these ranks and divide by the number of
observations tied at that value to get an average rank. This average rank 1s
used for each of the tied observations. This same procedure 1s repeated for
any other groups of tied observations. Second, 1f there, are any values below
detection, consider all values below detection as tied at zero. (It 1s
Irrelevant what number 1s assigned to nondetected values as long as all such
values are assigned the same number, and It 1s smaller than any detected or
quantified value.)
The effect of tied observations 1s to Increase the value of the sta-
tistic, H. Unless there are many observations tied at the same value, the
effect of ties on the computed test statistic 1s negligible (1n practice, the
effect of ties can probably be neglected unless some group contains 10 percent
of the observations all tied, which 1s most likely to occur for concentrations
below detection limit). In the present context, the term negligible" can be
more specifically defined as follows. Compute the Kruskal-UalHs statistic
without the adjustment for ties. If the test statistic 1s significant at the
5X level then conclude the test since the statistic with correction for ties
will be significant as well. If the test statistic falls between the 10* and
the 5X critical values, then proceed with the adjustment for ties as shown
below.
ADJUSTMENT FOR TIES
/
If there are SOX or more observations that fell below the detection
Unit, then this method for adjustment for ties 1s Inappropriate. The user 1s
referred to Section 8 'Miscellaneous Topics." Otherwise, 1f there are tied
values present 1n the data, use the following correction for the H statistic
H'
where g the number of groups of distinct tied observations and T* « (t?-t<),
where t1 Is the number of observations 1n the tied group 1. Note that unique
observations can be considered groups of size 1, with the corresponding
T, - (li-1) -0. y
5-15
-------
REFERENCE /
Hollander, Myles, and 0. A. Wolfe. 1973. Nonporam«tric Statistical
Methods. John Wiley and Sons, New York.
EXAMPLE
The data 1n Table 5-4 represent benzene concentrations 1n water samples
taken at one background and five compliance wells.
Step 1. The 20 observations have been ranked from least to greatest.
The Unit of detection was 1.0 ppm. Note that two values 1n Well 4 were below
detection and were assigned value zero. These two are tied for the smallest
value and have consequently been assigned the average of the two ranks 1 and
2, or 1.5. The ranks of the observations are Indicated 1n parentheses after
the observation 1n Table 5-4. Note that there are 3 observations tied at 1.3
that would have had ranks 4( 5, and 6 1f they had been slightly different.
These three have been assigned the average rank of 5 resulting from averaging
4, 5, and 6. Other ties occurred at 1.5 (ranks 7 and 8) and 1.9 (ranks 11 and
12).
Step 2. The values of the sums of ranks and average ranks are Indicated
at the bottom of Table 5-4.
Step 3. Costpute the Kruskal-WalUs statistic
H M,u (34 V4 + ... + 35.5V3) - 3(20+1) » 14.68
ADJUSTMENT FOR TIES
There are four groups of ties 1n the data of Table 5-4:
Tl (23-2) 6 for the 2 observations of 1,900.
Tj (2'-2) « 6 for the 2 observations of 1,500.
T, (33-3) 24 for the 3 observations of 1.300.
T* » (2 8-2) 6 for the 2 observations of 0.
4
Thus Z T. - 6+6+24+6 » 42
1-1 1
and H' " l-(4l/(8»-a)) " raff ' 14'76' * n^W* cha"9e from 14-68'
Step 4. To test the null hypothesis of no contamination, obtain the
critical ch1 -squared value with (6-1) 5 degrees of freedom at the 5X signif-
icance level from Table 1, Appendix B. The value 1s 11.07. Compare the cal-
culated value, H1, with the tabulated value. Since 14.76 1s greater than
11.07, reject the hypothesis of no contamination at the 5X level. If the site
was 1n detection monitoring 1t should move Into compliance monitoring. If the
5-16
-------
TABLE 5-4. EXAMPLE DATA FOR ONE-WAY NONPARANETRIC ANOVA--BENZENE CONCENTRATIONS (ppM)
Date
Jan 1
Feb 1
Mar 1
Apr 1
SUM of ranks;
Average rank:
Background
Hell 1
1.
1.
1.
1.
"i
RI
RI
K
7 (10)
9 (11.5)
5 (7.5)
3(5)
4
34
- 8.5
- 6. the
Compliance wells
Hell 2
11.0 (20)
8.0 (18)
9.5 (19)
n* « 3
R, 57
R, * 19
ntmber of wells
Hell 3
1.3 (5)
1.2 (3)
1.5 (7.5)
n$ 3
R) - 15.5
R, - 5.17
Well 4
0 (1.5)
1.3 (5)
0 (1.5)
2.2 (13)
n^ 4
RH - 21
R\ « 5.25
Hell 5
4.9 <17)
3.7 (16)
2.3 (14)
n.-3
Rft 47
Rs 15.67
Hell 6
1.6 (9)
2.5 (15)
1.9 (11.5)
n. - 3
R. - 35.5
R. 11.83
N - c n< 20, the total nUMber of observations.
1-1
-------
site was 1n compliance monitoring 1t should move Into corrective action. If
the site was 1n corrective action 1t should stay there.
In the case where the hydraullcally upgradlent wells serve as the back-
ground against which the compliance wells are to be compared, comparisons of
each compliance well with the background wells should be performed In addition
to the analysis of variance procedure. In this example, data from each of the
compliance wells would be compared with the background well data. This com-
parison 1s accomplished as follows. The average ranks for each group, R^ are
used to compute differences. If a group of compliance wells for a regulated
unit have larger concentrations than those found 1n the background wells, the
average rank for the compliance wells at that unit will be larger than the
average rank for the background wells.
Step 5. Calculate the critical values to compare each compliance well
to the background well.
In this example, K«, so there are 5 comparisons of the compliance wells
with the background wells. Using an experlmentwlse significance level of a *
0.05, we find the upper 0.05/5 * 0.01 percent He of the standard normal
distribution to be 2.33 (Table 4, Appendix B). The total sample size, N, 1s
20. The approximate critical value, C2, 1s computed for compliance Well 2,
which has the largest average rank, as:
C,- 2.32
20(21
1/2
*
7*3
1/2
10.5
The critical values for the other wells are: 10.5 for Wells 3, 5, and 6; and
9.8 for Well 4.
Step 6. Compute the differences between the average rank of each com-
pliance well and the average rank of the background well:
Differences
19.0 . 8.5 10.5
5.17 - 8.5 « -3.33
5.25 - 8.5 - .3.25
15.67 - 8.5 7.17
11.83 - 8.5 3.13
Critical values
C, 10.5
C3 10.5
C% * 9.8
C, 10.5
C. 10.5
Compare each difference wltt the corresponding critical difference. D2 10.5
equals the critical value ot C, 10.5. We conclude that the concentration of
benzene averaged over compliance Mel 1 2 1s significantly greater than that at
the background well. None of the other compliance well concentration of
benzene 1s significantly higher than the average background value. Based upon
these results, only compliance Well 2 can be singled out as being
contaminated.
5.18
-------
For data sets with more than 30 observations, the parametric analysis of
variance performed on the rank values 1s a good approximation to the Kruskal-
Uallls test (Quade, 1966). If the user has access to SAS, the PROC RANK pro-
cedure Is used to obtain the ranks of the data. The analysis of variance pro-
cedure detailed In Section 5.2.1 1s then performed on the ranks. Contrasts
are tested as In the parametric analysis of variance.
INTERPRETATION
The Kruskal-Wallls test statistic 1s compared to the tabulated critical
value from the ch1-squared distribution. If the test statistic does not
exceed the tabulated value, there 1s no statistically significant evidence of
contamination and the analysis would stop and report this finding. If the
test statistic exceeds the tabulated value, there 1s significant evidence that
the hypothesis of no differences In compliance concentrations from the back-
ground level Is not true. Consequently, 1f the test statistic exceeds the
critical value, one concludes that there 1s significant evidence of contami-
nation. One then proceeds to Investigate where the differences lie, that 1s,
which wells are Indicating contamination.
The multiple comparisons procedure described 1n steps 5 and 6 compares
each compliance well to the background well. This determines which compliance
wells show statistically significant evidence of contamination at an experl-
mentwlse error rate of 5 percent. In many cases, Inspection of the mean or
median concentrations will be sufficient to Indicate where the problem lies.
5.3 TOLERANCE INTERVALS BASED ON THE NORMAL DISTRIBUTION
An alternate approach to analysis of variance to determine whether there
1s statistically significant evidence of contamination 1s to use tolerance
Intervals. A tolerance Interval 1s constructed from the -data on (uncontam-
Inated) background wells. The concentrations from compliance wells are then
compared with the tolerance Interval. With the exception of pH, 1f the com-
pliance concentrations do not fall 1n the tolerance Interval, this provides
statistically significant evidence of contamination.
Tolerance Intervals are most appropriate for use at facilities that do
not exhibit high degrees of spatial variation between background wells and
compliance wells. Facilities that overlie extensive, homogeneous geologic
deposits (for example, thick, homogeneous lacustrine clays) that do not natu-
rally display hydrogeochemlcal variations may be suitable for this statistical
method of analysis.
A tolerance Interval establishes a concentration range that 1s con-
structed to contain a specified proportion (P*) of the population with a
specified confidence coefficient, Y. The proportion of the population
Included, P, 1s referred to as the coverage. The probability with which the
tolerance Interval Includes the proportion PX of the population 1s referred to
as the tolerance coefficient.
A coverage of 9535 1s recommended. If this 1s used, random observations
from the same distribution as the background well data would exceed the upper
5-19
-------
tolerance limit less than 5% of the time. Similarly, a tolerance coefficient
of 95X Is recommended. This means that one has a confidence l«vtl of 95% that
the upper 95X tolerance limit will contain at least 95X of the distribution of
observations from background well data. These values were chosen to be con-
sistent with the performance standards described 1n Section 2. The use of
these values corresponds to the selection of of « In the Multiple well
testing situation.
The procedure can be applied with as few as three observations from the
background distribution. However, doing so would result 1n a large upper
tolerance limit. A sample size of eight or more results 1s an adequate toler-
ance Interval. The minimum sampling schedule called for 1n the regulations
would result 1n at least four observations from each background well. Only if
a single background well 1s sampled at a single point 1n time 1s the sample
size so small as to make use of the procedure questionable.
Tolerance Intervals can be constructed assuming that the data or the
transformed data are normally distributed. Tolerance Intervals can also be
constructed assuming other distributions. It 1s also possible to construct
nonparametrlc tolerance Intervals using only the assumption that the data came
from some continuous population. However, the nonparametrlc tolerance
Intervals require such a large number of observations to provide a reasonable
coverage and tolerance coefficient that they are Impractical 1n this
application.
The range of the concentration data 1n the background well samples should
be considered In determining whether the tolerance Interval approach should be
used, and 1f so, what distribution 1s appropriate. The background well con-
centration data should be Inspected for outliers and tests of normality
applied before selecting the tolerance Interval approach. Tests of normality
were presented 1n Section 4.2. Note that 1n this case, the test of normality
would be applied to the background well data that are used to construct the
tolerance Interval. These data should all be from the same normal
distribution.
In this application, unless pH 1s being monitored, a one-sided tolerance
Interval or an upper tolerance limit 1s desired, since contamination 1s Indi-
cated by large concentrations of the hazardous constituents monitored. Thus,
for concentrations, the appropriate tolerance Interval 1s (0, TL), with the
comparison of Importance being the larger limit, TL.
PURPOSE
The purpose of the tolerance Interval approach 1s to define a concentra-
tion range from background well data, within which a large proportion of the
monitoring observations should fall with high probability. Once this Is done,
data from compliance wells can be checked for evidence of contamination by
simply determining whether they fall 1n the tolerance Interval. If they do
not, this 1s evidence of contamination.
In this case the data are assumed to be approximately'normally distrib-
uted. Section 4.2 provided methods to check for normality. If,the data are
5-20
-------
not normal, take the natural logarithm of the data and see 1f the transformed
data are approximately normal. If so, this method can fee used on the loga-
rithms of the data. ' Otherwise, seek the assistance of a professional
statistician.
PROCEDURE
Step 1. Calculate the mean, X, and the standard deviation, S, from the
background well data.
-Step 2. Construct the one-sided upper tolerance limit as
TL » X + K S,
where K 1s the one-sided normal tolerance factor found 1n Table 5, Appendix B.
Step 3. Compare each observation from compliance wells to the tolerance
limit found 1n Step 2. If any observation exceeds the tolerance limit, that
1s statistically significant evidence that the well 1s contaminated. Note
that If the tolerance Interval was constructed on the logarithms of the orig-
inal background observations, the logarithms of the compliance well observa-
tions should be compared to the tolerance limit. Alternatively the tolerance
limit may be transferred to the original data scale by taking the anti-
logarithm.
REFERENCE
Lleberman, Gerald J. 1958. "Tables for One-sided Statistical Tolerance
Limits.* Industrial Quality Control. Vol. XIV, No. .10.
EXAMPLE
Table 5-5 contains example data that represent lead concentration levels
1n parts per million 1n water samples at a hypothetical facility. The
background well .data are 1n columns 1 and 2, while the other four columns
represent compliance well data.
Step 1. The mean and standard deviation of the n 8 observations have
been calculated for the background well. The mean Is 51.4 and the standard
deviation 1s 16.3.
Step 2. The tolerance factor for a one-sided normal tolerance Interval
1s found from Table 5, Appendix B as 3.188. This Is for 95X coverage with
probability 95* and for n 8. The upper tolerance limit Is then calculated
as 51.4 + (3.188)(16.3) 103.4.
Step 3. The tolerance limit of 103.3 1s compared with the compliance
well data. Any value that exceeds the tolerance limit indicates statistically
significant evidence of contamination. Two observations from Well 1, two
observations from Well 3, and all four observations from Hell 4 exceed the
tolerance limit. Thus there 1s statistically significant evidence of con-
tamination at Hells 1, 3, and 4.
5-21
-------
TABLE 5-5. EXAMPLE DATA FOR NORMAL TOLERANCE INTERVAL
Lead concentrations (ppm)
Backqro
Date ~~A"
Jan 1 58.0
Feb 1 54.1
Mar 1 30.0
Apr 1 46.1
n 8
Mean -"51.4
SO - 16.3
und well Compliance wells
B Hell 1 Hell 2 Hell 3 Hell 4
46.1 273.1* 34.1 49.9 225.9*
76.7 170.7* 93.7 73.0 183.1*
32.1 32.1 70.8 244.7* 198.3*
68.0 53.0 83.1 202.4* 160.8*
The upper 95% coverage tolerance limit
with tolerance coefficient of 95% 1s
51.4 + (3.188)(16.3) » 103.4
* Indicates contamination
INTERPRETATION
A tolerance limit with 95£ coverage gives an upper bound below which 95%
of the observations of the distribution should fall. The tolerance coeffi-
cient used here 1s 95X, Implying that at least 95X of the observations should
fall below the tolerance limit with probability 95X, 1f the compliance well
data come from the same distribution as the background data. In other words,
1n this example, we are 95X certain that 95X of the background lead concentra-
tions are below 104 ppm. If observations exceed the tolerance limit, this 1s
evidence that the compliance well data are not from the same distribution, but
rather are from a distribution with higher concentrations. This Is Inter-
preted as statistically significant evidence of contamination.
5.4 PREDICTION INTERVALS
A prediction Interval Is a statistical .Interval calculated to Include one
or more future observations from the same population with a specified confi-
dence. This approach 1s algebraically equivalent to the average replicate
(AR) test that 1s presented 1n the Technical Enforcement Guidance Document
(TEGO), September 1986. In ground-water monitoring, a prediction Interval
approach may be used to make comparisons between background and compliance
well data. This method of analysis 1s similar to that for calculating a
tolerance limit, and familiarity with prediction Intervals or personal prefer-
ence would be the only reason for selecting them over the method for tolerance
limits. The concentrations of a hazardous constituent 1n the background wells
are used to establish ah Interval within which K future observations from the
same population are expected to He with a specified confidence. Then each of
K future observations of compliance well concentrations 1s compared to the
prediction Interval. The Interval 1s constructed to contain all of K future
5-22
-------
observations with the stated confidence. If any future observation exceeds
the prediction Interval, this 1s statistically significant evidence of contam-
ination. In application, the lumber of future observations to be collected,
K, «ust be specified. Thus, the prediction Interval 1s constructed for a
specified time period 1n the future. One year Is suggested. The Interval can
be constructed either to contain all K Individual observations with a speci-
fied probability, or to contain the K1 Means observed at the K' sampling
periods.
The prediction Interval presented here 1s constructed assuming that the
background data all follow the same normal distribution. If that 1s not the
case (see Section 4.2 for tests of normality), but a log transformation
results 1n data that are adequately normal on the log scale, then the Interval
may still be used. In this case, use the data after transforming by taking
the logarithm. The future observations need to also be transformed by taking
logarithms before comparison to the Interval. (Alternatively, the end points
of the Interval could be converted back to the original scale by taking their
antl-logarithms.)
PURPOSE
The prediction Interval 1s constructed so that K future compliance well
observations can be tested by determining whether they He 1n the Interval or
not. If not, evidence of contamination 1s found. Note that the number of
future observations, K, for which the Interval 1s to be used, must be speci-
fied 1n advance. In practice, an owner or operator would need to construct
the prediction Interval on a periodic (at least yearly) basis, using the most
recent background data. The Interval 1s described using the 95X confidence
factor appropriate for Individual well comparisons. It 1s recommended that a
one-sided prediction Interval be constructed for the mean of the four observa-
tions from each compliance well at each sampling period.
PROCEDURE
Step 1. Calculate the mean, 7, and the standard deviation, S, for the
background well data (used to form the prediction Interval).
Step 2. Specify the number of future observations for a compliance well
to be Included 1n the Interval, K. Then the Interval 1s given by
t(n-l, K, 0.95)J
where 1t 1s assumed that the mean of the m observations taken at the K sam-
pling periods will be used. Here n 1s the number of observations 1n the back-
ground data, and t/n_it K, 0.95) 1* found from Table 3 1n Appendix B. The
table 1s entered with K as the number of future observations, and degrees of
freedom, v « n-1. If K > 5, use the column for K * 5.
5-23
-------
Step 3. Once the Interval has been calculated, at each sampjjng period,
the Mean of the compliance well observations 1s obtained. This mean 1s com-
pared to see 1f 1t falls 1n the Interval. If 1t does, this 1s reported and
monitoring continues. If a mean concentration at a sampling period does not
fall In the prediction Interval, this 1s statistically significant evidence of
contamination. This 1s also reported and the appropriate action taken.
REMARK
For a single future observation, t 1s given by the t-d1str1but1on found
1n Table 6 of Appendix B. In general, the Interval to contain K future means
of sample size m each 1s given by
K, 0.95)]
where t 1s as before from Table 3 of Appendix B and where m 1s the number of
observations 1n each mean. Note that for K single observations, m»l, while
for the mean of four samples from a compliance well, m»4.
Note, too, that the prediction Intervals are one-sided, giving a value
that should not be exceeded by the future observations. The 5X experlmentwise
significance level 1s used with the Bonferronl approach. However, to ensure
that the significance level for the Individual comparisons does not go below
IX, a/K 1s restricted to be IX or larger. If more than K comparisons are
used, the compartsonwlse significance level of IX 1s used. Implying that the
comparlsonwise level may exceed 5X.
EXAMPLE
Table 5-6 contains chlordane concentrations measured at a hypothetical
facility. Twenty-four background observations are available and are used to
develop the prediction Interval. The prediction Interval Is applied to K«2
sampling periods with m»4 observations at a single compliance well each.
Step 1. Find the mean and standard deviation of the 24 background well
measurements. These are 101 and 11, respectively.
Step 2. There are K 2 future observations of means of 4 observations
to be Included 1n the prediction Interval. Entering Table 3 of Appendix B at
K 2 and 20 degrees of freedom (the nearest entry to the 23 degrees of
freedom), we find t/2gt 2, 0.95) * 2*09* The Interval 1s given by
(0. 101 + (11)2.09(1/4 * 1/24)1/21 - (0, 113.4).
Step 3. The mean of each of the four compliance well observations at
sampling period one and two 1s found and compared with the Interval found 1n
Step 2. The mean of the first sampling period Is 122 and that for the second
sampling period 1s 113. Comparing the first of these to the prediction Inter-
val for two means based on samples of size 4, we find that the mean exceeds
5-24
-------
TABLE 5-6. EXAMPLE DATA FOR PREDICTION IHTERVALCHLOROAfiE LEVELS
Background well
Sampling date
January 1, 1985
April 1, 1985
July 1, 1985
October 1, 1985
January 1, 1986
April 1, 1986
n
Mean
SO
dataWell 1
Chlordane
concentration
(ppb)
97
103
104
85
120
105
104
108
110
95
102
78
105
94
110
111
80
106
115
105
100
93
89
113
24
101
11
ConiDl lance well dataWell 2
- Chlordane
concentration
Sampling date (ppb)
July 1, 1986 123
120
116
128
4
Mean 122
SO 5
October 1, 1986 116
117
119
IP!
m « 4
Mean « 113
SO * 8
5-25
-------
the upper limit of the prediction Interval. This 1s statistical!/ I1gn1fleant
evidence of contamination and should be reported to the Regional Administra-
tor. Since the second sampling period wan 1s within the prediction Interval.
the Regional Administrator may allow the facility to remain 1n Its current
stage of monitoring.
INTERPRETATION
A prediction Interval 1s a statistical Interval constructed from back-
ground sample data .to contain a specified number of future observations from
the same distribution with specified probability. That 1s, the prediction
Interval 1s constructed so as to have a 95X probability of containing the next
K sampling period means, provided that there 1s no contamination. If the
future observations are found to be 1n the prediction Interval, this 1s evi-
dence that there has been no change at the facility and that no contamination
1s occurring. If the future observation falls outside of the prediction
Interval, this 1s statistical evidence that the new observation does not come
from the same distribution, that 1s, from the population of uncontamlnated
water samples previously sampled. Consequently, 1f the observation 1s a con-
centration above the prediction Interval's upper limit, 1t 1s statistically
significant evidence of contamination.
The prediction Interval could be constructed 1n several ways. It can be
developed for means of observations at each sampling period, or for each In-
dividual observation at each sampling period.
It should also be noted that the estimate of the standard deviation, S,
that 1s used should be an unbiased estimator. The usual estimator, presented
above, assumes that there 1s only one source of variation. If there are other
sources of variation, such as time effects, or spatial variation 1n the data
used for the background, these should be Included 1n the estimate of the vari-
ability. This can be accomplished by use of an appropriate analys1s-of-vari-
ance model to Include the other factors affecting the variability. Determina-
tion of the components of variance 1n complicated models 1s beyond the scope
of this document and requires consultation with a professional statistician.
REFERENCE
Hahn, 6. and Wayne Nelson. 1973. 'A Survey of Prediction Intervals and Their
Applications." Journal of Quality Technology. 5:178-188.
5-26
-------
SECTION 6
COMPARISONS WITH MCLs OR ACLs
This section Includes statistical procedures appropriate when the moni-
toring alms at determining whether ground-water concentrations of hazardous
constituents are below or above fixed concentration Units. In this situation
the maximum concentration limit (MCL) or alternate concentration limit (ACL)
1s a specified concentration limit rather than being determined by the back-
ground well concentrations. Thus the applicable statistical procedures are
those that compare the compliance well concentrations estimated from sampling
with the prespedfled fixed limits. Methods for comparing compliance well
concentrations to a (variable) background concentration were presented 1n
Section 5.
The methods applicable to the type of comparisons described 1n this sec-
tion Include confidence Intervals and tolerance Intervals. A special section
deals with cases where the observations exhibit very small or no variability.
6.1 SUMMARY CHART FOR COMPARISON WITH MCLs OR ACLs
Figure 6-1 1s a flow chart to aid the user 1n selecting and applying a
statistical method when the permit specifies an MCL or ACL.
As with each type of comparison, a determination 1s made first to see 1f
there are enough data for 1ntra-wel1 comparisons. If so, these should be done
1n parallel with the other comparisons.
/
Here, whether the compliance Hm1t 1s a maximum concentration limit (MCL)
or an alternate concentration limit (ACL), the recommended procedure to com-
pare the mean compliance well concentration against the compliance limit 1s
the construction of a confidence Interval. This approach 1s presented 1n
Section 6.2.1. Section 6.2.2 adds a special case of limited variance 1n the
data. If the permit requires that a compliance limit 1s not to be exceeded
more than a specified fraction of the time, then the construction of tolerance
limits 1s the recommended procedure, discussed 1n Section 6.2.3.
6.2 STATISTICAL PROCEDURES
This section presents the statistical procedures appropriate for compari-
son of ground-water monitoring data to a constant compliance limit, a fixed
standard. The Interpretation of the fixed compliance limit (MCL or ACL) 1s
that the mean concentration should not exceed this fixed Halt. An alternate
Interpretation may be specified. The permit could specify t compliance limit
as a concentration not to be exceeded by more than a small, specified
6-1
-------
Comparisons with MCL/ACLs
Consult with
Profasstona]
Statistician
Comparisons with
MCL/ACU
(Sactton6)
Normal
Confidanea
Intarvala
Lognormai
Confidanea
IrtarvaJs
Nonpsramatric
Confidanea
MarvaJs
with
Intra-WaU Comparisons
I Mor* than 1 Yr of Data
Control Charts
(SacttonT)
with Uppar 95th Pafeantila
t
Toiaranea Limits
Figure 6-1. Comparisons with MCLs/ACLs.
6-2
-------
t
proportion of the observations. A tolerance Interval approach for such a
situation Is also presented.
6.2.1 Confidence Intervals
When a regulated unit 1s 1n compliance monitoring with a fixed compliance
Halt (either an MCL or an ACL), confidence Intervals are the recommended pro-
cedure pursuant to §264.97(h)(5) 1n the Subpart F regulations. The unit will
remain 1n compliance monitoring unless there Is statistically significant evi-
dence that the mean concentration at one or more of the downgradlent wells
exceeds the compliance limit. A confidence Interval for the mean concentra-
tion 1s constructed from the sample data for each compliance well Individu-
ally. These confidence Intervals are compared with the compliance limit. If
the entire confidence Interval exceeds the compliance limit, this 1s statisti-
cally significant evidence that the mean concentration exceeds the compliance
limit.
Confidence Intervals can generally be constructed for any specified dis-
tribution. General methods can be found 1n texts on statistical Inference
some of which are referenced 1n Appendix C. A confidence limit based on the
normal distribution 1s presented first, followed by a modification for the
log-normal distribution. A nonpar ametrlc confidence Interval 1s also
presented.
6.2.1.1 Confidence Interval Based on the Normal Distribution
PURPOSE
The confidence Interval for the mean concentration's constructed from
the compliance well data. Once the Interval has been constructed, 1t can be
compared with the MCL or ACL by Inspection to determine whether the mean con-
centration significantly exceeds the MCL or ACL.
PROCEDURE
Step 1. Calculate the mean, X, and standard deviation, S, of the sample
concentration values. Do this separately for each compliance well.
Step 2. For each well calculate the confidence Interval as
where t/ \ 1s obtained from the t- table (Table , Appendix B).
Generally, there will be at least four observations at each sampling period,
so t will usually have at least 3 degrees of freedom.
Step 3. Compare the Intervals calculated 1n Step 2 to the compliance
limit (the MCL or ACL, as appropriate). If the compliance limit 1s contained
1n the Interval or 1s above the upper limit, the unit remains 1n compliance.
6-3
-------
If any well confidence Interval's lower limit exceeds the comp>1aflce limit,
this 1s statistically significant evidence of contamination.
REMARK
The 99th percentHe of the t-d1$tr1but1on 1s used 1n constructing the
confidence Interval. This 1s consistent with an alpha (probability of Type I
error) of 0.01, since the decision on compliance 1s made by comparing the
lower confidence limit to the MCL or ACL. Although the Interval as con-
structed with both upper and lower limits 1s a 98X confidence Interval, the
use of 1t 1s one-sided, which 1s consistent with the IX alpha level of
Individual well comparisons.
EXAMPLE
Table 6-1 lists hypothetical concentrations of Aldlcarb 1n three compli-
ance wells. For Illustration purposes, the MCL for Aldlcarb has been set at
7 ppb. There 1s no evidence of nonnormallty, so the confidence Interval based
on the normal distribution 1s used.
TABLE 6-1. EXAMPLE DATA FOR NORMAL CONFIDENCE INTERVAL--ALDICARB
CONCENTRATIONS IN COMPLIANCE WELLS (ppb)
Sampling
date Well 1 Well 2 Well 3
Jan. 1 19.9 i 23.7 5.6
Feb. 1 29.6 21.9 3.3
Mar. 1 18.7 26.9 2.3
Apr. 1 24.2 26.1 6.9
X - 23.1 24.6 4.5
S - 4.9 2.3 2.1
MCL 7 ppb
Step 1. Calculate the mean and standard deviation of the concentrations
for each compliance well. These statistics are shown In the table above.
Step 2. Obtain the 99th percentHe of the t-dlstrlbutlon with (4-1) 3
degrees of freedom from Table 6, Appendix B as 4.541. Then calculate the con-
fidence Interval for each well's mean concentration.
Hell 1: 23.1 ± 4.541(4.9)//T. (12.0, 34.2)
Hell 2: 24.6 t 4.541(2.3)M~- (19.4, 29.8)
Hell 3: 4.5 ± 4.541 (2.l)/^~« (-0.3, 9.3)
6-4
-------
where the usual convention of expressing the upper and lower* limits of the
confidence Interval In parentheses separated by a comma has teen followed.
Step 3. Compare each confidence Interval to the MCL of 7 ppb. When this
1s done, the confidence Interval for Well 1 lies entirely above the MCL of 7,
Indicating that the mean concentration of Aldlcarb In Hell 1 significantly
exceeds the MCL. Similarly, the confidence Interval for Hell 2 lies entirely
above the MCL of 7. This 1s significant evidence that the mean concentration
1n Well 2 exceeds the MCL. However, the confidence Interval for Well 3 1s
mostly below the MCL. Thus, there Is no statistically significant evidence
that the mean concentration 1n Well 3 exceeds the MCL.
IKTERPRETATION
The confidence Interval 1s an Interval constructed so that 1t should con-
tain the true or population mean with specified confidence (98* 1n this
case). If this Interval does not contain the compliance limit, then the mean
concentration must differ from the compliance limit. If the lower end of the
Interval 1s above the compliance limit, then the mean concentration must be
significantly greater than the compliance limit. Indicating noncompllance.
6.2.1.2 Confidence Interval for Log-Normal Data
PURPOSE
The purpose of a confidence Interval for the mean concentration of log-
normal data 1s to determine whether there 1s statistically significant
evidence that the mean concentration exceeds a fixed compliance limit. The
Interval gives a range that Includes the true mean concentration with
confidence 98X. The lower limit will be below the true mean with confidence
99X, corresponding to an alpha of IX.
PROCEDURE
This procedure 1s used to construct a confidence Interval for the mean
concentration from the compliance well data when the data are log-normal (that
1s, when the logarithms of the data are normally distributed). -Once the
Interval has been constructed, 1t can be compared with the MCL or ACL by
Inspection to determine whether the mean concentration significantly exceeds
the MCL or ACL. Throughout the following procedures and examples, natural
logarithms (In) are used.
Step 1. Take the natural logarithm of each data point (concentration
measurement). Also, take the natural logarithm of the compliance Unit.
Step 2. Calculate the sample mean and standard deviation of the log-
transformed data from each compliance well. (This 1s Step 1 of the previous
section, working now with logarithms.)
6-5
-------
Step 3. Form the confidence Intervals for each compllance wej; as
1 * '(0.99, n-1)
where t(o.99t n-1) 1s from tne t-dlstrlbutlon 1n Table 6 of Appendix B. Here
t will typically have 3 degrees of freedom.
Step 4. Compare the confidence Intervals found 1n Step 3 to the
logarithm of the compliance limit found 1n Step 1. If the lower limit of the
confidence Interval lies entirely above the logarithm of the compliance limit.
there 1s statistically significant evidence that the unit 1s out of compli-
ance. Otherwise, the unit .1s 1n compliance.
EXAMPLE
Table 6-2 contains EDB concentration data from three compliance wells at
a hypothetical site. The HCL 1s assumed to be 20 ppb. For demonstration pur-
poses, the data are assumed not normal; a natural log-transformation
normalized them adequately. The lower part of the table contains the natural
logarithms of the concentrations.
TABLE 6-2. EXAMPLE DATA FOR LOG-NORMAL CONFIDENCE INTERVALEDB
CONCENTRATIONS IN COMPLIANCE WELLS (ppb)
Sampling
date Uell 1 Well 2 Well 3
Concentrations
Jan. 1 24.2 39.7 55.7
Apr. 1 10.2 75.7 17.0
Jul. 1 17.4 60.2 97.8
Oct. 1 39.7 10.9 25.3
7 22.9 46.6 49.0
S * 12.6 28.0 36.6
MCL - 20 ppb
Natural log concentrations
Jan. 1 - 3.19 3.68 4.02
Apr. 1 2.32 4.33 2.84
Jul. 1 2.85 4.10 4.58
Oct. 1 3.68 2.39 3.23
X 3.01 3.62 3.67
S 0.57 0.86 0.78
In (MCL) 3.00
6-6
-------
Step 1. The logarithms of the data are used to calculate a confidence
Interval. Take the natural log of the concentrations 1n the top part of
Table 6-2 to find the values given 1n the lower part of the table. For exam-
ple, ln(24.2) - 3.19, . . ., 1n(25.3) - 3.23. Also, take the logarithm of the
MCL to find that ln(20) 3.00.
Step 2. Calculate the mean and standard deviation of the log concentra-
tions for each compliance well. These are shown 1n the table.
Step 3. Form the confidence Intervals for each compliance well.
Well 1: 3.01 ± 4.541(0.57)/^"« (1.72, 4.30)
Well 2: 3.62 ± 4.541(0.86) A^- (1.67, 5.57)
Hell 3: 3.67 ± 4.541(0.78)A^"« (1.90, 5.44)
where 4.541 1s the value obtained from the t-table (Table 6 1n Appendix B) as
1n the previous example.
Step 4. Compare the Individual well confidence Intervals with the MCL
(expressed on the log scale). The natural log of the MCL of 20 ppm 1s 3.00.
None of the Individual well confidence Intervals for the mean has a lower
limit that exceeds this value, so none of the Individual well mean concentra-
tions 1s significantly different from the MCL.
Note: The lower and upper limits of the confidence Interval for each
well's mean concentration could be converted back to the original scale by
taking antllogs. For example, on the original scale, the confidence Intervals
would be:
Well 1: (exp(1.72), exp(4.30)) or (5.58, 73.70)
Hell 2: (exp(1.67). exp(S.Sl)) or (5.31, 262.43)
Well 3: (exp(1.90), exp(5.44)) or (6.69, 230.44)
These limits could be compared directly with the MCL of 20 ppb. It 1s gen-
erally easier to take the logarithm of the MCL rather than the antllogarithm
of all of the Intervals for comparison.
INTERPRETATION
If the original data are not normal, but the log-transformation ade-
quately normalizes the data, the confidence Interval (on the log scale) 1s an
Interval constructed so that the lower confidence limit should be less than
the true or population mean (on the log scale) with specified confidence (99*
6-7
-------
In this case). If the lower end of the confidence interval exceeds the appro*
prlate compliance Halt, then the oean concentration wst exceed that compli-
ance I1i1t. These results provide statistically significant evidence of
contamination.
6.2.1.3 Honparametrlc Confidence Interval ^
If the data do not adequately follow the normal distribution even after
the logarithm transformation, a nonparametrlc confidence Interval can be con-
structed. This Interval 1s for the median concentration (which equals the
mean 1f the distribution 1s symmetric). The nonparametrlc confidence Interval
Is generally wider and requires more data than the corresponding normal dis-
tribution Interval, and so the normal or log-normal distribution Interval
should be used whenever 1t 1s appropriate. It requires a minimum of seven (7)
observations 1n order to construct an Interval with a two-sided confidence
coefficient of 98X, corresponding to a one-sided confidence coefficient of
99X. Consequently, 1t 1s applicable only for the pooled concentration of
compliance wells at a single point 1n time or for special sampling to produce
a minimum of seven observations at a single well during the sampling period.
PURPOSE
The nonparametrlc confidence Interval 1s used when the raw data have been
found to violate the normality assumption, a log-transformation falls to
normalize the data, and no other specific distribution 1s assumed. It pro-
duces a simple confidence Interval that 1s designed to contain the true or
population median concentration with specified confidence (here 99X). If this
confidence Interval contains the compliance limit, 1t Is concluded that the
median concentration does not differ significantly from the compliance
limit. If the Interval's lower limit exceeds the compliance limit, this 1s
statistically significant evidence that the concentration exceeds the compli-
ance limit and the unit 1s out of compliance.
PROCEDURE
Step 1. U1th1n each compliance well, order the n data from least to
greatest, denoting the ordered data by X(l)(. . ., X(n). where X(1) 1s the 1th
value 1n the ordered data.
Step 2. Determine the critical values of the order statistics as
follows. If the minimum seven observations 1s used, the critical values are 1
and 7. Otherwise, find the smallest Integer, M, such that the cumulative
binomial distribution with parameters n (the sample size) and p « 0.5 1s at
least 0.99. Table 6-3 gives the values of N and n+l-M together with the exact
confidence coefficient for sample sizes from 4 to 11. For larger samples,
take as an approximation the nearest Integer value to
M - n/2 * 1
where I$f9g 1s the 99th percent He from the normal distribution (Table 4,
Appendix B) and equals 2.33.
6-8
-------
TABLE 6-3. VALUES OF M AND n+l-M AND CONFIDENCE;
COEFFICIENTS FOR SMALL SAMPLES
n
4
5
6
7
8
9
10
11
M
4
5
6
7
8
9
9
10
n+l-M
1
1
1
1
1
1
2
2
Two-sided
confidence
87. 5X
93.8*
96.9X
98. 4X
99.2*
99.6*
97.9%
98.8*
Step 3. Once M has been determined in Step 2, find n+l-M And take as the
confidence limits the order statistics, X(M) and X(n+l-M). (With the minimum
seven observations, use X(l) and X(7).)
Step 4. Compare the confidence limits found 1n Step 3 to the compliance
limit. If the lower limit, X(M) exceeds the compliance limit, there 1s sta-
tistically significant evidence of contamination. Otherwise, the unit remains
1n compliance.
REMARK
The nonparametrlc confidence Interval procedure requires at least seven
observations 1n order to obtain a (one-sided) significance level of IX (confi-
dence of 99X). This means that data from two (or more) wells or sampling
periods would have to be pooled to achieve this level. If only the four
observations from one well taken at a single sampling period were used, the
one-sided significance level would be 6.25X. This would also be the false
alarm rate.
Ties do not affect the procedure. If there are ties, order the observa-
tions as before, Including all of the tied values as separate observations.
That 1s, each of the observations with a common value Is Included 1n the
ordered 11st (e.g., 1, 2, 2, 2, 3, 4, etc.). For ties, use the average of the
tied ranks as 1n Section 5.2.2, Step 1 of the example. The ordered statistics
are found by counting positions up from the bottom of the 11st is before.
Multiple values from separate observations are counted separately.
EXAMPLE
Table 6-4 contains concentrations of S11vex In parts per million from two
hypothetical compliance wells. The data are assumed to consist of four sam-
ples taken each quarter for a year, so that sixteen observations are available
6-9
-------
TABLE 6-4. EXAMPLE DATA FOR NONPARAMETRIC CONFIDENCE
INTERVALSILVEX CONCENTRATIONS (ppm)
Sampling
date
Jan. 1
Apr. 1
Jul. 1
Oct. 1
Nell 1
Mell 2 I
v
Concentration Concentration
(ppm) Rank (pp») Rank
3.17 {
2.32
7.37
4.44
9.50
21.36 {
5.15
2) 3.52 (
1) 12.32 I
11) 2.28 1
[6) 5.30 (
[13) 8.12
16) 3.36
7) 11.02
15.70 (15) 35.05
6)
15)
4)
7)
»'
14)
16)
5.58 (8) 2.20 (3)
3.39
8.44
10.25
[3) 0.00 i
[12) 9.30 <
(14) 10.30
3.65 (4) 5.93
6.15 (9) 6.39
6.94
3.74
(10) 0.00
(5) 6.53
:i.s)
;i2)
(13)
8)
9)
1.5)
19)
from each well. The data are not normally distributed, neither as raw data
nor when log transformed. Thus, the nonparametrlc confidence Interval 1s
used. The MCL 1s taken to be 25 ppm.
Step 1. Order the 16 measurements from least to greatest within each
well separately. The numbers 1n parentheses beside each concentration 1n
Table 6-4 are the ranks or order of the observation. For example, 1n Well 1,
the smallest observation 1s 2.32, which has rank 1. The second smallest 1s
3.17. which has rank 2. and so forth, with the largest observation of 21.36
having rank 16.
Step 2. The sample size 1s large enough so that the approximation 1s
used to find M.
M - 16/2 + 1 « 2.33 /TI6747 - 13.7 - 14 4
Step 3. The approximate 9SX confidence limits are given by the
16 + 1 - 14 - 3rd largest observation and the 14th largest observation. For
6-10
-------
Hell 1, the 3rd observation Is 3.39 and the 14th largest observation 1s
10.25. Thus the confidence Halts for Hell 1 are (3.39. 10.25). Similarly
for Hell 2, the 3rd largest observation and the 14th largest observation are
found to give the confidence Interval (2.20, 11.02). Note that for Well 2
there were two values below detection. These were assigned a value of zero
and received the two smallest ranks. Had there been three or acre values
below the Halt of detection, the lower limit of the confidence Interval would
have been the Unit 'of detection because these values would have been the
smallest values and so would have Included the third order statistic.
Step 4. Neither of the two confidence Intervals' lower Hm1t exceeds the
MCI of 25. In fact, the upper Hm1t 1s less than the MCL, Implying that the
concentration 1n each well 1s significantly below the MCL.
INTERPRETATION
The rank-order statistics used to form the confidence Interval 1n the
nonparametrlc confidence Interval procedure will contain the population median
with confidence coefficient of 98X. The population median equals the mean
whenever the distribution 1s symmetric. The nonparametrlc confidence Interval
1s generally wider and requires more data than the corresponding normal dis-
tribution Interval, and so the normal or log-normal distribution Interval
should be used whenever 1t 1s appropriate.
If the confidence Interval contains the compliance limit (either MCL or
ACL), then 1t 1s reasonable to conclude that the median compliance well con-
centration does not differ significantly from the compliance limit. If the
lower end of the confidence Interval exceeds the compliance limit, this 1s
statistically significant evidence at the IX level that the median compliance
well concentration exceeds the compliance limit and the unit Is out of
compliance.
6.2.2 Tolerance Intervals for Compliance Limits
In some cases a permit may specify that a compliance limit (MCL or ACL)
1s not to be exceeded more than a specified fraction of the time. Since lim-
ited data will be available from each monitoring well, these data can be used
to estimate a tolerance Interval for concentrations from that well. If the
upper end of the tolerance Interval (I.e., upper tolerance limit) 1s less than
the compliance limit, the data Indicate that the unit Is 1n compliance. That
1s, concentrations should be less than the compliance Hm1t at least a speci-
fied fraction of the time. If the upper tolerance limit of the Interval
exceeds the compliance limit, then the concentration of the hazardous con-
stituent could exceed the compliance limit more than the specified proportion
of the time.
This procedure compares an upper tolerance limit to the MCL or ACL. With
small sample sizes the upper tolerance limit can be fairly large, particularly
1f large coverage with high confidence 1s desired. If the owner or operator
wishes to use a tolerance limit 1n this application, ha/she should suggest
values for the parameters of the procedure subject to the approval of the
Regional Administrator. For example, the owner or operator could suggest a
6-11
-------
95X coverage with 95X confidence. This Mans that the upper to!er,ance Unit
1s a value which, with 95% confidence, will be exceeded less than'SX of the
time.
PURPOSE
The purpose of the tolerance Interval approach Is to construct an Inter-
val that should contain a specified fraction of the concentration measurements
from coopllancc wells with a specified degree of confidence. In this appli-
cation 1t Is generally desired to have the tolerance Interval contain 95X of
the Measurements of concentration with confidence at least 95X.
PROCEDURE
It 1s assumed that the data used to construct the tolerance Interval are
approximately normal. The data may consist of the concentration measurements
themselves If they are adequately normal (see Section 4.2 for tests of normal-
ity), or the data used may be the natural logarithms of the concentration
data. It 1s Important that the compliance limit (MCL or ACL) be expressed in
the same units (either concentrations or logarithm of the concentrations) as
the observations.
Step 1. Calculate the mean, Jf, and the standard deviation, S, of the
compliance well concentration data.
Step 2. Determine the factor, K, from Table 5, Appendix B, for the sam-
ple size, n, and form the one-sided tolerance Interval
10. X + KS]
Table 5, Appendix B contains the factors for a 9SX coverage tolerance Interval
with confidence factor 95X.
Step 3. Compare the upper limit of the tolerance erval computed 1n
Step 2 to the compliance limit. If the upper limit of the tolerance Interval
exceeds that limit, this Is statistically significant evidence of contamina-
tion.
EXAMPLE
Table 6-5 contains Aldlcarb concentrations at a hypothetical facility 1n
compliance monitoring. The data are concentrations In parts per million (ppm)
and represent observations at three compliance wells. Assume than the permit
establishes an ACL of 50 ppm that 1s not to be exceeded more than 5X of the
time.
Step 1. Calculate the mean and standard deviation of the observations
from each well. These are given 1n the table.
6-12
-------
TABLE 6-5. EXAMPLE DATA FOR A TOLERANCE *
INTERVAL COMPARED TO AN ACL
Sampling
date
Aldlcarb concentrations (p
Nell 1 Well 2 H
pa)
ell 3
Jan. 1 19.9 23.7 25.6
Feb. 1 29.6 21.9 23.3
Mar. 1 18.7 26.9 22.3
Apr. 1 24.2 26.1 26.9
Mean 23.1 24.7 24.5
SO 4.93 2.28 2.10
ACL 50 ppm
Step 2. For n « 4, the factor, K, In Table 5, Appendix B, 1s found to
be 5.145. Fora the upper tolerance Interval limits as:
Well 1: 23.1 * 5.145(4.93) 48.5
Well 2: 24.7 * 5.145(2.28) 36.4
Well 3: 24.5 * 5.145(2.10) 35.3
Step 3. Compare the tolerance limits with the ACL of 50 PPM. Since the
upper tolerance limits are below the ACL, there Is no statistically signifi-
cant evidence of contamination at any well. The site remains 1n detection
monitoring.
INTERPRETATION
It may be desirable 1n a permit to specify a compliance limit that 1s not
to be exceeded more than 5X of the time. A tolerance Interval constructed
from the compliance well data provides an estimated Interval that will contain
95X of the data with confidence 95X. If the upper Halt of this Interval 1s
below the selected compliance limit, concentrations measured at the compliance
wells should exceed the compliance limit less than 5X of the time. If the
upper limit of the tolerance Interval exceeds the compliance limit, then more
than 5X of the concentration measurements would be expected to exceed the
compliance limit.
6.2.3 Special Cases with Limited Variance
Occasionally, all four concentrations from a compliance well at a par-
ticular sampling period could be Identical. If this 1s the case, the formula
for estimating the standard deviation at that specific sampling period would
6-13
-------
give zero, and the methods for calculating parametric confidence Intervals
would give the same Halts for the upper and lower ends of the intervals,
which 1s not appropriate. _
In the case of Identical concentrations, one should assuae that there 1s
soae variation In the data, out that the concentrations were rounded and give
the same values after rounding. To account for the variability that was
present before rounding, take the least significant digit 1n the reported
concentration as having resulted from rounding. Assuae that rounding results
In a uniform error on the Interval centered at the reported value with the
Interval ranging up or down one half unit from the reported value. This
assumed rounding Is used to obtain a nonzero estimate of the variance for use
1n cases where all the measured concentrations were found to be Identical.
PURPOSE
The purpose of this procedure Is to obtain a nonzero estimate of the
variance when all observations from a well during a given sampling period gave
Identical results. Once this modified variance 1s obtained, Its square root
1s used 1n place of the usual sample standard deviation, S, to construct con-
fidence Intervals or tolerance Intervals.
PROCEDURE
Step 1. Determine the least significant value of any data point. That
1s, determine whether the data were reported to the nearest 10 ppa, nearest 1
ppa, nearest 100 ppa, etc. Denote this value by 2R.
Step 2. The data are assumed to have been rounded to the nearest 2R, so
each observation 1s actually the reported value ±R. Assuming that the obser-
vations were Identical because of rounding, the variance 1s estimated to be
R*/3, assuming the uniform distribution for the rounding error. This gives
the estimated standard deviation as
S1 - R/^"
Step 3. Take this estimated value froa Step 2 and use It as the estimate
of the standard deviation 1n the appropriate parametric procedure. That 1s,
replace S by S'.
EXAMPLE
In calculating a confidence Interval for a single compliance well, sup-
pose that four observations were taken during a sampling period and all
resulted 1n 590 ppm. There 1s no variance among the four values 590, 590,
590, and 590.
Step 1. Assume that each of the values 590 came froa rounding the con-
centration to the nearest 10 ppa. That Is, 590 could actually be any value
between 585.0 and 594.99. Thus, 2R Is 10 ppa (rounded off), so R 1s 5 ppa.
6-14
-------
Step 2. The estimate of the standard deviation Is
S' . 5/^~» 5/1.732 2.89 ppm
Step 3. Use S1 - 2.89 and X* 590 to calculate the confidence Interval
(see Section 6.2.1) for the mean concentration from this well. This gives
590 ± (4.541) (2.89//S)"- (583.4, 596.6)
as the 98X confidence Interval of the average concentration. Note that 4.541
1s the 99th percent He from the t -distribution (Table 6, Appendix B) with 3
degrees of freedom since the sample size was 4.
INTERPRETATION
When Identical results are obtained from several different samples, the
Interpretation 1s that the data are not reported to enough significant figures
to show the random differences. If there 1s no extrinsic evidence Invalidat-
ing the data, the data are regarded as having resulted from rounding more
precise results to the reported observations. The rounding 1s assumed to
result 1n variability that follows the uniform distribution on the range ±R,
where 2R 1s the smallest unit of reporting. This assumption Is used to calcu-
late a standard deviation for the observations that otherwise appear to have
no variability.
REMARK
Assuming that the data are reported correctly to the units Indicated,
other distributions for the rounding variability could be assumed. The max-
imum standard deviation that could result from rounding when the observation
1s ±R 1s the value R.
6-15
-------
SECTION 7
CONTROL CHARTS FOR INTRA-WELL COMPARISONS
The previous sections cover various situations where the compliance weTI
data are compared to the background well data or to specified concentration
Halts (ACL or MCL) to detect possible contamination. This section discusses
the case where the level of each constituent within a single uncontamlnated
well 1s being monitored over time. In essence, the data for each constituent
In each well are plotted on a time scale and Inspected for obvious features
such as trends or sudden changes In concentration levels. The method sug-
gested here 1s a combined Shewhart-CUSUM control chart for each well and
constituent.
The control chart method 1s recommended for uncontamlnated wells only,
when data comprising at least eight Independent samples over a one-year period
are available. This requirement 1s specified under current RCRA regulations
and applies to each constituent 1n each well.
As discussed 1n Section 2, a common sampling plan will obtain four Inde-
pendent samples from each well on a semi-annual basis. With this plan a con-
trol chart can be Implemented when one year's data are available. As a result
of Monte Carlo simulations, Starks (1988) recommended at least four sampling
periods at a unit of tight or more wells, and at least eight sampling periods
.at a unit with fewer than four wells.
The use of control charts can be an effective technique for monitoring
the levels of a constituent at a given well over time. It also provides a
visual means of detecting deviations from a 'state of control." It 1s clear
that plotting of the data Is an Important part of the analysis process. Plot-
ting Is an easy task, although time-consuming 1f many data sets need to be
plotted. Advantage should be taken of graphics software, since plotting of
time series data will be an ongoing process. New data points will be added to
the already existing data base each time new data are available. The follow-
ing few sections will discuss. In general terms, the advantages of plotting
time series data; the corrective steps one could take to adjust when season-
al 1ty in the data 1s present; and finally, the detailed procedure for con-
structing a Shewhart-CUSUM control chart, along with a demonstration of that
procedure. Is presented.
7.1 ADVANTAGES OF PLOTTING DATA
While analyzing the data by means of any of the appropriate statistical
procedures discussed 1n earlier sections 1s recommended, we also recommend
plotting the data. Each data point should be plotted against time using a
time scale (e.g., month, quarter). A plot should be generated for each
7-1
-------
constituent Measured 1n each well. For visual coBparlson purposes, the scale
should be kept Identical from well to well for t given constituent, f
±
Another Important application of the plotting procedure 1s for detecting
possible trends or drifts 1n the data from a given well. Furthermore, when
visually comparing the plots fro* several wells within e unit, possible con-
tamination of one rather than all downgradlent wells could be detected which
would then warrant a closer look it that well. In general, graphs can provide
highly effective Illustrations of the tine series, allowing the analyst to
obtain a ouch greater sense of the data. Seasonal fluctuations or sudden
changes, for example, may become quite evident, thereby supporting the analyst
1n his/her decision of which statistical procedure to use. General upward or
downward trends, 1f present, can be detected and the analyst can follow-up
with a test for trend, such as the nonparametrlc Mann-Kendall test (Mann,
1945; Kendall, 1975). If, 1n addition, seasonal1ty Is suspected, the user can
perform the seasonal Kendall test for trend developed by Hlrsch et al.
(1982). The reader 1s also referred to Chapters 16 "Detecting and estimating
Trends" and 17 "Trends and Seasonal1ty" of Gilbert's 'Statistical Methods'for
Environmental Pollution Monitoring," 1987. In any of the above cases, the
help of a professional statistician Is recommended.
Another Important use of data plots Is that of Identifying unusual data
points (e.g., outliers). These points should then be Investigated for pos-
sible QC problems, data entry errors, or whether they are truly outliers.
Many software packages are available for computer graphics, developed for
mainframes, mini-, or microcomputers. For example, SAS features an easy-to-
use plotting procedure, PROC PLOT; where the hardware and software are avail-
able, a series of more sophisticated plotting routines can be accessed through
SAS GRAPH. On microcomputers, almost everybody has his or her favorite
graphics software that they use on a regular basis and no recommendation w-ni
be made as to the most appropriate one. The plots shown In this document were
generated using LOTUS 1-2-3.
Once the data for each constituent and each well are plotted, the plots
should be examined for seasonal1ty and a correction Is recommended should
seasonal 1ty be present. A fairly simple-to-use procedure for deseasonal1z1ng
data Is presented 1n the following paragraphs.
7.2 CORRECTING FOR SEASONALITY
A necessary precaution before constructing a control chart Is to take
Into account seasonal variation of the data to minimize the chance of mistak-
ing seasonal effect for evidence of well contamination. This could result
from variations 1n chemical concentrations with recharge rates during
different seasons throughout the years. If seasonal1ty Is present, then
deseasonal1z1ng the data prior to using the combined Shewhart-CUSUM control
chart procedure 1s recommended.
Many approaches to deseasonallze data exist. If the seasonal pattern Is
regular. It may be modeled with a sine or cosine function. Moving averages
can be used, or differences (of order 12 for monthly data for example) can be
7-2
-------
used. However, tine series models may Include rather complicated methods for
deseasona11z1ng the data. Another simpler Method exists wMch abould be ade-
quate for the situations described 1n this document. It has the advantage of
being easy to understand and apply, and of providing natural estimates of the
monthly or quarterly effects via the Monthly or quarterly-means. The method
proposed here can be applied to any seasonal cycletypically an annual cycle
for Monthly or quarterly data.
NOTE
Corrections for seasonal1ty should be used with great caution as they
represent extrapolation Into the future. There should be a good scientific
explanation for the seasonalUy as well as good empirical evidence for the
seasonalIty before corrections are made. Larger than average rainfalls for
two or three Augusts 1n a row does not Justify the belief that there will
never be a drought 1n August, and this Idea extends directly to groundwater
quality. In addition, the quality (bias, robustness, and variance) of the
estimates of the proper corrections must be considered even 1n cases where
corrections are .called for. If seasonal1ty 1s suspected, the user might want
to seek the help of a professional statistician.
PURPOSE
When seasonal 1ty 1s known to exist 1n a time series of concentrations,
then the data should be deseasonallzed prior to constructing control charts in
order to take Into account seasonal variation rather than mistaking seasonal
effects for evidence of contamination.
PROCEDURE
The following Instructions to adjust a tine series for seasonal1ty are
based on monthly data with a yearly cycle. The procedure can be easily modi-
fied to accommodate a yearly cycle of quarterly data.
Assume that N years of monthly data are available. Let x.i« denote the
unadjusted observation for the 1th month during the jth year. J
Step 1. Compute the average concentration for month 1 over the N-year
period:
X, - (Xn + ... * X1N)/H
This 1s the average of all observations taken 1n different years but during
the sane month. That 1s, calculate the Man concentrations for all Januarys,
then the nean for all Februarys and so on for each of the 12 months.
Step 2. Calculate the grand mean, X, of all 11*12 observations,
12 N 12
I- x x XM/H*12 « x I12
1-1 j-1 1J 1-1 1
7-3
-------
Step 3. Compute the adjusted concentrations,
1J
M
Computing Xjj - J1 removes the average effect of month 1 from the monthly
data, and adding X, the overall mean, places the adjusted z^ values about the
sane mean, X~. It follows that the overall mean adjusted observation, 7,
equals the overall mean unadjusted value, X.
EXAMPLE
Columns 2 through 4 of Table 7-1 show monthly unadjusted concentrations
of a fictitious analyte over a 3-year period.
s
TABLE 7-1. EXAMPLE COMPUTATION FOR OESEASONALIZING DATA
Unadjusted
concentrations
January
February
March
April
May
June
July
August
September
October
November
December
1983
1.99
2.10
2.12
2.12
2.11
2.15
2.19
2.18
2.16 '
2.08
2.05
2.08
1984
2.01
2.10
2.17
2.13
2.13
2.18
2.25
2.24
2.22
2.13
2.08
2.16
1985
2.15
2.17
2.27
2.23
2.24
2.26
2.31
2.32
2.28
2.22
2.19
2.22
«.;'<
3-Hoitth
average
2.05
2.12
2.19
2.16
2.16
2.20
2.25
2.25
2.22
2.14
2.11
2.16
Monthly adjusted
concentrations
1355
2.10
2.14
2.10
2.13
2.12
2.12
2.11
2.10
2.11
2.10
2.11
2.09
1984
2.13
2.15
2.15
2.14
2.13
2.15
2.16
2.16
2.17
2.16
2.14
2.17
1985
2.27
2.21
2.25
2.24
2.25
2.23
2.23
2.24
2.22
2.24
2.25
2.23
Overall 3-year average "2.17
Step 1. Compute the Monthly averages across the 3 years. T-tse values
are shown 1n the fifth column of Table 7-1.
Step 2. The grand Mean over the 3-year period 1s calculated to be 2.17.
7-4
-------
Step 3. Within each month and year, subtract the average Monthly con-
centration for that nonth and add the grand »ean. For example, for January
1983, the adjusted concentration becomes
1.99 - 2.05 + 2.17 - 2.11
The adjusted concentrations are shown 1n the last three columns of Table 7-1.
The reader can check that the average of all 36 adjusted concentrations
equals 2.17, the average unadjusted concentration. Figure 7-1 shows the plot
of the unadjusted and adjusted data. The raw data clearly exhibit seasonallty
as well as an upwards trend which 1s less evident by simply looking at the
data table.
INTERPRETATION
'As can be seen In Figure 7-1, seasonal effects were present In the
data. After adjusting for monthly effects, the seasonallty was removed as can
be seen 1n the adjusted data plotted 1n the same figure.
7.3 COMBINED SHEWHART-CUSUM CONTROL CHARTS FOR EACH WELL AND CONSTITUENT
Control charts are widely used as a statistical tool In industry as well
as research and development laboratories. The concept of control charts 1s
relatively simple, which makes them attractive to use. From the population
distribution of a given variable, such as concentrations of a given constit-
uent, repeated random samples are taken at Intervals over time. Statistics,
for example the mean of replicate values at a point 1n time, are computed and
plotted together with upper and/or lower predetermined limits on a chart where
the x-ax1s represents time. If a result falls outside these boundaries, then
the process 1s declared to be "out of control"; otherwise, the process 1s
declared to be "1n control.* The widespread use of control charts 1s due to
their ease of construction and the fact that they can provide a quick visual
evaluation of a situation, and remedial action can be taken, 1f necessary.
In the context of ground water monitoring, control charts can be used to
monitor the Inherent statistical variation of the data collected within a
single well, and to flag anomalous results. Further Investigation of data
points lying outside the established boundaries will be necessary before any
direct action Is taken.
A control chart that can be used on a real time basis must be constructed
from a data set large enough to characterize the behavior of a specific
well. It Is recommended that data from a minimum of eight samples within a
year be collected for each constituent at each well to permit an evaluation of
the consistency of monitoring results with the current concept of the hydro-
geology of the site. Starks (1988) recommends a minimum of four sampling
periods at a unit with tight or more wells and a minimum of tight sampling
periods at a unit with-less than four wells. Once the control chart for the
specific constituent at a given well 1s acceptable, then subsequent data
7-5
-------
s
1
§
I
I
2.32
Time Series of Monthly Observations
(Unadjusted. Adjusted. 3-year Mean)
1.08
10 ~|IrI 1"-I ' T - - - - - - . '1"*1
Jan-B3 May-B3 Sep-83 Jon-84 Uay-84 Sep-84 Jan-85 May-85 Se0-85
D Unadjusted
Time (month)
Adjusted
3-year Mean
-------
f
points can be plotted on It to provide a quick evaluation as to whether the
process Is 1n control.
The standard assumptions 1n the use of control charts are that the data
generated by the process, when 1t 1s 1n control, are Independently (see Sec-
tion 2.4.2) and normally distributed with a fixed mean « and constant variance
o». The most Important assumption Is that of independence; control charts are
not robust with respect to departure from Independence (e.g.. serial correla-
tion, see glossary). In general, the sampling scheme will be such that the
possibility of obtaining serially correlated results 1s minimized, as noted 1n
Section 2. The assumption of normality Is of somewhat less concern, but
should be Investigated before plotting the charts. A transformation (e.g.,
log-transform, square root transform) can be applied to the raw data so as to
obtain errors normally distributed about the mean. An additional situation
which may decrease the effectiveness of control charts 1s seasonal1ty 1n the
data. The problem of seasonally can be handled by removing the seasonal 1ty
effect from the data, provided that sufficient data to cover at least two
seasons of the same type are available (e.g., 2 years when monthly or quart-
erly seasonal effect). A procedure to correct a time series for seasonal1ty
was shown above 1n Section 7.2.
PURPOSE
Combined Shewhart-cumulatlve sum (CUSUM) control charts are constructed
for each constituent at each well to provide a visual tool of detecting both
trends and abrupt changes 1n concentration levels.
PROCEDURE
Assume that data from at least eight Independent samples of monitoring
are available to provide reliable estimates of the mean, u, and standard
deviation, o, of the constituent's concentration levels 1n a given well.
Step 1. To construct a combined Shewhart-CUSUM chart, three parameters
need to be selected prior to plotting:
h - a decision Internal value
k - a reference value
SCL - Shewhart control limit (denoted by U In Starks (1988))
The parameter k of the CUSUM scheme 1s directly obtained from the value,
0, of the displacement that should be quickly detected; k 0/2. It 1s recom-
mended to select k « 1, which will allow a displacement of two standard devia-
tions to be detected quickly.
When k 1s selected to be 1. the parameter h 1s usually sat at values of 4
or 5. The parameter h 1s the value against which the cumulative sum 1n the
CUSUM scheme will be compared. In the context of groundwatar monitoring, a
value of h S 1s recommended (Starks, 1988; Lucas, 1982).
7-7
-------
The upper Shewhart limit 1s set at SCL - 4.5 1n units of standard devia-
tion. This combination of k « 1. h « 5, end SCL 4.5 was found most appro-
priate for the application of combined Shewhart-CUSUM charts for froundwater
monitoring (Starks. 1988).
Step 2. Assume that at time period T*, n1 concentration measurements
Xlf .... Xn1, are available. Compute their average Xv
'Step 3. Calculate the standardized mean
where « and o are the mean and standard deviation obtained from prior monitor-
Ing at the same well (at least four sampling periods 1n a year).
Step 4. At each time period, Tj, compute the cumulative sum, Sj, as:
S, - max (0, (Z, - k) * S,
where max {A, 8} 1s the maximum of A and 8, starting with S0 * 0.
Step 5. Plot the values of S^ versus TJ on a time chart for this com-
bined Shewhart-CUSUM scheme. Declare an "out-of-control" situation at sam-
pling period T1 1f for the first time. Sj * h or Z1 * SCL. This will Indicate
probable contamination at the well and further Investigations will be
necessary.
REFERENCES
Lucas, J. M. 1982. 'Combined Shewhart-CUSUM Quality Control Schemes.* Jour-
nal of Quality Technology- Vol. 14, pp. 51-59.
Starks, T. H. 1988 (Draft). "Evaluation of Control Chart Methodologies for
RCRA Haste Sites."
Hockman, K. K., and J. M. Lucas. 1987. "Variability Reduction Through Sub-
vessel CUSUM Control." Journal of Quality Technology. Vol. 19, pp. 113-121.
EXAMPLE
The procedure 1s demonstrated on a set of carbon tetrachlorlde measure-
ments taken monthly at a compliance well over a 1-year period. The monthly
means of two measurements each (n« 2 for all 1'$) are pi Merited In the third
column of Table 7-2 below. Estimates of » and e, the mean and standard
deviation of carbon tetrachlorlde measurements at that particular veil were
obtained from a preceding monitoring period at that well; » « S.5 vg/L and
0 « 0.4 iig/L.
7-8
-------
TABLE 7-2. EXAMPLE DATA FOR COMBINED SHEHHART-CUSUM C&RT-
CARBON TETRACHLORIDE CONCENTRATION («g/L)
Sampling
period Mean concentration. Standardized X1(
Date
Jan 6
Feb 3
Mar 3
Apr 7
May 5
Jun 2
Jul 7
Aug 4
Sep 1
Oct 6
Nov 3
Dec 1
T1
1
2
3
4
5
6
7
8
9
10
11
12
X1
5.52
5.60
5.45
5.15
5.95
5.54 .
5.49
6.08
6.91
6.78
6.71
6.65
Z1
0.07
0.35
-0.18
-1.24
1.59
0.14
-0.04
2.05
4.99*
4.53*
4.28
4.07
Z1 -k
0.93
-0.65
-1.18
-2.24
0.59
-0.86
-1.04
1.05
3.99
3.53
3.28
3.07
CUSUM,
Si
0
0
0
0
0.59
0.00
0.00
1.05,
5-045
8.S6&
11.84°
14.91b
Parameters: Mean * 5.50; std 0.4; k » 1; h « 5; SCL » 4.5.
* Indicates "out-of-control* process via Shewhart control limit (Zj > 4.5).
b CUSUM "out-of-control' signal (5f > 5).
Step 1. The three parameters necessary to construct a combined
Shewhart-CUSUM chart were selected as h 5; k » 1; SCL * 4.5 1n units of
standard deviation.
Step 2. The monthly means are presented In the third column of
Table 7-2.
Step 3. Standardize the means within each sampling period. These
computations are shown In the fourth column of Table 7-2. For example,
Zi » (5.52 - 5.50)*/!70.4 - 0.07.
Step 4. Compute the quantities Sj, 1 1, ..., 12. For example,
Sj - max (0, -0.93 + 0} - 0
S2 max [0. -0.65 * 0} 0
S, - max (0, 0.59 * S } -max (0, 0.59 + 0} 0.59
S, max (0. -0.86 + Ss} -max (0. -0.86 + O.S9) max (0, -0.27}
etc.
7-9
-------
These quantities are shown In the last column of Table 7-2. »
Step 5. Construct the control chart. The y-ax1s 1$ In units of stan-
dard deviations. The x-ax1s represent time, or.the sampling periods. For
each sampling period, T.. record the value of X, and S*. Draw horizontal
lines at values h - 5 and SCL 4.5. These two lines represent the upper con-
trol limits for the CUSUM scheme and the Shewhart control H»1t, respec-
tively. The chart for this example data set Is shown 1n Figure 7-2.
The combined chart Indicates statistically significant evidence of con-
tamination starting at sampling period T,. Both the CUSUM scheme and the
Shewhart control Hm1t were exceeded by S, and Z,, respectively. Investi-
gation of the situation should begin to confirm contamination and action
should be required to bring the variability of the data back to Its previous
level.
INTERPRETATION
The combined Shewhart-CUSUM control scheme was applied to an example data
set of carbon tetrachlorlde measurements taken on a monthly basis at a well.
The statistic used 1n the construction of the chart was the mean of two
measurements per sampling period. (It should be noted that this method can be
used on an Individual measurement as well, 1n which case n* 1). Estimates
of the mean and standard deviation.of the measurements were available from
previous data collected at that well over at least four sampling periods.
The parameters of the combined chart were selected to be k 1 unit, the
reference value or allowable slack for the process; h » 5 units, the decision
Interval for the CUSUM scheme; and SCL 4.5 units, the upper Shewhart control
Unit. All parameters are 1n units of «, the standard deviation obtained from
the previous monitoring results. Various combinations of parameter values can
be selected. The particular values recommended here appear to be the best for
the Initial use of the procedure from a review of the simulations and recom-
mendations 1n the references. A discussion on this subject 1s given by Lucas
(1982), Hockman and Lucas (1987), and Starks (1988). The choice of the param-
eters h and k of a CUSUM chart 1s based on the desired performance of the
chart. The criterion used to evaluate a control scheme Is the average number
of samples or time periods before an out-of-control signal Is obtained. This
criterion 1s denoted by ARL or average run length, the ARL should be large
when the mean concentration of a hazardous constituent Is near Its target
value and small when the mean has shifted too far from the target. Tables
have been developed by simulation methods to estimate ARLs for given combina-
tions of the parameters (Lucas, Hockman and Lucas, and Starks). The user 1s
referred to these articles for further reading.
7.4 UPDATE OF A CONTROL CHART
The control chart Is based on preselected performance parameters as well
as on estimates of » and o, the parameters of the distribution of the measure-
ments In question. As monitoring continues and the process 1s found to be 1n
control, these parameters need periodic updating so as to Incorporate this new
Information Into the control charts. Starks (1988) has suggested that 1n
7-10
-------
COMBINED SHEWHART-CUSUM CHART
c
3
O
tJ
C
O
M
Ul
c
O
O
O
O
; h-5; SCL-4.5
234
O Standardized Mean
567
Sampling Period
12
Figure 7-2. Combined Shewhart-CUSUN chart.
-------
general, adjustments 1n sample means and standard deviations t>e "»ede after
sampling periods 4. 8, 12, 20, and 32, following the Initial monitoring period
recoonended to be at least eight sampling periods. Also, the performance
parameters h, k, and SCI would need to be updated. The author suggests that
h » 5, k « 1, and SCI 4.5 be kept at those values for the first 12 sampling
periods following the Initial monitoring plan, and that k be reduced to 0.75
and SCI to 4.0 for all subsequent sampling periods. These values and sampling
period numbers are not mandatory. In the event of an out-of-control state or
a trend, the control chart should not be updated.
7.5 NONDETECTS IN A CONTROL CHART
Regulations require that four Independent water samples be taken at each
well at a given sampling period. The mean of the four concentration measure-
ments of a particular constituent Is used 1n the construction of a control
chart. Now situations will arise when the concentration of a constituent 1s
below detection limit for one or more samples. The following approach 1s
suggested for treating nondetects when plotting control charts.
If only one of the four measurements 1s a nondetect, then replace It with
one half of the detection limit (MDL/2) or with one half of the practical
quant1tat1on limit (PQl/2) and proceed as described In Section 7.3.
If either two or three of the measurements are nondetects, use only the
quantltated values (two or one, respectively) for the control chart and pro-
ceed as discussed earlier 1n Section 7.3.
If all four measurements are nondetects, then use one half of the detec-
tion limit or practical quantltatlon limit as the value for the construction
of the control chart. This Is an obvious situation of no contamination of the
well.
In the event that a control chart requires updating and a certain propor-
tion of the measurements 1s below detection limit, then adjust the mean and
standard deviation necessary for the control chart by using Cohen's method
described 1n Section 8.1.4. In that case, the proportion of nondetects
applies to the pool of data available at the time of the updating and would
Include all nondetects up to that time, not just the four measurements taken
at the last sampling period.
CAUTIONARY NOTE; Control charts are a useful supplement to other statistical
techniques because they are graphical and s.mple to use. However, 1t 1s
Inappropriate to construct a control chart on wells that have shown evidence
of contamination or an Increasing trend (see §264.97(a)(l)(1)). Further, con-
tamination may not be present In a well 1n the form of a steadily Increasing
concentration profileIt may be present Intermittently or may Increase 1n a
step function. Therefore, the absence of an Increasing trend does not
necessarily prove that a release has not occurred.
7-12
-------
SECTION 8
MISCELLANEOUS TOPICS
This chapter contains a variety of special topics that are relatively
short and self contained. These topics Include Methods to deal with data
below the limit of detection and Methods to check for, and deal with outliers
or extreme values 1n the data.
8.1 LIMIT OF DETECTION
In a chemical analysis some compounds may be below the detection limit
(OL) of the analytical procedure. These are generally reported as not
detected (rather than as zero or not present) and the appropriate limit of
detection 1s usually given. Data that Include not detected results are a
special case referred to as censored data In the statistical literature. For
compounds not detected, the concentration of the compound 1s not known.
Rather, 1t 1s only known that the concentration of the compound 1s less than
the detection Hm1t.
There are a variety of ways to deal with data that Include values below
detection. There 1s no general procedure that 1s applicable 1n all cases.
However there are some general guidelines that usually prove adequate. If
these do not cover a specific situation, the user should consult a profes-
sional statistician for the most appropriate way to deal with the values below
detection.
A summary of suggested approaches to deal with data below the detection
limit 1s presented as Table 8-1. The method suggested depends on the amount
of data below the detection limit. For small amounts of below detection
values, simply replacing a "NO" (not detected) report with a small number, say
the detection limit divided by two, and proceeding with the usual analysis 1s
satisfactory. For moderate amounts of below detection limit data, a more
detailed adjustment 1s appropriate, while for large amounts one may need to
only consider whether a compound was detected or not as the variable of
analysis.
The meaning of small, moderate, and large above Is subject to judgment.
Table 8-1 contains some suggested values. It should be recognized that these
values are not hard and fast rules, but are based on judgment. If there 1s a
question about how to handle values below detection, consult a statistician.
8-1
-------
TABLE 8-1. METHODS FOR BELOW DETECTION LIMIT VALUES
Percentage
of Nondetects
in the Data Base
Statistical
Analysis Method
Section of
Guidance Document
Less than 15%
Replace NDs with
MDL/2orPQI_/2,
then proceed with
parametric procedures:
ANOVA
Tolerance Units
Prediction Intervals
Control Charts
Section 8.1.1
Section 5.2.1
Section 5.3
Section 5.4
Section 7
Between 15 and 50%
Use NDs as ties,
then proceed with
Nonparametric ANOVA
or
use Cohen's adjustment,
then proceed with:
Tolerance Limits
Confidence Intervals
Control Charts
Section 5.2.2
Section 8.1.3
Section 5.3
Se an 6.2.1
Se an 7
More than 50%
Test of Proportions
Section 8.1.2
8-2
-------
It should be noted that the nonpara«etrie methods presented earlier auto-
matlcally deal with values below detection by regarding the* as all tied at a
level below any quantHated results. The nonparametrlc Methods any be used 1f
there 1s a moderate aaount of data below detection. If the proportion of non-
quantified values In the data exceeds 25%, these methods should be used with
caution. They should probably not be used If less than half of the data con-
sists of quantified concentrations.
8.1.1 The DL/2 Method
The aaount of data that are below detection plays an Important role 1n
selecting the method to deal with the limit of detection problem. If a small
proportion of the observations are not detected, these may be replaced with a
small number, usually the method detection limit divided by 2 (MOL/2), and the
usual analysis performed. This 1s the recommended method for use with the
analysis of various procedure of Section 5.2.1. Seek professional help 1f 1n
doubt about dealing with values below detection limit. The results of the
analysis are generally not sensitive to the specific choice of the replacement
number.
As a guideline, 1f 15% or fewer of the values are not detected, replace
them with the method detection limit divided by two and proceed with the
appropriate analysis using these modified values. Practical quantltatlon
limits (PQL) for Appendix IX compounds were published by EPA In the Federal
Register (Vol 52. No 131, July 9, 1987, pp 25947-25952). These give practical
quantltatlon limits by compound and analytical method that may be used 1n
replacing a small amount of nondetected data with the quantltatlon limit
divided by 2. If approved by the Regional Administrator, site specific PQL's
may be used 1n this procedure. If more than 15% of the values are reported as
not detected, 1t 1s preferable to use a nonparametrlc method or a test of pro-
portions.
.
8.1.2. Test of Proportions
If more than 50% of the data are below detection but at least 10% of the
observations are quantified, a test of proportions may be used to compare the
background well data with the compliance well data. Clearly, If none of the
background well observations were above the detection limit, but all of the
compliance well observations were above the detection limit, one would suspect
contamination. In general the difference may not be as obvious.* However, a
higher proportion of quantHated values 1n compliance wells could provide evi-
dence of contamination. The test of proportions 1s a method to determine
whether a difference 1n proportion of detected values In the background well
observations and compliance well observations provides statistically signifi-
cant evidence of contamination.
The test of proportions should be used when the proportion of quantified
values 1s small to moderate (I.e., between 10% and 50%). If very few quanti-
fied values are found, a method based on the Pols son distribution may be used
as an alternative approach. A method based on a tolerance limit for the
number of detected compounds and the maximum concentration found for any
detected compound has been proposed by Gibbons (1988). This alternative would
8-3
-------
be appropriate when the number of detected compounds 1s qui ? null .relative
to the number of compounds analyzed for as might occ.r in detection
monitoring.
PURPOSE :
The test of proportions determines whether the proportion of compounds
detected 1n the compliance well data differs significantly from the proportion
of compounds detected 1n the background well data. If there 1s a significant
difference, this 1s statistically significant evidence of contamination.
PROCEDURE
The procedure uses the normal distribution approximation to the binomial
distribution. This assumes that the sample size Is reasonably large. Gener-
ally, 1f the proportion of detected values 1s denoted by P, and the sample
size 1s n, then the normal approximation Is adequate, provided that nP and
n(l-P) both are greater than or equal to 5.
Step 1. Determine X, the number of background well samples 1n which the
compound was detected. Let n be the total number of background well samples
analyzed. Compute the proportion of detects:
Pu x/n
Step 2. Determine Y, the number of compliance well samples In which the
compound was detected. Let M be the total number of compliance well samples
analyzed. Compute the proportion of detects:
Step 3. Compute the standard error of the difference 1n proportions:
S0 « CKx+y)/(rHm)l[l - (x+y)/(rHm)Hl/n * 1/m]}1/2
and form the statistic:
' Z -
-------
TABLE 8-2. EXAMPLE DATA FOR A TEST OF PROPORTIONS'
Cadnlm concentration (v9/L) Cadiltn concentration (wg/L)
at background well at compliance wells
(24 tuples) (64 samples)
0.1 BOL
0.12 BOL
BOL* BOL
0.26 BOL
BOL
0.1
BOL
0.014
BOL
BOL
BOL
BOL
BOL
0.12
BOL
0.21
BOL
0.12
BOL
BOL
0.12
0.08
BOL
0.2
BOL
0.1
BDL
0.012
BOL
BOL
BOL
BOL
BDL
0.12
0.07
BDL
0.19
BOL
0.1
BOL
0.01
BOL
BOL
BOL
BDL
BOL
0.11
0.06
BOL
0.23
BOL
0.11
BOL
0.031
BOL
BOL
BDL
BOL
BOL
0.12
0.08
BOL
0.26
BOL
0.02
BOL
0.024
BOL
BDL
BOL
BOL
BDL
0.1
0.04
BOL
BOL
0.1
BDL
0.01
BOL
BOL
BOL
BOL
BOL
BOL aeans below detection I1«1t.
8-5
-------
Step 1. Estimate the proportion above detection in the Background
wells. As shown In Table 8-2, there were 24 Maples from background wells
analyzed for cadmium, so n- 24. Of these. 16 wart below detect*** and x - 8
were above detection, so Pu 8/24 - 0.333. »-*
Step 2. Estimate the proportion above detection In the'" compliance
wells. There were 64 samples from compliance wells analyzed for cadmium, with
40 below detection and 24 detected values. This gives m 64. y « 24. so P.
24/64 - 0.375. d
Step 3. Calculate the standard error of the difference 1n proportions.
SD « CU8+24)/(24*64)][l-(8*24)/(24*64)l(l/24+l/64)}1/2 -0.115
Step 4. Form the statistic Z and compare It to the normal
distribution.
2 - 0.375-0.333 . 0.37
0.115 J
which 1s less 1n absolute value than the value from the normal distribution,
1.96. Consequently, there 1s no statistically significant evidence that the
proportion of samples with cadmium levels above the detection limit differs 1n
the background well and compliance well samples.
INTERPRETATION
Since the proportion of water samples with detected amounts of cadmium In
the compliance wells was not significantly different from that 1n the
background wells, the data are Interpreted to provide no evidence of contam-
ination. Had the proportion of samples with detectable levels of cadmium 1n
the compliance wells been significantly higher than that 1n the background
wells this would have been evidence of contamination. Had the proportion been
significantly higher 1n the background wells, additional study would have been
required. This could Indicate that contamination was migrating from an off-
site source, or It could mean that the hydraulic gradient had been Incorrectly
estimated or had changed and that contamination was occurring from the facil-
ity, but the ground-water flow was not 1n the direction originally estimated.
Mounding of contaminants 1n the ground water near the background wells could
also be a possible explanation of this observance.
8.1.3 Cohen's Method
If a confidence Interval or a tolerance Interval based upon the normal
distribution Is being constructed, a technique presented by Cohen (1959)
specifies a method to adjust the sample mean and sample standard deviation to
account for data below the detection Im1t. The only requirements for the use
of this technique 1s that the data are normally distributed and that the
detection limit be always the same. This technique Is demonstrated below.
8-6
-------
PURPOSE
Cohen's Method provides estimates of the sample wan end standard devia-
tion when some (< SOX) observations are below detection. These estimates can
then be used to construct tolerance, confidence, or prediction Intervals.
PROCEDURE
Let n be the total number of observations, m represent the number of data
points above the detection limit (OL), and X* represent the value of the 1th
constituent value above the detection limit.
Step 1. Compute the sample mean xri from the data above the detection
limit as follows:
Step 2. Compute the sample variance Si from the data above the detection
limit as follows:
Step 3. Compute the two parameters, h and T (lowercase gamma), as
follows:
[n-ml
n
and
(x-OL)*
where n 1s the total number of observations (I.e., above and below the
detection limit), and where OL Is equal to the detection Unit.
These values are then used to determine the value of the parameter x from
Table 7 1n Appendix B.
Step 4. Estimate the corrected sample mean, which accounts for the data
below detection limit, as follows'
X *d - x(xd - OL)
Step 5. Estimate the corrected sample standard deviation, which accounts
for the data below detection limit, as follows:
8-7
-------
Step 6. Use the corrected values of X and S 1n the procedure for con-
structing a tolerance Interval (Section 5.3) or a confidence Interval (Sec-
tion 6.2.1).
REFERENCE
Cohen, A. C., Jr. 1959. "Simplified Estimators for the Normal Distribution
When Samples are Singly Censored or Truncated." T«cnnom«tric*. 1:217-237.
EXAMPLE
Table 8-3 contains data on sulfate concentrations. Three observations of
the 24 were below the detection limit of 1,450 mg/L and are denoted by
"< 1,450" 1n the table.
TABLE 8-3. EXAMPLE DATA FOR COHEN'S TEST
Sulfate concentration (mg/L)
1.850
1,760
.- < 1,450
1,710
1,575
1,475
1.780
1,790
1,780
< 1,450
1,790
1,800
< 1,450
1,800
1,840
1,820
1.860
1.780
1,760
1,800
1.900
1.770
1.790
1.780
DL 1.450 mg/L
Note: A symbol <* before a number indicates that the value
Is not detected. The number following 1s then the Halt of
detection.
8.3
-------
Step 1. Calculate the mean fron the m » 21 values above detection
xd 1,771.9
Step 2. Calculate the sample variance from the 21 quantified values
Sj 8,593.69
Step 3. Determine
h - (24-21J/24 - 0.125
and
T " 8593.69/U771.9-1450)* « 0.083
Enter* Table 7 of Appendix B at h 0.125 and j « 0.083 to determine the
value of x. Since the table does not contain these entries exactly, double
linear Interpolation was used to estimate x 0.14986.
REMARK
For the Interested reader, the details of the double linear Interpolation
are provided.
The values from Table 7 between which the user needs to Interpolate are:
h 0.10 h 0.15
I
0.05 0.11431 0.17935
0.10 0.11804 0.18479
There are 0.025 units between 0.01 and 0.125 on the h-scale. There are
0.05 units between 0.10 and 0.15. Therefore, the value of Interest (0.125)
lies (0.025/0.05 * 100) 50* of the distance along the Interval between 0.10
and 0.15. To linearly Interpolate between the tabulated values on the h axis,
the range between the values must be calculated, the value that Is 50% of the
distance along the range must be computed and then that value must be added to
the lower point on the tabulated values. The result 1s the Interpolated
value. The Interpolated points on the h-scale for the current example are:
0.17935 - 0.11431 0.06504 0.06504 * 0.50 - 0.03252
0..11431 + 0.03252 - 0.14683
0.18479 - 0.11804 0.06675 0.06675 * 0.50 0.033375
0.11804 + 0.033375 0.151415
On the r-ax1s there are 0.033 units between 0.05 and 0.083. There are
0.05 units between 0.05 and 0.10. The value of Interest (0.083) lies
8-9
-------
(0.0330.05 * 100) 66X of the distance along the Interval between 0.05 and
0.10. The Interpolated point on the Y-axis 1s: *
0.141415 - 0.14683 0.004585 0.004585 * 0.66 « 0.0030261
0.14683 + 0.0030261 0.14986
Thus, x 0.14986.
Step 5. The corrected sample mean and standard deviation are then esti-
mated as follows:
X « 1.771.9 - 0.14986 (1,771.9 - 1.450) - 1.723.66
S - 18.593.69 0.14986(1,771.9 - 1.450)*]1/2 155.31
Step 6. These modified estimates of the wan. 7 1723.66. and of the
standard deviation, S 155.31, would be used 1n the tolerance or confidence
Interval procedure. For example, 1f the sulfate concentrations represent
background at a facility, the upper 95X tolerance I1»1t becomes
1723.7 + (155.3)(2.309) - 2082.3 mg/l
Observations from compliance wells In excess of 2,082 mg/l would give sta-
tistically significant evidence of contamination.
INTERPRETATION
Cohen's method provides maximum likelihood estimates of the mean and
variance of a censored normal distribution. That Is, of observations that
follow a normal distribution except for those below a limit of detection,
which are reported as "not detected." The modified estimates reflect the fact
that the not detected observations are below the limit of detection, but not
necessarily zero. The large sample properties of the modified estimates allow
for them to be used with the normal theory procedures as a means of adjusting
for not detected values 1n the data. Use of Cohen's method 1n more compli-
cated calculations such as those required for analysis of variance procedures,
requires special consideration from a professional statistician.
8.2 OUTLIERS
A ground-water constituent concentration value that 1s much different
from most other values 1n a data set for the same ground-water constituent
concentration can be referred to as an outlier." Possible reasons for
outliers can be:
A catastrophic unnatural occurrence such as a spill;
Inconsistent sampling or analytical chemistry methodology that may
result In laboratory contamination or other anomalies;
Errors 1n the transcription of data values or decimal points; and
8-10
-------
True but extreme ground-water constituent concentration measure-
ments.
There are several tests to determine 1f there 1s statistical evidence
that an observation 1s an outlier. The reference for the test presented here
1s ASTM paper E178-75.
PURPOSE
The purpose of a test for outliers 1s to determine whether there 1s
statistical evidence that an observation that appears extreme does not fit the
distribution of the rest of the data. If a suspect observation 1s Identified
as an outlier, then steps need to be taken to determine whether 1t 1s the
result of an error or a valid extreme observation.
PROCEDURE
Let the sample of observations of a hazardous constituent of ground water
be denoted by Xl( ..., Xn. For specificity, assume that the data have been
ordered and that the largest observation, denoted by Xn, 1s suspected of being
an outlier. Generally, Inspection of the data suggests values that do not
appear to belong to the data set. For example, 1f the largest observation 1s
an order of magnitude larger than the other observations, 1t would be suspect.
Step 1. Calculate the mean, X and the standard deviation, S, of the data
Including all observations.
Step 2. Form the statistic, Tn:
, T rv . vw<
1 'n *An *''*
Note that Tn 1s the difference between the largest observation and the sample
mean, divided by the sample standard deviation.
Step 3. Compare the statistic Tn to the critical value given the sample
size, n, 1n Table 8 1n Appendix B. If the Tn statistic exceeds the critical
value from the table, this 1s evidence that the suspect observation, X_, 1s a
statistical outlier.
Step 4. If the value 1s Identified as an outlier, one of the actions
outlined below should be taken. (The appropriate action depends on what can
be learned about the observation.) The records of the sampling and analysis
of the sample that led to It should be Investigated to determine whether the
outlier resulted from an error that can be Identified.
If an error (1n transcription, dilution, analytical procedure, etc.)
can be Identified and the correct value recovered, the observation should be
replaced by Its corrected value and the appropriate statistical analysis done
with the corrected value.
8-11
-------
If 1t can be determined that the observation 1s In error, but the
correct value cannot be determined, then the observation should J>e deleted
from the data set and the appropriate statistical analysis performed. The
fact that the observation was deleted and the reason for Its deletion should
be reported when reporting the results of the statistical analysis.
If no error 1n the value can be documented then It must be assumed
that the observation 1s a true but extreme value. In this case 1t oust not be
altered. It may be desirable to obtain another sample to confirm the observa-
tion. However, analysis and reporting should retain the observation and state
that no error was found In tracing the sample that led to the extreme observa-
tion.
EXAMPLE
Table 8-4 contains 19 values -of total organic carbon (TOC) that were
obtained from a monitoring well. Inspection shows one value which at 11,000
mg/L Is nearly an order of magnitude larger than most of the other observa-
tions. It 1s a suspected outlier.
Step 1. Calculate the mean and standard deviation of the data.
X - 2300 and S - 2325.9
TABLE 8-4. EXAMPLE DATA FOR TESTING FOR AN OUTLIER
Total organic carbon (mg/L)
1.700
1,900
1,500
1,300
11,000
1,250
1.000
1,300
1,200
1.450
1.000
1.300
1,000
2,200
4.900
3,700
1,600
2,500
1.900
8-12
-------
Step 2. Calculate the statistic Tls.
Tlf - (11000-2300)/2325.9 « 3.74
Step 3. Referring to Table 8 of Appendix B for the upper 5* significance
level, with n 19, the critical value Is 2.532. Since the value of the
statistic TI, 3.74 Is greater than 2.532, there Is statistical evidence
that the largest observation 1s an outlier.
Step 4. In this case, tracking the data revealed that the unusual value
of 11,000 resulted from a keying error and that the correct value was 1,100.
This correction was then Made 1n the data.
INTERPRETATION
An observation that 1s 4 or 5 tines as large as the rest of the data 1s
generally viewed with suspicion. An observation that 1s an order of magnitude
different could arise by a common error of misplacing a decimal. The test for
an outlier provides a statistical basis for determining whether an observation
1s statistically different from the rest of the data. If 1t is, then 1t 1s a
statistical outlier. However, a statistical outlier may not be dropped or
altered just because It has been Identified as an outlier. The test provides
a formal Identification of an observation as an outlier, but does not Identify
the cause of the difference.
Whether or not a statistical test Is done, any suspect data point should
be checked. An observation may be corrected or dropped only 1f 1t can be
determined that an error has occurred. If the error can be Identified and
corrected (as In transcription or keying) the correction should be made and
the corrected values used. A value that 1s demonstrated to be Incorrect may
be deleted from the data. However, 1f no specific error can be documented,
the observation must be retained 1n the data. Identification of an observa-
tion as an outlier but with no error documented could be used to suggest
resampling to confirm the value.
8-13
-------
APPENDIX A
GENERAL.STATISTICAL CONSIDERATIONS AND
6LOSSARY OF STATISTICAL TERMS
A-l
-------
STATISTICAL CONSIDERATIONS
FALSE ALARMS OR TYPE I ERRORS
The statistical analysis of data fro ground-water Monitoring at RCRA
sites has as Us goal the determination of whether the data provide evidence
of the presence of, or an Increase In the level of contamination. In the case
of detection monitoring, the goal of the statistical analysis 1s to determine
whether statistically significant evidence of contamination exists. In the
case of compliance monitoring, the goal Is to determine whether statistically
significant evidence of concentration levels exceeding compliance limits
exists. In monitoring sites In corrective action, the goal 1s to determine
whether levels of the hazardous constituents are still above compliance limits
or have been reduced to, at, or below the compliance limit.
These questions are addressed by the use of hypothesis tests. In the
case of detection monitoring, 1t 1s hypothesized that a site 1s not contami-
nated; that Is, the hazardous constituents are not present 1n the ground
water. Samples of the ground water are taken and analyzed for the constitu-
ents 1n question. A hypothesis test Is used to decide whether the data Indi-
cate the presence of the hazardous constituent. The test consists of calcu-
lating one or more statistics from the data and comparing the calculated
results to some prespeclfled critical levels.
In performing a statistical test, there are four possible outcomes. Two
of the possible outcomes result In the correct decision: (a) the test may
correctly Indicate that no contamination Is present or (b) the test may cor-
rectly Indicate the presence of contamination. The other two possibilities
are errors: (c) the test may Indicate that contamination 1s present when 1n
fact It 1s not or (d) the test may fall to detect contamination when 1t 1s
present.
If the stated hypothesis 1s that no contamination Is present (usually
called the null hypothesis) and the test Indicates that contamination 1s
present when In fact 1t 1s not, this 1s called a Type I error. Statistical
hypothesis tests are generally set up to control the probability of Type I
error to be no more than a specified value, called the significance level, and
usually denoted by a. Thus 1n detection monitoring, the mill hypothesis would
be that the level of each hazardous constituent Is zero (or at least below
detection). The test would reject this hypothesis If some measure of concen-
tration were too large. Indicating contamination. A Type I error would be a
false alarm or a triggering event that Is Inappropriate.
In compliance monitoring, the null hypothesis Is that the level of each
hazardous constituent 1s less than or equal to the appropriate compliance
A-2
-------
Unit. For the purpose of setting up the statistical procedure, the simple
null hypothesis that the level 1s equal to the compliance Halt'would be
used. As 1n detection monitoring, the test would Indicate contamination 1f
some measure of concentration Is too large. A false alarm or Type I error
would occur 1f the statistical procedure Indicated that levels exceed the
appropriate compliance Halts when, 1n fact, they do not. Such an error would
be a false alarm 1n that 1t would Indicate falsely that compliance limits were
being exceeded.
PROBABILITY OF DETECTION AND TYPE II ERROR
The other type of error that can occur Is called a Type II error. It
occurs 1f the test falls to detect contamination that 1s present. Thus a
Type II error 1s a missed detection. While the probability of a Type I error
can be specified, since 1t Is the probability that the test will give a false
alarm, the probability of a Type II error depends on several factors, Includ-
ing the statistical test, the sample size, and the significance level or prob-
ability of Type I error. In addition. It depends on the degree of contamina-
tion present. In general, the probability of a Type II error decreases as the
level of contamination Increases. Thus a test may be likely to miss low lev-
els of contamination, less likely to miss moderate contamination, and very
unlikely to miss high levels of contamination.
One can discuss the probability of a Type II error as the probability of
a missed detection, or one can discuss the complement (one minus the prob-
ability of Type II error) of this probability. The complement, or probability
of detection, 1s also called the power of the test. It depends on the magni-
tude of the contamination so that the power or probability of detecting con-
tamination Increases with the degree of contamination.
If the probability of a Type I error Is specified, then for a given sta-
tistical test, the power depends on the sample size and the alternative of
Interest. In order to specify a desired power or probability of detection,
one must specify the alternative that should be detected. Since generally the
power will Increase as the alternative differs more and more from the null
hypothesis, one usually tries to specify the alternative that 1s closest to
the null hypothesis, -yet enough different that 1t 1s Important to detect.
In the detection monitoring situation, the null hypothesis 1s that the
concentration of the hazardous constituent Is zero (or at least below detec-
tion). In this case the alternative of Interest 1s that there 1s a concen-
tration of the hazardous constituent that 1s above the detection limit and Is
large enough so that the monitoring procedure should detect It. Since it is a
very difficult problem to select I concentration of each hazardous constituent
that should be detectable with specified power, a more useful approach 1s to
determine the power of a test at several alternatives and decide whether the
procedure 1s acceptable on the basis of this power function rather than on the
power against a single alternative.
In order to Increase the power, a larger sample must be taken. This
would mean sampling at more frequent Intervals. There 1s a Hm1t to how much
can be achieved, however. In cases with limited water flow. It may not be
possible to sample wells as frequently as desired. If samples close together
A-3
-------
J
1n time prove to be correlated, this correlation reduces the Information
available fron the different samples. The additional cost of sampling and
analysis will also Impose practical limitations on the sample size that can be
used.
Additional wells could also be used to Increase the performance of the
test. The additional monitoring wells would primarily be helpful 1n ensuring
that a plume would not escape detection by missing the monitoring wells. How-
ever, In some situations the additional wells would contribute to a larger
sample size and so Improve the power.
In compliance monitoring the emphasis 1s on determining whether addi-
tional contamination has occurred, raising the concentration above a compli-
ance limit. If the compliance limit 1s determined from the background well
levels, the null hypothesis Is that the difference between the background and
compliance well concentrations Is zero. The alternative of Interest Is that
the compliance well concentration exceeds the background concentration. This
situation 1s essentially the same for power considerations as that of the
detection monitoring situation.
If compliance monitoring Is relative to a compliance limit (MCL or ACL),
specified as a constant, then the situation 1s different. Here the null hypo-
thesis 1s that the concentration 1s less than or equal to the compliance
limit, with equality used to establish the test. The alternative 1s that the
concentration Is above the compliance limit. In order 'to specify power, a
minimum amount above the compliance limit must be established and power speci-
fied for that alternative or the power function evaluated for several possible
alternatives.
SAMPLE DESIGNS AND ASSUMPTIONS
As discussed 1n Section 2, the sample design to be employed at a regu-
lated unit will primarily depend on the hydrogeologlc evaluation of the
site. Hells should be sited to provide multiple background wells hydraull-
cally upgradlent from the regulated unit. The background wells allow for
determination of natural spatial variability In ground-water quality. They
also allow for estimation of background levels with greater precision than
would be possible from a single upgradlent well. Compliance wells should be
sited hydraullcally downgradlent to each regulated unit. The location and
spacing of the wells, is well as the depth of sampling, would be determined
from the hydrogeology to ensure that at least one of the wells should Inter.
cept a plume of contamination of reasonable size.
Thus the assumed sample design Is for a sample of wells to Include a
number of background wells for the site, together with a number of compliance
wells for each regulated unit at the site. In the event that a site has only
a single regulated unit, there would be two groups of wells, background and
compliance. If a site has Multiple regulated units, there would be a set of
compliance wells for each regulated unit, allowing for detection Monitoring or
compliance Monitoring separately at each regulated unit.
Data from the analysis of the water at each well are Initially assumed to
follow a normal distribution. This 1s likely to be the case for detection
A-4
-------
monitoring of analytes 1n that levels should be near zero and errors would
likely represent Instrument or other sampling and analysis variability. If
contamination 1s present, then the distribution of the data may be skewed to
the right, giving a few very large values. The assumption of normality of
errors 1n the detection monitoring case 1s quite reasonable, with deviations
from normality likely Indicating some degree of contamination. Ttsts of nor-
mality are recommended to ensure that the data are adequately represented by
the normal distribution.
In the compliance monitoring case, the data for each analyte will again
Initially be assumed to follow the normal distribution. In this case, how-
ever, since there Is a nonzero concentration of the analyte 1n the ground
water, normality 1s more of an Issue. Tests of normality are recommended. If
evidence of nonnormalUy 1s found, the data should be transformed or a
distribution-free test be used to determine whether statistically significant
evidence of contamination exists.
The standard situation would result 1n multiple samples (taken at dif-
ferent times) of water from each well. The wells would form groups of back-
ground wells and compliance wells for each regulated unit. The statistical
procedures recommended would allow for testing each compliance well group
against the background group. Further, tests among the compliance wells
within a group are recommended to determine whether a single well might be
Intercepting an Isolated plume. The specific procedures discussed and recom-
mended 1n the preceding sections should cover the majority of cases. They did
not cover all of the possibilities. In the event that none of the procedures
described and Illustrated appears to apply to a particular case at a given
regulated site, consultation with a statistician should be sought to determine
an appropriate statistical procedure.
The following approach 1s recommended. If a regulated unit Is 1n detec-
tion monitoring, It will remain 1n detection monitoring until or unless there
1s statistically significant evidence of contamination, 1n which case 1t would
be placed In compliance monitoring. Likewise, 1f a regulated unit Is 1n com-
pliance monitoring, 1t will remain 1n compliance monitoring unless or until
there 1s statistically significant evidence of further contamination, 1n which
case 1t would move Into corrective action.
In monitoring a'regulated unit with multiple compliance wells, two types
of significance levels are considered. One Is an experfmentwlse significance
level and the other Is a compartsonwlst significance level. When a procedure
such as analysis of variance 1s used that considers several compliance wells
simultaneously, the significance 1s an experlmentwise significance. If
Individual well comparisons are Bade, each of those comparisons 1s done at a
comparlsonwlse significance level.
The fact that many comparisons will be made at a regulated unit with
multiple compliance wells .an make the probability that at least one of the
comparisons will be Incorrectly significant too high. To control the false
positive rate, multiple comparisons procedures are allowed that control the
experiment**je significance level to be SX. That 1s, the probability that one
or more of the comparisons will falsely Indicate contamination 1s controlled
A-5
-------
at 5X. However, to provide some assurance of adequate power tf detect real
contamination, the comparison^se significance level for comparing each
Individual well to the background 1s required to be no Itss than IX.
Control of the experimentsse significance level via multiple comparisons
procedures Is allowed for comparisons among several wells. However, use of an
experlmentwlse significance level for the comparisons among the different haz-
ardous constituents 1s not permitted. Each hazardous constituent to be moni-
tored for 1n the permit must be treated separately.
A-6
-------
GLOSSARY OF STATISTICAL TERMS
(underlined terns are explained subsequently)
Alpha (a)
Alpha-error
Alternative hypothesis
Arithmetic average
Confidence coefficient
Confidence Interval
Cumulative distribution
function
Distribution-free
A greek letter used to denote the significance
level or probability of a Type I error.
Sometimes used for Type I error.
An alternative hypothesis specifies that the
underlying distribution differs from the null
hypothesis. The alternative hypothecs usually
specifies the value of a parameter, -or example
the mean concentration, that one 1s trying to
detect.
The arithmetic average of a set of observations
1s their sum divided by the number of
observations.
The confidence coefficient of a confidence
Interval for a parameter 1s the probability that
the random Interval constructed from the sample
data contains the true value of the parameter.
The confidence coefficient 1s related to the
significance level of an associated hypothesis
test by the fact that the significance level (1n
percent) 1s one hundred minus the confidence
coefficient (1n percent).
A confidence Interval for a parameter 1s a
random Interval constructed from sample data 1n
such a way that the probability that the
Interval will contain the true value of the
parameter 1s a specified value.
Distribution function.
This 1s sometimes used as a synonym for
nonparametrlc. A statistic 1$ distribution-free
If its distribution does not depend upon which
specific distribution function (1n a large
class) the observations follow.
A-7
-------
Distribution function
Experimentalse error rate
Hypothesis
Independence
Mean
Median
Multiple comparison
procedure
The distribution function for t random variable,
X, 1s a function that specifies the probability
that X 1s less than or equal to t, for all real
values of t.
This tern refers to multiple comparisons. If a
total of n decisions are made about comparisons
(for example of compliance wells to background
wells) and x of the decisions are wrong, then
the expeHmentwIse error rat lf /n.
This Is a formal statement o parameter of
Interest and the dlstrlbut i <, a statistic.
It 1s usually used as a nu;1 hypothesis or an
alternative hypothesis. For example, the null
hypothesis might specify that ground water had a
zero concentration of benzene and that analyti-
cal errors followed a normal distribution with
mean zero and standard deviation 1 ppm.
A set of events are Independent 1f the
probability of the joint occurrence of any
subset of the events factors Into the product of
the probabilities of the events. A set of
observations Is Independent 1f the joint
distribution function of the random errors
associated with the observations factors Into
the product of the distribution functions.
Arithmetic average.
This 1s the middle value of a sample when the
observations have been ordered from least to
greatest. If the number of observations Is odd,
It 1s the middle observation. If the number of
observations Is even, 1t 1s customary to take
the midpoint between the two middle observa-
tions. For a distribution, the median 1s a
value such that the probability 1s one-half that
an observation will fall above or below the
median.
This 1s a statistical procedure that makes a
large number of decisions or comparisons on one
set of data. For example, at a sampling period,
several compliance well concentrations may be
compared to the background well concentration.
A-8
-------
NonparaMtrlc statistical
procedure
Normal population,
normality
Null hypothesis
One-sided test
One-sided tolerance limit
One-sided confidence limit
Order statistics
Outlier
Parameter
Percent He
A nonparametrlc statistical procedure te a
statistical procedure that IMS .Desirable
properties that told under «11d assumptions
regarding the data. Typically the procedure Is
valid for a large class of distributions rather
than for a specific distribution of the data
such as the normal.
The errors associated with the observations
follow the normal or Gaussian distribution
function.
A null hypothesis specifies the underlying
distribution of the data completely. Often the
null distribution specifies that there 1s no
difference between the mean concentration 1n
background well water samples and compliance
well water samples.
A one-sided test 1s appropriate 1f concentra-
tions higher than those specified by the null
hypothesis are of concern. A one-sided test
only rejects for differences that are large and
In a prespeclfled direction.
This 1s an upper limit on observations from a
specified distribution.
This 1s an upper limit on a parameter of a
distribution.
The sample values observed after they have been
arranged In Increasing order.
An outlier Is an observation that 1s found to
lie an unusually long way from the rest of the
observations 1n a series of replicate
observations.
A parameter 1s an unknown constant associated
with a population. For example, the mean
concentration of a hazardous constituent 1n
ground water 1s a parameter of Interest.
A percentHe of a distribution 1s a value below
which « specified proportion or percent of the
observations from that distribution will fall.
A-9
-------
Power
Sample standard deviation
Sample variance
Serial correlation
Significance level
Type I error
Type II error
The power of a test Is the probability that the
test will reject under 4 specified alternative
hypothesis. This 1s one Inus the probability
°' * Type II error. The power 1s a Measure of
the test's ability to detect a difference of
specified size fro* the null hypothesis.
This Is the square root of the sample variance.
TM, a statistic (computed on a sample of
e ie Ions rather than on the whole popula-
t T tat Measures the variability or spread of
t«e enervations about the sample mean. It 1s
the sum of the squared differences from the
sample Mean, divided by the number of observa-
tions less one.
This Is the correlation of observations spaced a
constant Interval apart 1n a series. For exam-
ple, the first order serial correlation Is the
correlation between adjacent observations. The
first order serial correlation 1s found by cor-
relating the pairs consisting of the first and
second, second and third, third and fourth,
etc., observations.
Sometimes referred to as the alpha level, the
significance level of a test 1s the probability
of falsely rejecting a true null hypothesis.
The probability-of a Type I error.
A Type I error occurs when a true null
hypothesis 1s rejected erroneously. In the
Monitoring context a Type I error occurs when a
test Incorrectly Indicates contamination or an
Increase 1n contamination at a regulated unit.
A Type II error occurs when one falls to reject
a null hypothesis that 1s false. In the moni-
toring context, a Type II error occurs when
monitoring falls to detect contamination or an
Increase 1n a concentration of a hazardous
constituent.
A-10
-------
r
r
APPENDIX B
STATISTICAL TABLES
B-l
-------
CONTENTS
Table Page
1 PercentHes of the x* Distribution With
v Degrees of Freedom, x*V(p B-3
2 95th Percentlies of the F-D1str1but1on With vt and
v, Degrees of Freedom, FVlfW2(«.t$ B-4
3 95th Percentlies of the Bonferronl t-Stat1sties,
t(v, /«) B-5
4 Percentlies of the Standard Normal Distribution, Up B-6
5 Tolerance Factors (K) for One-Sided Normal Tolerance
Intervals With Probability Level (Confidence Factor)
Y 0.95 and Coverage P 95X B-8
6 PercentHes of Student's t-D1str1but1on B-9
7 Values of the Parameter x for Cohen's Estimates
Adjusting for Nondetected Values B-10
8 Critical Values for T (One-Sided Test) When the
Standard Deviation Is Calculated From the Same Sample... B-ll
8-2
-------
TABLE 1. PERCENTILES OF THE x» DISTRIBUTION WITH
v DEGREES OF FREEDOM, x£,p
X
1
2
3
4
3
4
7
1
9
10
11
12
U
14
15
U
17
11 '
1*
20
22
22
23
24
25
26
XI
2t
2>
30
40
90
0
10
10
90
100
0.750
1423
2.773
4.10C
5415
4426
7441
9437
1072
11.39
1245
13.70
14.15
15.91
17.12
1125
1947
20.49
2140
2172
2343
24.93
2404
27.14
2124
2944
30L43
31.53
32.62
33.71
34 JO
4542
5*43
«64f
TIM
tt.13
9165
109.1
0.900
2.706
4.405
6451
7.779
9436
1044
1102
134«
14.61
15.99
174*
1155
19.11
21.06
2241
23.54
24.77
25.99
27 JO
2141
2942
30J1
32.01
33 JO
3441
35J6
3C74
37.92
39.09
4044
51.10
63.17
74.40
1543
94.51
1074
1119
0.950
3J41
'5.991
7.115
9.4M
1147
059
1447
1541
1192
1141
19.41
21.03
22J4
23.41
2540
2440
27 J9
2S47
30.14
31.41
32.67
33.92
35J7
3442
3745
3149
40.11
4144
42J6
43.77
55.76
4740
7941
9043
101.9
113.1
1244
0475
5424
7471
9441
11.14
12J3
14.45
14.01
1743
1942
20.41
2142
2344
24.74
24.12
27.49
2U5
30.19
3143
3245
34.17
35.41
34.71
3101
3944
4045
4142
43.19
MM
4172
44.91
9944
71.42
1340
9942
1044
111.1
1294
ew
4.63
9418
1144
134*
1549
1441
1141
2049
2147
2341
24.72
2442
2749
29J4
3041
3240
33.41
3441
34J9
H47
3«J3
4029
41.64
42Jt
4441
45.44
44.94
4US
4*49
9049
4349
74.15
M4«
uxu
1124
124.1
1354
0495
7479
10.40
1244
1444
14.75
1145
204J
21 J6
2349
25.19
26.74
2130
2942
3142
3240
3407
35.72
37.14
3141
4040
41.40
4240
44.11
4544
4449
4S49
49.44
90.99
9244
9347
64.77
79.49
9144
HMO
1144
1214 '
1404
0.999
1043
1342
1647
1147
2042
22.44
2442
24.12
2741
2949
3144
3241
3443
34.12
37.70
3945
40.79
4241
4342
4542
4440
4127
49.73
51.11
32.42
9445
9541
S649
9140
99.70
73.40
16.44
9941
1124
1244
1374
149.4
SOURCE; Johnson, Honwn L. and F. C. Uonu. 1977. Statistic* and Bxp«rim«ntal
Design in ErigtrMcrfng and th« Physical Scitncta. Vol. I. Second Edition. John
Wiley and Sons, New York.
6-3
-------
TABLE 2. 95th PERCENTILES OF THE F-OISTRIBUTION WITH
Vi AND va DEGREES OF FREEDOM.
.,.»,.«*
U
IS
12*
1*1.4 If*.}
1X11 I*.M
Ml! *J1
T.TI CM
115.7
2S4J SM4 SKI SMJ SN.f S4*4 Ml.* S*X* S4X* S4M SM.I SM.I
I*.2S |*Jt 1*41 IMS I*4T 1*41 19.41 1*41 H.41 l*.4l l*.4l
Ml HIS 9.M CM M* MS Ml IT* 1T4 XT* XM
It
II
IS
11
M
IS
M
IT
II
19
SI
§
S4
4* Cl*
XI*
441
411
XM
Ul
14*
XII
XII
XM
15
'
SJ7
XM
IS
in
in
*.*s
XM
Ul
IS
Ul
1M
14*
US
Ul
UT
us
in
in
in
in
44*
ff-
1M
ITT
1*1
U4
14*
U*
U!
ia
ui
n*
11*
us
lit
UT
US
Itl
XTT
4.11
1J4
141
XIS
Ut
174
1*1
Ul
14S
US
IS*
XS4
XI*
us
111
XM
its
1*1
Ul
I.M
ai.i axs ui.i
I9.4T I*.4I l».4*
Ul U* MT US
XT! 1.T2 XM XM
44*
Ml
XII
4.4*
XTT
XM
XM Ul
XT*
S4T
X4T
XII
Ul
us
II*
IIS
111
UT
1M
Xtl
I.M
1.9*.
IJ4
IS!
X41
S44
XZT
XII
XI*
XM
Ul
IJ*
I.M
I.M
1.91
IJ*
4.41
1.T4
14*
XII
IT*
US
14*
Ul
XI*
in
in
1M
in
I.M
I.M
IJ*
IJ*
IJ4
4.4*
XT*
xn
x*r
XT!
XM
X4!
XM
X2S
XII
XII
XM
XOI
tin
I.M
IJT
IJ4
IJI
1.7*
If.!'
1.9.
1.4.
4.)'
1.4
1.2.
X*
XT
XI-
X"
XJ'
X2
XI
xo
xo
r.»
IJ
I.I
l.t
I.T
I.T
1.7
a
a*
17
a
a*
M
4*
tat
444
441
441
441
4.11
4.IT
4.M
4.M
XM
IJT
US
&
xn
141
xn
XtT
xtt
XM
XM
xn
in
in
XM
in
1M
U*
in
1T4
in
1TI
in
u»
XM
Ul
14}
UT
Ut
U*
UT
U*
US
Ul
14S
UT
U*
Ul
IS
14*
141
141
IS
us
UT
lit
l«t
U*
UT
ul
IS
1IT
It*
1*1
U4
US
Ul
14*
ia
1ST
til
It
US
I.M
ia
1ST
S4S
U4
xn
IS!
IIS
1M
I.M
IJI
194
US
Ut
ig
I'M
ui
lit
us
in
us
lit
IS
IJI
I.TS
U*
IS
XM
Xtl -
Ut
i.n
IJ4
I.TS
IJT
1M
I.**
l.*7
I.M
I.M
IJ4
I.TS
IJ*
IJT
I.M
I.M
lift
I.M
I.T*
i.n
I.M
IJS
IJt
I.M
IJI
IJT
IJS
IJ4
1.74
I.*!
I4S
1.4*
IJT
IJS
I.M
i.n
IJI
I.T*
IJ*
U*
IJt
IJ9
IJS
IJ*
i.n
I.TT
i.n
I.T4
I.M
Ul
1.41
IJS
I.TT
I.TS
i.n
I.TI
i.n
I.M
MT
141
142 1
NOTE: v,: Degrees of freedoa for numerator
va: Degrees of freedom for denominator
SOURCE; Johnson, Herman L. and F. C. Leone. 1977. Statistics and Exp«rim«ntat
Design in Engineering and th« Physical Scitnccs. Vol.1. Second Edition. John
Wiley and Sons, Hew York.
B-4
-------
TABLE 3. 95th PERCENTILES OF THE BOHFERRONI
t-STATISTICS, t(v, a/m)
where v « degrees of freedom associated with the wan
squares error
number of comparisons
a 0.05, the experlmentwlse error level
\ m
\"
4
5
6
7
8
9
10
15
20
30
O.C
2.13
2.02
1.94
1.90
1.86
1.83
1.01
1.75
1.73
1.70
1.65
2
0.025
2.78
2.57
2.45
2.37
2.31
2.26
2.23
2.13
2.09
2.04
1.96
3
0.0167
3.20
2.90
2.74
2.63
2.55
2.50
2.45
2.32
2.27
2.21
2.13
4
0.0125
3.51
3.17
2.97
2.83
2.74
2.67
2.61
2.47
2.40
2.34
2.24
5
0.01
3.75
3.37
3.14
3.00
2.90
2.82
2.76
2.60
2.53
2.46
2.33
SOURCE; For a/m 0.05, 0.025, and 0.01, the percentlies
were extracted from the t-table (Table 6, Appendix B) for
values of F»l-« of 0.95, 0.975, and 0.99, respectively.
For / 0.05/3 and 0.05/4, the percentlies were
estimated using "A Nomograph of Student's t" by Nelson,
L. S. 1975. Journal of Quality Technology, Vol. 7,
pp. 200-201.
B-5
-------
TABLE 4. PERCENTILES OF THE STANDARD NORMAL DISTRIBUTION,
Up
f
0.50
0.51
032
0.53
034
035
036
0.57
0.58
OJ9
0.60
0.61
0.62
0.63
0.64
0.65
0.66
0.67
0.68
0.69
0.70
0.71
0.72
0.73
0.74
0.000
0.0000
0.0251
0.0502
0.0753
0.1004
0.1257
0.1510
0.1764
0.2019
OL2275
0.2533
03793
03055
03319
03385
03833
0.4125
0.4399
0.4677
0.4959
OJ244
0.5534
03828
0.6128
0.6433
0.001
0.0025
0.0276
0.0527
0.0778
0.1030
0.1282
0.1535
0.1789
0.2045
03301
03559
03819
0.3081
0.3345
0.3611
03880
0.4152
0.4427
0.4705
0.4987
03273
03563
0.5858
0.6158
0.6464
0.002
0.0050
0.0301
0.0357
0.0803
. 0.1035
0.1307
0.1560
0.1815
03070
OL23Z7
03583
03843
03107
04372
03638
03907
0.4179
0.4454
0.4733
03015
03302
03392
03888
0.6189
0.6495
0.003
0.0075
0.0326
0.0577
0.0828
0.1080
0.1332
0.1586
0.1840
03096
0^333
OJ6II
OJ87I
OJ134
OJ398
0.3665
OJ934
0.4207
0.4482
0.4761
(L5044
OJ330
OJ622
OJ9I8
0.6219
0.6526
0.004
0.0100
0.0351
0.0602
0.0853
0,1105
0.1338
0.1611
0.1866
CL2I2I
OJ378
OO637
OL2898
OJI60
OJ425
OJ692
OJ96I
0.4234
0.4510
0.4789
QJ072
OJ359
OJ651
05948
0.6250
O6557
0.005
0.0125
0.0376
0,0627
04)878
0.1130
0.1383
0.1637
0.1891
OJI47
OJ404
Q7f4?
OJ924
OJI86
OJ45I
OJ719
OJ989
0.4261
0.4538
0.4817
OJIOI
OJ388
OJ681
03978
a<280
O.C588
0.006
0.0150
0.0401
0.0652
0.0904
0.1156
0.1408
0.1662
0.1917
OJI73
O2430
OJ689
0.2950
OJ2I3
OJ478
OJ745
0.4016
v.42i$
0.4565
0.4845
OJI29
OJ417
OJ7IO
0.6008
0.6311
0.6620
0.007
0.0175
0.0426
0.0677
4M29
0.1 181
0.1434
0.1687
0.1942
0^2198
0^456
0.2715
0^976
OJ239
OJ505
OJ772
0.4043
0.4316
0.4593
0.4874
OJISS
03446
03740
0.6038
0.6341
0.6651
00)08
0.0201
0.0451
0.0702
0.0954
0.1206
0.1459
0.1713
0.1968
0.2224
0.2482
0^741
OJ002
03266
03531
03799
0,4070
0.4344
0.4621
0.4902
03187
03476
03769
0.6068
0.6372
0.6682
0.009
0.0226
0.0476
0.0728
0.0979
0.1231
0.1484
0.1738
0.1993
0.2250
0.2508
0.2767
03029
03292
03558
03826
0.4097
0.4372
0.4649
0.4930
03215
0.5505
0.5799
0.6098
0.6403
0.6713
NOTE: For values of P below 0.5, obtain the value of U(i.p) 'ro« Table 4 and
change Its sign. For exaaple, UQi45 » -Ul_o.45) » -^0.55 -0.1257.
(Continued)
B-6
-------
TABLE 4 (Continued)
f
0.75
3.76
9.77
0.78
0.79
0.80
0.81
0.82
0.83
0.84
0-85
0.86
0.87
0.88
0.89
0.90
0.91
0.92
0.93
0.94
0.95
0.96
0.97
0.98
0.99
0.000
0.6745
0.7063
0.7388
0.7722
0.8064
0.8416
0.8779
0.9154
0.9542
0.9945
1.0364
1.0803
1.1264
1.1750
1.2265
1.2816
IJ408
1.4051
1.4758
IJ548
1.6449
1.7507
1.8808
2.0537
13263
0.001
0.6776
0.7095
0.742!
0.7756
0.8099
0.8452
0.8816
0.9192
0.9581
0.9986
1.0407
1.0848
I.I3II
1.1800
1.2319
1.2873
IJ469
1.41 18
1.4133
IJ632
1.6546
1.7624
1.8957
10749
2J656
0.002
0.6808
0.7128
0.7454
0.7790
OJI34
0.8488
0.8853
0.9230
0.9621
1.0027
1.0450
1.0893
1.1359
1.1850
1.2372
1.2930
1J532
1.4187
1.4909
1.5718
1.6646
1.7744
1.91 10
2.0969
2.4089
0.003
0.6840
0.7160
0.7488
0.7824
04169
0.8524
0.8890
0.9269
0.9661
IJOQ69
14*94
1.0939
1.1407
1.1901
1.2426
1.2988
1J595
1.4255
1.4985
IJ805
1.6747
1.7866
1.9268
2.1201
2.4573
0.004
0.6871
0.7192
0.7521
0.7158
04204
0.8560
0.8927
0.9307
0.9701
1.01 10
1.0537
1.0985
1.1455
1.1952
IJ48I
1J047
1.3638
1.4325
15063
IJ893
1.6849
1.7991
1.9431
11444
15121
0.005
0.6903
0.7225
0.7554
0.7892
04239
0.8596
0.8965
0.9346
0.9741
14)152
1.0381
1.1031
1.1503
1.2004
1.2536
UI06
1J722
1.4395
IJI4I
IJ982
1.6954
14119
1.9600
11701
15758
0006
0.6935
0.7257
0.7588
0.7926
04274
04633
0,9002
0.9383
.9782
14)194
14)625
1.1077
1.1552
IJOSS
1.2591
IJI65
IJ787
1.4466
1.5220
1.6072
1.7060
14250
1.9774
2.1973
2.6521
04)07
0.6967
0.7290
0.7621
0.7961
0.8310
0.8669
09040
0.9424
0.9822
1.0237
14)669
1.1123
1.1601
1.2107
1.2646
1J225
1J8S2
!.453S
1.5301
1.6164
1.7169
14384
1.9954
77767
17478
0.008
0.6999
0.7323
0.7655
0.7995
04345
0.8705
0.9078
0.9463
0.9863
1.0279
1.0714
1.1170
1.1650
1.2160
1.2702
U285
IJ9I7
1.461 1
IJ382
1.6258
1.7279
1.8522
10141
12571
18782
0.009
0.7031
0.7336
0.7688
0.8030
0.8381
0.8742
0.9116
0.9502
0.9904
1.0322
1.0758
1.1217
1.1700
1.2212
1.2759
1.3346
1.3984
1.4684
1J464
1.6352
1.7392
1.8663
10335
12904
3.0902
SOURCE; Johnson, Norman L. and F. C. Leone. 1977. Statistics and Experimental
Design in Engineering and the Physical Sciences. Vol. I, Second Edition. John
Wiley and Sons, New York.
B-7
-------
TABLE 5. TOLERANCE FACTORS (K) FOR ONE-SIDED NORMAL TOLERANCE
INTERVALS WITH PROBABILITY LEVEL (CONFIDENCEFACTOR)
Y - 0.95 AND COVERAGE P 95X
a
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
30
35
40
45
50
K
7.655
5.145
4.202
3.707
3.399
3.188
3.031
2.911
2.815
2.736
2.670
2.614
2.566
2.523
2.486
2.543
2.423
2.396
2.371
2.350
2.329
2.309
2.292
2.220
2.166
2.126
2.092
2.065
n
75
100
125
150
175
200
225
250
275
300
325
350
375
400
425
450
475
500
525
550
575
600
625
650
675
700
725
750
775
800
825
850
875
900
925
950
975
1000
X
1.972
1.924
1.891
1.868
1.850
1.836
1.824
1.814
1.806
1.799
1.792
1.787
1.782
1.777
1.773
1.769.
1.766
1.763
1.760
1.757
1.754
1.752
1.750
1.748
1.746
1.^44
1.T42
1.740
1.739
1.737
0.736
1.734
1.733
1.732
1.731
1.729
1.728
1.727
SOURCE: (a) for sample sizes s 50: Lleberaan, Gerald F. 1958. Tables for
One-sided Statistical Tolerance L1»1ts." Mu*trta( Quality Control. Vol. XIV,
No. 10. (b) for sample sizes * 50: K values were calculated from large
sample approximation.
B-8
-------
TABLE 6. PERCENTILES OF STUDENT'S t-DISTRIBUTION
(F 1-a; n degrees of f reedon)
^
10
11
is
n
M
u
u
17
U
If
M
a
n
a
M
at
M
TT
n
n
M
40
0
tat
M
.at
.MO
.m
.*n
.M7
.M
.w
.MS
.Ml
.MB
.MO
.MO
.MO
M»
.Mi
.MS
.MT
.M7
.MT
.MT
.MT
.M*
.MO
.MO
.MO
.MO
.MO
.MO
.MO
.MO
.Ml
.M4
.M4
.MS
JO
l.OM
.110
.TOO
.741
.117
.711
.ni
.700
.701
.700
.007
.000
.004
.on
.i
.000
.OM
MO
.OM
.0*7
.OM
.OM
.004
.OM
.004
.004
.OM
.OM
.OH
.001
.070
.077
.074
~
.071
.MO
of
.MS
.470
.440
.41*
.MT
.MS
.*n
.MS
.MO
.MO
.*tt
I .Ml
1.B7
.MS
.MO
.MS
.MS
.SB
.ni
.S10
.an
1.S10
1.11*
.S14
.SI*
.111
.110
.MS
!MO
.MO
.MS
M
.114
.OM
.Ml
.!»
.U
.04*
.MO
.000
.m
.is
.700
.7M
.771
.7*1
.7**
.740
.740
.7*4
.no
.TM
.7*1
.n?
.714
.m
.70S
.700
.70*
.701
.MO
.0*7
.OM
.on
.OM
.041
J9*
u.Too
4.SH
.!*
.770
.cn
.447
.M*
.MO
.MS
.MS
.Ml
.17*
.100
.141
.1*1
.00
.110
.101
.OM
.ON
.OM
.074
.000
.004
.OM
.MO
.OH
.040
.041
.043
.091
.000
.MO
.000
*
*t.Bl
0.000
.Ml
.747
.MO
.141
.MS
.MO
.HI
.TM
.71*
.Ml
.0*0
.OM
.00*
.CH
.0*7
.US
.HO
.OH
.*!
.M*
.an
.on
.4*1
.47*
.47*
.4*7
.4*2
.407
.4M
.MO
.Ml
.no
JM
M.M7
.no
.Ml
.004
.OH
.707
.400
.Ml
.MO
.100
.too
.M*
.01*
.077
.047
.HI
.M*
.*7I
.Ml
.M*
.HI
.*!*
.MT
.7*7
.7*7
.77*
.771
.TM
.TM
.TM
.70*
.OH
.017
.m
JH*
CM. 419
u.m
1S.M1
otc
*
.01
4t
M.
Tti
HT
.4X7
.SI*
.Ml
.140
.071
.01*
.00*
.02*
.**>
.MO
.11*
.TM
.TOT
.T4I
.TM
.707
.000 .
.074 '
.080
.040
.HI
.400
.sn
.Ml
SOURCE; CRC Handbook of TaMM for ProbabiHty and Statttttes. 1966.
w. H. Beyer, Editor. Published by the Chealcal Rubber Company. Cleveland,
Ohio.
8-9
-------
TABLE 7. VALUES OF THE PARAMETER x FOR COHEN'S ESTIMATES '
ADJUSTING FOR HONOETECTEO VALUES *
IlMll li
.IJTM »tMTOt * .ttifl
.Kin
*4V4JfV V4IS
rsss
. .
.urn !* «?*»
.MMM
.n
.Mm .MIIU .inn* .Mtiu :m»g .MUU IMTU
,U*M UMM
.UtM tMtt .
.u*n UMT .SUIT .«mt
2S :SS?
Mm .snu .IUM
.M
.1*
.U
1.M
.tun
' 'S51
iS iSt
^MB JMi 1.1W UIM l.Ml
^M 4M« I«1M UMI i*Mt
I* ^*U UMt UJM U4tt *.Ml
TIM JU« UMf M«* !«! V*>*
.tru
r Mii ** uua
.MM ^M ^TW UMT
.n
.n
***4T I5«i .*ua !TSM JMI UMT UIM UMJ
4MM .«U4 .OU .T4U ^T»» X.«l» U1M UM
MMt J»M S» ItM ^W UN* LJM UM»
^__^_ M_._ M^MV ^A ^MBBV 1 A^B) L.9BV LJBA 1.
SS :SJ :2S :^ JS t:« tS tS t:**
4*ui .an MM .rm Mti £.«* UJM J.4M um
MM* .MM .MM .TM4 .MM UW4 U*M U4M UM»
JH- ilu4
*S |:«
OT 9*4W
'.Ml t!4M
.Mi J.4M
XTC 9.MI
I^M I.MT
!«4« *!MI
.4M >.«
.Ml *.
*.MI S!TM
.Ml I.TT1
.MT l.MI
SOURCE; Cohen, A. C., Jr. 1961. "Tables for Maxima Likelihood Estimates:
Singly Truncated and Singly Censored Samples." Ttcftnom«trtcs.
B-10
-------
TABLE 8. CRITICAL VALUES FOR T. (ONE-SIDED TEST) WENiTHE
STANDARD DEVIATION if CALCULATED FROM
THE SAME SAMPLE -
LUl
Ull
UOI
UM
XMI
IMt
urn
xu»
XJT«
un
I.T**
XMT
un
ua
u«
i^ti
uu
un
uu
un
L4U
un
un
UN
un
I.M
ua
USI
an
u
n
u
u
n
u
a
a
XT»I
XMT
UM
UM
JJM
UM
XMT
UM
14U
&Mt
un
XI*
UM
urn
UM
Ull
UTt
J.*»
XMi
XW»
Xl*l
xsa
UM
X«2
un
nm
uat
I.WT
ins
Ull
tta
Ull
i*U
tat
un
un
tin
uu
U4T
ue*
uu
LMI
uu
Ull
un
uu
XJW
n
n
M
J5
un
ua
x**»
XMS
XJM
un
xiu
xus
xisr
xm
in*
UK
uw
un
uu
un
ua
uw
ua
1MT
Uff
UT»
un
un
XII*
xus
UM
UM
UM
UU
UM
XM*
xin
XIM
xn*
UM
Ull
XM1
xn*
UM
XTM
XJM
XT41
xn*
XTTJ
ue
ui*
uw
UU
x»n
UN
XTM
Ull
ua
41
41
41
un
un
UM
ua
uu
UM
UST
uu
XMI
xm
xn*
xn*
xm
XTfT
XYM
XTT»
XM*
XCf
X4J1
xai
UM
UTI
un
UST
X*»7
un
UTT
un
UM
UW
UM
XMI
XIII
XIM
t.ta
UM
xta
x*n
XMI
XTOO
XTIO
XTI*
XTM
XT4»
XM
X*M
XTM
(Contlnutd)
B-ll
-------
TABLE 8 (Continued)
r
t
fl
e
si
M
M
n
M
M
41
M
«
XTM
UM
UI4
xns
XIM
ue
Ult
XtM
XMT
XIT4
UC
XM*
UM
XM)
UM
XftT
x*u
XfM
XMT
UI4
US*
xai
x»*
XMI
XM)
XMI
XIM
XM)
XISI
xm
xm
X*M
XM3
ITT*
XTU
XTM
XTM
XJM
XMI
XMI
XM*
XT?
AMI
Mil
XSK
XSM
UIT
xm
XIM
XIM
xm
XIM
X3M
xju
XS4
XIM
XIM
XIM
Ul)
Ml*
xtn
J.OIT
JO**
XIII
XIII
XIM
xui
XUT
XMI
XM»
XMS
XMI
tMtl
Tl
TJ
n
74
TJ
M
n
n
T»
M
t)
U
M
M
*l
« t)
M
tT
M
XMI
XMI
XtM
UIT
ua
XCT
X4J)
x«n
X4T*
xm
xm
xm
XM*
xni
un
un
xm
xm
un
XMI
XMT
xtn
XtM
4JBT
UM
un
uu
UM
un
UTT
un
un
XMT
UII
uu
XSI
x»
un
XJtl
xm
XMI
uu
UM
XM*
4JTJ
4JT»
XTII
XTM
XT»
XTM
XT3
XTJJ
XTM
XTM
XU»
XMT
xm
XTM
XJM
XJM
XMI
XMT
XJJI
XJ»
UM
XJI3
XMT
xm
XST)
xm
XM2
XJM
x»
JJT
XUI
XJM
XJM
uu
XMT
XJM
XM)
XIM
un
XfT4
uu
XMT
xm
UM
XMI
XMT
XIII
XI IT
xt:i
xu*
XM
XIM
XU*
XM)
XMT
XUI
XIII
xni
XITT
xw
XM)
XMT
XM)
XtM
Xttl
XtIT
xtn
xtr
XMI
XtM
XM5
XM)
xm
UM
XM5
UM
un
un
UM
XMI
XMT
XITI
xm
xm
Xttl
XtM
XIM
Xltl
XIM
Xt))
XtIT
X*M
XtM
xtn
xtn
XMI
XtM
XtM
X»»)
un
Mil
XM4
UIT
(Continued)
B-12
-------
TABLE 8 (Continued)
m
Ml
W4
10)
107
MI*
to*
no
in
u:
in
114
1:1
I IT
III
II*
IM
1:1
UJ
in
IK
;.'»
u:
;*
141
14:
141
144
14?
AM:
A«*5
ITJ7
).TM
AIOJ
AM*
AII2
Alt*
411*
Air
A1JJ
AIS
ALU
411)
AIM
4 Ml
AW*
AM*
AIM
All)
AIM
AIM
A 1*1
AIM
1T7I
1TTT
i?iT
a.no
)!TM
JAM
UM
UM
UM
UM
MM
M*
MI:
MOV
1.40J
ISO
UZ4
UM
1411
Mil
MJ4
MIT
Ull
IM*
A I**
AIT]
AIT)
AIT!
AIM
All!
411)
4 IB
AIM
AIM
AIM
At**
4J»
AJU
1M5
MM
Mil
1414
MIT
J.C4
M»
Mil
MM
MM
l.»4»
MM
M)l
1411
1.4J7
U42
U4)
U5I
Mil
Ml)
MM
I.M2
M»)
JAW
MTT
MM
M;:
1.4JJ
ut:
UtT
UM
UTt
Ull
1.04)
10»
1.03!
MM
10*1
1.0*4
j.n»7
1070
1JT3
1.MI
M51
MM
Mil
1AM
MM
1**7
MM
).4*f
14TO
14T1
MT)
1.41
1.4M
UM
Ul*
Ufl
UM
UM
UN
un
AS?
A2I2
M»f
M*>
MM
AJU
UN
1M1
1.**
lino
171]
1TI*
J.71«
J.TS
1^«7
1.4W
1.4*1
1.4*1
l.4f»
14**
1.901
MM
JJW
15*7
1JO*
UM
IN*
un
Ul)
UI5
un
un
un
Ull
UM
i.eu
1.0*5
1»*
MOB
MO:
1104
MOT
MM
MI:
MI4
Ml*
Ml*
lip
ii:>
Mil
MM
l.D!
1I.H
JRCEt ASTM Designation E178-75, 1975. "Standard
Recomended Practice for Dealing With Outlying
Observations."
B-13
-------
APPENDIX C
GENERAL BIBLIOGRAPHY
C-l
-------
The following 11st provides the reader with those references directly
mentioned 1n the text. It also Includes, for those readers desiring further
Information, references to literature dealing with selected subject matters 1n
a broader sense. This 11st 1s 1n alphabetical order.
ASTM Designation: E178-75. 1975. "Standard Recommended Practice for Dealing
with Outlying Observations."
ASTM Manual on Presentation of Data and Control Chart Analysis. 1976. ASTM
Special Technical Publication 150.
Bararl, A., and L. S. Hedges. 1985. "Movement Water 1n Glacial Till."
Proceedings of the 17th international Congress of t iternational Association of
Hydroaeologists. pp. 129-134.
Barcelona, M. J., J. P. G1bb, J. A. Helfrlch, and E. E. Garske. 1985. "Prac-
tical Guide for Ground-Water Sampling." Report by Illinois State Water Sur-
vey, Department of Energy and Natural Resources for USEPA. EPA/600/2-85/104.
Bartlett, M. S. 1937. "Properties of Sufficiency and Statistical Tests."
Journal of the Royal Statistical Society, Series A. 160:268-282.
Box, G. E. P., and J. M. Jenkins. 1970. Time Series Analysis. Hoi den-Day, San
Francisco, California.
Brown, K. W., and D. C. Andersen. 1981. "Effects of Organic Solvents on the
Permeability of Clay Soils." EPA 600/2-83-016, Publication No. 83179978, U.S.
EPA, Cincinnati, Ohio.
Cohen, A. C., Jr. 1959. "Simplified Estimators for the Normal Distribution
When Samples Are Singly Censored or Truncated." Technometrics. 1:217-237.
Cohen. A. C., Jr. 1961. "Tables for Maximum Likelihood Estimates: Singly
Truncated and Singly Censored Samples." Technometrics. 3:535-541.
Conover, W. J. 1980. Practical Nonparametric Statistics. Second Edition, John
Wiley and Sons, New York, New York.
CRC Handbook of TaMes for Probability and Statistics. 1966. William H. Beyer
(ed.). The Chemical Rubber Company.
Current Jhdex to Statistics. Applications, Methods and Theory. Sponsored by
American Statistical Association and Institute of Mathematical Statistics.
Annual series providing Indexing coverage for the broad field of statistics.
David, H. A. 1956. "The Ranking of Variances 1n Normal Populations." Jour-
nal of the American Statistical Association. Vol. 51, pp. 621-626.
Davis, J. C. 1986. Statistics and Data Analysis in Geology. Second Edition.
John Wiley and Sons, New York, New York.
C-2
-------
01xon, W. J., and F. J. Massey, Jr. 1983. ftitroduction to Statistical Analysis.
Fourth Edition. McGraw-Hill, New York, New York.
Gibbons, R. 0. 1987. "Statistical Prediction Intervals for the Evaluation of
Ground-Water Quality." Grand Water. Vol. 25, pp. 455-465.
Gibbons, R. 0. 1988. "Statistical Models for the Analysis of Volatile
Organic Compounds 1n Waste Disposal Sites." Ground Water. Vol. 26.
^~" - .,
Gilbert, R. 1987. Statistical Methods for Environmental Pollution Monitoring.
Professional Books Series, Van Nos Relnhold.
_/
Hahn, G. and W. Nelson. 1973. "A Survey of Prediction Intervals and Their
Applications." Journal of Quality Technology. 5:178-188.
Heath, R. . 1983. Basic Ground-Water Hydrology. U.S. Geological Survey
Water Supply Paper. 2220, 84 p.
Hirsch, R. M., J. R. Slack, and R. A. S»1th. 1982. "Techniques of Trend
Analysis for Monthly Water Quality Data." Water Resources Research. Vol. 18,
No. 1, pp. 107-121.
Hockman, K. K., and J. M. Lucas. 1987. "Variability Reduction Through Sub-
vessel CUSUM Control. Journal of Quality Technology. Vol. .19, pp. 113-121.
Hollander, M., and 0. A. Wolfe. 1973. Nonporometric Statistical Methods. John
Wiley and Sons, New York, New York.
Huntsberger, D. V., and P. BllUngsley. 1981. Elements of Statistical Infer-
ence. Fifth Edition. Allyn and Bacon, Inc., Boston, Massachusetts.
Johnson, N. L.t and F. C. Leone. 1977. Statistics and Experimental Design in
Engineering and the Physical Sciences. 2 Vol., Second Edition. John Wiley and
Sons, New York, New York.
Kendall, M. G., and A. Stuart. 1966. The Advanced Theory of Statistics.
3 Vol. Hafner Publication Company. Inc., New York. New York.
Kendall, M. G., and W. R. Buckland. 1971. A Dictionary of Statistical Terms.
Third Edition. Hafner Publishing Company, Inc., New York. New York.
»
Kendall. M. 6. 1975. Rank Correlation Methods. Charles Griffin, London.
Langley. R. A. 1971. Practical Statistic* Simply Explained. Second Edition.
Dover Publications, Inc., New York, New York.
Lehnann, E. L. 1975. Konporometric Statistical Methods Based on Ranks. Holsten
Day, San Francisco, California.
Lleberman, G. J. 1958. "Tables for One-Sided Statistical Tolerance
L1i1ts." Industrial Quality Control. Vol. XIV, No. 10.
C-3
-------
L1l11efors, H. W. 1967. 'On the Kolmogorov-Smlrnov T«$t for Homallty with
Mean and Variance Unknown.* Journal of tfw American SutbtfMl Association.
64:399-402. . ,:
Llngren, B. W. 1976. Statistical Theory. Third Edition. McMillan.
Lucas, J. M. . 1982. 'Combined Shewhart-CUSUM Quality Control Schemes.* Jour-
nal of Quality Technology. Vol. 14, pp. 51*59.
Mann, H. B. 194S "Non-parametric Tests Against Trend.* Sconometrica.
Vol. 13, pp. 245-25
Miller, R. 6., Jr. ,^31. Simultaneous Statistical Jhference. Second Edition.
SpHnger-Verlag, New York, New York.
Nelson, L. S. 1987. "Upper 102, SX. and U~Po1nts of the Maximum F-
Ratlo.* Journal of Quality Technology. Vol. 19, p. 165.
Nelson, L. S. 1987. "A Gap Test for Variances." Journal of Quality Technol-
ogy. Vol. 19. pp. 107-109.
Noether, 6. E. 1967. Clement* of Nonparametric Statistics. Wiley, New York.
Pearson, E. S., and H. 0. Hartley. 1976. BtometrOca Tables for Statistician.
Vol. 1, Blometrlka Trust, University College, London.
Quade, 0. 1966. "On Analysis of Variance for the K-Sample Problem.' Annals
of Mathematical Statistics. 37:1747-1748.
R.ea1ngton, R. 0., and M. A. Schork. 1970. Statistics with Application* to the Bio-
logical and Health Sciences. Prentice-Hall, pp. 235-236.
Shapiro, S. S., and M. R. W11k. 1965. "An Analysis of Variance Test for Nor-
mality (Complete Samples)." BtometrOea. Vol. 52. pp. 591-611.
Snedecor, 6. W., and U. 6. Cochran. 1980. Statistical Methods. Seventh Edi-
tion. The Iowa State University Press, AMS, Iowa.
Starks, T. H. 1988 (Draft). "Evaluation of Control Chan Methodologies for
RCRA Waste Sites." Report by Environmental Research Center, University of
Nevada, Us Vegas, for Exposure Assessment Research Division, Environmental
Monitoring Systems Laboratory-Las Vegas, Nevada. CR814342-01-3.
"Statistical Methods for the Attainment of Superfund Cleanup Standards
(Volume 2: Ground WaterDraft)."
Steel, R. G. 0., and J. H. Torrle. 1980. Principles and Procedures of Statistics,
A Btometrical Approach. Second Edition. McGraw-Hill Book Company, New York,
New York.
C-4
tl.r "-- ' ". - i '-.<-
. ..ji io?0
-------
I odd, V. f.. l»du ,C- -id Water ftyprou^y. John W1l«y ttftd Sons. rr New York,
534 p.
Tukey, d. W.
ance.
£.:*-.» trim tnd{?1f&t1
^ the Analysis of Vari-
Statistical Software..!
SHOP Statistical
Press,
1S35 «r1ntl«g
of California
Lstus l"2-3 ReHcisr^ >I, ^JSS. Lotus fe^elopwsnt Corporation,
Parkway, CiwbrltSge, Massachusetts 02142.
SAS: Statistical Analysis Systea, SAS Institute, Inc.
Us*r's Su1d«s e«s1es. Version 5 Edmw,
User's fiuldt: StttU%1ic*. Versla^ 5 Idttton, 1985.
SPSS,* Statistical Package for the S
Sciences,
. Mcfiraw-HiJi.
SYSTA:; S^ftHrtleal Software Pick*«5ft f- .-,» the PC, Systat, Inc., 1300 Shenmn
Avenue, Ev^r^ton* Slllna'i?. 6C^01,
-------
STATISTICAL ANALYSIS OF
GROUND-WATER MONITORING
DATA AT RCRA FACILITIES
DRAFT
ADDENDUM TO INTERIM FINAL
GUIDANCE
OFFICE OF SOLID WASTE
PERMITS AND STATE PROGRAMS DIVISION
U.S. ENVIRONMENTAL PROTECTION AGENCY
401 M STREET, S.W.
WASHINGTON, D.C. 20460
JULY 1992
!Jl Printed on Recycled Paper
-------
',*t
DISCLAIMER
This document is intended to assist Regional and State personnel in evaluating ground-water
monitoring data from RCRA facilities. Conformance with this guidance is expected to result in
statistical methods and sampling procedures that meet the regulatory standard of protecting human
^K'''
health and tliWnvironment. However, EPA will not in all cases limit its approval of statistical
methods and sampling procedures to those that comport with the guidance set forth herein. This
guidance is not a regulation (i.e., it does not establish a standard of conduct which has the force of
law) and should not be used as such. Regional and State personnel should exercise their discretion
in using this guidance document as well as other relevant information in choosing a statistical
method and sampling procedure that meet the regulatory requirements for evaluating ground-water
monitoring data from RCRA facilities.
This document has been reviewed by the Office of Solid Waste, U.S. Environmental
Protection Agency, Washington, D.C., and approved for publication. Approval does not signify
that the contents necessarily reflect the views and policies of the U.S. Environmental Protection
Agency, nor does mention of trade names, commercial products, or publications constitute
endorsement or recommendation for use.
-------
CONTENTS
1. CHECKING ASSUMPTIONS FOR STATISTICAL PROCEDURES 1
1.1 Normality of Data 1
1.1.1 Interim Final Guidance Methods for Checking Normality... 3
1.1.2 Probability Plots 5
1.1.3 Coefficient of Skewness 8
1.1.4 The Shapiro-Wilk Test of Normality (n<50) 9
1.1.5 The Shapiro-Francia Test of Normality (n>50) 12
1.1.6 The Probability Plot Correlation Coefficient 13
1.2 Testing for Homogeneity of Variance 20
1.2.1 Box Plots 20
1.2.2 Levene'sTest 23
/
2. RECOMMENDATIONS FOR HANDLING NONDETECTS 25
2.1 Nondetects in ANOVA Procedures 26
2.2 Nondetects in Statistical Intervals 27
2.2.1 Censored and Detects-Only Probability Plots 28
2.2.2 Aitchison's Adjustment 33
-------
2.2.3 More Than 50% Nondetects 34
2.2.4 Poisson Prediction Limits 35
2.2.5 Poisson Tolerance Limits 38
3. NON-PARAMETRIC COMPARISON OF COMPLIANCE DATA TO BACKGROUND.. 41
3.1 Kruskal-Wallis Test 41
3.1.1 Adjusting for Tied Observations 42
3.2 Wilcoxon Rank-Sum Test for Two Groups 45
3.2.1 Handling Ties in the Wilcoxon Test 48
4. STATISTICAL INTERVALS: CONFIDENCE, TOLERANCE, AND PREDICTION 49
4.1 Tolerance Intervals 51
4.1.1 Non-parametric Tolerance Intervals 54
4.2 Prediction Intervals 56
4.2.1 Non-parametric Prediction Intervals 59
4.3 Confidence Intervals 60
5. STRATEGIES FOR MULTIPLE COMPARISONS 62
5.1 Background of Problem 62
5.2 Possible Strategies 67
5.2.1 Parametric and Non-parametric ANOVA 67
-------
5.2.2 Retesting with Parametric Intervals 67
5.2.3 Retesting with Non-parametric Intervals 71
6. OTHER TOPICS 75
6.1 Control Chans 75
6.2 Outlier Testing 80
-------
ACKNOWLEDGMENT
This document was developed by EPA's Office of Solid Waste under the direction of Mr.
James R. Brown of the Permits and State Programs Division. The Addendum was prepared by the
joint effons of Mr. James R. Brown and Kirk M. Cameron, Ph.D., Senior Statistician at Science
Applications International Corporation (SAIC). SAIC provided technical support in developing
this document under EPA Contract No. 68-WO-0025. Other SAIC staff who assisted in the
preparation of the Addendum include Mr. Robert D. Aaron, Statistician.
-------
Draft 1/28/93
STATISTICAL ANALYSIS OF
"GROUND-WATER MONITORING DATA
AT RCRA FACILITIES
ADDENDUM TO INTERIM FINAL GUIDANCE
JULY 1992
This Addendum offers a series of recommendations and updated advice concerning the
Interim Final Guidance document for statistical analysis of ground-water monitoring data. Some
procedures in the original guidance are replaced by alternative methods that reflect more current
thinking within the statistics profession. In other cases, further clarification is offered for currently
recommended techniques to answer questions and address public comments that EPA has received
both formally and informally since the Interim Final Guidance was published.
1. CHECKING ASSUMPTIONS FOR STATISTICAL
PROCEDURES
Because any statistical or mathematical model of actual data is an approximation of reality, all
statistical tests and procedures require certain assumptions for the methods to be used correctly and
for the results to have a proper interpretation. Two key assumptions addressed in the Interim
Guidance concern the distributional properties of the data and the need for equal variances among
subgroups of the measurements. In the Addendum, new techniques are outlined for testing both
assumptions that offer distinct advantages over the methods in the Interim Final Guidance.
1.1 NORMALITY OF DATA
Most statistical tests assume that the data come from a Normal distribution. Its density
function is the familiar bell-shaped curve. The Normal distribution's the assumed underlying
model for such procedures as parametric analysis of variance (ANOVA), t-tests, tolerance
intervals, and prediction intervals for future observations. Failure of the data to follow a Normal
distribution at least approximately is not always a disaster, but can lead to false conclusions if the
data really follow a more skewed distribution like the Lognormal. This is because the extreme tail
behavior of a data distribution is often the most critical factor in deciding whether to apply a
statistical test based on the assumption of Normality.
-------
Draft 1/28/93
The Interim^ Final Guidance suggests that one begin by assuming that the original data are
Normal prior to testing the distributional assumptions. If the statistical test rejects the model of
Normality, the data can be tested for Lognormality instead by taking the natural logarithm of each
observation and repeating the test If the. original data are Lognormal, taking the natural logarithm
of the observations will result in data that are Normal. As a consequence, tests for Normality can
also be used to test for Lognormality by applying the tests to the logarithms of the data.
Unfortunately, all of the available tests for Normality do at best a fair job of rejecting non-
Normal data when the sample size is small (say less than 20 to 30 observations). That is, the tests
do not exhibit high degrees of statistical power. As such, small samples of untransformed
Lognormal data can be accepted by a test of Normality even though the skewness of the data may
lead to poor statistical conclusions later. EPA's experience with environmental concentration data,
and ground-water data in particular, suggests that a Lognormal distribution is generally more
appropriate as a default statistical model than the Normal distribution, a conclusion shared by
researchers at the United States Geological Survey (USGS, Dennis Helsel, personal
communication, 1991). There also appears to be a plausible physical explanation as to why
pollutant concentrations so often seem to follow a Lognormal pattern (Ott, 1990). In Ott's model,
pollutant sources are randomly diluted in a multiplicative fashion through repeated dilution and
mixing with volumes of uncontaminated air or water, depending on the surrounding medium.
Such random and repeated dilution of pollutant concentrations can lead mathematically to a
Lognormal distribution.
Because the Lognormal distribution appears to be a better default statistical model than the
Normal distribution for most ground-water data, it is recommended that all data first be logged
prior to checking distributional assumptions. McBean and Rovers (1992) have noted that
"[s]upport for the lognormal distribution in many applications also arises from the shape of the
distribution, namely constrained on the low side and unconstrained on the high side.... The
logarithmic transform acts to suppress the outliers so that the mean is a much better representation
of the central tendency of the sample data."
Transformation to the logarithmic scale is not done to make "large numbers look smaller."
Performing a logarithmic or other monotonic transformation preserves the basic ordering within a
data set, so that the data are merely rescaled with a different set of units. Just as the physical
difference between 80* Fahrenheit and 30* Fahrenheit does not change if the temperatures are
rescaled or transformed to the numerically lower Celsius scale, so too the basic statistical
relationships between data measurements remain the same whether or not the log transformation is
-------
Draft 1/28/93
applied. What does change is that the logarithms of Lognormally distributed data are more nearly
Normal in character, thus satisfying a key assumption of many statistical procedures. Because of
this fact, the same tests used to check Normality, if run on the logged data, become tests for
Lognormality.
If the assumption of Lognormality is not rejected, further statistical analyses should be
performed on the logged observations, not the original data. If the Lognormal distribution is
rejected by a statistical test, one can either test the Normality of the original data, if it was not
already done, or use a non-parametric technique on the ranks of the observations.
If no data are initially available to test the distributional assumptions, "referencing" may be
employed to justify the use of, say, a Normal or Lognormal assumption in developing a statistical
testing regimen at a particular site. "Referencing" involves the use of historical data or data from
sites in similar hydrogeologic settings to justify the assumptions applied to currently planned
statistical tests. These initial assumptions must be checked when data from the site become
available, using the procedures described in this Addendum. Subsequent changes to the initial
assumptions should be made if formal testing contradicts the initial hypothesis.
1.1.1 Interim Final Guidance Methods for Checking Normality
The Interim Final Guidance outlines three different methods for checking Normality: the
Coefficient-of-Variation (CV) test, Probability Plots, and the Chi-squared test. Of these three,
only Probability Plots are recommended within this Addendum. The Coefficient-of-Variation and
the Chi-squared test each have potential problems that can be remedied by using alternative tests.
These alternatives include the Coefficient of Skewness, the Shapiro-Wilk test, the Shapiro-Francia
test, and the Probability Plot Correlation Coefficient.
The Coefficient-of-Variation is recommended within the Interim Guidance because it is easy
to calculate and is amenable to small sample sizes. To ensure that a Normal model which predicts a
significant fraction of negative concentration values is not fitted to positive data, the Interim Final
Guidance recommends that the sample Coefficient of Variation be less than one; otherwise this
"test" of Normality fails. A drawback to using the sample CV is that for Normally distributed data,
one can often get a sample CV greater than one when the true CV is only between 0.5 and 1. In
other words, the sample CV, being a random variable, often estimates the true Coefficient of
Variation with some error. Even if a Normal distribution model is appropriate, the Coefficient of
Variation test may reject the model because the sample CV (but not the true CV) is too large.
-------
Draft 1/28/93
The real purpose of the CV is to estimate the skewness of a dataset, not to test Normality.
Truly Normal data can have any non-zero Coefficient of Variation, though the larger the CV, the
greater the proportion of negative values predicted by the model. As such, a Normal distribution
with large CV may be a poor model for positive concentration data. However, if the Coefficient of
Variation test is used on the logarithms of the data to test Lognormality, negative logged
concentrations will often be expected, nullifying the rationale used to support the CV test in the
first place. A better way to estimate the skewness of a dataset is to compute the Coefficient of
Skewness directly, as described below.
The Chi-square test is also recommended within the Interim Guidance. Though an acceptable
goodness-of-fit test, it is not considered the most sensitive or powerful test of Normality in the
current literature (Can and Koehler, 1990). The major drawback to the Chi-square test can be
explained by considering the behavior of parametric tests based on the Normal distribution. Most
tests like the t-test or Analysis of Variance (ANOVA), which assume the underlying data to be
Normally distributed, give fairly robust results when the Normality assumption fails over the
middle ranges of the data distribution. That is, if the extreme tails are approximately Normal in
shape even if the middle part of the density is not, these parametric tests will still tend to produce
valid results. However, if the extreme tails are non-Normal in shape (e.g., highly skewed),
Normal-based tests can lead to false conclusions, meaning that either a transformation of the data
or a non-parametric technique should be used instead.
The Chi-square test entails a division of the sample data into bins or cells representing
distinct, non-overlapping ranges of the data values (see figure below). In each bin, an expected
value is computed based on the number of data points that would be found if the Normal
distribution provided an appropriate model. The squared difference between the expected number
and observed number is then computed and summed over all the bins to calculate the Chi-square
test statistic.
CHI SQUARE GOODNESS OF FIT
-------
Draft 1/28/93
If the Chi-square test indicates that the data are not Normally distributed, it may not be clear
what ranges of the data most violate the Normality assumption. Departures from Normality in the
middle bins are given nearly the same weight as departures from the extreme tail bins, and all the
departures are summed together to form the test statistic. As such, the Chi-square test is not as
powerful for detecting departures from Normality in the extreme tails of the data, the areas most
crucial to the validity of parametric tests like the t-test or ANOVA (Miller, 1986). Furthermore,
even if there are departures in the tails, but the middle portion of the data distribution is
approximately Normal, the Chi-square test may not register as statistically significant in certain
cases where better tests of Normality would. Because of this, four alternative, more sensitive tests
of Normality are suggested below which can be used in conjunction with Probability Plots.
1.1.2 Probability Plots
As suggested within the Interim Final Guidance, a simple, yet useful graphical test for
Normality is to plot the data on probability paper. The y-axis is scaled to represent probabilities
according to the Normal distribution and the data are arranged in increasing order. An observed
value is plotted on the x-axis and the proportion of observations less than or equal to each observed
value is plotted as the y-coordinate. The scale is constructed so that, if the data are Normal, the
points when plotted will approximate a straight line. Visually apparent curves or bends indicate
that the data do not follow a Normal distribution (see Interim Final Guidance, pp. 4-8 to 4-11).
Probability Plots are particularly useful for spotting irregularities within the data when
compared to a specific distributional model like the Normal. It is easy to determine whether
departures from Normality are occurring more or less in the middle ranges of the data or in the
extreme tails. Probability Plots can also indicate the presence of possible outlier values that do not
follow the basic pattern of the data and can show the presence of significant positive or negative
skewness.
If a (Normal) Probability Plot is done on the combined data from several wells and Normality
is accepted, it implies that all of the data came from the same Normal distribution. Consequently,
each subgroup of the data set (e.g., observations from distinct wells), has the same mean and
standard deviation. If a Probability Plot is done on the data residuals (each value minus its
subgroup mean) and is not a straight line, the interpretation is more complicated. In this case,
either the residuals are not Normal, or there is a subgroup of the data with a Normal distribution
but a different mean or standard deviation than the other subgroups. The Probability Plot will
indicate a deviation from the underlying Normality assumption either way.
-------
Draft 1/28/93
The same Probability Plot technique may be used to investigate whether a set of data or
residuals follows the Lognormal distribution. The procedure is the same, except that one first
replaces each observation by its natural logarithm. After the data have been transformed to their
natural logarithms, the Probability Plot is constructed as before. The only difference is that the
natural logarithms of the observations are used on the x-axis. If the data are Lognormal, the
Probability Plot (on Normal probability paper) of the logarithms of the observations will
approximate a straight line.
Many statistical software packages for personal computers will construct Probability Plots
automatically with a simple command or two. If such software is available, there is no need to
construct Probability Plots by hand or to obtain special graph paper. The plot itself may be
generated somewhat differently than the method described above. In some packages, the observed
value is plotted as before on the x-axis. The y-axis, however, now represents the quantile of the
Normal distribution (often referred to as the "Normal score of the observation") corresponding to
the cumulative probability of the observed value. The y-coordinate is often computed by the
following formula:
.-1
n + 1
where <£"' denotes the inverse of the cumulative Normal distribution, n represents the sample size,
and i represents the rank position of the ith ordered concentration. Since the computer does these
calculations automatically, the formula does not have to be computed by hand.
EXAMPLE 1
Determine whether the following data set follows the Normal distribution by using a
Probability Plot.
-------
Draft 1/28/93
Nickel Concentration (ppb)
Month Welll Well 2 Well 3 Well 4
1
2
3
4
5
58.8
1.0
262
56
8.7
19
81.5
331
14
64.4
39
151
27
21.4
578
3.1
942
85.6
10
637
SOLUTION
Step 1. List the measured nickel concentrations in order from lowest to highest.
Nickel
Concentration
(ppb)
1
3.1
8.7
10
14
19
21.4
27
39
56
58.8
64.4
81.5
85.6
151
262
331
578
637
942
Order
(i)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Probability
5
10
14
19
24
29
33
38
43
48
52
57
62
67
71
76
81
86
90
95
Normal
Quantile
-1.645
-1.28
-1.08
-0.88
-0.706
-0.55
-0.44
-0.305
-0.176
-0.05
0.05
0.176
0.305
0.44
0.55
0.706
0.88
1.08
1.28
1.645
Step 2. The cumulative probability is given in the third column and is computed as 100*(i/(n+l))
where n is the total number of samples (n=20). The last column gives the Normal
quantiles corresponding to these probabilities.
Step 3. If using special graph paper, plot the probability versus the concentration for each
sample. Otherwise, plot the Normal quantile versus the concentration for each sample,
as in the plot below. The curvature found in the Probability Plot indicates that there is
evidence of non-Normality in the data.
-------
Draft 1/28/93
PROBABILITY PLOT
-------
Draft 1/28/93
The Skewness Coefficient may be computed using the following formula:
-\3
Y =-D
II 3
where the numerator represents the average cubed residual and SD denotes the standard deviation
of the measurements. Most statistics computer packages (e.g., Minitab, GEO-EAS) will compute
the Skewness Coefficient automatically via a simple command.
EXAMPLE 2
Using the data in Example 1, compute the Skewness Coefficient to test for approximate
symmetry in the data.
SOLUTION
Step 1. Compute the mean, standard deviation (SD), and average cubed residual for the nickel
concentrations:
x = 169.52ppb
SD = 259.72ppb
- ^(x,-x)3 = 2.98923 *108ppb3
Step 2. Calculate the Coefficient of Skewness using the previous formula to get yi=1.84. Since
the skewness is much larger than 1, the data appear to be significantly positively
skewed Do not assume that the data follow a Normal distribution.
Step 3. Since the original data evidence a high degree of skewness, one can attempt to compute
the Skewness Coefficient on the logged data instead. In that case, the skewness works
out to be lyil= 0.24 < 1, indicating that the legged data values are slightly skewed, but
not enough to reject an assumption of Normality in the logged data. In other words, the
original data may be Lognormally distributed.
1.1.4 The Shapiro-Wilk Test of Normality (n<50)
The Shapiro-Wilk test is recommended as a superior alternative to the Chi-square test for
testing Normality of the data. It is based on the premise that if a set of data are Normally
distributed, the ordered values should be highly correlated with corresponding quantiles taken from
a Normal distribution (Shapiro and Wilk, 1965). In particular, the Shapiro-Wilk test gives
-------
Draft 1/28/93
substantial weight to evidence of non-Normality in the tails of a distribution, where the robustness
of statistical tests based on the Normality assumption is most severely affected. The Chi-square
test treats departures from Normality in the tails nearly the same as departures in the middle of a
distribution, and so is less sensitive to the types of non-Normality that are most crucial. One
cannot tell from a significant Chi-square goodness-of-fit test what son of non-Normality is
indicated.
The Shapiro-Wilk test statistic (W) will tend to be large when a Probability Plot of the data
indicates a nearly straight line. Only when the plotted data show significant bends or curves will
the test statistic be small. The Shapiro-Wilk test is considered to be one of the very best tests of
Normality available (Miller, 1986; Madansky, 1988).
To calculate the test statistic W, one can use the following formula:
where the numerator is computed as
b =
In this last formula, XQ) represents the jth smallest ordered value in the sample and
coefficients aj depend on the sample size n. The coefficients can be found for any sample size
from 3 up to 50 in Table A- 1 of Appendix A. The value of k can be found as the greatest integer
less than or equal to n/2.
Normality of the data should be rejected if the Shapiro- Wilk statistic is too low when
compared to the critical values provided in Table A-2 of Appendix A. Otherwise one can assume
the data are approximately Normal for purposes of further statistical analysis. As before, it is
recommended that the test first be performed on the logarithms of the original data to test for
Lognormality. If the logged data indicate non- Normality by the Shapiro- Wilk test, a re-test can be
performed on the original data to test for Normality of the original concentrations.
EXAMPLE 3
Use the data of Example 1 to compute the Shapiro- Wilk test of Normality.
10
-------
Draft 1/28/93
SOLUTION
Step 1. Order the data from smallest to largest and list, as in the following table. Also list the
data in reverse order alongside the first column.
Step 2. Compute the differences X(n-i+i)-x(i) in column 3 of the table by subtracting column 1
from column 2.
i
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
x(i)
1.0
3.1
8.7
10.0
14.0
19.0
21.4
27.0
39.0
56.0
58.8
64.4
81.5
85.6
151.0
262.0
331.0
578.0
637.0
942.0
X(n-i+l)
942.0
637.0
578.0
331.0
262.0
151.0
85.6
81.5
64.4
58.8
56.0
39.0
27.0
21.4
19.0
14.0
10.0
8.7
3.1
1.0
*
-------
Draft 1/28/93
original concentration data are used in this example to illustrate how the assumption of
Normality can be rejected.)
1.1.5 The Shapiro-Francia Test of Normality (n>50)
The Shapiro-Wilk test of Normality can be used for sample sizes up to 50. When the sample
is larger than 50, a slight modification of the procedure called the Shapiro-Francia test (Shapiro and
Francia, 1972) can be used instead.
Like the Shapiro-Wilk test, the Shapiro-Francia test statistic (W) will tend to be large when a
Probability Plot of the data indicates a nearly straight line. Only when the plotted data show
significant bends or curves will the test statistic be small.
To calculate the test statistic W', one can use the following formula:
r i2
(n-l)SD2I.m2
i i
where x^^ represents the ith ordered value of the sample and where mj denotes the approximate
expected value of the ith ordered Normal quantile. The values for mj can be approximately
computed as
m =
-------
Draft 1/28/93
1.1.6 The Probability Plot Correlation Coefficient
One other alternative test for Normality that is roughly equivalent to the Shapiro-Wilk and
Shapiro-Francia tests is the Probability Plot Correlation Coefficient test described by Filliben
(1975). This test fits in perfectly with the use of Probability Plots, because the essence of the test
is to compute the common correlation coefficient for points on a Probability Plot. Since the
correlation coefficient is a measure of the linearity of the points on a scatterplot, the Probability Plot
Correlation Coefficient, like the Shapiro-Wilk test, will be high when the plotted points fall along a
straight line and low when there are significant bends and curves in the Probability Plot.
Comparison of the Shapiro-Wilk and Probability Plot Correlation Coefficient tests has indicated
very similar statistical power for detecting non-Normality (Ryan and Joiner, 1976).
The construction of the test statistic is somewhat different from the Shapiro-Wilk W, but not
difficult to implement. Also, tabled critical values for the correlation coefficient have been derived
for sample sizes up to n=100 (and are reproduced in Table A-4 of Appendix A). The Probability
Plot Correlation Coefficient may be computed as
Cn x
where X(i) represents the ith smallest ordered concentration value, Mj is the median of the ith order
statistic from a standard Normal distribution, and X and M represent the average values of X(i)
and M(i). The ith Normal order statistic median may be approximated as Mi^'^mi), where as
before, O"1 is the inverse of the standard Normal cumulative distribution and mi can be computed
as follows (given sample size n):
m;=
(i-.3175)/(n+.365) forl
-------
Draft 1/28/93
When working with a complete sample (i.e., containing no nondetects or censored values), the
average value M=0, and so the formula for the Probability Plot Correlation Coefficient simplifies
to
r._._Z£c.)M,
EXAMPLE 4
Use the data of Example 1 to compute the Probability Plot Correlation Coefficient test.
SOLUTION
Step 1. Order the data from smallest to largest and list, as in the following table.
Step 2. Compute the quantities mj from Filliben's formula above for each i in column 2 and the
order statistic medians, Mj, in column 3 by applying the inverse Normal transformation
to column 2.
Step 3. Since this sample contains no nondetects, the simplified formula for r may be used.
Compute the products X(i)*Mj in column 4 and sum to get the numerator of the
correlation coefficient (equal to 3,836.81 in this case). Also compute Mj2 in column 5
and sum to find quantity Cn2=17.12.
m \/f« V Jfc\>f * * 1
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
1.0
3.1
8.7
10.0
14.0
19.0
21.4
27.0
39.0
56.0
58.8
64.4
81.5
85.6
151.0
262.0
331.0
578.0
637.0
942.0
.03406
.08262
.13172
.18082
.22993
.27903
.32814
.37724
.42634
.47545
.52455
.57366
.62276
.67186
.72097
.77007
.81918
.86828
.91738
.96594
-1.8242
-1.3877
-1.1183
-0.9122
-0.7391
-0.5857
-0.4451
-0.3127
-0.1857
-0.0616
0.0616
0.1857
0.3127
0.4451
0.5857
0.7391
0.9122
1.1183
1.3877
1.8242
-1.824
-4.302
-9.729
-9.122
-10.347
-11.129
-9.524
-8.444
-7.242
-3.448
3.621
11.959
25.488
38.097
88.445
193.638
301.953
646.376
883.941
1718.408
3.328
1.926
1.251
0.832
0.546
0.343
0.198
0.098
0.034
0.004
0.004
0.034
0.098
0.198
0.343
0.546
" 0.832
1.251
1.926
3.328
14
-------
jjrait
Step 4. Compute the Probability Plot Correlation Coefficient using the simplified formula for r,
where JSD=259.72 and Cn=4.1375, to get
3836.81
(4.1 375)(259. 72)Vl9
Step 5. Compare the computed value of r=0.819 to the 5% critical value for sample size 20 in
Table A-4, namely R .05,20=0-950. Since r < 0.950, the sample shows significant
evidence of non-Normali ty by the Probability Plot Correlation Coefficient test. The data
should be transformed using natural logs and the correlation coefficient recalculated
before proceeding with further statistical analysis.
EXAMPLE 5
The data in Examples 1, 2, 3, and 4 showed significant evidence of non-Normality. Instead
of first logging the concentrations before testing for Normality, the original data were used. This
was done to illustrate why the Lognormal distribution is usually a better default model than the
Normal. In this example, use the same data to determine whether the measurements better follow a
Lognormal distribution.
Computing the natural logarithms of the data gives the table below.
Logged Nickel Concentrations log (ppb)
Month Welll Well 2 Well 3 Well 4
1
2
3
4
5
4.07
0.00
5.57
4.03
2.16
2.94
4.40
5.80
2.64
4.17
3.66
5.02
3.30
3.06
6.36
1.13
6.85
4.45
2.30
6.46
SOLUTION
Method 1. Probability Plots
Step 1. List the natural logarithms of the measured nickel concentrations in order from lowest to
highest.
15
-------
uran
Order
(i)
1
2
3
4
5
6
7
0
9
10
11
12
13
14
15
16
17
18
19
20
Log Nickel
Concentration
log(ppb)
0.00
1.13
2.16
2.30
2.64
2.94
3.06
3.30
3.66
4.03
4.07
4.17
4.40
4.45
5.02
5.57
5.80
6.36
6.46
6.85
Probability
100*(i/(n+l))
5
10
14
19
24
29
33
38
43
48
52
57
62
67
71
76
81
86
90
95
Normal
Quantiles
-1.645
-1.28
-1.08
-0.88
-0.706
-0.55
-0.44
-0.305
-0.176
-0.05
0.05
0.176
0.305
0.44
0.55
0.706
0.88
1.08
1.28
1.645
Step 2. Compute the probability as shown in the third column by calculating 100*(i/n+l), where
n is the total number of samples (n=20). The corresponding Normal quantiles are given
in column 4.
Step 3. Plot the Normal quantiles against the natural logarithms of the observed concentrations
to get the following graph. The plot indicates a nearly straight line fit (verified by
calculation of the Correlation Coefficient given in Method 4). There is no substantial
evidence that the data do not follow a Lognormal distribution. The Normal-theory
procedure(s) should be performed on the log-transformed data.
16
-------
Draft 1/28/93
PROBABILITY PLOT
I
02 4
. LN(Nickel) LN'(ppb)
Method 2. Coefficient of Skewness
Step 1. Calculate the mean, SD, and average cubed residuals of the natural logarithms of the
data.
x = 3.9181og(ppb)
SD = 1.8021og(ppb)
-I,(xI-x)3 = -1.325 Iog3(ppb)
Step 2. Calculate the Skewness Coefficient, 71.
-1.325
7. =
= -0.244
(.95)2(1.802)3
Step 3. Compute the absolute value of the skewness, lyi 1=1-0.2441=0.244.
Step 4. Since the absolute value of the Skewness Coefficient is less than 1, the data do not show
evidence of significant skewness. A Normal approximation to the log-transformed data
may therefore be appropriate, but this model should be further checked.
17
-------
L>ralt
Method 3. Shapiro-Wilk Test
Step 1. Order the logged data from smallest to largest and list, as in following table. Also list
the data in reverse order and compute the differences X
i
1
2
3
4
' 5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
LN(x(i))
0.00
1.13
2.16
2.30
2.64
2.94
3.06
3.30
3.66
4.03
4.07
4.17
4.40
4.45
5.02
5.57
5.80
6.36
6.46
6.85
LN(x(n.i+i))
6.85
6.46
6.36
5.80
5.57
5.02
4.45
4.40
4.17
4.07
4.03
3.66
3.30
3.06
2.94
2.64
2.30
2.16
1.13
0.00
an-i+l
.4734
.3211
.2565
.2085
.1686
.1334
.1013
.0711
- .0422
.0140
bi
3.24
1.71
1.08
0.73
0.49
0.28
0.14
0.08
0.02
0.00
b=7.77
Step 2. Compute k=10, since n/2=10. Look up the coefficients an.j+1 from Table A-l and
multiply by the first k differences between columns 2 and 1 to get the quantities bj. Add
these 10 products to get b=7.77.
Step 3. Compute the standard deviation of the logged data, SD=1.8014. Then the Shapiro-Wilk
statistic is given by
W =
7.77 1
.1.8014VT9J
2
= 0.979.
Step 4. Compare the computed value of W to the 5% critical value for sample size 20 in Table A-
2, namely W.os ,20=0-905. Since W=0.979>0.905, the sample shows no significant
evidence of non-Normality by the Shapiro-Wilk test. Proceed with further statistical
analysis using the log-transformed data.
Method 4. Probability Plot Correlation Coefficient
Step 1. Order the logged data from smallest to largest and list below.
18
-------
Draft 1/28/93
Order
(i)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Log Nickel
Concentration
log(ppb)
0.00
1.13
2.16
2.30
2.64
2.94
3.06
3.30
3.66
4.03
4.07
4.17
4.40
4.45
5.02
5.57
5.80
6.36
6.46
6.85
mi
.03406
.08262
.13172
.18082
.22993
.27903
.32814
.37724
.42634
.47545
.52455
.57366
.62276
.67186
.72097
.77007
.81918
.86828
.91738
.96594
Mj
-1.8242
-1.3877
-1.1183
-0.9122
-0.7391
-0.5857
-0.4451
-0.3127
-0.1857
-0.0616
0.0616
0.1857
0.3127
0.4451
0.5857
0.7391
0.9122
1.1183
1.3877
1.8242
X(i)*Mj
0.000
-1.568
-2.416
-2.098
-1.951
-1.722
-1.362
-1.032
-0.680
-0.248
0.251
0.774
1.376
1.981
2.940
4.117
5.291
7.112
8.965
12.496
M;2
3.328
1.926
1.251
0.832
0.546
0.343
0.198
0.098
0.034
0.004
0.004
0.034
0.098
0.198
0.343
0.546
0.832
1.251
1.926
3.328
Step 2.
Step 3.
Step 4.
Step 5.
Compute the quantities mj and the order statistic medians Mj, according to the procedure
in Example 4 (note that these values depend only on the sample size and are identical to
the quantities in Example 4).
Compute the products X(,)*Mj in column 4 and sum to get the numerator of the
correlation coefficient (equal to 32.226 in this case). Also compute Mj2 in column 5 and
sum to find quantity Cn^=17.12.
Compute the Probability Plot Correlation Coefficient using the simplified formula for r,
where SD= 1.8025 and Cn=4.1375, to get
r =
32.226
(4.1375)(1.8025)Vl9
= 0.991
Compare the computed value of r=0.991 to the 5% critical value for sample size 20 in
Table A-4, namely R.o5,20=0-950. Since r > 0.950, the logged data show no significant
evidence of non-Normality by the Probability Plot Correlation Coefficient test.
Therefore, Lognormality of the original data could be assumed in subsequent statistical
procedures.
19
-------
Draft 1/28/93
1.2 TESTING FOR HOMOGENEITY OF VARIANCE
One of the most important assumptions for the parametric analysis of variance (ANOVA) is
that the different groups (e.g., different wells) have approximately the same variance. If this is not
the case, the power of the F-test (its ability to detect differences among the group means) is
reduced. Mild differences in variance are not too bad. The effect becomes noticeable when the
largest and smallest group variances differ by a ratio of about 4 and becomes quite severe when the
ratio is 10 or more (Milliken and Johnson, 1984).
The procedure suggested in the EPA guidance document, Bartlett's test, is one way to test
whether the sample data give evidence that the well groups have different variances. However,
Bartlett's test is sensitive to non-Normality in the data and may give misleading results unless one
knows in advance that the data are approximately Normal (Milliken and Johnson, 1984). As an
alternative to Bartlett's test, two procedures for testing homogeneity of the variances are described
below that are less sensitive to non-Normality.
1.2.1 Box Plots
Box Plots were first developed for exploratory data analysis as a quick way to visualize the
"spread" or dispersion within a data set. In the context of variance testing, one can construct a Box
Plot for each well group and compare the boxes to see if the assumption of equal variances is
reasonable. Such a comparison is not a formal test procedure, but is easier to perform and is often
sufficient for checking the group variance assumption.
The idea behind a Box Plot is to order the data from lowest to highest and to trim off 25
percent of the observations on either end, leaving just the middle 50 percent of the sample values.
The spread between the lowest and highest values of this middle 50 percent (known as the
interquartile range or IQR) is represented by the length of the box. The very middle observation
(i.e., the median) can also be shown as a line cutting the box in two.
To construct a Box Plot, calculate the median and upper and lower quantiles of the data set
(respectively, the 50th, 25th, and 75th percentiles). To do this, calculate k=p(n+l)/100 where
n=number of samples and p=percentile of interest. If k is an integer, let the km ordered or ranked
value be an estimate of the pth percentile of the data. If k is not an integer, let the pth percentile be
equal to the average of the two values closest in rank position to k. For example, if the data set
consists of the 10 values {1, 4, 6.2, 10, 15, 17.1, 18, 22, 25, 30.5}, the position of the median
20
-------
Draft 1/28/93
would be found a_s 50*(10+1)/100=5.5. The median would then be computed as the average of
the 5th and 6th ordered values, or (15+17.1)/2=16.05.
Likewise, the position of the lower quartile would be 25*(10+1)/100=2.75. Calculate the
average of the 2nd and 3rd ordered observations to estimate this percentile, i.e., (4+6.2)/2=5.1.
Since the upper quartile is found to be 23.5, the length of Box Plot would be the difference
between the upper and lower quaniles, or (23.5-5.1)=18.4. The box itself should be drawn on a
graph with the y-axis representing concentration and the x-axis denoting the wells being plotted.
Three horizontal lines are drawn for each well, one line each at the lower and upper quaniles and
another at the median concentration. Vertical connecting lines are drawn to complete the box.
Most statistics packages can directly calculate the statistics needed to draw a Box Plot, and
many will construct the Box Plots as well. In some computer packages, the Box Plot will also
have two "whiskers" extending from the edges of the box. These lines indicate the positions of
extreme values in the data set, but generally should not be used to approximate the overall
dispersion.
If the box length for each group is less than 3 times the length of the shortest box, the sample
variances are probably close enough to assume equal group variances. If, however, the box length
for any group is at least triple the length of the box for another group, the variances may be
significantly different (Kirk Cameron, SAIC, personal communication). In that case, the data
should be further checked using Levene's test described in the following section. If Levene's test
is significant, the data may need to be transformed or a non-parametric rank procedure considered
before proceeding with further analysis.
EXAMPLE 6
Construct Box Plots for each well group to test for equality of variances.
Arsenic Concentration (ppm)
Month Welll Well 2 Well 3 Well 4 Well 5 Well 6
1
2
3
4
22.9
3.09
35.7
4.18
2.0
1.25
7.8
52-
2.0
109.4
4.5
2.5
7.84
9.3
25.9
2.0
24.9
1.3
0.75
27
0.34
4.78
2.85
1.2
21
-------
Draft 1/28/93
SOLUTION
Step 1. Compute the 25th, 50th, and 75th percentiles for the data in each well group. To
calculate the pth percentile by hand, order the data from lowest to highest. Calculate
p*(n+l)/100 to find the ordered position of the pth percentile. If necessary, interpolate
between sample values to estimate the desired percentile.
Step 2. Using well 1 as an example, n+l=5 (since there are 4 data values). To calculate the 25th
percentile, compute its ordered position (i.e., rank) as 25*5/100=1.25. Average the 1st
and 2nd ranked values at well 1 (i.e., 3.09 and 4.18) to find an estimated lower quartile
of 3.64. This estimate gives the lower end of the Box Plot. The upper end or 75th
percentile can be computed similarly as the average of the 3rd and 4th ranked values, or
(22.9+35.7)/2=29.3. The median is the average of the 2nd and 3rd ranked values,
giving an estimate of 13.14.
Step 3. Construct Box Plots for each well group, lined up side by side on the same axes.
BOX PLOTS OF WELL DATA
120
100
E
Q.
<& 80
O
cc
u
O
U
u
60
40
20
3 4
WELL
Step 4. Since the box length for well 3 is more than three times the box lengths for wells 4 and
6, there is evidence that the group variances may be significantly different. These data
should be further checked using Levene's test described in the next section.
22
-------
Draft 1/28/93
1.2.2 Levene's Test
Levene's test is a more formal procedure than Box Plots for testing homogeneity of variance
that, unlike Bartlett's test, is not sensitive to non-Normality in the data. Levene's test has been
shown to have power nearly as great as Banlett's test for Normally distributed data and power
superior to Bartlett's for non-Normal data (Milliken and Johnson, 1984).
To conduct Levene's test, first compute the new variables
where Xjj represents the jth value from the ith well and x; is the ith well mean. The values z,j
represent the absolute values of the usual residuals. Then run a standard one-way analysis of
variance (ANOVA) on the variables Zjj. If the F-test is significant, reject the hypothesis of equal
group variances. Otherwise, proceed with analysis of the Xjj's as initially planned.
EXAMPLE 7
Use the data from Example 6 to conduct Levene's test of equal variances.
SOLUTION
Step 1. Calculate the group mean for each well (x()
Well 1 mean = 16.47 Well 4 mean = 11.26
Well 2 mean = 15.76 Well 5 mean = 13.49
Well 3 mean = 29.60 Well 6 mean = 2.29
23
-------
Draft 1/28/93
Step 2. Compute the absolute residuals zjj in each well and the well means of the residuals (z 0.
Month
1
2
3
4
Well
Mean ( z j)
Overall
Mean (z)
Welll
6.43
13.38
19.23
12.29
= 12.83
= 15.36
Well 2
13.76
14.51
7.96
36.24
18.12
Absolute Residuals
Well 3 Well 4
27.6
79.8
25.1
27.1
39.9
3.42
1.96
14.64
9.26
7.32
Well 5
11.41
12.19
12.74
13.51
12.46 -
Well 6
1.95
2.49
0.56
1.09
1.52
Step 3. Compute the sums of squares for the absolute residuals.
SSTOTAL = (N-l) SDz2 = 6300.89
Z,2 ~ NI' = 3522.90
TOTAL~^ DWELLS = 2777.77
Step 4. Construct an analysis of variance table to calculate the F-statistic. The degrees of
freedom (df) are computed as (#groups-l)=(6-l)=5 df and (#samples-#groups)=(24-
6)= 18 df.
ANOVA Table
Source
Between Wells
Error
Total
Sum-of-Squares
3522.90
2777.99
6300.89
df
5
18
23
Mean-Square F-Ratio
704.58 4.56
154.33
P
0.007
Step 5. Since the F-statistic of 4.56 exceeds the tabulated value of F Os=2.77 with 5 and 18 df,
the assumption of equal variances should be rejected. Since the original concentration
data are used in this example, the data should be logged and retested.
24
-------
Draft 1/28/93
2. RECOMMENDATIONS FOR HANDLING
NONDETECTS
The basic recommendations within the Interim Final Guidance for handling nondetect
analyses include the following (see p. 8-2): 1) if less than 15 percent of all samples are nondetect,
replace each nondetect by half its detection or quantitation limit and proceed with a parametric
analysis, such as ANOVA, Tolerance Limits, or Prediction Limits; 2) if the percent of nondetects is
between 15 and 50, either use Cohen's adjustment to the sample mean and variance in order to
proceed with a parametric analysis, or employ a non-paramerric procedure by using the ranks of
the observations and by treating all nondetects as tied values; 3) if the percent of nondetects is
greater than 50 percent, use the Test of Proportions.
As to the first recommendation, experience at EPA and research at the United States
Geological Survey (USGS, Dennis Helsel, personal communication, 1991) has indicated that if
less than 15 percent of the samples are nondetect, the results of parametric statistical tests will not
be substantially affected if nondetects are replaced by half their detection limits. When more than
15 percent of the samples are nondetect, however, the handling of nondetects is more crucial to the
outcome of statistical procedures. Indeed, simple substitution methods tend to perform poorly in
statistical tests when the nondetect percentage is substantial (Gilliom and Helsel, 1986).
Even with a small proportion of nondetects, however, care should be taken when choosing
between the method detection limit (MDL) and the practical quantitation limit (PQL) in
characterizing "nondetect" concentrations. Many nondetects are characterized by analytical
laboratories with one of three data qualifier flags: "U," "J," or "E." Samples with a "U" data
qualifier represent "undetected" measurements, meaning that the signal characteristic of that analyte
could not be observed or distinguished from "background noise" during lab analysis. Inorganic
samples with an "E" flag and organic samples with a "J" flag may or may not be reported with an
estimated concentration. If no concentration is estimated, these samples represent "detected but not
quantified" measurements. In this case, the actual concentration is assumed to be positive, but
somewhere between zero and the PQL. Since all of these non-detects may or may not have actual
positive concentrations between zero and the PQL, the suggested substitution for parametric
statistical procedures is to replace each nondetect by one-half the PQL (note, however, that "E" and
"J" samples reported with estimated concentrations should be treated, for statistical purposes, as
valid measurements. Substitution of one-half the PQL is not recommended for these samples).
25
-------
Draft 1/28/93
In no case should nondetect concentrations be assumed to be bounded above by the MDL.
The MDL is estimated on the basis of ideal laboratory conditions with ideal analyte samples and
does not account for matrix or other interferences encountered when analyzing specific, actual field
samples. For this reason, the PQL should be taken as the most reasonable upper bound for
nondetect concentrations.
It should also be noted that the distinction between "undetected" and "detected but not
quantified" measurements has more specific implications for rank-based non-parametric
procedures. Rather than assigning the same tied rank to all nondetects (see below and in Section
3), "detected but not quantified" measurements should be given larger ranks than those assigned to
"undetected" samples. In fact the two types of nondetects should be treated as two distinct groups
of tied observations for use in the Wilcoxon and Kruskal-Wallis non-parametric procedures.
2.1 NONDETECTS IN ANOVA PROCEDURES
For a moderate to large percentage of nondetects (i.e., over 15%), the handling of nondetects
should vary depending on the statistical procedure to be run. If background data from one or more
upgradient wells are to be compared simultaneously with samples from one or more downgradient
wells via a t-test or ANOVA type procedure, the simplest and most reliable recommendation is to
switch to a non-parametric analysis. The distributional assumptions for parametric procedures can
be rather difficult to check when a substantial fraction of nondetects exists. Furthermore, the non-
parametric alternatives described in Section 3 tend to be efficient at detecting contamination when
the underlying data are Normally distributed, and are often more powerful than the parametric
methods when the underlying data do not follow a Normal distribution.
Nondetects are handled easily in a nonparametric analysis. All data values are first ordered
and replaced by their ranks. Nondetects are treated as tied values and replaced by their midranks
(see Section 3). Then a Wilcoxon Rank-Sum or Kruskal-Wallis test is run on the ranked data
depending on whether one or more than one downgradient well is being tested.
The Test of Proportions is not recommended in this Addendum, even if the percentage of
nondetects is over 50 percent. Instead, for all two-group comparisons that involve more than 15
percent nondetects, the non-parametric Wilcoxon Rank-Sum procedure is recommended.
Although acceptable as a statistical procedure, the Test of Proportions does not account for
potentially different magnitudes among the concentrations of detected values. Rather, each sample
is treated as a 0 or 1 depending on whether the measured concentration is below or above the
26
-------
Draft 1/2S/VJ
detection limit. The Test of Proportions ignores information about concentration magnitudes, and
hence is usually less powerful than a non-parametric rank-based test like the Wilcoxon Rank-Sum,
even after adjusting for a large fraction of tied observations (e.g., nondetects). This is because the
ranks of a dataset preserve additional information about the relative magnitudes of the concentration
values, information which is lost when all observations are scored as O's and 1's.
Another drawback to the Test of Proportions, as presented in the Interim Final Guidance, is
that the procedure relies on a Normal probability approximation to the Binomial distribution of O's
and 1's. This approximation is recommended only when the quantities n x (%NDs) and n x (1-
%NDs) are no smaller than 5. If the percentage of nondetects is quite high and/or the sample size
is fairly small, these conditions may be violated, leading potentially to inaccurate results.
Comparison of the Test of Proportions to the Wilcoxon Rank-Sum test shows that for small
to moderate proportions of nondetects (say 0 to 60 percent), the Wilcoxon Rank-Sum procedure
adjusted for ties is more powerful in identifying real concentration differences than the Test of
Proportions. When the percentage of nondetects is quite high (at least 70 to 75 percent), the Test
of Proportions appears to be slightly more powerful in some cases than the Wilcoxon, but the
results of the two tests almost always lead to the same conclusion, so it makes sense to simply
recommend the Wilcoxon Rank-Sum test in all cases where nondetects constitute more than 15
percent of the samples.
2.2 NONDETECTS IN STATISTICAL INTERVALS
If the chosen method is a statistical interval (Confidence, Tolerance or Prediction limit) used
to compare background data against each downgradient well separately, more options are available
for handling moderate proportions of nondetects. The basis of any parametric statistical interval
limit is the formula x ± K-S, where x and s represent tjie sample mean and standard deviation of
the (background) data and K depends on the interval type and characteristics of the monitoring
network. To use a parametric interval in the presence of a substantial number of nondetects, it is
necessary to estimate the sample mean and standard deviation. But since nondetect concentrations
are unknown, simple formulas for the mean and standard deviation cannot be computed directly.
Two basic approaches to estimating or "adjusting" the mean and standard deviation in this situation
have been described by Cohen (1959) and Aitchison (1955).
The underlying assumptions of these procedures are somewhat different. Cohen's
adjustment (which is described in detail on pp. 8-7 to 8-11 of the Interim Final Guidance) assumes
27
-------
Draft 1/2S/V3
that all the data (detects and nondetects) come from the same Normal or Lognormal population, but
that nondetect values have been "censored" at their detection limits. This implies that the
contaminant of concern is present in nondetect samples, but the analytical equipment is not
sensitive to concentrations lower than the detection limit. Aitchison's adjustment, on the other
hand, is constructed on the assumption that nondetect samples are free of contamination, so that all
nondetects may be regarded as zero concentrations. In some situations, particularly when the
analyte of concern has been detected infrequently in background measurements, this assumption
may be practical, even if it cannot be verified directly.
Before choosing between Cohen's and Aitchison's approaches, it should be cautioned that
Cohen's adjustment may not give valid results if the proportion of,nondetects exceeds 50%. In a
case study by McNichols and Davis (1988), the false positive rate associated with the use of t-tests
based on Cohen's method rose substantially when the fraction of nondetects was greater than 50%.
This occurred because the adjusted estimates of the mean and standard deviation are more highly
correlated as the percentage of nondetects increases, leading to less reliable statistical tests
(including statistical interval tests).
On the other hand, with less than 50% nondetects, Cohen's method performed adequately in
the McNichols and Davis case study, provided the data were not overly skewed and that more
extensive tables than those included within the Interim Final Guidance were available to calculate
Cohen's adjustment parameter. As a remedy to the latter caveat, a more extensive table of Cohen's
adjustment parameter is provided in Appendix A (Table A-5). It is also recommended that the data
(detected measurements and nondetect detection limits) first be log-transformed prior to computing
either Cohen's or Aitchison's adjustment, especially since both procedures assume that the
underlying data are Normally distributed
2.2.1 Censored and Detects-Only Probability Plots
To decide which approach is more appropriate for a particular set of ground water data, two
separate Probability Plots can be constructed. The first is called a Censored Probability Plot and is
a test of Cohen's underlying assumption. In this method, the combined set of detects and
nondetects is ordered (with nondetects being given arbitrary but distinct ranks). Cumulative
probabilities or Normal quantiles (see Section 1.1) are then computed for the data set as in a
regular Probability Plot. However, only the detected values and their associated Normal quantiles
are actually plotted. If the shape of the Censored Probability Plot is reasonably linear, then
Cohen's assumption that nondetects have been "censored" at their detection limit is probably
28
-------
Draft 1/28/93
acceptable and Cohen's adjustment can be made to estimate the sample mean and standard
deviation. If the Censored Probability Plot has significant bends and curves, particularly in one or
both tails, one might consider Aitchison's procedure instead.
To test the assumptions of Aitchison's method, a Detects-Only Probability Plot may be
constructed. In this case, nondetects are completely ignored and a standard Probability Plot is
constructed using only the detected measurements. Thus, cumulative probabilities or Normal
quantiles are computed only for the ordered detected values. Comparison of a Detects-Only
Probability Plot with a Censored Probability Plot will indicate that the same number of points and
concentration values are plotted on each graph. However, different Normal quantiles are
associated with each detected concentration. If the Detects-Only Probability Plot is reasonably
linear, then the assumptions underlying Aitchison's adjustment (i.e., that "nondetects" represent
zero concentrations, and that detects and nondetects follow separate probability distributions) are
probably reasonable.
If it is not clear which of the Censored or Detects-Only Probability Plots is more linear,
Probability Plot Correlation Coefficients can be computed for both approaches (note that the
correlations should only involve the points actually plotted, that is, detected concentrations). The
plot with the higher correlation coefficient will represent the most linear trend. Be careful,
however, to use other, non-statistical judgments to help decide which of Cohen's and Aitchison's
underlying assumptions appears to be most reasonable based on the specific characteristics of the
data set. It is also likely that these Probability Plots may have to be constructed on the logarithms
of the data instead of the original values, if in fact the most appropriate underlying distribution is
the Lognormal instead of the Normal.
EXAMPLE 8
Create Censored and Detects-Only Probability Plots with the following zinc data to determine
whether Cohen's adjustment or Aitchison's adjustment is most appropriate for estimating the true
mean and standard deviation.
29
-------
Draft 1/28/93
Zinc Concentrations (ppb) at Background Wells
Sample Well 1 Well 2 Well 3 Well 4 Well 5
1
2
3
4
5
6
' 7
8
<7
11.41
<7
<7
<7
10.00
15.00
<7
<7
<7
13.70
11.56
<7
<7
10.50
- 12.59
<7
12.85
14.20
9.36
<7
12.00
<7
<7
11.69
10.90
<7
12.22
11.05
<7
13.24
<7
<7
<7
<7
11.15
13.31
12.35
<7
8.74
SOLUTION
Step 1. Pool together the data from the five background wells and list in order in the table
below.
Step 2. To construct the Censored Probability Plot, compute the probabilities i/(n+l) using the
combined set of detects and nondetects, as in column 3. Find the Normal quantiles
associated with these probabilities by applying the inverse standard Normal
transformation, -1.
Step 3. To construct the Detects-Only Probability Plot, compute the probabilities in column 5
using only the detected zinc values. Again apply the inverse standard Normal
transformation to find the associated Normal quantiles in column 6. Note that
nondetects are ignored completely in this method.
30
-------
Draft 1/28/93
Order (i)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
Zinc Cone.
(ppb)
<7
<7
<7
<7
<7
<7
<7
<7
<7
<7
<7
<7
<7
<7
<7
<7
<7
<7
<7
<7
8.74
9.36
10.00
10.50
10.90
11.05
11.15
11.41
11.56
11.69
12.00
12.22
12.35
12.59
12.85
13.24
13.31
13.70
14.20
15.00
Censored
Probs.
.024
.049
.073
.098
.122
.146
.171
.195
.220
.244
.268
.293
.317
.341
.366
.390
.415
.439
.463
.488
.512
.537
.561
.585
.610
.634
.659
.683
.707
.732
.756
.780
.805
.829
.854
.878
.902
.927
.951
.976
Normal
Quan tiles
-1.971
-1.657
-1.453
-1.296
-1.165
-1.052
-0.951
-0.859
-0.774
-0.694
-0.618
-0.546
-0.476
-0.408
' -0.343
-0.279
-0.216
-0.153
-0.092
-0.031
0.031
0.092
0.153
0.216
0.279
0.343
0.408
0.476
0.546
0.618
0.694
0.774
0.859
0.951
1.052
1.165
1.296
1.453
1.657
1.971
Detects-Only
Probs.
.048
.095
.143
.190
.238
.286
.333
.381
.429
.476
.524
.571
.619
.667
.714
.762
.810
.857
.905
.952
Normal
Quantiles
-1.668
-1.309
-1.068
-0.876
-0.712
-0.566
-0.431
-0.303
-0.180
-0.060
0.060
0.180
0.303
0.431
0.566
0.712
0.876
1.068
1.309
1.668
Step 4. Plot the detected zinc concentrations versus each set of probabilities or Normal quantiles,
as per the procedure for constructing Probability Plots (see figures below). The
nondetect values should not be plotted. As can be seen from the graphs, the Censored
Probability Plot indicates a definite curvature in the tails, especially the lower tail. The
Detects-Only Probability Plot, however, is reasonably linear. This visual impression is
bolstered by calculation of a Probability Plot Correlation Coefficient for each set of
31
-------
Draft 1/28/93
detected values: the Censored Probability Plot has a correlation of r=.969, while the
DetecB-Only Probability Plot has a correlation of r=.998.
Step 5. Because the Detects-Only Probability Plot is substantially more linear than the Censored
Probability Plot, it may be appropriate to consider detects and nondetects as arising from
statistically distinct distributions, with nondetects representing "zero" concentrations.
Therefore, Aitchison's adjustment may lead to better estimates of the true mean and
standard deviation than Cohen's adjustment for censored data.
CENSORED PROBABILITY PLOT
2.5
1.5
1 "
3
ee
O
z
-0.5
IS
10 11 12 13 14
ZINC CONCENTRATIONS (ppb)
15
16
32
-------
Draft 1/28/93
DETECTS-ONLY PROBABILITY PLOT
o>
tc
o
0.5
-1.5
-2.5
I
10 11 12 13 14
ZINC CONCENTRATIONS (ppb)
15
16
2.2.2 Aitchison's Adjustment
To actually compute Aitchison's adjustment (Aitchison, 1955), it is assumed that the detected
samples follow an underlying Normal distribution. If the detects are Lognormal, compute
Aitchison's adjustment on the logarithms of the data instead. Let d=# nondetects and let n=total #
of samples (detects and nondetects combined). Then if x* and s* denote respectively the sample
mean and standard deviation of the detected values, the adjusted overall mean can be estimated as
and the adjusted overall standard deviation may be estimated as the square root of the quantity
n-1
The general formula for a parametric statistical interval adjusted for nondetects by Aitchison's
method is given by £ ± f 0", with K depending on the type of interval being constructed.
33
-------
Draft 1/28/93
EXAMPLE 9
In Example 8, it was determined that Aitchison's adjustment might lead to more appropriate
estimates of the true mean and standard deviation than Cohen's adjustment. Use the data in
Example 8 to compute Aitchison's adjustment
SOLUTION
Step 1. The zinc data consists of 20 nondetects and 20 detected values; therefore d=20 and n=40
in the above formulas.
Step 2. Compute the average x" = 11.891 and the standard deviation s" = 1.595 of the set of
detected values.
Step 3. Use the formulas for Aitchison's adjustment to compute estimates of the true mean and
standard deviation:
= l- 1x11.891 = 5.95
40 y
3,495
If Cohen's adjustment is mistakenly computed on these data instead, with a detection
limit of 7 ppb.the estimates become £=7.63 and & = 4.83. Thus, the choice of
adjustment can have a significant impact on the upper limits computed for statistical
intervals.
2.2.3 More Than 50% Nondetects
If more than 50% but less than 90% of the samples are nondetect or the assumptions of
Cohen's and Aitchison's methods cannot be justified, parametric statistical intervals should be
abandoned in favor of non-parametric alternatives (see Section 3 below). Nonparametric
statistical intervals are easy to construct and apply to ground water data measurements, and no
special steps need be taken to handle nondetects.
When 90% or more of the data values are nondetect (as often occurs when measuring volatile
organic compounds [VOCs] in ground water, for instance), the detected samples can often be
modeled as "rare events" by using the Poisson distribution. The Poisson model describes the
behavior of a series of independent events over a large number of trials, where the probability of
occurrence is low but stays constant from trial to trial. The Poisson model is similar to the
Binomial model in that both models represent "counting processes." In the Binomial case,
nondetects are counted as 'misses' or zeroes and detects are counted (regardless of contamination
34
-------
Draft 1/28/93
level) as 'hits' or_gnes; in the case of the Poisson, each panicle or molecule of contamination is
counted separately but cumulatively, so that the counts for detected samples with high
concentrations are larger than counts for samples with smaller concentrations. As Gibbons (1987,
p. 574) has noted, it can be postulated
...that the number of molecules of a particular compound out of a much larger
number of molecules of water is the result of a Poisson process. For example,
we might consider 12 ppb of benzene to represent a count of 12 units of benzene
for every billion units examined. In this context, Poisson's approach is justified
in that the number of units (i.e., molecules) is large, and the probability of the
occurrence (i.e. ^ a molecule being classified as benzene) is small.
For a detect with concentration of 50 ppb, the Poisson count would be 50. Counts for
nondetects can be taken as zero or perhaps equal to half the detection limit (e.g., if the detection
limit were 10 ppb, the Poisson count for that .sample would be 5). Unlike the Binomial (Test of
Proportions) model, the Poisson model has the ability to utilize the magnitudes of detected
concentrations in statistical tests.
The Poisson distribution is governed by the average rate of occurrence, X., which can be
estimated by summing the Poisson counts of all samples in the background pool of data and
dividing by the number of samples in the pool. Once the average rate of occurrence has been
estimated, the formula for the Poisson distribution is given by
x!
where x represents the Poisson count and A. represents the average rate of occurrence. To use the
Poisson distribution to predict concentration values at downgradient wells, formulas for
constructing Poisson Prediction and Tolerance limits are given below.
2.2.4 Poisson Prediction Limits
To estimate a Prediction limit at a particular well using the Poisson model, the approach
described by Gibbons (1987b) and based on the work of Cox and Hinkley (1974) can be used. In
this case, an upper limit is estimated for an interval that will contain all of k future measurements of
an analyte with confidence level 1-cc, given n previous background measurements.
To do this, let Tn represent the sum of the Poisson counts of n background samples. The
goal is to predict Tic*, representing the total Poisson count of the next k sample measurements. As
35
-------
Draft 1/28/93
Cox and Hinkley show, if Tn has a Poisson distribution with mean (a and if no contamination has
occurred, it is reasonable to assume that Tk* will also have a Poisson distribution but with mean
c|i, where c depends on the number of future measurements being predicted.
In particular, Cox and Hinckley demonstrate that the quantity
has an approximate standard Normal distribution. From this relation, an upper prediction limit for
7^* is calculated by Gibbons to be approximately
where t=tn.i ot is the upper (1-cc) percentile of the Student's t distribution with (n-1) degrees of
freedom. The quantity c in the above formulas may be computed as k/n, where, as noted, k is the
number of future samples being predicted.
EXAMPLE 10
Use the following benzene data from six background wells to estimate an upper 99% Poisson
Prediction limit for the next four measurements from a single downgradient well.
Month
1
2
3
4
5
6
Welll
<2
<2
<2
<2
<2
<2
Benzene Concentrations (ppb)
Well 2 Well 3 Well 4 Well 5
<2
<2
<2
12.0
<2
<2
<2
<2
<2
<2
<2
<2
<2
15.0
<2
<2
<2
<2
<2
<2
<2
<2
<2
<2
Well 6
<2
<2
<2
<2
10.0
<2
36
-------
Draft 1/28/93
SOLUTION
Step 1. Pooling the background data yields n=36 samples, of which, 33 (92%) are nondetect.
Because the rate of detection is so infrequent (i.e., <10%), a Poisson-based Prediction
limit may be appropriate. Since four future measurements are to be predicted, k=4, and
hence, c=k/n=l/9.
Step 2. Set each nondetect to half the detection limit or 1 ppb. Then compute the Poisson count
of the sum of all the background samples, in this case, Tn=33(l)+(12.0+15.0+10.0) =
70.0. To calculate an upper 99% Prediction limit, the upper 99th percentile of the t-
distribution with (n-l)=35 degrees of freedom must be taken from a reference table,
namely t35(.oi=2.4377.
Step 3. Using Gibbons' formula above, calculate the upper Prediction limit as:
Step 4. To test the upper Prediction limit, the Poisson count of the sum of the next four
downgradient wells should be calculated. If this sum is greater than 15.3 ppb, there is
significant evidence of contamination at the downgradient well. If not, the well may be
regarded as clean until the next testing period.
The procedure for generating Poisson prediction limits is somewhat flexible. The value k
above, for instance, need not represent multiple samples from a single well. It could also denote a
collection of single samples from k distinct wells, all of which are assumed to follow the same
Poisson distribution in the absence of contamination. The Poisson distribution also has the
desirable propeny that the sum of several Poisson variables also has a Poisson distribution, even if
the individual components are not identically distributed. Because of this, Gibbons (1987b) has
suggested that if several analytes (e.g., different VOCs) can all be modeled via the Poisson
distribution, the combined sum of the Poisson counts of all the analytes will also have a Poisson
distribution, meaning that a single prediction limit could be estimated for the combined group of
analytes, thus reducing the necessary number of statistical tests.
A major drawback to Gibbons' proposal of establishing a combined prediction limit for
several analytes is that if the limit is exceeded, it will not be clear which analyte is responsible for
"triggering" the test. In pan this problem explains why the ground-water monitoring regulations
mandate that each analyte be tested separately. Still, if a large number of analytes must be regularly
tested and the detection rate is quite low, the overall facility-wide false positive rate may be
unacceptably high. To remedy this situation, it is probably wisest to do enough initial testing of
background and facility leachate and waste samples to determine those specific parameters present
at levels substantially greater than background. By limiting monitoring and statistical tests to a few
37
-------
Draft 1/28/93
parameters meeting the above conditions, it should be possible to contain the overall facility-wide
false positive rate while satisfying the regulatory requirements and assuring reliable identificatior
of ground-water contamination if it occurs.
Though quantitative information on a suite of VOCs may be automatically generated as a
consequence of the analytical method configuration (e.g., SW-846 method 8260 can provide
quantitative results for approximately 60 different compounds), it is usually unnecessary to
designate all of these compounds as leak detection indicators. Such practice generally aggravates
the problem of many comparisons and results in elevated false positive rates for the facility as a
whole. This makes accurate statistical testing especially difficult. EPA therefore recommends that
the results of leachate testing or the waste analysis plan serve as the primary basis for designating
reliable leak detection indicator parameters.
2.2.5 Poisson Tolerance Limits
To apply an upper Tolerance limit using the Poisson model to a group of downgradient
wells, the approach described by Gibbons (1987b) and based on the work of Zacks (1970) can be
taken. In this case, if no contamination has occurred, the estimated interval upper limit will contain
a large fraction of all measurements from the downgradient wells, often specified at 95% or more.
The calculations involved in deriving Poisson Tolerance limits can seem non-intuitive,
primarily because the argument leading to a mathematically rigorous Tolerance limit is complicated.
The basic idea, however, uses the fact that if each individual measurement follows a common
Poisson distribution with rate parameter, X, the sum of n such measurements will also follow a
Poisson distribution, this time with rate nX.
Because the Poisson distribution has the property that its true mean is equal to the rate
parameter X, the concentration sum of n background samples can be manipulated to estimate this
rate. But since we know that the distribution of the concentration sum is also Poisson, the possible
values of X can actually be narrowed to within a small range with fixed confidence probability (y).
For each "possible" value of X in this confidence range, one can compute the percentile of the
Poisson distribution with rate X that would lie above, say, 95% of all future downgradient
measurements. By setting as the "probable" rate, that X which is greater than all but a small
38
-------
percentage a of the most extreme possible A.'s, given the values of n background samples, one can
compute an upper tolerance limit with, say, 95% coverage and (l-oc)% confidence.
To actually make these computations, Zacks (1970) shows that the most probable rate X can
be calculated approximately as
where as before Tn represents the Poisson count of the sum of n background samples (setting
nondetects to half the method detection limit), and
+2
n
represents the y percentile of the Chi-square distribution with (2Tn+2) degrees of freedom.
To find the upper Tolerance limit with P% coverage (e.g., 95%) once a probable rate X has
been estimated, one must compute the Poisson percentile that is larger than (3% of all possible
measurements from that distribution, that is, the p% quantile of the Poisson distribution with mean
rate Xjn. denoted by P'^P.Xjn)- Using a well-known mathematical relationship between the
Poisson and Chi-square distributions, finding the P% quantile of the Poisson amounts to
determining the least positive integer k such that
X*_ [2k + 2]>2AT
^ n
where, as above, the quantity [2k+2] represents the degrees of freedom of the Chi-square
distribution. By calculating two times the estimated probable rate Xjn on the right-hand-side of the
above inequality, and then finding the smallest degrees of freedom so that the (1-P)% percentile of
the Chi-square distribution is bigger than 2Vrn. the upper tolerance limit k can be determined fairly
easily.
Once the upper tolerance limit, k, has been estimated, it will represent an upper Poisson
Tolerance limit having approximately P% coverage with 7% confidence in all comparisons with
downgradient well measurements.
39
-------
Draft 1/28/93
EXAMPLE 11
Use the benzene data of Example 10 to estimate an upper Poisson Tolerance limit with 95%
coverage and 95% confidence probability.
SOLUTION
Step 1. The benzene data consist of 33 nondetects with detection limit equal to 2 ppb and 3
detected values for a total of n=36. By setting each nondetect to half the detection limit
as before, one finds a total Poisson count of the sum equal to Tn=70.0. It is also known
that the desired confidence probability is 7=.95 and the desired coverage is (3=.95.
Step 2. Based on the observed Poisson count of the sum of background samples, estimate the
probable occurrence rate Xjn using Zacks' formula above as
Step 3. Compute twice the probable occurrence rate as 2>.Tn=4.74. Now using a Chi-square
table, find the smallest degrees of freedom (df), k, such that
;&[2k+2];>4.74
Since the 5th percentile of the Chi-square distribution with 12 df equals 5.23 (but only
4.57 with 11 df), it is seen that (2k+2)=12, leading to k=5. Therefore, the upper
Poisson Tolerance limit is estimated as k=5 ppb.
Step 4. Because the estimated upper Tolerance limit with 95% coverage equals 5 ppb, any
detected value among downgradient samples greater than 5 ppb may indicate possible
evidence of contamination.
40
-------
Draft 1/28/93
3. NON-PARAMETRIC COMPARISON OF
COMPLIANCE WELL DATA
TO BACKGROUND
When concentration data from several compliance wells are to be compared with
concentration data from background wells, one basic approach is analysis of variance (ANOVA).
The ANOVA technique is used to test whether there is statistically significant evidence that the
mean concentration of a constituent is higher in one or more of the compliance wells than the
baseline provided by background wells. Parametric ANOVA methods make two key assumptions:
1) that the data residuals are Normally distributed and 2) that the group variances are all
approximately equal. The steps for calculating a parametric ANOVA are given in the Interim Final
Guidance (pp. 5-6 to 5-14).
If either of the two assumptions crucial to a parametric ANOVA is grossly violated, it is
recommended that a non-parametric test be conducted using the ranks of the observations rather
than the original observations themselves. The Interim Final Guidance describes the Kruskal-
Wallis test when three or more well groups (including background data, see pp. 5-14 to 5-20) are
being compared. However, the Kruskal-Wallis test is not amenable to two-group comparisons,
say of one compliance well to background data. In this case, the Wilcoxon Rank-Sum procedure
(also known as the Mann-Whitney U Test) is recommended and explained below. Since most
situations will involve the comparison of at least two downgradient wells with'.background data,
the Kruskal-Wallis test is presented first with an additional example.
3.1 KRUSKAL-WALLIS TEST
When the assumptions used in a parametric analysis of variance cannot be verified, e.g.,
when the original or transformed residuals are not approximately Normal in distribution or have
significantly different group variances, an analysis can be performed using the ranks of the
observations. Usually, a non-parametric procedure will be needed when a substantial fraction of
the measurements are below detection (more than 15 percent), since then the above assumptions
are difficult to verify.
The assumption of independence of the residuals is still required. Under the null hypothesis
that there is no difference among the groups, the observations are assumed to come from identical
distributions. However, the form of the distribution need not be specified.
41
-------
Draft 1/28/93
A non-parametric ANOVA can be used in any situation that the parametric analysis of
variance can be used. However, because the ranks of the data are being used, the minimum
sample sizes for the groups must be a little larger. A useful rule of thumb is to require a minimum
of three well groups with at least four observations per group before using the Kruskal-Wallis
procedure.
Non-parametric procedures typically need a few more observations than parametric
procedures for two reasons. On the one hand, non-parametric tests make fewer assumptions
concerning the distribution of the data and so more data is often needed to make the same judgment
that would be rendered by a parametric test. Also, procedures based on ranks have a discrete
distribution (unlike the continuous distributions of parametric tests). Consequently, a larger
sample size is usually needed to produce test statistics thai will be significant at a specified alpha
level such as 5 percent.
The relative efficiency of two procedures is defined as the ratio of the sample sizes needed by
each to achieve a certain level of power against a specified alternative hypothesis. As sample sizes
get larger, the efficiency of the Kruskal-Wallis test relative to the parametric analysis of variance
test approaches a limit that depends on the underlying distribution of the data, but is always at least
86 percent. This means roughly that in the worst case, if 86 measurements are available for a
parametric ANOVA, only 100 sample values are needed to have an equivalently powerful Kruskal-
Wallis test. In many cases, the increase in sample size necessary to match the power of a
parametric ANOVA is much smaller or not needed at all. The efficiency of the Kruskal-Wallis test
is 95 percent if the data are really Normal, and can be much larger than 100 percent in other cases
(e.g., it is 150 percent if the residuals follow a distribution called the double exponential).
These results concerning efficiency imply that the Kruskal-Wallis test is reasonably powerful
for detecting concentration differences despite the fact that the original data have been replaced by
their ranks, and can be used even when the data are Normally distributed. When the data are not
Normal or cannot be transformed to Normality, the Kruskal-Wallis procedure tends to be more
powerful for detecting differences than the usual parametric approach.
3.1.1 Adjusting for Tied Observations
Frequently, the Kruskal-Wallis procedure will be used when the data contain a significant
fraction of nondetects (e.g., more than 15 percent of the samples). In these cases, the parametric
assumptions necessary for the usual one-way ANOVA are difficult or impossible to verify, making
42
-------
Draft 1/28/93
the non-parametric alternative attractive. However, the presence of nondetects prevents a unique
ranking of the concentration values, since nondetects are, up to the limit of measurement, all tied at
the same value.
To get around this problem, two steps are necessary. First, in the presence of ties (e.g.,
nondetects), all tied observations should receive the same rank. This rank (sometimes called the
midrank (Lehmann, 1975)) is computed as the average of the ranks that would be given to a group
of ties if the tied values actually differed by a tiny amount and could be ranked uniquely. For
example, if the first four ordered observations are all nondetects, the midrank given to each of
these samples would be equal to (l+2+3+4)/4=2.5. If the next highest measurement is a unique
detect, its rank would be 5 and so on until all observations are appropriately ranked.
The second step is to compute the Kruskal-Wallis statistic as described in the Interim Final
Guidance, using the midranks computed for the tied values. Then an adjustment to the Kruskal-
Wallis statistic must be made to account for the presence of ties. This adjustment is described on
page 5-17 of the Interim Final Guidance and requires computation of the formula:
H'=
where g equals the number of groups of distinct tied observations and t; is the number of
observations in the ith tied group.
EXAMPLE 12
Use the non-parametric analysis of variance on the following data to determine whether there
is evidence of contamination at the monitoring site.
Toluene Concentration (ppb)
Background Wells Compliance Wells
Month Well 1 Well 2 Well 3 Well 4 Well 5
1
2
3
4
5
<5
7.5
<5
<5
6.4
<5
<5
<5
<5
<5
<5
12.5
8.0
<5
11.2
<5
13.7
15.3
20.2
25.1
<5
20.1
35.0
28.2
19.0
43
-------
Unit 1/28/93
SOLUTION
Step 1. Compute the overall percentage of nondetects. In this case, nondetects account for 48
percent of the data. The usual parametric analysis of variance would be inappropriate.
Use the Kruskal-Wallis test instead, pooling both background wells into one group and
treating each compliance well as.a separate group.
Step 2. Compute ranks for all the data including tied observations (e.g., nondetects) as in the
following table. Note that each nondetect is given the same midrank, equal to the
average of the first 12 unique ranks.
Month
1
2
3
4
5
Rank Sum
Rank Mean
Background Wells
Well 1 Well 2
6.5
14
6.5
6.5
13
Rb=79
Rb=7.9
6.5
6.5
6.5
6.5
6.5
Toluene Ranks
Compliance Wells
Well 3 Well 4 Well 5
6.5
17
15
6.5
16
R3=61
R"3=12.2
6.5
18
19
22
23
R4=88.5
R4=17.7
6.5
21
25
24
20
R5=96.5
R5=19.3
Step 3. Calculate the sums of the ranks in each group (Rj) and the mean ranks in each group
(Rj). These results are given above.
Step 4. Compute the Kruskal-Walk's statistic H using the formula on p. 5-15 of the Interim Final
Guidance
H =
12
N(N + l)^i=1 N
i.
- 3(N +1)
where N=total number of samples, Nj=number of samples in ith group, and K=number
of groups. In this case, N=25, K=4, and H can be computed as
H =
12
25*26
792 612 88.52 96.5:
10 + 5 + 5 5
-78 = 10.56.
Step 5. Compute the adjustment for ties. There is only one group of distinct tied observations,
containing 12 samples. Thus, the adjusted Kruskal-Wallis statistic is given by:
44
-------
L)ratt
°6 = 11.87.
25* -25)
Step 6. Compare the calculated value of H' to the tabulated Chi-square value with (K-l)= (#
groups- 1)=3 df, X23>05=7.81. Since the observed value of 11.87 is greater than the
Chi-square critical value, there is evidence of significant differences between the well
groups. Post-hoc pairwise comparisons are necessary.
Step 7. Calculate the critical difference for compliance well comparisons to the background
using the formula on p. 5-16 of the Interim Final Guidance document. Since the number
of samples at each compliance well is four, the same critical difference can be used for
each comparison, namely,
C, = Z.05
Step 8. Form the differences between the average ranks of each compliance well and the
background and compare these differences to the critical value of 8.58.
Well 3: R3-R~b = 12.2-7.9 = 4.3
Well4: R~4-Rb = 17.7-7.9 = 9.8
Well 5: R~5-Rb= 19.3-7.9= 11.4.
Since the average rank differences at wells 4 and 5 exceed the critical difference, there is
significant evidence of contamination at wells 4 and 5, but not at well 3.
3.2 WILCOXON RANK-SUM TEST FOR Two GROUPS
When a single compliance well group is being compared to background data and a non-
parametric test is needed, the Kruskal-Wallis procedure should be replaced by the Wilcoxon Rank-
Sum test (Lehmann, 1975; also known as the two-sample Mann-Whitney U test). For most
ground-water applications, the Wilcoxon test should be used whenever the proportion of
nondetects in the combined data set exceeds 15 percent. However, to provide valid results, do not
use the Wilcoxon test unless the compliance well and background data groups both contain at least
four samples each.
To run the Wilcoxon Rank-Sum Test, use the following algorithm. Combine the compliance
and background data and rank the ordered values from 1 to N. Assume there are n compliance
samples and m background samples so that N=m+n. Denote the ranks of the compliance samples
45
-------
LJTdll
by Q and the ranks of the background samples by Bj. Then add up the ranks of the compliance
samples and subtract n(n+l)/2 to get the Wilcoxon statistic W:
The rationale of the Wilcoxon test is that if the ranks of the compliance data are quite large
relative to the background ranks, then the hypothesis that the compliance and background values
came from the same population should be rejected. Large values of the statistic W give evidence of
contamination at the compliance well site.
To find the critical value of W, a Normal approximation to its distribution is used. The
expected value and standard deviation of W under the null hypothesis of no contamination are
given by the formulas
E(W) = mn; SD(W) = mn(N +1)
An approximate Z-score for the Wilcoxon Rank-Sum Test then follows as:
W-E(W)--
.
SD(W)
The factor of 1/2 in the numerator serves as a continuity correction since the discrete distribution of
the statistic W is being approximated by the continuous Normal distribution.
Once an approximate Z-score has been computed, it may be compared to the upper 0.01
percentile of the standard Normal distribution, z.oi=2.326, in order to determine the statistical
significance of the test. If the observed Z-score is greater than 2.326, the null hypothesis may be
rejected at the 1 percent significance level, suggesting that there is significant evidence of
contamination at the compliance well site.
EXAMPLE 13
The table below contains copper concentration data (ppb) found in water samples at a
monitoring facility. Wells 1 and 2 are background wells and well 3 is a single compliance well
suspected of contamination. Calculate the Wilcoxon Rank-Sum Test on these data.
46
-------
Draft 1/28/93
Copper Concentration (ppb)
Background Compliance
Month Welll Well 2 Well 3
1
2
3
4
5
6
4.2
5.8
11.3
7.0
7.3
8.2
5.2
6.4
11.2
11.5
10.1
9.7
9.4
10.9
14.5
16.1
21.5
17.6
SOLUTION
Step 1. Rank the N=18 observations from 1 to 18 (smallest to largest) as in the following table.
Ranks of Copper Concentrations
Background Compliance
Month Welll Well 2 Well 3
1
2
3
4
5
6
1
3
13
5
6
7
2
4
12
14
10
9
8
11
15
16
18
17
Step 2. Compute the Wilcoxon statistic by adding up the compliance well ranks and subtracting
n(n+l)/2, so that W=85-21=64.
Step 3. Compute the expected value and standard deviation of W.
E(W) = -mn = 36
SD(W) = Jmn(N +1) = VTl4 = 10.677
Step 4. Form the approximate Z-score.
SD(W) 10.677
47
-------
Draft 1/28/93
Step 5. Compare the observed Z-score to the upper 0.01 percentile of the Normal distribution.
Since-Z=2.576>2.326=z.oi, there is significant evidence of contamination at the
compliance well at the 1 percent significance level.
3.2.1 Handling Ties in the Wilcoxon Test
Tied observations in the Wilcoxon test are handled in similar fashion to the Kruskal-Wallis
procedure. First, midranks are computed for all tied values. Then the Wilcoxon statistic is
computed as before but with a slight difference. To form the approximate Z-score, an adjustment
is made to the formula for the standard deviation of W in order to account for the groups of tied
values. The necessary formula (Lehmann, 1975) is:
SD'(W) =
where, as in the Kruskal-Wallis method, g equals the number of groups of distinct tied
observations and tj represents the number of tied values in the ith group.
48
-------
Draft 1/28/93
4. STATISTICAL INTERVALS: CONFIDENCE,
TOLERANCE, AND PREDICTION
Three types of statistical intervals are often constructed from data: Confidence intervals,
Tolerance intervals, and Prediction intervals. Though often confused, the interpretations and uses
of these intervals are quite distinct. The most common interval encountered in a course on statistics
is a Confidence interval for some parameter of the distribution (e.g., the population mean). The
interval is constructed from sample data and is thus a random quantity. This means that each set of
sample data will generate a different Confidence interval, even though the algorithm for
constructing the interval stays the same every time.
A Confidence interval is designed to contain the specified population parameter (usually the
mean concentration of a well in ground-water monitoring) with a designated level of confidence or
probability, denoted as 1-ct. The interval will fail to include the true parameter in approximately a
percent of the cases where such intervals are constructed.
The usual Confidence interval for the mean gives information about the average concentration
level at a particular well or group of wells. It offers little information about the highest or most
extreme sample concentrations one is likely to observe over time. Often, it is those extreme values
one wants to monitor to be protective of human health and the environment. As such, a
Confidence interval generally should be used only in two situations for ground-water data analysis:
(1) when directly specified by the permit or (2) in compliance monitoring, when down gradient
samples are being compared to a Ground-Water Protection Standard (GWPS) representing the
average of onsite background data, as is sometimes the case with an Alternate Contaminant Level
(ACL). In other situations it is usually desirable to employ a Tolerance or Prediction interval.
A Tolerance interval is designed to contain a designated proportion of the population (e.g.,
95 percent of all possible sample measurements). Since the interval is constructed from sample
data, it also is a random interval. And because of sampling fluctuations, a Tolerance interval can
contain the specified proportion of the population only with a certain confidence level. Two
coefficients arc associated with any Tolerance interval. One is the proportion of the population that
the interval is supposed to contain, called the coverage. The second is the degree of confidence
with which the interval reaches the specified coverage. This is known as the tolerance coefficient.
A Tolerance interval with coverage of 95 percent and a tolerance coefficient of 95 percent is
constructed to contain, on average, 95 percent of the distribution with a probability of 95 percent
49
-------
Draft 1/28/93
Tolerance intervals are very useful for ground-water data analysis, because in many
situations one wants to ensure that at most a small fraction of the compliance well sample
measurements exceed a specific concentration level (chosen to be protective of human health and
the environment). Since a Tolerance interval is designed to cover all but a small percentage of the
population measurements, observations should very rarely exceed the upper Tolerance limit when
testing small sample sizes. The upper Tolerance limit allows one to gauge whether or not too many
extreme concentration measurements are being sampled from compliance point wells.
Tolerance intervals can be used in detection monitoring when comparing compliance data to
background values. They also should be used in compliance monitoring when comparing
compliance data to certain Ground-Water Protection Standards. Specifically, the tolerance interval
approach is recommended for comparison with a Maximum Contaminant Level (MCL) or with an
ACL if the ACL is derived from health-based risk data.
Prediction intervals are constructed to contain the next sample value(s) from a population or
distribution with a specified probability. That is, after sampling a background well for some time
and measuring the concentration of an analyte, the data can be used to construct an interval that will
contain the next analyte sample or samples (assuming the distribution has not changed). A
Prediction interval will thus contain a future value or values with specified probability. Prediction
intervals can also be constructed to contain the average of several future observations.
Prediction intervals are probably most useful for two kinds of detection monitoring. The first
kind is when compliance point well data are being compared to background values. In this case the
Prediction interval is constructed from the background data and the compliance well data are
compared to the upper Prediction limits. The second kind is when intrawell comparisons are being
made on an uncontaminated well. In this case, the Prediction interval is constructed on past data
sampled from the well, and used to predict the behavior of future samples from the same well.
In summary, a Confidence interval usually contains an average value, a Tolerance interval
contains a proportion of the population, and a Prediction interval contains one or more future
observations. Each has a probability statement or "confidence coefficient" associated with it. For
further explanation of the differences between these interval types, see Hahn (1970).
One should note that all of these intervals assume that the sample data used to construct the
intervals are Normally distributed. In light of the fact that much ground-water concentration data is
better modeled by a Lognormal distribution, it is recommended that tests for Normality be run on
50
-------
Draft 1/28/93
the logarithms of the original data before constructing the random intervals. If the data follow the
Lognormal model, then the intervals should be constructed using the logarithms of the sample
values. In this case, the limits of these intervals should not be compared to the original compliance
data or GWPS. Rather, the comparison should involve the logged compliance data or logged
GWPS. When neither the Normal or Lognormal models can be justified, a non-parametric version
of each interval may be utilized.
4.1 TOLERANCE INTERVALS
In detection monitoring, the compliance point samples are assumed to come from the same
distribution as the background values until significant evidence of contamination can be shown.
To test this hypothesis, a 95 percent coverage Tolerance interval can be constructed on the
background data. The background data should first be tested to check the distributional
assumptions. Once the interval is constructed, each compliance sample is compared to the upper
Tolerance limit. If any compliance point sample exceeds the limit, the well from which it was
drawn is judged to have significant evidence of contamination (note that when testing a large
number of samples, the nature of a Tolerance interval practically ensures that a few measurements
will be above the upper Tolerance limit, even when no contamination has occurred. In these cases,
the offending wells should probably be resampled in order to verify whether or not there is definite
evidence of contamination.)
If the Tolerance limit has been constructed using the logged background data, the compliance
point samples should first be logged before comparing with the upper Tolerance limit. The steps
for computing the actual Tolerance interval in detection monitoring are detailed in the Interim Final
Guidance on pp. 5-20 to 5-24. One point about the table of factors K used to adjust the width of
the Tolerance interval is that these factors are designed to provide at least 95% coverage of the
population. Applied over many data sets, the average^coverage of these intervals will often be
close to 98% or more (see Guttman, 1970). To construct a one-sided upper Tolerance interval
with average coverage of (!-£)%, the K multiplier can be computed directly with the aid of a
Student's t-distribution table. In this case, the formula becomes
where the t-value.represents the (l-p)th upper percentile of the t-distribution with (n-1) degrees of
freedom.
51
-------
Draft 1/28/93
In compliance monitoring, the Tolerance interval is calculated on the compliance point data,
so that the upper one-sided Tolerance limit may be compared to the appropriate Ground-Water
Protection Standard (i.e., MCL or ACL). If the upper Tolerance limit exceeds the fixed standarc
and especially if the Tolerance limit has been constructed to have an average coverage of 95% as
described above, there is significant evidence that as much as 5 percent or more of all the
compliance well measurements will exceed the limit and consequently that the compliance point
wells are in violation of the facility permit. The algorithm for computing Tolerance limits in
compliance monitoring is given on pp. 6-11 to 6-15 of the Interim Final Guidance.
EXAMPLE 14
The table below contains data that represent chrysene concentration levels (ppb) found in
water samples obtained from the five compliance wells at a monitoring facility. Compute the upper
Tolerance limit at each well for an average of 95% coverage with 95% confidence and determine
whether there is evidence of contamination. The alternate concentration limit (ACL) is 80 ppb.
Chrysene Concentration (ppb)
Month Welll Well 2 Well 3 Well 4 Well 5
1
2
3
4
Mean
SD
SOLUTION
Step 1. Before
19.7
39.2
7.8
12.8
19.88
13.78
constructing
10.2
7.2
16.1
5.7
9.80
4.60
the tolerance
68.0
48.9
30.1
38.1
46.28
16.40
intervals, check
26:8
17.7
31.9
22.2
24.65
6.10
the distributional
47.0
30.5
15.0
23.4
28.98
13.58
assumptions. Thi
algorithm for a parametric Tolerance interval assumes that the data used to compute the
interval are Normally distributed. Because these data are more likely to be Lognormal in
distribution than Normal, check the assumptions on the logarithms of the original data
given in the table below. Since each well has only four observations, Probability Plots
are not likely to be informative. The Shapiro-Wilk or Probability Plot Correlation
Coefficient tests can be run, but in this example only the Skewness Coefficient is
examined to ensure that gross departures from Lognormality are not missed.
52
-------
Draft 1/28/93
Logged Chrysene Concentration [log(ppb)]
Month Welll Well 2 Well 3 Well 4 Well 5
1
2
3
4
Mean
SD
Step 2.
2.98 2.32
3.67 1.97
2.05 2.78
2.55 1.74
2.81 2.20
0.68 0.45
The Skewness Coefficients for
the coefficients is greater than
Normality of the logged data)
intervals.
Well
1
2
3
4
5
4
3
3
3
3
0
.22 3.29 3.85
.89 2.87 3.42
.40 3.46 2.71
.64 3.10 3.15
.79 3.18 3.28
.35 0.25 0.48
each well are given in the following table. Since none of
1 in absolute value, approximate Lognormality (that is,
is assumed for the purpose of constructing the tolerance
Skewness
.210
.334
.192
-.145
-.020
ISkewnessI
.210
.334
.192
.145
.020
Step 3. Compute the tolerance interval for each compliance well using the logged concentration
data. The means and SDs are given in the second table above.
Step 4. The tolerance factor for a one-sided Normal tolerance interval with an average of 95%
coverage with 95% probability and n=4 observations is given by
K =
The upper tolerance limit is calculated below for each of the five wells.
Welll 2.81+2.631(0.68)= 4.61 log(ppb)
Well 2 2.20+2.631(0.45)= 3.38 log(ppb)
Well 3 3.79+2.631(0.35)= 4.71 log(ppb)
Well 4 3.18+2.631(0.25)= 3.85 log(ppb)
Well 5 3.28+2.631(0.48)= 4.54 log(ppb)
53
-------
Draft 1/28/93
Step 5. Compare the upper tolerance limit for each well to the logarithm of the ACL, that is
log(80)=4.38. Since the upper tolerance limits for wells 1, 3, and 5 exceed the logged
ACL of 4.38 log(ppb), there is evidence of chrysene contamination in wells 1, 3, and 5.
4.1.1 Non-parametric Tolerance Intervals
When the assumptions of Normality and Lognormality cannot be justified, especially when a
significant portion of the samples are nondetect, the use of non-parametric tolerance intervals
should be considered. The upper Tolerance limit in a non -parametric setting is usually chosen as
an order statistic of the sample data (see Guttman, 1970), commonly the maximum value or maybe
the second largest value observed. As a consequence, non-parametric intervals should be
constructed only from wells that are not contaminated. Because the maximum sample value is
often taken as the upper Tolerance limit, non-parametric Tolerance intervals are very easy to
construct and use. The sample data must be ordered, but no ranks need be assigned to the
concentration values other than to determine the largest measurements. This also means that
nondetects do not have to be uniquely ordered or handled in any special manner.
One advantage to using the maximum concentration instead of assigning ranks to the data is
that non-parametric intervals (including Tolerance intervals) are sensitive to the actual magnitudes
of the concentration data. Another plus is that unless all the sample data are nondetect, the
maximum value will be a detected concentration, leading to a well-defined upper Tolerance limit.
Once an order statistic of the sample data (e.g., the maximum value) is chosen to represent
the upper tolerance limit, Guttman (1970) has shown that the coverage of the interval, constructed
repeatedly over many data sets, has a Beta probability density with cumulative distribution
l,m)= f
J°
where n=# samples in the data set and m=[(n+l)-(rank of upper tolerance limit value)]. If the
maximum sample value is selected as the tolerance limit, its rank is equal to n and so m=l . If the
second largest value is chosen as the limit, its rank would be equal to (n-1) and so m=2.
Since the Beta distribution is closely related to the more familiar Binomial distribution,
Guttman has shown that in order to construct a non-parametric tolerance interval with at least f}%
coverage and (1-ct) confidence probability, the number of (background) samples must be chosen
such that
54
-------
Draft 1/28/V3
Table A-6 in Appendix A provides the minimum coverage levels with 95% confidence for
various choices of n, using either the maximum sample value or the second largest measurement as
the tolerance limit. As an example, with 16 background measurements, the minimum coverage is
{3=83% if the maximum background value is designated as the upper Tolerance limit and {3=74% if
the Tolerance limit is taken to be the second largest background value. In general, Table A-6
illustrates that if the underlying distribution of concentration values is unknown, more background
samples are needed compared to the parametric setting in order to construct a tolerance interval with
sufficiently high coverage. Parametric tolerance intervals do not require as many background
samples precisely because the form of the underlying distribution is assumed to be known.
Because the coverage of the above non-parametric Tolerance intervals follows a Beta
distribution, it can also be shown that the average (not the minimum as discussed above) level of
coverage is equal to l-[m/(n+l)] (see Gunman, 1970). In particular, when the maximum sample
value is chosen as the upper tolerance limit, m=l, and the expected coverage is equal to n/(n-t-l).
This implies that at least 19 background samples are necessary to achieve 95% coverage on
average.
EXAMPLE 15
Use the following copper background data to establish a non-parametric upper Tolerance
limit and determine if either compliance well shows evidence of copper contamination.
Copper Concentration (ppb)
Background Wells Compliance Wells
Month
1
2
3
4
5
6
7
8
Welll
<5
<5
7.5
<5
<5
<5
6.4
6.0
Well 2
9.2
<5
<5
6.1
8.0
5.9
<5
<5
Well 3
<5
5.4
6.7
<5
<5
<5
<5
<5
Well 4
6.2
<5
7.8
10.4
WellS
<5
<5
5.6
<5
55
-------
Draft 1/28/93
SOLUTION
Step 1. Examine the background data in Wells 1, 2, and 3 to determine that the maximum
observed value is 9.2 ppb. Set the 95% confidence upper Tolerance limit equal to this
value. Because 24 background samples are available, Table A-6 indicates that the
minimum coverage is equal to 88% (the expected average coverage, however, is equal to
24/25=96%). To increase the Coverage level, more background samples would have to
be collected.
Step 2. Compare each sample in compliance Wells 4 and 5 to the upper Tolerance limit. Since
none of the measurements at Well 5 is above 9.2 ppb, while one sample from Well 4 is
above the limit, conclude that there is significant evidence of copper contamination at
Well 4 but not Well 5.
4.2 PREDICTION INTERVALS
When comparing background data to compliance point samples, a Prediction interval can be
constructed on the background values. If the distributions of background and compliance point
data are really the same, all the compliance point samples should be contained below the upper
Prediction interval limit. Evidence of contamination is indicated if one or more of the compliance
samples lies above the upper Prediction limit.
With intrawell comparisons, a Prediction interval can be computed on past data to contain a
specified number of future observations from the same well, provided the well has not been
previously contaminated. If any one or more of the future samples falls above the upper Prediction
limit, there is evidence of recent contamination at the well. The steps to calculate parametric
Prediction intervals are given on pp. 5-24 to 5-28 of the Interim Final Guidance.
EXAMPLE 16
The data in the table below are benzene concentrations measured at a groundwater monitoring
facility. Calculate the Prediction interval and determine whether there is evidence of contamination.
Background Well Data Compliance Well Data
Benzene Concentration Benzene Concentration
Sampling Date (ppb) Sampling Date (ppb)
Month 1
Month 2
12.6
30.8
52.0
28.1
33.3
44.0
3.0
12.8
Month 4 48.0
30.3
42.5
15.0
n=4
Mean=33.95
SD=14.64
56
-------
Draft 1/28/93
Month3
58.1
12.6
17.6
25.3
n=12
Mean=27.52
SD=17.10
Months
47.6
3.8
2.6
51.9
n=4
Mean=26.48
SD=26.94
SOLUTION
Step 1. First test the background data for approximate Normality. Only the background data are
included since these values are used to construct the Prediction interval.
Step 2. A Probability Plot of the 12 background values is given below. The plot indicates an
overall pattern that is reasonably linear with some modest departures from Normality.
To further test the assumption of Normality, run the Shapiro-Wilk test on the
background data.
PROBABILITY PLOT
P
<
a
2
O
.2
10 20 30 40
»
BENZENE (ppb)
(0
Step 3. List the data in ascending and descending order as in the following table. Also calculate
the differences X(n.i+1)-X(n and multiply by the coefficients an_i+i taken from Table A-l
to get the components of vector bj used to calculate the Shapiro-Wilk statistic (W).
57
-------
bi
Step 4.
1
2
3
4
5
6
7
8
9
10 -
11
12
3.0
12.6
12.6
12.8
17.6
25.3
28.1
30.8
33.3
44.0
52.0
' 58.1
58.1
52.0
44.0
33.3
30.8
28.1
25.3
17.6
12.8
12.6
12.6
3.0
0.548
0.333
0.235
0.159
0.092
0.030
30.167
13.101
7.370
3.251
1.217
Q.Q85
b=55.191
Sum the components bi in column 5 to get quantity b. Compute the standard deviation
of the background benzene values. Then the Shapiro-Wilk statistic is given as
W =
55.191
n.ioiVTT
= 0.947.
Step 5. The critical value at the 5% level for the Shapiro-Wilk test on 12 observations is 0.859.
Since the calculated value of W=0.947 is well above the critical value, there is no
evidence to reject the assumption of Normality.
Step 6. Compute the Prediction interval using the original background data. The mean and
standard deviation of the 12 background samples are given by 27.52 ppb and 17.10
ppb, respectively.
Step 7. Since there are two future months of compliance data to be compared to the Prediction
limit, the number of future sampling periods is k=2. At each sampling period, a mean of
four independent samples will be computed, so m=4 in the prediction interval formula
(see Interim Final Guidance, p. 5-25). The Bonferroni t-statistic, t(15^.95), with k=2
and 11 df is equivalent to the usual t-statistic at the .975 level with 11 df, i.e.,
tn..975=2.201.
Step 8. Compute the upper one-sided Prediction limit (UL) using the formula:
. it
/ 11 rxcA
(n-l,k,.95)
Then the UL is given by:
Step 9.
UL = 27.52 + (17.10)(2.201)J- + = 49.25 ppb.
V 4 12
Compare the UL to the compliance data. The means of the four compliance well
observations for months 4 and 5 are 33.95 ppb and 26.48 ppb, respectively. Since the
58
-------
Draft 1/28/93
mean concentrations for months 4 and 5 are below the upper Prediction limit, there is no
evidence of recent contamination at the monitoring facility.
4.2.1 Non-parametric Prediction Intervals
When the parametric assumptions of a Normal-based Prediction limit cannot be justified,
often due to the presence of a significant fraction of nondetects, a non-parametric Prediction
interval may be considered instead. A non-parametric upper Prediction limit is typically
constructed in the same way as a non-parametric upper Tolerance limit, that is, by estimating the
limit to be the maximum value of the set of background samples.
The difference between non-parametric Tolerance and Prediction limits is one of
interpretation and probability. Given n background measurements and a desired confidence level,
a non-parametric Tolerance interval will have a certain coverage percentage. With high probability,
the Tolerance interval is designed to miss only a small percentage of the samples from
downgradient wells. A Prediction limit, on the other hand, involves the confidence probability that
the next future sample or samples will definitely fall below the upper Prediction limit. In this
sense, the Prediction limit may be thought of as a 100% coverage Tolerance limit for the next k
future samples.
As Guttman (1970) has indicated, the confidence probability associated with predicting that
the next single observation from a downgradient well will fall below the upper Prediction limit -
estimated as the maximum background vaJue -- is the same as the expected coverage of a similarly
constructed upper Tolerance limit, namely (l-a)=n/(n+l). Furthermore, it can be shown from
Gibbons (199Ib) that the probability of having k future samples all fall below the upper non-
parametric Prediction limit is (l-a)=n/(n+k). Table A-7 in Appendix A lists these confidence
levels for various choices of n and k. The false positive rate associated with a single Prediction
limit can be computed as one minus the confidence level.
Balancing the ease with which non-parametric upper Prediction limits are constructed is the
fact that, given fixed numbers of background samples and future sample values to be predicted, the
maximum confidence level associated with the Prediction limit is also fixed. To increase the level
of confidence, the only choices are to 1) decrease the number of future values to be predicted at any
testing period, or 2) increase the number of background samples used in the test. Table A-7 can be
used along these lines to plan an appropriate sampling strategy so that the false positive rate can be
minimized and the confidence probability maximized to a desired level.
59
-------
Draft 1/28/93
EXAMPLE 17
Use the following arsenic data from a monitoring facility to compute a non-parametric upper
Prediction limit that will contain the next 2 monthly measurements from a downgradiem well and
determine the level of confidence associated with the Prediction limit.
Arsenic Concentrations (ppb)
Background Wells Compliance
Month
1
2
3
4
5
6
Welll
<5
<5
8
<5
9
10
Well 2
7
6.5
<5
6
12
' <5
Well 3 Well 4
<5
<5
10.5
<5
<5 8
9 14
SOLUTION
Step 1. Determine the maximum value of the background data and use this value to estimate the
upper Prediction limit. In this case, the Prediction limit is set to the maximum value of
the n=18 samples, or 12 ppb. As is true of non-parametric Tolerance intervals, only
uncontaminated wells should be used in the construction of Prediction limits.
Step 2. Compute the confidence level and false positive rate associated with the Prediction limit.
Since two future samples are being predicted and n=18, the confidence level is found to
be n/(n+k)= 18/20=90%. Consequently, the Type I error or false positive rate is equal to
(1-.90)=10%. If a lower false positive rate is desired, the number of background
samples used in the test must be enlarged.
Step 3. Compare each of the downgradiem samples against the upper Prediction limit. Since the
value of 14 ppb for month 2 exceeds the limit, conclude that there is significant evidence
of contamination at the downgradient well at the 10% level of significance.
4.3 CONFIDENCE INTERVALS
Confidence intervals should only be constructed on data collected during compliance
monitoring, in particular when the Ground-Water Protection Standard (GWPS) is an ACL
computed from the average of background samples. Confidence limits for the average
concentration levels at compliance wells should not be compared to MCLs. Unlike a Tolerance
interval, Confidence limits for an average do not indicate how often individual samples will exceed
the MCL. Conceivably, the lower Confidence limit for the mean concentration at a compliance
well could fall below the MCL, yet 50 percent or more of the individual samples might exceed the
60
-------
Draft 1/28/93
MCL. Since an MCL is designed to set an upper bound on the acceptable contamination, this
would not be protective of human health or the environment.
When comparing individual compliance wells to an ACL derived from average background
levels, a lower one-sided 99 percent Confidence limit should be constructed. If the lower
Confidence limit exceeds the ACL, there is significant evidence that the true mean concentration at
the compliance well exceeds the GWPS and that the facility permit has been violated. Again, in
most cases, a Lognormal model will approximate the data better than a Normal distribution model.
It is therefore recommended that the initial data checking and analysis be performed on the
logarithms of the data. If a Confidence interval is constructed using logged concentration data, the
lower Confidence limit should be compared to the logarithm of the ACL rather than the original
GWPS. Steps for computing Confidence intervals are given on pp. 6-3 to 6-11 of the Interim
Final Guidance.
61
-------
Draft 1/28/93
5. STRATEGIES FOR MULTIPLE COMPARISONS
5.1 BACKGROUND OF PROBLEM
Multiple comparisons occur whenever more than one statistical test is performed during any
given monitoring or evaluation period. These comparisons can arise as a result of the need to test
multiple downgradient wells against a pool of upgradient background data or to test several
indicator parameters for contamination on a regular basis. Usually the same statistical test is
performed in every comparison, each test having a fixed level of confidence (1-oc), and a
corresponding false positive rate, a.
The false positive rate (or Type I error) for an individual comparison is the probability that
the test will falsely indicate contamination, i.e., that the test will "trigger," though no contamination
has occurred. If ground-water data measurements were always constant in the absence of
contamination, false positives would never occur. But ground-water measurements typically vary,
either due to natural variation in the levels of background concentrations or to variation in lab
measurement and analysis.
Applying the same test to each comparison is acceptable if the number of comparisons is
small, but when the number of comparisons is moderate to large the false positive rate associated
with the testing network as a whole (that is, across all comparisons involving a separate statistical
test) can be quite high. This means that if enough tests are run, there will be a significant chance
that at least one test will indicate contamination, even if no actual contamination has occurred. As
an example, if the testing network consists of 20 separate comparisons (some combination of
multiple wells and/or indicator parameters) and a 99% confidence level Prediction interval limit is
used on each comparison, one would expect an overall network-wide false positive rate of over
18%, even though the Type I error for any single comparison is only 1%. This means there is
nearly 1 chance in 5 that one or more comparisons will falsely register potential contamination even
if none has occurred. With 100 comparisons and the same testing procedure, the overall network-
wide false positive rate jumps to more than 63%, adding additional expense to verify the lack of
contamination at falsely triggered wells.
To lower the network-wide false positive rate, there are several important considerations. As
noted in Section 2.2.4, only those constituents that have been shown to be reliable indicators of
potential contamination should be statistically tested on a regular basis. By limiting the number of
tested constituents to the most useful indicators, the overall number of statistical comparisons that
must be made can be reduced, lowering the facility-wide false alarm rate. In addition, depending
62
-------
Draft 1/28/93
on the hydrogeology of the site, some indicator parameters may need to be tested only at one (or a
few adjacent) regulated waste units, as opposed to testing across the entire facility, as long as the
permit specifies a common point of compliance, thus further limiting the number of total statistical
comparisons necessary.
One could also try to lower the Type I error applied to each individual comparison.
Unfortunately, for a given statistical test in general, the lower the false positive rate, the lower the
power of the test to detect real contamination at the well. If the statistical power drops too much,
real contamination will not be identified when it occurs, creating a situation not protective of the
environment or human health. Instead, alternative testing strategies can be considered that
specifically account for the number of statistical comparisons being made during any evaluation
period. All alternative testing strategies should be evaluated in light of two basic goals:
1. Is the network-wide false positive rate (across all constituents and wells being
tested) acceptably low? and
2. Does the testing strategy have adequate statistical power to detect real contamination
when it occurs?
To establish a standard recommendation for the network-wide overall false positive rate, it
should be noted that for some statistical procedures, EPA specifications mandate that the Type 1
error for any individual comparison be at least 1 %. The rationale for this minimum requirement is
motivated by statistical power. For a given test, if the Type I error is set too low, the power of the
test will dip below "acceptable" levels. EPA was not able to specify a minimum level of acceptable
power within the regulations because to do so would require specification of a minimum difference
of environmental concern between the null and alternative hypotheses. Limited current knowledge
about the health and/or environmental effects associated with incremental changes in concentration
levels of Appendix IX constituents greatly complicates this task. Therefore, minimum false
positive rates were adopted for some statistical procedures until more specific guidance could be
recommended. EPA's main objective, however, as in the past, is to approve tests that have
adequate statistical power to detect real contamination of ground water, and not to enforce
minimum false positive rates.
This emphasis is evident in §264.98(g)(6) for detection monitoring and §264.99(i) for
compliance monitoring. Both of these provisions allow the owner or operator to demonstrate that
the statistically significant difference between background and compliance point wells or between
compliance point wells and the Ground-Water Protection Standard is an artifact caused by an error
in sampling, analysis, statistical evaluation, or natural variation in ground-water chemistry. To
63
-------
Draft 1/28/93
make the demonstration that the statistically significant difference was caused by an error in
sampling, analysis, or statistical evaluation, re-testing procedures that have been approved by the
Regional Administrator can be written into the facility permit, provided their statistical power is
comparable to the EPA Reference Power Curve given below.
For large monitoring networks, it is almost impossible to maintain a low network-wide
overall false positive rate if the Type I errors for individual comparisons must be kept above 1%.
As will be seen, some alternative testing strategies can achieve a low network-wide false positive
rate while maintaining adequate power to detect contamination. EPA therefore recommends hat
instead of the 1% criterion for individual comparisons, the overall network-wide false positive rate
(across all wells and constituents) of any alternative testing strategy should be kept to
approximately 5% for each monitoring or evaluation period, while maintaining statistical power
comparable to the procedure below.
The other goal of any testing strategy should be to maintain adequate statistical power for
detecting contamination. Technically, power refers to the probability that a statistical testing
procedure will register and identify evidence of contamination when it exists. However, power is
typically defined with respect to a single comparison, not a network of comparisons. Since some
testing procedures may identify contamination more readily when several wells in the network arc
contaminated as opposed to just one or two, it is suggested that all testing strategies be compared
on the following more stringent, but common, basis. Let the effective power of a testing
procedure be defined as the probability of detecting contamination in the monitoring network when
one and only one well is contaminated with a single constituent. Note that the effective power is a
conservative measure of how a testing regimen will perform over the network, because the test
must uncover one contaminated well among many clean ones (i.e., like "finding a needle in a
haystack").
To establish a recommended standard for the statistical power of a testing strategy, it must be
understood that the power is not single number, but rather a function of the level of contamination
actually present. For most tests, the higher the level of contamination, the higher the statistical
power, likewise, the lower the contamination level, the lower the power. As such, when
increasingly contaminated ground water passes a particular well, it becomes easier for the statistical
test to distinguish background levels from the contaminated ground water; consequently, the power
is an increasing function of the contamination level.
64
-------
Draft 1/28/93
Perhaps the best way to describe the power function associated with a particular testing
procedure is via a graph, such as the example below of the power of a standard Normal-based
upper Prediction limit with 99% confidence. The power in percent is plotted along the y-axis
against the standardized mean level of contamination along the x-axis. The standardized
contamination levels are in units of standard deviations above the baseline (estimated from
background data), allowing different power curves to be compared across indicator parameters,
wells, and so forth. The standardized units, A, may be computed as
_ (Mean Contamination Level)- (Mean Background Level)
(SD of Background Data)
In some situations, the probability that contamination will be detected by a particular testing
procedure may be difficult if not impossible to derive analytically and will have to be simulated on
a computer. In these cases, the power is typically estimated by generating Normally-distributed
random values at different mean levels and repeatedly simulating the test procedure. With enough
repetitions a reliable power curve can be plotted (e.g., see figure below).
EPA REFERENCE POWER CURVE
(16 Background Samples)
100
80
u
1
40
20
01234
A (STANDARDIZED UNITS ABOVE BACKGROUND)
65
-------
Draft 1/28/93
Notice that the power at A=Q represents the false positive rate of the test, because at that point
no contamination is actually present and the curve is indicating how often contamination will be
"detected" anyway. As long as the power at A=0 is approximately 5% (except for tests on an
individual constituent at an individual well where the false positive rate should approximate 1%)
and the rest of the power curve is acceptably high, the testing strategy should be adequately
comparable to EPA standards.
To determine an acceptable power curve for comparison to alternative testing strategies, the
following EPA Reference Power Curve is suggested. For a given and fixed number of
background measurements, and based on Normally-distributed data from a single downgradient
well generated at various mean levels above background, the EPA Reference Power Curve will
represent the power associated with a 99% confidence upper prediction limit on the next single
future sample from the well (see figure above for n=16).
Since the power of a test depends on several factors, including the background sample size,
the type of test, and the number of comparisons, a different EPA Reference Power Curve will be
associated with each distinct number of background samples. Power curves of alternative tests
should only be compared to the EPA Reference Power Curve using a comparable number of
background measurements. If the power of the alternative test is at least as high as the EPA
reference, while maintaining an approximate 5% overall false positive rate, the alternative
procedure should be acceptable.
With respect to power curves, keep in mind three important considerations: 1) the power of
any testing method can be increased merely by relaxing the false positive rate requirement, letting a
become larger than 5%. This is why an approximate 5% alpha level is suggested as the standard
guidance, to ensure fair power comparisons among competing tests and to limit the overall
network-wide false positive rate. 2) The simulation of alternative testing methods should
incorporate every aspect of the procedure, from initial screens of the data to final decisions
concerning the presence of contamination. This is especially applicable to strategies that involve
some form of retesting at potentially contaminated wells. 3) When the testing strategy incorporates
multiple comparisons, it is crucial that the power be gauged by simulating contamination in one and
only one indicator parameter at a single well (i.e., by measuring the effective power). As noted
earlier, EPA recommends that power be defined conservatively, forcing any test procedure to find
"the needle in the haystack."
66
-------
Draft 1/28/93
5.2 POSSIBLE STRATEGIES
5.2.1 Parametric and Non-parametric ANOVA
As described in the Interim Final Guidance, ANOVA procedures (either the parametric
method or the Kruskal-Wallis test) allow multiple downgradient wells (but not multiple
constituents) to be combined into a single statistical test, thus enabling the network-wide false
positive rate for any single constituent to be kept at 5% regardless of the size of the network. The
ANOVA method also maintains decent power for detecting real contamination, though only for
small to moderately-sized networks. In large networks, even the parametric ANOVA has a
difficult time finding the "needle in a haystack." The reason for this is that the ANOVA F-test
combines all downgradient wells simultaneously, so that "clean" wells are mixed together with the
single contaminated well, potentially masking the test's ability to detect the source of
contamination.
Because of these characteristics, the ANOVA procedure may have poorer power for detecting
a narrow plume of contamination which affects only one or two wells in a much larger network
(say 20 or more comparisons). Another drawback is that a significant ANOVA test result will not
indicate which well or wells is potentially contaminated without further post-hoc testing.
Furthermore, the power of the ANOVA procedure depends significantly on having at least 3 to 4
samples per well available for testing. Since the samples must be statistically independent,
collection of 3 or more samples at a given well may necessitate a several-month wait if the natural
ground-water velocity at that well is low. In this case, it may be tempting to look for other
strategies (e.g., Tolerance or Prediction intervals) that allow statistical testing of each new ground
water sample as it is collected and analyzed. Finally, since the simple one-way ANOVA procedure
outlined in the Interim Final Guidance is not designed to test multiple constituents simultaneously,
the overall false positive rate will be approximately 5% per constituent, leading to a potentially high
overall network-wide false positive rate (across wells and constituents) if many constituents need
to be tested.
5.2.2 Retesting with Parametric Intervals
One strategy alternative to ANOVA is a modification of approaches suggested by Gibbons
(1991a) and Davis and McNichols (1987). The basic idea is to adopt a two-phase testing strategy.
First, new samples from each well in the network are compared, for each designated constituent
parameter, against an upper Tolerance limit with pre-specified average coverage (Note that the
upper Tolerance limit will be different for each constituent). Since some constituents at some wells
67
-------
in a large network would be expected to fail the Tolerance limit even in the absence of
contamination, each well that triggers the Tolerance limit is resampled and only those constituents
that "triggered" the limit are retested via an upper Prediction limit (again differing by constituent).
If one or more resamples fails the upper Prediction limit, the specific constituent at that well failing
the test is deemed to have a concentration level significantly greater than background. The overall
strategy is effective for large networks of comparisons (e.g., 100 or more comparisons), but also
flexible enough to accommodate smaller networks.
To design and implement an appropriate pair of Tolerance and Prediction intervals, one must
know the number of background samples available and the number of comparisons in the network.
Since parametric intervals are used, it is assumed that the background data are either Normal or can
be transformed to an approximate Normal distribution. The tricky pan is to choose an average
coverage for the Tolerance interval and confidence level for the Prediction interval such that the
twin goals are met of keeping the overall false positive rate to approximately 5% and maintaining
adequate statistical power.
To derive the overall false positive rate for this retesting strategy, assume that when no
contamination is present each constituent and well in the network behaves independently of other
constituents and wells. Then if Aj denotes the event that well i is triggered falsely at some stage of
the testing, the overall false positive rate across m such comparisons can be written as
total a = Pr{A, orA2or... or A, or... or Am} = l-
1-1
where Ai denotes the complement of event Ai. Since P{ Ai) is the probability of noj registering a
false trigger at uncontaminated well i, it may be written as
Pr{A,} = Pr{X; < TL} + Pr{X, > TL} x Pr{Y, < PL I X, > TL}
where Xi represents the original sample at well i, Yj represents the concentrations of one or more
resamples at well i, TL and PL denote the upper Tolerance and Prediction limits respectively, and
the right-most probability is the conditional event that all resample concentrations fall below the
Prediction limit when the initial sample fails the Tolerance limit
Letting x=Pr{XjTL), the overall false positive rate across m
constituent-well combinations can be expressed as
68
-------
Draft 1/28/93
total a = l-[x + (l-x)-y]"
As noted by Guttman (1970), the probability that any random sample will fall below the
upper Tolerance limit (i.e., quantity x above) is equal to the expected or average coverage of the
Tolerance interval. If the Tolerance interval has been constructed to have average coverage of
95%, x=0.95. Then given a predetermined value for x, a fixed number of comparisons m, and a
desired overall false positive rate a, we can solve for the conditional probability y as follows:
1-x
If the conditional probability y were equal to the probability that the resample(s) for the ith
constituent-well combination falls below the .upper Prediction limit, one could fix a at, say, 5%,
and construct the Prediction interval to have confidence level y. In that way, one could guarantee
an expected network-wide false positive rate of 5%. Unfortunately, whether or not one or more
resamples falls below the Prediction limit depends partly on whether the initial sample for that
comparison eclipsed the Tolerance limit. This is because the same background data are used to
construct both the Tolerance limit and the Prediction limit, creating a statistical dependence between
the tests.
The exact relationship between the conditional probability y and the unconditional probability
Pr{Yi
-------
Urart 1/28/93
Tolerance limits for smaller networks and higher coverage Tolerance limits for larger networks.
That way (as can be seen in the table), the resulting Prediction limit confidence levels will be low
enough to allow the construction of Prediction limits with decent statistical power.
PARAMETRIC RETESTING STRATEGIES
#
COMPARISONS
5
20
50
100
#BG
SAMPLES
8
16
16
24
24
8
16
24
16
16
24
24
16
24
24
TOLERANCE
COVERAGE (%)
95
95
95
95
95
95
95
. 95
98
99
98
99
98
99
98
PREDICTION
LEVEL (%)
90
90
85
85
90
98
97
97
97
92
95
90
98
95
98
RATING
**
**
*
**
*
**
**
**
**
*
**
**
*
*
*
Note: ** = strongly recommended
* = recommended
Only strategies that approximately met the selection criteria are listed in the table. It can be
seen that some, but not all, of these strategies are strongly recommended. Those that are merely
"recommended" failed in the simulations to fully meet one or both of the selection criteria. The
performance of all the recommended strategies, however, should be adequate to correctly identify
contamination while maintaining a modest facility-wide false positive rate.
Once a combination of coverage and confidence levels for the Tolerance-Prediction interval
pair is selected, the statistical power of the testing strategy should be estimated in order to compare
with the EPA Reference Power Curve (particularly if the testing scenario is different from those
computed in this Addendum). Simulation results have suggested that the above method for
choosing a two-phase testing regimen can offer statistical power comparable to the EPA Reference
for almost any sized monitoring network (see power curves in Appendix B).
70
-------
Draft 1/28/93
Several examples of simulated power curves are presented in Appendix B. The range of
downgradient wells tested is from 5 to 100 (note that the number of wells could actually represent
the number of constituent-well combinations if testing multiple parameters), and each curve is
based on either 8, 16, or 24 background samples. The y-axis of each graph measures the effective
power of the testing strategy, i.e., the probability that contamination is detected when one and only
one constituent at a single well has a mean concentration higher than background level. For each
case, the EPA Reference Power Curve is compared to two different two-phase testing strategies. In
the first case, wells that trigger the initial Tolerance limit are resampled once. This single resample
is compared to a Prediction limit for the next future sample. In the second case, wells that trigger
the Tolerance limit are resampled twice. Both resamples are compared to an upper Prediction limit
for the next two future samples at that well.
The simulated power curves suggest 'two points. First, with an appropriate choice of
coverage and prediction levels, the two-ph'ase retesting strategies have comparable power to the
EPA Reference Power Curve, while maintaining low overall network-wide false positive rates.
Second, the power of the retesting strategy is slightly improved by the addition of a second
resample at wells that fail the initial Tolerance limit, because the sample size is increased.
Overall, the two-phase testing strategy defined abovei.e., first screening the network of
wells with a single upper Tolerance limit, and then applying an upper Prediction limit to resamples
from wells which fail the Tolerance intervalappears to meet EPA's objectives of maintaining
adequate statistical power for detecting contamination while limiting network-wide false positive
rates to low levels. Furthermore, since each compliance well is compared against the interval limits
separately, a narrow plume of contamination can be identified more efficiently than with an
ANOVA procedure (e.g., no post-hoc testing is necessary to finger the guilty wells, and" the two-
phase interval testing method has more power against the "needle-in-a-haystack" contamination
hypothesis).
5.2.3 Retesting with Non-parametric Intervals
When parametric intervals are not appropriate for the data at hand, either due to a large
fraction of nondetects or a lack of fit to Normality or Lognormality, a network of individual
comparisons can be handled via retesting using non-parametric Prediction limits. The strategy is to
establish a non-parametric prediction limit for each designated indicator parameter based on
background samples that accounts for the number of well-constituent comparisons in the overall
network.
71
-------
Draft 1/28/93
In order to meet the twin goals of maintaining adequate statistical power and a low overall
rate of false positives, a non-parametric strategy must involve some level of retesting at those wells
which initially indicate possible contamination. Retesting can be accomplished by taking a specific
number of additional, independent samples from each well in which a specific constituent triggers
the initial test and then comparing these samples against the non-parametric prediction limit for that
parameter.
Because more independent data is added to the overall testing procedure, retesting of
additional samples, in general, enables one to make more powerful and more accurate
determinations of possible contamination. Retesting does, however, involve a trade-off. Because
the power of the test increases with the number of resamples, one must decide how quickly
resamples can be collected to ensure 1) quick identification and confirmation of contamination and
yet, 2) the statistical independence of successive resamples from any particular well. Do not forget
that the performance of a non-parametric retesting strategy depends substantially on the
independence of the data from each well.
Two basic approaches to non-parametric retesting have been suggested by Gibbons (1990
and 199 Ib). Both strategies define the upper Prediction limit for each designated parameter to be
the maximum value of that constituent in the set of background data. Consequently, the
background wells used to construct the limits must be uncontaminated. After the Prediction limits
have been calculated, one sample is collected from each downgradient well in the network. If any
sample constituent value is greater than its upper prediction limit, the initial test is "triggered" and
one or more resamples must be collected at that downgradient well on the constituent for further
testing.
At this point, the similarity between the two approaches ends. In his 1990 article, Gibbons
computes the probability that at least one of m independent samples taken from each of k
downgradient wells will be below (i.e., pass) the prediction limit. The m samples include both the
initial sample and (m-1) resamples. Because retesting only occurs when the initial well sample fails
the limit, a given well fails the overall test (initial comparison plus retests) only if all (m-1)
resamples are above the prediction limit. If any resample passes the prediction limit, that well is
regarded as showing no significant evidence of contamination.
Initially, this first strategy may not appear to be adequately sensitive to mild contamination at
a given downgradient well. For example, suppose two resamples are to be collected whenever the
initial sample fails the upper prediction limit. If the initial sample is above the background
72
-------
Draft 1/78/93
maximum and one of the resamples is also above the prediction limit, the well can still be classified
as "clean" if the other resample is below the prediction limit. Statistical power simulations (see
Appendix B), however, suggest that this strategy will perform adequately under a number of
monitoring scenarios. Still, EPA recognizes that a retesting strategy which might classify a well as
"clean" when the initial sample and a resample both fail the upper Prediction limit could offer
problematic implications for permit writers and enforcement personnel.
A more stringent approach was suggested by Gibbons in 1991. In that article (1991b),
Gibbons computes, as "passing behavior," the probability that all but one of m samples taken from
each of k wells pass the upper prediction limit. Under this definition, if the initial sample fails the
upper Prediction limit, all (m-1) resamples must pass the limit in order for well to be classified as
"clean" during that testing period. Consequently, if any single resample falls above the background
maximum, that well is judged as showing significant evidence of contamination.
Either non-parametric retesting approach offers the advantage of being extremely easy to
implement in field testing of a large downgradient well network. In practice, one has only to
determine the maximum background sample to establish the upper prediction limit against which all
other comparisons are made. Gibbons' 1991 retesting scheme offers the additional advantage of
requiring less overall sampling at a given well to establish significant evidence of contamination.
Why? If the testing procedure calls for, say, two resamples at any well that fails the initial
prediction limit screen, retesting can end whenever either one of the two resamples falls above the
prediction limit. That is, the well will be designated as potentially contaminated if the first resample
fails the prediction limit even if the second resample has not yet been collected.
In both of his papers, Gibbons offers tables that can be used to compute the overall network-
wide false positive rate, given the number of background samples, the number of downgradient
comparisons, and the number of retests for each comparison. It is clear that there is less flexibility
in adjusting a non-parametric as opposed to a parametric prediction limit to achieve a certain Type I
error rate. In fact, if only a certain number of retests are feasible at any given well (e.g., in order
to maintain independence of successive samples), the only recourse to maintain a low false positive
rate is to collect a larger number of background samples. In this way, the inability to make
parametric assumptions about the data illustrates why non-parametric tests are on the whole less
efficient and less powerful than their parametric counterparts.
Unfortunately, the power of these non-parametric retesting strategies is not explored in detail
by Gibbons. To compare the power of both Gibbons' strategies against the EPA Reference Power
73
-------
Draft 1/28/93
Curve, Normally distributed data were simulated for several combinations of numbers of
background samples and downgradient wells (again, if multiple constituents are being tested, the
number of wells in the simulations may be regarded as the number of constituent-well
combinations). Up to three resamples were allowed in the simulations for comparative purposes.
EPA recognizes, however, that it will be feasible in general to collect only one or two independent
resamples from any given well. Power curves representing the results of these simulations are
given in Appendix B. For each scenario, the EPA Reference Power Curve is compared with the
simulated powers of six different testing strategies. These strategies include collection of no
resamples, one resample, two_resamples under Gibbons' 1990 approach (designated as A on the
curves) and his 1991 approach (labelled as B), and three resamples (under approaches A and B).
Under the one resample strategy, a potentially contaminated compliance well is designated as
"clean" if the resample passes the retest and "contaminated" otherwise.
The following table lists the best-performing strategies under each scenario. As with the use
of parametric intervals for retesting, the criteria for selecting the best-performing strategies required
1) an approximate 5% facility-wide false positive rate and 2) power equivalent to or better than the
EPA Reference Power Curve. Because Normal data were used in these power simulations, more
realistically skewed data would likely result in greater advantages for the non-parametric retesting
strategies over the EPA Reference test.
Examination of the table and the power curves in Appendix B shows that the number of
background samples has an important effect on the recommended testing strategy. For instance,
with 8 background samples in a network of at least 20 wells, the best performing strategies all
involve collection of 3 resamples per "triggered" compliance well (EPA regards such a strategy as
impractical for permitting and enforcement purposes at most RCRA facilities). It tends to be true
that as the number of available background samples grows, fewer resamples are needed from each
potentially contaminated compliance well to maintain adequate power. If, as is expected, the
number of feasible, independent retests is limited, a facility operator may have to collect additional
background measurements in order to establish an adequate retesting strategy.
74
-------
NON-PARAMETRIC RETESTING STRATEGIES
#
WELLS
5
20
50
100
#BG
SAMPLES
8
8
16
16
24
8
16
16
24
24
32
32
16
24
24
32
16
24
32
STRATEGY
1 Resample
2 Resamples (A)
1 Resample
2 Resamples (B)
2 Resamples (B)
2 Resamples (A)
1 Resample
2 Resamples (A)
1 Resample
2 Resamples (B)
1 Resample
2 Resamples (B)
2 Resamples (A)
1 Resample
2 Resamples (A)
1 Resample
2 Resamples (A)
2 Resamples (A)
1 Resample
REFERENCE
Gibbons, 1990
Gibbons, 1991
Gibbons, 1991
Gibbons, 1990
Gibbons, 1990
Gibbons, 1991
Gibbons, 1991
Gibbons, 1990
Gibbons, 1990
Gibbons, 1990
Gibbons, 1990
RATING
*
**
**
**
**
*
*
*
**
*
*
**
**
*
*
**
**
*
*
Note: ** = very good performance * = good performance '
6. OTHER TOPICS
6.1 CONTROL CHARTS
Control Charts are an alternative to Prediction limits for performing either intrawell
comparisons or comparisons to historically monitored background wells during detection
monitoring. Since the baseline parameters for a Control Chan are estimated from historical data,
this method is only appropriate for initially uncontaminated compliance wells. The main advantage
of a Control Chan over a Prediction limit is that a Control Chan allows data from a well to be
viewed graphically over time. Trends and changes in the concentration levels can be seen easily,
because all sample data is consecutively plotted on the chart as it is collected, giving the data
analyst an historical overview of the pattern of contamination. Prediction limits allow only point-
in-time comparisons between the most recent data and past information, making long-term trends
difficult to identify.
More generally, intrawell comparison methods eliminate the need to worry about spatial
variability between wells in different locations. Whenever background data is compared to
compliance point measurements, there is a risk that any statistically significant difference in
75
-------
concentration levels is due to spatial and/or hydrogeological differences between the wells rather
than contamination at the facility. Because intrawell comparisons involve but a single well,
significant changes in the level of contamination cannot be attributed to spatial differences between
wells, regardless of whether the method used is a Prediction limit or Control Chart.
Of course, past observations can be used as baseline data in an intrawell comparison only if
the well is known to be uncontaminated. Otherwise, the comparison between baseline data and
newly collected samples may negate the goal in detection monitoring of identifying evidence of
contamination. Furthermore, without specialized modification, Control Chans do not efficiently
handle truncated data sets (i.e., those with a significant fraction of nondetects), making them
appropriate only for those constituents with a high frequency of occurrence in monitoring wells.
Control Charts tend to be most useful, therefore, for inorganic parameters (e.g., some metals and
geochemical monitoring parameters) that occur naturally in the ground water.
The steps to construct a Control Chan can be found on pp. 7-3 to 7-10 of the Interim Final
Guidance. The way a Control Chan works is as follows. Initial sample data is collected (from the
specific compliance well in an intrawell comparison or from background wells in comparisons of
compliance data with background) in order to establish baseline parameters for the chart,
specifically, estimates of the well mean and well variance. These samples are meant to characterize
the concentration levels of the uncontaminated well, before the onset of detection monitoring.
Since the estimate of well variance is particularly important, it is recommended that at least 8
samples be collected (say, over a year's time) to estimate the baseline parameters. Note that none
of these 8 or more samples is actually plotted on the chan.
As future samples are collected, the baseline parameters are used to standardize the data. At
each sampling period, a standardized mean is computed using the formula below, where m
represents the baseline mean concentration and s represents the baseline standard deviation.
Z, =
A cumulative sum (CUSUM) for the ith period is also computed, using the formula Sj = max{0,
(Zj-k)+Si-i}, where Z\ is the standardized mean for that period and k represents a pre-chosen
Control Chan parameter.
Once the data have been standardized and plotted, a Control Chan is declared out-of-control
if the sample concentrations become too large when compared to the baseline parameters. An out-
76
-------
Draft 1/28/93
of-control situation is indicated on the Control Chart when either the standardized means or
CUSUMs cross one of two pre-determined threshold values. These thresholds are based on the
rationale that if the well remains uncontaminated, new sample values standardized by the original
baseline parameters should not deviate substantially from the baseline level. If contamination does
occur, the old baseline parameters will no longer accurately represent concentration levels at the
well and, hence, the standardized values should significantly deviate from the baseline levels on the
Control Chan.
In the combined Shewhan-cumulative sum (CUSUM) Control Chan recommended by the
Interim Final Guidance (Section 7), the chan is declared out-of-control in one of two ways. First,
the standardized means (Zi) computed at each sampling period may cross the Shewhart control
limit (SCL). Such a change signifies a rapid increase in well concentration levels among the most
recent sample data. Second, the cumulative sum (CUSUM) of the standardized means may
become too large, crossing the "decision internal value" (h). Crossing the h threshold can mean
either a sudden rise in concentration levels or a gradual increase over a longer span of time. A
gradual increase or trend is particularly indicated if the CUSUM crosses its threshold but the
standardized mean Zj does not. The reason for this is that several consecutive small increases in Zj
will not trigger the SCL threshold, but may trigger the CUSUM threshold. As such, the Control
Chan can indicate the onset of either sudden or gradual contamination at the compliance point.
As with other statistical methods, Control Charts are based on certain assumptions about the
sample data. The first is that the data at an uncontaminated well (i.e., a well process that is "in
control") are Normally distributed. Since estimates of the baseline parameters are made using
initially collected data, these data should be tested for Normality using one of the goodness-of-fit
techniques described earlier. Better yet, the logarithms of the data should be tested first, to see if a
Lognormal model is appropriate for the concentration data. If the Lognormal model is not rejected,
the Control Chan should be constructed solely on the basis of logged data.
The methodology for Control Chans also assumes that the sample data are independently
distributed from a statistical standpoint. In fact, these charts can easily give misleading results if
the consecutive sample data are not independent. For this reason, it is important to design a
sampling plan so that distinct volumes of water are analyzed each sampling period and that
duplicate sample analyses are not treated are independent observations when constructing the
Control Chan.
77
-------
Draft 1/28/93
The final assumption is that the baseline parameters at the well reflect current background
concentration levels. Some long-term fluctuation in background levels may be possible even
though contamination has not occurred at a given well. Because of this possibility, if a Control
Chan remains "in control" for a long period of time, the baseline parameters should be updated to
include more recent observations as background data. After all, the original baseline parameters
will often be based only on the first year's data. Much better estimates of the true background
mean and variance can be obtained by including more data at a later time.
To update older background data with more recent samples, a two-sample t-test can be run to
compare the older concentration levels with the concentrations of the proposed update samples. If
the t-test does not show a significant difference at the 5 percent significance level, proceed to re-
estimate the baseline parameters by including more recent data. If the t-test does show a significant
difference, the newer data should not be characterized as background unless some specific factor
can be pinpointed explaining why background levels on the site have naturally changed.
EXAMPLE 18
Construct a control chart for the 8 months of data collected below.
H=27 ppb
0=25 ppb
Nickel Concentration (ppb)
Month Sample 1 Sample 2
1
2
3
4
5
6
7
8
15.3
41.1
17.5
15.7
37.2
25.1
19.9
99.3
22.6
27.8
18.1
31.5
32.4
32.5
27.5
64.2
SOLUTION
Step 1. The three parameters necessary to construct a combined Shewhan-CUSUM chart are
h=5, k=l, and SCL=4.5 in units of standard deviation (SD).
Step 2. List the sampling periods and monthly means, as in the following table.
78
-------
Draft 1/28/93
Month TJ Mean(ppb) Zj Zj - k
1 1
2 2
3 3
4 4
5 5
6 6
7 7
8 8
19.0
34.5
17.8
23.6
34.8
28.8
23.7
81.8
-0.45
0.42
-0.52
-0.19
0.44
0.10
-0.19
3.10
-1.45
-0.58
-1.52
-1.19
-0.56
-0.90
-1.19
2.10
0.00
0.00
0.00
0.00
0.00
0.00
0.00
2.10
Step 3. Compute the standardized means Zj and the quantities Sj. List in the table above. Each
Sj is computed for consecutive months using the formula on p. 7-8 of the EPA guidance
document.
Si =max {0,-1.45 + 0} =0.00
82 = max {0, -0.58 + 0} = 0.00
83 = max {0,-1.52 + 0} =0.00
S4 = max {0,-1.19 + 0} =0.00
85 = max {0, -0.56 + 0} = 0.00
85 = max {0,-0.90 + 0} =0.00
87 = max {0,-1.19 + 0} =0.00
Sg = max {0, 2.10 + 0} =2.10
Step 4. Plot the control chart as given below. The combined chart indicates that there is no
evidence of contamination at the monitoring facility because neither the standardized
mean nor the CUSUM statistic exceeds the Shewhart control limits for the months
examined.
79
-------
Draft 1/28/93
CONTROL CHART FOR NICKEL DATA
MU = 27ppb SIGMA = 25ppb
Z
O
H*
<
es
f-
1
z
o
o
u
N
as
<
0
z
H
o
e
?
4
3
2
1
-1
i i . i i
_
^
If -
7
,/
A ^ A y
^ " N & -^ " ^a
-
t i i i i i i i ,
h
ii
C/"1!
SCL
1
CUSUM
2 4 6
SAMPLING PERIOD
10
Note: In the above Control Chart, the CUSUMs are compared to threshold h, while the
standardized means (Z) are compared to the SCL threshold.
6.2 OUTLIER TESTING
Formal testing for outliers should be done only if an observation seems particularly high (by
orders of magnitude) compared to the rest of the data set. If a sample value is suspect, one should
run the outlier test described on pp. 8-11 to 8-14 of the EPA guidance document. It should be
cautioned, however, that this outlier test assumes that the rest of the data values, except for the
suspect observation, are Normally distributed (Barnett and Lewis, 1978). Since Lognormally
distributed measurements often contain one or more values that appear high relative to the rest, it is
recommended that the outlier test be run on the logarithms of the data instead of the original
observations. That way, one can avoid classifying a high Lognormal measurement as an outlier
just because the test assumptions were violated.
If the test designates an observation as a statistical outlier, the sample should not be treated as
such until a specific reason for the abnormal measurement can be determined. Valid reasons may,
for example, include contaminated sampling equipment, laboratory contamination of the sample, or
80
-------
Draft 1/28/93
errors in transcription of the data values. Once a specific reason is documented, the sample should
be excluded from any further statistical analysis. If a plausible reason cannot be found, the sample
should be treated as a true but extreme value, not to be excluded from further analysis.
EXAMPLE 19
The table below contains data from five wells measured over a 4-month period. The value
7066 is found in the second month at well 3. Determine whether there is statistical evidence that
this observation is an outlier.
Carbon Tetrachloride Concentration (ppb)
Welll Well 2 Well 3 Well 4 Well 5
1.69
3.25
7.3
12.1
302
35.1
15.6
13.7
16.2
7066
350
70.14
199
41.6
75.4
57.9
275
6.5
59.7
68.4
SOLUTION
Step 1. Take logarithms of each observation. Then order and list the logged concentrations.
81
-------
Draft 1/78/93
Order
1
2
3
4
5
6
7
8
9"
10
11
12
13
14
15
16
17
18
19
20
Concentration
(ppb)
1.69
3.25
6.5
7.3
12.1
13.7
15.6
16.2
35.1
41.6
57.9
59.7
68.4
70.1
75.4
199
275
302
350
7066
Logged
Concentration
0.525
1.179
1.872
1.988
2.493
2.617
2.747
2.785
3.558
3.728
4.059
4.089
4.225
4.250
4.323
5.293
5.617
5.710
5.878
8.863
Step 2. Calculate the mean and SD of all the logged measurements. In this case, the mean and
SD are 3.789 and 1.916, respectively.
Step 3. Calculate the outlier test statistic T20 as
T _X(2o)-X_ 8.863-3.789 _
i-« = = = 2.048.
20 SD 1.916
Step 4. Compare the observed statistic T20 with the critical value of 2.557 for a sample size
n=20 and a significance level of 5 percent (taken from Table 8 on p. B-12 of the Interim
Final Guidance). Since the observed value 720=2.648 exceeds the critical value, there is
significant evidence that the largest observation is a statistical outlier. Before excluding
this value from further analysis, a valid explanation for this unusually high value should
be found. Otherwise, treat the outlier as an extreme but valid concentration
measurement.
82
-------
REFERENCES
Aitchison, J. (1955) On the distribution of a positive random variable having a discrete probability
mass at the origin. Journal of American Statistical Association, 50(272): 901-8.
Barnett, V. and Lewis, T. (1978) Outliers in statistical data. New York: John Wiley & Sons.
Cohen, A.C., Jr. (1959) Simplified estimators for the normal distribution when samples are single
censored or truncated. Technometrics, 1:217-37.
Cox, D.R. and Hinkley, D.V. (1974) Theoretical statistics. London: Chapman & Hall.
Davis, C.B. and McNichols, RJ. (1987) One-sided intervals for at least p of m observations from a
normal population on each of r future occasions. Technometrics, 29(3):359-70.
Filliben, J.J. (1975) The probability plot correlation coefficient test for normality. Technometrics,
17:111-7.
Can, F.F. and Koehler, K.J. (1990) Goodness-of-fit tests based on p-p probability plots.
Technometrics, 32(3):289-303.
Gayen, A.K. (1949) The distribution of "Student's" t in random samples of any size drawn from non-
normal universes. Biometrika, 36:353-69.
Gibbons, R.D. (1987a) Statistical prediction intervals for the evaluation of ground-water quality.
Ground Water, 25(4):455-65.
Gibbons, R.D. (1987b) Statistical models for the analysis of volatile organic compounds in waste
disposal sites. Ground Water, 25(5):572-80.
Gibbons, R.D. (1990) A general statistical procedure for ground-water detection monitoring at waste
disposal facilities. Ground Water, 28(2):235-43.
Gibbons, R.D. (1991a) Statistical tolerance limits for ground-water monitoring. Ground Water,
29(4):563-70.
Gibbons, R.D. (1991b) Some additional nonparametric prediction limits for ground-water detection
monitoring at waste disposal facilities. Ground Water, 29(5):729-36.
Gilliom, RJ. and Helsel, D.R. (1986) Estimation of distributional parameters for censored trace level
water quality data: pan 1, estimation techniques. Water Resources Research, 22(2): 135-46.
Guttman, I. (1970) Statistical tolerance regions: classical and bayesian. Darien, Connecticut: Hafner
Publishing.
83
-------
Hahn, GJ. (1970) Statistical intervals for a normal population: pan 1, tables, examples, and
application*. Journal of Quality Technology, 2(3): 115-25.
Lehmann, E.L. (1975) Nonparametrics: statistical methods based on ranks. San Francisco: Holden
Day, Inc.
Madansky, A. (1988) Prescriptions for working statisticians. New York: Springer-Verlag.
McBean, E.A. and Rovers, F.A. (1992) Estimation of the probability of exceedance of contaminant
concentrations. Ground Water Monitoring Review, Winter, 115-9.
McNichols, R.J. and Davis, C.B. (1988) Statistical issues and problems in ground water detection
monitoring at hazardous waste facilities. Ground Water Monitoring Review, Fall.
Miller, R.G., Jr. (1986) Beyond ANOVA. basics of applied statistics. New York: John Wiley &
Sons.
Milliken, G.A. and Johnson, D.E. (1984) Analysis of messy data: volume 1. designed experiments.
Belmont, California: Lifetime Learning Publications.
Ott, W.R. (1990) A physical explanation of the lognormality of pollutant concentrations. Journal of
Air Waste Management Association, 40:1378-83.
Ryan, T.A., Jr. and Joiner, B.L. (1990) Normal probability plots; :': tests for normality. Minitab
Statistical Software: Technical Reports, November, 1-1 to l-i-+.
Shapiro, S.S. and Wilk, M.B. (1965) An analysis of variance test for normality (complete samples).
Biometrika, 52:591-611.
Shapiro, S.S. and Francia, R.S. (1972) An approximate analysis of variance test for normality.
Journal of American Statistical Association, 67(337):215-6.
Zacks, S. (1970) Uniformly most accurate upper tolerance limits for monotone likelihood ratio
families of discrete distributions. Journal of American Statistical Association, 65(329):307-
16.
84
-------
TABLE A-l.
COEFFICIENTS {Ajv-I+l) FOR W TEST OF
NORMALITY, FOR N=2(l)50
i/n
1
2
3
4
5
i/n
1
2
3
4
5
6
7
8
9
10
i/n
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
i/n
1
2
3
4
5
6
7
8
9
10
2
0.7071
_ _
11
0.5601
.3315
.2260
.1429
.0695
0.0000
21
0.4643
.3185
.2578
.2119
.1736
0.1399
.1092
.0804
.0530
.0263
0.0000
31
0.4220
.2921
.2475
.2145
.1874
0.1641
.1433
.1243
.1066
.0899
3
0.7071
.0000
12
0.5475
.3325
.2347
.1586
.0922
0.0303
22
0.4590
.3156
.2571
.2131
.1764
0.1443
.1150
.0878
.0618
.0368
0.0122
32
0.4188
.2898
.2463
.2141
.1878
0.1651
.1449
.1265
.1093
.0931
4
0.6872
.1677
13
0.5359
.3325
.2412
.1707
.1099
0.0539
.0000
23
0.4542
.3126
.2563
.2139
.1787
0.1480
.1201
.0941
.0696
.0459
0.0228
.0000
33
0.4156
.2876
.2451
.2137
.1880
0.1660
.1463
.1284
.1118
.0961
5
0.6646
.2413
.0000
14
0.5251
.3318
.2460
.1802
.1240
0.0727
.0240
24
0.4493
.3098
.2554
.2145
.1807
0.1512
.1245
.0997
.0764
.0539
0.0321
.0107
34
0.4127
.2854
.2439
.2132
.1882
0.1667
.1475
.1301
.1140
.0988
6
0.6431
.2806
.0875
15
0.5150
.3306
.2495
.1878
.1353
0.0880
.0433
.0000
25
0.4450
.3069
.2543
.2148
.1822
0.1539
.1283
.1046
.0823
.0610
0.0403
.0200
.0000
35
0.4096
.2834
.2427
.2127
.1883
0.1673
.1487
.1317
.1160
.1013
7
0.6233
.3031
.1401
.0000
16
0.5056
.3290
.2521
.1939
.1447
0.1005
.0593
.0196
26
0.4407
.3043
.2533
.2151
.1836
0.1563
.1316
.1089
.0876
.0672
0.0476
.0284
.0094
36
0.4068
.2813
.2415
.2121
.1883
0.1678
.1496
.1331
.1179
.1036
8
0.6052
.3164
.1743
.0561
17
0.4968
.3273
.2540
.1988
.1524
0.1109
.0725
.0359
.0000
27
0.4366
.3018
.2522
.2152
.1848
0.1584
.1346
.1128
.0923
.0728
0.0540
.0358
.0178
.0000
37
0,4040
.2794
.2403
.2116
.1883
0.1683
.1503
.1344
.1196
.1056
9
0.5888
.3244
.1976
.0947
.0000
18
0.4886
.3253
.2553
.2027
.1587
0.1197
.0837
.0496
.0163
28
0.4328
.2992
.2510
.2151
.1857
0.1601
.1372
.1162
.0965
.0778
0.0598
.0424
.0253
.0084
38
0.4015
.2774
.2391
.2110
.1881
0.1686
.1513
.1356
.1211
.1075
10
0.5739
.3291
.2141
.1224
.0399
19
0.4808
.3232
.2561
.2059
.1641
0.1271
.0932
.0612
.0303
.0000
29
0.4291
.2968
.2499
.2150
.1864
0.1616
.1395
.1192
.1002
.0822
0.0650
.0483
.0320
.0159
.0000
39
0.3989
.2755
.2380
.2104
.1880
0.1689
.1520
.1366
.1225
.1092
20
0.4734
.3211
.2565
.2085
.1686
0.1334
.1013
.0711
.0422
.0140
30
0.4254
.2944
.2487
.2148
.1870
0.1630
.1415
.1219
.1036
.0862
0.0697
.0537
.0381
.0227
.0076
40
0.3964
.2737
.2368
.2098
.1878
0.1691
.1526
.1376
.1237
.1108
A-l
-------
TABLE A-l. (CONTINUED)
COEFFICIENTS {A^.i+i} FOR W TEST OF
NORMALITY, FOR N=2(l)50
i/n
11
12
13
14
15
16
17
18
19
20
i/n
1
2
3
4
5
6
1
%
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
31
0.0739
.0585
.0435
.0289
.0144
0.0000
41
0.3940
.2719
.2357
.2091
.1876
0.1693
.1531
.1384
.1249
.1123
0.1004
.0891
.0782
.0677
.0575
0.0476
.0379
.0283
.0188
.0094
0.0000
32
0.0777
.0629
.0485
.0344
.0206
0.0068
42
0.3917
.2701
.2345
.2085
.1874
0.1694
.1535
.1392
.1259
.1136
0.1020
.0909
.0804
.0701
.0602
0.0506
.0411
.0318
.0227
.0136
0.0045
33
0.0812
.0669
.0530
.0395
.0262
0.0131
.0000
43
0.3894
.2684
.2334
.2078
.1871
0.1695
.1539
.1398
.1269
.1149
0.1035
.0927
.0824
.0724
.0628
0.0534
.0442
.0352
.0263
.0175
0.0087
.0000
34
0.0844
.0706
.0572
.0441
.0314
0.0187
.0062
44
0.3872
.2667
.2323
.2072
.1868
0.1695
.1542
.1405
.1278
.1160
0.1049
.0943
.0842
.0745
.0651
0.0560
.0471
.0383
.0296
.0211
0.0126
.0042
35
0.0873
.0739
.0610
.0484
.0361
0.0239
.0119
.0000
.45
0.3850
.2651
.2313
.2065
.1865
0.1695
.1545
.1410
.1286
.1170
0.1062
.0959
.0860
.0775
.0673
0.0584
.0497
.0412
.0328
.0245
0.0163
.0081
.0000
36
0.0900
.0770
.0645
.0523
.0404
0.0287
.0172
.0057
46
0.3830
.2635
.2302
.2058
.1862
0.1695
.1548
.1415
.1293
.1180
0.1073
.0972
.0876
.0785
.0694
0.0607
.0522
.0439
.0357
.0277
0.0197
.0118
.0039
37
0.0924
.0798
.0677
.0559
.0444
0.0331
.0220
.0110
.0000
47
0.3808
.2620
.2291
.2052
.1859
0.1695
.1550
.1420
.1300
.1189
0.1085
.0986
.0892
.0801
.0713
0.0628
.0546
.0465
.0385
.0307
0.0229
.0153
.0076
.0000
38
0.0947
.0824
.0706
.0592
.0481
0.0372
.0264
.0158
.0053
48
0.3789
.2604
.2281
.2045
.1855
0.1693
.1551
.1423
.1306
.1197
0.1095
.0998
.0906
.0817
.0731
0.0648
.0568
.0489
.0411
.0335
0.0259
.0185
.0111
.0037
39
0.0967
.0848
.0733
.0622
.0515
0.0409
.0305
.0203
.0101
.0000
49
0.3770
.2589
.2271
.2038
.1851
0.1692
.1553
.1427
.1312
.1205
0.1105
.1010
.0919
.0832
.0748
0.0667
.0588
.0511
.0436
.0361
0.0288
.0215
.0143
.0071
.0000
40
0.0986
.0870
.0759
.0651
.0546
0.0444
.0343
.0244
.0146
.0049
50
0.3751
.2574
.2260
.2032
.1847
0.1691
.1554
.1430
.1317
.1212
0.1113
.1020
.0932
.0846
.0764
0.0685
.0608
.0532
.0459
.0386
0.0314
.0244
.0174
.0104
.0035
A-2
-------
TABLE A-2.
PERCENTAGE POINTS OF THE W TEST FOR N=3(l)50
n 0.01 0.05
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
0.753
.687
.686
0.713
.730
.749
.764
.781
0.792
.805
.814
.825
.835
0.844
.851
.858
.863
.868
0.873
.878
.881
' .884
.888
0.891
.894
.896
.898
.900
0.902
.904
.906
.908
.910
0.767
.748
.762
0.788
.803
.818
.829
.842
0.850
.859
.866
.874
.881
0.887
.892
.897
.901
.905
0.908
.911
.914
.916
.918
0.920
.923
.924
.926
.927
0.929
.930
.931
.933
.934
A-3
-------
TABLE A-2. (CONTINUED)
PERCENTAGE POINTS OF THE W TEST FOR N=3(l)50
n 0.01 O.OS
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
0.912
.914
.916
.917
.919
0.920
.922
.923
.924
.926
0.927
.928
.929
.929
.930
0.935
.936
.938
.939
.940
0.941
.942
.943
.944
.945
0.945
.946
.947
.947
.947
-------
TABLE A-3.
PERCENTAGE POINTS OF THE W TEST FOR N>35
n .01 .05
35
50
51
53
55
57 -
59
61
63
65
67
69
71
73
75
77
79
81
83
85
87
89
91
93
95
97
99
0.919
.935
0.935
.938
.940
.944
.945
0.947
.947
.948
.950
-.951
0.953
.956
.956
.957
.957
0.958
.960
.961
.961
.961
0.962
.963
.965
.965
.967
0.943
.953
0.954
.957
.958
.961
.962
0.963
.964
.965
.966
.966
0.967
.968
.969
.969
.970
0.970
.971
.972
.972
.972
0.973
.973
.974
.975
.976
A-5
-------
TABLE A-4.
PERCENT POINTS OF THE NORMAL PROBABILITY PLOT
CORRELETION COEFFICIENT FOR N=3(l)50(5)100
n
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
.01
.869
.822
.822
.835
.847
.859
.868
.876
.883
.889
.895
.901
.907
.912
.912
.919
.923
.925
.928
.930
.933
.936
.937
.939
.941
.943
.945
.947
.948
.949
.950
.951
.952
.953
.955
.956
.957
.958
.025
.872
.845
.855
.868
.876
.886
.893
.900
.906
.912 -
.917
.921
.925
.928
.931
.934
.937
.939
.942
.944
.947
.949
.950
.952
.953
.955
.956
.957
.958
.959
.960
.960
.961
.962
.962
.964
.965
.966
.05
.879
.868
.879
.890
.899
.905
.912
.917
.922
.926
.931
.934
.937
.940
.942
.945
.947
.950
.952
.954
.955
.957
.958
.959
.960
.962
.962
.964
.965
.966
.967
.967
.968
.968
.969
.970
.971
.972
A-6
-------
TABLE A-4. (CONTINUED)
PERCENT POINTS OF THE NORMAL PROBABILITY PLOT
CORRELETION COEFFICIENT FOR N=3(l)50(5)100
n
41
42
43
44
45 -
46
47
48
49
50
55
60
65
70
75
80
85
90
95
100
.01
.958
.959
.959
.960
.961
.962
.963
.963
.964
.965
.967
.970
.972
.974
.975
.976
.977
.978
.979
.981
.025
.967
.967
.967
.968
.969
.969
.970
.970
.971
.972
.974
.976
.977
.978
.979
.980
.981
.982
.983
.984
.05
.973
.973
.973
.974
.974
.974
.975
.975
.977
.978
.980
.981
.982
.983
.984
.985
.985
.985
.986
.987
A-7
-------
TABLE A-5.
VALUES OF LAMBDA FOR COHEN'S METHOD
Y
.01
.05
.10
.15
.20
.25
.30
.35
.40
.45
.50
.55
.60
.65
.70
.75
.80
.85
.90
.95
1.00
1.05
1.10
1.15
1.20
1.25
1.30
1.35
1.40
1.45
1.50
1.55
1.60
1.65
1.70
1.75
1.80
1.85
1.90
1.95
.01
.0102
.0105
.0110
.0113
.0116
.0120
.0122
.0125
.0128
.0130
.0133
.0135
.0137
.0140
.0142
.0144
.0146
.0148
.0150
.0152
.0153
.0155
.0157
.0159
.0160
.0162
.0164
.0165
.0167
.0168
.0170
.0171
.0173
.0174
.0176
.0177
.0179
.0180
.0181
.0183
.05
.0530
.0547
.0566
.0584
.0600
.0615
.0630
.0643
.0657
.0669
.0681
.0693
.0704
.0715
.0726
.0736
.0747
.0756
.0766
.0775
.0785
.0794
.0803
.0811
.0820
.0828
.0836
.0845
.0853
.0860
.0868
.0876
.0883
.0891
.0898
.0905
.0913
.0920
.0927
.0933
.10
.1111
.1143
.1180
.1215
.1247
.1277
.1306
.1333
.1360
.1385
.1409
.1432
.1455
.1477
.1499
.1520
.1540
.1560
.1579
.1598
.1617
.1635
.1653
.1671
.1688
.1705
.1722
.1738
.1754
.1770
.1786
.1801
.1817
.1832
.1846
.1861
.1876
.1890
.1904
.1918
.15
.1747
.1793
.1848
.1898
.1946
.1991
.2034
.2075
.2114
.2152
.2188
.2224
.2258
.2291
.2323
.2355
.2386
.2416
.2445
.2474
.2502
.2530
.2557
.2584
.2610
.2636
.2661
.2686
.2710
.2735
.2758
.2782
.2805
.2828
.2851
.2873
.2895
.2917
.2938
.2960
Percentage
.20 .25
.2443
.2503
.2574
.2640
.2703
.2763
.2819
.2874
.2926
.2976
.3025
.3073
.3118
.3163
.3206
.3249
.3290
.3331
.3370
.3409
.3447
.3484
.3521
.3557
.3592
.3627
.3661
.3695
.3728
.3761
.3793
.3825
.3856
.3887
.3918
.3948
.3978
.4007
.4036
.4065
.3205
.3279
.3366
.3448
.3525
.3599
.3670
.3738
.3803
.3866
.3928
.3987
.4045
.4101
.4156
.4209
.4261
.4312
.4362
.4411
.4459
.4506
.4553
.4598
.4643
.4687
.4730
.4773
.4815
.4856
.4897
.4938
.4977
.5017
.5055
.5094
.5132
.5169
.5206
.5243
of Non-detects
.30 .35
.4043
.4130
.4233
.4330
.4422
.4510
.4595
.4676
.4755
.4831
.4904
.4976
.5046
.5114
.5180
.5245
.5308
.5370
.5430
.5490
.5548
.5605
.5662
.5717
.5771
.5825
.5878
.5930
.5981
.6031
.6081
.6130
.6179
.6227
.6274
.6321
.6367
.6413
.6458
.6502
.4967
.5066
.5184
.5296
.5403
.5506
.5604
.5699
.5791
.5880
.5967
.6051
.6133
.6213
.6291
.6367
.6441
.6515
.6586
.6656
.6725
.6793
.6860
.6925
.6990
.7053
.7115
.7177
.7238
.7298
.7357
.7415
.7472
.7529
.7585
.7641
.7696 -
.7750.
.7804
.7857
.40
.5989
.6101
.6234
.6361
.6483
.6600
.6713
.6821
.6927
.7029
.7129
.7225
.7320
.7412
.7502
.7590
.7676
.7761
.7844
.7925
.8005
.8084
.8161
.8237
.8312
.8385
.8458
.8529
.8600
.8670
.8738
.8806
.8873
.8939
.9005
.9069
.9133
.9196
.9259
.9321
.45
.7128
.7252
.7400
.7542
.7678
.7810
.7937
.8060
.8179
.8295
.8408
.8517
.8625
.8729
.8832
.8932
.9031
.9127
9222
^9314
.9406
.9496
.9584
.9671
.9756
.9841
.9924
1.0006
1.0087
1.0166
1.0245
1.0323
1.0400
1.0476
1.0551
1.0625
1.0698
1.0771
1.0842
1.0913
.^0
.8402
.854C
.8703
.886C
.9012
.9158
.9300
.9437
.9570
.9700
.9826
.9950
1.0070
1.0188
1.0303
1.0416
1.0527
1.0636
1.0743
1.0847
1 '
1
I.io2
1.1250
1.1347
1.1443
1.1537
1.1629
1.1721
1.1812
1.1901
1.1989
1.2076
1.2162
1.2248
1.2332
1.2415
1.2497
1.2579
1.2660
A-8
-------
TABLE A-5. (CONTINUED)
VALUES OF LAMBDA FOR COHEN'S METHOD
Y
2.00
2.05
2.10
2.15
2.20
. 2.25
2.30
2.35
2.40
.2.45
2.50
2.55
2.60
2.65
2.70
2.75
2.80
2.85
2.90
2.95
3.00
3.05
3.10
3.15
3.20
3.25
3.30
3.35
3.40
3.45
3.50
3.55
3.60
3.65
3.70
3.75
3.80
3.85
3.90
3.95
.01
.0184
.0186
.0187
.0188
.0189
.0191
.0192
.0193
.0194
.0196
.0197
.0198
.0199
.0201
.0202
.0203
.0204
.0205
.0206
.0207
.0209
.0210
.0211
.0212
.0213
.0214
.0215
.0216
.0217
.0218
.0219
.0220
.0221
.0222
.0223
.0224
.0225
.0226
.0227
.0228 .
.05
.0940
.0947
.0954
.0960
.0967
.0973
.0980
.0986
.0992
.0998
.1005
.1011
.1017
.1023
.1029
.1035
.1040
.1046
.1052
.1058
.1063
.1069
.1074
.1080
.1085
.1091
.1096
.1102
.1107
.1112
.1118
.1123
.1128
.1133
.1138
.1143
.1148
.1153
.1158
.1163
.10
.1932
.1945
.1959
.1972
.1986
.1999
.2012
.2025
.2037
.2050
.2062
.2075
.2087
.2099
.2111
.2123
.2135
.2147
.2158
.2170
.2182
.2193
.2204
.2216
.2227
.2238
.2249
.2260
.2270
.2281
.2292
.2303
.2313
.2324
.2334
.2344
.2355
.2365
.2375
.2385
.15
.2981
.3001
.3022
.3042
.3062
.3082
.3102
.3122
.3141
.3160
.3179
.3198
.3217
.3236
.3254
.3272
.3290
.3308
.3326
.3344
.3361
.3378
.3396
.3413
.3430
.3447
.3464
.3480
.3497
.3513
.3529
.3546
.3562
.3578
.3594
.3609
.3625
.3641
.3656
.3672
Percentage
.20 .25
.4093
.4122
.4149
.4177
.4204
.4231
.4258
.4285
.4311
.4337
.4363
.4388
.4414
.4439
.4464
.4489
.4513
.4537
.4562
.4585
.4609
.4633
.4656
.4679
.4703
.4725
.4748
.4771
.4793
.4816
.4838
.4860
.4882
.4903
.4925
.4946
.4968
.4989
.5010
.5031
.5279
.5315
.5350
.5385
.5420
.5454
.5488
.5522
.5555
.5588
.5621
.5654
.5686
.5718
.5750
.5781
.5812
.5843
.5874
.5905
.5935
.5965
.5995
.6024
.6054
.6083
.6112
.6141
.6169
.6197
.6226
.6254
.6282
.6309
.6337
.6364
.6391
.6418
.6445
.6472
of Non-
.30
.6547
.6590
.6634
.6676
.6719
.6761
.6802
.6844
.6884
.6925
.6965
.7005
.7044
.7083
.7122
.7161
.7199
.7237
.7274
.7311
.7348
.7385
.7422
.7458
.7494
.7529
.7565
.76
.7635
.7670
.7704
.7739
.7773
.7807
.7840
.7874
.7907
.7940
.7973
.8006
detects
.35
.7909
.7961
.8013
.8063
.8114
.8164
.8213
.8262
.8311
.8359
.8407
.8454
.8501
.8548
.8594
.8639
.8685
.8730
.8775
.8819
.8863
.8907
.8950
,8993
.9036
.9079
.9121
.9163
.9205
.9246
.9287
.9328
.9369
.9409
.9449
.9489
.9529
.9568
.9607
.9646
.40
.9382
.9442
.9502
.9562
.9620
.9679
.9736
.9794
.9850
.9906
.9962
1.0017
1.0072
1.0126
1.0180
1.0234
1.0287
1.0339
1.0392
1.0443
1.0495
1.0546
1.0597
1.0647
1.0697
1.0747
1.0796
1.0845
1.0894
1.0942
1.0990
1.1038
1.1086
1.1133
1.1180
1.1226
1.1273
1.1319
1.1364
1.1410
.45
1.0984
1.1053
1.1122
1.1190
1.1258
1.1325
1.1391
1.1457
1.1522
1.1587
1.1651
1.1714
1.1777
1.1840
1.1902
1.1963
1.2024
1.2085
1.2145
1.2205
1.2264
1.2323
1.2381
1.2439
1.2497
1.2554
1.2611
1.2668
1.2724
1.2779
1.2835
1.2890
1.2945
1.2999
1.3053
1.3107
1.3160
1.3213
1.3266
1.3318
.50
1.2739
1.2819
1.2897
1.2974
1.3051
1.3127
1.3203
1.3278
1.3352
1.3425
1.3498
1.3571
1.3642
1.3714
1.3784
1.3854
1.3924
1.3993
1.4061
1.4129
1.4197
1.4264
1.4330
1.4396
1.4462
1.4527
1.4592
1.4657
1.4720
1.4784
1.4847
1.4910
1.4972
1.5034
1.5096
1.5157
1.5218
1.5279
1.5339
1.5399
A-9
-------
TABLE A-5. (CONTINUED)
VALUES OF LAMBDA FOR COHEN'S METHOD
Y
4.00
4.05
4.10
4.15
4.20
4.25
4.30
4.35
4.40
4.45
4.50
4.55
4.60
4.65
4.70
4.75
4.80
4.85
4.90
4.95
5.00
5.05
5.10
5.15
5.20
5.25
5.30
5.35
5.40
5.45
5.50
5.55
5.60
5.65
5.70
5.75
5.80
5.85
5.90
5.95
6.00
.01
.0229
.0230
.0231
.0232
.0233
.0234
.0235
.0236
.0237
.0238
.0239
.0240
.0241
.0241
.0242
.0243
.0244
.0245
.0246
.0247
.0248
.0249
.0249
.0250
.0251
.0252
.0253
.0254
.0255
.0255
.0256
.0257
.0258
.0259
.0260
.0260
.0261
.0262
.0263
.0264
.0264
.05
.1168
.1173
.1178
.1183
.1188
.1193
.1197
.1202
.1207
.1212
.1216
.1221
.1225
.1230
.1235
.1239
.1244
.1248
.1253
.1257
.1262
.1266
.1270
.1275
.1279
.1284
.1288
.1292
.1296
.1301
.1305
.1309
.1313
.1318
.1322
.1326
.1330
.1334
.1338
.1342
.1346
.10
.2395
.2405
.2415
.2425
.2435
.2444
.2454
.2464
.2473
.2483
.2492
.2502
.2511
.2521
.2530
.2539
.2548
.2558
.2567
.2576
.2585
.2594
.2603
.2612
.2621
.2629
.2638
.2647
.2656
.2664
.2673
.2682
.2690
.2699
.2707
.2716
.2724
.2732
.2741
.2749
.2757
.15
.3687
.3702
.3717
.3732
.3747
.3762
.3777
.3792
.3806
.3821
.3836
.3850
.3864
.3879
.3893
.3907
.3921
.3935
.3949
.3963
.3977
.3990
.4004
.4018
.4031
.4045
.4058
.4071
.4085
.4098
.4111
.4124
.4137
.4150
.4163
.4176
.4189
.4202
.4215
.4227
.4240
Percentage
.20 .25
.5052
.5072
.5093
.5113
.5134
.5154
.5174
.5194
.5214
.5234
.5253
.5273
.5292
.5312
.5331
.5350
.5370
.5389
.5407
.5426
.5445
.5464
.5482
.5501
.5519
.5537
.5556
.5574
.5592
.5610
.5628
.5646
.5663
.5681
.5699
.5716
.5734
.5751
.5769
.5786
.5803
.6498
.6525
.6551
.6577
.6603
.6629
.6654
.6680
.6705
.6730
.6755
.6780
.6805
.6830
.6855
.6879
.6903
.6928
.6952
.6976
.7000
.7024
.7047
.7071
.7094
.7118
.7141
.7164
.7187
.7210
.7233
.7256
.7278
.7301
.7323
.7346
.7368
.7390
.7412
.7434
.7456
of Non-detects
.30 .35
.8038
.8070
.8102
.8134
.8166
.8198
.8229
.8260
.8291
.8322
.8353
.8384
.8414
.8445
.8475
.8505
.8535
.8564
.8594
.8623
.8653
.8682
.8711
.8740
.8768
.8797
.8825
.8854
.8882
.8910
.8938
.8966
.8994
.9022
.9049
.9077
.9104
.9131
.9158
.9185
.9212
.9685
.9723
.9762
.9800
.9837
.9875
.9913
.9950
.9987
1.0024
1.0060
1.0097
1.0133
1.0169
1.0205
1.0241
1.0277
1.0312
1.0348
1.0383
1.0418
1.0452
1.0487
1.0521
1.0556
1.0590
1.0624
1.0658
1.0691
1.0725
1.0758
1.0792
1.0825
1.0858
1.0891
1.0924
1.0956
1.0989
1.1021
1.1053
1.1085
.40
1.1455
1.1500
1.1545
1.1590
1.1634
1.1678
1.1722
1.1765
1.1809
1.1852
1.1895
1.1937
1.1980
1.2022
1.2064
1.2106
1.2148
1.2189
1.2230
1.2272
1.2312
1.2353
1.2394
1.2434
1.2474
1.2514
1.2554
1.2594
1.2633
1.2672
1.2711
1.2750
1.2789
1.2828
1.2866
1.2905
1.2943
1.2981
1.3019
1.3057
1.3094
.45
1.3371
1.3423
1.3474
1.3526
1.3577
1.3627
1.3678
1.3728
1.3778
1.3828
1.3878
1.3927
1.3976
1.4024
1.4073
1.4121
1.4169
1.4217
1.4265
1.4312
1.4359
1.4406
1.4453
1.4500
1.4546
1.4592
1.4638
1.4684
1.4729
1.4775
1.4820
1.4865
1.4910
1.4954
1.4999
1.5043
1.5087
1.5131
1.5175
1.5218
1.5262
.50
1.5458
1.5518
1.5577
1.5635
1.5693
1.5751
1.5809
1.5866
1.5924
1.5980
1.6037
1.6093
1.6149
1.6205
1.6260
1.6315
1.6370
1.6425
1.6479
1.6r
1.
l.bu-rl
1.6694
1.6747
1.6800
1.6853
1.6905
1.6958
1.7010
1.7061
1.7113
1.7164
1.7215
1.7266
1.7317
1.7368
1.7418
1.7468
1.7518
1.7568
1.7617
A-10
-------
TABLE A-6.
MINIMUM COVERAGE (BETA) OF 95% CONFIDENCE
"NON-PARAMETRIC UPPER TOLERANCE LIMITS
N
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
P(maximum)
5.0
22.4
36.8
47.3
54.9
60.7
65.2
68.8
71.7
74.1
76.2
'77.9
79.4
- 80.7
81.9
82.9
83.8
84.7
85.4
86.1
86.7
87.3
87.8
88.3
88.7
89.1
89.5
89.9
90.2
90.5
90.8
91.1
91.3
91.6
91.8
92.0
92.2
92.4
92.6
92.8
p(2nd largest)
....
2.6
13.6
24.8
34.2
41.8
48.0
53.0
57.0
60.6
63.6
66.2
68.4
70.4
72.0
73.6
75.0
76.2
77.4
78.4
79.4
80.2
81.0
81.8
82.4
83.0
83.6
84.2
84.6
85.2
85.6
86.0
86.4
86.8
87.2
87.4
87.8
88.2
88.4
88.6
A-ll
-------
TABLE A-6. (CONTINUED)
MINIMUM COVERAGE (BETA) OF 95% CONFIDENCE
-NON-PARAMETRIC UPPER TOLERANCE LIMITS
N P(maximum) P(2nd largest)
41 93.0 89.0
42 93.1 89.2
43 93.3 89.4
44 93.4 89.6
45 93.6 89.8
46 93.7 90.0
47 93.8 90.2
48 93.9 90.4
49 94.1 90.6
50 94.2 90.8
55 94.7 91.6
60 95.1 92.4
65 95.5 93.0
70 95.8 93.4
75 96.1 93.8
80 96.3 94.2
85 96.5 94.6
90 96.7 94.8
95 96.9 95.0
100 97.0 95.4
A-12
-------
TABLE A-7.
CONFIDENCE LEVELS FOR NON-PARAMETRIC
PREDICTION LIMITS FOR N=l(l)100
N
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
k = l
50.0
66.7
75.0
80.0
83.3
85.7
87.5
88.9
90.0
90.9
91.7
92.3
92.9
93.3
93.8
94.1
94.4
94.7
95.0
95.2
95.5
95.7
95.8
96.0
96.2
96.3
96.4
96.6
96.7
96.8
96.9
97.0
97.1
97.1
97.2
97.3
97.4
97.4
97.5
97.6
k = 2
33.3
50.0
60.0
66.7
71.4
75.0
77.8
80.0
81.8
83.3
84.6
85.7
86.7
87.5
88.2
88.9
89.5
90.0
90.5
90.9
91.3
91.7
92.0
92.3
92.6
92.9
93.1
93.3
93.5
93.8
93.9
94.1
94.3
94.4
94.6
94.7
94.9
95.0
95.1
95.2
NUMBER
k=3
25.0
40.0
50.0
57.1
62.5
66.7
70.0
72.7
75.0
76.9
78.6
80.0
81.3
82.4
83.3
84.2
85.0
85.7
86.4
87.0
87.5
88.0
88.5
88.9
89.3
89.7
90.0
90.3
90.6
90.9
91.2
91.4
91.7
91.9
92.1
92.3
92.5
92.7
92.9
93.0
OF FUTURE
k=4 k=5
20.0
33.3
42.9
50.0
55.6
60.0
63.6
66.7
69.2
71.4
73.3
75.0
76.5
77.8
78.9
80.0
81.0
81.8
82.6
83.3
84.0
84.6
85.2
85.7
86.2
86.7
87.1
87.5
87.9
88.2
88.6
88.9
89.2
89.5
89.7
90.0
90.2
90.5
90.7
90.9
16.7
28.6
37.5
44.4
50.0
54.5
58.3
61.5
64.3
66.7
68.8
70.6
72.2
73.7
75.0
76.2
77.3
78.3
79.2
80.0
80.8
81.5
82.1
82.8
83.3
83.9
84.4
84.8
85.3
85.7
86.1
86.5
86.8
87.2
87.5
87.8
88.1
88.4
88.6
88.9
SAMPLES
k = 6
14.3
25.0
33.3
40.0
45.5
50.0
53.8
57.1
60.0
62.5
64.7
66.7
68.4
70.0
71.4
72.7
73.9
75.0
76.0
76.9
77.8
78.6
79.3
80.0
80.6
81.3
81.8
82.4
82.9
83.3
83.8
84.2
84.6
85.0
85.4 .'
85.7
86.0
86.4
86.7
87.0
k = 7
12.5
22.2
30.0
36.4
41.7
46.2
50.0
53.3
56.3
58.8
61.1
63.2
65.0
66.7
68.2
69.6
70.8
72.0
73.1
74.1
75.0
75.9
76.7
77.4
78.1
78.8
79.4
80.0
80.6
81.1
81.6
82.1
82.5
82.9
83.3
83.7
84.1
84.4
84.8
85.1
k = 8
11.1
20.0
27.3
33.3
38.5
42.9
46.7
50.0
52.9
55.6
57.9
60.0
61.9
63.6
65.2
66.7
68.0
69.2
70.4
71.4
72.4
73.3
74.2
75.0
75.8
76.5
77.1
77.8
78.4
78.9
79.5
80.0
80.5
81.0
81.4
81.8
82.2
82.6
83.0
83.3
A-13
-------
TABLE A-7. (CONTINUED)
CONFIDENCE LEVELS FOR NON-PARAMETRIC
PREDICTION LIMITS FOR N=l(l)100
N
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
k = l
97.6
97.7
97.7
97.8
97.8
97.9
97.9
98.0
98.0
98.0
98.1
98.1
98.1
98.2
98.2
98.2
98.3
98.3
98.3
98.4
98.4
98.4
98.4
98.5
98.5
98.5
98.5
98.6
98.6
98.6
98.6
98.6
98.6
98.7
98.7
98.7
98.7
98.7.
98.8
98.8
k = 2
95.3
95.5
95.6
95.7
95.7-
95.8
95.9
96.0
96.1
96.2
96.2
96.3
96.4
96.4
96.5
96.6
96.6
96.7
96.7
96.8
96.8
96.9
96.9
97.0
97.0
97.1
97.1
97.1
97.2
97.2
97.3
97.3
97.3
97.4
97.4
97.4
97.5
97.5
97.5
97.6
NUMBER
k = 3
93.2
93.3
93.5
93.6
93.8
93.9
94.0
94.1
94.2
94.3
94.4
94.5
94.6
94.7
94.8
94.9
95.0
95.1
95.2
95.2
95.3
95.4
95.5
95.5
95.6
95.7
95.7
95.8
95.8
95.9
95.9
96.0
96.1
96.1
96.2
96.2
96.3
96.3
96.3
96.4
OF
k = 4
91.1
91.3
91.5
91.7
91.8
92.0
92.2
92.3
92.5
92.6
92.7
92.9
93.0
93.1
93.2
93.3
93.4
93.5
93.7
93.8
93.8
93.9
94.0
94.1
94.2
94.3
94.4
94.4
94.5
94.6
94.7
94.7
94.8
94.9
94.9
95.0
95.1
95.1
95.2
95.2
FUTURE
k = 5
89.1
89.4
89.6
89.8
90.0
90.2
90.4
90.6
90.7
90.9
91.1
91.2
91.4
91.5
91.7
91.8
91.9
92.1
92.2
92.3
92.4
92.5
92.6
92.8
92.9
93.0
93.1
93.2
93.2
9*3.3
93.4
93.5
93.6
93.7
93.8
93.8
93.9
94.0
94.0
94.1
SAMPLES
k = 6
87.2
87.5
87.8
88.0
88.2
88.5
88.7
88.9
89.1
89.3
89.5
89.7
89.8
90.0
90.2
90.3
90.5
90.6
90.8
90.9
91.0
91.2
91.3
91.4
91.5
91.7
91.8
91.9
92.0
92.1
92.2
92.3
92.4
92.5
92.6
92.7
92.8
92.9
92.9
93.0
k = 7
85.4
85.7
86.0
86.3
86.5
86.8
87.0
87.3
87.5
87.7
87.9
88.1
88.3
88.5
88.7
88.9
89.1
89.2
89.4
89.6
89.7
89.9
90.0
90.1
90.3
90.4
90.5
90.7
90.8
90.9
91.0
91.1
91.3
91.4
91.5
91.6
91.7
91.8
91.9
92.0
k = 8
83.7
84.0
84.3
84.6
84.9
85.2
85.5
85.7
86.0
86.2
86.4
86.7
86.9
87.1
87.3
87.5
87.7
87.9
88.1
88.2
88.4
88.6
88.7
88.9
89.0
89.2
89.3
89.5
89.6
89.7
89.9
90.0
90.1
90.2
90.4
90.5
90.6
90.7
90.8
90.9
A-14
-------
TABLE A-7. (CONTINUED)
CONFIDENCE LEVELS FOR NON-PARAMETRIC
PREDICTION LIMITS FOR N=l(l)100
N
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
k = l
98.8
98.8
98.8
98.8
98.8
98.9
98.9
98.9
98.9
98.9
98.9
98.9
98.9
98.9
99.0
99.0
99.0
99.0
99.0
99.0
k = 2
97.6
97.6
97.6
97.7
97.7
97.7
97.8
97.8
97.8
97.8
97.8
97.9
97.9
97.9
97.9
98.0
98.0
98.0
98.0
98.0
NUMBER
k = 3
96.4
96.5
96.5
96.6
96.6
96.6
96.7
96.7
96.7
96.8
96.8
96.8
96.9
96.9
96.9
97.0
97.0
97.0
97.1
97.1
OF
k=4
95.3
95.3
95.4
95.5
95.5
95.6
95.6
95.7
95.7
95.7
95.8
95.8
95.9
95.9
96.0
96.0
96.0
96.1
96.1
96.2
FUTURE
k = 5
94.2
94.3
94.3
94.4
94.4
94.5
94.6
94.6
94.7
94.7
94.8
94.8
94.9
94.9
95.0
95.0
95.1
95.1
95.2
95.2
SAMPLES
k = 6
93.1
93.2
93.3
93.3
93.4
93.5
93.5
93.6
93.7
93.8
93.8
93.9
93.9
94.0
94.1
94.1
94.2
94.2
94.3
94.3
k = 7
92.0
92.1
92.2
92.3
92.4
92.5
92.6
92.6
92.7
92.8
92.9
92.9
93.0
93.1
93.1
93.2
93.3
93.3
93.4
93.5
k=8
91.0
91.1
91.2
91.3
- 91.4
91.5
91.6
91.7
91.8
91.8
91.9
92.0
92.1
92.2
92.2
92.3
92.4
92.5
92.5
92.6
A-15
-------
I. CONSTRUCTION OF POWER CURVES
To construct power curves for each of the parametric and non-parametric retesting strategies,
random standard Normal deviates were generated on an IBM mainframe computer using SAS. The
background level mean concentration was set to zero, while the alternative mean concentration level
was incremented in steps of A=0.5 standardized units above the background level. At each increment,
5000 iterations of the retesting strategy were simulated; the proponion of iterations indicating
contamination at any one of the wells in the downgradient monitoring network was designated as the
effective power of the retesting strategy (for that A and configuration of background samples and
monitoring wells).
Power values for the EPA Reference Power Curves were not simulated, but represent analytical
calculations based on the non-central t-distribution with non-centrality parameter A. SAS programs for
simulating the effective power of any of the parametric or non-parametric retesting strategies are
presented below.
//***
//*
//*
//*
//*
//*
//*
//*
//*
//*
//*
//*
//*
//*«*
t* *****i
r*********************************
DESCRIPTION: *** PARAMETRIC SIMULATIONS * * *
This program produces power curves for 35 different curve
simulations (refer to the %LET statements below). Delta ranges
from 0 to 5 by 0.5. The variable list is as follows for the
input parameters:
BG = Background
WL = Well
TL = Tolerance Limit
PL = Prediction Limit
***************!
r*****************
// EXEC SAS
// OUTSAS DD DSN=XXXXXXX.GWT03000.SJA3092.CURVES,
// DISP=OLD
// SYSIN DD *
OPTIONS LS=132 PS=57;
ILET ISTART*!;
ILET CURVENUM=35;
ILET RSEED=2020;
ILET REPEAT=5000;
ILET ITPRINT=1000;
%LET
%LET
%LET
%LET
%LET
%LET
%LET
%LET
%LET
%LET
BG1 =24;
BG2 =24;
BG3 =8;
BG4 =8;
BG5 =24;
BG6 =24;
BG7 =8;
BG8 =8;
BG9 =24;
BG10-24;
%LET
%LET
%LET
ILET
%LET
%LET
%LET
%LET
%LET
%LET
WL1
WL2
WL3
WL4
WL5
WL6
WL7
WL8
WL9
WL10
=5;
=5;
=5;
=5;
=20;
=20;
=20;
=20;
=50;
=50;
%LET
%LET
%LET
%LET
%LET
%LET
%LET
ILET
%LET
ILET
TL1 =0.
TL2 =0.
TL3 =0.
TL4 =0.
TL5 =0.
TL6 =0'.
TL7 =0.
TL8 =0.
TL9 =0.
TL10=0.
95;
95;
95;
95;
95;
95;
95;
95;
95;
95;
ILET
ILET
ILET
ILET
ILET
ILET
ILET
ILET
ILET
ILET
PL1
PL2
PL3
PL4
PL5
PL6
PL7
PL8
PL9
=0
=0
=0
=0
=0
=0
=0
= 0
= 0
PL10=0
.80;
.85;
.80;
.85;
.95;
.97;
.95;
.97;
.98;
.99;
B-l
-------
ILET
ILET
ILET
ILET
ILET
ILET
ILET
ILET
ILET
ILET
ILET
ILET
ILET
ILET
ILET
%LET
ILET
%LET
ILET
ILET
ILET
ILET
ILET
ILET
ILET
BG11 =
BG12 =
BG13=
BG14 =
BG15=
BG16 =
BG17-
BG18=
BG19=
BG20-
BG21=
BG22=
BG23=
BG24=
'BG25=
BG26=
BG27=
BG28=
BG29=
BG30=
BG31=
24;
24;
24;
24;
24;
24;
24;
24;
24;
24;
8;
8;
16;
16;
24;
16;
16;
16;
16;
16;
16;
BG32=24;
BG33=
BG34=
BG35=
16;
16;
16;
ILET
ILET
ILET
ILET
ILET
ILET
ILET
ILET
ILET
ILET
ILET
ILET
ILET
ILET
ILET
ILET
ILET
ILET
ILET
ILET
ILET
ILET
ILET
ILET
ILET
WL11=50;
WL12=50;
WL13=50;
WL14 =
WL15=
WL16=
WL17 =
50;
50;
100;
100;
WL18=100;
WL19=
WL20=
100;
100;
WL21=20;
WL22 =
WL23=
WL24 =
WL25=
WL26=
WL27 =
WL28 =
5;
5;
5;
5;
20;
20;
50;
WL29=50;
WL30=
WL31=
WL32=
WL33=
WL34=
WL35=
50;
50;
100;
100;
100;
100;
ILET
ILET
ILET
ILET
ILET
ILET
ILET
ILET
ILET
ILET
ILET
ILET
ILET
ILET
%LET
ILET
ILET
%LET
%LET
ILET
%LET
%LET
%LET
%LET
%LET
TL11=0.
TL12=0.
TL13=0.
TL14=0.
TL15=0.
TL16=0.
TL17=0.
TL18=0.
TL19=0.
TL20=0.
TL2l=0.
TL22=0.
TL23=0.
TL24=0.
TL25=0.
TL26=0.
TL27=0.
TL28=0.
TL29=0.
TL30=0.
TL3l=0.'
TL32=0.
TL33=0.
TL34=0.
TL35=0.
99;
99;
99;
98;
98;
98;
98;
99;
99;
99;
95;
95;
95;
95;
95;
95;
95;
98;
98;
99;
99;
98;
98;
99;
99;
ILET
ILET
ILET
ILET
ILET
ILET
ILET
ILET
ILET
ILET
ILET
ILET
ILET
ILET
ILET
ILET
ILET
ILET
ILET
ILET
ILET
ILET
ILET
ILET
ILET
PL11-0.
PL12=0.
PL13=0.
PL14=0.
PL15=0.
PL16=0.
PL17=0.
PL18=0.
PL19=0 .
PL20=0.
PL21=0.
PL22=0.
PL23=0.
PL24=0.
PL25=0.
PL26=0.
PL27=0.
PL28=0.
PL29=0.
PL30=0.
PL31=0.
PL32=0.
PL33=0.
PL34=0.
PL35=0.
90;
93;
94;
95;
97;
97;
99;
95;
97;
98;
98;
90;
85;
90;
90;
95;
97;
95;
97;
90;
92;
98;
98;
95;
96;
IMACRO PARSIM;
DATA ITERATE;
*** Set changing simulation variable to common variable names;
BG=&&BG&I;
WL=fi&WL&I;
TL=&STLiI;
PL=&&PL&I;
DO DELTA=0 TO 5 BY 0.5;
*** Initialize TPO, TP1 & TP2 to
TPO=0;
TP1=0;
TP2=0;
0 before entering simulation;
DO J=l TO &REPEAT;
*** Initialize CNTO, CNT1 & CNT2 to 0;
CNTO=0;
CNT1=0;
CNT2=0;
XB=RANNOR(&RSEED)/SQRT(BG);
SB=SQRT(2*RANGAM(&RSEED,(BG-l)/2)/(BG-1));
PL2=XB+SB*SQRT(1+1/BG)*TINV((l-(l-PL)/2),(BG-1))
PL1=XB+SB*SQRT(1-H/BG)*TINV((l-(l-PL)),(BG-1));
PLO=XB+SB*SQRT(1-H/BG)*TINV((l-(l-TL)),(BG-1));
TLIM=XB+SB*SQRT(l-(-l/BG)*TINV(U-(l-TL) ) , (BG-1) ) ;
DO K=l TO WL;
IF K
-------
X3=RANNOR(&RSEED)+DELTA;
END;
IF X1>TLIM THEN DO;
CNTO=CNTQ_±1;
IF X2>PL1 THEN CNT1=CNT1+1;
IF X2>PL2 OR X3>PL2 THEN CNT2=CNT2+1;
END;
END;
IF CNTO>0 THEN TPO=TPO+100/&REPEAT;
IF CNT1>0 THEN TP1=TP1+100/&REPEAT;
IF CNT2>0 THEN TP2=TP2+100/&REPEAT;
*** Print iteration information every 100 iterations;
I=&I;
IF MOD(J,&ITPRINT)=0 THEN
PUT '»> CURVE ' I ', ITERATION ' J ', ' BG= ', ' WL= ', ' TL=
PL= ', ' DELTA= ', ' TPO= ', ' TP1= ', ' TP2= '<«';
END;
OUTPUT;
END;
RUN;
DATA OUTSAS.PCURVE&I; SET ITERATE(KEEP=BG WL TL PL TPO TP1 TP2 DELTA);
RUN;
PROC PRINT DATA=OUTSAS.PCURVE&I;
FORMAT TPO TP1 TP2 8.4;
TITLE1"TEST PRINT OF PARAMETRIC SIMULATION PCURVE&I";
TITLE2"NUMBER OF ITERATIONS = &REPEAT";
RUN;
%MEND PARSIM;
%MACRO CURVE;
%DO I=&ISTART %TO 6CURVENUM;
%PARSIM
%END;
%MEND CURVE;
%CURVE
//A****************************************** *'******************»***«*;
//* DESCRIPTION: *** NON-PARAMETRIC SIMULATION ***
//*
//* This program produces power curves for 15 different curve
//* simulations (refer to the %LET statements below). Delta ranges
//* from 0 to 5 by 0.5. The variable list is as follows for the
//* input parameters:
//*
//* BG = Background
//* WL = Well
//*
//******************************************************««**«*********i
// EXEC SAS
// OUTSAS DD DSN=XXXXXXX.GWT03000.SJA3092.CURVES,DISP=OLD
// SYSIN DD *
OPTIONS LS=132 PS=57;
%LET ISTART=1;
%LET CURVENUM=15;
%LET RSEED=3030;
%LET REPEAT-5000;
%LET ITPRINT=1000;
B-3
-------
%LET BG1 =8; %LET WL1 = 5;
%LET BG2 =16; %LET WL2 = 5;
%LET BG3 =24; %LET WL3 =5;
%LET BG4 =8; "" %LET WL4 =20;
%LET BG5 -16; ILET WL5 =20;
%LET BG6 =24; %LET WL6 =20;
%LET BG7 =8; %LET WL7 =50;
%LET BG8 =16; %LET WL8 =50; - "
%LET BG9 =24; %LET WL9 =50;
%LET BG10=8; %LET WL10=100;
%LET BG11-16; %LET WL11=100;
%LET BG12-24; %LET WL12=100;
%LET BG13=32; %LET WL13=100;
%LET BG14=32; %LET WL14=20;
%LET BG15=32; %LET WL15=50;
%MACRO NPARSIM;
DATA ITERATE;
*** Set changing simulation variable to common variable names;
BG=&&BG&I;
WL=4&WL&I;
DO DELTA=0 TO 5 BY 0.5;
*** Initialize PLx variables to 0 before entering simulation,
PLO=0;
PL1=0;
PL2A=0;
PL2B=0;
PL3A=0;
PL3B=0;
DO J=l TO &REPEAT;
*** Initialize CNTx variables to 0;
CNTO=0;
CNT1=0;
CNT2=0;
CNT3=0;
CNT4=0;
CNT5=0;
DO K=l TO BG;
TEST=RANNOR(&RSEED);
IF K=l THEN MAX=TEST;
ELSE IF TEST>MAX THEN MAX=TEST;
END;
DO L-l TO WL;
IF LMAX THEN DO;
CNTO-CNTO+1;
IF X2>MAX THEN CNT1-CNT1+1;
-------
END;
IF X2>MAX & X3>MAX THEN CNT2-CNT2+1;
IF X2>MAX OR X3>MAX THEN CNT3=CNT3+1;
IF X2>MAX & X3>MAX & X4>MAX THEN CNT4=CNT4+1;
IF X2>MAX__OR X3>MAX OR X4>MAX THEN CNT5=CNT5+1,
IF CNTO>0 THEN PLO-PLO-f 100/SREPEAT;
IF CNT1>0 THEN PL1=PL1-HOO/&REPEAT;
IF CNT2>0 THEN PL2A=PL2A+100/&REPEAT;
IF CNT3>0 THEN PL2B=PL2B+100/&REPEAT;
IF CNT4>0 THEN PL3A=PL3A-HOO/&REPEAT;
IF CNT5>0 THEN PL3B=PL3B+100/&REPEAT;
*** Print iteration information every X iterations;
I-&I;
IF MOD(J,&ITPRINT)=0 THEN
' PUT '»> CURVE ' I ', ITERATION ' J ', ' BG= ', ' WL= ', ' DELTA=
', ' PLO= ', ' PL1- ', ' PL2A= ', ' PL2B= ', ' PL3A= ', ' PL3B= '<«';
END;
OUTPUT;
END;
RUN;
DATA OUTSAS.NCURVE&I; SET ITERATE(KEEP=BG WL PLO PL1 PL2A PL2B PL3A PL3B DELTA)
RUN;
PROC PRINT DATA-OUTSAS.NCURVE&I;
FORMAT PLO PL1 PL2A PL2B PL3A PL3B 8.4;
TITLE1"TEST PRINT OF NON-PARAMETRIC SIMULATION NCURVE&I";
TITLE2"NUMBER OF ITERATIONS = &REPEAT";
RUN;
%MEND NPARSIM;
%MACRO CURVE;
%DO I=&ISTART %TO &CURVENUM;
%NPARSIM
%END;
%MEND CURVE;
%CURVE
B-5
-------
EPA REFERENCE POWER CURVES
1
U3
u.
u.
Background Samples
A (UNITS ABOVE BACKGROUND)
B-6
-------
II. PARAMETRIC RETESTING STRATEGIES
POWER CURVE FOR 95% TOLERANCE
AND 90% PREDICTION LIMIT
(8 Background Samples; 5 wells)
#
cc
1
Ili
it.
123
A (UNITS ABOVE BACKGROUND)
EPA Reference
^ Zero resamples
O One resample
A Two resamples
u
fc
u
POWER CURVE FOR 95% TOLERANCE
AND 90% PREDICTION LIMIT
(16 Background Samples; 5 wells)
100
A (UNITS ABOVE BACKGROUND)
B-7
EPA Reference
if Zero resamples
O One resample
A Two resamples
-------
POWER CURVE FOR 95% TOLERANCE
AND 85% PREDICTION LIMIT
(16 Background Samples; 5 wells)
lOfl i-
u
w
I
1 2 3
A (UNITS ABOVE BACKGROUND)
EPA Reference
Tfc Zero resamples
O One resample
A Two resamples
POWER CURVE FOR 95% TOLERANCE
AND 85% PREDICTION LIMIT
(24 Background Samples; S wells)
IM r
*
a.
£
u
5
1 2 3
A (UNITS ABOVE BACKGROUND)
EPA Reference
<^ Zero resamples
O One resample
A Two resamples
B-8
-------
POWER CURVE FOR 95% TOLERANCE
AND 90% PREDICTION LIMIT
(24 Background Samples; 5 wells)
ee
u:
I
u
t
li.'
EPA Reference
Zero resamples
1 2 3
A (UMTS ABOVE BACKGROUND)
i
POWER CURVE FOR 95% TOLERANCE
AND 98% PREDICTION LIMIT
(8 Background Samples; 20 wells)
100
EPA Reference
~k Zero resamples
O One resample
A Two resamples
1 2 3
A (UNITS ABOVE BACKGROUND)
B-9
-------
POWER CURVE FOR 95% TOLERANCE
AND 97% PREDICTION LIMIT
(16 Background Samples; 20 wells)
100 i
#
ec
I
u
1 2 3
A (UNITS ABOVE BACKGROUND)
EPA Reference
l*r Zero resamples
O One resample
A Two resamples
POWER CURVE FOR 95% TOLERANCE
AND 97% PREDICTION LIMIT
(24 Background Samples; 20 wells)
a.
5
. EPA Reference
it Zero resamples
O One resample
A Two resamples
1 2 3
A (UNITS ABOVE BACKGROUND)
B-10
-------
#
oc
Is
£
u
I
I
POWER CURVE FOR 98% TOLERANCE
AND 97% PREDICTION LIMIT
(16 Background Samples; SO wells)
EPA Reference
~k Zero resamples
O One resample
A Two resamples
1 2 3
A (UNITS ABOVE BACKGROUND)
POWER CURVE FOR 99% TOLERANCE
AND 92% PREDICTION LIMIT
(16 Background Samples; 50 wells)
£
>
u.
u
EPA Reference
^*r Zero resamples
O One resample
A TWO resamples
1 2 3
A (UNITS ABOVE BACKGROUND)
B-ll
-------
u
e
POWER CURVE FOR 98% TOLERANCE
AND 95% PREDICTION LIMIT
(24 Background Samples; 50 wells)
123
A (UMTS ABOVE BACKGROUND)
EPA Reference
~X Zero resamples
O One resample
A Two resamples
2
Id
i
POWER CURVE FOR 99% TOLERANCE
AND 90% PREDICTION LIMIT
(24 Background Samples; 50 wells)
EPA Reference
~k Zero resamples
O One resample
A Two resamples
1 2 3
& (UNITS ABOVE BACKGROUND)
B-12
-------
POWER CURVE FOR 98% TOLERANCE
AND 97% PREDICTION LIMIT
(24 Background Samples; 50 wells)
u
1
Ui
b
Ui
1 2 3
A (UNITS ABOVE BACKGROUND)
EPA Reference
^T Zero resamples
O One resample
A Two resamples
POWER CURVE FOR 95% TOLERANCE
AND 98% PREDICTION LIMIT
(24 Background Samples; 50 wells)
100
I
u
>
1 2 3
A (UNITS ABOVE BACKGROUND)
EPA Reference
"ft Zero resamples
O One resample
A Two resamples
B-13
-------
POWER CURVE FOR 98% TOLERANCE
AND 98% PREDICTION LIMIT
(16 Background Samples; 100 wells)
100
I
u
£
u
20 -
EPA Reference
~k Zero resamples
O One resample
A Two resamples
123
A (UNITS ABOVE BACKGROUND)
POWER CURVE FOR 99% TOLERANCE
AND 95% PREDICTION LIMIT
(24 Background Samples; 100 wells)
£
ee
1
u
I
EPA Reference
"fc Zero resamples
O One resample
A Two resamples
1 2 3
A (UNITS ABOVE BACKGROUND)
B-14
-------
POWER CURVE FOR 98% TOLERANCE
AND 98% PREDICTION LIMIT
(24 Background Samples; 100 wells)
u.
1
u
E
u:
t
bJ
1 2 3
A (UNITS ABOVE BACKGROUND)
EPA Reference
~k Zero resamples
O One resample
A Two resamples
B-15
-------
III. NON-PARAMETRIC RETESTING STRATEGIES
POWER CURVE FOR NON-PARAMETRIC
PREDICTION LIMITS
(8 Background Simples; 5 wells)
a.
1
O
u.
b
u
4 (UMTS ABOVE BACKGROUND)
(8 Background Samples; 5 wells)
Id
Ui
U!
U.
1 2 3
A (UNITS ABOVE BACKGROUND)
EPA Reference
O Zero resamples
A One resample
EPA Reference
A Two resamples (A)
O Two resamples (B)
B-16
-------
(8 Background Samples; 5 wells)
#
ac
I
C
EPA Reference
A Three resamples (A)
O Three resamples (B)
1 2 3
A (UNITS ABOVE BACKGROUND)
POWER CURVE FOR NON-PARAMETRIC
PREDICTION LIMITS
(16 Background Samples; 5 wells)
a
u
EPA Reference
O Zero resamples
A One resample
1 2 3
A (UNITS ABOVE BACKGROUND)
B-17
-------
(16 Background Samples; 5 wells)
u.
u.
u
EPA Reference
A Two resamples (A)
O Two resamples (B)
1 2 3
6 (UNITS ABOVE BACKGROUND)
(16 Background Samples; 5 wells)
100
EPA Reference
A Three resamples (A)
O Three resamples (B)
1 2 3
A (UNITS ABOVE BACKGROUND)
B-18
-------
POWER CURVE FOR NON-PARAMETRIC
PREDICTION LIMITS
(24 Background Samples; 5 wells)
tt
u:
100
#
a,
Z
2
|
t
1 2 3
A (UNITS ABOVE BACKGROUND)
(24 Background Samples; 5 wells)
I 2 3
A (UNITS ABOVE BACKGROUND)
EPA Reference
O Zero resamples
A One resample
EPA Reference
A Two resamples (A)
O Two resamples (B)
B-19
-------
(24 Background Samples; 5 wells)
g
u.
u.
u
20 -
fi & &
EPA Reference
A Three resamples (A)
O Three resamples (B)
1 2 3
& (UNITS ABOVE BACKGROUND)
POWER CURVE FOR NON-PARAMETRIC
PREDICTION LIMITS
(8 Background Samples; 20 wells)
100
ee
u
O
20 -
EPA Reference
O Zero resamples
A One resample
1234
i (UNITS ABOVE BACKGROUND)
B-20
-------
(8 Background Samples; 20 wells)
#
ac
1
u:
P
1
EPA Reference
A Two resamples (A)
O Two resamples (B)
I 2 3
A (UMTS ABOVE BACKGROUND)
(8 Background Samples; 20 wells)
tt
u
k.
U.
U
EPA Reference
A Three resamples (A)
O Three resamples (B)
1 2 3
A (UNITS ABOVE BACKGROUND)
B-21
-------
POWER CURVE FOR NON-PARAMETRIC
PREDICTION LIMITS
(16 Background Samples; 20 wells)
tt
u
EPA Reference
O Zero resamples
A One resample
1 2 3
A (UNITS ABOVE BACKGROUND)
(16 Background Samples; 20 wells)
oc
I
5
I
EPA Reference
A Two resamples (A)
O Two resamples (B)
1 2 3
A (UNITS ABOVE BACKGROUND)
B-22
-------
(16 Background Samples; 20 wells)
#
oc
I
£
u
til
u.
u.
lii
EPA Reference
A Three resamples (A)
O Three resamples (B)
1 2 3
A (UNITS ABOVE BACKGROUND)
POWER CURVE FOR NON-PARAMETRIC
PREDICTION LIMITS
(24 Background Samples; 20 wells)
100
#
a
i
u
B
EPA Reference
O Zero resamples
A One resample
1 2 3
A (UNITS ABOVE BACKGROUND)
B-23
-------
(24 Background Samples; 20 wells)
100
i
U
b.
ta.
EPA Reference
A Two resamples (A)
O Two resamples (B)
1 2 3
A (UMTS ABOVE BACKGROUND)
(24 Background Samples; 20 wells)
s
O
u
EPA Reference
A Three resamples (A)
O Three resamples (B)
1 2
A (UNITS ABOVE BACKGROUND)
B-24
-------
POWER CURVE FOR NON-PARAMETRIC
PREDICTION LIMITS
(32 Background Samples; 20 wells)
100
i
O
E
u:
t
u
20 -
1234
A (UNITS ABOVE BACKGROUND)
(32 Background Samples; 20 wells)
*
ee
£
u
B
EPA Reference
O Zero resamples
A One resample
EPA Reference
A Two resamples (A)
O Two resamples (B)
1 2 3
& (UNITS ABOVE BACKGROUND)
B-25
-------
(32 Background Samples; 20 wells)
100
*
tt
$
2
>
o
20 -
EPA Reference
A Three resamples (A)
O Three resamples (B)
1 2 3
A (UNITS ABOVE BACKGROUND)
POWER CURVE FOR NON-PARAMETRIC
PREDICTION LIMITS
(8 Background Samples; 50 wells)
EPA Reference
O Zero resamples
A One resample
1 2 3
A (UNITS ABOVE BACKGROUND)
B-26
-------
(8 Background Samples; 50 wells)
100
#
ae
I
o
01234
A (UMTS ABOVE BACKGROUND)
(8 Background Samples; 50 wells)
c
u.
14.
u
1 2 3
A (UNITS ABOVE BACKGROUND)
EPA Reference
A Two resamples (A)
O Two resamples (B)
EPA Reference
Three resamples (A)
Three resamples (B)
B-27
-------
POWER CURVE FOR NON-PARAMETRIC
PREDICTION LIMITS
(16 Background Samples; 50 wells)
#
et
S
bi
t
Ui
1 2 3
A (UNITS ABOVE BACKGROUND)
(16 Background Samples; 50 wells)
loo
u
o
£
u
EPA Reference
O Zero resamples
A One resample
EPA Reference
A Two resamples (A)
O Two resamples (B)
1234
A (UNITS ABOVE BACKGROUND)
B-28
-------
(16 Background Samples; 50 wells)
100
£
u
i
20 -
EPA Reference .
A Three resamples (A)
O Three resamples (B)
1 2 3
A (UMTS ABOVE BACKGROUND)
POWER CURVE FOR NON-PARAMETRIC
PREDICTION LIMITS
(24 Background Samples; 50 wells)
a.
u
O
U
u
EPA Reference
O Zero resamples
^ One resample
1 2 3
A (UNITS ABOVE BACKGROUND)
B-29
-------
(24 Background Samples; 50 wells)
i
u
£
u.
w
EPA Reference
A Two resamples (A)
O Two resamples (B)
1 2 3
A (UNITS ABOVE BACKGROUND)
(24 Background Samples; 50 wells)
100
u
>
20 -
0 K^e59f
EPA Reference
A Three resamples (A)
O Three resamples (B)
123
& (UNITS ABOVE BACKGROUND)
B-30
-------
POWER CURVE FOR NON-PARAMETRIC
PREDICTION LIMITS
(32 Background Samples; 50 wells)
i
a
t
1234
A (UNITS ABOVE BACKGROUND)
(32 Background Samples; 50 wells)
e
6
u
t
u
EPA Reference
O Zero resamples
A One resample
EPA Reference
A Two resamples (A)
O Two resamples (B)
1 2 3
A (UNITS ABOVE BACKGROUND)
B-31
-------
(32 Background Samples; 50 wells)
1M
5
O
EPA Reference
A Three resam pies (A)
O Three resamples (B)
2. 3
A (UNITS ABOVE BACKGROUND)
POWER CURVE FOR NON-PARAMETRIC
PREDICTION LIMITS
(8 Background Samples; 100 wells)
ct
a
\
t
w
EPA Reference
O Zero resamples
A One resample
1 2 3
A (UNITS ABOVE BACKGROUND)
B-32
-------
(8 Background Samples; 100 wells)
100
#
ee
O
u
>
b.
U.
Ili
20
O
>
EPA Reference
A Two resamples (A)
O Two resamples (B)
1 2 3
A (UMTS ABOVE BACKGROUND)
(8 Background Samples; 100 wells)
EPA Reference
A Three resamples (A)
O Three resamples (B)
1 2 3
A (UNITS ABOVE BACKGROUND)
B-33
-------
POWER CURVE FOR NON-PARAMETRIC
PREDICTION LIMITS
(16 Background Samples; 100 wells)
*
oc
i
u
I
t
u
1234
A (UNITS ABOVE BACKGROUND)
(16 Background Samples; 100 wells)
1
la
t
u
1 2 3
A (UNITS ABOVE BACKGROUND)
EPA Reference
O Zero resamples
A One resample
EPA Reference
A Two resamples (A)
O Two resamples (B)
B-34
-------
(16 Background Samples; 100 wells)
100 r
I
e
£
*
i
2
>
t
u
EPA Reference
A Three resamples (A)
O Three resamples (B)
012345
4 (UNITS ABOVE BACKGROUND)
POWER CURVE FOR NON-PARAMETRIC
PREDICTION LIMITS
(24 Background Samples; 100 wells)
100 r
EPA Reference
O Zero resamples
A One resample
1 2 3
A (UNITS ABOVE BACKGROUND)
B-35
-------
(24 Background Samples; 100 wells)
1M
#
s
I
u
u
£
u
A (UNITS ABOVE BACKGROUND)
(24 Background Samples; 100 wells)
100
o
123
A (UNITS ABOVE BACKGROUND)
EPA Reference
A Two resamples (A)
O Two resamples (B)
EPA Reference
A Three resamples (A)
O Three resamples (B)
B-36
-------
POWER CURVE FOR NON-PARAMETRIC
PREDICTION LIMITS
(32 Background Samples; 100 wells)
u
I
I
fe
123.
A (UNITS ABOVE BACKGROUND)
(32 Background Samples; 100 wells)
#
PC
1
I
t
u
EPA Reference
O Zero resamples
A One resample
EPA Reference
A Two resamples (A)
O Two resamples (B)
1 2 3
A (UMTS ABOVE BACKGROUND)
B-37
-------
(32 Background Samples; 100 wells)
K
U
i
u
>
u
I
EPA Reference
i. Thrze rcsDKtpizs (A;
O Three resamples (B)
1 2 3
A (UNITS ABOVE BACKGROUND)
B-38
------- |