Guidance Document On The Statistical Analysis Of Groundwater Monitoring Data At RCRA Facilities Interim Final Guidance

-------

-------
Guidance Document on the Statistical Analysis
           of Ground-Water Monitoring Data
                         at RCRA Facilities

-------
                                   PREFACE
     This  guidance  document  has   been  developed  primarily  for  evaluating
ground-water monitoring data at RCRA  (Resource  Conservation and Recovery Act)
facilities.  The  statistical methodologies described in this  document  can be
applied to  both  hazardous (Subtitle C of  RCRA) and municipal  (Subtitle D of
RCRA) waste land disposal  facilities.

     The recently  amended regulations concerning the  statistical  analysis of
ground-water monitoring   data  at  RCRA  facilities  (53 FR   39720,  October 11,
1988),  provide  a  wide  variety of  statistical  methods  that  may  be used to
evaluate ground-water  quality.   To the  experienced  and  inexperienced water
quality professional, the choice  of which test to  use  under a particular set
of conditions may not be  apparent.   The  reader  is referred  to Section 4 of
this guidance,  "Choosing  a Statistical Method," for assistance in  choosing an
appropriate statistical test.   For relatively  new  facilities that  have only
limited amounts  of ground-water monitoring data, it  is recommended  that a form
of hypothesis  test  (e.g.,  parametric analysis  of  variance)  be  employed to
evaluate the data.  Once  sufficient data  are  available (after 12 to 24 months
or eight background  samples),  another method  of analysis  such as  the control
chart methodology described in  Section 7  of the  guidance is recommended.  Each
method  of  analysis and the  conditions under  which they will  be  used  can be
written in  the  facility  permit.   This  will  eliminate  the need for  a  permit
modification  each  time  more   information  about  the  hydrogeochemistry  is
collected,  and more appropriate methods of data  analysis become apparent.

     This  guidance was  written  primarily for  the  statistical   analysis  of
ground-water monitoring  data  at  RCRA facilities.    The  guidance has  wider
applications however,   if one  examines   the  spatial  relationships  involved
between  the monitoring  wells   and  the  potential  contaminant source.    For
example, Section 5 of the guidance describes  background well  (upgradient)  vs.
compliance  well  (downgradient)  comparisons.   This scenario can be  applied to
other  non-RCRA  situations involving  the   same  spatial  relationships and  the
same null  hypothesis.  The explicit null  hypothesis  (H0)  for testing contrasts
between means,  or where appropriate between medians, is that the means between
groups  (here monitoring wells)  are equal  (i.e.,  no release  has been detected),
or that the group means are below a prescribed action  level (e.g.,  the ground-
water protection standard).  Statistical  methods that  can  be used  to evaluate
these  conditions   are  described  in  Section 5.2 (Analysis  of Variance),  5.3
(Tolerance Intervals), and 5.4  (Prediction Intervals).

     A  different  situation  exists  when  compliance  wells   (downgradient)  are
compared to a fixed standard (e.g., the ground-water protection standard).  In
that case,  Section 6 of the  guidance  should be  consulted.   The value to which
the  constituent  concentrations at  compliance wells art compared  can  be  any

-------
standard  established  by  a  Regional  Administrator,  State  or county  health
official, or another appropriate official.

     A note of  caution  applies  to Section 6.  The  examples  used  in Section 6
are used  to determine whether ground  water has been  contaminated  as a result
of a release  from  a facility.  When the lower confidence  limit lies entirely
above the  ACL  (alternate concentration  limit)  or MCL  (maximum concentration
limit),   further action  or  assessment may  be warranted.    If  one  wishes  to
determine whether a cleanup standard has been attained for a Superfund site or
a RCRA facility  in  corrective action, another EPA  guidance document entitled,
"Statistical Methods for  the Attainment of Superfund Cleanup  Standards (Vol-
ume 2:   Ground  Water—Draft),  should  be   consulted.    This draft  Superfund
guidance  is a  multivolume set that addresses questions  regarding  the success
of air,  ground-water,  and  soil  remediation efforts.    Information  about  the
availability  of  this   draft guidance,  currently  being  developed,  can  be
obtained  by  calling the  RCRA/Superfund  Hotline,  telephone  (800)  424-9346  or
(202) 382-3000.

     Those  interested in  evaluating individual  uncontaminated  wells or in  an
intrawell comparison are referred to Section 7 of the guidance which describes
the use  of  Shewhart-CUSUM control  charts and  trend  analysis.  Municipal water
supply engineers, for example, who wish to  monitor water quality parameters in
supply wells, may find this section useful.

     Other  sections  of  this  guidance have  wide  applications  in the field  of
applied  statistics,  regardless  of the  intended  use or  purpose.   Section 4.2
and  4.3  provide   information  on  checking  distributional  assumptions  and
equality  of variance,  while Sections 8.1   and  8.2 cover  limit of  detection
problems  and  outliers.    Helpful  advice and  references for  many  experiments
involving the use of statistics can be found in these sections.

     Finally,  it should be  noted  that  this guidance is  not  intended to be the
final chapter on the statistical analysis of ground-water monitoring data, nor
should  it be used  as such.  40 CFR Part 264 Subpart F  offers an  alternative
[§264.97(h)(5)]  to  the  methods   suggested  and  described  in this guidance
document.   In  fact, the guidance recommends a procedure  (confidence  intervals)
for comparing monitoring  data to a fixed standard that is not mentioned in the
Subpart  F  regulations.    This is  neither contradictory  nor  inconsistent  but
rather  epitomizes  the  complexities of the  subject  matter  and exemplifies the
need  for flexibility due to  the  site-specific monitoring  requirements of the
RCRA program.

-------
                                   CONTENTS

Preface	  i i i
Figures	   v1
Tables	  vi1
Executive Summary	  E-l

     1.   Introduction	  1-1
     2.   Regulatory Overview	  2-1
               2.1  Background	  2-1
               2.2  Overview of Methodology	  2-3
               2.3  General Performance Standards	  2-3
               2.4  Basic Statistical Methods and Sampling
                      Procedures	  2-6
     3.   Choosing a Sampling Interval	  3-1
               3.1  Example Calculations	  3-8
               3.2  Flow Through Karst and "Pseudo-Karst" Terranes	 3-11
     4.   Choosing a Statistical Method	  4-1
               4.1  Flowcharts—Overview and Use	  4-1
               4.2  Checking Distributional Assumptions	  4-4
               4.3  Checking Equality of Variance:  Bartlett's Test	 4-17
     5.   Background Well to Compliance Well Comparisons	  5-1
               5.1  Summary Flowchart for Background Well to
                      Compliance Well Comparisons	  5-2
               5.2  Analysis of Variance	  5-5
               5.3  Tolerance Intervals Based on the Normal
                      Distribution	 5-20
               5.4  Prediction Intervals	 5-24
     6.   Comparisons with MCLs or ACLs	  6-1
               6.1  Summary Chart for Comparison with MCLs or ACLs	  6-1
               6.2  Statistical Procedures	  6-1
     7.   Control Charts for Intra-Well Comparisons	  7-1
               7.1  Advantages of Plotting Data	  7-1
               7.2  Correcting for Seasonal ity	  7-2
               7.3  Combined Shewhart-CUSUM Control Charts for Each
                      Well and Constituent	  7-5
               7.4  Update of a Control Chart	 7-10
               7.5  Nondetects in a Control Chart	 7-12
     8.   Miscellaneous Topics	  8-1
               8.1  Limit of Detection	  8-1
               8.2  Outliers	 8-11
Appendices
     A.   General Statistical Considerations and Glossary of
            Statistical Terms	  A-l
     B.   Statistical Tables	  B-l
     C.   General Bibliography	  C-l
     D.   Federal Register,  40 CFR,  Part 264	  D-l

-------
                                    FIGURES
Number                                                                   Page
 3-1      Hydraulic conductivity ojf selected rocks	   3-3
 3-2      Range of values of hydrajulic conductivity and permeability....   3-4
 3-3      Conversion factors for permeability and hydraulic
            conductivity units.	   3-4
 3-4      Total porosity and drairtable porosity for typical
            geologic materials...«	   3-7
 3-5      Potentiometric surface rtiap for computation of hydraulic
            gradient	4	   3-9
 4-1      Flowchart overview	   4-3
 4-2      Probability plot of rawjchlordane concentrations	 4-11
 4-3      Probability plot of log+transformed chlordane concentrations.. 4-13
 5-1      Background well to compliance well comparisons	   5-3
 5-2      Tolerance limits:  alternate approach to background
            well to compliance we'll comparisons	   5-4
 6-1      Comparisons with MCLs/ACLs	   6-2
 7-1      Plot of unadjusted and Seasonally adjusted monthly
            observations	   7-6
 7-2      Combined Shewhart-CUSUM chart	 7-11
                                       vi

-------
                                    TABLES
Number                                                                   Page
 2-1      Summary of Statistical  Methods	  2-7
 3-1      Default Values for Effective Porosity (Ne) for Use in Time
            of Travel (TOT)  Analyses	  3-5
 3-2      Specific Yield Values for Selected Rock Types	  3-6
 3-3      Determining a Sampling Interval	 3-11
 4-1      Example Data for Coefficient-of-Variation Test	  4-8
 4-2      Example Data Computations for Probability Plotting	 4-10
 4-3      Cell Boundaries for the Chi-Squared Test	 4-14
 4-4      Example Data for Chi-Squared Test	 4-15
 4-5      Example Data for Bartlett's Test	 4-19
 5-1      One-Way Parametric ANOVA Table	  5-8
 5-2      Example Data for One-Way Parametric Analysis of Variance	 5-11
 5-3      Example Computations in One-Way Parametric ANOVA Table	 5-12
 5-4      Example Data for One-Way Nonparametric ANOVA--Benzene
            Concentrations (ppm)	 5-18
 5-5      Example Data for Normal Tolerance Interval	 5-23
 5-6      Example Data for Prediction Interval—Chlordane Levels	 5-27
 6-1      Example Data for Normal Confidence Interval--Aldicarb
            Concentrations in Compliance Wells (ppb)	  6-4
 6-2      Example Data for Log-Normal Confidence Interval--EDB
            Concentrations in Compliance Wells (ppb)	  6-6
 6-3      Values of M and n+l-M and Confidence Coefficients for
            Smal 1 Sampl es	  6-9
 6-4      Example Data for Nonparametric Confidence Interval—T-29
            Concentrations (ppm)	 6-10
                                      vii

-------
                              TABLES  (continued)
Number                                                                   Page
 6-5      Example Data for a Tolerance  Interval Compared to  an ACL	6-13
 7-1      Example Computation for  Deseasonalizing  Data	   7-4
 7-2      Example Data for Combined Shewhart-CUSUM Chart—Carbon
            Tetrachloride Concentration (vg/L)	   7-9
 8-1      Methods for Below Detection Limit  Values	   8-2
 8-2      Example Data for a Test  of  Proportions	   8-6
 8-3      Example Data for Testing Cohen's Test	   8-9
 8-4      Example Data for Testing for  an Outlier	  8-13
                                     v i i i

-------
                                ACKNOWLEDGMENT
     This document  was developed  by EPA's  Office of  Solid  Waste under  the
direction of Dr. Vernon Myers, Chief of  the  Ground-Water  Section of the Waste
Management  Division.    The  document was  prepared by  the  joint  efforts  of
Dr. Vernon B.  Myers,  Mr. James R.  Brown  of the  Waste Management  Division,
Mr. James  Craig  of  the  Office  of  Policy  Planning  and  Information,  and
Mr. Barnes Johnson  of  the  Office  of Policy, Planning,  and  Evaluation.   Tech-
nical  support  in the  preparation  of  this  document  was  provided  by  Midwest
Research  Institute  (MRI)  under a  subcontract to  NUS Corporation, the  prime
contractor with EPA's Office of Solid Waste.   MRI  staff who assisted  with the
preparation  of the document  were   Jairus  D.  Flora,  Jr.,  Ph.D.,  Principal
Statistician,  Ms. Karin M.   Bauer,   Senior  Statistician,  and  Mr. Joseph S.
Bartling, Assistant Statistician.
                                      ix

-------
                               EXECUTIVE SUMMARY


     The  hazardous waste  regulations  under  the  Resource  Conservation  and
Recovery Act (RCRA) require owners and operators of hazardous waste facilities
to utilize  design  features and control  measures that prevent  the release of
hazardous waste into  ground-water.   Further,  regulated  units  (i.e., all sur-
face  impoundments, waste  piles,  land treatment  units,  and   landfills that
receive hazardous  waste  after  July  26, 1982)  are also subject  to the ground-
water  monitoring   and  corrective action  standards  of  40 CFR  Part  264, Sub-
part F.  These regulations require that a statistical method and sampling pro-
cedure approved by EPA be used  to  determine  whether there  are releases from
regulated units into ground water.

     This document provides  guidance  to  RCRA facility  permit  applicants  and
writers concerning the statistical analysis of ground-water monitoring data at
RCRA facilities.   Section  1  is an introduction  to  the  guidance; it describes
the  purpose and  intent  of  the  document  and  emphasizes  the  need  for  site-
specific  considerations  in implementing  the Subpart F  regulations  of  40 CFR
Part 264.

     Section 2 provides  the  reader  with  an  overview of the recently promul-
gated  regulations  concerning the statistical  analysis  of  ground-water  moni-
toring  data  (53   FR   39720,  October  11,  1988).    The  requirements  of  the
regulation  are  reviewed, and  the need to  consider site-specific  factors  in
evaluating data at a hazardous waste facility  is emphasized.

     Section 3 discusses  the  important  hydrogeologic parameters  to consider
when choosing  a  sampling interval.    The Darcy equation is  used to determine
the horizontal component of the average linear velocity of ground water.   This
parameter provides a good estimate  of time of  travel  for most soluble con-
stituents in  ground  water and may  be used to determine a sampling interval.
In karst, cavernous  volcanics, and fractured  geologic  environments, alterna-
tive methods are needed  to determine  an  appropriate sampling interval.  Exam-
ple calculations are  provided  at  the  end  of  the section  to further assist the
reader.

     Section 4  provides  guidance  on  choosing  an  appropriate  statistical
method.   A  flow chart to  guide  the reader  through this  section,  as  well  as
procedures  to test  the  distributional  assumptions  of  data,   are presented.
Finally, this section outlines procedures to test specifically for equality of
variance.

     Section 5 covers statistical methods that may be used to evaluate ground-
water  monitoring   data when  background  wells have  been  sited hydraulically
upgradient  from  the  regulated unit,  and a  second  set  of  wells  are  sited


                                      E-l

-------
hydraulically downgradient  from  the regulated  unit at  the  point of  compli-
ance.   The data  from  these compliance  wells  are  compared  to data from the
background  wells  to  determine  whether  a  release   from   a  facility  has
occurred.  Parametric and nonparametric analysis of variance, tolerance inter-
vals, and prediction intervals are suggested  methods for this type of compari-
son.   Flow charts,  procedures,  and example calculations  are given for each
testing method.

     Section 6  includes  statistical   procedures   that   are  appropriate  when
comparing  ground-water  constituent  concentrations  to  fixed  concentration
limits  (e.g.,  alternate  concentration  limits  or  maximum concentration lim-
its).  The methods applicable to this type of comparison are confidence inter-
vals and  tolerance  intervals.   As in Section 5, flow  charts,  procedures, and
examples explain the calculations necessary for each testing method.

     Section 7 presents the case where the level of each constituent within a
single, uncontaminated well is being compared to  its  historic background con-
centrations.  This is known as an intra-well  comparison.  In essence, the data
for  each  constituent in  each well  are  plotted  on a time  scale  and  inspected
for  obvious  features   such  as  trends  or  sudden changes  in  concentration
levels.   The method  suggested  in this  section is a  combined Shewhart-CUSUM
control chart.

     Section 8 contains a variety of special topics that are relatively short
and  self-contained.   These topics  include methods to deal  with  data  that is
below  the limit  of  analytical detection and methods to test  for outliers or
extreme values in the data.

     Finally, the  guidance  presents  appendices  that cover general statistical
considerations,  a glossary  of  statistical  terms,  statistical tables,  and a
listing  of  references.    These  appendices  provide  necessary and  ancillary
information to aid the user in evaluating ground-water monitoring data.
                                      E-2

-------
                                  SECTION  1

                                 INTRODUCTION


     The U.S. Environmental  Protection Agency  (EPA) promulgated  regulations
for detecting contamination of  ground water at hazardous  waste  land disposal
facilities under  the  Resource Conservation  and  Recovery Act  (RCRA)  of 1976.
The statistical  procedures specified  for use to  evaluate the presence of con-
tamination have been  criticized  and require improvement.   Therefore,  EPA has
revised those statistical procedures  in 40 CFR  Part 264, "Statistical Methods
for Evaluating Ground-Water Monitoring Data From Hazardous Waste Facilities."

     In 40 CFR  Part 264,  EPA has  recently amended  the  Subpart  F  regulations
with  statistical  methods  and sampling  procedures  that are  appropriate  for
evaluating ground-water monitoring  data  under a variety of situations (53 FR
39720, October 11, 1988).   The purpose of  this document is to provide guidance
in  determining  which  situation applies  and  consequently  which  statistical
procedure may be  used.   In addition  to providing  guidance on selection of an
appropriate  statistical  procedure,   this  document  provides  instructions  on
carrying out the procedure and interpreting the results.

     The  regulations provide three  levels  of  monitoring  for a  regulated
unit:   detection monitoring;  compliance   monitoring;  and  corrective  action.
The regulations define conditions for a regulated  unit  to be changed from one
level of monitoring to a more stringent level of monitoring  (e.g.,  from detec-
tion monitoring to compliance monitoring).   These conditions are that there is
statistically significant evidence  of contamination [40 CFR §264.91(a)(l) and
(2)1.

     The regulations allow the benefit of  the doubt to reside with  the current
stage of monitoring.   That is,  a  unit will remain  in  its current  monitoring
stage unless  there  is convincing evidence to change it.   This means  that  a
unit will  not be changed from  detection  monitoring to  compliance  monitoring
(or from compliance monitoring to corrective action) unless there is statisti-
cally significant evidence  of contamination (or contamination above the com-
pliance limit).

     The main purpose of this document is  to guide owners, operators, Regional
Administrators,  State  Directors, and  other  interested  parties  in  the selec-
tion, use, and  interpretation of appropriate statistical methods for monitor-
ing  the  ground  water at each specific regulated  unit.   Topics  to  be covered
include sampling  needed,  sample sizes,  selection  of appropriate  statistical
design, matching  analysis  of data  to design,  and  interpretation of results.
Specific recommended methods  are detailed  and a general  discussion  of evalu-
ation of alternate methods is provided.  Statistical concepts are discussed in


                                      1-1

-------
Appendix  A.    References for  suggested
references to  alternate  procedure
calling for external consultation
ing expert assistance when needed.
         procedures  are  provided as  well  as
 s and general  statistics  texts.  Situations
 are mentioned as well as sources for obtain-
     EPA would like to emphasize
implementing  the  Subpart  F  regu
amended, 53 FIR 39720, October  11,
promulgate  regulations  that are
enough to  accommodate  a wide var
This is usually  achieved  by spec
majority of monitoring  situations
tives that are  also  protective
philosophy  is maintained  in  the
"Statistical  Methods  for  Evaluat
ardous Waste  Facilities"  (53 FR
allow for  the use of an  alternat
[§264.97(g)(2) and §264.97(h)(5),
are explicitly referenced  [§264.S
meet  the   performance  standards
should  be  given  when  preparing
applications.
 :he  need for site-specific considerations  in
 ations  of  40 CFR  Part 264  (especially  as
  1988).   It  has  been an ongoing strategy  to
 specific enough  to  implement,  yet flexible
 ety of  site-specific environmental factors.
 fying criteria that are appropriate for the
 ,  while  at  the same-time allowing alterna-
 f  human health  and the  environment.   This
  recently  promulgated  amendments  entitled,
 ing  Ground-Water  Monitoring Data  From Haz-
 9720, October 11,  1988).   The sections that
 e  sampling  procedure and statistical method
  respectively] are as viable  as those that
97(g)(l)  and  §264.97(h)(l-4)],  provided they
 3f  §264.97(i).    Due  consideration  to this
  and  reviewing  Part B  permits and  permit
                                      1-2

-------
                                  SECTION 2

                             REGULATORY OVERVIEW
     In 1982,  EPA  promulgated ground-water monitoring  and  response standards
for  permitted  facilities  in Subpart  F  of  40 CFR Part 264,  for  detecting
releases of  hazardous  wastes into ground water from storage,  treatment,  and
disposal units, at permitted facilities (47  FR 32274, July 26, 1982).

     The Subpart F  regulations  required ground-water  data  to  be  examined  by
Cochran's  Approximation  to  the  Behrens-Fisher  Student's  t-test  (CABF)   to
determine whether there was  a significant exceedance of background levels,  or
other  allowable  levels,  of specified chemical parameters and hazardous waste
constituents.  One concern was that this procedure could result in a high rate
of  "false  positives"  (Type I error),  thus  requiring  an  owner  or  operator
unnecessarily  to  advance  into  a more  comprehensive and  expensive phase  of
monitoring.   More  importantly,  another concern  was that the procedure could
result  in  a  high rate of  "false negatives"  (Type  II error), i.e., instances
where actual  contamination would go undetected.

     As a  result  of these concerns,  EPA amended  the CABF  procedure with five
different statistical methods that are more  appropriate for ground-water moni-
toring  (53 FR 39720, October 11,  1988).   These amendments  also outline sam-
pling procedures and performance standards  that  are designed to help minimize
the event that a statistical method will indicate contamination when it is not
present  (Type  I  error),  and fail to  detect contamination  when  it is  present
(Type  II error).

2.1  BACKGROUND

     Subtitle C of  the Resource  Conservation  Recovery Act of 1976 (RCRA) cre-
ates a comprehensive program for the safe management of hazardous waste.  Sec-
tion 3004  of RCRA  requires owners  and  operators  of  facilities  that  treat,
store,  or  dispose  of hazardous  waste to comply  with  standards established  by
EPA that  are "necessary to  protect  human  health and the environment."   Sec-
tion 3005  provides  for implementation of these  standards  under permits issued
to owners  and  operators  by EPA or authorized States.   Section 3005 also pro-
vides that owners and operators of existing  facilities  that  apply for a permit
and comply with  applicable notice  requirements  may operate until a  permit
determination  is made.    These  facilities  are  commonly  known  as  "interim
status"  facilities.   Owners and operators  of interim  status facilities also
must comply with standards set under Section 3004.

     EPA promulgated  ground-water  monitoring and response  standards  for per-
mitted  facilities  in 1982  (47  FR 32274, July  26,  1982),  codifi«rf in 40 CFR


                                      2-1

-------
Part 264, Subpart F.  These standards establish programs for protecting ground
water from releases of hazardous wastes  from  treatment,  storage, and disposal
units.  Facility owners and operators were  required  to sample ground water at
specified intervals and to use a statistical procedure to determine whether or
not  hazardous  wastes  or  constituents   from  the  facility  are  contaminating
ground water.   As explained  in  more detail  below,  the  Subpart F regulations
regarding statistical methods used  in evaluating  ground-water monitoring data
that EPA promulgated in 1982 have generated criticism.

     The  Part 264  regulations prior to  the October 11,  1988  amendments pro-
vided that the Cochran's  Approximation  to  the Behrens-Fisher Student's t-test
(CABF) or an alternate statistical  procedure approved by EPA be used to deter-
mine  whether  there  is  a  statistically  significant  exceedance  of background
levels, or other  allowable levels,  of specified  chemical  parameters and haz-
ardous  waste  constituents.   Although  the regulations  have  always provided
latitude  for  the  use  of  an  alternate  statistical  procedure,  concerns  were
raised that the  CABF statistical  procedure in the regulations was not appro-
priate.   It  was pointed  out  that:   (1) the  replicate sampling  method is not
appropriate for the CABF procedure,  (2)  the CABF procedure does not adequately
consider  the  number of comparisons  that must be made,  and  (3)  the CABF does
not control for  seasonal  variation.   Specifically, the concerns were that the
CABF procedure could result in  "false positives"  (Type I error), thus requir-
ing  an owner  or  operator unnecessarily  to  collect  additional  ground-water
samples,  to  further characterize  ground-water quality,  and  to apply  for  a
permit modification, which  is then  subject  to EPA review.   In addition, there
was concern that  CABF  may result in  "false negatives" (Type II error), i.e.,
instances  where  actual   contamination   goes  undetected.    This  could  occur
because  the  background   data,  which  are  often   used  as  the  basis of  the
statistical  comparisons,  are  highly  variable  due  to  temporal,  spatial,
analytical, and sampling effects.

     As  a result  of these concerns, on  October 11,  1988 EPA amended both the
statistical methods and the sampling procedures of the regulations, by requir-
ing  (if  necessary)  that  owners  or  operators  more accurately characterize the
hydrogeology and  potential  contaminants at the facility,  and  by including in
the  regulations  performance  standards  that  all  the  statistical  methods and
sampling  procedures must  meet.   Statistical  methods  and  sampling procedures
meeting these performance  standards would have a  low probability of  indicating
contamination when  it  is not present,  and  of failing to detect contamination
that  actually is present.  The facility owner or operator would have to demon-
strate that a procedure is appropriate for  the site-specific conditions at the
facility,  and to  ensure  that  it  meets the performance  standards outlined
below.   This  demonstration holds for any  of  the  statistical methods and sam-
pling  procedures  outlined  in  this regulation  as  well as any alternate methods
or procedures proposed by  facility  owners and operators.

      EPA  recognizes  that the selection of appropriate monitoring parameters is
also  an  essential  part  of a  reliable statistical   evaluation.    The Agency
addressed this  issue  in  a  previous   Federal  Register  notice  (52  FR 25942,
July  9,  1987).
                                      2-2

-------
2.2  OVERVIEW OF METHODOLOGY

     EPA has  elected  to retain the  idea of general  performance requirements
that the regulated community must meet.   This  approach allows for flexibility
in  designing statistical  methods and  sampling  procedures  to  site-specific
considerations.

     EPA has  tried  to  bring  a measure  of  certainty to these  methods,  while
accommodating the unique  nature of many of the regulated  units in question.
Consistent  with  this  general   strategy,  the Agency is establishing  several
options for the sampling  procedures and  statistical methods  to be  used in
detection monitoring and,  where appropriate, in compliance monitoring.

     The owner or operator  shall  submit, for each  of the  chemical parameters
and hazardous constituents  listed  in the facility  permit,  one  or more of the
statistical  methods  and  sampling  procedures  described  in  the  regulations
promulgated  on  October 11,  1988.    In  deciding  which statistical   test  is
appropriate, he or  she will consider the theoretical  properties of the test,
the data available, the site hydrogeology,  and  the  fate and transport charac-
teristics of potential contaminants at the facility.  The Regional Administra-
tor will review, and  if appropriate,  approve the proposed  statistical  methods
and sampling procedures when issuing the facility permit.

     The Agency recognizes  that there may be situations  where any one statis-
tical test may not be appropriate.  This is true of new facilities with little
or  no  ground-water  monitoring  data.   If  insufficient data  prohibit the owner
or operator from specifying a statistical method of analysis,  then contingency
plans  containing  several  methods  of  data  analysis  and the  conditions  under
which  the  method can  be used will be specified by  the Regional Administrator
in the permit.  In many cases,  the parametric ANOVA can be performed after six
months of data have been collected.  This will  eliminate the need for a permit
modification  in  the  event  that  data  collected during  future  sampling  and
analysis events indicate the need  to  change to  a more appropriate statistical
method of  analysis.   In the event that  a permit modification is necessary to
change a sampling procedure or a statistical method, the reader is referred to
53 FR 37912, September 28, 1988.  These  are considered Class 1 changes requir-
ing Director approval  and should follow  minor modification procedures.

2.3  GENERAL PERFORMANCE STANDARDS

     EPA's  basic concern  in establishing these  performance  standards  for sta-
tistical methods is to achieve a proper  balance between the risk that the pro-
cedures will  falsely  indicate  that  a  regulated unit  is   causing  background
values or  concentration  limits  to be exceeded  (false positives)  and  the risk
that the  procedures will  fail  to  indicate  that background values  or  concen-
tration  limits  are  being  exceeded   (false negatives).    EPA's approach  is
designed to address that concern  directly.   Thus  any  statistical method or
sampling  procedure,  whether  specified   here  or  as  an  alternative to  those
specified,  should   meet  the   following   performance  standards  contained  in
40 CFR §264.97(i):
                                      2-3

-------
    1.   The statistical method used to evaluate ground-water monitoring data
         shall  be  appropriate  for  the  distribution of chemical parameters or
         hazardous  constituents.    If  the  distribution  of  the  chemical
         parameters  or  hazardous   constituents  is  shown  by  the  owner  or
         operator to be inappropriate for a normal theory test, then the data
         should  be  transformed or a distribution-free  theory  test should be
         used.   If  the distributions for the  constituents  differ, more than
         one statistical method may be needed.

    2.   If  an individual well  comparison  procedure  is  used  to  compare an
         individual compliance well constituent concentration with background
         constituent  concentrations or  a  ground-water  protection standard,
         the test  shall be done  at a Type  I  error level of no  less than 0.01
         for  each  testing period.   If  a  multiple  comparisons procedure is
         used,  the Type I experimentwise  error  rate  shall  be no  less than
         0.05  for  each testing period; however, the Type  I  error of no less
         than  0.01  for individual  well  comparisons must be maintained.  This
         performance  standard  does  not apply to  control  charts, tolerance
         intervals, or prediction  intervals.

    3.   If  a  control  chart  approach is used  to  evaluate ground-water moni-
         toring  data,  the specific type of control  chart  and  its associated
         parameters  shall  be proposed  by the  owner  or operator and approved
         by  the  Regional Administrator if he or she  finds it to be protective
         of  human health and the environment.

    4.   If  a  tolerance interval or a prediction interval is used to evaluate
         ground-water  monitoring data, then the levels of confidence shall be
         proposed;  in addition, for  tolerance intervals,  the proportion of
         the  population that  the   interval must  contain  (with the proposed
         confidence)  shall be  proposed  by  the owner or operator and approved
         by  the  Regional Administrator if he or she  finds these parameters to
         be  protective of human health and the environment.  These parameters
         will  be determined  after  considering  the  number of  samples  in the
         background data base, the  distribution of the data, and the range of
         the concentration values  for each constituent of concern.

    5.   The  statistical  method  will   include  procedures for  handling data
         below the limit of detection  with  one or  more procedures that are
         protective of human health and the environment.  Any  practical quan-
         titation  limit (PQL)  approved  by the  Regional  Administrator under
         §264.97(h)  that  is  used  in the statistical method shall be the low-
         est concentration  level  that  can  be reliably achieved within  speci-
         fied   limits  of precision  and accuracy  during  routine laboratory
         operating  conditions  available to the facility.

    6.    If  necessary, the  statistical method  shall  include  procedures to
         control  or correct  for seasonal and  spatial  variability as well as
         temporal  correlation  in the data.

     In  referring to  "statistical  methods,"  EPA means  to  emphasize that the
concept  of  "statistical significance"  must  be reflected in several aspects of
the monitoring  program.   This involves not  only the  choice  of  a  level of

                                     2-4

-------
significance, but also the choice of a statistical test, the sampling require-
ments, the  number of samples,  and  the frequency  of  sampling.   Since  all of
these parameters  interact to determine  the  ability of the procedure to detect
contamination,  the  statistical  methods,  like  a comprehensive  ground-water
monitoring  program,  must be evaluated  in  their  entirety, not  by individual
components.  Thus a systems approach to ground-water monitoring is endorsed.

     The second performance standard requires further comment.  For individual
well comparisons  in  which an individual compliance well  is  compared to back-
ground, the Type  I error  level  shall  be no  less than 1% (0.01) for each test-
ing period.   In  other words, the probability of the test resulting in a false
positive is no less  than  1  in  100.   EPA believes that this significance level
is sufficient in  limiting the  false  positive rate while at the same time con-
trolling the false negative (missed detection) rate.

     Owners  and  operators  of  facilities that  have  an  extensive  network of
ground-water  monitoring  wells  may  find it  more  practical to use  a multiple
well  comparisons  procedure.     Multiple  comparisons  procedures  control  the
experimentwise error rate  for  comparisons   involving multiple  upgradient  and
downgradient wells.   If  this method  is used,  the Type I experimentwise error
rate  for  each constituent  shall  be  no less  than 5% (0.05)  for  each testing
period.

     In using a  multiple  well  comparisons procedure,  if the owner or operator
chooses to  use  a t-statistic rather  than an F-statistic,  the individual well
Type I  error  level  must  be maintained  at  no  less  than 1%  (0.01).    This
provision should  be considered if a facility owner or operator wishes to use a
procedure that distributes  the  risk  of  a  false positive evenly throughout all
monitoring wells  (e.g., Bonferroni t-test).

     Setting  these  levels of significance at 1%  and  5%,  respectively,  raises
an  important  question in  how  the false  positive rate will  be  controlled at
facilities with a large number of ground-water monitoring wells and monitoring
constituents.  The  Agency set  these  levels  of  significance  on the basis of a
single testing  period  and  not  on the entire operating  life  of  the facility.
Further, large facilities can reduce the false positive rate by implementing a
unit-specific monitoring  approach.   Data  from uncontaminated upgradient wells
can be pooled and treated as one group.  This will not only reduce the number
of  comparisons  in a  multiple  well  comparisons  procedure but will  also take
into  account  spatial heterogeneities that  may  affect background ground-water
quality.   If  the  overall  F-test is  significant,  then testing of the contrasts
between the mean  of each compliance well concentration and the mean background
concentration must be performed  Tor  each  constituent.   This will  identify the
monitoring  wells  that  are out  of compliance.   The Type  I  error  level for the
individual comparisons shall be  no less than 0.01.  Nonetheless, it is evident
that  facilities   with  an  extensive  number  of  ground-water  monitoring  wells
which are monitored for many constituents may still generate a large number of
comparisons during each testing  period.

     In these particular  situations, a determination of whether a release from
a facility  has occurred may require the Regional Administrator to evaluate the
site  hydrogeology,   geochemistry,  climatic   factors,  and  other  environmental
parameters to determine if  a statistically significant result is indicative of

                                      2-5

-------
an  actual  release  from  the facility.    In  making  this determination,  the
Regional  Administrator may note the relative magnitude of the concentration of
the constituent(s).   If the exceedance is based on an observed compliance well
value that is  the  same relative magnitude as  the  PQL (practical  quantitation
limit) or the  background  concentration level, then a  false  positive may have
occurred, and  further  sampling  and testing may be  appropriate.   If, however,
the  background  concentration  level   or  an   action   level  is  substantially
exceeded, then the  exceedance  is  more  likely to  be  indicative of  a release
from the facility.

2.4  BASIC STATISTICAL METHODS AND SAMPLING PROCEDURES

     The October 11, 1988 rule  specifies five  types of statistical  methods to
detect contamination in ground water.  EPA believes that at least one of these
types of  procedures will  be appropriate  for virtually  all  facilities.   To
address  situations  where these  methods  may  not be appropriate,  EPA  has
included a provision  for  the owner or operator to select an alternate method
which is subject to approval  by the Regional Administrator.

2.4.1  The Five Statistical Methods Outlined in the October 11, 1988 Final
       Rule

     1.   A parametric analysis of  variance  (ANOVA)  followed  by multiple com-
          parison procedures  to identify specific  sources of  difference.  The
          procedures  will include  estimation  and testing  of the  contrasts
          between the mean of each compliance well and the background mean for
          each constituent.

     2.   An analysis of  variance  (ANOVA)  based on ranks  followed  by multiple
          comparison  procedures to  identify  specific sources  of  difference.
          The  procedure will include estimation and  testing  of the contrasts
          between the median of each compliance well and the median background
          levels for each constituent.

     3.   A procedure  in  which  a tolerance interval  or  a prediction interval
          for  each  constituent is  established from the  background  data, and
          the  level of each  constituent  in each compliance well is compared to
          its  upper tolerance or prediction limit.

     4.   A control  chart approach  which  will give  control   limits  for each
          constituent.   If any compliance  well  has a value  or a  sequence of
          values that  lie outside  the  control  limits  for that constituent, it
          may  constitute  statistically significant evidence of contamination.

     5.   Another  statistical  method  submitted by the owner  or  operator and
          approved by the Regional Administrator.

     A  summary of  these  statistical  methods  and  their  applicability is pre-
sented in Table 2-1.  The table lists types of comparisons and the recommended
procedure and  refers the  reader to the appropriate sections where a discussion
and  example can be found.
                                      2-6

-------
TABLE 2-1.  SUMMARY OF STATISTICAL METHODS
SUMMARY OF STATISTICAL METHODS
COMPOUND
ANY
COMPOUND
IN
BACKGROUND
ACL/MCL
SPECIFIC
SYNTHETIC
TYPE OF COMPARISON
BACKGROUND VS
COMPLIANCE WELL
INTRA-WELL
FIXED STANDARD
MANY NONDETECTS
IN DATA SET
RECOMMENDED METHOD
ANOVA
TOLERANCE LIMITS
PREDICTION INTERVALS
CONTROL CHARTS
CONFIDENCE INTERVALS
TOLERANCE LIMITS
SEE BELOW DETECTION
LIMIT TABLE 8-1
SECTION OF
GUIDANCE
DOCUMENT
5.2
5.3
5.4
7
6.2.1
6.2.2
8.1
                    2-7

-------
     EPA is  specifying  multiple
and has allowed for alternative
priate for  all  circumstances.
procedures are appropriate  for
from ground-water monitoring sy
site-specific  factors  that
Student's t-test  (CABF)  and the
regulations.   The  statistical
comparison problems and provide
natural variation.   EPA believi
procedures consider and control
                                  statistical  methods and  sampling procedures
                                5  because  no  one  method  or procedure is appro-
                                 EPA believes  that the suggested  methods and
                                the site-specific design and  analysis  of data
                                terns and that they can account for more of the
                              Coichran's  Approximation  to  the  Behrens-Fisher
                                 accompanying sampling procedures  in  the past
                                methods  specified here  address the  multiple
                                for documenting  and  accounting for sources of
                                is  that  the specified statistical  methods and
                                for natural temporal  and spatial variation.
2.4.2  Site-Specific Considerat
                               by
     The decision on the number
made on a site-specific basis
the statistical method being us
port characteristics  of potential
The number of wells must be suf
ing contamination when  it  is p
should be  used,  the owner  or
characteristics, including  the
sampling procedures are:
     1.
          Obtain a sequence of
          ensures, to the  great
          pendent sample is obtained
          effective porosity,
          and  the  fate  and  transport
          nants.  The  sampling
          the Regional Administ
     2.
                                ons for Sampling
                                of wells  needed  in  a  monitoring  system will  be
                                 the Regional Administrator  and  will  consider
                               ed,  the site hydrogeology, the fate  and trans-
                                   contaminants, and the sampling  procedure.
                                icient  to ensure a  high  probability  of detect-
                                esent.   To determine which sampling procedure
                              operator  shall  consider  existing  data  and  site
                                possibility of  trends and seasonality.   These
                               e
statistical methods that will  a
procedures  may be  used  to rep"
Subpart F regulations.   Rather
dividing it into four replicat
taken  at  intervals  far  enougi(i
depending  on  rates of  ground-
characteristics)  will  help ensu
independent sample) of ground
ground-water velocity  prohibits
a semiannual  basis, an  alterna
Administrator may be utilized
     The  Regional  Administrator
 dure  and  interval  submitted  b.
 effective  porosity,  hydraulic
 uppermost aquifer  under the
                                   sampling procedures  will allow  the use of
                                xurately detect contamination.  These  sampling
                                ace  the sampling method  present  in the former
                                than  taking  a single ground-water sample and
                                : samples, a sequence  of at least four samples
                                 apart in  time  (daily,  weekly,  or   monthly,
                               v|/ater flow and  contaminant  fate  and transport
                                '•e the  sampling of a discrete portion (i.e., an
                               water.   In hydrogeologic  environments where the
                                one from obtaining four independent samples on
                                :e sampling procedure  approved  by the  Regional
                                  CFR  §264.97(g)(l) and  (2)].
                               [ro
                                 shall approve  an  appropriate sampling proce-
                                 the  owner  or operator  after considering the
                                conductivity,  and  hydraulic  gradient  in the
                             was^te  management area, and the fate and transport

                                      2-8

-------
characteristics  of  potential  contaminants.    Most  of  this  information  is
already required to  be submitted in the  facility's  Part B permit application
under §270.14(c) and may  be  used by the owner or operator to make this deter-
mination.   Further,  the  number  and kinds  of samples  collected  to establish
background concentration  levels  should be appropriate to the form of statisti-
cal  test  employed,   following   generally   accepted  statistical  principles
[40 CFR §264.97(g)].    For example,  the  use  of control  charts presumes a well-
defined  background  of at least  eight  samples  per  well.   By contrast,  ANOVA
alternatives might require only  four samples per well.

     It seems  likely  that most  facilities will  be  sampling monthly over four
consecutive  months,  twice a year.    In order  to  maintain  a  complete annual
record  of ground-water   data,  the   facility  owner  or  operator  may  find  it
desirable to obtain a  sample each month of  the year.  This will help  identify
seasonal  trends in the  data and permit evaluation of the  effects  of  auto-
correlation and seasonal  variation  if present in the samples.

     The  concentrations  of   a   consistent  determined  in  these  samples  are
intended  to  be used  in  one-point-in-time comparisons  between  background and
compliance wells.  This  approach will  help reduce  the components of  seasonal
variation  by providing  for  simultaneous comparisons  between  background  and
compliance well information.

     The  flexibility  for establishing sampling  intervals  was chosen to  allow
for  the unique nature of the hydrogeologic  systems  beneath  hazardous  waste
sites.   This sampling scheme  will  give proper  consideration to  the  temporal
variation  of and  autocorrelation   among  the  ground-water constituents.   The
specified  procedure   requires  sampling  data  from   background  wells, at  the
compliance point,  and according to a  specific  test protocol.    The  owner or
operator  should use  a background value determined  from  data collected  under
this scenario  if a test  approved by the Regional Administrator requires it or
if  a  concentration   limit   in   compliance  monitoring  is  to  be  based  upon
background data.

     EPA  recognizes  that  there may  be  situations where the owner or operator
can devise alternate statistical  methods and sampling procedures that  are more
appropriate  to  the facility  and that  will  provide  reliable results.   There-
fore,  today's   regulations allow the Regional  Administrator to  approve  such
procedures if  he  or she  finds that the procedures  balance  the risk  of  false
positives and  false  negatives in a manner  comparable  to that provided by the
above  specified tests and  that they  meet  specified  performance  standards
[40 CFR §264.97(g)].     In examining  the comparability of  the  procedure  to
provide  a reasonable  balance  between  the  risk  of  false  positives  and  false
negatives,  the owner  or operator  will  specify in the alternate  plan  such
parameters as sampling frequency and sample size.

2.4.3  The "Reasonable Confidence"  Requirement

     The  methods  indicate that  the procedure must  provide  reasonable confi-
dence that the  migration  of  hazardous constituents  from a regulated unit into
and  through   the   aquifer will  be   detected.    (The  reference to  hazardous
constituents  does  not mean that  this  option  applies  only  to  compliance
monitoring;  the   procedure   also   applies  to  monitoring  parameters   and

                                     2-9

-------
constituents  in  the detection  monjitori
indicating  the  presence of  hazardous
specific  tests,  however,  will
"reasonable confidence"  in  the  proposed
shows that  his  or her suggested t
the specified tests, then it  is  li
confidence"  test.   There  may  b
difficult  to  directly  compare  the
protocols for the specified tests.
to be evaluated on its own merits.
2.4.4  Implementation
     Owners  and  operators  currently
employing the  CABF  procedure may c
procedure at  the time  of  State i
course, these  owners  and operator
under  § 270.41(a)(3).   This chang
tion.  Class  1  permit modification
limited interest to the  public.
approval   from  the  Director.
September 28, 1988 for more detail
   or
     Under  appropriate  circumstan
continue  using  the CABF  procedure
comparably few monitoring wells  (e
a  limited number  of  chemical  pa:
fewer than four).   In  this  case,  f
testing  period,  and performing th
nificance may  result  in no more t
The owner or operator  should cons
adequacy  of  the  CABF procedure fo
or  operator  should also  continual
upgradient monitoring  wells  and  <
well  data (background  wells) to
This  practice  will  help reduce thi
ated  with the  CABF procedure.
independent  samples from  the mon
addresses  how one  might  accompli
replicate  sampling procedure
information  about  analytical  varic
ground-water  sampling  programs
geochemical  variability in  the
Obtaining  independent  samples whe
autocorrelation.
should
   c.bi
      In  all  cases  any  statistical
 approved  by the Regional Administ
 statistical  method or  sampling pro
 Regional  or State permit  review e
 modification is  approved  (see 53  F
         ng  program  since they  are surrogates
        constituents.)   The protocols  for the
    e  used  as  general  benchmark  to  define
          procedure.   If  the owner or operator
     st  is  comparable in  its results  to one of
    kely to be acceptable under the "reasonable
      situations,  however,  where  it  will  be
      performance of  an alternate  test  to the
     In such cases the  alternate test will have
     operating  under   a   RCRA   permit   and
hange  this  procedure to a more  appropriate
   Regional  permit  review and  update.    Of
  may also  apply  for a permit modification
  is considered a  Class 1 permit  modifica-
s are  technical  in nature and generally of
     1 modifications may be made with prior
he   reader  is  referred   to  53 FR   37912,
  about  the  permit modification process.
ces,  the  owner  or  operator  may  wish  to
    This would involve  a  facility that  has
 g., fewer than five) and monitors  for only
 ameters and  hazardous  constituents  (e.g.,
ewer than  20  comparisons would  be made each
  CABF  procedure at  the 0.05 level of sig-
 an one false positive  each  testing period.
 der a similar evaluation when  deciding  the
  his or her facility.  Likewise, the  owner
ly  update  the background concentrations  in
;imultaneously  compare  aggregate  upgradient
downgradient well  data  (compliance wells).
  component of temporal  variability associ-
 urther, efforts should be  made  to  obtain
 toring wells.   Section 3  of  the  guidance
sh  this task.   If  situations  permit,  the
    be avoided.  Replicate   samples provide
 ility  and  accuracy.   The goal of  all RCRA
   Id  be to  provide  data about the  hydro-
   fers  below  the  hazardous  waste  facility.
  possible will help  reduce the  effects  of
  should
  acui
       method  or  sampling  procedure  must  be
    ator  or  State  Director.  Changing  from one
    :edure to another may  be done  at  the time of
    nd update, or  at any  time a  Class  1  permit
      37912, September  28,  1988).
                                     2-10

-------
                                   SECTION  3

                         CHOOSING A SAMPLING INTERVAL
     This section discusses the important hydrogeologic parameters to consider
when choosing  a sampling  interval.   The Darcy equation  is  used to determine
the horizontal  component of the  average linear velocity of  ground water for
confined, semi confined,  and  unconfined  aquifers.   This value  provides  a good
estimate of time of  travel  for most soluble constituents  in ground water, and
can be  used  to determine a sampling  interval.   Example calculations are pro-
vided at  the end of  the section to  further assist the  reader.   Alternative
methods must  be employed  to  determine  a  sampling interval  in  hydrogeologic
environments where Darcy 's law is invalid.  Karst, cavernous basalt, fractured
rocks,  and other "pseudo karst"  terranes usually  require  specialized monitor-
ing approaches.

     Section 264.97(g)  of  40  CFR Part 264  Subpart F  provides  the owner  or
operator of a  RCRA facility with  a  flexible sampling schedule that will allow
him or  her to choose a sampling procedure that will reflect site-specific con-
cerns.   This section specifies that  the owner or operator  shall,  on  a semi-
annual  basis, obtain a sequence of at least four samples from each well, based
on  an   interval that  is determined after  evaluating the uppermost aquifer's
effective porosity,  hydraulic conductivity,  and  hydraulic  gradient,  and the
fate and  transport characteristics  of potential contaminants.   The intent of
this provision  is  to set a sampling  frequency  that allows  sufficient  time to
pass  between  sampling events  to  ensure,  to the  greatest  extent  technically
feasible, that  an  independent  ground-water sample  is  taken from  each well.
For further  information  on  ground-water  sampling,  refer to  the EPA "Practical
Guide for Ground-Water Sampling," Barcelona et al., 1985.

     The sampling frequency of the four semiannual sampling events required in
Part 264 Subpart F can be based on estimates using the average linear velocity
of ground water.  Two forms  of the  Darcy equation stated  below relate ground-
water velocity  (V)  to  effective  porosity  (Ne),  hydraulic  gradient  (i),  and
hydraulic conductivity (K):
                     V(Kh*i)/Ne      and     Vv=(K/i)/Ne

where  V^  and  Vv  are the  horizontal  and vertical  components of  the  average

linear velocity  of ground water,  respectively;  Kh and Kv  are  the horizontal

and vertical components of hydraulic conductivity; i is the head gradient; and
Ne  is  the  effective  porosity.    In  applying these equations to  ground-water
monitoring, the  horizontal  component  of the average linear  velocity  (V^) can
be  used   to determine  an  appropriate  sampling  interval.    Usually,  field


                                      3-1

-------
investigations will  yield bulk
cases, the bulk hydraulic conduct
or a  slug  test will  be sufficien
ponent of  the  average  linear velocity
in estimating flow velocities in
velocity such as recharge and discharge
values for  hydraulic conductivity.   In  most
 vity determined by  a pump test,  tracer  test,
 It  for these calculations.  The vertical  com-
      of ground water (Vy) may be  considered
      with  significant components of  vertical
       zones.
     To  apply  the  Darcy  equatioji  to  ground-water  monitoring,  one  needs to
determine the parameters K,  i,  and  Ne.   The hydraulic conductivity, K, is the
volume of  water at  the existing kinematic  viscosity that will move  in unit
time under  a unit  hydraulic gradient through  a  unit area measured  at right
angles to  the  direction of  flow.   The reference  to "existing kinematic vis-
cosity" relates to the fact that hydraulic conductivity is not only determined
by the media (aquifer), but also by fluid  properties (ground water or poten-
tial contaminants).   Thus,  it  is^  possible to  have  several  hydraulic conduc-
tivity values for  many differentjchemical substances  that  are present in the
same aquifer.   In either case  it  is  advisable to use the  greatest value for
velocity that  is calculated  usirg  the Darcy  equation to  determine  sampling
intervals.    This will  provide  for the  earliest  detection  of a  leak  from a
hazardous waste facility and  expeditious  remedial  action  procedures.   A range
of hydraulic conductivities  (the transmitted fluid is water) for various aqui-
fer materials  is  given in Figure;; 3-1 and  3-2.   The conductivities are given
in several  units.  Figure 3-3 lists conversion factors to change between vari-
ous permeability and hydraulic conductivity units.

     The hydraulic  gradient, i, [is the  change in hydraulic  head  per unit of
distance in  a  given direction,   tt can be  determined by  dividing the differ-
ence  in  head  between  two  points  on  a  potentiometric  surface  map  by  the
orthogonal  distance between those two points (see  example calculation).  Water
level measurements are normally u^ed to determine  the natural hydraulic gradi-
ent at  a facility.   However, the
in  the vicinity  of  the monitori
  effects of mounding  in  the event of a  leak
from a waste  disposal  facility  ma^y produce a steeper local hydraulic gradient
 |ig well.   These local  changes in  hydraulic
gradient should be accounted for fn the velocity calculations.

     The  effective  porosity, Ne,  is  the  ratio,  usually expressed  as  a per-
centage, of the total  volume of vjoids available for fluid transmission to the
total  volume  of the porous  medium dewatered.   It can be estimated during a
pump test by dividing the volume of water removed from an aquifer by the total
volume  of aquifer  dewatered (se<;  example calculation).  Table 3-1 presents
approximate effective porosity  values  for  a variety of aquifer materials.  In
cases  where the  effective porosity is unknown,  specific  yield may be substi-
tuted  into the equation.   Specific yields  of selected rock units are given in
Table  3-2.   In the  absence of  measured  values, drainable  porosity is often
used to approximate effective porosity.  Figure 3-4 illustrates representative
values  of drainable  porosity  and total  porosity as  a function  of  aquifer
particle  size.    If  available,  field  measurements of effective  porosity are
preferred.
                                      3-2

-------
           I G N E OUS  AND  METAMORPHIC  ROCKS
U n ' r o c r u r e d
                                             Fractured
                                     BASALT
Unfracfured
                                 Fractured

                          SANDSTONE
                                                               Lava flow
                         Fractured     Semiconsolidoted
                SHALE
       Unfractured      Fractured
                                            CARBONATE  ROCKS
                        Fractured
             CLAY              SILT, LOESS
                                                                Cavernous
                                              SILTY  SANO
                                                  CLEAN  SAND

                                                    Fine     Coarse
                 GLACIAL   TILL
                                                                  GRAVEL
       IO"8   IO"7   IO'6  IO"5   IO"4   IO~J   IO"2  10"'    I      10   10 2   I03    I04
                                       m/day
  IO"7   IO"6  IO"5   IO"4   IO"3   IO"2  IO"1
                                ft/day
                                                       10   10  z   10 3   10 4  10 5
    IO"7  I0"$   IO"5   IO"4   IO"3  I0"z   10"'    I     10   10 2   10 3   10 4   10 5
                                    gal/day-ft2
Source:   Heath, R. C.   1983.   Basic  Ground-Water  Hydrology.   U.S.  Geological
Survey Water  Supply Paper, 2220, 84 pp.
            Figure  3-1.  Hydraulic conductivity of selected rocks.
                                     3-3

-------
ROCKS






i>
1 i
~f> n
sS
"~ "rt
- o"?
" D 0^
C w 3 2
DO
O_ Q 0
C^£
ls|
^ c ^
ii I
1 — J




















Ic
_o <
•a 1
c
;_












1>
5
n
3
5

Unconsondared A- A A" /C /r
deposits ^ (darcy) (cm2) (cm/s) (m/s) (aa|/dav/ff

















































u







!u
a
0

(_
a
Crt
n


.22
- c
T^ -
O
(.
(*
is
o

i

C/l —









7


3
•o o £
S (,o
3 4
0
"H

•a
2
B
a>
c
— ^
-2
3 3 _£-
- Ql (/)
3 Q


—
'o
>> tj
C
a







0









o
x
•>












j





























rlO5 plO'3 rt02 p!

-to4


-IO3
-to2
-10


- 1
-10''
-tO'2

-to-3


-10"4
-to'5


-to-4


-ID'5
-IO"6
-io-7


-10"8
-io-9
-io-'°

-to-"


-io-'2
-io-'3

-to"6 -to'14
-to-7

-to-8
-to'15

-io-16

-10


-1
-10-'
-io-2


-to-3
-to-4
-to-5

MO'6


-10"7
-to"8

-10'9
-10"°
.IO6
-to"


-to-2
-io-3
-to-4


-to-5
-io-6
MO'7

-io-8


-10'9
- 10"°

- to-"
- to"2

-10'" "-'O'13


-to5
) w
-104
-IO3

- IO2

-10
-1

-10"

-1Q-2
i w
-io-3
- 10'4
1 w
MO'5
-10'6

                                             L 10
                                               ,-7
Source:    Freeze,  R. A.,  and J.  A.  Cherry.   1979.   Ground  Water.
Hall, Inc., Englewood Cliffs, New Jersey,  p. 29.

            Figure 3-2.  Range of values of hydraulic conductivity
                              and permeability.
                                                         Prentice
Permeability, k*
cm2





cm2
ft2
darcy
m/s
ft/s

9.29
9.87
1.02
3.11
gal/day/ft2 5.42

*To
Source:
Hall
, Inc.
obtain k in
Freeze,
i
x IO2
X JO"'
x 10-3
x 10-*
X 10-!°
ft 2
1.08 •<
1
1.06 x
1.10 x
3.35 x
5.83 x
ft2, multiply k in
R. A.
, Englewood Cl
10-3

IQ-i'
10's
IO-7
10-13
cm2 by
, and J.
iffs
, New
darcy
1.01 x 10« 9.
9.42 x IO10 9.
1 9.
1.04 x 105
3.15 x 10* 3,
5.49 x 1C'2 4.
1.08 x 10-3.
A. Cherry.
Jersey, p.
Hydraulic conductivity, K
m/s
80 x IO2
11 x IO5
66 x 10-«
1
,05 x 10-i
.72 x IO-7

1979.
29.
ft/s
3.22 x 103
2.99 x 10«
3.17 x 10-s
3.28
1
gal/day/ft2
1
1.
1
2
5
1.74 x 10~6

Ground


.85 x
,71 x
.82 x
.12 x
.74 x
1

Water.


10»
10' -
IQi
10s
10'


Prentice

Figure 3-3.  Conversion factors for permeability and
           hydraulic conductivity  units.
                        3-4

-------
  TABLE 3-1.  DEFAULT VALUES FOR EFFECTIVE  POROSITY  (Ne)  FOR USE
                 IN TIME OF TRAVEL  (TOT) ANALYSES
                                                 Effective porosity
           Soil textural classes                   of  saturation3
Unified soil classification system

     GS, GP, GM, GC, SW, SP, SM, SC                    0.20
                                                       (20%)

     ML, MH                                            0.15
                                                       (15%)

     CL, OL, CH, OH, PT                                0.01,


USDA soil textural classes

     Clays, silty clays, sandy clays
     Silts, silt loams, silty clay  loams               0.10
                                                       (10%)

     All others                                        0.20
                                                       (20%)

Rock units (all)

     Porous media (nonfractured rocks                  0.15
     such as sandstone and some carbonates)            (15%)

     Fractured rocks (most carbonates,                 0.0001
     shales, granites, etc.)                           (0.01%)
Source:  Barari, A., and L. S. Hedges.   1985.  Movement  of  Water
in Glacial Till.  Proceedings of the 17th International Congress of the
International Association of Hydrogeologists,  pp. 129-134.

a  These values are estimates and there  may be differences  between
   similar units.  For example, recent studies indicate  that
   weathered and unweathered glacial till may have  markedly dif-
   ferent effective porosities (Barari and  Hedges,  1985; Bradbury
   et a!., 1985).

   Assumes de minimus secondary porosity.   If fractures or soil
   structure are present, effective porosity should  be 0.001
   (0.1%).

                                3-5

-------
       TABLE 3-2.  SPECIFIC YIELD VALUES FOR
                SELECTED ROCK TYPES
         Rock type               Specific yield (%)
Clay                                    2
Sand                                   22
Gravel                                 19
Limestone                              18
Sandstone (semiconsolidated)            6
Granite                                 0.09
Basalt (young)                          8

Source:Heath, R. C,1983.Basic Ground-Water
Hydrology.  U.S. Geological Survey, Water Supply
Paper 2220, 84 pp.
                         3-6

-------
o
u
a.
50

45

40

35

30

25


20

15

10

 5

 0
           I     I
I     I    I    I    I     I
                                                Porosity
                                SoecifiC yield
                               (dramable porosity)
              •a
              «*
              oi
                                                   O
                                                    £
                                                    3
                                                     E
                                U
                                                              o
                                                             U
                                         a
                                         01
                                         I*
                                         5
                                        O
                                                                   O
                                                                   CC
         1/16 1/18  1/4   1/21     2    4    8    16    32   64  128  256

                         Maximum 10% gram size, millimeters
           (The grtin site m which rfte cumulttue rota/ beginning .vitrt the cotntit mttentl.
           retcnei 10% of the total tfrnpie I
   Source:  Todd, D.  K.   1980.   Ground Water Hydrology.   John
   Wiley and  Sons,  New York.   534 pp.
     Figure 3-4.  Total porosity and  drainable porosity for
                     typical geologic  materials.
                                   3-7

-------
     Once the values for K, i, and Ne are determined, the horizontal component
of the average  linear  velocity of ground water can  be  calculated.   Using the
Darcy equation,  we can determine  the  time  required for ground  water to pass
through the complete monitoring well diameter  by  dividing  the monitoring well
diameter by the  horizontal  component of  the average  linear velocity of ground
water.   (If considerable  exchange of water  occurs  during well  purging, the
diameter of the  filter  pack may be used  rather than the monitoring well diam-
eter.)  This  value will represent the minimum  time  interval  required between
sampling events  that will  yield  an independent ground-water  sample.   (Three-
dimensional mixing of ground water in the vicinity of the monitoring well will
occur when  the  well is purged before  sampling, which  is  one  reason  why this
method only provides an estimation of travel time).

     In determining these sampling intervals, one should note that many chemi-
cal compounds will  not  travel  at  the  same velocity as ground water.  Chemical
characteristics  such as adsorptive potential,  specific  gravity,  and molecular
size will  influence the way chemicals travel  in  the subsurface.  Large mole-
cules, for example, will tend  to  travel  slower than the average linear veloc-
ity of ground water because of matrix  interactions.   Compounds that exhibit a
strong adsorptive potential will  undergo a similar fate that will dramatically
change time  of  travel  predictions using the  Oarcy  equation.   In  some cases
chemical  interaction with  the matrix material  will  alter the matrix structure
and  its  associated hydraulic conductivity  that  may result  in an increase in
contaminant  mobility.   This  effect has been  observed with  certain organic
solvents in  clay units (see Brown and Andersen,  1981).  Contaminant fate and
transport models may  be useful in determining  the  influence  of  these effects
on movement in the  subsurface.  A variety of these models are available on the
commercial market for private use.

3.1  EXAMPLE CALCULATIONS

EXAMPLE CALCULATION NO. 1:  DETERMINING THE EFFECTIVE POROSITY (Ne)

     The effective  porosity,  Ne,   expressed  in  %, can  be  determined  during a
pump test using  the following method:

       Ne  = 100% x volume  of  water removed/volume of aquifer  dewatered

          Based  on a  pumping  rate of the  pump  of  50  gal/min  and  a pumping
          duration  of 30 min, compute the volume of water removed as:

                        50  gal/min x  30 min  = 1,500 gal

          To calculate  the  volume of aquifer dewatered, use the formula:

                                V =  (l/3)Trr2h

where r  is  the  radius  (ft) of area affected by pumping and h (ft) is the drop
in the water  level.  If, for example, h  = 3 ft and r =  18 ft, then:

                        V =  (l/3)*3.14*182*3 = 1,018 ft3
                                      3-8

-------
Next, converting ft3 of water to gallons of water,

                   V  =  (1,018  ft3)(7.48  gal/ft3)  =  7,615  gal

          Substituting  the two  volumes  in  the equation  for  the  effective
          porosity, obtain

                        Ne  = 100%  x  1,500/7,615 = 19.7%

EXAMPLE CALCULATION NO. 2:  DETERMINING THE HYDRAULIC GRADIENT (i)

     The  hydraulic  gradient,  i,  can  be  determined  from a  potentiometric
surface  map  (Figure 3-5  below)  as  i  =  Ah/i,   where   Ah  is  the  difference
measured  in  the gradient  at  Vz^ and  Pz2, and  s. is the  orthogonal  distance
between the two piezometers.

     Using the values given in Figure 3-3, obtain

              i  = Ah/a  = (29.2 ft  -  29.1 ft)/100  ft =  0.001  ft/ft
                                                                29.21
                                                                  29.1'
                                                                        O1
            Figure  3-5.   Potentiometric  surface map  for  computation
                            of hydraulic gradient.
     This  method  provides  only  a   very  general  estimate  of  the  natural
hydraulic  gradient  that   exists   in  the  vicinity of  the  two  piezometers.
Chemical  gradients  are known  to  exist  and  may override  the  effects  of  the
hydraulic  gradient.    A  detailed  study  of the  effects of multiple  chemical
contaminants may be necessary  to determine the  actual  average  linear velocity
(horizontal  component)  of  ground  water  in   the  vicinity  of  the  monitoring
wells.
                                      3-9

-------
EXAMPLE  CALCULATION NO.  3:    DETERMINING  THE  HORIZONTAL  COMPONENT OF  THE
AVERAGE LINEAR VELOCITY OF GROUND WATER (Vh)

     A  land  disposal  facility  has ground-water  monitoring  wells  that  are
screened in  an unconfined  silty sand  aquifer.   Slug tests,  pump  tests,  and
tracer tests conducted during a hydrogeologic site investigation have revealed
that the aquifer has a horizontal hydraulic conductivity (Kh) of 15 ft/day and
an  effective  porosity  (Ne)  of  15%.    Using  a  potentiometric  map (as  in
example 2), the hydraulic gradient (i)  has been determined to be 0.003 ft/ft.

     To estimate the minimum time interval between sampling  events that will
allow one to obtain an independent sample of ground water proceed as follows.

     Calculate  the horizontal  component of  the  average  linear  velocity  of
ground water (Vh) using the Darcy equation, Vh = (K^*i)/Ne.

With Kh = 15 ft/day,

     Ne = 15%, and
      i = 0.003 ft/ft, calculate

            Vh = (15)(0.003)7(15%) = 0.3 ft/day, or equivalently

            Vh = (0.3 ft/day)(12 in/ft) = 3.6 in/day

     Discussion:   The  horizontal component of the  average  linear velocity of
ground water,  Vh,  has  been  calculated  and  is equal to 3.6 in/day.  Monitoring
well  diameters at  this  particular  facility are 4 in.   We  can  determine  the
minimum time interval between sampling events that will allow one to obtain an
independent sample  of ground water by dividing the monitoring well diameter by
the horizontal component of the  average linear velocity of ground water:

            Minimum time interval = (4 in)/(3.6 in/day) = 1.1 days

     Based on  the above calculations, the owner or operator could sample every
other  day.   However,  because  the velocity can vary  with  recharge rates sea-
sonally, a weekly sampling interval would be advised.

                          Suggested  Sampling Interval

                      Date            Obtain Sample No.

                     June 1                   1
                     June 8                   2
                     June 15                  3
                     June 22                  4

Table  3-3 gives  some results for common situations.
                                     3-10

-------
                  TABLE  3-3.   DETERMINING A SAMPLING INTERVAL
DETERMINING A SAMPLING INTERVAL
UNIT
GRAVEL
SAND
SILTY SAND
TILL
SS (SEMICON)
BASALT
Kp (ft/day)
104
102
10
10'3
1
1C'1
Ne (%)
19
22
14
2
6
8
Vn (in/mo)
9.6x104
8.3x102
1.3x 102
9.1 x 10"2
30
2.28
SAMPLING INTERVAL
DAILY
DAILY
WEEKLY
MONTHLY *
WEEKLY
MONTHLY *
        The horizontal component of the average linear velocities is based on
        a hydraulic gradient, i, of 0.005 ft/ft.

        * Use a Monthly sampling interval or an alternate sampling procedure.
3.2  FLOW THROUGH KARST AND "PSEUDO-KARST" TERRANES

     The Darcy  equation  is not valid  in  turbulent  and nonlinear laminar flow
regimes.   Examples of  these  particular  hydrogeological  environments include
karst and  "pseudo-karst"  (e.g.,  cavernous basalts  and  extensively fractured
rocks) terranes.  Specialized methods have been investigated by Quinlan (1989)
for developing  alternative monitoring  procedures  for karst and "pseudo-karst"
terranes.  Dye  tracing  as described by Quinlan  (1989)  and Mull  et al. (1988)
is useful  for  identifying  flow  paths  and travel times  in karst  and "pseudo-
karst"  terranes.     Conventional   ground-water   monitoring  wells  in  these
environments are  often  of  little  value  in designing  an  effective monitoring
system.   Field investigations are necessary to locate seeps and springs, which
may serve  as  better "monitoring wells" for  identifying  releases  of hazardous
constituents into ground water and surface water.
                                     3-11

-------
                                  SECTION  4

                        CHOOSING A STATISTICAL METHOD


     This section  discusses  the choice of an appropriate  statistical  method.
Section 4.1 includes a flowchart to guide this selection. Section 4.2 contains
procedures to  test the distributional assumptions of  statistical  methods and
Section 4.3 has procedures to test specifically for equality of variances.

     The choice of an appropriate statistical test depends on the type of mon-
itoring and the nature of  the  data.   The proportion of values in the data set
that  are  below detection  is  one  important consideration.    If  most  of the
values are below detection, a test of proportions is suggested.

     One set of statistical  procedures is suggested when  the monitoring con-
sists of comparisons  of  water sample data from  the  background (hydraulically
upgradient) well  with the sample data  from compliance  (hydraulically down-
gradient) wells.   The recommended approach  is  analysis of  variance (ANOVA).
Also, for  a facility with  limited  amounts of data,  it is  advisable  to ini-
tially  use  the ANOVA  method  of data evaluation, and  later,  when sufficient
amounts of data are collected,  to  change to  a tolerance interval or a control
chart approach  for each compliance  well.   However, alternate  approaches are
allowed.   These include  adjustments  for seasonality,  use  of tolerance inter-
vals, and  use  of   prediction  intervals.   These methods  are  discussed  in Sec-
tion 5.

     When the  monitoring  objective  is to compare the  concentration of a haz-
ardous  constituent to a  fixed  level  such  as  a maximum  concentration limit
(MCL),  a different type  of approach  is  needed.   This  type of comparison com-
monly serves as a basis of compliance monitoring.  Control  charts may be used,
as may tolerance or confidence intervals.  Methods for comparison with a fixed
level are presented in Section 6.

     When a long history of data  from each well  is available, intra-well com-
parisons are appropriate.  That is, the data from a single uncontaminated well
are compared over time to detect shifts in concentration, or gradual trends in
concentration that may indicate contamination.  Methods for this situation are
presented in Section 7.

4.1  FLOWCHARTS—OVERVIEW AND USE

     The selection  and  use of a statistical procedure for ground-water moni-
toring  is  a detailed  process.   Because  a single flowchart  would  become too
complicated for easy  use, a  series  of flowcharts has  been  developed.   These
flowcharts  are found  at  the beginning  of each  section  and are  intended  to


                                      4-1

-------
guide the  user  in the selection and  use  of procedures in that  section.   The
more detailed flowcharts  can  be. thought of as  attaching  to  the general flow-
charts at the indicated points.

     Three general types  of statistical procedures  are presented in the flow-
chart overview  (Figure 4-1):    (1) background  well  to  compliance  well  data
comparisons; (2) comparison of compliance well data with a constant limit such
as  an  alternate concentration  limit  (ACL) or  a maximum  concentration limit
(MCL);  and  (3)  intra-well  comparisons.   The first  question  to be  asked  in
determining the  appropriate statistical  procedure  is the type  of monitoring
program  specified  in  facility permit.   The type  of monitoring  program may
determine  if  the  appropriate  comparison  is  among wells,  comparison  of down-
gradient well data to a constant, intra-well comparisons, or a special case.

     If the facility is in detection monitoring, the appropriate comparison is
between wells that  are hydraulically  upgradient  from the facility  and those
that are hydraulically downgradient.  The statistical procedures for this type
of  monitoring  are presented  in Section  5.   In  detection monitoring,  it  is
likely  that  many  of the  monitored  constituents may  result  in  few quantified
results (i.e., much of the  data are below the limit of analytical detection).
If  this is the  case,  then the  test of proportions (Section 8.1.3) may be rec-
ommended.   If the  constituent occurs  in  measurable concentrations  in back-
ground, then  analysis  of  variance  (Section 5.2)  is  recommended.   This method
of  analysis  is  preferred  when the data lack  sufficient  quantity to allow for
the use of tolerance intervals or control  charts.

     If the facility  is in  compliance monitoring,  the permit will specify the
type  of compliance  limit.   If  the compliance  limit is determined  from the
background,  the  statistical method is chosen  from  those that  compare back-
ground  well  to  compliance well  data.   Statistical methods  for  this  case are
presented  in  Section  5.   The preferred method  is  the appropriate analysis of
variance method in Section 5.2, or if sufficient data permit, tolerance inter-
vals or control  charts.  The flow chart in Section 5 aids in determining which
method  is applicable.

     If a  facility  in  compliance monitoring  has a constant maximum concentra-
tion  limit  (MCL)  or  alternate  concentration  limit (ACL)  specified,  then the
appropriate comparison  is with a constant.   Methods  for comparison with MCLs
or  ACLs are  presented  in Section  6,  which  contains  a  flow chart  to  aid  in
determining which method to use.

     Finally, when  more than  one  year of data have  been  collected  from each
well, the  facility  owner  or operator  may  find it useful  to perform intra-well
comparisons over  time  to  supplement the other methods.  This is not a regula-
tory  requirement,  but  it could provide  the  facility owner or  operator with
information about the  site  hydrogeology.   This  method of analysis may be used
when sufficient data from an individual uncontaminated well exist and the data
allow for the identification of trends.  A recommended control chart procedure
(Starks, 1988) suggests that a minimum background sample of eight observations
is  needed.   Thus an  intra-well control chart  approach  could  begin  after the
first  complete  year of  data  collection.    These methods  are  presented  in
Section 7.
                                      4-2

-------
               FLOWCHART OVERVIEW
      Detection Monitoring
                 Compliance Monitoring
                   or Corrective Action
                            Background
  Background/
Compliance Well
  Comparisons
  (Section 5)
                               Type of
                             Compliance
                                Limit
        MCL/ACL
with
with
                      i	
              Intra-Well
            Comparisons
             If more than
             1Yr. of Data
            Control Charts
             (Section 7)
 Comparisons
with MCL/ACLs
  (Section 6)
                                           	1
                    Figure 4-1.  Flowchart overview.
                                4-3

-------
4.2  CHECKING DISTRIBUTIONAL ASSUMPTIONS

     The purpose of this section is to provide users with methods to check the
distributional  assumptions  of  the   statistical  procedures  recommended  for
ground-water monitoring.   It  is emphasized that one  need  not  do an extensive
study of the distribution of the data unless  a nonparametric method of analy-
sis is used  to  evaluate the data.   If the owner or  operator wishes to trans-
form the data in lieu of  using a nonparametric method,  it must first be shown
that  the  untransformed  data  are  inappropriate  for  a  normal  theory  test.
Similarly, if the owner or operator wishes to use nonparametric methods, he or
she must demonstrate that the data  do violate normality assumptions.

     EPA has adopted this approach because most  of  the  statistical procedures
that meet the criteria set forth in the regulations  are robust with respect to
departures from many of the normal distributional assumptions.   That is, only
extreme violations  of assumptions will  result  in  an incorrect outcome  of a
statistical   test.   Moreover,  it is  only in  situations  where  it  is  unclear
whether contamination  is  present that departures from  assumptions will  alter
the outcome of a statistical test.   EPA  therefore believes that it is protec-
tive  of  the  environment  to adopt the approach of  not requiring  testing of
assumptions of a normal distribution on a wide scale.

     It  should  be  noted  that the   normal  distributional  assumptions  for
statistical  procedures  apply to the  errors of the  observations.   Application
of  the  distributional  tests to the  observations themselves may  lead  to the
conclusion that the distribution does not fit the observations.  In some cases
this lack of fit may be due to differences in means  for the different wells or
some other  cause.   The tests  for distributional assumptions are best applied
to  the  residuals  from a  statistical  analysis.  A residual  is  the difference
between the original observation  and the value predicted  by  a model.   For
example, in analysis of variance, the predicted values are the group means and
the residual is the difference between each observation and its group mean.

     If the  conclusion  from testing  the  assumptions is that  the  assumptions
are not  adequately met,  then  a transformation of the  data may be  used  or a
nonparametric  statistical  procedure  selected.   Many types of  concentration
data have been reported in the literature to be adequately described by a log-
normal distribution.  That  is,  the natural  logarithm of the original observa-
tions has been found  to  follow the  normal distribution.  Consequently, if the
normal distributional  assumptions  are found  to  be  violated for  the original
data, a transformation  by taking the  natural  logarithm  of each observation is
suggested.   This  assumes that  the data  are  all  positive.   If  the log trans-
formation does  not adequately  normalize  the  data  or stabilize  the variance,
one should use a nonparametric procedure or seek the consultation of a profes-
sional statistician to determine an appropriate statistical procedure.

     The  following sections  present  four  selected  approaches  to  check  for
normality.   The first  option  refers  to  literature  citation,  the  other three
are statistical procedures.  The choice is left to the user.  The availability
of  statistical software and the user's familiarity with it will be a factor in
the  choice  of a  method.   The  coefficient  of variation method,  for example,
requires only the  computation  of the  mean and standard  deviation of the data.


                                      4-4

-------
Plotting on  probability paper  can be done  by  hand but  becomes  tedious with
many data  sets.    However,  the  commercial  Statistical Analysis  System (SAS)
software package provides a computerized  version  of a probability plot  in its
PROC UNIVARIATE procedure.   SYSTAT, a package  for  PCs also has a probability
plot procedure.  The chi-squared test is not readily available through commer-
cial software but can be programmed on a PC (for example  in LOTUS 1-2-3) or in
any other  (statistical) software  language with  which the user  is familiar.
The amount  of data available  will also  influence  the choice.   All  tests of
distributional  assumptions  require  a  fairly  large  sample  size  to   detect
moderate to small  deviations from  normality.   The chi-squared test requires a
minimum of 20 samples for a reasonable test.

     Other  statistical  procedures  are  available for  checking distributional
assumptions.   The more  advanced  user  is referred to the Kolmogorov-Smirnov
test (see,  for  example, Lindgren,   1976) which  is used to test the hypothesis
that data come  from a specific  (that  is,  completely  specified) distribution.
The normal  distribution assumption can thus be tested for.  A minimum  sample
size of 50 is recommended for using this test.

     A  modification  to the  Kolmogorov-Smirnov  test  has  been  developed by
Lilliefors who  uses the sample  mean  and standard  deviation  from the data as
the parameters of  the distribution (Lilliefors,  1967).   Again, a sample  size
of at least 50 is  recommended.

     Another  alternative  to testing  for  normality  is  provided  by  the  rather
involved Shapiro-Wilk's test.  The interested user  is  referred to the relevant
article in Biometrika by Shapiro and Wilk (1965).

4.2.1   Literature  Citation

PURPOSE

     An owner or  operator  may wish  to  consult  literature  to determine what
type  of distribution  the  ground-water  monitoring  data  for  a  specific  con-
stituent are  likely  to  follow.   In cases where insufficient data prevents the
use  of  a  quantitative  method  for checking distributional  assumptions,  this
approach may  be necessary  and make  it  easier to  determine  whether  there is
statistically significant evidence of contamination.

PROCEDURE

     One simple way to  select a procedure based on  a specific  statistical dis-
tribution, is by citing a relevant published reference.   The owner or operator
may  find  papers that  discuss  data resulting  from sampling  ground water and
conclude that such data for a  particular constituent follow a specified dis-
tribution.  Citing such a  reference may be sufficient justification for using
a  method  based on that distribution, provided  that the data do not show evi-
dence that the assumptions are violated.

     To justify the  use of a literature citation, the owner or operator needs
to  make sure  that the  reference cited  considers  the distribution of data for
the  specific  compound being monitored.   In  addition,  he or she must evaluate


                                      4-5

-------
the similarity of their site to the site that was discussed in the literature,
especially  similar  hydrogeologic  and  potential contaminant  characteristics.
However, because many  of  the compounds may not  be  studied in the literature,
extrapolations to compounds with similar chemical characteristics and to sites
with  similar  hydrogeologic  conditions  are also  acceptable.    Basically,  the
owner or operator needs to provide some reason or justification for choosing a
particular distribution.

4.2.2  Coefficient-of-Variation Test

     Many  statistical  procedures assume that  the  data are normally distrib-
uted.  The  concentration of  a  hazardous  constituent in ground water is inher-
ently nonnegative, while  the normal distribution allows  for  negative values.
However, if the  mean of the normal distribution is  sufficiently above zero,
the  distribution  places very little probability on  negative  observations and
is still a  valid approximation.

     One simple check  that can rule out use of  the normal distribution is to
calculate  the  coefficient  of variation  of  the  data.  The use  of this method
was  required  by the  former  Part 264 Subpart F  regulations pursuant  to  Sec-
tion 264.97(h)(l).   Because most  owners  and  operators  as  well  as  Regional
personnel  are  already  familiar with this procedure,  it will  probably be used
frequently.   The  coefficient of variation,  CV, is  the standard deviation of
the observations, divided  by their  mean.   If  the normal  distribution is to be
a  valid  model, there  should be very  little  probability  of  negative values.
The  number of standard deviations  by  which the mean  exceeds  zero determines
the  probability of negative  values.   For example,  if the  mean exceeds zero by
one  standard  deviation, the normal distribution  will have  less than  0.159
probability of a negative observation.

     Consequently, one  can  calculate the  standard  deviation of  the observa-
tions, calculate  the mean,  and  form  the ratio of  the  standard deviation di-
vided by  the  mean.   If this ratio exceeds  1.00,  there  is evidence that the
data  are  not  normal  and the normal distribution should not be  used  for those
data.   (There are other possibilities for  nonnormality,  but  this is a simple
check that  can rule out obviously nonnormal data.)

PURPOSE

     This  test is a  simple  check  for  evidence  of  gross  nonnormality  in the
ground-water monitoring data.

PROCEDURE

     To apply  the coefficient-of-variation check for normality proceed as fol-
lows.

     Step  1.   Calculate the sample mean, X, of n observations X^, i=l, ...,n.


                                X = ( z  X.)/n
                                      4-6

-------
     Step 2.   Calculate the sample standard deviation, S.*
                              I  (X. - X)2/(n - 1)
                             1-1   1
                                                    1/2
     Step 3.   Divide the sample standard deviation  by  the  sample mean.  This
ratio is the CV.

                                  cv  = s/x.

     Step 4.   Determine if the result of Step 3 exceeds 1.00.  If so, this is
evidence that the normal distribution does not fit the data adequately.

EXAMPLE

     Table 4-1 is an example data set of chlordane concentrations in 24 water
samples from a fictitious site.  The data are presented in order from least to
greatest.


Applying the procedure steps to the data of Table 4-1, we have:

     Step 1.    X = 1.52

     Step 2.    S = 1.56

     Step 3.   CV = 1.56/1.52 = 1.03

     Step 4.   Because the  result of  Step 3 was  1.03,  which  exceeds 1.00, we
conclude  that  there is  evidence  that the  data do not adequately  follow the
normal  distribution.   As will  be  discussed in other sections  one  would then
either  transform the data, use a nonparametric procedure, or seek professional
guidance.
     Throughout this document we use S2 to denote the unbiased estimate of the
     population variance a2.  We  refer to this unbiased estimate of the popu-
     lation  variance as the sample  variance.    The  formula given  in Step 2
     above for  S,  the square root of  the unbiased  estimate of the population
     variance,  is used as the sample estimate of the standard deviation and is
     referred to as  the "sample standard deviation."   Any computation of the
     sample  standard deviation or the  sample variance, unless explicitly noted
     otherwise, refers to these formulas.  It should  be noted that this esti-
     mate of the standard deviation is not unbiased in that its expected value
     is  not  equal  to the population  standard  deviation.   However, all of the
     statistical  procedures  have  been  developed  using   the  formulas as  we
     define  them here.
                                      4-7

-------
                  TABLE 4-1.  EXAMPLE DATA FOR COEFFICIENT-
                              OF-VARIATION TEST
                        Chlordane
               Dissolved phase
               Immiscible phase
     NOTE.   The owner  or operator
1.03  is  so  close  to  the  limi
nonparametric  test  if  he or  she
would be incorrect due to the dep
4.2.3  Plotting on Probability Paper

PURPOSE
     Probability  paper  is  a  vlsi
whether a small  set  of  data fol
estimates of  the mean and  standa
from the plot.

PROCEDURE

     Let X  be the variable;  Xlt
The values of X  can be raw  data,
  concentration  (ppm)
     0.04
     0.18
     0.18
     0.25
     0.29
     0.38
     0.50
     0.50
     0.60
     0.93
     0.97
                                       10
                                       16
                                     1.29
                                     1.
                                     1.
       37
       38
                                     1.45
     1.
     2.
     2.
     2,
     3.
     4.
     46
     58
     69
     80
     33
     50
                                     6.60
    may  choose  to use  parametric  tests since
  t  but  should   use  a transformation  or  a
   believes  that  the parametric  test results
  rture from normality.
  al
lows
   aid  and
   a  normal
d deviation
 diagnostic  tool  in
distribution.  Also,
 of  the distribution
determining
approximate
can be read
  X2,...,X1-,...,Xn the  set  of n observations,
  esiduals, or transformed data.
                                      4-8

-------
     Step 1.   Rearrange the observations in ascending order:
     Step 2.   Compute the  cumulative frequency for  each  distinct value X(i)
as (i/(n+l)) x 100%.   The  divisor of  (n+1)  is  a plotting  convention to avoid
cumulative frequencies of  100% which would  be  at  infinity on the probability
paper.

     If a value of X occurs more  than once,  then the corresponding value of i
increases appropriately.    For example,  if  X(2) = X(3),  then  the cumulative
frequency for  X(l)  is 100*l/(n+l),  but the cumulative  frequency for X(2) or
X(3)  is 100*(l+2)/(n+l).

     Step 3.   Plot the distinct  pairs [X(i),  (i/n+1))  x 100] values on prob-
ability  paper  (this  paper is comrnercially available)  using  an appropriate
scale  for X  on the  horizontal axis.   The  vertical  axis  for  the cumulative
frequencies is already scaled  from 0.01 to 99.99%.

     If the points fall roughly on a straight line (the  line can  be drawn with
a ruler), then one  can conclude  that  the underlying distribution is approxi-
mately normal.   Also, an  estimate  of the mean  and  standard  deviation can be
made from the  plot.   The  horizontal line drawn through 50%  cuts the plotted
line at the mean  of  the X  values.  The horizontal line  going through 84% cuts
the line at a value corresponding to the mean plus one standard deviation.  By
subtraction, one obtains the standard deviation.

REFERENCE

Dixon, W.  J.,  and  F.  J.  Massey, Jr.   Introduction to Statistical  Analysis.
McGraw-Hill, Fourth Edition, 1983.

EXAMPLE

     Table 4-2 lists 22 distinct chlordane concentration values (X) along with
their  frequencies.   These  are the  same values  as  those listed  in Table 4-1.
There  is a total  of n=24 observations.

     Step 1.   Sort the values of X  in  ascending order (column 1).

     Step 2.   Compute [100 x  (i/25)],  column 4, for each distinct value of X,
based on the values of i  (column 2).

     Step 3.   Plot  the  pairs [X-,  100x(i/25)] on  probability  paper  (Fig-
ure 4-2).

INTERPRETATION

     The  points in  Figure  4-2  do not  fall on  a straight line;  therefore,  the
hypothesis  of  an  underlying  normal  distribution  is  rejected.    However,  the


                                      4-9

-------
TABLE 4-2.  EXAMPLE DATA COMPUTATIONS FOR
           PROBABILITY  PLOTTING

Concentration
X
0.04
0.18
0.25
0.29
0.38
0.50
0.60
Dissolved phase 0.93
0.97
1.10
1.16
1.29
1.37
1.38
1.45
1.46
2.58
2.69
Immiscible phase 2.80
3.33
4.50
6.60

Absolute
frequency
1
2
1
1
1
2
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1

i 1
1
3
4
5
6
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

L00x(i/(n+l)
4
12
16
20
24
32
36
40
44
48
52
56
60
64
68
72
76
80
84
88
92
96

) 1n(X)
-3.22
-1.71
-1.39
-1.24
-0.97
-0.69
-0.51
-0.07
-0.03
0.10
0.15
0.25
0.31
0.32
0.37
0.38
0.95
0.99
1.03
1.20
1.50
1.89
                   4-10

-------
                         —rtrt

:






1 (
1 i
!

, [
1 1 |




,
,
1

^ ;
	 	 	 r •~'r ; 	


i
i

, . i
i ' _ i f . •
i ^.. . , _ . .. _ r . . .r _ .
1 ' ' ' " r > • 1 ' -* ' ' [ ' ' ( ' TJ
— 	 ' 	 ! 	 ^ 	 ' 	 ' 	 ' 	 	 i

i 0 !
|ZZ— i 	 _ 	 : 	 » 	 1 	 1
— ; :. . i:,-. :. : •:. . ; . —
__ | -.__ *. - •- ,..*..,-__,_-.
. 	 **- 	 1 	

1 i i |









, 	 1 	 , 	 | 	 1 	 ; 	

i •',•.•••". I — rt. Concentration 100x (i/(n*1)) '~^==\
X
                    0.04
                    0.18
                    0.25
                    0.29
                    0.33
                    0.50
                    0.60
                    0.93
                    0.97
                    1.10
                    1.16
                    1.29
                    1.37
                    1.38
                    1.45
 4
12
16
20
24
32
36
40
44
48
52
56
60
64
68
        0  0.5   1
     3      4
X-Axis: (Concentration)
                                                                     — -—"-^

| 	 :_
"" '"" ^~'~l '"'
(

1



!
;
i





	 =



i
i ' 	

i
1"

i
, 	 — \

:
\

j
(
|

i
i ,

i
i
r ""i 2.53 7R — p — 1

i 2.69 80 " ! '
{- 2-*?0 84 1

• •• -p 3.33 88 -"! 	 1
\. 'r 4.50 92 r^t-_^
	 ,_ 6.60 96 -;-.''.HJ
1 1 i i , ! i
i 1 ' ' I i ; i
[ill
Figure 4-2.   Probability  plot  of raw  chlordane concentrations.
                                  4-11

-------
shape of the curve indicates a lognormal distribution.  Thi
next step.

     Also, information  about  the solubility of  chlordane in
helpful.  Chlordane has a solubility (in water) that ranges
1.85 mg/L.   Because  the last  six  measurements exceed  this  solut
contamination is suspected.
                                                                   V  fs
     Next, take  the  natural  logarithm of the  X-values  (ln(X))  (coy
Table 4-2).  Repeat Step 3 above using the pairs [ln(X), 100x(i/25)].
suiting  plot is  shown  in  Figure  4-3.    The  points  fall  approximately
straight  line  (hand-drawn) and  the  hypothesis  of  lognormality  of  X,
ln(X) is normally distributed, can  be  accepted.   The mean can be estimatev
slightly below 0 and  the standard deviation at about 1.2 on the log scale.

CAUTIONARY NOTE

     The probability  plot  is  not  a formal  test of whether  the  data follow a
normal  distribution.    It  is designed  as  a  quick,  graphical   procedure to
identify  cases  of  obvious  nonnormality.   Figure  4-3  is  an  example  of  a
probability plot of normal  data, illustrating how a probability plot of normal
data  looks.   Figure  4-2 is an example  of how  nonnormal data  look  on a prob-
ability  plot.   Data  that are sufficiently nonnormal  to require  use of a  pro-
cedure  not  based on  the normal  distribution will  show a definite  curve.  A
single  point  that does not fall on the straight line  does  not  indicate  non-
normality, but may be an outlier.

4.2.4  The Chi-Squared Test

     The chi-squared test  can be used to test  whether  a set of  data properly
fits  a  specified  distribution within  a  specified probability.   Most introduc-
tory  courses  in  statistics explain the chi-squared  test,  and  its familiarity
among  owners  and operators  as well  as  Regional   personnel  may  make  it  a
frequently used method of analysis.  In this application the assumed distribu-
tion  is  the  normal distribution, but other distributions  could  also be used.
The  test consists of defining  cells  or ranges  of values  and  determining the
expected number  of observations  that  would  fall  in  each cell according to the
hypothesized distribution.  The actual number of data  points  in each cell is
compared with that predicted  by  the distribution to  judge the adequacy of the
fit.

PURPOSE

      The chi-squared  test  is used  to  test the  adequacy  of  the  assumption of
normality of the  data.

PROCEDURE

      Step  1.  Determine  the  appropriate  number of  cells, K.   This  number
usually  ranges  from  5  to   10.   Divide the  number  of  observations,  N,  by 4.
Dividing  the total  number  of observations by  4 will  guarantee  a  minimum of
four  observations necessary for each of  the K = N/4 cells.   Use the largest
whole number of  this result, using  10 if the result exceeds 10.

                                     4-12

-------
i 1
j • In
a I '
"!— 1
- 1 -i
-1

— |~' •£
"fe 'c
! — ; c
hn ~c
3 ~" 	 -C
s „ I c

•e I C
3 ' _
j I ' C
- " i — ' c
- — c

8 ~ i — c


•^* ,J_ 	 ~
^ l~~ '
S K^ i
«• | — i

§ "f= . ./
X 1
**~ 1 ,1 '
	
t: 	 E
1 '"•"• i'. ' .-. ,


i | t
_
^_
1 •" .
-i i .
J
s 	 r
a 	 1 — : —
m
s ' "" ' 	 '
- — : — r
1 ' . i !

a
.j
X-Axis:
i
X) 1 0Ox (i/{
.22 4
.71 12
.39 16
.24 20

.97 24
.69 32
.51 36
07 40
.03 44
.10 48
.15 52

.25 5fc

.31 60
.32 64
37 68
.38 72
.95 76

qq ar

.03 84
.20 88
50 92

.89 96


—r- 	 i 	 rjt



	 f-y— —
\/ . ;
i Jf • - ••
A :
i /
\ / '
/ i
—f 	 1 	 .
--i J 1. -
J '

j 1

! I
! !
-2.5 -2
In (Concentrati

n+l))

























I 	
l—,/
r/—
*-—
¥ 	
B —
l 	















1
on)












	





inr




• • '"
	
	
—
jLJ

L_


.








i






j
t
! '

" 	 i 	









!
1
.
1 i
t 	 ' 	 J
; 	 — r
1 	 -JL



1 — ^f 	 '
i — T"
\/*"~ 	 '
f • 	
• 	 : — i
|
'

• ""*' * — —


	 ... 	




S~1 ?
<< t
	 , 	






f
0












L




^
^




P

—
P-


-
^ 	

-
















(
i




i






_— . t___^.
/
; /
+\/~

^4 —





- " -> - - - i

— I
._ 	
— f~ — —

•
. . ... r - • l

| 	 1
	 j






i — * 	







/
\











/
Y-





	
[• 	






, 	
— •
	
____

:_. _
.^-— -
„ ..

•




^







>

i
I — -







J-yr 	
\/~ 	
, 	
~~





^ 	
' 	 ^\





	

	

^ " -

rr .- - z"


	 I









! 	




I


: i

i
i

_ 	 1 	 j 	 , 	

— ~ I 	 ' 	
i •
{—-^4—; 	
i

i
i 	 , 	 1 	
hr,1 • ..:.- "i — r r.-
^_: — ^;_ru--^_i.- —
i 	 1 	 1 	 1 	



	 -. 	 1 	 1 	
	 1 	 '
	 j=j 	

— 	 -— ' p~ '•• ~
- - ' | r
.
' " | ' " ' '
_ ._, 	 ; 	
— _- -r 	 ^_ 	 -p. 	

— :- — j 	 T—

! 1 |
|


I i
	 	 ± 	 1 	 	 	
*

— !^ 	 1- 	 r 	









"s













"


.3




""





—






—
-

•
'-
3



                                 Mean
Mean-fStd
Figure 4-3.  Probability plot  of  log-transformed chlordane concentrations.
                                    4-13

-------
     Step 2.   Standardize the data by  subtracting the sample mean and divid-
ing by the sample standard deviation:
                          Z1 =
                                         -  X)/S
     Step 3.   Determine the  number  of observations that  fall  in each of  the
cells defined according to Table 4-3.  The expected number of observations  for
each cell  is  N/K, where N  is the total  number of observations  and K is  the
number of  cells.   Let  N^  denote the observed  number  in cell  i  (for i taking
values from 1  to  K)  and let  E^  denote the  expected number of observations  in
cell i.  Note that in this case the cells are chosen to  make the  E^'s equal.


             TABLE 4-3.  CELL BOUNDARIES FOR THE CHI-SQUARED TEST


                  	Number of cells  (K)	
                                          7
                                                8
10
Cell boundaries
for equal ex-
pected cell
sizes with the
normal distri-
bution



-0.84
-0.25
0.25
0.84





-0.97
-0.43
0.00
0.43
0.97




-1.07
-0.57
-0.18
0.18
0.57
1.07



-1.15
-0.67
-0.32
0.00
0.32
0.67
1.15


-1.22
-1.08
-0.43
-0.14
0.14
0.43
1.08
1.22

-1.28
-0.84
-0.52
-0.25
0.00
0.25
0.52
0.84
1.28
Step 4.   Calculate the chi-squared statistic by the formula below:

                                           2
                          2  =
                                    K  (N.  - E.)
                                    z    '     1
     Step  5.   Compare  the  calculated result to the  table of the chi-squared
distribution with  K-3 degrees of  freedom (Table 1,  Appendix B).   Reject  the
hypothesis of normality if the calculated value exceeds the tabulated  value.

REFERENCE

Remington,  R.  0.,  and  M.  A.  Schork.   Statistics  -with Applications  to the
Biological and Health Sciences.  Prentice-Hall,  1970.   235-236.

EXAMPLE

     The data  in Table  4-4 are N  =  21 residuals from an  analysis of  variance
on dioxin  concentrations.  The analysis of variance assumes that the errors
                                     4-14

-------
  TABLE  4-4.   EXAMPLE  DATA  FOR  CHI-SQUARED
                    TEST
Observation
Residual
Standardized
  residual
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
-0.45
-0.35
-0.35
-0.22
-0.16
-0.13
-0.11
-0.10
-0.10
-0.06
-0.05
0.04
0.11
0.13
0.16
0.17
0.20
0.21
0.30
0.34
0.41
-1.90
-1.48
-1.48
-0.93
-0.67
-0.55
-0.46
-0.42
-0.42
-0.25
-0.21
0.17
0.47
0.55
0.68
0.72
0.85
0.89
1.27
1.44
1.73
                    4-15

-------
(estimated by the residuals)  are normally distributed.   The  chi-squared test
is used to check this assumption.

     Step 1.   Divide the number of observations, 21,  by  4 to get 5.25.  Keep
only the integer part, 5, so the test will use K = 5 cells.

     Stejz 2.   The sample mean and standard deviation are calculated and found
to be:  X = 0.00, S  = 0.24.   The data are standardized by subtracting the mean
(0 in this case) and dividing by S.  The results are also shown in Table 4-4.

     Step 3.   Determine the  number  of (standardized) observations  that fall
into the five cells  determined from Table 4-3.  These divisions are:  (1) less
than  or  equal  to  -0.84, (2) greater  than -0.84  and less  than or  equal  to
-0.25,  (3) greater  than -0.25  and less  than  or equal to  +0.25,  (4) greater
than 0.25 and less than  or  equal  to  0.84, and (5)  greater than 0.84.  We find
4 observations  in cell  1,  6  in cell 2,  2  in cell 3, 4  in cell 4,  and  5  in
cell 5.

     Step 4.   Calculate the  chi-squared statistic.   The expected  number  in
each cell is N/K or 21/5 = 4.2.


                  Ya  =  M - 4.2)2          (5  -  4.2)2
                  x       4.2        *••       4.2       ^*1U


     Step 5.   The critical  value  at the  5%  level  for a  chi-squared test with
2 (K-3 =  5-3  =  2)  degrees of freedom  is  5.99 (Table  1,  Appendix B).  Because
the calculated value of 2.10 is less than 5.99 there is no evidence that these
data are not normal.

INTERPRETATION

     The  cell boundaries are  determined from the normal  distribution so that
equal  numbers of observations  should  fall  in each  cell.   If  there  are large
differences between the number of observations in each cell and that predicted
by  the normal  distribution,  this  is evidence  that  the  data  are  not normal.
The  chi-squared statistic is a nonnegative  statistic  that increases  as  the
difference between  the  predicted and observed number of  observations in each
cell  increases.

      If  the  calculated  value of  the  chi-squared statistic exceeds  the tabu-
lated  value,  there  is statistically  significant  evidence  that the data do not
follow  the  normal distribution.   In that case, one would need to do a trans-
formation,  use  a nonparametric procedure, or  seek  consultation before inter-
preting  the  results  of  the  test of  the  ground-water  data.   If the calculated
value  of the chi-squared  statistic does not  exceed  the  tabulated critical
value, there  is no  significant  lack  of fit to the normal distribution and one
can proceed assuming that the assumption of normality is adequately met.
                                     4-16

-------
REMARK

     The chi-squared statistic can be  used  to  test whether the residuals from
an analysis  of variance  or other  procedure  are  normal.   In this  case the
degrees of freedom are found by (number of cells minus one minus the number of
parameters that  have been  estimated).  This  may  require more than  the sug-
gested 10 cells.  The chi-squared test does require a fairly large sample size
in that there should be generally at least four observations per cell.

4.3  CHECKING EQUALITY OF VARIANCE:  BARTLETT'S TEST

     The analysis of variance procedures presented in Section 5 are often more
sensitive  to  unequal variances  than  to  moderate departures  from normality.
The  procedures described  in  this  section  allow  for  testing to  determine
whether group variances are  equal  or differ  significantly.  Often in practice
unequal variances and nonnormality occur together.  Sometimes a transformation
to stabilize  or equalize the  variances  also produces  a distribution that is
more  nearly  normal.  This  sometimes  occurs  if the  initial  distribution was
positively  skewed  with  variance  increasing  with the  number  of observations.
Only  Bartlett's  test for checking equality,  or homogeneity,  of  variances is
presented  here.   It encompasses checking equality of  more than two variances
with  unequal  sample  sizes.   Other tests  are available for special cases.  The
F-test is  a  special  situation when  there are  only two groups to be compared.
The user  is referred to classical  textbooks  for this test (e.g., Snedecor and
Cochran, 1980).  In the case of equal  sample sizes but more than two variances
to be compared,  the user might want  to use  Hartley's  or maximum F-ratio test
(see  Nelson,  1987).  This test provides a quick procedure to test for variance
homogeneity.

PURPOSE

      Bartlett's test  is  a  test of homogeneity  of  variances.   In other words,
it is a means of  testing whether a number of  population variances  of normal
distributions  are  equal.   Homogeneity of  variances is  an assumption made in
analysis  of   variance when  comparing  concentrations  of  constituents between
background  and compliance  wells,  or  among  compliance  wells.   It  should be
noted  that Bartlett's  test  is  itself sensitive to  nonnormality  in  the data.
With  long-tailed  distributions  the   test  too  often  rejects  equality  (homo-
geneity) of the variances.

PROCEDURE

      Assume that  data  from  k wells  are  available and that there  are n^  data
points for well i.
                                     4-17

-------
 2   Step 1.   Compute the k sample variances S^...^.  The sample  variance,
S , is the square of the sample standard deviation and is given by  the  general
equation


           S2 = z  (X.-X)2/(n-l)
               i=l   1

where X is the average  of  the  Xlt...,Xn values.  Each variance has associated

with  it  f.j  = n.j-1  degrees of  freedom.   Take  the natural  logarithm of  each
variance, ln(Sl),...,ln(S|c).

     Step 2.   Compute the test statistic

                       2     k         2
           X2 =  f in(Sp) -  z  f. ln(S.)


                       k        /k    \
           where  f =  z  f. =   z  n.l-k
                      1=1   ]    \i=l  V


thus f is the total sample size minus the number of wells (groups); and

            21           2
           S  = 4   z   f.S. ,  is the pooled variance across wells.
            P   f  1=1   i  i

     Step 3.   Using  the  chi-squared  table  (Table  1,  Appendix B),  find  the
critical value for x2 with  (k-1) degrees of freedom at a predetermined  signif-
icance level, for example, 5%.

INTERPRETATION

      If the calculated  value  x2 is  larger than the tabulated value,  then  con-
clude that the variances are not equal at that  significance  level.

REFERENCE

Johnson N. L.,  and  F. C.  Leone.     Statistics  and  Experimental  Design  in
Engineering and the Physical Sciences.  Vol. I, John Wiley and Sons, New  York,
1977.

EXAMPLE

     Manganese concentrations are given for k=6 wells in Table 4-5  below.

     Note:  Some numbers in Table 4-5 have been rounded.
                                     4-18

-------
                  TABLE 4-5.  EXAMPLE DATA FOR BARTLETT'S TEST

Sampling
date
January 1
February 1
March 1
April 1
ni =
f1 = nrl =
si '
v-
fi*si2 -
ln(Si2) =
fi*ln(S12) =


Well 1 Well 2
50
73
244
202
4
3
95.27
9,076
27,228
9.11
27.33
46
77


2
1
21.92
480
480
6.17
6.17

Well 3
272
171
32
53
4
3
111.60
12,455
37,365
9.43
28.29

Well 4
34
3,940


2
1
2,761.96
7,628,423
7,628,423
15.85
15.85

Well 5
48
54


2
1
4.24
18
18
2.89
2.89

Well 6
68
991
54

3
2
536.98
288,348
576,696
12.57
25.14
logari
Step 1.   •    Compute the  six  sample variances  and  take their natural
•ithm, ln(S!)),..., ln(S6), as 9.11,  6.17,...,  12.57, respectively.
Step 2
                    Compute    z  f.  ln(s) = 105.67
                             1=1   ]      n
This is the sum of  the  last  line  in Table 4-5.
                    Compute   f =   z  f . = 3 + 1 +...+ 2 = 11
                                 1=1   1
                    Compute
 $•'
 TT   .Z  fi Si = TT (27'298 +"- + 576«696)  =  JY (8,270,210) =  751,837.27


               Take the natural  logarithm of  S*:   ln(Sp)  =  13.53

               Compute x2 = 11(13.53)  - 105.67  =  43.16
                                    4-19

-------
     Step 3.   The critical  x2 value with 6-1 = 5 degrees of freedom at the 5%
significance level  is  11.1  (Table  1  in Appendi^ B).   £ince 43.16  is larger
than 11.1, we  conclude  that the six variances S  , ...,S  ,  are not homogeneous
at the 5% significance  level.                    l       6

INTERPRETATION

     The  sample  variances  of  the  data  from the six  wells were  compared  by
means of Bartlett's test.   The  test was significant at the 5% level, suggest-
ing  that  the  variances  are  significantly  unequal   (heterogeneous).   A  log-
transform of the  data  can be done  and  the  same  test performed  on the trans-
formed  data.    Generally,  if  the  data  followed  skewed  distribution,  this
approach resolves  the  problem of unequal variances and the  user can proceed
with an ANOVA for example.

     On the  other hand, unequal variances  among well data could  be a direct
indication of  well  contamination,  since the individual  data could  come  from
different distributions (i.e., different means and  variances).   Then the user
may  wish  to  test  which  variance  differs   from  which  one.    The  reader  is
referred  here  to  the  literature for  a gap test  of  variance  (Tukey,  1949;
David, 1956; or Nelson, 1987).

NOTE

          In the  case  of  k=2 variances, the test of equality of variances is
the F-test (Snedecor and Cochran, 1980).

          Bartlett's test simplifies in  the  case  of equal  sample sizes,  n^=n,
i=l,...,k.   The  test  used then  is  Cochran1s test.   Cochran1s test focuses on
the largest variance and compares it to the  sum of all the variances.  Hartley
introduced a quick test of homogeneity of variances that uses the ratio of the
largest over the  smallest variances.   Technical  aids for the procedures under
the  assumption of equal  sample sizes  are given by  L. S. Nelson  in  the Journal
of Quality Technology, Vol. 19, 1987, pp.  107  and 165.
                                     4-20

-------
                                  SECTION 5

                BACKGROUND WELL TO COMPLIANCE WELL COMPARISONS


     There are  many  situations in ground-water  monitoring that call  for the
comparison of  data from  different  wells.   The  assumption  is  that a  set of
uncontaminated wells can be defined.   Generally these are background wells and
have been  sited to  be  hydraulically upgradient from  the regulated unit.   A
second set of  wells are  sited  hydraulically downgradient from  the regulated
unit and  are  otherwise  known as  compliance  wells.   The  data from  these com-
pliance wells are compared to the data  from  the  background wells to determine
whether there  is  any evidence  of contamination  in  the  compliance  wells that
would presumably result from a release from the regulated unit.

     If the  owner or  operator of a hazardous waste  facility  does not have
reason to  suspect that  the  test assumptions  of equal  variance  or normality
will be violated, then  he or  she  may simply  choose the parametric analysis of
variance as a default method  of statistical  analysis.   In the event that this
method indicates  a  statistically significant difference between  the  groups
being tested, then the test assumptions should be evaluated.

     This situation, where the  relevant comparison  is  between data  from back-
ground wells  and data  from  compliance wells, is  the  topic of  this section.
Comparisons  between  background  well data  and  compliance well  data   may  be
called for in  all  phases of monitoring.  This type  of comparison is the gen-
eral case  for detection  monitoring.   It  is  also the  usual  approach for com-
pliance monitoring  if the compliance limits are determined  by  the  background
well constituent  concentration  levels.    Compounds  that are present  in back-
ground  wells   (e.g.,  naturally  occurring  metals)  are  most   appropriately
evaluated using this comparison method.

     Section  5.1 provides a  flowchart  and  overview  for  the  selection  of
methods for  comparison  of background well  and compliance  well  data.   Sec-
tion 5.2  contains  analysis of  variance  methods.   These provide methods for
directly comparing background well data to compliance  well  data.   Section 5.3
describes  a  tolerance  interval  approach, where  the background  well data are
used to  define the  tolerance  limits for  comparison with the compliance well
data.  Section  5.4  contains  an approach based on  prediction intervals, again
using  the  background well data to determine the prediction  interval for com-
parison with the compliance well  data.  Methods  for comparing data to a fixed
compliance limit (an MCL  or ACL) will be described in Section 6.
                                      5-1

-------
5.1  SUMMARY FLOWCHART FOR BACKGROUND WELL TO COMPLIANCE WELL COMPARISONS

     Figure 5-1 is a flowchart to aid in selecting the appropriate statistical
procedure for background well to compliance  well  comparisons.   The first step
is to  determine whether  most of  the observations  are quantified  (that  is,
above the detection limits) or not.  Generally, if more than 5056 of the obser-
vations are below the detection limit  (as  might  be the case with detection or
compliance monitoring for  volatile organics)  then  the appropriate comparison
is a test of proportions.   The test  of proportions compares the proportion of
detected values in the background wells to those in the compliance wells.  See
Section 8.1 for a discussion of dealing with data below the detection limit.

     If the proportion of  detected values  is 50% or more,  then an analysis of
variance procedure is the first choice.  Tolerance limits or prediction inter-
vals are acceptable alternate choices  that  the user may select.  The analysis
of variance  procedures  give a more  thorough picture of the situation  at  the
facility.   However,  the tolerance limit or prediction interval  approach  is
acceptable and requires less computation in many situations.

     Figure 5-2  is  a  flowchart  to  guide  the  user  if  a tolerance  limits
approach  is  selected.    The first step  in  using  Figure 5-2  is  to  determine
whether the facility is  in detection monitoring.   If so, much  of the data may
be below the detection limit.  See Section 8.1 for a discussion of this case,
which may call for consulting a statistician.  If most of the data are quanti-
fied, then follow  the  flow chart to determine if normal tolerance limits  can
be used.   If the data are  not normal  (as  determined  by one of the procedures
in Section 4.2), then the  logarithm  transformation  may be  done and the trans-
formed data checked for  normality.   If the log data are normal, the lognormal
tolerance  limit  should  be  used.   If  neither  the original data  nor  the log-
transformed   data   are  normal,   seek  consultation   with   a  professional
statistician.

     If  a  prediction  interval is  selected  as the method  of choice,  see Sec-
tion 5.4 for guidance in performing the procedure.

     If  analysis of  variance is  to be used, then continue with Figure 5-1 to
select the  specific  method that  is appropriate.   A one-way analysis of vari-
ance is  recommended.   If the data  show evidence  of seasonality (observed,  for
example, in a plot of the  data over time), a trend analysis or perhaps a two-
way  analysis  of  variance may be the appropriate  choice.  These instances  may
require consultation with  a professional statistician.

     If  the one-way analysis  of variance  is appropriate, the computations  are
performed, then  the residuals are  checked  to see if they meet the assumptions
of normality  and equal  variance.   If  so,  the analysis  concludes.   If  not, a
logarithm transformation may be  tried and the residuals from  the analysis of
variance on  the  log  data  are checked  for  assumptions.   If these still  do  not
adequately satisfy  the  assumptions, then a  one-way nonparametric analysis of
variance may be done, or professional consultation may be sought.
                                      5-2

-------
       Tolerance Limits:  Alternate Approach to
Background Well To Compliance Well Comparisons
         Tolerance" Limits
            Take Log
             of Data
           Consult with
           Professional
           Statistician
            Are Data
            Normal?
              Are
            Log Data
            Normal?
 Normal
Tolerance
  Limits
Conclusions])
 Lognormal
 Tolerance
  Limits
 Conclusions])
            )onclusions'
   Figure 5-2.  Tolerance limits:  alternate approach to background
              well to compliance well comparisons.
                            5-4

-------
5.2  ANALYSIS OF VARIANCE

     If  contamination  of  the ground  water  occurs  from  the waste  disposal
facility  and  if  the  monitoring  wells  are  hydraulically  upgradient  and
hydraulically downgradient  from  the  site,  then contamination is  unlikely to
change the  levels  of a  constituent  in all wells  by the  same amount.   Thus,
contamination from a disposal  site can  be seen as differences in average con-
centration among wells,  and  such differences can be detected by  analysis of
variance.

     Analysis of variance (ANOVA) is  the  name given  to a wide variety of sta-
tistical procedures.  All  of these procedures  compare  the means of different
groups of observations to determine whether there are  any significant differ-
ences  among  the  groups,  and  if  so, contrast  procedures  may  be  used  to
determine where  the  differences  lie.   Such procedures are  also known  in the
statistical literature as general linear model procedures.

     Because of  its  flexibility  and  power,  analysis of  variance  is  the pre-
ferred  method of  statistical analysis  when  the ground-water  monitoring  is
based  on  a comparison of background  and  compliance  well  data.   The  ANOVA is
especially useful  in  situations  where sample  sizes  are small, as  is  the case
during the  initial  phases  of ground-water monitoring.   Two types of  analysis
of variance  are  presented:    parametric and nonparametric  one-way  analyses of
variance.  Both methods are appropriate when the only factor of concern is the
different monitoring wells at a given sampling period.

     The hypothesis tests with parametric analysis of  variance usually assume
that  the errors  (residuals)  are  normally distributed with  equal  variance.
These  assumptions  can  be  checked  by  saving  the  residuals  (the  difference
between the observations and  the  values  predicted by the analysis of variance
model) and  using the tests of assumptions presented  in Section  4.   Since the
data will  generally  be  concentrations and since  concentration data are often
found  to follow the  lognormal distribution,  the  log   transformation  is  sug-
gested if substantial violations of the  assumptions  are found in the analysis
of the original concentration data.    If the residuals  from  the  transformed
data  do  not  meet  the   parametric   ANOVA  requirements,  then  nonparametric
approaches to analysis of variance are available using the ranks  of the obser-
vations.   A  one-way analysis of  variance using the  ranks is  presented  in
Section 5.2.2.

     When several  sampling  periods have  been  used and  it is important to con-
sider  the sampling periods  as a  second factor, then  two-way analysis of vari-
ance,  parametric or  nonparametric,  is appropriate.  This  would  be one  way to
test  for and adjust  the data for seasonality.   Also, trend  analysis  (e.g.,
time  series)  may be used to  identify  seasonal ity  in the  data  set.   If  neces-
sary,  data  that exhibit seasonal trends  can be  adjusted.   Usually,  however,
seasonal  variation will  affect  all  wells  at a  facility  by nearly  the  same
amount,  and  in most circumstances, corrections will not  be necessary.   Fur-
ther,  the  effects  of seasonality will be  substantially reduced  by simultane-
ously  comparing  aggregate   compliance  well  data  to   background  well  data.
Situations  that require  an  analysis  procedure  other  than  a  one-way  ANOVA
should be referred to a professional  statistician.


                                      5-5

-------
5.2.1  One-Way Parametric Analysis of Variance

     In the context of ground-water monitoring, two situations exist for which
a one-way analysis of variance is most applicable:

     *    Data for a water quality parameter  are available  from several wells
          but for only one time period (e.g.,  monitoring has just begun).

     *    Data for a water quality parameter  are available  from several wells
          for several  time periods.   However, the  data do  not  exhibit sea-
          sonal ity.

     In order to apply a  parametric  one-way  analysis of variance,  a minimum
number of observations  is  needed to give meaningful  results.   At least p > 2
groups are to be compared  (i.e., two or more wells).   It  is recommended that
each group (here,  wells)  have at least three  observations  and  that the total
sample size,  N,  be large enough so that N-p > 5.  A variety of combinations of
groups and number of observations  in  groups  will  fulfill  this minimum.   One
sampling interval  with  four  independent samples per well  and  at least three
wells would fulfill the minimum sample size requirements.  The wells should be
spaced so as  to maximize the probability of intercepting a plume of contamina-
tion.  The samples should  be taken far enough  apart  in  time to guard against
autocorrelation.

PURPOSE

     One-way  analysis  of  variance is  a statistical  procedure  to  determine
whether differences  in  mean  concentrations among  wells, or  groups  of wells,
are  statistically  significant.   For example,  is there  significant contamina-
tion of one or more compliance wells as compared to background wells?

PROCEDURE

     Suppose the  regulated  unit  has p wells and that n^ data points (concen-
trations of a constituent) are available for  the  ith well.   These data can be
from either  a single sampling period  or from  more  than one.   In the latter
case,  the user  could check for  seasonality before  proceeding by  plotting the
data over  time.   Usually the computation will  be  done on  a computer using a
commercially available  program.   However,  the procedure  is  presented so that
computations can be done using a desk calculator,  if necessary.

                                P
     Step 1.   Arrange the N =   in.  data points  in a data table as follows
                               1=1 1

(N  is  the total  sample size at this specific regulated unit):
                                      5-6

-------
                                                 Well Total Well Mean
                                                   (from      (from

Well No. 1
2
3
u
n


Observations

11 . In.
• ' 1
• i
xul
pl pn

Step 1)

1.
xu.
xp.
X
Step 2)
-
1.
xu.
X
P.
X
     Step 2.   Compute well totals and well means as follows:
              i
      Y   -  r  X
       i          i i '
        *    j=l    J
total of all n.. observations at well i
              1
       '.   = — X.  ,  average of all n. observations at well i
        i •   n •    i •                     i
              P    n1
       X   =  z    z  X. .  ,  grand total of all n. observations
        • ••I'-ilj                          I
       X   = -jj X   ,  grand mean of all observations
These totals and means are shown in the last two columns of the table above.

     Step 3.   Compute the  sum of  squares  of differences between  well  means
and the grand mean:
               SSu«T!«. =  z n. (X.  -X  )2=  z
                 Wells   ^_i i v i.    ..'
                        P  i
                        z  —
                       i=l ni
(The formula  on the  far right  is  usually most convenient  for calculation.)
This sum of  squares  has (p-1) degrees of  freedom  associated  with  it and is a
measure of the variability between wells.
                                      5-7

-------
     Step 4.   Compute the corrected total sum of squares
           SS
             Total
   p
=  z
?
.  - X
   p
=  z
                                                   ni
Z1 X?. - (X* /N)
(The formula  on the  far right  is  usually most convenient  for calculation.)
This sum of  squares  has (N-l) degrees of  freedom  associated  with  it and is a
measure of the variability in the whole data set.

     Step  5.  Compute  the  sum  of  squares  of differences  of  observations
within wells from the well means.  This is the sum of squares due to error and
is obtained by subtraction:



                          SSError = SSTotal  "  SSWells


It has  associated  with  it  (N-p) degrees of  freedom  and is  a  measure of the
variability within wells.

     Step 6.   Set up  the  ANOVA  table as  shown  below  in Table  5-1.   The sums
of squares and  their degree of  freedom were  obtained  from  Steps 3 through 5.
The mean square quantities are simply obtained by dividing each sum of squares
by its corresponding degrees of freedom.
                   TABLE 5-1.   ONE-WAY PARAMETRIC ANOVA  TABLE
Source of                          Degrees of
Variation        Sums of squares     freedom    Mean squares
                                                                      MS,



Error (within SSgrror
wells)
Total ssTotal
V A


N-p

N-l
''"Wells
— cc
- 5bWells/(p-l)
MSError
= SSError/(N-p)

MSr-
^Error




     Step  7.   To test  the  hypothesis  of  equal  means for all p wells, compute
F = MSWells/MSError  Oast column in Table 5-1).  Compare this statistic to the
tabulated  F statistic with  (p-1) and (N-p) degrees of freedom (Table 2, Appen-
dix B)  at  the 5% significance  level.   If the  calculated  F value exceeds the
tabulated  value, reject the  hypothesis  of equal  well  means.    Otherwise,
                                      5-8

-------
conclude that there is no significant difference between the concentrations at
the p wells and thus no evidence of well contamination.

     In the case of a  significant  F (calculated F greater than tabulated F in
Step 7), the user  will  conduct the next  few  steps  to determine which compli-
ance well(s) is (are)  contaminated.   This will  be done by comparing each com-
pliance well with the background well(s).  Concentration differences between a
pair of background wells and compliance wells or between a compliance well and
a set of background wells  are  railed  contrasts  in the ANO'VA and multiple com-
parisons framework.

     Step  8.   Determine  if  the significant  F  is due to  differences  between
background and compliance wells (computation of Bonferroni t-stati sties).

     Assume that of  the p wells, u are  background  wells  and m are compliance
wells (thus u + m  = p).   Then  m differences — m  compliance wells each compared
with the average  of  the background wells — need to  be computed and tested for
statistical significance.  If there are more than five downgradient wells, the
individual   comparisons  are done at the comparisonwise significance  level  of
1%, which may make the experimentwise significance level greater than 5%.

          Obtain the total sample size of all u background wells.
          Compute the average concentration from the u background wells.
          Compute  the  m differences  between the  average  concentrations from
          each compliance well and the average background wells.
                            X •   -  X.  ,   i  =  1 , . . . ,  m
                             I •     0
          Compute the standard error of each difference as
          where MSError  is determined  from  the  ANOVA table (Table 5-1)  and n.,-
          is the number of observations at well i.

          Obtain the  t-statistic  t = t/N_p^  n_a/m\  from Bonferroni 's t-table
          (Table 3, Appendix B) with o = 0.05 and (N-p) degrees of freedom.
                                      5-9

-------
          Compute the m quantities  0^ = SE^ x t  for each compliance well i.
          If m > 5 use the entry  for  t/N_p\  n.o.oi)*   Tnat is» use tne entry
          at m = 5.

     Step  9.    Compute  the residuals.   The  residuals  are the  differences
between each observation  and  its predicted  value according  to  the particular
analysis of  variance model  under consideration.   In the  case of  a one-way
analysis of variance,  the predicted value  for each observation  is  the group
(that is, well) mean.  Thus the residuals are given by:
The residuals,  R^,- can  be used  to check  for departures  from  normality  as

described in Section 4.2.

NOTE

     The data  can  also be checked  for equality of variances  as  described  in
Section 4.3.   The  last column of  Table 5-2 contains the  standard  deviations
estimated for each well, the S^ used in Bartlett's test.

INTERPRETATION

     If the  difference X^ -  X^  exceeds the  value D.J,  conclude  that  the ith

compliance well has significantly higher concentrations  than the average back-
ground wells.   Otherwise  conclude  that the well  is  not contaminated.   This
exercise needs  to  be  performed for  each of the m  compliance  wells individu-
ally.  The  test is  designed so that  the overall  experimentwise error is 5% if
there are no more than five compliance wells.

     In  some cases  it may be appropriate to  implement  the  ANOVA procedure
independently for  an  individual  regulated  unit.    If  there  are more than five
wells at the compliance point and the  waste management area consists  of more
than one  regulated unit, then the  data may be evaluated  separately for each
regulated unit  if  approved by the  Regional Administrator or  State Director.
In  many  cases the  monitoring well  system  design and  site  hydrogeology will
determine if this  approach is appropriate for  a particular  regulated  unit.
This will help  reduce the number of compliance wells used  in  a multiple well
comparisons procedure.

     If  a  single  regulated  unit  has  more than  five  wells  at the point  of
compliance,  refer to the caveat in the cautionary note.

CAUTIONARY NOTE

     Should  the  regulated unit  consist of more  than  five  compliance  wells,
then the Bonferroni t-test should be modified by doing the individual compari-
sons at  the  1%  level so  that the Part 264 Subpart  F  regulatory requirement


                                     5-10

-------
pursuant to §264.97(1)(2)  will  be met.   Alternately,  a different analysis  of
contrasts, such as Scheffe's, may be used.  The more advanced user is  referred
to the second reference below for a discussion of multiple comparisons.

REFERENCES

Johnson,  Norman L.,  and  F.  C.  Leone.   1977.   Statistics  and  Experimental
Design in Engineering  and  the Physical Sciences.   Vol.  II,  Second Edition,
John Wiley and Sons, New York.

Miller,  Ruppert  G.,  Jr.   1981.   Simultaneous Statistical Inference.   Second
Edition, Springer-Verlag,  New York.

EXAMPLE

     Four  lead  concentration  values  at  each  of  six  wells  are   given  in
Table 5-2  below.   The  wells consist  of  u=2  background and  m=4  compliance
wells.   (The  values in  Table 5-2 are actually  the  natural  logarithms of  the
original lead concentrations.)

     Step 1.   Arrange the  4 x 6 = 24 observations in a data table as  follows:


      TABLE 5-2.  EXAMPLE DATA FOR ONE-WAY PARAMETRIC ANALYSIS OF VARIANCE

Natural




loq

of Pb concentrations(yg/L)

Wei
1
total
Well No. Date:
1 Background wells
2
3 Compliance wells
4
5
6

Jan 1
4.06
3.83
5.61
3.53
3.91
5.42

Feb 1
3.99
4.34
5.14
4.54
4.29
5.21

Mar 1
3.40
3.47
3.47
4.26
5.50
5.29

Wei
1
mean
Apr 1 (X1e) (X^)
3.
4.
3.
4.
5.
5.

83
22
97
42
31
08
X..
15
15
18
16
19
21
= 106
.28
.86
.19
.75
.01
.00
.09
3
3
4
4
4
5
X.. = 4
.82
.97
.55
.19
.75
.25
.42


Well
std.
0.296
0.395
0.996
0.453
0.773
0.143

dev.


(max)


(min)

     Step  2.   The  calculations  are shown on  the  right-hand side of the  data
table above.  Sample standard deviations  have  been  computed  also.

     Step  3.   Compute the between-well sum of squares.


           SSWells = i (15-282 +  •••• + 21.012) - 25 x  106.082 =  5.75


               with  [6 (wells) -  1] = 5 degrees of  freedom.

                                     5-11

-------
     Step 4.    Compute the corrected  total  sum of squares.
        SS
          Total
= 4.062 + 3.992 +	+ 5.Q82 . -. x 106.082 = 11.92
               with [24 (observations)  - 1]  = 23 degrees of freedom.

     Step 5.    Obtain the within-well  or error sum of squares by subtraction.
                        SSError = U'92 ~ 5'75 = 6'17
               with [24 (observations)  - 6 (wells)]  = 18 degrees of freedom,

     Step 6.   Set up the one-way ANOVA as in Table  5-3 below:


      TABLE 5-3.   EXAMPLE COMPUTATIONS  IN ONE-WAY PARAMETRIC ANOVA TABLE
Source of
variation
   Sums  of     Degrees  of
   squares      freedom       Mean  squares
Between we!Is        5.76

Error                6.18
(within welIs)       	

Total               11.94
                   5

                  18


                  23
 5.76/5 = 1.15   1.15/0.34 = 3.38

6.18/18 = 0.34
     Step 7.   The calculated F statistic is 3.38.  The tabulated F value with
5  and  18 degrees of  freedom at the a = 0.05  level  is 2.77  (Table  2,  Appen-
dix B).  Since the calculated  value  exceeds  the  tabulated  value, the hypothe-
sis  of  equal  well  means must  be  rejected,  and  post hoc  comparisons  are
necessary.

     Step 8.   Computation of Bonferroni  t-statistics.

          Note that there are four compliance wells,  so m = 4 comparisons will
          be made

          n^ = 8         total number of samples  in background wells

          X^ = 3.89 average concentration of background wells
                                     5-12

-------
     Compute the differences between the four compliance wells and the
     average of the two background wells:

     X3. - Xb = 4.55 - 3.89 = 0.66

     XH- - Xb = 4.19 - 3.89 = 0.3

     X5. - Xb = 4.75 - 3.89 = 0.86

     X6. - Xb = 5.25 - 3.89 = 1.36

     Compute the standard error of  each  difference.   Since the number of
     observations  is  the  same  for  all compliance  wells, the  standard
     errors for the four differences will be equal.


         SE. = [0.34 (1/8 + 1/4) I*5 = 0.357 for i = 3,..., 6
     From Table 3,  Appendix  B,  obtain the critical  t  with  (24 - 6) = 18
     degrees of freedom, m = 4, and for  a =  0.05.   The approximate value
     is 2.43 obtained  by  linear interpolation between  15 and  20 degrees
     of freedom.

     Compute the quantities  D.J.   Again,  due to  equal  sample  sizes, they
     will all be equal.
        D. = SE. x t = 0.357 x 2.43 = 0.868 for i  = 3,..., 6


Step 9.  Compute the residuals using the data given in Table 5-2.

Residuals for Well 1:

     RM = 4.06 - 3.82 =  0.24
     R12 = 3.99 - 3.82 =  0.17
     R13 = 3.40 - 3.82 = -0.42
     Rm = 3.83 - 3.82 =  0.01

Residuals for Well 2:

     R21 = 3.83 - 3.97 = -0.14
     R22 = 4.34 - 3.97 =  0.37
     R23 = 3.47 - 3.97 = -0.50
     R21t = 4.22 - 3.97 =  0.25

Residuals for Well 3:

     R31 = 5.61 - 4.55 =  1.06
     R32 = 5.14 - 4.55 =  0.59
                                5-13

-------
          R33 = 3.47 - 4.55 = -1.08
          R3H = 3.97 - 4.55 = -0.58

     Residuals for Well  4:
RH1
M- 2
R|»3
R.m
Residual
RSI
R52
R53
RSU
= 3.
= 4.
= 4.
= 4.
s for
= 3.
= 4.
= 5.
= 5.
53 -
54 -
26 -
42 -
Well
91 -
29 -
50 -
31 -
4.
4.
4.
4.
5
4.
4.
4.
4.
19
19
19
19
•
•
75
75
75
75
=
=
=
=

s
=
-
=
-0
0
0
0

-0
-0
0
0
.66
.35
.07
.23

.84
.46
.75
.56
     Residuals for Well  6:
R61 = 5.42 -
R62 = 5.21 -
R63 = 5.29 -
R61t = 5.08 -
5.25 = 0.17
5.25 = -0.04
5.25 = 0.04
5.25 = -0.17
INTERPRETATION

     The F  test was  significant  at the  5% level.   The Bonferroni  multiple
comparisons procedure  was  then used  to determine  for which wells  there  was
statistically significant evidence of contamination.   Of  the four differences

X^ - Xb, only   X6. - Xb = 1.36 exceeds  the critical  value of 0.868.   From
this it  is  concluded that there  is  significant evidence of contamination at
Well 6.  Well 5  is  right on the boundary of significance.   It  is likely that
Well 6 has intercepted a plume of contamination with  Well 5  being on the edge
of the plume.

     All the compliance well concentrations were somewhat above  the mean con-
centration of the background  levels.   The well means  should be  used to indi-
cate  the location  of the  plume.    The findings  should  be  reported  to  the
Regional Administrator.

5.2.2  One-way Nonparametric Analysis of Variance

     This procedure is appropriate for  interwell comparisons when the data or
the residuals from a parametric ANOVA have been found to be  significantly dif-
ferent from normal and when a log  transformation fails to  adequately normalize
the  data.    In   one-way  nonparametric  ANOVA,  the   assumption  under  the  null
hypothesis is that the data from each well  come from the  same continuous dis-
tribution and hence have the same median  concentrations of  a specific hazard-
ous  constituent.   The alternatives  of  interest are  that the data  from some
wells show increased levels of the hazardous constituent in  question.
                                     5-14

-------
     The procedure is called the Kruskal-Wallis test.  For meaningful results,
there should be  at least three groups with  a  minimum sample size of three in
each group.  For large data sets use of a computer program is recommended.  In
the case  of  large data  sets  a  good approximation to  the  procedure  is to re-
place each  observation  by  its rank  (its numerical  place when  the  data are
ordered from least to greatest) and perform  the (parametric) one-way analysis
of variance  (Section 5.2.1)  on  the ranks.   Such  an  approach can be done with
some commercially statistical packages such as SAS.

PURPOSE

     The purpose of the procedure is to test the hypothesis that  all wells (or
groups of wells) around  regulated  units  have the  same median concentration of
a hazardous  constituent.   If the wells are  found  to  differ,  post-hoc compari-
sons are again necessary to determine if contamination is present.

     Note that the wells define the groups.  All wells will have  at least four
observations.  Denote the number of groups by K and the number of observations
in each  group  by n^, with N  being  the total  number  of all observations.  Let
X^j denote the jth observation in  the ith  group, where j runs  from  1 to the
number of  observations  in the group, n^,  and  i runs  from 1  to  the number of
groups, K.

PROCEDURE

     Step  1.   Rank all  N  observations  of the  groups  from least to greatest.
Let  R.JJ  denote  the rank  of  the   jth observation in  the ith  group.    As  a

convention, denote the background well(s) as group 1.

     Step 2.   Add the ranks  of the observations  in  each group.   Call the sum
of the ranks for the ith group R.J.  Also calculate  the average  rank for each
group,  R1 = Rj/rif.
Step 3.   Compute the Kruskal-Wallis statistic:


                  H =
                              12          Rf"
                                 tf  i=i  N)

     Step  4.   Compare  the calculated  value  H  to the  tabulated  chi-squared
value with (K-l) degrees of freedom, where K is the number of groups (Table 1,
Appendix B).   Reject the  null  hypothesis  if  the computed  value  exceeds the
tabulated critical value.
                                     5-15

-------
     Step 5.   If the  computed  value exceeds  the  value from  the  chi-squared
table, compute the critical difference for well comparisons to the background,
assumed to be group 1:
                   Ci =
for i taking values 2,..., K,
12
                                            1/2
where Z///K_j\\ is the upper (a/(K-l))-percentile from the standard normal

distribution found in Table 4, Appendix B.  Note:  If there are more than five
compliance  wells  at  the regulated  unit  (K > 6),  use  Z.01,  the  upper  one-
percentile from the standard normal distribution.

     Step 6.   Form the differences of the average ranks for each group to the
background and  compare  these  with the critical  values  found  in step 5 to de-

termine which wells give evidence of contamination.  That is,  compare R-j-Ri to
C.j  for  i  taking the values 2  through  K.   (Recall  that group 1  is the back-
ground.)

     While the  above steps  are  the general  procedure,  some details need to be
specified further to  handle special cases.  First,  it  may happen that two or
more observations are numerically  equal  or  tied.  When this occurs, determine
the  ranks that the  tied observations would  have received  if they  had  been
slightly different from  each  other,  but  still  in the same places with respect
to  the  rest of  the observations.   Add  these ranks and divide by the number of
observations tied at that value  to get an average rank.  This average rank is
used for  each  of  the tied observations.   This same  procedure is repeated for
any other groups of  tied observations.   Second,  if there are any values below
detection,  consider  all  values  below detection  as tied  at  zero.    (It  is
irrelevant what number  is assigned to nondetected values  as  long as all  such
values  are  assigned  the  same  number,  and it  is   smaller  than  any detected or
quantified value.)

     The  effect of  tied observations  is  to  increase  the  value of  the  sta-
tistic,  H.   Unless  there  are many observations  tied  at the  same  value,  the
effect  of ties  on the  computed  test statistic  is negligible (in practice, the
effect  of ties can probably be neglected unless  some group contains 10 percent
of  the  observations all  tied, which is most likely to occur for concentrations
below detection limit).   In the present  context, the term "negligible" can be
more  specifically  defined  as  follows.   Compute  the Kruskal-Wallis statistic
without the adjustment  for  ties.   If the test  statistic is significant at the
5%  level  then  conclude the test  since the  statistic with correction for ties
will be significant  as  well.   If the test statistic falls between  the 10% and
the 5%  critical values,  then proceed with the  adjustment for  ties  as shown
below.
                                     5-16

-------
ADJUSTMENT FOR TIES

     If  there are  50%  or  more  observations that  fell  below  the detection
limit, then this method for adjustment for ties is inappropriate.   The user  is
referred to  Section 8 "Miscellaneous Topics."   Otherwise,  if  there are tied
values present in the data, use the following correction for the H  statistic
                            h"  =
                                          H
                                 1 -
              z  T./(N3-N)
              • 1   i       /
where g = the number of groups of distinct tied observations and T^ =
where t^  is  the  number  of observations in the tied group i.  Note that unique
observations  can  be considered  groups  of  size 1,  with   the  corresponding
            = 0.
REFERENCE
Hollander,  Myles,  and  D.  A.  Wolfe.
Methods.   John Wiley  and Sons,  New York.

EXAMPLE
                   1973.
               Nonparametric  Statistical
     The  data  in Table 5-4 represent  benzene  concentrations  in water samples
taken at one background and five compliance wells.

     Step  1.   The  20 observations have  been  ranked from  least  to greatest.
The limit of detection was 1.0 ppm.  Note that two values in Well 4 were below
detection  and  were  assigned  value zero.   These  two  are tied for the smallest
value and  have consequently  been assigned the average  of the two ranks 1 and
2, or  1.5.   The ranks of  the observations are indicated in parentheses after
the observation  in  Table 5-4.   Note  that there are 3 observations tied at 1.3
that would  have had  ranks 4,  5, and  6  if they had  been slightly different.
These three have been assigned the average rank  of 5 resulting from averaging
4, 5, and 6.   Other ties occurred at 1.5  (ranks 7 and 8)  and 1.9 (ranks 11 and
12).

     Step 2.   The values of the sums  of ranks and average ranks are indicated
at the bottom  of Table 5-4.

     Step 3.   Compute the Kruskal-Wallis statistic
           H =
                   12
               20(20+1)
(34V4
+ 35.5V3) - 3(20+1) = 14.68
                                     5-17

-------







X— •*
I
a.
z
o
i— <
i—
2
i—
z
LU
O
0
UJ
LU

Z
LU
ca
i
i
^^
>
o
z
^£

v->
a:
H~
UJ
^r
^£
a:

a.

o
z
>_

(Tj
o




^-^
O^

VO
•
f-H



*^^»
^,
f-H
^•x
^^
•
""



s—*.
ID
•
i— 1
^^x

o










,^1— H»
LO
^HM^

ro
•
r-H





^"^*
o
CM
" —
O

r— 1
r— 1






^— »s
O
r— 1
x^x

f^
•
i— 1







r— (

C
*
--3



LO
r— I

LO
•
CM



^~*n
^O
r— t
s^
PS^
•
ro





s*^.
LO
v^^x

ro
•
<— i








^•^
ro
*^^

CM
«
i-H





f~**
CO
1— 1
' — '
o

00





^_,
LO
•
T— t
T— t
-w

C7^
•
v-H







i— 4

Q
a>
u.

_
to
i—4
t-H

^\
t
*— (



X^K
^f
,«4
— '
ro
•
CM



^^•^
LO . — «
• CO
t— ( »— 1
^••^ *^s

O CM
•
CM






,_^
LO
•
PN^
XuS

LO
•
•— '





*»••*
cn

*— '
LO
•
a\






,*~**
LO
• ***™^
P^ IT)
s.^x ^_ ^

LO ro
• •
f-H f-H







t-H f-H

U &-
(Tj f\
2: ««

ro
LO CO
LO ^-1
co ro •— i
ii u n

^ *O tf
c a: IQC


VO
•
r«^ LO
rO ^- r-i
ii it u

in in i/>
c a: la:



LO
CM
I— 1 •
«T CM LO

II II II

^ J* ^
c a: la:

on
C
o
•p-
"lo
LO l^x S-
r-l OJ
LO • in
ro i— i LO ri
0
II II II
M-
pn frj try Q
c ce: 10:
a;
-Q
^
c

0)
E cT
3 CM
c
(U

LO -M C
^" • r-H
«T CO CO • VO M ||
VO -r-
II II II
II II
*+ •• ^
c a: la: ^ z


t • • •
J^ C
c ia
<0 l-
s_
O)
<4- cn
O fO
s_
E (U
a >
5-18

-------
ADJUSTMENT FOR TIES

     There are four groups of ties in the data of Table 5-4:
          T2
          T3
          Tlf
               (23-2) = 6     for the 2 observations of 1,900.
               (23-2) = 6     for the 2 observations of 1,500.
               (33-3) = 24    for the 3 observations of 1,300.
               (23-2) = 6     for the 2 observations of 0.
     Thus
             z  T. = 6+6+24+6 = 42
and   H1  =
                             off = 14'76'  a ne9l191b1.e
                                                                from
     Step 4.   To  test the  null  hypothesis  of  no contamination,  obtain the
critical chi-squared value with (6-1) = 5 degrees of freedom at the 5% signif-
icance  level from  Table  1,  Appendix  B.   The value is 11.07.  Compare the cal-
culated  value,  H1,  with the  tabulated  value.   Since  14.76 is  greater than
11.07,  reject the  hypothesis of no contamination at the  5%  level.  If the site
was in  detection monitoring it should move  into compliance monitoring.   If the
site was  in compliance monitoring it should  move  into corrective actidn.  If
the site was in corrective action it should stay there.

     In the case  where the hydraulically upgradient wells  serve as the back-
ground  against  which the compliance wells  are to  be compared,  comparisons of
each compliance well with the background wells should be performed in addition
to the  analysis of variance procedure.  In  this example, data from each  of the
compliance  wells  would be compared with the  background  well  data.   This com-

parison is  accomplished as follows.  The average ranks for each group, R^  are
used to compute differences.  If a  group  of  compliance  wells for a regulated
unit have  larger  concentrations than those  found in the background wells, the
average rank for  the  compliance  wells  at  that  unit  will   be larger  than the
average rank for the background wells.

     Step  5.   Calculate  the  critical  values to  compare each compliance well
to the  background  well.

     In this example,  K=6,  sq  there  are  5 comparisons  of the compliance wells
with the  background  wells.   Using an experimentwise significance level  of a =
0.05,  we  find  the  upper  0.05/5 =  0.01  percentile  of the standard   normal
distribution to  be 2.33 (Table 4, Appendix B).   The  total  sample size, N, is
20.   The approximate  critical  value,  C2,  is  computed for compliance Well 2,
which has  the largest  average rank, as:
                     = 2.32
                             20(21)
                               .2
                                     1/2
                                                  1/2
The critical values for the other wells  are:
9.8 for Well 4.
                                                     =  10.5
                                               10.5 for Wells 3, 5, and 6; and
                                     5-19

-------
     Step 6.   Compute the  differences  between the average rank  of  each com-
pliance well and the average rank of the background well:


          Differences                   Critical values

     02 = 19.0 - 8.5 = 10.5            - -C2 =-10.5
     03 = 5.17 - 8.5 = -3.33           —C3 =10.5
     D^ = 5.25 - 8.5 = -3.25               C^. = 9.8
     D5 = 15.67 - 8.5 * 7.17               C5 =-10.5
     D6 = 11.83 - 8.5 = 3.13               C6 =-10.5


Compare each difference with the corresponding critical difference.  D2 = 10.5
equals the critical value of C2 = 10.5.   We conclude that the concentration of
benzene averaged over compliance Well 2  is  significantly greater  than that at
the  background well.   None  of the  other  compliance  well concentration  of
benzene is significantly higher than the average background value.  Based upon
these  results,   only  compliance  Well 2   can  be  singled  out   as  being
contaminated.

     For data  sets with more  than  30  observations,  the parametric analysis of
variance performed on the  rank  values is a  good approximation  to the Kruskal-
Wallis test  (Quade,  1966).   If  the user has access to SAS, the PROC RANK pro-
cedure is used to obtain the ranks of the data.  The analysis of variance pro-
cedure detailed  in Section 5.2.1  is  then  performed on  the  ranks.   Contrasts
are tested as  in the parametric analysis of variance.

INTERPRETATION

     The  Kruskal-Wallis  test  statistic  is compared  to  the tabulated critical
value  from  the  chi-squared  distribution.    If the  test statistic  does  not
exceed the  tabulated  value,  there  is  no statistically significant evidence of
contamination  and  the analysis  would stop  and  report  this finding.   If  the
test statistic exceeds the tabulated value, there is significant evidence that
the  hypothesis  of  no differences  in compliance  concentrations  from  the back-
ground  level is not  true.   Consequently,  if the test  statistic  exceeds  the
critical  value,  one  concludes that there is  significant evidence of contami-
nation.   One then  proceeds to investigate where the differences lie, that is,
which wells  are indicating contamination.

     The  multiple  comparisons  procedure  described  in steps 5 and  6 compares
each compliance well to the background well.  This determines which compliance
wells  show statistically significant evidence  of contamination  at an experi-
mentwise  error rate of 5  percent.   In  many cases,  inspection of  the mean or
median concentrations will be sufficient to  indicate where the problem lies.

5.3  TOLERANCE  INTERVALS BASED ON  THE NORMAL DISTRIBUTION

     An  alternate  approach to analysis  of variance to determine whether there
is  statistically  significant  evidence  of  contamination is to  use  tolerance
intervals.   A tolerance  interval  is  constructed from  the data  on (uncontam-
inated)  background wells.   The concentrations  from  compliance  wells are then

                                     5-20

-------
compared with the  tolerance  interval.   With the exception  of  pH,  if the com-
pliance concentrations  do  not fall  in  the tolerance  interval,  this provides
statistically significant evidence of contamination.

     Tolerance intervals are most appropriate  for  use at  facilities  that do
not exhibit  high degrees  of spatial variation  between background  wells and
compliance wells.    Facilities  that overlie  extensive, homogeneous geologic
deposits (for example, thick, homogeneous  lacustrine clays) that do not natu-
rally display hydrogeochemical variations may be suitable for this statistical
method of analysis.

     A  tolerance  interval  establishes  a concentration  range  that  is  con-
structed  to  contain  a  specified  proportion  (P%)  of  the  population  with  a
specified  confidence  coefficient,  Y.     The   proportion   of  the  population
included, P,  is  referred to as the  coverage.   The  probability with which the
tolerance interval includes the proportion P% of the population is referred to
as the tolerance coefficient.

     A coverage  of 95% is  recommended.   If  this  is used,  random observations
from the same distribution  as  the  background  well  data would exceed the upper
tolerance limit  less  than  5% of  the time.  Similarly, a tolerance coefficient
of 95% is recommended.  This means that one has a confidence level of 95% that
the upper 95% tolerance limit will contain at least 95% of the distribution of
observations from  background well  data.   These values  were chosen to be con-
sistent  with  the performance  standards described  in  Section 2.   The  use of
these  values  corresponds to  the selection of  o of  5% in  the  multiple well
testing situation.

     The procedure can be  applied with as few  as  three observations from the
background distribution.    However,  doing so  would  result in a  large upper
tolerance limit.  A sample size of eight or more results is an adequate toler-
ance  interval.   The  minimum sampling  schedule called  for  in  the regulations
would result in  at least four observations from each background well.  Only if
a  single  background  well is sampled at a single point in  time  is the sample
size so small as to make use of the procedure questionable.

     Tolerance  intervals can  be constructed  assuming that the  data  or the
transformed data are normally distributed.   Tolerance intervals  can  also be
constructed assuming  other distributions.   It  is also possible  to construct
nonparametric tolerance intervals using only the assumption that the data came
from  some  continuous  population.    However,  the  nonparametric  tolerance
intervals require  such  a large number  of observations to provide a reasonable
coverage  and  tolerance   coefficient   that   they   are  impractical  in  this
application.

     The range of the concentration data  in the background well samples should
be considered in determining whether the tolerance interval approach should be
used,  and  if  so, what distribution  is  appropriate.   The  background well con-
centration  data  should be  inspected  for outliers  and  tests  of  normality
applied  before selecting the tolerance  interval  approach.   Tests of normality
were presented in  Section  4.2.   Note that in this case, the test of normality
would  be applied to  the background  well  data that are used  to  construct the


                                     5-21

-------
tolerance  interval.    These  data   should   all   be  from  the  same  normal
distribution.

     In this application, unless pH  is  being  monitored,  a one-sided tolerance
interval or an upper tolerance limit  is desired,  since contamination is indi-
cated by large concentrations of the  hazardous  constituents monitored.   Thus,
for concentrations,  the  appropriate tolerance  interval  is (0, TL),  with the
comparison of importance being the larger limit, TL.

PURPOSE

     The purpose of the tolerance  interval approach is to define a concentra-
tion range from  background  well  data, within which a large proportion  of the
monitoring observations should fall with high probability.  Once this is done,
data from  compliance wells  can  be checked  for evidence  of  contamination by
simply  determining  whether  they  fall in the tolerance interval.   If they do
not, this is evidence of contamination.

     In this case  the  data are assumed  to be approximately normally distrib-
uted.   Section  4.2 provided methods to  check for  normality.   If  the data are
not normal, take the natural  logarithm  of the data and see if the transformed
data are  approximately normal.   If  so,  this  method can be used  on the loga-
rithms  of  the  data.    Otherwise,  seek  the  assistance  of  a  professional
statistician.

PROCEDURE

     Step 1.   Calculate the mean, X, and  the standard deviation, S, from the
background well data.

     Step 2.   Construct the one-sided upper tolerance limit as

     TL = X + K S,

where K is the one-sided normal tolerance factor found in Table 5, Appendix B.

     Step 3.   Compare each observation from compliance wells to the tolerance
limit found  in  Step 2.  If  any observation  exceeds the tolerance limit, that
is  statistically significant evidence  that  the well  is  contaminated.   Note
that if the  tolerance  interval was constructed  on  the logarithms of the orig-
inal background  observations, the logarithms of the compliance  well observa-
tions should be  compared to the  tolerance limit.   Alternatively the tolerance
limit  may be  transferred  to the  original   data  scale  by taking  the anti-
logarithm.

REFERENCE

Lieberman,  Gerald  J.    1958.    "Tables for  One-sided Statistical  Tolerance
Limits."  Industrial Quality Control.   Vol.  XIV,  No. 10.
                                     5-22

-------
EXAMPLE

     Table 5-5 contains  example  data  that represent lead concentration levels
in  parts  per  million  in water samples  at  a hypothetical  facility.   The
background well  data are  in columns  1  and 2,  while the  other four columns
represent compliance well data.


          TABLE 5-5.  EXAMPLE DATA FOR NORMAL TOLERANCE  INTERVAL
       Lead concentrations  (ppm)

                Background  well   	Compliance wells	
       Date       A          B    well  1   Well 2   Well 3   Well 4
Jan
Feb
Mar
Apr
n
Mean
SD
1
1
1
1
=
=
=




8
51
16
58
54
30
46

.4
.3
.0
.1
.0
.1



46.
76.
32.
68.
The
with
51.4
1 273.1*
7 170.7*
1 32.1
0 53.0
upper 95%
tolerance
+ (3.188)
34.1
93.7
70.8
83.1
coverage
49
73
244
202
.9
.0
.7*
.4*
tolerance 1
coefficient of
(16.3) =
103.4
95%

225
183
198
160
imit
is

.9*
.1*
.3*
.8*



           Indicates contamination
     Step  1.   The mean  and  standard  deviation  of the n = 8 observations have
been calculated  for the background well.   The mean  is  51.4  and the standard
deviation  is 16.3.

     Step  2.   The tolerance  factor for a  one-sided normal  tolerance interval
is  found  from Table 5,  Appendix  B as  3.188.   This  is  for  95% coverage with
probability 95%  and  for n =  8.   The  upper tolerance limit  is then calculated
as  51.4 +  (3.188)(16.3)  = 103.4.

     Step  3.   The  tolerance limit of  103.3  is compared with  the compliance
well data.  Any value that exceeds the  tolerance limit indicates statistically
significant  evidence of contamination.   Two  observations  from  Well  1,  two
observations  from Well 3, and  all four observations from  Well 4  exceed  the
tolerance  limit.   Thus  there is  statistically significant evidence  of con-
tamination at Wells  1, 3, and 4.
                                     5-23

-------
INTERPRETATION

     A tolerance limit with 95% coverage gives  an  upper bound below which 95%
of the  observations  of the distribution  should fall.   The  tolerance coeffi-
cient used here is 95%,  implying  that  at  least  95% of the observations should
fall  below  the  tolerance limit with  probability 95%,  if  the compliance well
data  come from the same  distribution  as the  background  data.  In other words,
in this example, we are 95% certain that 95% of  the background lead concentra-
tions are below 104 ppm.   If  observations  exceed the tolerance limit, this is
evidence that the compliance well  data are not from the  same distribution, but
rather  are  from a distribution  with  higher  concentrations.  This  is inter-
preted as statistically significant evidence of  contamination.

5.4  PREDICTION INTERVALS

     A prediction interval is a statistical interval calculated to include one
or more  future  observations  from  the same population with a specified confi-
dence.   This approach  is algebraically equivalent  to  the  average  replicate
(AR)   test  that is presented  in  the  Technical   Enforcement  Guidance Document
(TEGD),  September  1986.    In  ground-water monitoring,  a  prediction interval
approach may  be used  to make comparisons  between background and compliance
well   data.    This  method  of  analysis  is  similar  to that  for  calculating  a
tolerance limit, and  familiarity with prediction intervals or personal prefer-
ence would be the only reason for selecting them over the method  for tolerance
limits.  The concentrations of a hazardous constituent in the background wells
are used to establish  an interval  within which  K future observations from the
same population are expected to lie with a specified confidence.   Then each of
K  future observations of  compliance  well  concentrations  is compared  to the
prediction  interval.    The interval is constructed  to contain all  of K future
observations  with  the stated confidence.   If any  future  observation exceeds
the prediction interval, this is statistically significant evidence of contam-
ination.   In  application, the number of future  observations to  be collected,
K, must be  specified.    Thus,  the prediction  interval  is  constructed  for  a
specified time period in the future.  One year is suggested.   The  interval can
be constructed  either to contain  all  K individual  observations  with a speci-
fied  probability,  or  to contain  the  K1   means  observed  at the  K1  sampling
periods.

     The prediction  interval  presented here  is  constructed  assuming that the
background  data  all  follow the same normal distribution.   If that is not the
case  (see  Section 4.2   for  tests  of  normality),  but   a  log  transformation
results  in data that are adequately normal  on the log scale, then the interval
may  still  be used.   In  this  case, use the data after  transforming  by taking
the  logarithm.   The  future observations  need  to also be transformed by taking
logarithms  before  comparison  to the interval.   (Alternatively,  the end points
of the  interval  could  be converted back  to the  original scale by taking their
anti-logarithms.)

PURPOSE

     The prediction  interval  is constructed so  that  K  future compliance well
observations  can be  tested  by determining  whether  they  lie in the interval or


                                     5-24

-------
not.   If  not, evidence of  contamination is found.   Note that  the  number of
future observations, K, for which the interval  is to  be  used,  must  be speci-
fied  in advance.   In practice,  an  owner or operator would  need to  construct
the prediction interval on  a periodic  (at  least yearly)  basis,  using the most
recent background  data.   The  interval  is  described using the  95% confidence
factor appropriate for individual well comparisons.   It  is  recommended that a
one-sided  prediction interval be constructed for the mean of the four observa-
tions from each compliance well at each sampling period.

PROCEDURE

     Step 1.   Calculate the mean,  X,  and the  standard deviation, S,  for the
background well data (used to form the prediction interval).

     Step 2.   Specify the number of future observations for a compliance well
to be included in the interval, K.  Then the interval is given by
                    [0, X + syi/m+ 1/n   t(n_u K> Q>95)]


where  it  is  assumed  that the mean of  the m observations taken at  the  K sam-
pling periods will be used.  Here n is the number of observations in the back-
ground data, and  t/n i  v  n Qc\ is found from Table 3 in Appendix B.  The
~"                  III—1. l\ • U.JJI
table  is  entered  with K as the  number  of  future observations,  and degrees of
freedom, v = n-1.  If K > 5, use the column for K = 5.

     Step 3.   Once the interval has been calculated, at each sampling period,
the mean of the m compliance well observations is obtained.  This mean is com-
pared  to  see  if it falls in  the interval.   If  it  does,  this  is reported and
monitoring continues.   If a mean concentration  at  a sampling  period does not
fall in the prediction interval, this is statistically significant evidence of
contamination.  This  is also reported and the appropriate action taken.

REMARK

     For  a  single future  observation, t is  given by the t-distribution found
in Table 6 of Appendix  B.   In general,  the interval to contain K future means
of sample size m each is given by
                    fo  Y + '
                    IU,  AT.


where  t  is as before from Table 3 of Appendix B  and  where  m is the number of
observations  in  each mean.   Note  that for K  single  observations,  m=l,  while
for the mean of four  samples from a compliance well, m=4.

     Note,  too,  that the  prediction  intervals  are one-sided,  giving  a  value
that should not be exceeded by the future observations.  The 5% experimentwise
significance  level  is used with the Bonferroni approach.   However, to ensure


                                     5-25

-------
that the significance  level  for the individual comparisons  does  not go below
1%, a/K  is restricted  to  be 1%  or larger.    If  more than  K  comparisons  are
used,  the  comparisonwise significance  level  of 1% is  used,  implying that  the
comparisonwise level may exceed 5%.

EXAMPLE

     Table 5-6 contains  chlordane  concentrations  measured at  a  hypothetical
facility.  Twenty-four  background observations are available  and  are used to
develop  the  prediction  interval.    The  prediction interval  is  applied to  K=2
sampling periods with m=4 observations at a single compliance well each.

     Step  1.   Find the mean and  standard deviation of the 24 background well
measurements.  These are 101 and 11, respectively.

     Step 2.   There are K = 2  future observations of means of 4  observations
to be included in  the prediction  interval.   Entering  Table 3 of Appendix B at
K  =  2  and 20 degrees  of freedom (the nearest  entry to  the 23  degrees of
freedom), we find t^o  2  0 95) = 2.09.  The interval is given by

     [0, 101 + (11)2.09(1/4 + 1/24)1/2]  = (0, 113.4).


     Step  3.   The  mean  of each of the four compliance well  observations at
sampling period  one and two is found and compared with  the interval found in
Step 2.  The mean  of  the first  sampling period is 122 and that for the second
sampling period  is  113.  Comparing the first of these to the prediction inter-
val for  two  means  based on samples of  size  4, we find that  the  mean exceeds
the upper  limit of  the prediction  interval.   This is  statistically significant
evidence of  contamination  and  should  be reported to  the  Regional  Administra-
tor.  Since the  second sampling period mean is within the prediction  interval,
the Regional  Administrator may  allow  the facility  to remain  in  its current
stage of monitoring.

INTERPRETATION

     A  prediction   interval is  a  statistical interval constructed  from back-
ground  sample  data to contain  a  specified number  of  future observations from
the same distribution with  specified  probability.   That is,  the  prediction
interval is constructed so as to have a 95% probability of containing the next
K  sampling period  means,  provided that  there is no  contamination.   If  the
future  observations are  found  to  be in  the  prediction interval,  this is evi-
dence that there has  been  no change at  the  facility  and that no contamination
is  occurring.    If  the  future  observation  falls outside  of  the  prediction
interval,  this is  statistical evidence  that  the new  observation does not come
from  the same distribution, that  is,  from  the population  of uncontaminated
water samples  previously sampled.   Consequently,  if  the observation is a con-
centration  above the prediction  interval's  upper limit,  it  is statistically
significant evidence of contamination.
                                     5-26

-------
TABLE 5-6.  EXAMPLE DATA FOR PREDICTION INTERVAL--CHLORDANE LEVELS

Background well data—Well 1
Sampling date
January 1, 1985



April 1, 1985



July 1, 1985



October 1, 1985



January 1, 1986



April 1, 1986



n =
Mean =
SD =
Chlordane
concentration
(ppb)
97
103
104
85
120
105
104
108
110
95
102
78
105
94
110
111
80
106
115
105
100
93
89
113
24
101
11
Compliance well data— Well 2
Chlordane
concentration
Sampling date (ppb)
July 1, 1986 123
120
116
128
m = 4
Mean = 122
SD = 5

October 1, 1986 116
117
119
101
m = 4
Mean = 113
SD = 8












                               5-27

-------
     The  prediction  interval  could be constructed in several ways.   It can be
 developed for means  of observations  at  each sampling period, or for  each  in-
 dividual  observation  at each  sampling period.

     It  should also  be noted  that the estimate of the standard deviation,  S,
 that is  used should  be an unbiased estimator.  The usual  estimator,  presented
 above, assumes that there  is  only  one source of variation.   If there  are other
 sources  of variation, such as  time  effects, or spatial variation  in  the data
 used for  the background, these should be included  in  the estimate of  the vari-
'ability.   This can be accomplished by use of  an  appropriate analysis-of-vari-
 ance model  to include the  other factors  affecting  the variability.  Determina-
 tion of  the  components of  variance  in complicated models is beyond  the scope
 of this document  and  requires consultation with a professional statistician.

 REFERENCE

 Hahn,  G.  and Wayne Nelson.   1973.   "A Survey of Prediction Intervals  and Their
 Applications." Journal of Quality Technology.   5:178-188.
                                      5-28

-------
                                  SECTION  6

                         COMPARISONS WITH MCLs OR ACLs
     This section  includes  statistical procedures appropriate  when the moni-
toring aims  at determining  whether ground-water concentrations  of hazardous
constituents are below or above fixed concentration limits.  In this situation
the maximum  concentration limit (MCL) or  alternate  concentration limit (ACL)
is a  specified  concentration limit rather than  being  determined  by the back-
ground well  concentrations.    Thus  the applicable statistical  procedures  are
those that compare the  compliance  well  concentrations  estimated from sampling
with  the  prespecified fixed  limits.   Methods  for comparing  compliance well
concentrations  to  a  (variable)  background  concentration  were  presented  in
Section 5.

     The methods applicable  to  the  type  of comparisons described  in this sec-
tion include confidence  intervals  and  tolerance  intervals.   A special section
deals with cases where the observations exhibit very small or no variability.

6.1  SUMMARY CHART FOR COMPARISON WITH MCLs OR ACLs

     Figure  6-1  is a flow chart  to aid  the user in selecting  and  applying a
statistical method when the permit  specifies an MCL or ACL.

     As with each  type  of comparison,  a  determination  is made first to see if
there are enough data for intra-well comparisons.  If so, these should be done
in parallel with the other comparisons.

     Here, whether the compliance limit is a maximum concentration  limit (MCL)
or an  alternate  concentration limit (ACL), the  recommended  procedure to com-
pare  the  mean compliance well  concentration  against the  compliance  limit  is
the  construction of  a  confidence  interval.   This  approach is  presented  in
Section 6.2.1.   Section 6.2.2 adds a special case of  limited variance in the
data.   If the permit requires  that a compliance limit  is not  to be exceeded
more than a specified fraction of the time, then the construction of tolerance
limits is the recommended procedure, discussed in Section 6.2.3.

6.2  STATISTICAL PROCEDURES

     This  section  presents  the statistical .procedures  appropriate  for com-
parison  of  ground-water  monitoring data  to  a  constant compliance  limit,  a
fixed standard.  The interpretation of the fixed compliance limit (MCL or ACL)
is that the  mean concentration should  not exceed this  fixed limit.  An alter-
nate  interpretation  may be  specified.  The permit could specify  a compliance
limit  as  a concentration not  to  be exceeded by more  than  a small, specified


                                      6-1

-------
          Comparisons with MCL/ACLs
                        Comparisons with
                           MCL/ACLs
                           (Section 6)
                                                       mtra-Well Comparisons
                                                       if More than 1 Yr of Data
                                                          Control  Charts
                                                            (Section 7)
                            Type of
                          Comparison
                  with Upper 95th Percentile
Confidence Intervals
                            Tolerance Limits
                                                     Conclusions
     Are Data
     Normal?
              Conclusions
       Are
    Log Data
     Normal?
     Enough
       Data
    Available?
Nonparametric
 Confidence
  Intervals
Consult with
Professional
Statistician
   Figure 6-1.   Comparisons  with MCLs/ACLs.
                         6-2

-------
proportion of  the observations.
situation is also presented.

6.2.1  Confidence Intervals
                                   A  tolerance interval  approach for  such  a
     When a regulated unit is in compliance monitoring with a fixed compliance
limit (either an MCL or an ACL), confidence intervals are the recommended pro-
cedure pursuant to §264.97(h)(5)  in  the  Subpart  F  regulations.   The unit will
remain in compliance monitoring unless there is statistically significant evi-
dence that  the mean  concentration  at one  or  more of the  downgradient  wells
exceeds the compliance  limit.   A confidence interval  for  the mean concentra-
tion  is  constructed  from the  sample data for each  compliance  well individu-
ally.  These confidence  intervals are  compared with  the  compliance limit.  If
the entire confidence interval  exceeds the compliance limit, this is statisti-
cally significant evidence that  the  mean concentration exceeds  the compliance
limit.

     Confidence intervals can  generally  be  constructed for  any  specified dis-
tribution.   General  methods can be found  in  texts on  statistical inference
some  of  which  are referenced in Appendix C.   A  confidence  limit based on the
normal distribution  is presented first, followed by  a  modification  for the
log-normal  distribution.    A  nonparametric  confidence   interval  is   also
presented.

6.2.1.1  Confidence Interval Based on the Normal  Distribution

PURPOSE

      The  confidence  interval  for the  mean concentration is  constructed from
the  compliance  well  data.   Once the  interval  has  been constructed, it can be
compared with  the MCL  or ACL by inspection to  determine  whether the mean con-
centration significantly exceeds the MCL or ACL.

PROCEDURE

      Step 1.   Calculate  the  mean, X, and standard deviation, S, of the sample
concentration values.  Do this  separately for each compliance well.

      Step 2.  For each well calculate the confidence interval as



                             *  ± '(0.99,  n-1) S//"


where  ^fQtggt n_n  is  obtained  from  the  t-table  (Table 6,  Appendix B).

Generally, there  will  be at least four  observations  at  each sampling period,
so t  will usually have at least 3 degrees of freedom.

      Step  3.    Compare  the intervals calculated  in  Step 2  to  the compliance
limit (the MCL  or ACL,  as  appropriate).   If the  compliance limit is contained
in the  interval  or is above the  upper  limit,  the  unit remains  in compliance.
                                      6-3

-------
If any well  confidence interval's lower  limit  exceeds the  compliance  limit,
this is statistically significant evidence of contamination.

REMARK

     The 99th  percentile  of  the  t-distribution is  used  in  constructing  the
confidence interval.  This is consistent  with an  alpha (probability of Type I
error) of  0.01, since  the  decision  on  compliance  is  made  by  comparing  the
lower  confidence  limit to  the MCL  or ACL.   Although the  interval as  con-
structed with  both  upper  and lower  limits is a 98% confidence  interval,  the
use  of  it  is  one-sided,  which  is   consistent  with  the  1%  alpha level  of
individual well comparisons.

EXAMPLE

     Table 6-1  lists hypothetical concentrations of  Aldicarb in  three  compli-
ance wells.   For  illustration purposes,  the MCL for Aldicarb  has  been  set at
7 ppb.  There is no evidence of nonnormality, so the confidence interval based
on the normal distribution is used.

      TABLE 6-1.  EXAMPLE  DATA FOR NORMAL CONFIDENCE INTERVAL—ALDICARB
                  CONCENTRATIONS  IN COMPLIANCE WELLS (ppb)
                   Sampling
                     date           Well  1          Well  2           Well  3
                    Jan.  1           19.9             23.7              5.6
                    Feb.  1           29.6             21.9              3.3
                    Mar.  1           18.7             26.9              2.3
                    Apr.  1           24.2             26.1              6.9
                           X  =       23.1             24.6              4.5
                           S  =        4.9              2.3              2.1
    MCL = 7 ppb
     Step 1.  Calculate  the  mean  and  standard  deviation of the concentrations
for each compliance well.  These statistics are shown in the table above.

     Step 2.  Obtain  the 99th  percentile  of  the t-distribution with (4-1) = 3
degrees of freedom from Table 6, Appendix B as 4.541.  Then calculate the con-
fidence interval for each well's mean concentration.

               Well 1:  23.1 ± 4.541(4.9)//4~ = (12.0, 34.2)

               Well 2:  24.6 ± 4.541(2.3)//4~= (19.4, 29.8)

               Well 3:   4.5 ± 4.541(2.1)//3~= (-0.3, 9.3)
                                      6-4

-------
where the  usual  convention  of  expressing the upper  and lower  limits  of the
confidence interval in parentheses separated by a comma has been followed.

     Step 3.  Compare each confidence interval to the MCL of 7 ppb.  When this
is done, the confidence  interval  for  Well  1  lies entirely above the MCL of 7,
indicating  that  the mean  concentration of Aldicarb  in Well  1 significantly
exceeds the MCL.   Similarly,  the  confidence  interval  for Well 2 lies entirely
above the MCL of 7.   This  is significant evidence that the mean concentration
in Well  2  exceeds the MCL.   However,  the  confidence interval  for  Well  3 is
mostly below  the  MCL.   Thus,  there is  no  statistically significant evidence
that the mean concentration in Well 3 exceeds the MCL.

INTERPRETATION

     The confidence interval is an interval constructed so that  it should con-
tain  the  true  or  population mean  with  specified   confidence  (98% in  this
case).   If  this interval does not contain the compliance limit, then the mean
concentration must differ  from the  compliance limit.   If the lower end of the
interval is  above  the compliance  limit,  then the mean  concentration must be
significantly greater than the compliance limit, indicating noncompliance.

6.2.1.2  Confidence Interval for Log-Normal Data

PURPOSE

     The purpose of  a confidence interval for the mean concentration of log-
normal  data  is  to  determine  whether  there   is  statistically  significant
evidence that  the  mean  concentration  exceeds a fixed  compliance  limit.   The
interval  gives  a  range  that   includes  the true   mean  concentration  with
confidence  98%.   The lower limit will  be  below  the  true mean with confidence
99%, corresponding to an alpha of 1%.

PROCEDURE

     This  procedure  is used  to construct  a confidence  interval  for the mean
concentration from the compliance well data when the data are log-normal (that
is,  when the  logarithms of  the  data  are normally  distributed).    Once  the
interval  has been  constructed,  it can be compared  with the  MCL or  ACL by
inspection  to  determine  whether the mean  concentration significantly exceeds
the  MCL or  ACL.   Throughout the  following procedures  and  examples, natural
logarithms  (In) are used.

     Step  1.  Take  the  natural  logarithm of each data point  (concentration
measurement).  Also, take the natural logarithm of the compliance limit.

     Step  2.  Calculate the sample mean  and standard deviation of  the  log-
transformed  data from each compliance well.   (This  is  Step  1 of the previous
section, working now with  logarithms.)
                                      6-5

-------
     Step 3.  Form the confidence intervals for each compliance well as



                            * *  '(0.99,  n-1)


where t/Q gg  n_j\ is from the t-distribution  in  Table 6 of Appendix B.  Here

t will typically have 3 degrees of freedom.

     Step  4.   Compare  the  confidence  intervals  found   in  Step  3   to  the
logarithm of the compliance limit found  in Step 1.  If the lower limit of the
confidence interval lies entirely above the logarithm of the compliance limit,
there is  statistically  significant  evidence that  the unit  is  out  of  compli-
ance.  Otherwise, the unit is in compliance.

EXAMPLE

     Table 6-2 contains EDB concentration  data from three  compliance wells at
a hypothetical site.  The MCL is assumed to be 20 ppb.  For demonstration pur-
poses,  the  data  are  assumed   not   normal;   a  natural   log-transformation
normalized them adequately.  The lower part of the table contains the natural
logarithms of the concentrations.

      TABLE  6-2.   EXAMPLE DATA FOR  LOG-NORMAL  CONFIDENCE INTERVAL—EDB
                   CONCENTRATIONS IN COMPLIANCE WELLS (ppb)
             Sampling
               date           Well  1          Well  2           Well  3
                                         Concentrations
              Jan.  1           24.2            39.7             55.7
              Apr.  1           10.2            75.7             17.0
              Jul.  1           17.4            60.2             97.8
              Oct.  1           39.7            10.9             25.3

                      X =      22.9            46.6             49.0
                      S =      12.6            28.0             36.6

    MCL * 20 ppb
                                    Natural  log concentrations
              Jan.  1            3.19            3.68            4.02
              Apr.  1            2.32            4.33            2.84
              Jul.  1            2.85            4.10            4.58
              Oct.  1            3.68            2.39            3.23

                      X *       3.01            3.62            3.67
                      S =       0.57            0.86            0.78

    In (MCL) = 3.00
                                      6-6

-------
     Step 1.   The logarithms of  the  data are used to  calculate  a confidence
interval.   Take  the  natural log  of the concentrations  in  the  top  part  of
Table 6-2 to find the values given  in the lower part of the table.  For exam-
ple, ln(24.2) = 3.19,  .  . ., ln(25.3) = 3.23.  Also, take the logarithm of the
MCL to find that ln(20)  = 3.00.

     Step 2.  Calculate  the  mean  and  standard deviation of the log concentra-
tions for each compliance well.   These are shown in the table.

     Step 3.  Form the confidence intervals for each compliance well.


               Well 1:  3.01 ± 4.541(0.57)/A"= (1.72, 4.30)

               Well 2:  3.62 ± 4.541(0.86)//T= (1.67, 5.57)

               Well 3:  3.67 ± .4.541(0.78)//T= (1.90, 5.44)


where 4.541  is  the value  obtained  from  the t-table (Table 6 in Appendix B)  as
in the previous example.

     Step 4.   Compare the  individual  well confidence  intervals  with  the MCL
(expressed on the log scale).   The natural  log of  the  MCL of 20  ppm is 3.00.
None  of  the  individual  well confidence  intervals  for  the  mean  has  a lower
limit that exceeds this  value,  so  none  of the individual well mean concentra-
tions is significantly different from the MCL.

     Note:   The  lower  and  upper limits  of  the confidence  interval  for each
well's mean  concentration could  be converted  back  to the original  scale  by
taking antilogs.  For example, on the original scale, the confidence intervals
would be:


               Well 1:  (exp(1.72), exp(4.30)) or (5.58, 73.70)

               Well 2:  (exp(1.67), exp(5.51)) or (5.31, 262.43)

               Well 3:  (exp(1.90), exp(5.44)) or (6.69, 230.44)


These  limits could be compared directly  with the  MCL of 20 ppb.   It  is gen-
erally easier  to  take the logarithm  of  the  MCL rather than the antilogarithm
of all of the intervals for comparison.

INTERPRETATION

      If  the original data  are  not normal,  but  the  log-transformation ade-
quately  normalizes the data, the  confidence  interval  (on the log  scale) is  an
interval  constructed  so that the  lower confidence limit  should  be less than
the  true or  population mean  (on the log scale) with specified confidence (99%
                                      6-7

-------
in this case).  If the lower end of the confidence interval exceeds the appro-
priate compliance limit, then the mean  concentration must  exceed that compli-
ance  limit.    These  results provide  statistically  significant evidence  of
contamination.

6.2.1.3  Nonparametric Confidence Interval

     If the data  do  not adequately follow the normal  distribution  even after
the logarithm transformation, a  nonparametric confidence interval  can be con-
structed.   This  interval  is for the  median concentration  (which  equals the
mean if the distribution is symmetric).  The nonparametric  confidence interval
is generally wider and  requires more data than the  corresponding  normal dis-
tribution  interval,  and  so the  normal  or  log-normal distribution  interval
should be used whenever it is appropriate.  It requires a minimum of seven (7)
observations  in  order to  construct  an  interval  with  a two-sided  confidence
coefficient of  98%,   corresponding  to  a  one-sided  confidence  coefficient  of
99%.   Consequently,   it  is  applicable  only for  the  pooled  concentration  of
compliance wells  at a single point  in  time  or for special  sampling to produce
a minimum of seven observations at a single well  during the sampling period.

PURPOSE

     The nonparametric confidence interval is used when the raw data have been
found  to  violate the  normality assumption,  a  log-transformation  fails  to
normalize  the  data,  and no  other  specific  distribution is  assumed.   It pro-
duces  a  simple confidence  interval  that is designed  to  contain the  true  or
population median concentration with specified confidence (here 99%).  If this
confidence interval  contains the compliance  limit,  it is  concluded  that the
median  concentration  does  not  differ  significantly  from  the  compliance
limit.   If the interval's  lower limit exceeds the compliance  limit,  this  is
statistically significant  evidence that  the  concentration  exceeds  the compli-
ance limit and the unit is out of compliance.

PROCEDURE

     Step  1.   Within  each compliance  well,  order  the n  data  from  least  to
greatest, denoting the ordered data by X(l),. .  ., X(n), where X(i)  is the ith
value in the ordered data.

     Step  2.    Determine  the  critical  values  of  the order   statistics  as
follows.   If the minimum seven observations is used,  the critical values are 1
and  7.    Otherwise,  find  the  smallest integer,  M,  such  that  the  cumulative
binomial  distribution  with parameters n  (the  sample  size)  and  p = 0.5  is  at
least 0.99.  Table 6-3 gives the values of M and  n+l-M together with the exact
confidence coefficient  for  sample  sizes from 4  to  11.    For  larger samples,
take as an approximation the nearest integer value to
where  ZQ gg is  the  99th  percentile  from  the  normal distribution  (Table 4,

Appendix B) and equals 2.33.

                                      6-8

-------
               TABLE 6-3.  VALUES OF M AND n+l-M AND CONFIDENCE
                        COEFFICIENTS FOR SMALL SAMPLES

n
4
5
6
7
8
9
10
11
M
4
5
6
7
8
9
9
10
n+l-M
1
1
1
1
1
1
2
2
Two-sided
confidence
87.5%
93.8%
96.9%
98.4%
99.2%
99.6%
97.9%
98.8%

     Step 3.  Once M has been determined in Step 2, find n+l-M and take as the
confidence limits the order  statistics,  X(M)  and  X(n+l-M).   (With the minimum
seven observations, use X(l) and X(7).)

     Step 4.  Compare the confidence  limits  found in Step 3 to the compliance
limit.   If  the  lower limit, X(M) exceeds  the  compliance  limit,  there is sta-
tistically significant evidence of contamination.   Otherwise, the unit remains
in compliance.

REMARK

     The nonparametric  confidence interval procedure requires at least seven
observations in order to obtain a (one-sided) significance level  of 1% (confi-
dence of  99%).    This  means that data from two  (or more) wells or sampling
periods  would  have  to  be  pooled to  achieve  this  level.   If only  the four
observations from  one  well  taken at  a single sampling period were  used,  the
one-sided significance  level would be  6.25%.   This  would also  be  the false
alarm rate.

     Ties do not affect the  procedure.   If there  are ties, order the observa-
tions as  before,  including  all of  the tied values  as  separate  observations.
That  is,  each of  the  observations  with  a  common value  is included  in  the
ordered list (e.g., 1, 2,  2, 2, 3, 4,  etc.).  For ties,  use the average of the
tied ranks as in Section 5.2.2, Step 1 of the example.  The ordered statistics
are  found  by counting  positions  up from  the bottom of  the list as before.
Multiple values from separate observations are counted separately.

EXAMPLE

     Table 6-4 contains concentrations  of T-29 in parts  per  million from two
hypothetical compliance wells.   The data are  assumed to  consist  of  four sam-
ples taken each quarter for a year, so that sixteen observations  are  available
                                     6-9

-------
             TABLE 6-4.   EXAMPLE DATA FOR NONPARAMETRIC CONFIDENCE
                      INTERVAL—T-29 CONCENTRATIONS (ppm)
Sampling
date
Jan. 1



Apr. 1



Jul. 1



Oct. 1



Well 1
Concentration
(ppm)
3.17
2.32
7.37
4.44
9.50
21.36
5.15
15.70
5.58
3.39
8.44
10.25
3.65
6.15
6.94
3.74

Rank
(2)
(1)
(11)
(6)
(13)
(16)
(7)
(15)
(8)
(3)
(12)
(14)
(4)
(9)
(10)
(5)
Well 2
Concentration
(ppm)
3.52
12.32
2.28
5.30
8.12
3.36
11.02
35.05
2.20
0.00
9.30
10.30
5.93
6.39
0.00
6.53

Rank
(6)
(15)
(4)
(7)
(11)
(5)
(14)
(16)
(3)
(1.5)
(12)
(13)
(8)
(9)
(1.5)
(19)
from each  well.   The data  are  not  normally distributed, neither as  raw data
nor when  log transformed.   Thus,  the nonparametric  confidence interval  is
used.   The MCL is taken to be 15 ppm.

     Step  1.   Order  the  16 measurements  from  least  to greatest within each
well separately.   The  numbers  in  parentheses beside  each concentration  in
Table 6-4 are the ranks or  order of the observation.   For example,  in Well  1,
the smallest  observation  is 2.32, which has  rank  1.   The  second smallest  is
3.17,  which  has  rank 2,  and so  forth,  with the largest observation  of  21.36
having rank 16.

     Step  2.   The sample  size  is large enough  so  that the approximation  is
used to find M.


                   M =  16/2 +  1  + 2.33 /(16/4) = 13.7 =  14


     Step  3.    The   approximate 95%  confidence  limits  are  given  by  the
16 + 1 - 14 = 3rd largest  observation  and  the 14th largest observation.   For
                                     6-10

-------
Well  1,  the  3rd observation  is  3.39  and  the  14th  largest  observation  is
10.25.   Thus  the confidence limits  for  Well 1 are  (3.39,  10.25).   Similarly
for Well 2, the  3rd  largest observation and the  14th  largest observation are
found to  give the confidence  interval  (2.20,   11.02).   Note that for  Well  2
there were  two  values below detection.   These were assigned a value  of zero
and received  the two  smallest ranks.   Had there  been  three  or  more  values
below the limit of detection, the lower limit of the confidence interval would
have  been  the  limit  of detection  because  these values  would  have been the
smallest values and so would have included the third order statistic.

     Step 4.  Neither of the two confidence intervals'  lower limit exceeds the
MCL of  15.   In fact,  the upper limit is  less  than  the MCL, implying that the
concentration in each well is significantly below the MCL.

INTERPRETATION

     The rank-order  statistics used  to  form  the confidence interval  in the
nonparametric confidence interval procedure will contain the population median
with  confidence  coefficient of  98%.  The  population  median equals  the mean
whenever the distribution is symmetric.  The nonparametric confidence interval
is  generally  wider  and requires more data  than the corresponding normal dis-
tribution  interval,  and  so the  normal  or log-normal  distribution  interval
should be used whenever it is appropriate.

      If the confidence interval  contains the compliance  limit  (either  MCL or
ACL), then  it is reasonable to conclude  that  the median compliance well con-
centration  does  not differ significantly from the compliance  limit.   If the
lower end  of the  confidence  interval exceeds  the compliance  limit,  this  is
statistically significant evidence  at the 1% level  that  the median compliance
well  concentration 'exceeds the  compliance  limit  and   the  unit  is   out  of
compliance.

6.2.2  Tolerance Intervals for Compliance Limits

      In  some  cases  a  permit may specify  that  a compliance limit (MCL  or ACL)
is  not  to  be  exceeded  more  than a specified fraction of the time.  Since lim-
ited  data  will  be  available from each monitoring well, these data can be used
to  estimate a tolerance  interval  for concentrations from  that well.   If the
upper end of  the tolerance interval  (i.e., upper tolerance  limit) is less than
the compliance  limit,  the data indicate  that the unit  is in compliance.  That
is, concentrations  should be less  than  the  compliance  limit at least a speci-
fied  fraction of  the  time.   If  the upper tolerance  limit of  the  interval
exceeds  the compliance limit,  then the  concentration  of  the  hazardous con-
stituent could  exceed  the compliance limit  more than the specified proportion
of  the time.

      This procedure compares an upper tolerance limit to the MCL or ACL.  With
small sample  sizes the upper tolerance limit can be fairly  large, particularly
if  large  coverage  with high confidence  is  desired.   If  the owner or operator
wishes  to  use  a tolerance  limit  in this  application,  he/she  should  suggest
values  for the  parameters  of  the  procedure subject  to  the approval  of the
Regional Administrator.   For  example, the  owner or operator could  suggest a


                                     6-11

-------
95% coverage with  95$ confidence.   This means  that  the upper tolerance limit
is a  value  which,  with 95% confidence, will  be exceeded less than  5% of the
time.

PURPOSE

     The purpose of the tolerance  interval  approach  is to construct an inter-
val that should contain a specified fraction of the concentration measurements
from compliance wells  with  a specified degree  of  confidence.   In this appli-
cation it is generally desired to have the tolerance  interval  contain 95% of
the measurements of concentration with confidence at least 95%.

PROCEDURE

     It is assumed that the data used  to  construct the tolerance interval are
approximately normal.  The data may consist of  the concentration measurements
themselves if they are adequately normal (see Section 4.2 for tests of normal-
ity),  or  the data used may  be  the natural  logarithms of  the  concentration
data.  It is important that  the  compliance  limit (MCL or ACL) be expressed in
the same  units  (either concentrations  or logarithm  of  the  concentrations) as
the observations.

     Step 1.   Calculate the  mean,  X,  and  the  standard deviation,  S,  of the
compliance well concentration data.

     Step 2.   Determine the factor, K, from Table 5, Appendix B, for the sam-
ple size, n, and form the one-sided tolerance interval

                                 [0, X + KS]

Table  5, Appendix B contains the factors for a 95% coverage tolerance interval
with confidence factor 95%.

     Step 3.   Compare  the upper  limit of the  tolerance  interval  computed in
Step 2 to the compliance  limit.   If the upper limit of the tolerance interval
exceeds that  limit,  this is  statistically significant  evidence  of contamina-
tion.

EXAMPLE

     Table 6-5 contains Aldicarb concentrations at a hypothetical facility in
compliance monitoring.  The data are concentrations in parts per million (ppm)
and represent observations at three compliance  wells.   Assume than the permit
establishes an  ACL of 50 ppm that is  not to  be exceeded more than  5% of the
time.

     Step  1.   Calculate the  mean  and  standard deviation of the observations
from each well.  These are given in the table.
                                     6-12

-------
                   TABLE 6-5.  EXAMPLE DATA FOR A TOLERANCE
                          INTERVAL COMPARED TO AN ACL
               Sampling
                 date
 Aldlcarb concentrations (ppm)
Well 1       Well 2       Well 3
                Jan.  1
                Feb.  1
                Mar.  1
                Apr.  1
 19.
 29,
 18.
 24.
                      Mean
                      SD
 23.1
  4.93
23.7
21.9
26.9
26.1

24.7
 2.28
25.6
23.
22.
26.9

24.5
 2.10
         ACL =  50 ppm
     Step 2.   For n = 4, the  factor,  K,  in Table 5, Appendix B,  is  found to
be 5.145.  Form the upper tolerance interval limits as:

               Well 1:   23.1 + 5.145(4.93)  = 48.5

               Well 2:   24.7 + 5.145(2.28)  = 36.4

               Well 3:   24.5 + 5.145(2.10)  = 35.3

     Step 3.   Compare  the tolerance limits with the ACL of 50 PPM.  Since the
upper tolerance limits  are  below the ACL,   there  is  no  statistically signifi-
cant evidence  of  contamination  at  any well.   The site remains  in  detection
monitoring.

INTERPRETATION

     It may be desirable in a permit to specify a compliance limit that is not
to  be  exceeded more than  5% of the time.   A tolerance  interval  constructed
from the compliance well data provides  an estimated interval that will contain
95% of  the  data with confidence 95%.  If the upper  limit  of  this interval is
below the selected compliance limit, concentrations measured at the compliance
wells should  exceed  the compliance  limit  less  than 5%  of the time.   If the
upper limit of the tolerance  interval  exceeds  the compliance  limit,  then more
than 5% of the  concentration  measurements  would be expected to exceed the
compliance limit.

6.2.3  Special Cases with Limited Variance

     Occasionally, all   four concentrations from  a  compliance well at  a par-
ticular sampling period could  be identical.   If  this is the case, the formula
for estimating the  standard deviation  at that specific  sampling  period would
                                     6-13

-------
give  zero,  and the  methods for  calculating parametric  confidence intervals
would give  the same  limits for the  upper and  lower  ends of  the intervals,
which is not appropriate.

      In the case of  identical concentrations, one  should  assume that there is
some  variation in the data,  but that  the concentrations were rounded and give
the  same  values after  rounding.    To account  for the  variability that was
present before rounding,  take  the least  significant  digit  in  the  reported
concentration as having resulted from rounding.  Assume that rounding results
in  a uniform error on  the interval  centered at the  reported  value  with the
interval  ranging  up  or down one  half  unit  from  the  reported value.   This
assumed rounding is used to  obtain a  nonzero estimate  of the variance for use
in cases where all  the measured concentrations were found to be identical.

PURPOSE

     The  purpose  of  this  procedure  is  to obtain  a  nonzero estimate  of the
variance when all  observations from a well during a given sampling period gave
identical  results.   Once  this modified variance is obtained,  its square root
is used in place of  the usual  sample  standard deviation,  S, to construct con-
fidence intervals  or tolerance intervals.

PROCEDURE

      Step 1.   Determine the least significant value of any data  point.  That
is,  determine whether the  data were reported  to  the  nearest 10 ppm, nearest 1
ppm,  nearest 100 ppm, etc.  Denote this value by 2R.

      Step 2.   The data  are assumed to  have been  rounded to the nearest 2R, so
each  observation is actually the reported  value  ±R.   Assuming that the obser-
vations were  identical  because of rounding,  the variance  is  estimated to be
R2/3, assuming  the  uniform distribution  for  the rounding  error.   This gives
the  estimated standard deviation as


                                  S1 = R//3"


      Step 3.  Take this estimated  value from Step 2 and use it as  the estimate
of  the  standard deviation  in the  appropriate parametric  procedure.   That is,
replace S by S1.

EXAMPLE

      In calculating  a confidence  interval  for a single compliance well, sup-
pose  that  four  observations  were taken  during  a  sampling  period  and  all
resulted  in  590 ppm.    There  is  no variance  among the four values  590, 590,
590,  and 590.

      Step 1.   Assume that each of the values 590  came  from rounding  the con-
centration to  the  nearest  10 ppm.  That is,  590 could actually  be any value
between 585.0 and  594.99.  Thus, 2R is 10 ppm (rounded  off), so R  is 5 ppm.


                                     6-14

-------
     Step 2.  The estimate of the standard deviation is


                        S' = 5//T = 5/1.732 * 2.89 ppm


     Step 3.   Use  S1  = 2.89 and X = 590  to  calculate the confidence interval
(see Section 6.2.1) for the mean concentration from this well.  This gives


                   590  ±  (4.541)(2.89//i)  =  (583.4, 596.6)


as the 98% confidence  interval of  the  average concentration.   Note that 4.541
is the 99th  percentile from the  t-distribution (Table 6, Appendix  B)  with  3
degrees of freedom since the sample size was 4.

INTERPRETATION

     When identical results are obtained from  several  different  samples, the
interpretation is that the data are not reported to enough significant figures
to show the random differences.   If  there is no extrinsic evidence invalidat-
ing  the  data, the  data  are regarded  as  having  resulted from rounding  more
precise  results  to the reported  observations.   The  rounding is  assumed  to
result in variability  that follows the uniform distribution  on the range ±R,
where 2R is the smallest unit of reporting.  This assumption is used to calcu-
late a standard  deviation for  the observations that  otherwise appear to  have
no variability.

REMARK

     Assuming  that  the data are  reported correctly  to the  units  indicated,
other distributions for the rounding variability could be  assumed.   The  max-
imum standard  deviation that could result from rounding  when the observation
is ±R is the value R.
                                     6-15

-------
                                  SECTION  7

                  CONTROL  CHARTS FOR  INTRA-WELL COMPARISONS


     The previous sections cover various  situations  where the compliance well
data are  compared  to the  background  well  data or to  specified concentration
limits (ACL or MCL)  to  detect  possible  contamination.   This section discusses
the case  where  the   level  of  each  constituent within  a single uncontaminated
well is being monitored over time.   In  essence,  the  data for each constituent
in  each well  are plotted  on  a time scale and inspected  for obvious features
such as  trends  or  sudden  changes  in concentration  levels.    The  method sug-
gested  here  is  a   combined  Shewhart-CUSUM control  chart for each well  and
constituent.

     The  control  chart  method  is  recommended for uncontaminated  wells only,
when data comprising at least eight independent samples over a one-year period
are available.   This requirement is specified under current RCRA regulations
and applies to each constituent in each well.

     As discussed in Section 2, a common  sampling  plan will  obtain four inde-
pendent samples from each  well  on a  semi-annual  basis.  With this plan a con-
trol chart can be implemented when one year's  data are  available.  As a result
of  Monte  Carlo  simulations,  Starks  (1988) recommended  at  least four sampling
periods at a unit of eight or  more  wells,  and at least eight sampling periods
at  a unit with fewer than four wells.

     The  use  of  control charts can  be  an effective technique for monitoring
the levels of  a constituent at a  given well  over  time.  It  also provides a
visual means of  detecting  deviations from a  "state  of control."   It is clear
that plotting of the data is an important part of the analysis process.  Plot-
ting  is  an easy task,  although time-consuming if many  data sets need  to  be
plotted.   Advantage should be  taken  of graphics software,  since  plotting  of
time series data will be an ongoing process.   New data  points will be added to
the already existing data  base  each  time  new  data  are  available.  The follow-
ing few  sections  will  discuss,  in  general terms,  the  advantages  of plotting
time series data; the  corrective steps  one could  take to adjust when season-
al ity  in  the data  is  present; and  finally,  the  detailed procedure  for con-
structing a Shewhart-CUSUM control  chart,  along  with  a  demonstration  of that
procedure, is presented.

7.1  ADVANTAGES OF PLOTTING DATA

     While analyzing the data  by means  of any of  the  appropriate statistical
procedures discussed in earlier  sections is  recommended,  we  also  recommend
plotting  the  data.   Each  data point should  be  plotted against time  using a
time  scale (e.g.,  month,  quarter).    A  plot should  be  generated  for  each

                                     7-1

-------
constituent measured in each well.   For  visual  comparison purposes, the scale
should be kept identical from well to well  for a given constituent.

     Another important application of the  plotting  procedure  is for detecting
possible trends  or  drifts in the  data  from a given well.   Furthermore, when
visually comparing the  plots  from several  wells within  a unit, possible con-
tamination of one rather  than all downgradient wells  could  be detected which
would then warrant a closer look at that well.  In general, graphs can provide
highly effective  illustrations  of the  time  series,  allowing the  analyst  to
obtain a  much greater  sense of  the  data.   Seasonal  fluctuations  or  sudden
changes, for example, may become quite evident,  thereby supporting the analyst
in his/her decision of which  statistical procedure  to  use.  General upward or
downward trends,  if  present, can be detected  and  the  analyst  can follow-up
with  a  test for  trend,  such  as  the  nonparametric Mann-Kendall  test  (Mann,
1945; Kendall, 1975).  If, in addition, seasonality is suspected, the user can
perform  the  seasonal   Kendall  test  for  trend  developed by  Hirsch  et al.
(1982).   The reader  is  also  referred  to  Chapters  16 "Detecting and Estimating
Trends"  and 17 "Trends  and Seasonality"  of Gilbert's  "Statistical Methods for
Environmental  Pollution  Monitoring,"  1987.   In any  of  the  above  cases,  the
help of a professional statistician is recommended.

     Another important  use of data plots  is  that of  identifying unusual data
points (e.g., outliers).   These  points  should  then be  investigated for pos-
sible QC problems, data entry errors, or whether they are truly outliers.

     Many software packages are  available for computer graphics, developed for
mainframes, mini-, or  microcomputers.   For example, SAS  features an easy-to-
use plotting procedure, PROC  PLOT; where the  hardware  and software are avail-
able, a series of more sophisticated plotting routines can be accessed through
SAS  GRAPH.    On  microcomputers,  almost  everybody has  his  or  her favorite
graphics software that  they  use  on a  regular  basis  and no recommendation will
be made as to the most appropriate one.  The plots shown in this document were
generated using LOTUS 1-2-3.

     Once the data  for each  constituent and  each well  are plotted, the plots
should  be  examined  for seasonality  and  a correction  is recommended  should
seasonality be present.   A fairly simple-to-use procedure for deseasonalizing
data  is presented in the following paragraphs.

7.2   CORRECTING FOR SEASONALITY

      A  necessary precaution   before  constructing  a control  chart  is  to take
into  account seasonal  variation  of the  data to  minimize  the chance of mistak-
ing  seasonal  effect for  evidence of well  contamination.  This  could  result
from  variations   in  chemical   concentrations  with   recharge  rates  during
different  seasons throughout the years.    If  seasonality  is  present,  then
deseasonalizing  the  data prior  to using the combined  Shewhart-CUSUM control
chart procedure  is recommended.

      Many approaches  to deseasonalize  data exist.   If  the seasonal pattern is
regular,  it may be  modeled  with a sine or cosine  function.   Moving averages
can  be used, or  differences  (of order 12 for monthly data for example) can be


                                      7-2

-------
used.  However, time series models may  Include  rather complicated methods for
deseasonalizing the data.  Another simpler  method  exists which should be ade-
quate for the situations described in this  document.   It has the advantage of
being easy to understand and apply,  and  of  providing natural estimates of the
monthly or quarterly effects  via the monthly or quarterly  means.  The method
proposed here can be applied to  any  seasonal  cycle—typically an annual cycle
for monthly or quarterly data.

NOTE

     Corrections for  seasonality should  be used with  great caution  as  they
represent extrapolation  into  the future.   There should  be  a good scientific
explanation  for  the seasonal ity  as  well as  good  empirical  evidence  for the
seasonality  before  corrections  are made.   Larger  than  average  rainfalls for
two  or  three Augusts  in a row  does not justify  the belief that  there  will
never be  a  drought in August,  and this  idea extends directly  to groundwater
quality.   In addition,  the  quality  (bias,  robustness,  and  variance)  of the
estimates of the proper corrections must  be  considered even in cases where
corrections  are called for.  If  seasonality is  suspected, the user might want
to seek the help of a professional statistician.

PURPOSE

     When seasonality  is known  to exist  in a  time  series  of concentrations,
then the data should be deseasonalized  prior to constructing control charts in
order to  take  into  account seasonal  variation  rather than  mistaking seasonal
effects for evidence of contamination.

PROCEDURE

     The  following  instructions  to adjust  a  time  series for  seasonality are
based on monthly data with a yearly  cycle.   The procedure can be easily modi-
fied to accommodate a yearly cycle of quarterly data.

     Assume  that N  years of monthly data are available.  Let x^  denote the
unadjusted observation for the ith month during the jth  year.

     Step 1.   Compute the average concentration for month  i over  the  N-year
period:


                           Xi  =  (Xn  +  ... + X1N)/N


This is  the  average of  all  observations taken in  different  years  but  during
the  same month.  That  is,  calculate  the  mean  concentrations  for  all Januarys,
then the mean for all  Februarys  and so  on for each  of the 12 months.

     Step 2.    Calculate the grand mean,  X,  of all  N*12  observations,

                           12   N               12
                      X  =  z    z  X../N*12 =   z   X./12
                           i=l  j=l   1J        1=1    1

                                     7-3

-------
     Step 3.   Compute the adjusted concentrations,
Computing X^  - X^  removes  the  average effect  of month  i  from the  monthly
data, and adding X, the overall mean, places the adjusted z.,-,-  values  about the
same mean,  X.   It  follows  that  the overall  mean adjusted  observation, Z,
equals the overall  mean unadjusted value, X.

EXAMPLE

     Columns 2  through  4 of  Table 7-1  show monthly unadjusted  concentrations
of a fictitious analyte over  a 3-year period.


           TABLE 7-1.  EXAMPLE COMPUTATION  FOR DESEASONALIZING DATA
                     Unadjusted
                                          Monthly adjusted
concentrations
1983 1984 1985
3-Month
average
concentrations
1983 1984 1985
January
February
March
April
May
June
July
August
September
October
November
December
1.99
2.
2.
2.
10
12
12
2.11
2.
2.
2.
2.
2.
2.
15
19
18
16
08
05
 ,01
 ,10
 ,17
 ,13
 .13
 ,18
 .25
 .24
 .22
                2.15
2.08
2.13
2.08
2.16
,17
,27
,23
,24
,26
.31
.32
,28
.22
.19
 ,05
 ,12
 ,19
 ,16
 .16
 .20
2.
2.
2.
2.
2.
10
14
10
13
12
                             2.13
2.12
                2.22
2.25
2.25
2.22
2.14
2.11
2.16
2.
2.
2.
11
10
11
2.10
2.11
2.09
2.
2.
2.
2.
2.
2.
2,
2,
2,
2,
15
15
14
13
15
16
16
17
16
14
2.
2.
2,
2,
2.
2.
2.
2,
2,
2.
2.
27
21
25
24
25
23
23
24
22
24
25
                             2.17
                2.23
Overall 3-year average = 2.17
     Step  1.   Compute the monthly  averages  across the 3 years.  These  values
are shown  in the fifth column of Table 7-1.

     Step  2.   The grand mean over  the 3-year period  is calculated  to be 2.17.
                                      7-4

-------
     Step 3,   Within each month  and  year, subtract the  average  monthly con-
centration for  that  month and add  the  grand mean.  For  example, for January
1983, the adjusted concentration becomes


                           1.99 -  2.05 + 2.17  =  2.11


The adjusted concentrations are shown in the last three columns of Table 7-1.

     The reader can  check that the average of  all  36  adjusted concentrations
equals 2.17, the average  unadjusted concentration.   Figure 7-1 shows the plot
of the unadjusted and adjusted data.  The raw data clearly exhibit seasonality
as well  as  an upwards  trend  which is  less  evident  by simply  looking at the
data table.

INTERPRETATION

     As  can  be  seen  in  Figure  7-1,  seasonal   effects  were  present  in  the
data.  After adjusting for monthly effects, the seasonality was removed as can
be seen in the adjusted data plotted in the same figure.

7.3  COMBINED SHEWHART-CUSUM CONTROL CHARTS FOR EACH WELL AND CONSTITUENT

     Control charts are widely used as  a statistical  tool in industry as well
as research  and development laboratories.   The concept  of  control  charts is
relatively  simple, which  makes them  attractive to use.    From the population
distribution of a  given variable, such  as concentrations  of a given constit-
uent, repeated  random  samples  are taken at  intervals  over time.   Statistics,
for example the mean of replicate values at  a point in time, are computed and
plotted together with upper and/or lower predetermined limits on a chart where
the x-axis represents time.   If  a result falls  outside these boundaries, then
the  process  is declared  to be  "out  of control";  otherwise,  the  process is
declared to  be  "in control."   The widespread use  of control  charts  is due to
their ease  of  construction and the fact that they  can provide a  quick visual
evaluation of a situation, and remedial  action can be taken, if necessary.

     In the context of  ground  water monitoring, control  charts can be used to
monitor  the inherent  statistical  variation  of the data collected  within  a
single  well,  and   to  flag anomalous  results.   Further investigation of data
points  lying  outside  the  established  boundaries will  be  necessary before any
direct action is taken.

     A control chart that can be used on a real  time basis must be constructed
from  a  data  set   large  enough  to characterize  the  behavior  of a  specific
well.   It  is recommended  that data from a minimum of eight  samples  within  a
year be collected for each constituent at each well to permit an evaluation of
the consistency of monitoring  results with the current concept of the hydro-
geology  of  the site.   Starks  (1988)  recommends a  minimum of four  sampling
periods at  a unit with eight  or more  wells  and  a minimum  of  eight sampling
periods at  a  unit  with less than four wells.   Once  the control chart for the
specific  constituent  at  a given  well   is acceptable,  then  subsequent  data


                                      7-5

-------
CM  ro   co  
                                                                                        0)
                                                                                        O
                                                                                        c
                                                                                                o

                                                                                               a.
                                                                                                O)
                                                                                                s-
                                                   en
         CM  CM   CM  CM
             (-j/Suu)  UOI^DJ^USOUOO  »;X|Duy


                                          7-6

-------
points can  be  plotted on it to  provide  a quick evaluation as  to whether the
process is in control.

     The standard assumptions  in the use of control  charts  are that the data
generated by the  process,  when it is  in  control,  are independently (see Sec-
tion 2.4.2) and normal1y distributed with a fixed mean v and constant variance
o2.  The most important assumption is that of independence; control charts are
not robust with respect to departure  from independence (e.g., serial correla-
tion, see  glossary).   In general,  the sampling scheme will  be  such that the
possibility of obtaining serially correlated results is minimized, as noted in
Section 2.   The  assumption  of  normality  is  of  somewhat  less  concern,  but
should be  investigated  before  plotting the  charts.   A  transformation (e.g.,
log-transform, square root transform)  can be applied  to  the raw data so as to
obtain errors  normally  distributed about the  mean.   An  additional  situation
which may  decrease  the  effectiveness of  control charts  is  seasonality in the
data.  The  problem  of seasonality can be handled  by  removing the seasonality
effect from the  data,  provided  that  sufficient data to  cover at  least two
seasons of  the  same type are available (e.g.,  2 years when  monthly or quart-
erly seasonal effect).   A procedure to correct  a  time series for seasonality
was shown above in Section 7.2.

PURPOSE

     Combined Shewhart-cumulative  sum (CUSUM)  control charts are constructed
for each  constituent  at  each well to  provide  a  visual tool  of detecting both
trends and abrupt changes in concentration levels.

PROCEDURE

     Assume that  data from at least  eight  independent  samples  of monitoring
are  available  to  provide reliable  estimates   of  the mean,  y,  and  standard
deviation, a, of the constituent's concentration levels in a given well.

     Step  1.   To construct  a  combined Shewhart-CUSUM chart,  three parameters
need to be selected prior to plotting:

     h - a decision internal  value
     k - a reference value
     SCL - Shewhart control limit (denoted by U in Starks (1988))

     The parameter  k  of the  CUSUM  scheme  is  directly  obtained from the value,
D, of the displacement that should be quickly detected; k = D/2.  It is recom-
mended to select k = 1, which will allow a displacement of two standard devia-
tions to be detected quickly.

     When k is selected to be 1,  the parameter  h is usually set at values of 4
or  5.   The parameter h is the value  against which the cumulative  sum in the
CUSUM scheme will be compared.   In the  context of groundwater  monitoring,  a
value of h = 5 is recommended (Starks, 1988; Lucas, 1982).
                                      7-7

-------
     The upper Shewhart limit  is  set  at SCL = 4.5 in units of standard devia-
tion.  This  combination  of k = 1, h =  5,  and SCL = 4.5 was found most appro-
priate for  the  application of combined  Shewhart-CUSUM  charts for groundwater
monitoring (Starks, 1988).

     Step 2.   Assume that  at time  period T^, n^  concentration measurements
Xlf ..., Xni- , are available.  Compute their average X^ .

     Step 3.   Calculate the standardized mean


                              Z = (X   - y)
where y and o are the mean and standard deviation obtained from prior monitor-
ing at the same well (at least four sampling periods in a year).

     Step 4.   At each time period, T^ , compute the cumulative sum,  S^,  as:


                         Si  =  max  (0,  (Zi  - k)  + S^}


where max (A, B} is the maximum of A and B, starting with S0 = 0.

     Step 5.   Plot the  values  of Sn-  versus Tn-  on  a  time chart for this com-
bined  Shewhart-CUSUM  scheme.    Declare  an  "out-of -control"  situation at sam-
pling period T^ if for the first time, S^  > h or Z^ > SCL.  This will indicate
probable  contamination  at  the  well  and  further  investigations will   be
necessary.

REFERENCES

Lucas, J. M.   1982.   "Combined  Shewhart-CUSUM  Quality Control  Schemes."  Jour-
nal of Quality Technology .   Vol.  14,  pp.  51-59.

Starks, T.  H.   1988 (Draft).   "Evaluation of  Control  Chart Methodologies  for
RCRA Waste Sites."

Hockman,  K.  K.,  and J. M.  Lucas.   1987.   "Variability Reduction Through Sub-
vessel CUSUM Control."  Journal of Quality Technology.  Vol.  19,  pp.  113-121.

EXAMPLE

     The  procedure  is  demonstrated on a  set of carbon tetrachloride measure-
ments  taken monthly at  a  compliance  well over a  1-year  period.   The  monthly
means of  two measurements each  (n^ = 2 for all  i's) are presented  in the third
column  of  Table 7-2  below.   Estimates  of u  and o,  the mean  and standard
deviation  of carbon tetrachloride measurements at that  particular well were
obtained  from a preceding  monitoring period  at that well;  u = 5.5 yg/L  and
o = 0.4 yg/L.
                                      7-8

-------
          TABLE 7-2.   EXAMPLE DATA FOR COMBINED SHEWHART-CUSUM CHART-
                   CARBON TETRACHLORIDE CONCENTRATION  (yg/L)

Sampling
period Mean concentration, Standardized X^t
Date
Jan 6
Feb 3
Mar 3
Apr 7
May 5
Jun 2
Jul 7
Aug 4
Sep 1
Oct 6
Nov 3
Dec 1
Ti
1
2
3
4
5
6
7
8
9
10
11
12
Xi
5.52
5.60
5.45
5.15
5.95
5.54
5.49
6.08
6.91
6.78
6.71
6.65
Zi
0.07
0.35
-0.18
-1.24
1.59
0.14
-0.04
2.05
4.99a
4.53a
4.28
4.07
Zi - k
-0.93
-0.65
-1.18
-2.24
0.59
-0.86
-1.04
1.05
3.99
3.53
3.28
3.07
CUSUM,
Si
0
0
0
0
0.59
0.00
0.00
1.05
5.04^
8.56°
11.84°
14.91°

  Parameters:  Mean = 5.50; std = 0.4; k = 1; h = 5; SCL = 4.5.

  a  Indicates "out-of-control" process via Shewhart control limit  (Z^ > 4.5).

  b  CUSUM "out-of-control" signal (Si > 5).


     Step  1.  The  three  parameters   necessary   to  construct   a  combined
Shewhart-CUSUM chart  were selected  as h = 5;  k  = 1;  SCL = 4.5  in  units  of
standard deviation.

     Step  2.  The  monthly  means  are  presented  in   the  third  column  of
Table 7-2.

     Step  3.  Standardize the means  within  each  sampling  period.    These
computations  are  shown in the fourth  column of Table 7-2.   For example,
li = (5.52  -  5.50)*/270.4 = 0.07.

     Step 4.    Compute the quantities S^,  i  =  1, ..., 12.   For example,

     St = max (0, -0.93 + 0}  = 0
     S2 = max (0, -0.65 + 0}  = 0
Ss = max (0, 0.59 +
S6 = max {0, -0.86 H
*
etc.
>„}     =  max  {0,  0.59  +  0}  =  0.59
 Ss]    =  max  {0,  -0.86 + 0.59)  = max  {0,  -0.27}
                                                                         = 0
                                     7-9

-------
These quantities are shown in the last column of Table 7-2.

     Step 5.   Construct the control  chart.   The y-axis  is  in  units of stan-
dard deviations.   The  x-axis  represent time, or_the sampling periods.   For
each sampling  period,  T..-,  record  the  value  of  X^  and S^.   Draw horizontal
lines at values h = 5 and SCL = 4.5.  These two lines represent the upper con-
trol  limits  for  the CUSUM  scheme and the  Shewhart  control  limit,  respec-
tively.  The chart for this example data set is shown in Figure 7-2.

     The combined  chart indicates  statistically significant  evidence  of con-
tamination starting  at sampling period T9.   Both  the  CUSUM scheme  and  the
Shewhart control  limit  were exceeded by S9  and Z9, respectively.   Investi-
gation  of  the  situation  should  begin  to  confirm  contamination  and  action
should be required  to  bring the variability  of  the  data back to  its previous
level.

INTERPRETATION

     The combined Shewhart-CUSUM control scheme was applied to an example data
set of  carbon  tetrachloride measurements taken on a monthly basis at a well.
The  statistic  used  in  the  construction of  the  chart  was  the  mean  of  two
measurements per sampling period.  (It should be noted that this method can be
used on  an  individual  measurement  as well, in which case n^  = 1).  Estimates
of  the  mean and  standard  deviation  of the measurements  were  available from
previous data collected at that well over at least four sampling periods.

     The parameters of  the  combined chart were selected to be k = 1 unit, the
reference value or allowable slack  for  the  process;  h = 5 units,  the decision
interval for the CUSUM scheme;  and  SCL = 4.5 units, the upper Shewhart control
limit.  All parameters are in units of a, the standard deviation obtained from
the previous monitoring results.  Various combinations of parameter values can
be selected.  The particular values recommended here appear to be the best for
the  initial  use  of the procedure from  a review  of the  simulations and recom-
mendations in  the  references.   A discussion on this subject is given by Lucas
(1982), Hockman and Lucas  (1987), and Starks  (1988).  The choice of the param-
eters  h and k  of a CUSUM chart is  based  on the desired  performance  of the
chart.   The  criterion used to  evaluate a control scheme  is the average number
of  samples or  time  periods before  an  out-of-control  signal is obtained.  This
criterion  is  denoted by ARL or  average run length.   The  ARL should be large
when  the mean  concentration of a  hazardous  constituent is near  its target
value  and  small  when the  mean has shifted too  far from  the  target.   Tables
have been  developed  by  simulation methods  to  estimate ARLs for given combina-
tions  of  the parameters (Lucas, Hockman and  Lucas,  and Starks).   The user is
referred to these articles for further  reading.

7.4  UPDATE OF A CONTROL CHART

     The control  chart  is  based on  preselected performance parameters as well
as  on  estimates of y and o, the parameters of the distribution of the measure-
ments  in  question.   As  monitoring  continues and the process  is found to be in
control, these parameters  need periodic  updating so  as to  incorporate this new
information  into  the  control  charts.   Starks  (1988)  has  suggested  that in


                                      7-10

-------
                                                     o
PS
O
     in
     o
     en

"  e
      en
HrH  «
HH  m
      c
      o
      0)
Q
PQ
O
O
             I    I    I    I     I
        in   Tf  m   CN   ^-  o
                                                                                     .coin^ncN'«-o
                                                                                            ID
                                                                                            CO
                                                                                            D
                                                                                            O
                                                                                          (U
                                                                                          Q.

                                                                                          O>
                                                                                          "5.
                                                                                          E
                                                                                          o
                                                                                             C.
                                                                                             a
                                                                                            TJ
                                                                                             0)
                                                                                             N
                                                                                             O
                                                                                            TJ
                                                                                             C
                                                                                             O
                                                                                            -t-J
                                                                                            en
                                                                                                    00
                                                                                                    Z3
                                                                                                    O
                                                                                                     I
                                                                                                    -M
                                                                                                    i-
                                                                                                    O)
                                                                                                    J=
                                                                                                    OO
                                                                                                    
-------
general, adjustments  in sample means  and standard  deviations be  made after
sampling periods 4, 8, 12,  20, and 32,  following the initial monitoring period
recommended to  be at  least  eight  sampling  periods.   Also,  the  performance
parameters h, k,  and  SCL would need to be updated.   The  author suggests that
h = 5, k = 1, and SCL =  4.5 be kept  at  those  values for the first 12 sampling
periods following the  initial  monitoring  plan, and that  k  be  reduced to 0.75
and SCL to 4.0 for all subsequent sampling periods.  These values and sampling
period numbers are not mandatory.   In the event of an out-of-control state or
a trend, the control  chart should not be updated.

7.5  NONOETECTS IN A CONTROL CHART

     Regulations require that four independent  water  samples be taken at each
well at a given sampling period.  The mean of the four concentration measure-
ments  of  a particular  constituent  is  used in  the construction of  a control
chart.  Now  situations  will  arise when the concentration of a constituent is
below  detection  limit for  one or more  samples.   The following  approach is
suggested for treating nondetects when plotting control charts.

     If only one of the four measurements is  a nondetect,  then  replace it with
one  half  of the  detection  limit (MDL/2)  or  with  one half of  the practical
quantisation limit (PQL/2)  and proceed as described in Section  7.3.

     If either two or three of the  measurements  are  nondetects,  use only the
quantitated values (two  or  one,  respectively)  for  the control  chart and pro-
ceed as discussed earlier in Section 7.3.

     If all four measurements  are nondetects,  then use one  half of the detec-
tion  limit  or  practical  quantisation limit as  the  value  for the construction
of  the control chart.  This is an obvious situation of no  contamination of the
well.

     In the event that a control  chart requires updating and a certain propor-
tion  of  the measurements is  below  detection  limit, then adjust  the mean and
standard  deviation  necessary for the  control  chart  by using  Cohen's  method
described  in  Section 8.1.4.    In that'  case,  the  proportion of  nondetects
applies to  the pool  of  data  available  at the time of  the  updating and would
include all  nondetects  up  to that time,  not  just the four  measurements taken
at  the last sampling period.
 CAUTIONARY NOTE:   Control  charts  are  a useful  supplement to other statistical
 techniques  because they  are  graphical  and  simple  to  use.   However,  it is
 inappropriate  to  construct a  control  chart on wells  that  have shown evidence
 of contamination or an increasing trend  (see §264.97(a)(l)(i)).  Further,  con-
 tamination may not be present in a well  in the  form of a steadily increasing
 concentration  profile—it  may  be  present intermittently or may  increase  in a
 step  function.    Therefore,  the  absence of  an  increasing  trend  does  not
 necessarily prove  that a release has not occurred.
                                     7-12
-------
                                  SECTION 8

                             MISCELLANEOUS TOPICS
     This chapter  contains  a  variety  of special  topics that  are relatively
short and  self contained.   These  topics  include methods  to deal  with data
below the limit of detection and  methods to  check for, and deal with outliers
or extreme values in the data.


8.1  LIMIT OF DETECTION

     In a chemical  analysis some compounds  may be below  the detection limit
(DL)  of  the  analytical  procedure.   These  are  generally  reported   as  not
detected  (rather  than as zero or not present)  and  the appropriate  limit of
detection is usually given.   Data  that  include  not  detected results  are  a
special  case referred to as censored data  in the statistical literature.  For
compounds not  detected,  the  concentration   of  the  compound  is  not  known.
Rather,  it  is  only  known that  the concentration  of the  compound  is  less than
the detection limit.

     There are a  variety of ways to deal with  data that include  values below
detection.   There is  no general  procedure  that  is  applicable in all  cases.
However there  are some  general  guidelines that  usually prove adequate.   If
these do  not cover  a specific situation, the  user  should  consult  a  profes-
sional statistician for the most appropriate  way to deal  with the values below
detection.

     A summary of  suggested  approaches to deal with  data  below the  detection
limit is  presented  as Table 8-1.   The method  suggested  depends on the amount
of  data  below the  detection  limit.   For  small  amounts  of  below  detection
values,  simply replacing a "NO" (not detected) report  with a small  number, say
the detection  limit divided by two,  and  proceeding with  the usual  analysis is
satisfactory.   For moderate  amounts  of  below detection  limit data,  a more
detailed  adjustment  is  appropriate, while for  large  amounts  one  may  need to
only  consider  whether  a compound  was  detected  or  not  as  the  variable  of
analysis.

     The meaning of  small,  moderate,  and large above  is subject  to  judgment.
Table 8-1 contains some  suggested values.  It should  be  recognized that these
values are not hard and  fast rules,  but  are  based on  judgment.  If there is a
question about how to handle values below detection,  consult a statistician.
                                     8-1
-------
            TABLE 8-1.  METHODS FOR BELOW DETECTION LIMIT VALUES
     Percentage
    of Nondetects
   in the Data Base
          Statistical
       Analysis Method
      Section of
 Guidance Document
Less than 15%
Replace NDs with
MDL/2 or PQL/2,
then proceed with
parametric procedures:

• ANOVA
• Tolerance  Units
• Prediction  Intervals
• Control Charts
Section 8.1.1
                                                        Section 5.2.1
                                                        Section 5.3
                                                        Section 5.4
                                                        Section 7
Between 15 and 50%
Use NDs as ties,
then proceed with
Nonparametric ANOVA
or
use Cohen's adjustment,
then proceed with:

• Tolerance Limits
• Confidence Intervals
• Control Charts
                                                        Section 5.2.2


                                                        Section 8.1.3
                                                        Section 5.3
                                                        Section 6.2.1
                                                        Section 7
More than 50%
Test of Proportions
Section 8.1.2
                                    8-2
-------
     It should be noted that the nonparametric methods presented earlier auto-
matically deal with values  below detection  by regarding them as all tied at a
level below any quantitated results.  The nonparametric methods may be used if
there is a moderate amount of data below detection.  If the proportion of non-
quantified values  in  the  data exceeds 25%, these  methods  should be used with
caution.  They should  probably  not  be used  if less than half of the data con-
sists of quantified concentrations.

8.1.1  The DL/2 Method

     The amount  of data that are  below detection plays an  important  role in
selecting the method to deal  with the limit of detection problem.  If a small
proportion of the observations  are  not  detected,  these may be replaced with a
small number, usually the method detection limit divided by 2 (MDL/2), and the
usual analysis  performed.    This  is the  recommended  method for use  with the
analysis of various procedure of Section  5.2.1.   Seek  professional help if in
doubt about  dealing with  values  below detection  limit.   The.results  of the
analysis are generally not sensitive to the specific choice of the replacement
number.

     As a guideline,  if  15% or fewer of  the  values  are not detected,  replace
them with  the method  detection limit  divided by two  and  proceed with  the
appropriate  analysis   using  these  modified values.    Practical  quantisation
limits  (PQL)  for Appendix  IX compounds  were  published by EPA  in  the  Federal
Register (Vol 52, No 131,  July 9,  1987, pp 25947-25952).  These give practical
quantisation  limits  by compound  and analytical  method that  may be  used  in
replacing  a   small  amount  of  nondetected  data  with  the  quantitation  limit
divided by 2.  If  approved  by the  Regional  Administrator,  site specific PQL's
may be used in this procedure.  If more than 15% of the values are reported as
not detected, it is preferable to use a nonparametric method or a test of pro-
portions.

8.1.2.  Test of Proportions

     If more  than  50%  of  the  data are below detection  but  at least 10% of the
observations are quantified, a  test  of  proportions may be  used to compare the
background well  data  with  the compliance well data.   Clearly,  if  none of the
background well  observations  were  above  the  detection limit,  but  all  of the
compliance well observations were above the detection limit, one would suspect
contamination.   In  general  the  difference may not be  as obvious.   However,  a
higher proportion of quantitated values in compliance wells could provide evi-
dence of  contamination.    The test  of  proportions is  a method to  determine
whether a difference  in  proportion  of detected values  in  the background well
observations  and compliance  well observations  provides statistically  signifi-
cant evidence of contamination.

     The test of proportions  should  be  used when  the  proportion of quantified
values is small to moderate  (i.e.,  between  10% and 50%).   If very few quanti-
fied values are found, a method based on  the  Poisson  distribution may be used
as  an  alternative  approach.    A  method  based  on a tolerance limit  for  the
number  of  detected  compounds  and  the maximum  concentration  found  for  any
detected compound has been proposed  by Gibbons (1988).   This alternative would


                                     8-3
-------
be appropriate when  the  number of detected compounds  is  quite small relative
to  the  number  of  compounds  analyzed  for  as  might  occur  in  detection
monitoring.

PURPOSE

     The test  of proportions  determines  whether the  proportion  of compounds
detected in the compliance well data differs significantly from the proportion
of compounds detected in  the  background well  data.   If there is a significant
difference, this is statistically significant evidence of contamination.

PROCEDURE

     The procedure uses  the normal  distribution  approximation to the binomial
distribution.  This  assumes that  the  sample size is reasonably large.  Gener-
ally,  if  the proportion  of detected  values  is  denoted  by P,  and  the sample
size  is  n,  then  the normal   approximation  is adequate,  provided that  nP and
n(l-P) both are greater than or equal  to 5.

     Step  1.  Check criterion for using the normal approximation.

           Determine  X,  the number  of  background  well  samples in  which the
           compound was detected,  and  Y, the number  of compliance well samples
           in which the compound was detected.

           Let n^  be  the  total number of  background  well  samples analyzed and

           nc  be  the total  number of compliance well samples  analyzed.   Let

           n  = nb + n^.

           Estimate P with P =  (X + Y)/n.

           Compute nP and  n(l  - P).   If  both products are > 5,  then the normal
           approximation may be used.

     Step  2.    Compute   the  proportion  of  detects  in  the  background  well
samples:

                                   Pb  "  X/nb

     Step  3.    Compute   the  proportion  of  detects  in  the  compliance  well
samples:

                                   PC  '  Y/nc

     Step  4.  Compute the standard error of the  difference in proportions:

           S0 -   £l(X+Y)/(nb+nc)][l -  (X+Y)/(nb+nc)][l/nb

and  form the statistic:

                               Z  -  (Pb - P


                                      8-4
-------
     Step 5.   Compare the  absolute  value of Z to  the  97.5th percent! 1e from
the standard  normal  distribution, 1.96.   If the absolute  value  of Z exceeds
1.96,  this provides  statistically significant  evidence  at the 5% significance
level  that  the proportion  of  compliance  well  samples  where  the  compound was
detected exceeds the proportion of background  well  samples where the compound
was detected.   This  would  be  interpreted  as evidence of contamination.  (The
two-sided test is used  to  provide  information  about  differences  in either
direction.)

EXAMPLE

     Table 8-2  contains  data on cadmium concentrations  measured in background
well  and  compliance  wells  at a  facility.   In  the table, "BDL"  is  used for
below detection limit.

     Step 1.  Check the adequacy  of the normal  approximation.  From Table 8-2,
X = 8, nb = 24, Y = 24, nc  = 64,  and hence n = 88.

     Calculate:  P = (8 + 24)/(24 + 64) = 0.364

     Compute:   nP = 88(0.364) = 32

                n(l-P) = 88(1 - 0.364) = 56

     Since both of these exceed 5, the normal approximation is justified.

     Step  2.    Estimate  the  proportion   above  detection  in the  background
wells.   As  shown  in  Table 8-2,  there  were  24 samples  from  background wells
analyzed for cadmium, so nb  = 24.   Of these, 16 were below detection and X = 8
were above detection, so Pb  = 8/24 = 0.333.

     Step  3.    Estimate  the  proportion   above  detection  in the  compliance
wells.  There were 64 samples from compliance wells analyzed for cadmium, with
40  below detection and 24  detected values.  This gives n  = 64, Y = 24, so P_
= 24/64 = 0.375.                                                             C

     Step 4.   Calculate the  standard error of the difference in proportions.

      SQ  =  [[(8+24)/(24+64)][l-(8+24)/(24+64)](l/24 +1/64)}1/2 = 0.115

     Step 5.   Form the statistic  Z and compare it to the normal distribution.

                            7 _ 0.375 - 0.333   n ,7
                            L      OTTT5        U>J/

which  is  less  in  absolute  value  than  the value  from the normal  distribution,
1.96.   Consequently,  there is no  statistically  significant  evidence  that the
proportion of  samples with  cadmium levels above the detection limit differs in
the background  well and compliance well samples.
                                      8-5
-------
      TABLE 8-2.  EXAMPLE DATA FOR A TEST OF PROPORTIONS
Cadmium concentration (wg/L)     Cadmium concentration (ug/L)
     at background well               at compliance wells
        (24 samples)                     (64 samples)
0.1 BDL
0.12 BDL
BDL* BDL
0.26 BDL
BDL
0.1
BDL
0.014
BDL
BDL
BDL
BDL
BDL
0.12
BDL
0.21
BDL
0.12
BDL
BDL



0.12
0.08
BDL
0.2
BDL
0.1
BDL
0.012
BDL
BDL
BDL
BDL
BDL
0.12
0.07
BDL
0.19
BDL
0.1
BDL
0.01
BDL
BDL
BDL
BDL
BDL
0.11
0.06
BDL
0.23
BDL
0.11
BDL
0.031
BDL
BDL
BDL
BDL
BDL
0.12
0.08
BDL
0.26
BDL
0.02
BDL
0.024
BDL
BDL
BDL
BDL
BDL
0.1
0.04
BDL
BDL
0.1
BDL
0.01
BDL
BDL
BOL
BDL
BDL





 BDL means below detection limit.
                              8-6
-------
INTERPRETATION

     Since the proportion of water samples with detected amounts of cadmium in
the  compliance  wells  was  not   significantly  different  from  that  in  the
background wells, the  data are interpreted to  provide  no  evidence of contam-
ination.  Had  the  proportion of  samples with  detectable levels  of cadmium in
the compliance  wells been  significantly higher  than  that in  the background
wells this would have been evidence of contamination.  Had the proportion been
significantly higher in the background wells, additional study would have been
required.  This  could indicate that contamination was  migrating from an off-
site source, or it could mean that the hydraulic gradient had been incorrectly
estimated or had changed  and  that contamination was  occurring from the facil-
ity,  but the  ground-water  flow  was  not  in  the direction  originally  esti-
mated.  Mounding of contaminants  in the ground water near the background wells
could also be a possible explanation of this observance.

8.1.3  Cohen's Method

     If  a confidence interval or a  tolerance  interval  based upon the normal
distribution  is being  constructed,  a  technique presented  by  Cohen  (1959)
specifies a method to  adjust  the  sample  mean and  sample standard deviation to
account for data below the detection limit.  The only requirements for the use
of  this  technique  is  that  the  data  are  normally  distributed  and  that  the
detection limit be always the same.  This technique is demonstrated below.

PURPOSE

     Cohen's method provides  estimates of  the  sample mean  and standard devia-
tion when some  (< 50%)  observations  are  below detection.  These  estimates can
then be used to construct tolerance, confidence, or prediction intervals.

PROCEDURE

     Let n be the total number of observations, m represent the number of data
points above the detection limit (DL), and  X^  represent the  value of the ith
constituent value above the detection limit.

     Step 1.   Compute  the  sample mean xd  from the  data above  the  detection
limit as follows:

                                      1  m
                                 xd = m i=lxi

     Step 2.  Compute the sample variance S^ from the data  above  the detection
limit as follows:
                         m             m       ,   m    2
                   2 =
                   d     ""  rri-1 "            ra-1
                                      8-7
-------
     Step  3.    Compute the  two  parameters,  h  and  -r  (lowercase  gamma),  as
follows:
                                   h  =  (n~m>
                                         n
and

                                      (x-DL)z
where  n  is  the  total   number  of  observations  (i.e..,  above  and  below the
detection limit), and where DL is equal to the detection limit.

     These values are then used to determine the value of the parameter x from
Table 7 in Appendix 8.

     Step 4.  Estimate  the  corrected  sample mean,  which accounts for the data
below detection limit, as follows:


                            X = xd - x(xd - DL)


     Step 5.  Estimate the corrected sample standard deviation, which accounts
for the data below detection limit, as follows:
                               •)   ~ 	       71/9
                         S = (Sd + x(xd - DL)2)1/2
     Step 6.   Use the corrected values  of  X and S  in  the procedure for con-
structing a  tolerance interval  (Section 5.3)  or a  confidence  interval  (Sec-
tion 6.2.1).

REFERENCE

Cohen, A. C.,  Jr.  1959.   "Simplified  Estimators for the Normal Distribution
When Samples are  Singly Censored or Truncated."  Tecnnometrics.   1:217-237.

EXAMPLE

     Table 8-3 contains data on sulfate  concentrations.  Three observations of
the  24 were  below  the  detection  limit  of  1,450  mg/L  and  are  denoted  by
"< 1,450" in the  table.

     Step 1.  Calculate the mean from the m = 21 values above detection

                                 xd = 1,771.9

     Step 2.  Calculate the sample variance from the 21 quantified  values

                                 Sd  =  8,593.69


                                      8-8
-------
                     TABLE 8-3.  EXAMPLE DATA FOR COHEN'S TEST
                            Sulfate concentration (mg/L)
                                         1,850
                                        '1,760
                                       < 1,450
                                         1,710
                                         1,575
                                         1,475
                                         1,780
                                         1,790
                                         1,780
                                       < 1,450
                                         1,790
                                         1,800
                                       < 1,450
                                         1,800
                                         1,840
                                         1,820
                                         1,860
                                         1,780
                                         1,760
                                         1,800
                                         1,900
                                         1,770
                                         1,790
                                         1,780
            DL = 1,450 mg/L

            Note:A symbol "<" before a number indicates that the value
            is not detected.   The number following is then the limit of
            detection.
     Step 3.   Determine

                            h = (24-21)/24 = 0.125

and

                      r = 8593.69/(1771.9-1450)2 = Q.Q83

     Enter. Table 7 of  Appendix B at h  = 0.125  and  y  =  0.083  to determine the
value of  x.   Since the  table  does  not contain these entries  exactly,  double
linear interpolation  was used to estimate x = 0.14986.
                                     8-9
-------
REMARK

     For the interested reader, the details of the double linear interpolation
are provided.

     The values from Table 7 between which the user needs to interpolate are:


                I            h = 0.10         h = 0.15

               0.05          0.11431          0.17935
               0.10          0.11804          0.18479


     There are 0.025  units  between 0.01 and 0.125 on  the  h-scale.   There are
0.05 units between  0.10  and 0.15.   Therefore, the  value  of  interest (0.125)
lies (0.025/0.05 *  100) = 50%  of  the distance along  the interval between 0.10
and 0.15.  To linearly interpolate between the tabulated values on the h axis,
the range between the  values must  be calculated,  the value that is 50% of the
distance along the range must be computed and then that value must be added to
the  lower  point on the  tabulated  values.   The  result  is the  interpolated
value.  The interpolated points on the h-scale for the current example are:

          0.17935 - 0.11431 = 0.06504        0.06504 * 0.50 = 0.03252
          0.11431 + 0.03252 = 0.14683

          0.18479 - 0.11804 = 0.06675        0.06675 * 0.50 = 0.033375
          0.11804 + 0.033375 = 0.151415

     On  the  Y-axis  there are  0.033  units  between 0.05 and 0.083.   There are
0.05  units  between  0.05  and  0.10.    The  value  of  interest   (0.083)  lies
(0.0330.05 * 100) = 66%  of the distance along the  interval  between 0.05 and
0.10.  The interpolated point on the y-axis is:

          0.141415  - 0.14683 = 0.004585      0.004585 * 0.66 = 0.0030261
          0.14683 + 0.0030261 = 0.14986

     Thus, x = 0.14986.

     Step 5.   The corrected  sample mean and  standard deviation are then esti-
mated as follows:

              X = 1,771.9 - 0.14986  (1,771.9 - 1,450) = 1,723.66

            S = [8,593.69 + 0.14986(1,771.9 -  1,450)2]1/2 = 155.31

     Step  6.   These modified estimates  of the mean, X =  1723.66,  and of the
standard deviation,  S =  155.31, would  be  used in the tolerance or confidence
interval  procedure.   For  example,  if  the  sulfate  concentrations  represent
background at a facility, the upper  95%  tolerance limit becomes

                     1723.7  -i- (155.3)(2.309)  = 2082.3 mg/L


                                     8-10
-------
Observations from  compliance wells  in  excess  of  2,082 mg/L would  give sta-
tistically significant evidence of contamination.

INTERPRETATION

     Cohen's method  provides  maximum  likelihood  estimates  of  the  mean and
variance  of  a  censored  normal  distribution.   That is,  of  observations that
follow  a  normal distribution  except for  those below  a limit  of detection,
which are reported as "not detected."  The modified estimates reflect the fact
that the  not detected  observations are below  the  limit of  detection, but not
necessarily zero.  The large sample properties of the modified estimates allow
for them  to be  used  with the normal  theory procedures as a means of adjusting
for not detected values in  the data.   Use of  Cohen's  method in more compli-
cated calculations such as those required for analysis of variance procedures,
requires special consideration from a professional  statistician.

8.2  OUTLIERS

     A  ground-water  constituent  concentration value that  is much  different
from most other values  in a data set  for the  same  ground-water constituent
concentration  can  be  referred  to  as  an  "outlier."    Possible  reasons  for
outliers can be:

          A catastrophic unnatural occurrence such as a spill;

          Inconsistent sampling or analytical  chemistry  methodology  that may
          result in  laboratory contamination or other anomalies;

          Errors in  the transcription of data values or decimal points; and

          True  but  extreme  ground-water  constituent  concentration  measure-
          ments.

     There  are  several   tests  to  determine  if there is  statistical  evidence
that an observation  is  an  outlier.   The reference  for the test presented here
is ASTM paper E178-75.

PURPOSE

     The  purpose of  a  test  for  outliers  is  to determine  whether  there  is
statistical evidence that an observation that appears extreme does not fit the
distribution of  the  rest of  the data.   If  a suspect observation  is identified
as  an  outlier,   then steps need  to  be  taken  to determine  whether it  is the
result of an error or a valid extreme observation.

PROCEDURE

     Let the sample  of observations of a hazardous  constituent of ground water
be  denoted  by  Xlt  ...,  Xn.   For  specificity, assume that the data  have been
ordered and that the largest observation, denoted by Xn, is  suspected of being
an  outlier.   Generally,  inspection  of  the  data suggests values  that  do not
                                     8-11
-------
appear to belong to the data  set.   For example, if the largest observation is
an order of magnitude larger than the other observations, it would be suspect.

     Step 1.  Calculate the mean, X and the standard deviation, S, of the data
including all observations.

     Step 2.  Form the statistic, Tn:

                               Tn  =  (Xn  -  *)/S

Note that Tn is the difference between the largest observation and the sample
mean, divided by the sample standard deviation.

     Step 3.  Compare the  statistic  Tp to  the critical value given the sample

size, n,  in  Table  8 in Appendix B.   If  the  Tn  statistic exceeds the critical

value from the table, this  is evidence that  the suspect observation, Xp, is a
statistical outlier.

     Step 4.   If  the value is  identified as an outlier, one  of the actions
outlined  below should  be  taken.    (The  appropriate  action depends on what can
be  learned  about  the  observation.)  The records  of the sampling and analysis
of  the  sample  that  led to  it should  be  investigated to determine whether the
outlier resulted from an error that can be identified.

          If an error (in transcription, dilution, analytical procedure, etc.)
can  be  identified  and  the correct value recovered,  the observation should be
replaced  by  its corrected  value  and  the appropriate statistical analysis done
with the  corrected value.

     •    If it can be determined  that  the  observation  is  in  error,  but the
correct  value  cannot be  determined,  then the  observation  should  be deleted
from  the data set  and  the appropriate  statistical analysis performed.   The
fact that the  observation was deleted and the  reason for its deletion should
be  reported when reporting the results of the statistical analysis.

          If no error  in the value can  be documented then it must be assumed
that the  observation is a true but extreme value.   In this case it must not be
altered.  It may be desirable to obtain another sample to confirm the observa-
tion.  However, analysis and reporting should retain the observation and state
that no  error was found in  tracing the sample that  led to the extreme observa-
tion.

EXAMPLE

     Table  8-4 contains  19 values of  total  organic  carbon (TOC)  that were
obtained  from  a monitoring well.   Inspection shows one value which at 11,000
mg/L  is  nearly an  order  of magnitude larger than  most  of the other observa-
tions.   It  is a suspected outlier.
                                     8-12
-------
                TABLE 8-4.  EXAMPLE DATA  FOR TESTING  FOR  AN  OUTLIER
                            Total organic carbon  (mg/L)
                                         1,700
                                         1,900
                                         1,500
                                         1,300
                                        11,000
                                         1,250
                                         1,000
                                         1,300
                                         1,200
                                         1,450
                                         1,000
                                         1,300
                                         1,000
                                         2,200
                                         4,900
                                         3,700
                                         1,600
                                         2,500
                                         1,900
     Step 1.  Calculate the mean and standard deviation of the data.

                            X =  2300 and  S  =  2325.9

     Step 2.  Calculate the statistic T19.

                       Ti9 = (11000-2300)72325.9 = 3.74

     Step 3.  Referring to Table 8 of Appendix B for the upper 5% significance
level, with  n =  19,  the critical  value  is  2.532.   Since  the value  of  the
statistic T19   =  3.74 is greater than  2.532,  there is  statistical  evidence
that the largest observation is an outlier.

     Step 4.  In  this  case, tracking  the data revealed  that the unusual value
of 11,000 resulted  from  a keying error  and  that  the  correct value was 1,100.
This correction was then made  in the data.

INTERPRETATION

     An observation that  is 4  or 5 times  as  large as the  rest  of  the data is
generally viewed with suspicion.  An observation that is an order of magnitude
different could arise by a common error of  misplacing a decimal.  The test for
an outlier provides a statistical basis for determining whether an observation
                                     8-13
-------
is statistically different from the rest of  the  data.   If it is,  then it is a
statistical  outlier.   However, a  statistical  outlier  may  not be  dropped  or
altered just because it has been identified  as  an  outlier.   The test provides
a formal  identification of an  observation as  an  outlier, but does  not identify
the cause of the difference.

     Whether or not a statistical test  is  done,  any  suspect data  point should
be checked.   An observation  may be  corrected  or dropped  only if  it  can  be
determined  that  an  error has  occurred.   If the error  can be  identified  and
corrected (as  in  transcription or keying) the  correction  should  be made  and
the corrected values used.   A value that  is demonstrated to  be incorrect  may
be deleted  from  the data.  However,  if no specific error  can  be documented,
the observation must be  retained  in the data.   Identification  of an observa-
tion  as  an outlier  but with  no error  documented could  be  used  to suggest
resampling to confirm the value.
                                     8-14
-------
              APPENDIX A


GENERAL STATISTICAL CONSIDERATIONS AND
    GLOSSARY OF STATISTICAL TERMS
                 A-l
-------













***£'*£*<>£  o;<*VC"* oft?*  /;>^-'<^




    7o  n  th-   
-------
               *3$<
         x>e



                 >v

                    \*

                  rO* ' - *
  <>e *X

,..:»^^^^'-(








 v  f\t^ A.*1 ^A ^^ \3Y «<\ON «  i Ox > W01-1 *e   -rO"
  ^J ^ ^^& ?^ o< $ $*$^\ **&«**  . **>.




  i^«?»-r ,^%s&^5?f „„,

    > " <. ;;-'ul%,??.^: x;S>X'
     °tf r# *.<«$ s>* < '•» J>>;
    '.^fs;^.;: •: ^^5^

    ^-c^^^k:?
     t^^^^:-^^
      T&<>%&®&&8&
      ** w *%'.1 o^ *,\^ .^^. f>e'
         ^r^>
         r^ X^e
-------
                      GENERAL STATISTICAL CONSIDERATIONS
FALSE ALARMS OR TYPE I ERRORS

     The statistical  analysis of  data from  ground-water monitoring  at RCRA
sites has as  its  goal  the determination of whether  the data provide evidence
of the presence of, or an increase in the level of contamination.  In the case
of detection monitoring, the goal of  the  statistical  analysis is to determine
whether  statistically  significant evidence of contamination exists.   In the
case of compliance monitoring, the goal  is  to determine whether statistically
significant  evidence  of  concentration  levels  exceeding  compliance   limits
exists.   In monitoring sites  in  corrective action,  the  goal  is to determine
whether levels of the hazardous constituents are still above compliance  limits
or have been reduced to, at, or below the compliance limit.

     These  questions  are addressed  by the use  of  hypothesis tests.   In the
case of  detection  monitoring,  it  is  hypothesized that  a site is not contami-
nated;  that  is,  the  hazardous  constituents  are not  present  in  the  ground
water.   Samples  of the ground water  are  taken and  analyzed for the constitu-
ents in question.   A  hypothesis test is used  to decide whether the data indi-
cate the presence  of  the hazardous constituent.  The test consists of calcu-
lating  one  or  more ,statistics  from  the  data  and  comparing  the  calculated
results to some prespecified critical levels.

     In performing  a  statistical  test,  there  are four possible outcomes.  Two
of  the  possible outcomes  result  in  the  correct decision:   (a)  the  test may
correctly indicate  that no contamination is  present  or (b)  the test may cor-
rectly  indicate  the presence of  contamination.   The other  two possibilities
are  errors:   (c)  the test may indicate that  contamination is present when in
fact  it  is  not or  (d)  the test may  fail  to  detect contamination  when it is
present.

     If  the stated hypothesis  is that no  contamination  is  present  (usually
called  the  null  hypothesis)  and  the  test  indicates  that  contamination  is
present  when  in fact it is  not,  this is called  a Type I error.  Statistical
hypothesis  tests  are generally  set   up to  control  the  probability  of Type I
error to be no more than a specified  value, called the significance level, and
usually denoted by a.   Thus in detection monitoring, the  null hypothesis would
be  that  the  level  of  each hazardous constituent is  zero (or  at  least below
detection).   The test  would  reject  this hypothesis  if some measure of concen-
tration  were  too  large, indicating contamination.  A  Type I  error  would be a
false alarm or a triggering event that is inappropriate.

     In  compliance  monitoring,  the  null hypothesis is  that  the level of each
hazardous  constituent  is  less  than  or equal  to the  appropriate  compliance


                                      A-3
-------
limit.   For  the purpose of  setting up the statistical procedure,  the simple
null  hypothesis  that the  level is  equal to  the compliance  limit would  be
used.   As  in detection monitoring,  the test would  indicate  contamination if
some  measure  of  concentration  is too  large.   A  false alarm or  Type  I  error
would occur  if the  statistical procedure  indicated  that  levels  exceed  the
appropriate compliance limits when,  in fact,  they do not.   Such an error would
be a false alarm in that it would indicate falsely that compliance limits were
being exceeded.

PROBABILITY OF DETECTION AND TYPE II ERROR

     The other  type  of  error that  can occur is  called  a Type II  error.   It
occurs  if  the test  fails  to detect  contamination that  is  present.  Thus  a
Type II  error is a missed  detection.   While  the  probability  of a Type I  error
can be  specified, since it  is the probability  that the test  will  give a false
alarm, the probability of  a Type II  error depends on several  factors, includ-
ing the  statistical test,  the sample size, and  the significance level or prob-
ability  of Type I error.   In  addition,  it depends on the degree of contamina-
tion present.  In general,  the probability of a Type II error decreases as the
level of contamination  increases.   Thus  a test may be likely to miss low lev-
els  of  contamination, less  likely  to miss  moderate contamination, and  very
unlikely to miss high levels of contamination.

     One can discuss  the probability of  a Type II error  as  the probability of
a  missed  detection,   or one can discuss  the complement  (one minus  the  prob-
ability  of Type II error)  of this probability.   The complement, or probability
of detection, is also called  the power of the  test.   It  depends on the magni-
tude of  the  contamination  so that  the power or  probability  of detecting con-
tamination increases with the degree of contamination.

      If  the probability of  a  Type I  error is specified,  then for a given sta-
tistical test,  the power  depends  on the  sample  size and the  alternative  of
interest.   In order  to specify a  desired power  or  probability of detection,
one must specify the  alternative that should  be detected.   Since generally the
power will  increase  as the alternative differs  more  and more from the  null
hypothesis, one  usually tries  to specify  the  alternative that  is  closest to
the null hypothesis, yet enough different that  it is important to detect.

      In  the  detection monitoring situation,  the  null  hypothesis  is that  the
concentration of  the  hazardous  constituent is  zero  (or at  least  below detec-
tion).   In this case the  alternative  of interest is that there  is  a  concen-
tration of the hazardous constituent that  is above the detection  limit and is
large enough so that  the monitoring procedure should detect it.  Since it is a
very difficult problem to select a concentration  of each  hazardous constituent
that  should  be  detectable  with  specified power,   a more useful  approach  is to
determine  the  power  of  a test  at several  alternatives and decide whether the
procedure  is acceptable on the  basis of this  power function rather than on the
power against a single alternative.

      In  order to  increase  the  power,  a larger   sample must  be  taken.   This
would mean sampling  at  more  frequent  intervals.   There is a limit to how much
can  be  achieved,  however.    In  cases with limited  water  flow, it  may not be
possible to  sample wells as  frequently as desired.   If samples close together

                                     A-4.
-------
in  time  prove  to be  correlated,  this  correlation  reduces  the  information
available from  the different  samples.   The  additional cost  of  sampling and
analysis will also impose practical limitations on the sample size that can be
used.

     Additional  wells  could  also be  used  to increase  the  performance of the
test.  The additional  monitoring  wells would  primarily be helpful  in ensuring
that a plume would not escape detection by missing the monitoring wells.  How-
ever,  in  some situations  the additional  wells  would contribute  to  a .larger
sample size and so improve the power.

     In compliance monitoring  the  emphasis  is  on determining whether addi-
tional contamination  has occurred, raising the  concentration  above  a compli-
ance limit.   If  the  compliance limit is  determined  from  the  background well
levels, the  null  hypothesis  is  that the  difference between the background and
compliance well concentrations  is zero.    The alternative  of interest is that
the compliance well concentration exceeds  the background concentration.  This
situation  is essentially  the same  for  power considerations  as   that  of the
detection monitoring situation.

     If compliance monitoring  is  relative  to  a  compliance limit (MCL or ACL),
specified as a constant, then the situation is different.  Here the null hypo-
thesis  is  that the  concentration  is less  than or  equal  to  the compliance
limit, with  equality  used  to establish  the test.  The alternative is that the
concentration is  above  the  compliance  limit.   In order to specify  power,  a
minimum amount above the compliance limit  must be established and power speci-
fied for that alternative or the power function evaluated for several possible
alternatives.

SAMPLE DESIGNS AND ASSUMPTIONS

     As discussed in  Section 2,  the  sample design to  be  employed at a regu-
lated  unit  will   primarily  depend  on  the  hydrogeologic  evaluation  of  the
site.   Wells should  be  sited to provide  multiple background  wells hydrauli-
cally  upgradient   from  the  regulated unit.   The  background wells  allow for
determination of  natural  spatial  variability in ground-water  quality.   They
also  allow  for estimation  of background  levels with  greater  precision than
would  be  possible from a single  upgradient well.   Compliance  wells should be
sited  hydraulically  downgradient to  each  regulated  unit.   The  location and
spacing of  the  wells, as  well  as the depth  of  sampling,  would be determined
from the  hydrogeology to ensure that at least one of the wells should inter-
cept a plume of contamination of reasonable size.

     Thus  the assumed sample  design is for  a  sample  of wells to  include  a
number of background  wells  for the site,  together with a number of compliance
wells  for each  regulated unit at the site.  In the event that a site has only
a  single  regulated unit,  there  would be  two groups  of wells,  background and
compliance.   If a site  has  multiple  regulated units, there would  be a set of
compliance wells  for each regulated unit,  allowing for detection monitoring or
compliance monitoring  separately at each regulated unit.

     Data from the analysis  of the water at each well are initially assumed to
follow  a  normal  distribution.   This is likely  to be the  case  for detection

                                      A-5
-------
monitoring of  analytes in  that  levels should be  near zero and  errors would
likely represent  instrument or other  sampling  and analysis variability.   If
contamination is  present,  then the distribution of the  data may  be skewed to
the right,  giving a few  very  large values.   The  assumption of  normality of
errors in the  detection  monitoring case is quite  reasonable,  with deviations
from normality likely  indicating some  degree  of  contamination.   Tests of nor-
mality are recommended to ensure that the data  are  adequately represented by
the normal distribution.

     In the compliance monitoring  case,  the data  for  each  analyte will again
initially be  assumed  to   follow  the  normal  distribution.   In  this case, how-
ever,   since  there  is  a   nonzero concentration  of  the  analyte in  the ground
water, normality is more  of an issue.  Tests of normality are recommended.  If
evidence  of  nonnormality  is  found,  the data  should  be  transformed or  a
distribution-free test be used to  determine  whether  statistically significant
evidence of contamination exists.

     The  standard  situation would  result in  multiple  samples  (taken  at dif-
ferent times) of  water from each well.  The  wells would form  groups of back-
ground wells  and  compliance wells  for each  regulated unit.   The statistical
procedures  recommended would  allow for  testing  each  compliance  well  group
against  the  background  group.   Further, tests  among  the compliance  wells
within a  group are  recommended  to determine whether  a single well  might be
intercepting an isolated  plume.  The  specific procedures discussed and recom-
mended in the preceding sections should cover the majority of cases.  They did
not cover all of  the possibilities.   In  the  event  that none of the procedures
described and  illustrated  appears  to  apply  to  a  particular  case  at  a given
regulated site, consultation with a statistician should be sought to determine
an appropriate statistical procedure.

     The following  approach  is recommended.   If  a  regulated unit is in detec-
tion monitoring,  it will  remain  in  detection  monitoring  until  or unless there
is statistically significant evidence of contamination, in which case it would
be placed in compliance monitoring.   Likewise,  if  a  regulated  unit is in com-
pliance monitoring,  it will remain  in compliance  monitoring unless  or until
there  is statistically significant evidence of further contamination, in which
case it would move  into corrective action.

     In monitoring  a regulated unit with  multiple  compliance wells, two types
of significance levels are  considered.   One  is  an  experimentwise significance
level   and the other is a  comparisonwise  significance  level.  When a procedure
such as  analysis  of variance  is used  that considers  several  compliance wells
simultaneously,  the  significance   is   an experimentwise   significance.    If
individual well comparisons are  made,  each of those comparisons  is done at a
comparisonwise significance  level.

     The  fact  that many  comparisons will be made at  a regulated  unit with
multiple  compliance wells can make  the  probability  that at least  one of the
comparisons will  be incorrectly significant  too high.  To control  the false
positive  rate,  multiple   comparisons procedures  are  allowed that  control  the
experimentwise significance  level to be 5%.   That  is,  the probability that one
or more  of  the comparisons will falsely  indicate  contamination is controlled


                                     A-6
-------
at 5%.   However,  to provide  some  assurance of adequate power  to  detect real
contamination,  the  comparisonwise  significance  level  for  comparing  each
individual well to the background is required to be no less than 1%.

     Control of the experimentwise significance level via multiple comparisons
procedures is allowed for comparisons among several wells.   However, use of an
experimentwise significance level for the comparisons among the different haz-
ardous constituents is not permitted.   Each hazardous constituent  to be moni-
tored for in the permit must be treated separately.
                                     A-7
-------
                        GLOSSARY OF STATISTICAL TERMS
                 (underlined terms are explained subsequently)
Alpha (a)


Alpha-error

Alternative  hypothesis
Arithmetic average
Autocorrelation
Biased  estimator
Bonferroni t
A greek  letter  used to denote  the significance
level or probability of a Type I error.

Sometimes used for Type I error.

An  alternative   hypothesis   specifies  that  the
underlying  distribution  differs  from the  null
hypothesis.   The  alternative hypothesis  usually
specifies the value  of a  parameter, for example
the  mean concentration, that  one  is  trying  to
detect.
The arithmetic  average  of  a set of
is   their  sum   divided   by   the
observations.
observations
  number   of
This is a measure of dependence among sequential
observations from the same well.  There are dif-
ferent  orders  of autocorrelation,  depending  on
how  far  apart  in   time  the  correlation  per-
sists.    For  example,   the first;  order  auto-
correlation  is  the  correlation  between  suc-
cessive pairs of observations.

A  biased  estimator  is an  estimator  that  has  an
expectation  or  average  value that  is  not equal
to  the  parameter it  is  estimating.   Often the
bias decreases as the sample size increases.

This is an approach, developed by Bonferroni, to
control the  experimentwise error rate in mu11 i -
pie  comparisons.   The number  of comparisons or
hypotheses  to  be tested  is fixed  (at  k)  and a
"t"  statistic  is  computed  to  test  each  of
these.   Instead of  the  usual   "t"  table, where
each  of these tests  would be  done  at  the sig-
nificance  level  alpha,  a  special  table is used
so  that  each  test  is   done  at  level  alpha/k.
This  ensures that  the experimentwise error rate
is  no more than alpha.
                                      A-8-
-------
Comparisonwise error  rate
Composite hypothesis
Confidence  coefficient
Confidence  interval
Cumulative distribution
   function
Distribution-free
Distribution   function
Estimator
This term  is  used in association  with  multiple
comparisons.  It refers to the probability of an
error occurring  on  a single  comparison  of sev-
eral that might  be done.   It is computed assum-
ing  that the  single  comparison  or  hypothesis
test is the only one being done.

This is  a hypothesis for  which  not all  relevant
parameters are specified.   A composite hypothe-
sis  is  made up  of  two or  more  simple  hypothe-
ses.  For example, the  hypothesis  that  the data
are  normally  distributed  with  unspecified mean
and variance is a composite hypothesis.

The  confidence   coefficient  of  a  confidence
interval for a parameter is the probability that
the  random  interval  constructed from the sample
data contains  the true value of the parameter.
The  confidence  coefficient is  related  to  the
significance  level  of  an  associated  hypothesis
                      the significance level (in
                           minus the  confidence
                              test by the fact that
                              percent)  is  one  hundred
                              coefficient (in percent).
A  confidence  interval  for  a  parameter  is  a
random  interval  constructed  from  sample data in
such  a   way   that  the  probability  that  the
interval  will  contain the  true  value of  the
parameter is  a specified value.

The distribution function for a random variable,
X,  is a  function that  specifies the probability
that X  is less  than  or equal  to t, for all real
values of t.

This  is  sometimes   used  as  a   synonym  for
nonparametric.  A  statistic is distribution-free
if  its  distribution  does not  depend upon which
specific  distribution   function   (in  a  large
class) the observations follow.

This  document   uses  "Cumulative   Distribution
Function"  and   "Distribution  Function"  inter-
changeably.      See   Cumulative   Distribution
Function.

An  estimator  is  a statistic computed  from  the
observed  data.   It is  used  to estimate a param-
eter  of  interest; for example, the population
mean.   Often estimators are  the  sample equiva-
lents of  the  population parameters.
                                      A-9
-------
Experimentwise error rate
Hypothesis
Independence
Mean

Median
Multiple comparison
   procedure
 Nonparametric statistical
      procedure
This term refers to multiple comparisons.   If a
total of n  decisions  are  made  about comparisons
(for example  of compliance wells  to background
wells)  and  x of the  decisions are  wrong,  then
the  experimentwise  error  rate  is  x/n.    The
probability that X exceeds zero  is  the experi-
mentwise significance.

This is  a formal statement about  a. parameter of
interest and  the  distribution  of a statistic.
It  is  usually used as  a  null   hypothesis  or an
alternative  hypothesis.   For example,  the null
hypothesis might specify that ground water had a
zero concentration of  benzene  and that analyti-
cal  errors  followed a normal  distribution with
mean zero and standard deviation 1 ppm.

A   set   of   events   are    independent   if  the
probability  of  the  joint  occurrence  of  any
subset of the events factors into the product of
the  probabilities   of the  events.    A set  of
observations   is   independent   if  the   joint
distribution  function  of   the  random  errors
associated  with the  observations factors  into
the product of the distribution functions.

Arithmetic average.

This is  the middle  value of a sample  when the
observations  have  been  ordered  from  least  to
greatest.  If the number of observations is odd,
it  is the middle observation.   If the number of
observations  is even,  it is customary  to take
the  midpoint  between the  two middle  observa-
tions.   For a  distribution,   the  median  is a
value such that the probability i.<; one-half that
an  observation  will  fall  above  or below   the
median.

This is  a statistical procedure that makes a
large number  of decisions or comparisons on one
set of data.  For example, at a sampling period,
several  compliance well  concentrations  may be
compared to the background well concentration.

A  nonparametric statistical procedure is a
statistical   procedure   that   has   desirable
properties   that   hold  under   mild  assumptions
regarding the  data.   Typically the procedure is
valid for  a large  class of distributions rather
than  for a specific  distribution of  the data
such as  the  normal.
                                     A-10
-------
Normal  population,
normality
Null   hypothesis
One-sided test
One-sided tolerance  limit


One-sided  confidence  limit


Order statistics


Outlier
Parameter
Percentile
Post  hoc  comparison
 Power
The errors associated with the observations
follow  the  normal   or  Gaussian  distribution
function.

A  null   hypothesis   specifies   the  underlying
distribution of the  data completely.   Often the
null  distribution specifies  that  there  is  no
difference  between  the  mean  concentration  in
background  well   water  samples  and  compliance
well water  samples.   Typically,  the  null  hypo-
thesis is a simple hypothesis.

A  one-sided  test  is appropriate  if  concentra-
tions  higher  than those specified by  the null
hypothesis  are  of  concern.    A  one-sided test
only rejects for  differences  that are large and
in a prespecified direction.

This  is  an upper  limit on observations from  a
specified distribution.
This  is  an  upper
distribution.
limit on  a  parameter of  a
The sample  values  observed  after they have been
arranged in increasing order.

An  outlier  is an  observation that  is  found to
lie an  unusually long way  from  the  rest of the
observations   in    a   series   of   replicate
observations.

A  parameter is  an unknown  constant associated
with  a  population.    For  example,  the  mean
concentration  of  a  hazardous  constituent  in
ground water is a parameter of interest.

A  percentile  of  a  distribution is a value below
which a  specified  proportion  or  percent of the
observations from that distribution will fall.

This  is  a   comparison,  say   between  hazardous
constituent  concentrations   in two  wells,  that
was found to  be  of interest after the data were
collected.    Special  methods  must  be   used  to
determine   significance  levels   for  post  hoc
comparisons.

The power of  a test is  the  probability that the
test  will  reject under  a  specified  alternative
hypothesis.   This  is one minus  the probability
of  a  Type  II error.  The power  is a measure of
                                     A-ll
-------
Sample standard deviation

Sample variance
Serial correlation
Significance  level
Simple  hypothesis
Test  statistic
 Trend  analysis
 Type  I  error
the  test's  ability  to detect  a difference  of
specified size from the null hypothesis.

This is the square root of the sample variance.

This  is  a  statistic  (computed on  a  sample  of
observations  rather  than on  the whole popula-
tion) that measures the variability or spread of
the  observations  about the sample mean.   It is
the  sum  of the  squared differences  from  the
sample mean,  divided  by the  number  of observa-
tions less one.

This is the correlation of observations spaced a
constant  interval  apart  in  a  series.   For exam-
ple, the  first order  serial  correlation  is the
correlation between adjacent  observations.   The
first order serial correlation is  found by cor-
relating  the  pairs consisting of  the first and
second,  second  and   third,  third  and  fourth,
etc., observations.

Sometimes  referred to as  the alpha  level,  the
significance  level of a  test  is the probability
of  falsely rejecting a  true  null  hypothesis.
The  probability of a Type I error.

A  hypothesis  which  completely  specifies  the
distribution  of  the   observed  random variables.
To  completely define  a  distribution,  both the
type of  distribution  and numeric values for the
parameters must be given.

A  test  statistic  is  a value  computed  from the
observed  data.   This value  is  used  to  test a
hypothesis  by  relating the  value to a distribu-
tion table  and rejecting the  hypothesis  if the
computed  value falls  in  a region  that has low
probability under the hypothesis  being tested.
A  "t"  statistic,  an  "F" statistic,  and  a chi-
squared  statistic  are  examples.

This refers   to   a   collection  of   statistical
methods  that   analyze  data  to  determine  trends
over time.  The  trends may be of various types,
steady   increases  (or  decreases),   or  a  step
increase  at a  point in time.

A   Type  I  error  occurs  when   a  true  null
hypothesis  is  rejected  erroneously.    In  the
monitoring  context a  Type I error  occurs when a
                                     A-12
-------
                              test incorrectly  indicates  contamination or  an
                              increase in contamination at a regulated unit.

Type II error                 A Type II error occurs when one  fails  to reject
                              a null  hypothesis  that  is false.   In  the  moni-
                              toring  context,  a  Type  II   error  occurs  when
                              monitoring fails  to  detect contamination or  an
                              increase  in  a  concentration  of  a  hazardous
                              constituent.

Unbiased estimator            An unbiased estimator  is an  estimator  that  has
                              zero bias.  That is,  its expectation is equal  to
                              the  parameter  it  is  estimating.    Its  average
                              value is the parameter.
                                     A-13
-------
    APPENDIX B





STATISTICAL TABLES
       B-l
-------
                                CONTENTS
Table                                                               Page
  1        Percent!les of the x2 Distribution With
           v Degrees of Freedom, x2v p	  B-4
  2        95th Percentiles of the F-Distribution With vt and
           v2 Degrees of Freedom, FVijV2j0.9s	  B-5
  3        95th Percentiles of the Bonferroni t-Statisties,
           t(v, a/m)	  B-6
  4        Percentiles of the Standard Normal Distribution, Up	  B-7
  5        Tolerance Factors (K) for One-Sided Normal Tolerance
           Intervals With Probability Level (Confidence Factor)
           Y = 0.95 and Coverage P = 95%	  B-9
  6        Percentiles of Student's t-Distribution	 B-10
  7        Values of the Parameter x for Cohen's Estimates
           Adjusting for Nondetected Values	 B-ll
  8        Critical Values for Tp (One-Sided Test) When the
           Standard Deviation Is Calculated From the Same Sample... B-12
                                   B-3
-------
              TABLE 1.   PERCENTILES OF THE x2 DISTRIBUTION WITH
                         v DEGREES OF FREEDOM, x
SOURCE:
Wiley and Sons, New York.

V
* \
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
40
50
60
70
80
90
100
an,
erini

0.750
1.323
2.773
4.108
5.385
6.626
7.841
9.037
10.22
11.39
12.55
13.70
14.85
15.98
17.12
18.25
19.37
20.49
21.60
22.72
23.83
24.93
26.04
27.14
28.24
29.34
30.43
31.53
32.62
33.71
34.80
45.62
56.33
66.98
77.58
88.13
98.65
109.1
Norman L
1 and the

0.900
2.706
4.605
6.251
7.779
9.236
10.64
12.02
13.36
14.68
15.99
17.28
]8.55
19.81
21.06
22.31
23.54
24.77
25.99
27.20
28.41
29.62
30.81
32.01
33.20
34.38
35.56
36.74
37.92
39.09
40.26
51.80
63.17
74.40
35.53
96.58
107.6
118.5
. and
Physic

0.950
3.841
5.991
7.815
9.488
11.07
12.59
14.07
15.51
16.92
18.31
19.68
21.03
22.36
23.68
25.00
26.30
27.59
28.87
30.14
31.41
32.67
33.92
35.17
36.42
37.65
38.89
40.11
41.34
42.56
43.77
55.76
67.50
79.08
90.53
102.9
113.1
124.3
0
0.975
5.024
7.378
9.348
11.14
12.83
14.45
16.01
17.53
19.02
20.48
21,92
23.34
24.74
26.12
27.49
28.85
30.19
31.53
32.85
34.17
35.48
36.78
38.08
39.36
40.65
41.92
43.19
44.46
45.72
46.98
59.34
71.42
83.30
95.02
106.6
118.1
129.6
F. C. Leone.
:al Sciences.

0.990
6.635
9.210
11.34
13.28
15.09
16.81
18.48
20.09
21.67
23.21
24.72
26.22
27.69
29.14
30.58
32.00
33.41
34.81
36.19
37.57
38.93
40.29
41.64
42.98
44.31
45.64
46.96
48.28
49.59
50.89
63.69
76.15
88.38
100.4
112.3
124.1
135.8
1977.
Vol. I.
X2
0.995
7.879
10.60
12.84
14.86
16.75
18.55
20.28
21.96
23.59
25.19
26.76
28.30
29.82
31.32
32.80
34.27
35.72
37.16
33.58
40.00
41.40
42.80
44.18
45.56
46.93
48.29
4964
50.99
52.34
53.67
66.77
79.49
91.95
104.2
116.3
128.3
140.2

0.9i>9
10.83
13.82
16.27
18.47
20.52
22.46
24.32
26.12
27.88
29.59
31.26
32.91
34.53
36.12
37.70
39.25
40.79
42.31
43.82
4532
46.80
48.27
49.73
51.18
52.62
54.05
55.48
56. 89
58.30
59.70
73.40
86.66
99.61
112.3
124.8
137.2
149.4
Statistics and \
Second Editi
                                                                       John
                                      B-4
-------
            TABLE 2.  95th  PERCENTILES  OF THE F-DISTRIBUTION WITH
                  vt AND  v2 DEGREES  OF  FREEDOM,  FVlf>)  o.ss
'>\ '
1 161.4
2 18.31
3 10.13
4 7.71
} 6.61
« 5.99
7 5.59
8 5.32
9 5.12
10 4 96
II 4.84
12 475
13 4.67
14 4.60
15 454
16 4 49
17 4 45
18 441
19 4.38
:o 4.35
21 4.32
22 4.30
23 4.28
24 4.26
25 4.24
26 4.23
27 421
28 4.20
29 4.18
30 4.17
40 408
60 400
120 3.92
« 3.84
2
1995
1900
9.J5
6.94
5.79
5.14
4.74
4.46
4.26
4.10
3.98
389
3.81
3.74
3 68
363
3.59
3.55
3.52
3.49
3.47
3.44
3.42
3.40
3.39
.137
3.35
3.34
3.33
J.32
3.23
3 15
3.07
3.00
3
215.7
1916
928
6.39
5.41
4.76
4.35
407
3.86
3.71
3.59
3.49
3.41
3.34
3.29
324
3 :o
3.16
3.13
3.10
307
3.05
3.03
3.01
2.99
2.98
2.96
2.95
2.93
2.92
2.84
2.76
2.68
2.60
4
224.6
!925
9.12
6.39
5.19
4.53
4 12
384
3.63
3.48
3.36
3 26
3 18
3.11
3 06
3 01
2.9fi
2.93
2.90
2.87
2.84
282
2.80
2.78
2.76
2.74
2.73
2.71
2.70
2.69
2.61
2.53
2.4J
2.37
5
230.2
19.30
901
6.26
5.05
4.39
3.97
3.69
3.48
3.33
3 20
3.11
303
2.96
2.90
2.85
2.81
2.77
2.74
2.71
2.68
2.66
2.64
2.62
2.60
2.59
2.57
2.56
2.55
2.53
2.45
2.37
2.29
2.21
6
2340
1933
8.94
6.16
495
4T28
3.87
3.58
3.37
3.22
3.09
3.00
2.92
2.85
2.79
2.74
2.70
2.66
2.63
2.60
2.57
2.55
2.53
2.51
2.49
2.47
2.46
2.45
2.43
2.42
2.34
2.25
2.17
2.10
7
236.8
1935
889
6.09
4.88
4.21
3.79
3.50
3.29
3.14
3.01
2.91
2.83
2.76
2.71
2.66
2.61
2.58
2.54
2.51
2.49
2.46
2.44
2.42
2.40
2.39
2.37
2.36
2.35
2.33
2.25
2.17
2.09
2.01
8 9 10 12 15 20 24 30 40 60 120 *>
238.9 240.5 241.9 2439 245.9 248.0 249.1 250.1 2511 252.2 2533 254.3
1937 19.38 1940 19.41 19.43 19.45 19.45 1946 19.47 1948 19.49 1950
8.85 881 8.79 8.74 8.70 866 864 8.62 8.59 8.57 8.}5 8.33
6.04 6.00 5.96 5.91 5.86 5 80 5.77 5.75 5.72 5.69 5.66 5.63
4.82 4.77 4.74 4.68 4.62 4.56 4.53 4.50 4 46 4.43 4.40 4 36
4.15 4.10 4.06 400 3.94 3.87 3.84 3.81 3.77 3.74 3.70 367
3.73 3.68 3.64 3.57 3.51 3.44 3.41 3.38 3.34 3.30 3.27 3 23
3.44 3.39 3.35 3.28 3.22 3.15 3.12 3.08 3.04 3.01 2.97 2.93
3.23 3.18 3.14 3.07 3.01 2.94 2.90 2.86 2.83 2.79 2.75 2.71
3.07 3.01 2.98 2.91 2.85 2.77 2.74 2.70 2.66 2.62 2.58 2.54
2.95 2.90 2.85 2.79 2.72 2.65 2.61 2.57 2.53 2.49 2.45 2.40
2.85 2.80 2.75 2.69 2.62 2.54 2.51 2.47 2.43 2.38 2.34 2 30
2.77 2.71 2.67 260 2.53 2.46 2.42 238 2.34 2.30 2.25 221
2.70 2.65 2.60 2.53 2.46 2.39 2.35 2.31 2.27 2.22 2.18 2.13
2.64 2.59 2.54 2.48 2.40 2.33 229 2.25 2.20 2.16 211 2.07
2.59 2.54 2.49 2.42 2..'5 2.28 2.24 2.19 2.15 2.11 2.06 2.01
2.55 2.49 2.45 2.38 2.31 2.23 2.19 2.15 2.10 2.06 2.01 196
2.51 2.46 2.41 2.34 2.27 2.19 2.15 2.11 2.06 2.02 197 192
2.48 2.42 2.38 2.31 2.23 2.16 2.11 2.07 2.03 1.98 1.93 1.88
2.45 2.39 2.35 2.28 220 2.12 2.08 2.04 1.99 .95 .90 1.34
2.42 2.37 2.32 2.25 2.18 2.10 2.05 2.01 1.96 92 .87 1.81
2.40 2.34 130 2.23 2.15 2.07 203 .98 194 .89 .84 1.78
2.37 2.32 2.27 2.20 2.13 2.05 2.01 .96 1.91 .86 .81 1.76
2.36 2.30 2.25 2.18 2.11 2.03 198 .94 1.89 .84 .79 1.73
2.34 2.28 2.24 J.I 6 2.09 2.01 1.96 .92 1 87 .82 .77 1.7!
2.32 2.27 2.22 2.15 2.07 1.99 1.95 .90 1 85 .80 .75 1.69
2.31 2.25 2.20 2.13 2.06 1.97 1.93 .88 1 84 .79 .73 1.67
2.29 2.24 2.19 2.12 2.04 1.96 1.91 .87 1.82 .77 .71 165
2.28 2.22 2.18 2.10 2.03 1.94 1.90 85 1.81 .75 .70 1.64
2.27 2.21 2.16 2.09 2.01 1.93 1.89 .84 1.79 .74 .68 1 62
2.1» 2.12 2.08 2.00 1.92 1.84 1.79 .74 1.69 64 58 1.51
2.10 2.04 1 99 1.92 1.84 1.75 1 70 .65 1.J9 .53 .47 I 39
2.02 1.96 1.91 1.8J 1.75 1.66 1.61 .55 150 .43 .35 125
1.94 1.88 1.83 1.75 1.67 1.57 1.52 .46 1.39 .32 .22 1.00
NOTE:  vz:  Degrees of freedom for  numerator
       v2:  Degrees of freedom for  denominator

SOURCE:  Johnson, Norman L. and F.  C.  Leone.   1977.   Statistics and Experimental
Design in Engineering and the Physical Sciences.  Vol. I.  Second  Edition.   John
Wiley and Sons, New York.
                                      B-5
-------
       TABLE  3.  95th  PERCENTILES OF THE  BONFERRONI
                  t-STATISTICS, t(v, a/m)

where v = degrees of freedom associated with the mean
   squares error
      m = number of comparisons
      a = 0.05, the experimentwise error level
m
\a/m
4
5
6
7
8
9
10
15
20
30
m
1
0.05
2.13
2.02
1.94
1.90
1.86
1.83
1.01
1.75
1.73
1.70
1.65
2
0.025
2.78
2.57
2.45
2.37
2.31
2.26
2.23
2.13
2.09
2.04
1.96
3
0.0167
3.20
2.90
2.74
2.63
2.55
2.50
2.45
2.32
2.27
2.21
2.13
4
0.0125
3.51
3.17
2.97
2.83
2.74
2.67
2.61
2.47
2.40
2.34
2.24
5
0.01
3.75
3.37
3.1.4
3.00
2.90
2.82
2.76
2.60
2.53
2.46
2.33
SOURCE:  For a/m = 0.05, 0,025, and 0.01, the percent-lies
were extracted from the t-table (Table 6, Appendix B) for
values of F=l-a of 0.95, 0.975, and 0.99, respectively.

For a/m = 0.05/3 and 0.05/4, the percentiles were
estimated using "A Nomograph of Student's t" by Nelson,
L. S.  1975.  Journal of Quality Technology, Vol. 7,
pp. 200-201.
                            B-6
-------
         TABLE 4.   PERCENTILES OF THE STANDARD NORMAL DISTRIBUTION, Up
up
p
0.50
0.51
0.52
0.53
0.54
0.55
0.56
0.57
0.58
0.59
0.60
0.61
0.62
0.63
0.64
0.65
0.66
0.67
0.68
0.69
0.70
0.71
0.72
0.73
0.74
0.000
0.0000
0.0251
0.0502
0.0753
0.1004
0.1257
0.1510
0.1764
0.2019
0.2275
0.2533
0.2793
0.3055
0.3319
0.3585
0.3853
0.4125
0.4399
0.4677
0.4959
0.5244
0.5534
0.5828
0.6128
0.6433
0.00 1
0.0025
0.0276
0.0527
0.0778
0.1030
0.1 282
0.1535
0.1789
0.2045
0.2301
0.2559
0.2819
0.3081
0.3345
0.3611
0.3880
0.4152
0.4427
0.4705
0.4987
0.5273
0.5563
0.5858
0.6158
0.6464
0.002
0.0050
0.0301
0.0552
0.0803
0.1055
0.1307
0.1560
0.1815
0.2070
0.2327
0.2585
0.2845
0.3107
0.3372
0.3638
0.3907
0.4179
0.4454
0.4733
0.5015
0.5302
0.5592
0.5888
0.6189
0.6495
0.003
0.0075
0.0326
0.0577
0.0828
0.1080
0.1332
0.1586
0.1840
0.2096
0.2353
0.2611
0.2871
0.3134
0.3398
0.3665
0.3934
0.4207
0.4482
0.4761
0.5044
0.5330
0.5622
0.5918
0.6219
0.6526
0.004
0.0100
0.0351
0.0602
0.0853
0.1105
0.1358
0.1611
0.1866
0.2121
0.2378
0.2637
0.2898
0.3160
0.3425
0.3692
0.3961
0.4234
0.4510
0.4789
0.5072
0.5359
0.5651
0.5948
0.6250
0.6557
0.005
0.0125
0.0376
0.0627
0.0878
0.1130
0.1383
0.1637
0.1891
0.2147
0.2404
0.2663
0.2924
0.3186
0.3451
0.3719
0.3989
0.4261
0.4538
0.4817
0.5101
0.5388
0.5681
0.5978
0.6280
0.6588
0.006
0.0150
0.0401
0.0652
0.0904
0.1156
0.1408
0.1662
0.1917
0.2173
0.2430
0.2689
0.2950
0.3213
0.3478
0.3745
0.4016
0.4289
0.4565
0.4845
0.5129
0.5417
0.5710
0.6008
0.6311
0.6620
0.007
0.0175
0.0426
0.0677
0.0929
0.1181
0.1434
0.1687
0.1942
0.2198
0.2456
0.2715
0.2976
0.3239
0.3505
0.3772
0.4043
0.4316
0.4593
0.4874
0.5158
0.5446
0.5740
0.6038
0.6341
0.6651
0.008
0.0201
0.0451
0.0702
0.0954
0.1206
0.1459
0.1713
0.1968
0.2224
0.2482
0.2741
0.3002
0.3266
0.3531
0.3799
0.4070
0.4344
0.4621
0.4902
0.5187
0.5476
0.5769
0.6068
0.6372
0.6682
0.009
0.0226
0.0476
0.0728
0.0979
0.1231
0.1484
0.1738
0.1993
0.2250
0.2508
0.2767
0.3029
0.3292
0.3558
0.3826
0.4097
0.4372
0.4649
0.4930
0.5215
0.5505
0.5799
0.6098
0.6403
0.6713
NOTE:  For values of P below 0.5, obtain the value of Un.p) from Table 4 and
change its sign.   For example,  UQ>45  = -U(i_0.45) = -U0.55 = -0-1257.
                                 (Continued)
                                     B-7
-------
                              TABLE 4 (Continued)
p
0.75
0.76
0.77
0.78
0.79
0.80
0.81
0.82
0.83
0.84
0-85
0.86
0.87
0.88
0.89
0.90
0.91
0.92
0.93
0.94
0.9S
0.96
0.97
0.98
0.99
0.000 0.001
0.6745 0.6776
0.7063 0.7095
0.7388 0.7421
0.7722 0.7756
0.8064 0.8099
0.8416 0.8452
0.8779 0.8816
0.9154 0.9192
0.9542 0.9581
0.9945 0.9986
.0364
.0803
.1264
.1750
.2265
.2316
.3408
.4051
.4758
.5548
.0407
.0848
.1311
.1800
.2319
.2873
.3469
.4118
.4833
.5632
1.6449 1.6546
1.7507 1.7624
1.8808 1.8957
2.0537 2.0749
2.3263 2.3656
0.002 0.003 0.004
0.6808 0.6840 0.6871
0.7128 0.7160 0.7192
0.7454 0.7488 0.7521
0.7790 0.7824 0.7858
0.8134 0.8169 0.8204
0.8488 0.8524 0.8560
0.8853 0.8890 0.8927
0.9230 0.9269 0.9307
0.9621 0.9661 0.9701
1.0027 1.0069 I.OHO
.0450
.0893
.1359
.1850
.2372
1.2930
1.3532
.4187
.4909
.5718
.0494
.0939
.1407
.1901
.2426
.2988
.3595
.4255
.4985
.5805
.0537
.0985
.1455
.1952
.2481
.3047
.3658
.4325
.5063
.5893
.6646 1.6747 1.6849
.7744 1.7866 1.7991
.9110 1.9268 1.9431
2.0969 2.1201 2.1444
2.4089 2.4573 2.5121
0.005
0.6903
0.7225
0.7554
0.7892
0.8239
0.8596
0.8965
0.9346
0.9741
1.0152
1.0581
1.1031
1.1503
1.2004
1.2536
1.3106
1.3722
1.4395
1.5141
1.5982
1.6954
1.8119
1.96CO
2.1701
2.5758
0.006 0.007
0.6935 0.6967
0.7257 0.7290
0.7588 0.7621
0.7926 0.7961
0.8274 0.8310
0.8633 0.8669
0.9002 0.9040
0.9385 0.9424
0.9782 0.9822
1. 01 94 1.0237













.0625
.1077
.1552
.2055
.259!
.3165
.3787
.4466
.5220
.6072
.7060
.8250
.9774
.0669
.1123
.1601
.2107
.2646
.3225
.3852
.4538
.5301
.6164
.7169
.8384
.9954
2.1973 2.2262
2.6521 2.7478
0.008
0.6999
0.7323
0.7655
0.7995
0.8345
0.8705
0.9078
0.9463
0.9863
1.0279
1.0714
1.1170
1.1650
1.2160
1 .2702
1.3285
.3917
.4611
.5382
.6258
.7279
.8522
2.0141
2.2571
2.8782
0.009
0.7031
0.7356
0.7688
0.8030
0.8381
0.8742
0.91 16
0.9502
0.9904
1 .0322
1.0758
1.1217
1.1700
1.2212
1.2759
1.3346
.3984
.4684
.5464
.6352
.7392
.8663
2.0335
2.2904
3.0902
SOURCE:  Johnson, Norman  L.  and F.  C. Leone.  1977.
Design in Engineering and the Physical Sciences.   Vol.  I,
Wiley and Sons, New York.
Statistics and Experimental
Second Edition.   John
                                       B-8
-------
        TABLE 5.  TOLERANCE FACTORS  (K)  FOR  ONE-SIDED  NORMAL TOLERANCE
             INTERVALS WITH PROBABILITY  LEVEL  (CONFIDENCE FACTOR)
                         Y = 0.95 AND COVERAGE P = 95%
n
3
4
5
6
7
o
Q
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
30
35
40
45
50
55
60
65
70
K J!
7.655 !J
5.145 !J
4.202 JJ
3.707 JJ
3.399 JJ
3.188 J!
3.031
2.911 JJ
2.815 JJ
2.736 J J
2.670 JJ
2.614 J J
2.566
2.523 JJ
2.486 j;
2.543 J!
2.423 J|
2.396 JJ
2.371 ;j
2.350 JJ
2.329 JJ
2.309 JJ
2.292 JJ
2.220 JJ
2.166 JJ
2.126 J!
2.092 JJ
2.065 JJ
2.036 !!
2.017 ! !
2.000 II
1.986 ;
i
i
i
i
i
i i
1 1
i i
i i
1 1
1 1
n
75
100
125
150
175
200
225
250
275
300
325
350
375
400
425
450
475
500
525
550
575
600
625
650
675
700
725
750
775
800
825
850
875
900
925
950
975
1000
K
1.972
1.924
1.891
1.868
1.850
1.836
1.824
1.814
1.806
1.799
1.792
1.787
1.782
1.777
1.773
1.769
1.766
1.763
1.760
1.757
1.754
1.752
1.750
1.748
1.746
1.744
1.742
1.740
1.739
1.737
0.736
1.734
1.733
1.732
1.731
1.729
1.728
1.727
SOURCE:  (a) for sample sizes < 50:  Liebennan, Gerald  F.   1958.   "Tables for
One-sided Statistical Tolerance Limits."  Industrial Quality Control.   Vol.  XIV,
No. 10.  (b) for sample sizes > 50:  K values were calculated  from large
sample approximation.
                                      B-9
-------
             TABLE 6.  PERCENTILES OF STUDENT1 s t-DISTRIBUTION

                         (F = 1-a; n = degrees of freedom)
\f
• x^
1
2
3
4
5
6
7
8
9
10
11
12
13
U
15
16
17
18
19
20
21
22
23
24
25
26
27
23
29
30
40
60
130
m
.80
.325
.289
.277
.271
267
.285
.283
.262
261
.260
.260
.259
.259
.258
.258
.258
.257
.257
.257
.257
257
256
.256
.256
.256
.256
.256
.256
.256
.256
.255
.254
.254
.253
.75
1.000
.816
.765
.741
.727
.718
.711
.706
.703
.700
.697
.695
.694
692
.691
.690
.689
.688
.688
.687
686
.686
.685
.685
.684
.684
684
683
.683
.683
.681
.679
.677
.674
.90
3.078
1.886
1.638
1.533
1.476
1.440
1.415
2.397
1.383
1.372
1 363
1.356
1.350
1 345
1.341
1.337
1.333
1.330
I 328
1.325
1.323
.321
.319
.318
.316
.315
.314
313
.311
.310
1.303
1.296
1.2S9
1 282
.95
6 314
2.920
2.353
2.132
2.015
.943
.895
.860
.833
.812
.796
782
.771
1.761
1.753
1.746
1.740
1.734
1 729
1.725
1.721
1.717
1.714
1.711
1.708
1.706
1.703
1,701
1.699
1.697
1.684
1.671
1 658
1.645
.975
12.706
4.303
3.182
2.776
2.571
2 447
2.365
2 306
2.262
2.228
2 201
2.179
2.160
2.145
2.131
2.120
2.110
2 101
2 093
2.086
2 OSO
2 074
2 069
2.064
2.060
2 056
2.052
2.048
2.045
2.042
2.021
2.000
1.980
1 960
.90
31 821
6 965
4.541
3.747
3.365
3.143
2 998
2 896
2 821
2.764
2.718
2.681
2 650
2.624
2 602
2 583
2 567
2.552
2.539
2.523
2.518
2.508
2.500
2.492
2.485
2.479
2.473
2.467
2.462
2.457
2.423
2.390
2.353
2 326
.995
63.657
9.925
5.841
4.604
4.032
3.707
3 499
3 355
3 250
3.169
3.106
3.055
3 012
2.977
2.947
2.921
2 898
2.878
2.861
2.S45
2.S31
2.819
2 807
2.797
2.787
2.779
2.771
2 763
2.756
2.750
2.704
2.660
2.617
2.576
.9995
636 619
31.598
12.941
8.610
6.859
5.959
5. 405
5 041
4 781
4 587
4 437
4.318
4.221
4.140
4.073
4.015
3 965
3 922
3 383
3.850
3.S19
3 T92
3 767
3.745
3.725
3.707
3.690
3 . 674
3.659
3.646
3.551
3.460
3 373
3 291
SOURCE: CRC Handbook of Tables fop Probability and Statistics.   1966.
W. H. Beyer, Editor.  Published by the Chemical  Rubber Company.
Ohio.
Cleveland,
                                      B-10
-------
           TABLE  7.  VALUES  OF  THE  PARAMETER \ FOR COHEN'S ESTIMATES
                       ADJUSTING FOR NONDETECTED VALUES
X -
.00 .9MIOO
.at .910941
.10 .010930
.13 .011.310
.30 .3US43
.23 .OUS12
.30 .012243
.33 .012S30
.40 .012784
.ti \ .01303*
.M .013179
.34 .013913
.«O .013739
.14 .013911
-TO .014.171
.75 .914378
.10 .01*37*
.84 .014775
.90 .4149*7
.iS .313134
1.90 .01133*

!\ .=.
.03
.02O40O
.021294
.0220*2
.022791
.3334M
.3240T«
.02414*
.02 3211
.023738
.02*243
.33175*
.327194
.027*49
.3210*7
.021314
.028927
.029330
.029723
.030107
.0304*3
.03MM

.30
.03
.330*03
.332223
.03339*
.11344*4
.035413
.038177
.037249
.03*077
.038*8*
.339*24
.040332
.041034
.341733
.042391
.343030
.043*32
.04423*
.044*4*
.043423
.043*19
.044340

.33
.04
.0413*3
.0433*3
.044902
.04131*
.047*2*
.041*3*
.03091*
.011120
.031173
.033113
.934133
.033019
.033993
.031*74
.OS7T21
.031331
.0393(4
.0*0133
.0*0923
.0*1*7*
.9*3413

.40
.as
.033307
.0*4670
.03*391
.03833*
.0399*0
.011322
.012919
.01434J
.013*10
.011121
.011133
.01930*
.070439
.371331
.3721O3
.073*43
.074*33
.073142
.079108
.077349
. 07*471

.43
.04
.013127
.061189
.011413
.070311
.072139
.074372
.071101
.077711
.079332
.010*43
.042301
.01370*
.OI3O11
.011381
.017170
.01*917
.09O133
.091319
.092477
.0»3111
.0*4720

.30
.or
.1)74931
.077909
.91094*
.M3OO9
.013210
.017413
.5*9433
.391333
.093193
.0*493*
.094*37
.09129*
.099*17
.10143
. 10292
.10431
.101*0
.10719
.10134
.10917
.1111*

.13
.0*
.0*44«*
.019134
.092132
.093129
.09121*
.100*3
. 1029S
.10311
. 10723
. 1092*
.11131
.1130*
.11490
. 111*1
.il»37
.12004
.m«7
.12323
.124*0
.12*32
.127*0

.M
.01
.09*24
.10197
.10334
. 10)43
.11134
.11401
.11SS7
.11914
.12130
. 12377
.13299
. 1290*
.13011
.13309
.13402
.13390
.-J773
. 13912
.14 126
.14297
.1*4*3

«3
.10
. 11030
.11431
.11*04
.1314*
.1341*
.13772
.13039
. 13333
.13193
.13847
.14090
. 14323
. 14312
.14773
.149*7
.11191
. 1.5400
. 11199
. 13793
.119*3
.11170

.10
.19
.17342
.17933
. 18479
.1*913
.1*4*0
. 19910
.20330
.237+7
.:;i39
.21317
.21812
.22233
.22378
.22910
.23234
.2333O
.22*31
.24111
.24412
.24740
.23012

.10
~\^
.242*1 00
.230331 .03
.14741 .10
.31403 .13
.27031 .20
.2T12I .23
.21193 .30
.3*737 .31
.29250 .40
.297*3 .41
.30233 .10
.30725 13
.31184 M
.31133 11
.320*3 .TO
.32419 .75
. 32903 1 JO
.33307 13
.23703 ,*>
.34091 .93
.34471 1.00

.90 */\
.30 | .31892
.33 .33793
. 10 33142
.13 .34410
.20 .31233
.23 .31993
.30 .38700

.40 .38033
.43 .3*143

.30 .39271
.11 .39470
.*0 .40447
.43 .4100*
.70 .41133
.73 42090
.40 .42812
.93 .43122
.90 .43823
1 .15 .44112
1.00 .44392
.4021
.4130
.4223
.4330
.4423
.4110
.4193

.4733
.4131

.49O4
.4978
.1043
'.3114
.1110
.1243
.1308
.1370
.1430
.349O
.3341
.'.941
.KM*
.3184
.129*
.1403
.1301
.1404

.1791
. »»0

.19*7
.1031
.3133
.0213
.1291
.3387
.5441
.6113
.63*4
.111*
.1724
.3911 .7091 .1381 .»OI 1.143 .33* 1.141 2.178 3.283 .90 I
.5101 .7232 .8540 .9994 t.iSl .318 1.183 2.203 3.314 93
.8234 .7400 .8703 1.017 I. 183 .379 1 308 2.229 3.3451 10
.6361 .7342 .3860 1.033 1.204 .400 '..530 2 231 1.3781 .13
.6413 .7873 ,3012 1.011 1.222 .413 all 2.280 3.403 23 j
.540O .7810 .911* 1.D4T 1.240 .439 .473 2.303 3.433 .23
.6713 .7937 .8300 1.013 1.2.17 .437 .iJ3 2.329 3.4«4 .30
,3821 .8040 .9437 1.09* 1.274 1.478 .713 2.333 3.492 ,33
.S927 .8179 .9370 1.113 1.290 1.494 .732 2.378 3.120 .40
.7329 .8293 .9700 1.137 1.204 1.311 .731 2.399 3.147| 43
1
.7119 .8401 .9121 1.141 1.221 1.32* .770 2.421 3.3751 .SO
.7115 .1317 .9»50 1.134 1.337 1.343 ,78* 2.443 3-JOi; .34
.7320 .3121 1.307 1.188 1.331 1.3*1 .S0« 2.4«3 3.5281 .SO
.7412 .8719 1.319 1.182 1.36* 1.377 .124 2.13S j.sul .43
.7302 .3U2 1.030 1.193 1.280 1.193 .341 2.107 3.379 .70
.7190 .J932 1.042 1.207 1.394 I. SOS .33* 2.128 2. 703 1 .75
.7878 .9031 1.033 1.220 1.408 U524 .873 2.148 3.730! .30
.7781 .J127 1.044 1.232 1.422 1.539 .492 2.3
-------
TABLE 8.  CRITICAL VALUES FOR T. (ONE-SIDED TEST) WHEN THE
           STANDARD  DEVIATION  IS  CALCULATED  FROM
                     THE  SAME SAMPLE
Number of
Qburvaiioaa.
m
3
4
i
6
7
I
9
10
11
12
13
14
15
16
17
IS
19
20
^i
22
23
24
2!
26
27
23
29
30
31
32
33
34
35
36
37
38
39
40
41
a;
43
44
45
4«
47
48
49
SO
Upper O.J*
Sifnifiaac*
£.««!
I.1S5
1.499
1.780
2.01 1
2.201
2J5S
2.492
2.606
1705
2.771
2.867
2.93S
1997
3052
j.IOJ
3.U9
3.191
3.230
3.266
3.300
3.332
3.362
3.J39
3.415
3.440
3464
3.4*6
3J07
3.523
3.546
3.565
3.582
3.599
3.61«
3.631
3.646
3.660
3.673
3.687
3.700
3.712
3.n*
3.736
3.747
3.757
3.763
3.779
3.789
Upper 0.5*
SUmficarK*
Le»d
1.155
1.496
1.764
1.973
2.139
1274
2J37
1482
2J64
1636
1699
1755
1806
tM2
1S»4
1932
196S
3.001
3.031
3.060
3.087
3.112
3.135
3.157
3.178
3.190
3.218
3.236
3.253
3.270
3.:S6
3.301
3JI6
3J30
3J43
3JS6
3J69
3J3I
3.393
3.404
3.415
3.425
34J5
3.445
3.455
3.464
3474
J.483
Upper 1*
Significance
Uvd
1.135
1.492
1.749
1.944
1097
1221
2J2J
1410
2.485
1550
1607
1659
1705
2.747
1785
1821
1854
1884
2.912
1939
1963
2.9S7
3.009
3.029
3.049
3.068
3.085
3.103
3.119
3.135
3.150
3.164
3.178
3.191
3.204
3.216
3J2S
3.240
3.2JI
3-261
3.271
3.2S2
3.292
3.302
3.310
3.319
3.329
3.336
Upper 15%
Significance
Lod
1.155
1.48 1
1.7IJ
t.887
1020
1126
1215
1290
2.355
2.412
2.462
2J07
1549
:..<»
2.610
1651
2.681
1709
1733
2.718
2.7S1
2.302
2.822
1841
1S59
2.S76
2.S93
2.908
2.924
2.938
2.952
2.965
2.979
2.991
3.003
3.014
3.023
3.036
3.046
3.057
3.067
3.073
3.0S5
3.094
J.J03
3.111
3.120
3.128
Upper 5*
Significaasi
Leod
1.15}
1.463
1.672
1.822
1.938
2.032
1110
1176
2.234
1285
2.331
2.371
2.409
2.443
2.475
1504
1532
2.557
1«80
2.603
2.624
1644
1663
1681
1698
2.714
2.730
1745
2.759
2.773
17S6
2.799
2.311
2.S23
2.835
2.846
1S57
1866
1S77
18S7
2.896
1905
1914
2.923
2.951
2.940
2.948
2.956
Upper 10%
Sigmficancr
Lent
1.148
1.425
1.602
1.729
1.3128
1.909
t.977
1036
10S8
1134
1175
1213
1247
^ ^7<4
2.309
1335
1361
13S5
240S
2429
2.448
2,467
2.486
2.J02
:.J19
2.534
2.5-9
2.563
2.577
2.J9I
?.M»
:.616
2.628
:!.6?9
;r.65o
.1.661
::.6"i
2.682
1692
1700
2.710
1719
1727
173*
2."-U
2.753
:.7«o
2.768
                        (Continued)
                            B-12
-------
TABLE 8 (Continued)
Number of
ObMTtaiMMS.
•
51
52
53
54
55
i6
57
58
59
60
61
«:
63
64
65
66
67
6S
69
70
71
72
73
74
75
76
77
7S
79
.10
3!
32
83
S4
85
36
87
Sa
39
90
91
9:
93
94
95
96
97
91
•>»
100
Upper O.I=V
Significant*
Level
3.798
3.808
3.816
3.825
3.834
3.342
3.851
3.8JS
3.867
3.J74
3SS2
3 Si-*
3.396
3.903
3.910
3.917
3.923
3.930
3936
3.W2
3.94g
3.954
J.960
3.965
3.971
3.97T
3.9S2
3.9*7
3.992
3.998
4.00:
4007
4oi:
4.017
4021
4026
4031
4035
4.03*
4.044
4.049
4.053
4057
4 OKI
40M
4.0»9
4.073
407i
4.0W
««i4
L'pperO.3*
Stamficaacc
Level
3.491
3.500
3.507
3.516,
3.524
3.J31
3.53*
3.J46
ZJi)
3.5<0
3.<6«
3.5"
3.J7»
3J>6
3J»2
3.3M
3.605
3.610
3.617
3.622
3.627
3.633
3.63*
3.043
3.<4S
3.6-4
3.653
3.M3
3.469
3.673
J*T7
3.6S2
3.6S7
3.691
3.645
3.699
3.704
3.708
3.712
3.716
1.720
3.715
3.72S
3.732
3.736
3.739
3.744
3.747
3.750
3.754
Upper 1%
Significant*
Lend
3J45
3J53
3J6I
3J6S
3J76
3.383
3J9I
3.397
3.405
3.411
3.415
3424
3.430
3.437
3.442
1.449
3.454
3.460
3.466
3.471
3.476
3482
3.487
3.492
3.496
3.502
3J07
3.511
3.516
3J2I
3.325
3.529
3.534
3.539
3.543
3.347
3.551
3-555
3J59
3.363
3.567
3-570
3.375
3.379
3.382
3.5S6
3.5S9
3.393
3.597
3.600
Upper 2.5*
Significance
Level
3.136
3.143
3.151
3.15J
3.166
3.172
3.180
J.I 36
3.193
3.IW
.'.205
? :i2
J2'.S
3.224
3.230
3035
3-241
3.246
3.252
3.257
3262
3.267
3.272
3.278
3.232
3.2S7
3.291
3.297
3.301
3.305
3.309
3.315
3.319
3.323
3.327
3.331
3.335
3.339
3.343
3.347
3-550
3.355
3.358
3.362
3J65
3.369
J.372
3.377
3.3SO
3-583
Upper 51
Significance
Loci
2.964
1971
2.978
2.9S6
2.992
3.000
3.006
3.013
3.019
3.025
3.032
3037
J044
3.049
3.055
3.061
3.066
3.071
3.076
30S2
3087
3.092
3.093
3 102
3.107
3.1 II
3.117
3.121
3.125
3 130
3.134
3.139
3.U3
3.147
3.151
3.155
3.160
3.163
3.167
3.171
J.174
3.179
3.182
3 186
3.189
3.193
3.196
3.201
3.204
3.207
Upptr I0"v
Significance
Level
2.77J
2.78J
2.790
1798
2.J04
2.811
2.8 IS
1824
2.831
2.837
2.M2
2.M*
:.s:4
:.,S60
2.S66
2.871
2.877
2.833
2.3XS
2.393
2.897
2.903
2.90S
2.912
2.917
2.922
2.927
2.931
2.935
2.940
2.945
2.949
2.953
2.957
2.961
3.966
2.970
2.973
2.977
2.981
2.9S4
2.989
2.993
2.996
3000
3.003
3.006
3.011
3014
3.017
     (Continued)
        B-13
-------
                    TABLE  8 (Continued)
          Loci
                 Significance
 Leper ir
Sifmlicancc
Sitnifioj.xc
  Lerci
Significance
  Lori
L'npcr WV
Significance
  Le<-ei
i
1
1
1
1
1
!
i
1
1
1
1
1
1
!
1
i
1
1
!
1
I
!
1

i
i
1
1
j
1

1
!
'
1
i


;-

i


'•
i.
;.
jl
12
J3
»
J5
)t>
J7
"5
N
0
1
2
3
4
;
t>
7
1
9
n
.
2
3
4
5
5
7
j
t
0
1
2
i
4
5
t.

i
4
(i
1!
12
3
4
<
p
7
4.0»S
4095
4.0VS
4.102
4.105
4 109
4 112
4.116
4 1 19
4 12:
4 125
4.1:9
4.132
4 135
4 US
4 141
4 144
4 146
4 150
4153
4 156
4 ;59
4 161
4 1 />4
4 |6<»
4 IW*
4 173
4175
4 rs
4 ISO
4 183
4 1 85
4 ij.\
4 1VO
4.193
4 196
4 19*
4 2-lfi
4.2D3
4205
4.207
4209
4 212
4214
4216
42N
3.7M
3.765
37t.li
3771
3.774
3777
3 7*0
1?84
3.787
3.7W
3. ^3
3794
3.759
3.802
3 !05
3 80*
3 I'M
3SI4
3SI7
3SI9
3 S22
3 s:*
3827
3 S3 1
3S33
3 1-36
3 S3S
3 S40
3 543
3 £45
3 H»
3 .ISO
3 i.".'
.1 >Jo
3S7t
3.S7V
38S1
3 »>J
3.603
3.W
3 610
3 6U
3.617
3*20
3.623
3626
3 629
363:
3636
3639
jw:
1645
3647
3650
3653
3656
3659
3662
3.665
3667
1670
367:
3675
3677
36SO
3.6*3
3.M!6
)6j»
36"0
369.1
369!
3697
3 •••»
3-0:
3-04
3 ^O1
3 710
3 712
3.714
3.716
3 719
3 T:i
3723
3 *:;
3 •':•'
338*
3.390
3.393
3.39-
3400
3 40?
3.406
3.4C«
341:
3.415
3 418
3422
3 424
3427
3430
3433
3433
3,438
3441
3444
3.447
3.450
3452
3455
3.457
> 4150
3 4ft2
3 465
34o7
3 4TO
1471
' 47?
t 4~X
1 ^,\o
3 4»2
34S4
3.487
3 4«9
3491
.' 493
349">
.i 499
3 501
3 503
3 50<
3 507
3.509
3.214
3.217
3.220
3.2Z4
3.227
3.230
3.233
3 236
32--9
3.242
3 245
3.245
3.251
32J4
3257
3.259
3262
3265
3267
3.270
3 :-4
3 :-6
3 2 "9
3.231
3 254
3.2S6
3 2S9
3 291
3 294
5.296
j 29S
3 302
3 J04
3 306
3 309
3 311
3 3 1 3
3315
3 3'S
3320
3322
3.324
3.3>
332S
3.331
3 334
3.021
3024
3.02?
3030
3033
3037
30-WJ
3043
.* 04«
3049
305:
3055
30J!
3061
3064
3067
3070
3C.7J
3.0-5
3078
3.041
3C5.1
3 OD6
3 "if
3 o-:
3 045
1 OT
3 !
-------
     APPENDIX C





GENERAL BIBLIOGRAPHY
       C-l
-------
     The  following  list  provides  the  reader with  those references  directly
mentioned  in  the text.   It  also includes, for those readers  desiring further
information, references to literature dealing  with  selected  subject matters in
a broader  sense.   This list  is  in  alphabetical order.

ASTM Designation:  E178-75.   1975.   "Standard  Recommended Practice for Dealing
with Outlying Observations."

ASTM  Manual on  Presentation of Data and  Control Chart  Analysis.   1976.   ASTM
Special Technical  Publication 15D.

Barari, A.,  and  L.  S.  Hedges.   1985.   "Movement  of  Water in  Glacial  Till."
Proceedings of the 17th International Congress of the International'Association of
Hydrogeologists.   pp. 129-134.

Barcelona, M. J.,  J. P. Gibb,  J. A.  Helfrich,  and E. E.  Garske.   1985.  "Prac-
tical  Guide  for Ground-Water Sampling."   Report by Illinois  State Water Sur-
vey, Department  of Energy and Natural Resources  for USEPA.   EPA/600/2-85/104.

Bartlett,  M.  S.   1937.    "Properties of  Sufficiency  and Statistical  Tests."
Journal of the Royal Statistical  Society, Series A.   160:268-282.

Box, G. E. P., and J. M.  Jenkins.   1970.   Time Series Analysis.   Holden-Day, San
Francisco, California.

Brown, K.  W.,  and D. C.  Andersen.   1981.   "Effects of  Organic Solvents on the
Permeability of  Clay Soils."   EPA  600/2-83-016,  Publication  No.  83179978, U.S.
EPA, Cincinnati,  Ohio.

Cohen, A.  C.,  Jr.   1959.   "Simplified  Estimators  for  the Normal  Distribution
When Samples Are  Singly Censored or Truncated."   Techno metrics.   1:217-237.

Cohen, A.  C., Jr.   1961.   "Tables  for  Maximum  Likelihood Estimates:   Singly
Truncated  and Singly Censored Samples."  Technometrics.   3:535-541.

Conover, W.  J.   1980.   Practical Nonparametric Statistics.  Second Edition, John
Wiley  and  Sons,  New  York, New York.

CRC  Handbook of Tables  for  Probability  and Statistics.   1966.   William H.  Beyer
(ed.).  The  Chemical Rubber  Company.

Current Index  to  Statistics.    Applications,  Methods  and  Theory.    Sponsored  by
American  Statistical  Association   and  Institute of  Mathematical  Statistics.
Annual series providing indexing coverage  for  the broad field  of statistics.

David, H.  A.   1956.   "The Ranking  of Variances  in  Normal Populations."  Jour-
nal of the American Statistical Association.   Vol. 51, pp. 621-626.

Davis, J.  C.   1986.   Statistics and  Data Analysis in Geology.  Second  Edition.
John Wiley and Sons, New  York,  New York.
                                      C-3
-------
Dixon, W. J.,  and  F.  J.  Massey, Jr.  1983.  Introduction to Statistical  Analysis.
Fourth Edition.  McGraw-Hill, New York,  New York.

Freeze, R.  A., and J. A.  Cherry.   1979.  Ground water.   Prentice  Hall,  Inc.,
Englewood Cliffs,  New Jersey.

Gibbons, R. D.  1987.  "Statistical Prediction  Intervals for  the Evaluation of
Ground-Water Quality."  Ground  Water.  Vol. 25,  pp.  455-465.

Gibbons,  R. D.    1988.    "Statistical  Models   for  the Analysis  of  Volatile
Organic Compounds  in Waste Disposal Sites."  Ground Water.  Vol. 26.

Gilbert,  R.   1987.   Statistical  Methods for  Environmental  Pollution Monitoring.
Professional Books Series, Van  Nos  Reinhold.

Hahn,  G.  and W. Nelson.   1973.  "A Survey of  Prediction Intervals  and  Their
Applications."  Journal of Quality Technology.   5:178-188.

Heath,  R.   C.   1983.   Basic  Ground-Water Hydrology.   U.S. Geological  Survey
Water  Supply Paper.  2220, 84 p.

Hirsch,  R.  M., J. R. Slack, and  R.  A.  Smith.    1982.   "Techniques  of  Trend
Analysis for Monthly Water Quality  Data."  Water Resources  Research.   Vol. 18,
No. 1, pp.  107-121.

Hockman,  K. K., and J. M.  Lucas.   1987.  "Variability Reduction Through Sub-
vessel CUSUM Control.  Journal of Quality Technology.  Vol.  19,  pp. 113-121.

Hollander,  M.,  and D. A. Wolfe.  1973.   Nonparametric  Statistical Methods.  John
Wiley  and Sons, New  York,  New York.

Huntsberger, D. V., and  P.  Billingsley.  1981.  Elements of Statistical  Infer-
ence.  Fifth Edition.  Allyn and Bacon,  Inc., Boston,  Massachusetts.

Johnson,  N. L., and F. C.  Leone.   1977.  Statistics and Experimental  Design  in
Engineering  and the Physical Sciences.   2 Vol., Second  Edition.   John  Wiley and
Sons,  New York, New  York.

Kendall,  M. G.,  and A.  Stuart.   1966.   The  Advanced  Theory  of  Statistics.
3 Vol.  Hafner Publication Company, Inc.,  New York,  New York.

Kendall,  M. G., and W. R.  Buckland.   1971.  A Dictionary of Statistical  Terms.
Third  Edition.  Hafner Publishing Company, Inc., New York,  New York.

Kendall, M. G.  1975.  Rank Correlation Methods.  Charles Griffin, London.

Lang ley,  R. A.   1971.   Practical Statistics Simply Explained.    Second  Edition.
Dover  Publications,  Inc.,  New York, New  York.

Lehmann, E. L.  1975.  Nonparametric Statistical Methods Based on Ranks.  Moisten
Day, San Francisco,  California.
                                      C-4
-------
Lieberman,  G.  J.    1958.     "Tables  for  One-Sided  Statistical  Tolerance
Limits."  Industrial Quality Control.  Vol. XIV, No. 10.

Lilliefors, H.  W.   1967.   "On the Kolmogorov-Smirnov  Test for Normality with
Mean  and  Variance  Unknown."   Journal of  the  American Statistical Association.
64:399-402.

Lingren, B. W.  1976.  Statistical Theory.  Third  Edition.   McMillan.

Lucas, J. M.   1982.   "Combined Shewhart-CUSUM Quality  Control  Schemes."   Jour-
nal of Quality Technology .   Vol. 14, pp. 51-59.

Mann,  H.  B.    1945.    "Non-parametric  Tests Against  Trend."   Econometrica.
Vol. 13, pp. 245-259.

Miller, R.  G.,  Jr.    1981.   Simultaneous  Statistical  Inference.   Second Edition.
Springer-Verlag, New  York,  New York.

Mull,  D.  S., T.  0.   Liebermann,  J.   L.  Smoot,  and L.  H.  Woosley,  Jr.    1988.
"Application  of  Dye-Tracing  Techniques  for  Determining  Solute  Transport
Characteristics of Ground Water in Karst Terranes."   USEPA, EPA 904/6-88-001,
October 1988.   103 pp.

Nelson,  L.  S.    1987.    "Upper  10%,  5%,  and  1%—Points  of  the  Maximum  F-
Ratio."  Journal of Quality Technology.   Vol.  19, p. 165.

Nelson, L.  S.   1987.   "A Gap  Test for Variances."   Journal of Quality Technol-
ogy.  Vol.  19,  pp. 107-109.

Noether, G. E.  1967.  Elements of Nonparametric Statistics.   Wiley,  New York.

Pearson, E.  S., and  H.  0.  Hartley.   1976.  Biometrika Tables for Statistician.
Vol. 1, Biometrika Trust, University  College, London.

Quade, D.   1966.   "On Analysis of Variance for the K-Sample Problem."  Annals
of Mathematical  Statistics.   37:1747-1748.

Quinlan,  J. F.   "Ground-Water  Monitoring  in  Karst  Terranes:    Recommended
Protocols and Implicit Assumptions."   EPA/600/X-89/050, March 1989.

Remington, R. D., and M.  A.  Schork.  1970.  Statistics with Applications to the Bio-
logical and Health Sciences.   Prentice-Hall, pp.  235-236.

Shapiro, S. S., and M. R. Wilk.   1965.  "An Analysis  of Variance Test for Nor-
mality (Complete Samples)."   Biometrika.  Vol.  52, pp.  591-611.

Snedecor, G.  W.,  and W.  G.  Cochran.   1980.  Statistical Methods.   Seventh Edi-
tion.  The  Iowa State University  Press,  Ames, Iowa.
                                      C-5
-------
Starks, T.  H.   1988 (Draft).   "Evaluation  of  Control  Chart Methodologies for
RCRA Waste  Sites."   Report  by Environmental  Research Center,  University of
Nevada,  Las Vegas,  for  Exposure  Assessment Research  Division,  Environmental
Monitoring Systems Laboratory-Las Vegas, Nevada.   CR814342-01-3.

"Statistical  Methods  for  the  Attainment  of  Superfund  Cleanup  Standards
(Volume 2:  Ground Water—Draft)."

Steel, R. G. D., and J. H. Torrie.  1980.   Principles and Procedures  of Statistics,
A  Biometrical Approach.   Second Edition.   McGraw-Hill  Book  Company, New York,
New York.

Todd, D.  K.   1980.  Ground  Water Hydrology.   John Wiley and  Sons, New York,
534 p.

Tukey, J.  W.   1949.    "Comparing  Individual  Means  in the  Analysis  of Vari-
ance."  Biometrics.  Vol.  5,  pp. 99-114.


Statistical Software Packages:

BMDP Statistical  Software.    1983.   1985  Printing.  University  of California
Press, Berkeley.

Lotus  1-2-3 Release 2.   1986.   Lotus Development Corporation,  55 Cambridge
Parkway, Cambridge, Massachusetts 02142.

SAS:  Statistical Analysis System, SAS Institute, Inc.
          SAS® User Is Guide:  Basics, Version 5 Edition, 1985.
          SAS® User's Guide:  Statistics,  Version 5 Edition, 1985.

SPSS:  Statistical Package for the Social  Sciences.  1982.  McGraw-Hill.

SYSTAT:   Statistical Software  Package  for  the  PC.   Systat,  Inc., 1800 Sherman
Avenue, Evanston, Illinois 60201.
                                      C-6
-------
            APPENDIX D






FEDERAL REGISTER, 40  CFR, Part 264
               D-l
-------
   Tuesday
   October 11, 1388
   Part II



   Environmental

   Protection Agency

   40 CFR Part 264
   Statistical Methods for Evaluating
   Ground-Water Monitoring From
   Hazardous Waste Facilities; Final Rule
D-3
-------
39728    Federal Register /  Vol.  53,  No. 196  /  Tuesday, October 11,  1988 / Rules and Regulations
final authorization will have to revise
their programs to cover the additional
requirements in today's announcement.
Generally, these authorized State
programs must be revised within one
year of the date of promulgation of such
standards, or within two years if the
State must amend or enact a statute in
order to make the required revision (see
40 CFR 271,21). However, States may
always impose requirements which are
more stringent or have greater coverage
than EPA's programs.
  Regulations which are broader in
scope, however, may not be enforced as
part of the federally-authorized RCRA
program.

B. Regulatory Impact Analysis
  Executive  Order 12291 (46 FR 13191,
February 9,1981) requires that a
regulatory agency determine whether a
new regulation will be "major" and, if
so, that a Regulatory Impact Analysis be
conducted. A major rule is defined as a
regulation that is likely to result in:
  1. An annual effect on the economy of
$100 million  or more;
  2. A major increase in costs or prices
for consumers, individual industries.
Federal. State, or local government
agencies or geographic regions; or
  3. Significant adverse effects on
competition, employment, investment,
productivity, innovation, or the ability of
United States-based enterprises to
compete with foreign-based enterprises
in domestic or export markets.
  The Agency has determined that
today's regulation is not a major rule
because it does not meet the above
criteria. Today's action should produce
a net decrease in the cost of ground-
water monitoring at each facility. This
final rule has been submitted to the
Office of Management and Budget
(OMB) for review in accordance with
Executive Order 12291. OMB has
concurred with this final rule.
C. Regulatory Flexibility Act
   Pursuant to the Regulatory Flexibility
Act, 5 U.S.C. 601 et seq., whenever an
agency is required to publish a general
notice of rulemaking for any proposed or
final rule, it  must prepare and make
available for public comment a
regulatory flexibility analysis which
describes the impact of the rule on small
entities (e.g., small businesses, small
organizations, and small governmental
jurisdictions). The Administrator may
certify, however, that the rule will not
have a significant economic impact on a
substantial number of small entities. As
stated above, this final rule will have no
adverse impacts on businesses of any
size. Accordingly, I hereby certify  that
this regulation will not have a
significant economic impact on a
substantial number of small entities.
This final rule, therefore, does not
require a regulatory flexibility analysis.

List of Subjects in 40 CFR Part 264
  Hazardous material, Reporting and
recordkeeping requirements, Waste
treatment and disposal, Ground water.
Environmental monitoring.
  Date: September 28,1988.
Lee M. Thomas,
Administrator.
  Therefore, 40 CFR Chapter I is
amended as  follows:

PART 264—STANDARDS FOR
OWNERS AND OPERATORS OF
HAZARDOUS WASTE TREATMENT,
STORAGE, AND DISPOSAL
FACILITIES

  1. The authority citation for Part 264
continues to read as follows:
  Authority: Sees. 1006, 2002(a), 3004. and
3005 of the Solid Waste Disposal Act, as
amended by the Resource Conservation and
Recovery Act, as amended (42 U.S.C. 6905,
6912(a), 6924, and 6925).
  2. In § 264.91 by revising paragraphs
(a)(l)  and (a)(2) to read as follows:

§ 264.91  Required programs.
  (a) * * *
  (1) Whenever hazardous constituents
under § 264.93 from a regulated unit are
detected at a compliance point under
| 264.95, the owner or operator must
institute a compliance monitoring
program under § 264.99. Detected is
defined as statistically significant
evidence of contamination as described
in § 264.98(f);
  (2) Whenever the ground-water
protection standard under § 264.92 is
exceeded, the owner or operator must
institute a corrective action program
under § 264.100. Exceeded is defined as
statistically  significant evidence of
increased contamination as described in
§ 264.99(d);
*****
   3. Section 264.92 is revised to read as
follows:

§ 264.92  Ground-water protection
standard.
   The owner or operator must comply
with conditions specified in the facility
permit that are designed to ensure that
hazardous constituents under § 264.93
detected in the ground water from a
regulated  unit do not exceed the
concentration limits under § 264.94 in
the uppermost aquifer underlying the
waste management area beyond the
point of compliance under § 264.95
during the compliance period under
§ 264.96. The Regional Administrator
will establish this ground-water
protection standard in the facility permit
when hazardous constituents have been
detected in the ground water.
  4. In § 264.97 by removing the word
"and" from the end of (a)(l),
redesignating and revising (g)(3) as
(a)(l)(i), adding (a)(3), revising
paragraphs (g) and (h), and adding (i)
and (j), to read as follows:

§ 264.97  General ground-water monitoring
requirements.
  (a) * *  *
  (1) * *  *
  (i) A determination of background
quality may include sampling of wells
that are not hydraulically upgradient of
the waste management area where:
  (A) Hydrogeologic conditions do not
allow the owner or operator to
determine what wells are hydraulically
upgradient; and
  (B) Sampling at other wells will
provide an indication of background
ground-water quality that is
representative or more representative
than that provided  by the upgradient
wells; and
*****
  (3) AJlow for the  detection of
contamination when hazardous waste or
hazardous constituents have migrated
from the waste management area to the
uppermost aquifer.
*****
  (g) In detection monitoring or where
appropriate in compliance monitoring,
data on each hazardous constituent
specified in the permit will be collected
from background wells and wells at the
compliance point(s). The number and
kinds of samples collected to establish
background shall be appropriate for the
form of statistical test employed,
following generally accepted statistical
principles. The sample size shall be as
large as necessary to ensure with
reasonable confidence that a
contaminant release to ground  water
from a facility will be detected. The
owner or operator will determine an
appropriate sampling procedure and
interval for each hazardous constituent
listed in the facility permit which shall
be specified in the unit permit upon
approval by the Regional Administrator.
This sampling procedure shall be:
  (1) A sequence of at least four
samples, taken at an interval that
assures, to the greatest extent
technically feasible, that an independent
sample is  obtained, by reference to the
uppermost aquifer's effective porosity,
hydraulic  conductivity, and hydraulic
gradient, and the fate and transport
                                                       D-4
-------
           Federal Register / Vol. 53. No.  196 / Tuesday. October  11.  1988 / Rules and Regulations
                                                                      39729
characteristics of the potential
contaminants, or
  (2) an alternate sampling procedure
proposed by the owner or operator and
approved by the Regional
Administrator.
  (h) The owner or operator will specify
one of the following statistical methods
to be used in evaluating ground-water
monitoring data for each hazardous
constituent which, upon approval by the
Regional Administrator, will be
specified in the unit permit. The
statistical test chosen shall be
conducted separately for each
hazardous constituent in each well.
Where practical quantification limits
(pql's) are used in any of the following
statistical procedures to comply with
§ 264.97(i)(5), the pql must be proposed
by the owner or operator and  approved
by the Regional Administrator. Use of
any of the following statistical methods
must be protective  of human health and
the environment and must comply with
the performance standards outlined in
paragraph (i)  of this section.
  (1) A parametric  analysis of variance
(AN'OVA) followed by multiple
comparisons procedures to identify
statistically significant evidence of
contamination. The method must
include estimation and testing of the
contrasts between each compliance
well's mean and the background mean
levels for each constituent.
  (2) An analysis of variance (ANOVA)
based on ranks followed by multiple
comparisons procedures to identify
statistically significant evidence of
contamination. The method must
include estimation and testing of the
contrasts between each compliance
well's median and the background
median levels for each constituent.
  (3) A tolerance or prediction interval
procedure in which an interval for each
constituent is established from the
distribution of the background data, and
the level of each constituent in each
compliance well is compared to the
upper tolerance or prediction limit.
  (4) A control chart approach that gives
control limits for each constituent.
  (5) Another statistical test method
submitted by the owner or operator and
approved by the Regional
Administrator.
  (i) Any statistical method chosen
under § 264.97(h) for specification in the
unit permit shall comply with  the
following performance standards, as
appropriate:
  (1)  The statistical method used to
evaluate ground-water monitoring data
shall  be appropriate for the distribution
of chemical parameters or hazardous
constituents.  If the distribution of the
chemical parameters or hazardous
constituents is shown by the owner or
operator to be inappropriate for a
normal theory test, then the data should
be transformed or a distribution-free
theory test should be used. If the
distributions for the constituents differ,
more than one statistical method may be
needed.
  (2) If an individual  well comparison
procedure is used to compare an
individual compliance well constituent
concentration with background
constituent concentrations or a ground-
water protection standard, the test shall
be done at a Type I error level no less
than 0.01 for each testing period. If a
multiple comparisons procedure is used,
the Type I experimentwise error rate for
each testing period shall be  no less than
0.05; however, the Type I error of no less
than 0.01 for individual well
comparisons must be maintained. This
performance standard does  not apply to
tolerance intervals, prediction intervals
or control charts.
  (3) If a control chart approach is used
to evaluate ground-water monitoring
data, the specific type of control chart
and its associated parameter values
shall be proposed by the owner or
operator and approved by the Regional
Administrator if he or she finds it to be
protective of human health and the
environment.
  (4) If a tolerance interval or a
prediction interval is used to evaluate
groundwater monitoring data, the levels
of confidence and, for tolerance
intervals, the percentage of  the
population that the interval  must
contain, shall be proposed by the owner
or operator and approved by the
Regional Administrator if he or she finds
these parameters to be protective of
human health and the environment.
These parameters will be determined
after considering the  number of samples
in the background data base, the data
distribution, and the range of the
concentration values for each
constituent of concern.
  (5) The statistical method shall
account for data below the limit of
detection with one or more statistical
procedures that are protective of human
health and the environment. Any
practical quantification limit (pql)
approved by the Regional Administrator
under § 264.97(h) that is used in the
statistical method shall be the lowest
concentration level tha can  be reliably
achieved within specified limits of
precision and accuracy during routine
laboratory operating conditions that are
available to the facility.
  (6) If necessary, the statistical method
shall include procedures to control or
correct for seasonal and spatial
variability as well as temporal
correlation in the data.
  (j) Ground-water monitoring data
collected in accordance with paragraph
(g) of this section including actual levels
of constituents must be maintained in
the facility operating record. The
Regional Administrator will specify in
the permit when the data must be
submitted for review.
  5. In  § 264.98 by removing paragraphs
(i), (j) and (k), and by revising
paragraphs (c), (d), (f), (g), and (h) to
read as follows:

§ 264.98 Detection monitoring program.
*****
  (c) The owner or operator must
conduct a ground-water monitoring
program for each chemical parameter
and hazardous constituent specified in
the permit pursuant to paragraph (a) of
this section in accordance with
§ 264.97{g). The owner or operator must
maintain a record of ground-water
analytical data as measured and in a
form necessary for the determination of
statistical significance under § 264.97(h).
  (d) The Regional Administrator will
specify the frequencies for collecting
samples and conducting  statistical tests
to determine whether there is
statistically significant evidence of
contamination for any parameter or
hazardous constituent specified  in the
permit  under paragraph (a) of this
section in accordance with § 264.97(g). A
sequence of at least four samples from
each well (background and compliance
wells) must be collected  at least semi-
annually during detection monitoring.
*    *.   *    *    *
  (f) The owner or operator must
determine whether there is statistically
significant evidence of contamination
for any chemical parameter of
hazardous constituent specified  in the
permit  pursuant to paragraph (a) of this
section at a frequency specified  under
paragraph (d) of this section.
  (1) In determining whether
statistically significant evidence of
contamination exists, the owner or
operator must use the method(s)
specified in the permit under § 264.97(h).
These method(s) must compare data
collected at the compliance point(s) to
the background ground-water quality
data.
  (2) The owner or operator must
determine whether there in statistically
significant evidence of contamination at
each monitoring well as the compliance
point within a reasonable period of time
after completion of sampling. The
Regional Administrator will specify in
the facility permit what period of time is
reasonable, after considering the
                                                      D-5
-------
39730     Federal Register / Vol. 53. No.  196 / Tuesday.  October 11.  1988 /  Rules and Regulations
complexity of the statistical test and the
availability of laboratory facilities to
perform the analysis of ground-water
samples.
  (g) If the owner or operator
determines pursuant to paragraph (f) of
this section that there is statistically
significant evidence of contamination
for chemical parameters or hazardous
constituents specified pursuant to
paragraph (a) of this section at any
monitoring well at the compliance point,
he or she must:
  (1) Notify the Regional Administrator
of this finding in writing within seven
days. The notification must indicate
what chemical parameters or hazardous
constituents have shown statistically
significant evidence of contamination;
  (2) Immediately sample the ground
water in all monitoring  wells and
determine whether constituents in the
list of Appendix IX of Part 264 are
present, and if so, in what
concentration.
  (3) For any Appendix IX compounds
found in the analysis pursuant to
paragraph (g)(2) of  this  section, the
owner or operator may  resample within
one month and repeat the analysis for
those compounds detected. If the results
of the second analysis confirm the initial
results, then these constituents will form
the basis for compliance monitoring. If
the owner or operator does not resample
for the compounds  found pursuant to
paragraph (g)(2) of  this  section, the
hazardous constituents found during this
initial Appendix IX analysis will form
the basis for compliance monitoring.
  (4) Within 90 days, submit to the
Regional Administrator an application
for a permit modification to establish a
compliance monitoring  program meeting
the requirements of § 264.99. The
application must include tiie following
information:
  (i) An identification of the
concentration or any Appendix IX
constituent detected in the ground water
at each monitoring well at the
compliance point;
  (ii) Any proposed changes to the
ground-water monitoring system at the
facility  necessary to meet the
requirements of § 264.99;
   (iii) Any proposed additions or
changes to the monitoring frequency,
sampling  and analysis procedures or
methods,  or statistical methods used at
the facility necessary to meet the
requirements of | 264.99;
   (iv) For each  hazardous constituent
detected at the compliance point, a
proposed concentration limit under
§ 264.94{a) (1) or (2), or a notice of intent
to seek an alternate concentration limit
under § 264.94(b); and
  (5) Within 180 days, submit to the
Regional Administrator:
  (i) All data necessary to justify an
alternate concentration limit sought
under § 264.94(b); and
  (ii) An engineering feasibility plan for
a corrective action program necessary to
meet the requirement of § 264.100,
unless:
  (A) All hazardous constituents
identified under paragraph (g)(2) of this
section are listed in Table 1 of § 264.94
and their concentrations do not exceed
the respective values given in that
Table; or
  (B) The owner or operator has sought
an alternate concentration limit under
§ 264.94(b) for every hazardous
constituent identified under paragraph
(g)(2) of this section.
  (6) If the owner or operator
determines, pursuant to paragraph (f) of
this section, that there is a statistically
significant difference for chemical
parameters or hazardous constituents
specified pursuant to paragraph (a) of
this section at any monitoring well at
the compliance  point, he or she may
demonstrate that a source other than a
regulated unit caused the contamination
or that the detection is an artifact
caused by an error in sampling,
analysis, or statistical evaluation or
natural variation in the ground water.
The owner operator may make a
demonstration under this paragraph  in
addition to, or in lieu of, submitting a
permit modification application under
paragraph (g)(4) of this  section;
however, the  owner or operator is not
relieved of the requirement to submit a
permit modification application within
the time specified in paragraph (g)(4) of
this section unless the demonstration
made under this paragraph successfully
shows that a  source other than a
regulated unit caused the increase, or
that the increase resulted from error in
sampling, analysis, or evaluation. In
making a demonstration under this
paragraph, the owner or operator must:
  (i) Notify the  Regional Administrator
in writing within seven days of
determining statistically significant
evidence of contamination at the
compliance point that he intends to
make a demonstration under this
paragraph-
  (ii) Within  90 days, submit a report to
the Regional  Administrator which
demonstrates that a source  other than a
regulated unit caused the contamination
or that the contamination resulted from
error in sampling, analysis,  or
evaluation;
  (iii) Within 90 days, submit to the
Regional Administrator an application
for a permit modification to make any
appropriate changes to the detection
monitoring program facility; and
  (iv) Continue to monitor in accordance
with the detection monitoring program
established under this section.
  (h) If the owner or operator
determines that the detection monitoring
program no longer satisfies the
requirements of this section,  he or she
must, within 90 days, submit an
application for a permit modification to
make any appropriate changes to the
program.
  6. In § 264.99 by revising paragraph
(c), revising paragraphs (d), (f), and (g),
removing paragraph (h), redesignating
paragraph (i) as (h), (j) as (i) and (k) as
(j), revising the redesignated paragraphs
(h) introductory text and (i) introductory
text, and removing paragraph (1) to read
as follows:

§ 264.99   Compliance monitoring program.
*****
  (c) The Regional Administrator will
specify the sampling procedures and
statistical methods appropriate for the
constituents and the facility, consistent
with § 264.97 (g) and (h).
  (1) The owner or operator must
conduct a sampling program for each
chemical parameter or hazardous
constituent in accordance with
§ 264.97(g).
  (2) The owner or operator must record
ground-water analytical data as
measured and in form necessary for the
determination of statistical significance
under § 264.97(h) for the compliance
period of the facility.
  (d) The owner or operator  must
determine whether there is statistically
significant evidence of increased
contamination for any chemical
parameter or hazardous constituent
specified in the permit, pursuant to
paragraph (a) of this section, at a
frequency specified under paragraph (f)
under this section.
  (1) In determining whether
statistically  significant evidence of
increased contamination  exists, the
owner or operator must use the
method(s) specified in the permit under
§ 264.97(h). The rnethods(s) must
compare data collected at the
compliance point(s) to a concentration
limit developed in accordance with
§ 264.94.
  (2) The owner or operator  must
determine whether there  is statistically
significant evidence of increased
contamination at each monitoring well
at the compliance point within a
reasonable time period after completion
of sampling. The Regional Administrator
will specify  that time period in the
facility permit, after considering the
                                                        D-6
-------
           Federal Register /  Vol.  53.  No. 196  /  Tuesday. October 11. 1988  /  Rules and Regulations     39731
complexity of the statistical test and the
availability of laboratory facilities to
perform the analysis of ground-water
samples.
•    *     *    •     •
  (f) The Regional Administrator will
specify the frequencies for collecting
samples and conducting statistical tests
to determine statistically significant
evidence of increased contamination in
accordance with § 264.97(g). A sequence
of at least four samples from each well
(background and compliance wells)
must be collected at least semi-annually
during the compliance period of the
facility.
  (g) The owner or operator must
analyze samples from all monitoring
wells at the compliance point for all
constituents contained in Appendix  IX
of Part 264 at least annually to
determine whether additional hazardous
constituents are present in the
uppermost aquifer and. if so, at what
concentration, pursuant to procedures in
§ 264.98(f)- If the owner or operator finds
Appendix IX constituents in the ground
water that are not already identified in
the permit as monitoring constituents.
the owner or operator may resample
within one month and repeat the
Appendix IX analysis. If the second
analysis confirms the presence of new
constituents, the owner or operator must
report the concentration of these
additional constituents to the Regional
Administrator within seven days after
the completion of the second analysis
and add them to the monitoring list. If
the owner or operator chooses not to
resample, then he or she must report the
concentrations of these additional
constituents to the Regional
Administrator within seven days after
completion of the intiial analysis and
add them to the monitoring list.
  (h) If the owner or operator
determines pursuant to paragraph (d) of
this section that any concentration
limits under § 264.94 are being exceeded
at any monitoring well at the point of
compliance he or she must:
•    «    •     *     *

  (i) If the owner or operator
determines, pursuant to paragraph (d) of
this section, that the ground-water
concentration limits under this section
are being exceeded at any monitoring
well at the point of compliance, he or
she may demonstrate that a source other
than a regulated unit caused the
contamination or that the detection is an
artifact caused by an error in sampling.
analysis, or statistical evaluation or
natural variation in the ground water. In
making a demonstration under this
paragraph, the owner or operator must:
*    *    *     *     *

[FR Doc. 88-22913 Filed 10-7-88. 8:45 am|
BILLING CODE 6560-50-M
                                                       D-7
-------